On Knowledge-Based Representations for Actionable Data …

I bumped into a professional acquaintance last week. After describing briefly a presentation I was about to give, he offered to broker introductions to others who might have an interest in the work I’ve been doing. To initiate the introductions, I crafted a brief description of what I’ve been up to for the past 5 years in this area. I’ve also decided to share it here as follows: 

As always, [name deleted], I enjoyed our conversation at the recent AGU meeting in Toronto. Below, I’ve tried to provide some context for the work I’ve been doing in the area of knowledge representations over the past few years. I’m deeply interested in any introductions you might be able to broker with others at York who might have an interest in applications of the same.

Since 2004, I’ve been interested in expressive representations of data. My investigations started with a representation of geophysical data in the eXtensible Markup Language (XML). Although this was successful, use of the approach underlined the importance of metadata (data about data) as an oversight. To address this oversight, a subsequent effort introduced a relationship-centric representation via the Resource Description Format (RDF). RDF, by the way, forms the underpinnings of the next-generation Web – variously known as the Semantic Web, Web 3.0, etc. In addition to taking care of issues around metadata, use of RDF paved the way for increasingly expressive representations of the same geophysical data. For example, to represent features in and of the geophysical data, an RDF-based scheme for annotation was introduced using XML Pointer Language (XPointer). Somewhere around this point in my research, I placed all of this into a framework.

A data-centric framework for knowledge representation.

A data-centric framework for knowledge representation.

 In addition to applying my Semantic Framework to use cases in Internet Protocol (IP) networking, I’ve continued to tease out increasingly expressive representations of data. Most recently, these representations have been articulated in RDFS – i.e., RDF Schema. And although I have not reached the final objective of an ontological representation in the Web Ontology Language (OWL), I am indeed progressing in this direction. (Whereas schemas capture the vocabulary of an application domain in geophysics or IT, for example, ontologies allow for knowledge-centric conceptualizations of the same.)  

From niche areas of geophysics to IP networking, the Semantic Framework is broadly applicable. As a workflow for systematically enhancing the expressivity of data, the Framework is based on open standards emerging largely from the World Wide Web Consortium (W3C). Because there is significant interest in this next-generation Web from numerous parties and angles, implementation platforms allow for increasingly expressive representations of data today. In making data actionable, the ultimate value of the Semantic Framework is in providing a means for integrating data from seemingly incongruous disciplines. For example, such representations are actually responsible for providing new results – derived by querying the representation through a ‘semantified’ version of the Structured Query Language (SQL) known as SPARQL. 

I’ve spoken formally and informally about this research to audiences in the sciences, IT, and elsewhere. With York co-authors spanning academic and non-academic staff, I’ve also published four refereed journal papers on aspects of the Framework, and have an invited book chapter currently under review – interestingly, this chapter has been contributed to a book focusing on data management in the Semantic Web. Of course, I’d be pleased to share any of my publications and discuss aspects of this work with those finding it of interest.

With thanks in advance for any connections you’re able to facilitate, Ian. 

If anything comes of this, I’m sure I’ll write about it here – eventually!

In the meantime, feedback is welcome.

April’s Contributions on Bright Hub

In April, I contributed two articles to the Web Development channel over on Bright Hub:

Recent Articles on Bright Hub

I’ve added a few more articles over on Bright Hub:

Annotation Modeling: In Press

Our manuscript on annotation modeling is one step closer to publication now, as late last night my co-authors and I received sign-off on the copy-editing phase. The journal, Computers and Geosciences, is now preparing proofs.
For the most part then, as authors, we’re essentially done.
However, we may not be able to resist the urge to include a “Note Added in Proof”. At the very least, this note will allude to:

  • The work being done to refactor Annozilla for use in a Firefox 3 context; and
  • How annotation is figuring in OWL2 (Google “W3C OWL2” for more).

Stay tuned …

CANHEIT 2008: York Involvement

York University will be well represented at CANHEIT 2008
Although you’ll find the details in CANHEIT’s online programme, allow me to whet your appetite regarding our contributions:

Synced-Data Applications: The Bastard Child of Convergence

At the Search Engine Strategies Conference in August 2006, in an informal conversation, Google CEO Eric Schmidt stated:

What’s interesting [now] is that there is an emergent new model, and you all are here because you are part of that new model. I don’t think people have really understood how big this opportunity really is. It starts with the premise that the data services and architecture should be on servers. We call it cloud computing – they should be in a “cloud” somewhere. And that if you have the right kind of browser or the right kind of access, it doesn’t matter whether you have a PC or a Mac or a mobile phone or a BlackBerry or what have you – or new devices still to be developed – you can get access to the cloud. There are a number of companies that have benefited from that. Obviously, Google, Yahoo!, eBay, Amazon come to mind. The computation and the data and so forth are in the servers.

My interpretation of cloud computing is summarized in the following figure.


Yesterday, I introduced the concept of Synced-Data Applications (SDAs). SDAs are summarized in the following figure.


SDAs owe their existence to the convergence of the cloud and the desktop/handheld.

Synced-Data Applications: The Future of End-User Software?

I recently asked: Is desktop software is dead?

Increasingly, I am of the opinion that desktop software is well on its way to extinction.

In its place, Synced-Data Applications (SDAs) have emerged.

One of the best examples I’ve recently run across is Evernote. Native Evernote applications exist for desktops (as well as handhelds) and for the cloud (e.g., via a Web browser). Your data is replicated between the cloud (in this example, Evernote’s Webstores) and your desktop(s)/handheld(s). Synced-Data Applications.

And with Google Gears, Google Docs has also entered the SDA software paradigm.

With SDAs, it’s not just about the cloud, and it’s not just about the desktop/handheld. It’s all about the convergence that this software paradigm brings.

A revised version of the figure I shared in the previous post on this thread is included below.

Once again, it emphasizes that interest is focused on the convergence between the isolated realm of the desktop/handheld on the one hand, and the cloud (I previously referred to this as the network) on the other.

It’s much, much less about commercial versus Open Source software. And yes, I remain unaware of SDA examples that live purely in the Open Source realm …

Is Desktop Software Dead?

When was the last time you were impressed by desktop software?

Really impressed?

After seeing (in chronological order) Steve Jobs, Al Gore and Tim Bray make use of Apple Keynote, I absolutely had to give it a try. And impressed I was – and to some extent, still am. For me, this revelation happened about a year ago. I cannot recall the previous instance – i.e., the time I was truly impressed by desktop software.

Although I may be premature, I can’t help but ask: Is desktop software dead?
A few data points:
  • Wikipedia states: “There is no page titled “desktop software”.” What?! I suppose you could argue I’m hedging my bets by choosing an obscure phrase (not!), but seriously, it is remarkable that there is no Wikipedia entry for “desktop software”!
  • Microsoft, easily the leading purveyor of desktop software, is apparently in trouble. Although Gartner’s recent observations target Microsoft Windows Vista, this indirectly spells trouble for all Windows applications as they rely heavily on the platform provided by Vista.
  • There’s an innovation’s hiatus. And that’s diplomatically generous! Who really cares about the feature/functionality improvements in, e.g., Microsoft Office? When was the last time a whole new desktop software category appeared? Even in the Apple Keynote example I shared above, I was impressed by Apple’s spin on presentation software. Although Keynote required me to unlearn habits developed through years of use Microsoft PowerPoint, I was under no delusions of having entered some new genre of desktop software.
  • Thin is in! The bloatware that is modern desktop software is crumbling under its own weight. It must be nothing short of embarrassing to see this proven on a daily basis by the likes of Google Docs. Hardware vendors must be crying in their beers as well, as for years consumers have been forced to upgrade their desktops to accommodate the latest revs of their favorite desktop OS and apps. And of course, this became a negatively reinforcing cycle, as the hardware upgrades masked the inefficiencies inherent in the bloated desktop software. Thin is in! And thin, these days, doesn’t necessarily translate to a penalty in performance.
  • Desktop software is reaching out to the network. Despite efforts like Microsoft Office Online, the lacklustre results speak for themselves. It’s 2008, and Microsoft is still playing catch up with upstarts like Google. Even desktop software behemoth Adobe has shown better signs of getting it (network-wise) with recent entres such as Adobe Air. (And of course, with the arrival of Google Gears, providers of networked software are reaching out to the desktop.)

The figure below attempts to graphically represent some of the data points I’ve ranted about above.

In addition to providing a summary, the figure suggests:

  • An opportunity for networked, Open Source software. AFAIK, that upper-right quadrant is completely open. I haven’t done an exhaustive search, so any input would be appreciated.
  • A new battle ground. Going forward, the battle will be less about commercial versus Open Source software. The battle will be more about desktop versus networked software.

So: Is desktop software dead?

Feel free to chime in!

To Do for Microsoft: Create a Wikipedia entry for “desktop software”.

Book Review: Google Web Toolkit

Automagically convert Java to JavaScript. 

Thus begins the seemingly curious proposition of the Google Web Toolkit (GWT). 
Of course, it’s about a lot more than that. 
For one thing, GWT addresses a key gap in the rapid delivery of the Asynchronous JavaScript and XML (AJAX) based applications that are driving eyeballs and mindshare to Google’s Web site.
By the time you’ve read Prabhakar Chaganti’s book on the GWT, you’ll be significantly wiser on at least two fronts. You’ll know that:
  1. There’s a broad-and-deep software engineering ecosystem around the GWT that is fueling progress and delivering highly significant results. 
  2. Chaganti is an excellent guide with the ability to negotiate this ecosystem and drive you towards tangible outcomes.

Using a task-oriented approach, the book proceeds as follows:

  • Chapter 1 rapidly places the GWT in context, and gets you started by downloading, installing and working with the samples provided. Available for Apple Mac OS X, Linux and Microsoft Windows, the GWT only requires the Java SDK as an installation prerequisite. The GWT is made available via the Apache Open Source license; this allows for the development of commercial and Open Source applications. 
  • With the Java SDK, the GWT and the Eclipse IDE, the developer has a well-integrated and powerful platform on which to develop applications. After illustrating the development of the obligatory “Hello World!” application at the outset of Chapter 2, attention shifts rapidly to use of Eclipse. Google’s Web-wired DNA is evident in everything they do, and the GWT is no exception. The GWT leverages the Java SDK and Eclipse to the fullest, while closing the gaps in developing AJAX-based applications in a very organized way. By the end of this Chapter, the reader knows how to develop a simple application with both client and server-side components and execute the same in both hosted (i.e., non-deployed) and Web hosted (i.e., executing within a Web-hosted Tomcat servlet container). Made explicit in this latter deployment is GWT’s ability to support a variety of Web browsers – i.e., Apple Safari, Microsoft Internet Explorer, Mozilla Firefox and Opera.
  • The creation of services is the focus of Chapter 3. To quote from this Chapter, and in the GWT context, service “… refers to the code that the client invokes on the server side in order to access the functionality provided by the server.” The author is quick to point out that this is a separate and distinct notion from that used in the context of Web services. True to its billing, this Chapter works the reader through the creation of a service definition interface (a client/server contract that defines the service’s functionality and establishes rules of usage) and service implementation. Particularly important in this Chapter is the creation of an asynchronous service definition interface, as this facilitates remote calls in the background to the server, and capitalizes on the AJAX support in the GWT. With definition and implementation taken care of, the remainder of the chapter focuses on use (i.e., consumption of the service by a client). Conceptual illustrations compliment screenshots to effectively convey this content. 
  • Whereas the previous chapter delivered a prime number service, Chapter 4 introduces no less than six services that really showcase the capabilities of this application paradigm. With ample explanation and illustration live searches, password strength checks, auto form fills, sorting tables, dynamically generated lists and Flickr-style editable labels are each considered. Not only does one recognize these as design patterns that are already in everyday use (e.g., Flickr, Google Docs, Maps and Search, etc.), one also realizes their potential for re-use in one’s own projects. 
  • Chapter 5 introduces five interfaces that are more complex than those presented in the previous chapter. These interfaces are pageable tables, editable tree nodes, log spy (the GWT spin on the UNIX tail utility), sticky notes and jigsaw puzzle. To reiterate, one recognizes these as design patterns already in everyday use, and the potential for re-usability.
  • Browser effects are the subject of Chapter 6. Here the author introduces the JavaScript Native Interface (JSNI) as a vehicle that allows JavaScript libraries (e.g., Moo.Fx and Rico) to be accessed directly from Java classes. A wrapper-based approach, independent of JSNI, is also introduced to leverage the Script.aculo.us effects. Although compelling effects can be achieved, cautionary words are included in this Chapter, as the impact may be diminished by browser-level incompatibilities.
  • By the end of Chapter 7, impressive calendar and weather widgets have been created, and readied for re-use. 
  • In Chapter 8, JUnit is introduced in the context of unit testing. Standalone tests plus test suites are given consideration; this includes tests involving asynchronous services.  
  • Although this is only the second book I’ve ever seen from Packt Publishing (the first I’ve reviewed elsewhere), I’ve become accustomed to expecting bonus content towards the end of the book. Chapter 9, which addresses internationalization and XML support, falls into this bonus category. Of course, it’s no surprise that Google expertise on internationalizations ranks high, and this is evident in GWT support for the same. The author provides an hors d’oeuvre of the possibilities. XML support is of particular personal interest, so I was delighted by the degree of support for creating and parsing XML documents. I share the author’s sentiments with respect to XML support wholeheartedly: I too hope that future releases of the GWT will provide broader and deeper support for XML.  
  • In the final chapter (Chapter 10), attention is given to increasingly automated methods for deploying GWT-based applications. Starting with a manual deployment in Tomcat, then an automated deployment with Ant, and finally an Ant-based deployment from within Eclipse. 
  • A single appendix details how to access and execute the examples provided throughout the book.
With the possible exception of a concluding chapter, page, paragraph or even sentence(!), to provide some sense of closure to the book, I am at a loss to report any omissions, oversights or errors of any consequence. And although it will have to wait for a follow-on contribution of some kind, additional discussion might be given to topics such as Google Gears or even Google Android.
Even though the book I reviewed was a complimentary copy provided by the publisher, I would happily pay for my own copy, and heartily recommend this book to others having interests in the GWT. 
By the way, Packt has an articulated scheme when it comes to Open Source projects:

Packt Open Source Project Royalty Scheme Packt believes in Open Source. When we sell a book written on an Open Source project, we pay a royalty directly to that project. As a result of purchasing one of our Open Source books, Packt will have given some of the money received to the Open Source project.In the long term, we see ourselves and yourselves, as customers and readers of our books, as part of the Open Source ecosystem, providing sustainable revenue for the projects we publish on. Our aim at Packt is to establish publishing royalties as an essential part of the service and support business model that sustains Open Source. 

I cannot suggest that Packt is unique in this approach. Regardless, their approach is certainly welcome.