In the current issue of the IBM Systems Journal, Rose et al. write:
Although much work has been done on virtualization of physical servers and provisioning of applications over a grid of such servers, less attention has been given to the idea of virtualization of data resources and the use of fewer languages for querying, reporting, and manipulating stored data.
Thus their objective is to present “the idea of information virtualization by using the eXtensible Markup Language (XML) data model as the framework.” Rose et al.:
- Describe their toolbox – XPath, a cursor model, data-access patterns, DFDL, processing languages
- Provide five use cases – a sensor-based computer system, a commercial broker system, an archival system, a file access method and a data-aggregator application
- Identify their reference implementation
Rose et al.’s work has much in common with a data model I developed with Keith Aldridge of York University. Motivated by an actual use case in global geophysics, our data model is also XML-based. More specifically, our model is based on XML dialect Earth Sciences Markup Language (ESML) for a number of reasons:
- ESML makes use of XML Schema (XSD)
- ESML provides support for ASCII-format files
- ESML has Earth Science affinities
- ESML has industry standard affinities (specifically DFDL)
- ESML is being used in large-scale projects (like LEAD)
In addition to XPath and XQuery, Keith and I alluded to the use of XInclude, XPointer and XSLT.
Of course, as we discussed in our paper and elaborated on elsewhere, this is only the begining. The begining of a path that starts with metadata and results in the enhancement in the semantic expressivity and richness of representing data.