Annotation Paper Submitted to HPCS 2007 Event

I’ve blogged and presented recently (locally and at an international scientific event) on the topic of annotation and knowledge representation.

Working with co-authors Jerusha Lederman, Jim Freemantle and Keith Aldridge, a written version of the recent AGU presentation has been prepared and submitted to the HPCS 2007 event. The abstract is as follows:

Semantically Enabling the Global Geodynamics Project:
Incorporating Feature-Based Annotations via XML Pointer Language (XPointer)

Earth Science Markup Language (ESML) is efficient and effective in representing scientific data in an XML-based formalism. However, features of the data being represented are not accounted for in ESML. Such features might derive from events, identifications, or some other source. In order to account for features in an ESML context, they are considered from the perspective of annotation. Although it is possible to extend ESML to incorporate feature-based annotations internally, there are complicating factors identified that apply to ESML and most XML dialects. Rather than pursue the ESML-extension approach, an external representation for feature-based annotations via XML Pointer Language (XPointer) is developed. In previous work, it has been shown that it is possible to extract relationships from ESML-based representations, and capture the results in the Resource Description Format (RDF). Application of this same requirement to XPointer-based annotations of ESML representations results in a revised semantic framework for the Global Geodynamics Project (GGP).

Once the paper is accepted, I’ll make a pre-submission version available online.

Because the AGU session I participated in has also issued a call for papers, I’ll be extending the HPCS 2007 submission in various interesting ways.

And finally, thoughts are starting to gel on how annotations may be worked into the emerging notions I’ve been having on knowledge-based heuristics.

Stay tuned.

Cooperative Computing Program Operating in Stealth Mode

Normally I don’t like to bandy about words like ‘cool’. It bespeaks narrative laziness. However, in this post I will purposely make an exception.

While at Platform, one of the coolest projects I worked on was one of curriculum development.

Working with a team of faculty, administrators and industry representatives, a curriculum was developed for the Michingan Jewish Institute’s (MJI) degree program in Cooperative Computing.

What is Cooperative Computing? We defined it this way:

Cooperative Computing is the facilitated interchange of information between willing participants for individual or mutual gain.

Specifically:

  • Facilitated implies use of a enabling environment (e.g., .NET, J2EE, etc.), along with its associated data (e.g., XML with related standards and technologies) and programming models (e.g., C/C++, Java)
  • Interchange emphasizes the (e.g., XML-)detailed interaction that is mediated via standards-based protocols and interfaces (e.g., SOAP plus Grid-enabled Web services)
  • Information is the payload of the interchange – and note that it’s information purposely, instead of data. Why? Information is “data that has been interpreted, translated, or transformed to reveal the underlying meaning”. (Use of a lingua franca like XML ensures that this is the case.)
  • Willingness implies that the interaction is agreed upon – in fact, it’s most likely negotiated
  • Participants may be one or more people, but might also be other concrete entities (e.g., the environment) or even abstract entities (e.g., software and/or hardware components like agents, machines, robots, etc.)
  • Gain conveys the individual and/or mutual benefit derived from the interaction – e.g., the provision of service, a step in a business process, etc.

Use of this definition was consistent with MJI’s desire to use Cooperative Computing as an umbrella that could encompass Grid Computing, Web Services, Service Oriented Architectures (SOAs), and so on.

Although the curriculum was laid out almost-exactly two years ago, it took MJI a little bit of time to do the real work of developing the content.

When I recently connected with marketing director Dov Stein, he indicated that the first cohort of students has almost completed the program.

If you open http://mji.edu/inside.asp?id=99929 and search for “Cooperative Computing” you can have a closer look at the curriculum. (Well, I said they were operating in stealth mode!)

PR Agency Deems My Article Cynical: Internal OGF Communication Leaked via Google Cache

I’m a huge fan of WordPress and Google .

While perusing my blog’s WordPress stats recently, I noticed that my opinion piece on the creation of the Open Grid Forum (OGF) was receiving interest.

On Googling “open grid forum”, my GRIDtoday article rated as the number three result. In first place was the OGF’s Web site itself, and in second place a breaking news article on the OGF in GRIDtoday. Not bad, given that Google reports some 17.7 million results (!) for that combination.

This prompted me to Google “open grid forum lumb”. Not surprisingly, my GRIDtoday article rated first out of some 822 results. Following four results pointing to my blog, and one more to a Tabor Communications’ teaser, is the seventh result:

[gfac] FW: Final OGF Coverage Report
Harris also discusses a cynical article contributed by Ian Lumb of York University (formerly of Platform Computing Inc.), “Open Grid Forum: Necessary…but

http://www.ogf.org/pipermail/gfac/2006-July/000171.html – 12k – CachedSimilar pagesNote this

Somewhere between “… cynical article …”, and a subject line that belies an internal communication, my attention was grabbed!

So I clicked on the link and received back: “The requested URL /pipermail/gfac/2006-July/000171.html was not found on this server.” Darn!

Then I clicked on “Cached” … and:

This is G o o g l e‘s cache of http://www.ogf.org/pipermail/gfac/2006-July/000171.html as retrieved on 30 Sep 2006 05:14:59 GMT.
G o o g l e‘s cache is the snapshot that we took of the page as we crawled the web.

Excellent!

Below is an extract from the cached version of the page:

[gfac] FW: Final OGF Coverage Report

Linesch, Mark mark.linesch at hp.com
Thu Jul 6 16:15:25 CDT 2006

fyi mark
—–Original Message—–
From: Hatch, Marcie [mailto:Marcie.Hatch at zenogroup.com]
Sent: Thursday, July 06, 2006 1:08 PM
To: Linesch, Mark; Steve Crumb; tony.dicenzo at oracle.com; John Ehrig; Don Deutsch; Toshihiro Suzuki; robert.fogel at intel.com
Cc: Maloney, Nicole
Subject: Final OGF Coverage Report

Hi Team,

There have been nine pieces of total coverage resulting from the EGA/GGF merger announcement. The coverage has remained informative and continues to reiterate the key messages that were discussed during the press conference. Please note, the expected pieces by Patrick Thibodeau of Computerworld and Elliot King of Database Trends and Applications have not appeared, to date.

GRIDToday has featured four different pieces as a result of the announcement. Editor Derrick Harris summarized the various stories in an overview, providing the details of the announcement and points to the overall importance of grid computing. Harris also discusses his Q&A with Mark regarding the next steps for the OGF, the pace of standards adoption and how the OGF plans to balance the concerns of the commercial community with those of the research community.

Harris also discusses a cynical article contributed by Ian Lumb of York University (formerly of Platform Computing Inc.), “Open Grid Forum: Necessary…but Sufficient?” Lumb uses his experience working for Platform as a basis for his pessimistic outlook on grid computing. Hestates, “I remain a grid computing enthusiast, but as a realistic enthusiast, I believe that grid computing sorely needs to deliver definitive outcomes that really matter.”

Please let us know if you have any questions.

Kind regards,

Marcie Hatch

According to their Web site: “ZENO is a new-style communications company.” (Indeed!) And presumably, Marcie Hatch is one of their representatives. In this internal communication of the OGF’s Grid Forum Advisory Committee (GFAC), Ms. Hatch relays to OGF president and CEO Mark Linesch and colleagues her assessment of the coverage on the Enterprise Grid Alliance / Global Grid Forum merger announcement.

In the first paragraph of Ms. Hatch’s message, it is revealed that there have been nine items on the merger, although at least two more items were anticipated. The second paragraph introduces the coverage in GRIDtoday, and in the third paragraph, explicit reference to my GRIDtoday article is made. Before commenting on Ms. Hatch’s assessment of my article, let’s review how GRIDtoday editor Derrick Harris contextualized it originally:

However, not everyone is wholly optimistic about this new organization. Ian Lumb, former Grid solutions manager at Platform Computing, contributed an opinion piece questioning whether the OGF will be able to overcome the obstacles faced by the Grid market. While most in the Grid community are singing the praises of the OGF — and for good reason — it is nice to have a little balance, and to be reminded, quite honestly, that it will take a lot of work to get Grid computing to the place where many believe it should be.

Even with the benefit of hindsight, and Ms. Hatch’s assessment, I remain very comfortable with Harris’ contextualization of my article. And because it’s difficult to take the cynical spin from his words, I must assume that the cynical assessment derives from Ms. Hatch herself. For a variety of reasons, it’s very difficult for me to get through Ms. Hatch’s next sentence, “Lumb uses his experience working for Platform as a basis for his pessimistic outlook on grid computing.”, without laughing hysterically. I’m not sure how Ms. Hatch arrived at this assessment, as I appended to my GRIDtoday article the following in my bio:

Over the past eight years, Ian Lumb had the good fortune to engage with customers and partners at the forefront of Grid computing. For all but one of those eight years, Lumb was employed by Platform Computing Inc.

Now that’s a fairly positive spin for a cynic, and one that can be attested to by the Platform colleagues, customers and partners I interacted with. In re-reading my article, and indeed the earlier allusion to Platform in it, I believe it’s fairly clear that Ms. Hatch was unable to appreciate the Platform context. To re-iterate, I needed to step away from the community, so that I could appreciate the broader business and technical landscape. Ironicaly, even the OGF has acknowledged this broader landscape directly through the first of their two strategic goals. Ms. Hatch concludes her paragraph on my GRIDtoday article by quoting me directly. Not only is the quote not entirely a cynical one, it expresses sentiment that was conveyed by numerous others around the recent GridWorld event.

Not too surprisingly, I suppose, my GRIDtoday article did not make the “OGF News” page. Ironically, however, Globus Consortium president Greg Nawrocki’s blog post did:

July 2006 InfoWorld.com Blog, “A Broader Scope Needed for Grid Standards Bodie”
http://weblog.infoworld.com/gridmeter/archives/2006/07/a_broader_scope.html

Greg’s blog entry starts off: “There is a great article in a recent GRIDtoday from Ian Lumb detailing the Open Grid Forum’s necessity but questioning its sufficiency.
For those of you who’ve read this far, I feel I owe you some lessons learned in closing, so here goes:

  • PR companies may position what they think you want to hear, but not necessarily what you need to hear – Engage in your own due dilligence to ensure that their assessment matches your assessment, especially on matters that have any technical content.
  • OGF’s tagline is “Open Forum, Open Standards” – Hm?
  • Google results may inflate perspective, but Google cache delivers the goods – Semantics aside, is there any credibility in 17.7 million results for an entity created this past July? (I just re-ran the query and we’re up to 19.1 million results. Not bad for a few hours!) Google cache allowed me to view a mailing-list archive that, I expect, should’ve been off limits.

Cynically yours, Ian.

Gridness As Usual

In the wake of GridWorld, Intel’s Tom Gibbs writes in GRIDtoday:

The people toiling in the trenches in the Grid community have long stopped caring what it’s called or how folks outside the community think about it. They’re too busy making it work and haggling though standards needed to make it work better.

Understandable. However, if Grid Computing is to rise out of Gartner’s “Trough of Disillusionment”, a customer centric perspective is needed. Based on their strategic priorities, even the Open Grid Forum acknowledges this.

NIST’s Guide to Secure Web Services

NIST has recently released a Guide to Secure Web Services. Their Computer Security Division describes the document as follows:

NIST is pleased to announce the public comment release of draft Special Publication (SP) 800-95, Guide to Secure Web Services. SP 800-95 provides detailed information on standards for Web services security. This document explains the security features of Extensible Markup Language (XML), Simple Object Access Protocol (SOAP), the Universal Description, Discovery and Integration (UDDI) protocol, and related open standards in the area of Web services. It also provides specific recommendations to ensure the security of Web services-based applications.

Writing in Network World, M. E. Kabay extracts from the NIST report:

Perimeter-based network security technologies (e.g., firewalls, intrusion detection) are inadequate to protect SOAs [Service Oriented Architectures] … SOAs are dynamic, and can seldom be fully constrained to the physical boundaries of a single network. SOAP … is transmitted over HTTP, which is allowed to flow without restriction through most firewalls. Moreover, TLS [Transport Layer Security], which is used to authenticate and encrypt Web-based messages, is unsuitable for protecting SOAP messages because it is designed to operate between two endpoints. TLS cannot accommodate Web services’ inherent ability to forward messages to multiple other Web services simultaneously.

The NIST document includes a number of recommendations, the five of which Kabay highlights:

  • Replicate data and services to improve availability.
  • Use logging of transactions to improve accountability.
  • Use secure software design and development techniques to prevent vulnerabilities.
  • Use performance analysis and simulation techniques for end-to-end quality of service and quality of protection.
  • Digitally sign UDDI entries to verify the author of registered entries.

The NIST document definitely warrants consideration for anyone developing Web services.

Licensing Commercial Software for Grids: A New Usage Paradigm is Required

In the Business section of last Wednesday’s Toronto Star, energy reporter Tyler Hamilton penned a column on power-based billing by datacenter services provider Q9 Networks Inc. Rather than bill for space, Q9 chief executive officer Osama Arafat is quoted in Hamilton’s article stating:

… when customers buy co-location from us, they now buy a certain number of volt-amps, which is a certain amount of peak power. We treat power like space. It’s reserved for the customer.

Power-based billing represents a paradigm shift in quantifying usage for Q9.

Along with an entirely new business model, this shift represents a calculated, proactive response to market realities; to quote Osama from Hamilton’s article again:

Manufacturers started making the equipment smaller and smaller. Customers started telling data centre providers like us that they wanted to consolidate equipment in 10 cabinets into one.

The licensing of commercial software is desparately in need of an analogous overhaul.

Even if attention is restricted to the relatively simple case of the isolated desktop, multicore CPUs and/or virtualized environments are causing commercial software vendors to revisit their licensing models. If the desktop is networked in any sense, the need to recontextualize licensing is heightened.

Commercial software vendors have experimented with licensing locality in:

  • Time – Limiting licenses on the basis of time, e.g., allowing usage for a finite period of time with a temporary or subscription-based license, or time-insensitive usage in the case of a permanent license
  • Place – Limiting licensing on the basis of place, e.g., tieing usage to hardware on the basis of a unique host identifier

Although commercial software vendors have attempted to be responsive to market realities, there have been only incremental modifications to the existing licensing models. Add to this the increased requirements emerging from areas such as Grid Computing, as virtual organizations necessarily transect geographic and/or organizational boundaries, and it becomes very clear that a new usage paradigm is required.

With respect to the licensing of their commercial software, the situation is not unlike Q9’s prior to the development of power-based billing. What’s appealing about Q9’s new way of quantifying usage is its simplicity and, of course, its usefulness.

It’s difficult, however, to conceive such a simple yet effective analog in the case of licensing commercial software. Perhaps this is where the Open Grid Forum (OGF) could play a facilitative role in developing a standardized licensing framework. To move swiftly towards tangible outcomes, however, the initial emphasis needs to focus on a new way of quantifying the usage of commercial software that is not tailored to idealized and/or specific environments.

Licensing Commercial Software: A Defining Challenge for the Open Grid Forum?

Reporting on last week’s GridWorld event, GRIDtoday editor Derrick Harris states: “The 451 Group has consistently found software licensing concerns to be among the biggest barriers to Grid adoption.” Not surprisingly then, William Fellows (a principal analyst with The 451 Group) convened a panel session on the topic.

Because virtual organizations typically span geographic and/or organizational boundaries, the licensing of commercial software has been topical since Grid Computing’s earliest days. As illustrated below, traditional licensing models account for a single organization operating in a single geography (lower-left quadrant). Any deviation from this, as illustrated by any of the other quadrants, creates challenges for these licensing models as multiple geographies and/or multiple organizations are involved. Generally speaking the licensing challenges are most pronounced for vendors of end-user applications, as middleware needs to be pervasive anyway, and physical platforms (hardware plus operating system) have a distinct sense of ownership and place.

grid_sw_licensing_vo.png

The uptake of multicore CPUs and virtualization technologies (like vmWare) has considerably exacerbated the situation, as it breaks the simple, per-CPU licensing model employed by many Independent Software Vendors (ISVs) as illustrated below.

grid_sw_licensing_hw.png
In order to make progress on this issue, all stakeholders need to collaborate towards the development of recontextualized models for licensing commercial software. Even though this was apparently a relatively short panel session, Harris’ report indicates that the discussion resulted in interesting outcomes:

The discussion started to gain steam toward the end, with discussions about the effectiveness of negotiated enterprise licenses, metered licensing, token-based licenses and even the prospect of having the OGF develop a standardized licensing framework, but, unfortunately, time didn’t permit any real fleshing out of these ideas.

Although it’s promising that innovative suggestions were entertained, it’s even more interesting to me how the Open Grid Forum (OGF) was implicated in this context.

The OGF recently resulted from the merger of the Global Grid Forum (GGF) and the Enterprise Grid Alliance (EGA). Whereas Open Source characterizes the GGF’s overarching culture and philosophy regarding software, commercial software more aptly represents the former EGA’s vendor-heavy demographics. If OGF takes on the licensing of commercial software, it’s very clear that there will be a number of technical challenges. That OGF will also need to bridge the two solitudes represented by the former GGF and EGA, however, may present an even graver challenge.

Grid Computing’s Identity Crisis

Hanoch Eiron, Open Grid Forum (OGF) vice president of marketing, recently contributed a special feature to GRIDtoday. Even though Eiron’s contribution spans a mere three paragraphs, there is ample content to comment on.

Eiron opens with:

Let’s face it — the Grid hype by commercial vendors in the past few years was premature. Some would say that it has actually slowed the development of grids as it created customer expectations that could not be met.

IBM’s arrival on the Grid Computing scene, publically marked by their endorsement of the Open Source Globus Toolkit, signified the dawn of vendor-generated hype. However long before IBM sought to paint Grid Computing blue, it was Global Grid Forum (GGF) and Globus Project representatives who were the source of hype. Back in these BBB (Before Big Blue) days, academic gridders evangelized that Grid Computing represented the next phase in the ongoing evolution of Distributed Computing. And specifically with respect to Grid Computing standards and the Globus Toolkit:

This evolution in standards has wreaked havoc on the implementation front. For example, in moving from Versions 2 (protocol-specific implementation based on FTP, HTTP, LDAP, etc.) to 3 (introduction of Web services via OGSI) to 4 (refinement of previously introduced OGSI Web Services to WS-RF), the Open Source Globus Toolkit has undergone significant changes. When such changes break forward-compatibility in subsequent versions of the software, standards evolution becomes an impediment to adoption.

For a specific example, consider CERN’s gamble with Grid Computing:

The standards flux, that resulted in evolving variants of the Globus Toolkit, caused CERN and its affiliates some grief for at least two reasons.

  • First, projects like the LHC require significant advance planning. Evolving standards and implementations make advance planning even more challenging, and the allusions to gambling quite appropriate.
  • Second, despite the fact that CERN’s primary activity is academic research, CERN needs to provide a number of production-quality services. Again, such service levels are difficult to deliver on when standards and implementations are in a state of continuous change.

In other words, it’s not just vendors who have been guilty of hype and over-promising on deliverables.

Later in his first paragraph, Eiron states: “… it is clear that from a public perception standpoint, grids are now in a trough.” I couldn’t agree more. As the recent GridWorld event has ably demonstrated, considerable confusion exists about Grid Computing. Newbies, early adopters and even the Griderati, are uncomfortable with the term, unclear on what it means and how it fits into the broader context of clustering, cyberinfrastructure, Distributed Computing, High Performance Computing (HPC), Service Oriented Architecture (SOA), Utility Computing, virtualization, Web Services, etc. (That adaptive enterprise and autonomic computing don’t receive much play is of mild consolation.) Grid Computing is in a trough because it is suffering from a serious identity crisis. Fortunately, Eiron and OGF are not in denial, and have plans to address this situation.

Eiron refers to Grid Computing’s latest poster child, eBay. And although I haven’t had the benefit of a deep dive on the technical aspects of the eBay Grid, I expect it to be a grid more in positioning than substance. In a GRIDtoday Q&A with Paul Strong, distinguished research scientist at eBay Research Labs, there is evidence of cluster-level workload management, clustered databases, farms of Web servers, and other examples of Distributed Computing technologies. However, nothing that Strong discusses seems that griddy. All of this echoes what I wrote previously in a GRIDtoday article:

The highest-profile demonstrations of Grid computing run the risk of trivializing Grid computing. It may seem harsh to paint the well-intentioned World Community Grid as technologically trivial, but in terms of full disclosure, this is not the most sophisticated demonstration of Grid computing. Equally damaging are those clustered applications (like Oracle 10g) that masquerade as Grid-enabled. Taking such license serves only to confuse and dilute the very essence of Grid computing.

Eiron’s own words serve well in summing up here:

Is it clear that the community needs to do a better job of explaining the role of grids within the landscape of close and perhaps somewhat overlapping technologies, such as virtualization, services-oriented architecture (SOA), automation, etc. The Grid community also needs to better articulate how the architectures, industry standards and products can help customers reap the benefits of grids. It can use the perception trough as an opportunity to re-group and create a solid story that can be delivered upon, or morph into something else. It seems that much of the influence on how things will evolve is now in the Grid community’s own hands.

Of course, only time will tell if this window of opportunity is still open, and if the Grid Computing community is able to capitalize on it.

Grid: Early Adopters Prefer A Better Term

In a recent GRIDtoday article William Fellows, a savvy principal analyst with The 451 Group, states:

When asked, 70 percent of early adopters who responded to a survey said there is a better term than “Grid” to describe their distributed computing architectures: 23 percent said virtualization, 23 percent said HPC, 19 percent said utility computing, 19 percent said clustering, and 15 percent said SOA.

Sadly, this serves only to underline much of what I’ve blogging about lately.