Foraging for Resources in the Multicore Present and Future

HPC consultant Wolfgang Gentzsch has thoughtfully updated the case of multicore architectures in the HPC context. Over on LinkedIn, via one of the HPC discussion groups, I responded with:

I also enjoyed your article, Wolfgang – thank you. Notwithstanding the drive towards cluster-on-a-chip architectures, HPC customers will require workload managers (WLMs) that interface effectively and efficiently with O/S-level features/functionalities (e.g., MCOPt Multicore Manager from eXludus for Linux, to re-state your example). To me, this is a need well evidenced in the past: For example, various WLMs were tightly integrated with IRIX’s cpuset functionality (http://www.sgi.com/products/software/irix/releases/irix658.html) to allow for topology-aware scheduling in this NUMA-based offering from SGI. In present and future multicore contexts, the appetite for petascale and exascale computing will drive the need for such WLM-O/S integrations. In addition to the multicore paradigm, what makes ‘this’ future particularly interesting, is that some of these multicore architectures will exist in a hybrid (CPU/GPU) cloud – a cloud that may compliment in-house resources via some bursting capability (e.g., Bright’s cloud bursting, http://www.brightcomputing.com/Linux-Cluster-Cloud-Bursting.php). As you also well indicated in your article, it is incumbent upon all stakeholders to ensure that this future is a friendly as possible (e.g., for developers and users). To update a phrase originally spun by Herb Sutter (http://www.gotw.ca/publications/concurrency-ddj.htm) in the multicore context, not only is the free lunch over, its getting tougher to find and ingest lunches you’re willing to pay for!

We certainly live in interesting times!

Parsing XML: Commercial Interest

Over the past few months, a topic I’ve become quite interested in is parsing XML. And more specifically, parsing XML in parallel.

Although I won’t take this opportunity to expound in any detail on what I’ve been up to, I did want to state that this topic is receiving interest from significant industry players. For example, here are two data points:

Parsing of XML documents has been recognized as a performance bottleneck when processing XML. One cost-effective way to improve parsing performance is to use parallel algorithms and leverage the use of multi-core processors. Parallel parsing for XML Document Object Model (DOM) has been proposed, but the existing schemes do not scale up well with the number of processors. Further, there is little discussion of parallel parsing methods for other parsing models. The question is: how can we improve parallel parsing for DOM and other XML parsing models, when multi-core processors are available?

Intel Corp. released a new software product suite that is designed to enhance the performance of XML in service-oriented architecture (SOA) environments, or other environments where XML handling needs optimization. Intel XML Software Suite 1.0, which was announced earlier this month, provides libraries to help accelerate XSLT, XPath, XML schemas and XML parsing. XML performance was found to be twice that of open source solutions when Intel tested its product …

As someone with a vested interest in XML, I regard data points such as these as very positive overall.

HP Labs Innovates to Sustain Moore’s Law

I had the fortunate opportunity to attend a presentation by Dr. R. Stanley Williams at an HP user forum event in March 2000 in San Jose.

Subsequently, I ran across mention of his work at HP Labs in various places, including Technology Review.

So, when I read in PC World that

Hewlett-Packard researchers may have figured out a way to prolong Moore’s Law by making chips more powerful and less power-hungry

I didn’t immediately dismiss this as marketing hyperbole.

Because of Moore’s Law, transistor density has traditionally grabbed all the attention when it comes to next-generation chips. However power consumption, and the resulting heat generated, are also gating (sorry, I couldn’t resist that!) factors when it comes to chip design. This is why there is a well-established trend in the development of multicore chip architectures. With the multicore paradigm, it’s an aggregated responsibility to ensure that Moore’s Law is of ongoing relevance.

Williams has found a way to sustain the relevance of Moore’s Law without having to make use of a multicore architecture.

Working with Gregory S. Snyder, the HP team has redesigned the Field Programmable Gate Array (FPGA) by introducing a nano-scale interconnect (a field-programmable nanowire interconnect, FPNI). As hinted earlier in this posting, the net effect is to allow for significantly increased transistor density and reduced power consumption. A very impressive combination!

Although Sun executive Scott McNealy is usually associated with the aphorism “The network is the computer”, it may now be HP’s turn to recapitulate.

Usability and Parallel Computing

According to Wikipedia:

Usability is a term used to denote the ease with which people can employ a particular tool or other human-made object in order to achieve a particular goal.

And further on, it’s stated:

In human-computer interaction and computer science, usability usually refers to the elegance and clarity with which the interaction with a computer program or a web site is designed.

Although most people focus on interface design when they hear this term, the definition allows room for more.

For example, I’m now toying with the idea of replacing “Accessible” with “Usable” in the context of the recently blogged interest in Accessible Parallel Computing.

Parallel Computing Needs to be More Accessible

There are two truths about parallel computing.

1. Parallel computing is hard.

To quote from a February 2005 article by Herb Sutter in Dr. Dobb’s Journal:

Probably the greatest cost of concurrency is that concurrency really is hard: The programming model, meaning the model in programmers’ heads that they need to reason reliably about their program, is much harder than it is for sequential control flow.

2. Parallel computing is going mainstream.

To quote Sutter again:

… if you want your application to benefit from the continued exponential throughput advances in new processors, it will need to be a well-written, concurrent (usually multithreaded) application. And that’s easier said than done, because not all problems are inherently parallelizable and because concurrent programming is hard.

Because 1 and 2 are to some extent in opposition, we have an escalating situation.

So, this means we have to do a better job of making parallel computing, well, less hard – i.e., more accessible.

Since returning to York University last April, this has become resoundingly clear to me. In fact, it is resulting in an Accessible Parallel Computing Initiative. I hope to be able to share much more about this soon. For now, you can read over my abstract for the upcoming HPCS 2007 event:

Accessible Parallel Computing via MATLAB

Parallel applications can be characterized in terms of their granularity and concurrency. Whereas granularity measures computation relative to communication, concurrency considers the degree of parallelism present. In addition to classifying parallel applications, the granularity-versus-concurrency template provides some context for the strategies used to introduce parallelism. Despite the availability of various enablers for developing and executing parallel applications, actual experience suggests that additional effort is required to reduce the required investment and increase adoption. York University is pioneering an investment-reducing, adoption-enhancing effort based on the use of MATLAB, and in particular the MATLAB Distributed Computing Toolbox and Engine. In addition to crafting an appropriate environment for parallel computing, the researcher-centric York effort places at least as much emphasis on the development and execution of parallel codes. In terms of delivery to the York research community, MATLAB M-files will be shared in a tutorial context in an effort to build mindshare and directly engage researchers in parallel computing. Although MATLAB shows significant promise as a platform for parallel computing, some limitations have been identified. Of these, support for threaded applications in a shared-memory context and limited support for the Message Passing Interface (MPI) are of gravest concern.

As always, I welcome your feedback.

Licensing Commercial Software for Grids: A New Usage Paradigm is Required

In the Business section of last Wednesday’s Toronto Star, energy reporter Tyler Hamilton penned a column on power-based billing by datacenter services provider Q9 Networks Inc. Rather than bill for space, Q9 chief executive officer Osama Arafat is quoted in Hamilton’s article stating:

… when customers buy co-location from us, they now buy a certain number of volt-amps, which is a certain amount of peak power. We treat power like space. It’s reserved for the customer.

Power-based billing represents a paradigm shift in quantifying usage for Q9.

Along with an entirely new business model, this shift represents a calculated, proactive response to market realities; to quote Osama from Hamilton’s article again:

Manufacturers started making the equipment smaller and smaller. Customers started telling data centre providers like us that they wanted to consolidate equipment in 10 cabinets into one.

The licensing of commercial software is desparately in need of an analogous overhaul.

Even if attention is restricted to the relatively simple case of the isolated desktop, multicore CPUs and/or virtualized environments are causing commercial software vendors to revisit their licensing models. If the desktop is networked in any sense, the need to recontextualize licensing is heightened.

Commercial software vendors have experimented with licensing locality in:

  • Time – Limiting licenses on the basis of time, e.g., allowing usage for a finite period of time with a temporary or subscription-based license, or time-insensitive usage in the case of a permanent license
  • Place – Limiting licensing on the basis of place, e.g., tieing usage to hardware on the basis of a unique host identifier

Although commercial software vendors have attempted to be responsive to market realities, there have been only incremental modifications to the existing licensing models. Add to this the increased requirements emerging from areas such as Grid Computing, as virtual organizations necessarily transect geographic and/or organizational boundaries, and it becomes very clear that a new usage paradigm is required.

With respect to the licensing of their commercial software, the situation is not unlike Q9’s prior to the development of power-based billing. What’s appealing about Q9’s new way of quantifying usage is its simplicity and, of course, its usefulness.

It’s difficult, however, to conceive such a simple yet effective analog in the case of licensing commercial software. Perhaps this is where the Open Grid Forum (OGF) could play a facilitative role in developing a standardized licensing framework. To move swiftly towards tangible outcomes, however, the initial emphasis needs to focus on a new way of quantifying the usage of commercial software that is not tailored to idealized and/or specific environments.

Licensing Commercial Software: A Defining Challenge for the Open Grid Forum?

Reporting on last week’s GridWorld event, GRIDtoday editor Derrick Harris states: “The 451 Group has consistently found software licensing concerns to be among the biggest barriers to Grid adoption.” Not surprisingly then, William Fellows (a principal analyst with The 451 Group) convened a panel session on the topic.

Because virtual organizations typically span geographic and/or organizational boundaries, the licensing of commercial software has been topical since Grid Computing’s earliest days. As illustrated below, traditional licensing models account for a single organization operating in a single geography (lower-left quadrant). Any deviation from this, as illustrated by any of the other quadrants, creates challenges for these licensing models as multiple geographies and/or multiple organizations are involved. Generally speaking the licensing challenges are most pronounced for vendors of end-user applications, as middleware needs to be pervasive anyway, and physical platforms (hardware plus operating system) have a distinct sense of ownership and place.

grid_sw_licensing_vo.png

The uptake of multicore CPUs and virtualization technologies (like vmWare) has considerably exacerbated the situation, as it breaks the simple, per-CPU licensing model employed by many Independent Software Vendors (ISVs) as illustrated below.

grid_sw_licensing_hw.png
In order to make progress on this issue, all stakeholders need to collaborate towards the development of recontextualized models for licensing commercial software. Even though this was apparently a relatively short panel session, Harris’ report indicates that the discussion resulted in interesting outcomes:

The discussion started to gain steam toward the end, with discussions about the effectiveness of negotiated enterprise licenses, metered licensing, token-based licenses and even the prospect of having the OGF develop a standardized licensing framework, but, unfortunately, time didn’t permit any real fleshing out of these ideas.

Although it’s promising that innovative suggestions were entertained, it’s even more interesting to me how the Open Grid Forum (OGF) was implicated in this context.

The OGF recently resulted from the merger of the Global Grid Forum (GGF) and the Enterprise Grid Alliance (EGA). Whereas Open Source characterizes the GGF’s overarching culture and philosophy regarding software, commercial software more aptly represents the former EGA’s vendor-heavy demographics. If OGF takes on the licensing of commercial software, it’s very clear that there will be a number of technical challenges. That OGF will also need to bridge the two solitudes represented by the former GGF and EGA, however, may present an even graver challenge.