On the Use of Informal Ontologies in the Delivery of Service Oriented Architectures (SOAs)

In Service-Oriented Architecture: Concepts, Technology and Design, author Thomas Erl frames ontologies (section 10.2) in a top-down strategy for the delivery of a Service Oriented Architecture (SOA) .

As the first step, in a multistep process, what starts with ontologies ultimately results in a Contemporary SOA (Erl, section 3.2.20):

Contemporary SOA represents an open, extensible, federated, composable architecture that promotes service-orientation and is comprised of autonomous, QoS-capable, vendor diverse, interoperable, discoverable, and potentially reuable services, implemented as Web services.

SOA can establish an abstraction of business logic and technology, resulting in a loose coupling between these domains.

SOA is an evolution of past platforms, preserving successful characteristics of traditional architectures, and bringing with it distinct principles that foster service-orientation in support of a service-oriented enterprise.

SOA is ideally standardized throughout an enterprise, but achieving this state requires a planned transition and the support of a still evolving technology set.

In the same chapter, Erl also provides an abridged Contemporary SOA definition:

SOA is a form of technology architecture that adheres to the principles of service-orientation. When realized through the Web services technology platform, SOA establishes the potential to support and promote these principles throughout the business process and automation domains of an enterprise.

In other words, buying into the top-down strategy can ultimately result in a Contemporary SOA and this is a big deal.

Erl also discusses the bottom-up strategy for delivering a SOA (section 10.2).

In striking contrast to the top-down strategy, and as Erl describes it, the bottom-up strategy does not incorporate ontologies. Despite the fact that “… the majority of organizations that are currently building Web services apply the bottom-up approach …” (Erl, pg. 368):

The bottom-up strategy is really not a strategy at all. Nor is it a valid approach to achieving a contemporary SOA. This is a realization that will hit many organizations as they begin to take service-orientation, as an architectural model, more seriously. Although the bottom-up design allows for the creation of Web services as required by applications, implementing an SOA at a later point can result in a great deal of retro-fitting and even the introduction of new standardized service layers positioned over the top of the non-standardized services produced by this approach.

After reading this chapter, one is left with the impression that Erl favors the agile strategy (Erl, section 10.4) as it attempts “… to find an acceptable balance between incorporating service-oriented design principles into business analysis environments without having to wait before introducing Web services technologies into technical environments.”

I would be willing to accept all of this on spec if it weren’t for the fact that it’s possible to create informal ontologies, in non-SOA contexts, during bottom-up processes.

And if this is possible in non-SOA contexts, then it’s reasonable that informal ontologies could be incorporated into the bottom-up strategy for SOA delivery.

I believe this is worth exploring because use of informal ontologies in a bottom-up strategy for SOA delivery may improve the potential for ultimately achieving a Contemporary SOA. (An outcome, you’ll recall from above, Erl stated wasn’t otherwise acheiveable.)

I also believe this is worth exploring as, as Erl states, most organizations are attempting to gravitate towards SOAs from the bottom up.

Because the agile strategy (ideally) combines the best of both the top-down and bottom-up approaches, I also believe it’s worth exploring the potential for informal ontologies in this case as well.

Although further research is required, the figure below extends Erl’s Figure 10.3 (pg. 367) with a first-blush suggestion of how informal ontologies might be incorporated into the bottom-up strategy for SOA delivery.


It’s important to note that Erl’s original figure illustrates a five-step process that culminates with “Deploy services”.

Based on work I’ve done elsewhere, in this first-blush depiction, I believe the steps required to make use of informal ontologies would need to include:

  • “Extract service relationships” – In the work I’ve done elsewhere, this extraction has been achieved by Gleaning Resource Descriptions from Dialects of Languages (GRDDL). GRDDL extracts relationships and represents them in RDF from XML via XSLT.
  • “Generate informal ontology” – These days, ontologies are often expressed in the Web Ontology Language (OWL). OWL is a semantically richer and more-expressive variation of XML than is XML. Much like the previous step, the generated informal ontology is expressed in OWL via processing that would likely make use of XSLT. This step might also involve the need to incorporate annotations.
  • “Integrate informal ontologies” – Because each act of modeling through deploying application services will result in an informal ontology, there will eventually be a pressing need a integrate these informal ontologies. This ontology integration, which may also involve top-down or formal ontologies, will provide the best possibilities for ultimately realizing a Contemporary SOA.

Even at this early stage, the use of informal ontologies in the delivery of a SOA appears promising and worth investigating.

A Bayesian-Ontological Approach for Fighting Spam

When it comes to fighting spam, Bayesian and ontological approaches are not mutually exclusive.

They could be used together in a highly complimentary fashion.

For example you could use Bayesian approaches, as they are implemented today, to build a spam ontology. In other words, the Bayesian approach would be extended through the addition of knowledge-representation methods for fighting spam.

This is almost the opposite of the Wikipedia-based approach I blogged about recently.

In the Wikipedia-based approach, the ontology consists of ham-based knowledge.

In the alternative I’ve presented here, the ontology consists of spam-based knowledge.

Both approaches are technically viable. However, it’d be interesting to see which one actually works better in practice.

Either way, automated approaches for constructing ontologies, as I’ve outlined elsewhere, are likely to be of significant applicability here.

Another point is also clear: Either way, this will be a computationally intensive activity!

An Ontological Approach for Fighting Spam

Over the years, I’ve been impressed by Bayesian methods for fighting spam.

And although Bayesian methods improve by learning, they are ultimately statistically based.

In what I believe to be a first, Technion Faculty of Computer Science researchers have revealed their plans to develop an ontologically based solution for fighting spam. Also of interest is the fact that their raw data will come from Wikipedia.

These researchers could use the approach I’ve outlined elsewhere to build their ontologies.

Ultimately, it’ll be interesting to see how well this knowledge-based approach compares with Bayesian and other approaches in common usage today.

Virtual Ontologies

As noted elsewhere, I recently made a presentation at the Fall Meeting of the American Geophysical Union (AGU) in a session entitled Earth
and Space Science Cyberinfrastructure: Application and Theory of
Knowledge Representation

Most presenters were focused on the development, integration, and use of formal ontologies. Such ontologies are created from the top down.

In contrast, with my collaborators I’ve been focused on ontology development from the bottom up (please see the figure below). This made me the only presenter who discussed the creation of informal ontologies. To convey the looser formality, some refer to such ontologies as folksonomies.


(There are a lot of acronyms in this figure, and a fairly complex workflow. Please surf my blog and especially my formal publications and elsewhere if you’re interested in the details.)

As the figure seems to imply, informal ontologies are dynamically generated – possibly on demand. For this reason, they are also virtual ontologies. (I believe I’ve just coined a new term!)

Of course, integration of formal and informal ontologies will be a future requirement.

But at the moment, there’s still plenty of room for the basics at bottom 😉

Annotation Presentation to Remote Sensing Association

Next Wednesday, I’ll be making a presentation to the Ontario Association for Remote Sensing (OARS). The details are available online. As you can see from the abstract,

Incorporating Feature-Based Annotations into Automatically Generated Knowledge Representations

Earth Science Markup Language (ESML) is efficient and effective in representing scientific data in an XML-based formalism. However, features of the data being represented are not accounted for in ESML. Such features might derive from events (e.g., a gap in data collection due to instrument servicing), identifications (e.g., a scientifically interesting object in an image), or some other source. In order to account for features in an ESML context, consideration is given from the perspective of annotation, i.e., the addition of information to existing documents without changing the originals. Although it is possible to extend ESML to incorporate feature-based annotations internally (e.g., by extending the XML schema for ESML), there are a number of complicating factors that are identified. Rather than pursuing the ESML-extension approach, attention focuses on an external representation for feature-based annotations via XML Pointer Language (XPointer). In previous work, it has been shown that it is possible to extract relationships from ESML-based representations, and capture the results in the Resource Description Format (RDF). Thus attention focuses here on exploring and reporting on this same requirement for XPointer-based annotations of ESML representations. Earth Science examples allow for illustration of this approach for introducing annotations into automatically generated knowledge representations.

my intention is to emphsize some of my recent research into annotation. Most of this work is being done in collaboration with Jerusha Lederman and Keith Aldridge of York University.

The above abstract very closely resembles a submission that was recently accepted for the Fall Meeting of the American Geophysical Union (AGU). I’ll be blogging more on that soon.

The Pressing Need to Integrate Ontologies

There is a pressing need to integrate ontologies.

I base this observation on purely anecdoctal evidence. For example, I continue to notice papers on this topic at a number of scientific conferences. Of course this is a welcome and expected outcome, as the need arises for inter-, intra- and extra-disciplinary ontology integration as various scientific disciplines develop their own ontologies.

In the bigger scheme of things, this is a also very positive sign that the promise of the Semantic Web is starting to be realized, as ontologies comprise a key component.

Why is this happening? Now? I believe it’s because it’s becoming easier to develop ontologies.

The first ontologies were likely developed manually (i.e., hand coded) from the top down. NASA’s SWEET is a good example of such a formal ontology. Today, formal ontologies are developed with the aid of editors like Protege or SWOOP.

In recent times, however, there’s been emphasis on developing ontologies from the bottom up.

The development of such informal ontologies (or folksonomies) is also being aided by leading-edge technologies:

  • Computer-Assisted Development of Ontologies (CADO) – Easy-to-use end user tools are being developed to directly and proactively engage scientists in the point-and-click, drag-and-drop process of developing ontologies. (Note: I purposely used “CAD” in the acronym here to resonate with “Computer Assisted Drafting” familiar to those with experience in, for example, industrial manufacturing.)
  • Automated Ontology Extractors (AOEs) – Technologies like GRDDL (Gleaning Resource Descriptions from Dialiects of Languages) are being used to automate the extraction of relationships from XML-based representations. The relationships are captured in the Resource Description Format (RDF). In turn, RDF can be recast in OWL (Web Ontology Language), resulting in an automatically generated ontology. This is an area that particularly fascinates me, and I’ve written about it elsewhere in some detail. One of my current research projects aims to integrate annotations encapsulated via XPointer into these automatically generated knowledge representations. This is an endeavor, I’m finding out, that bears much in common with the thrust of this post on integrating ontologies.

When I saw Web inventor Sir Tim Berners-Lee deliver a keynote at Bio-IT World in May 2005, it was quite clear that he was very excited about GRDDL. Why? He sees exemplars like GRDDL as vehicles for enabling ontology creation, and therefore ultimately realizing his original vision for a next-generation Web, i.e., a Semantic Web.

Because evangelism is one of the key values that Berners-Lee needs to continue to deliver on, it must be gratifying for him to see his efforts paying off in the current example of ontology development. In other words, we’re in the throes of shifting from a need to emphasize ontology development, to a need to emphasize ontology integration. There are clearly many challenges in integrating ontologies. Considerable emphasis is, and needs to continue to be, placed on this important topic. However, there’s little question that the ultimate outcome of a much-more semantically enabled Web will be worth the investment.