Semantic Web

Let’s start by dissecting the meaning of the two words “semantic web.”  First Semantics – a branch of linguistics and logic concerned with meaning.  Web – the infrastructure required for connected computers to exchange information (HTML, documents, etc.).  As a technologist you might argue my definition of the web is somewhat narrow because it excludes transaction processing and browser resident applications, etc.  But for the purposes of this post, let’s forget about transactions; let’s just think about information, research, and data.

The problem with today’s web is that information is just presented as documents (HTML web pages, word documents, etc.).  But the items of data are not linked by meaning and relationship in such a way that is highly useful for finding anything.  Humans are capable of using the Web to carry out tasks such as finding documents on BMW motorcycles, reserving a library book at your local library, and searching for the lowest prices for a particular DVD. However, a computer cannot accomplish the same tasks without human direction because web pages are designed to be read by people, not machines. The semantic web adds additional data to web content providing additional tags that are understandable by computers, so that they can perform more of the tedious work involved in finding, sharing, and combining information on the web.

In essence, the semantic web teaches what the information means.  With a fully functioning semantic web, computers will understand meaning behind documents. The Semantic Web is not about links between web pages.  The Semantic Web describes the relationships between things (like A is a part of B and Y is a member of Z) and the properties of things (like size, weight, age, and price).  When a search is performed you will be searching for things, not just documents.  Computers will be able to help us find what we need.

Two new technologies are being used to build the semantic web:

  • Resource Description Framework (RDA) is a set of extensions to XHTML which is now a W3C Recommendation. RDA uses attributes from XHTML’s meta and link elements, and generalizes them so that they are usable on all elements. This allows you to annotate XHTML markup with semantics (more information: Rdfa.info/about).
  • Microformat is a web-based approach to semantic markup that seeks to re-use existing XHTML and HTML tags to convey metadata and other attributes. This approach allows information intended for end-users (such as contact information, geographic coordinates, calendar events, and the like) to also be automatically processed by software (more information: Microformats.org/about).

For those of you that would like to see the markup, I suggest you take a look at the Firefox Operator plug in. Operator is an extension for Firefox that adds the ability to interact with semantic data on web pages, including microformats.

Interested in seeing a few examples of what’s possible with the semantic web?  Several good illustrations are included on Altova’s site (Altova makes a semantic web editing tool). One example describes a software consultant looking for information on SOAP, the other illustrates how arranging travel will get easier (https://www.altova.com/semantic_web.html).

The jury is still out regarding the degree of up-take on these technologies.  But if the standards take hold, the impact on web sites, web search, and web site SEO will be huge.

I welcome you comments.
Mike Brannan