The Semantic Web
How did the Semantic Web  begin ?



A Brief Aside on Software Agents


Agents Make Communication Easier 


Agents Collect, Process and Exchange Web Content


Open Interaction Between Agents


Even Alien Agents Can Understand Each Other




Note: see an updated version of this page on the Home site.

As most things do, the Semantic Web started small.  An article in the May 2001 issue of Scientific American described a futuristic world where software agents automatically schedule an entire series of medical treatments via the Semantic Web.  The article states that:    

"The Semantic Web will bring structure to the meaningful content of Web pages, creating an environment where software agents roaming from page to page can readily carry out sophisticated tasks for users".   

Software agents have been a hot topic for about the last 15 years.  There are several and commercial variations of an Agent Standard.  In all of them, rules and other powerful conceptual structures are implemented explicitly.  The idea of software 'agents' has grown beyond any reasonable expectation in the last few years.  

The Wikipedia defines a software agent as:

"... an abstraction, a logical model that describes software that acts for a user or other program in a relationship of agency[1]. Such "action on behalf of" implies the authority to decide when (and if) action is appropriate. The idea is that agents are not strictly invoked for a task, but activate themselves".

Agents are generally credited with enabling communication between people and machines in an easier and more natural manner than specialized rule languages and knowledge 'templates'.   Later, it may be helpful to think of the five agents listed above as a packaging of semantic services in a form that is useful for a certain type of knowledge-intensive task.

For now, it is important to understand that the Semantic Web is something that software agents use to accomplish useful tasks.   

The Scientific American article continues:

"The real power of the Semantic Web will be realized when people create many programs that collect Web content from diverse sources, process the information and exchange the results with other programs. The effectiveness of such software agents will increase exponentially as more machine-readable Web content and automated services (including other agents) become available. The Semantic Web promotes this synergy: even agents that were not expressly designed to work together can transfer data among themselves when the data come with semantics".

There are three important elements to the vision described above.

1 - Collecting, processing and exchanging Web content from diverse sources.

2 - Interaction between agents increasing the effectiveness of the Semantic Web, that is, it creates a synergy.

3 - All agents can work together, even agents who have not been designed to work together.

To what degree has the article's vision of the future realized in the past five years since it was published ?


More ...
Early Definitions of the Semantic Web


The Object Web

The ideas of a "Semantic Web" had been around in various guises for several years prior to the Scientific American article.  One precursor of the Semantic Web was the idea of "Web Objects" or the "Object Web" kicking around in the mid-1990s.  The OMG was a big part of that phase of development of the technical infrastructure.   

It really began to take off when the World Wide Web Consortium (W3C) became the authoritative source of standards for the Semantic Web.     

The W3 Definition of the Semantic Web


Common Formats


How Data Relates To Objects


A Web of Data


Data, Metadata or Both ?


Distributed Subject Databases


Can Humans Really Read RDF Ontologies ?

According to the W3, the most authoritative source of standards, the Semantic Web is "about two things", that is:

1 - Common formats for interchange of data, where on the original Web we only had interchange of documents

2 - Language for recording how the data relates to real world objects.

They continue, "that allows a person, or a machine, to start off in one database, and then move through an unending set of databases which are connected not by wires but by being about the same thing".

Above all, they see it as a web of data, which is interesting for two reason.  

From the Wikipedia: "In general, data consist of propositions that reflect reality. A large class of practically important propositions are measurements or observations of a variable. Such propositions may comprise numbers, words, or images". 

Clearly, most of the 'data' in the Semantic Web will be not 'observations' per se but data about data, that is meta-data about the domain ( a price series for hog bellies ) type ( decimal numbers ), the format ( currency ), etc.  So, in effect, the W3 is blurring the distinction between data and meta-data.  There are no separate data structures for data and meta-data.  It is all data.  

The second interesting thing is the unending journey of the person or machine through many distributed databases connected by a common thing or subject, something like a distributed subject database.         

Also note that the agent can be a software program or a person.  That is built in to the definition at this point.  But I think it is clear that almost everyone connected with the Semantic Web effort understands that eyeball to eyeball encounters with page after page of arcane XML/RDF/OWL rule definitions is not something that anyone but an enthusiast wants to cope with on a daily basis. 

More on the subject of human-readable XML/RDF/OWL later.  In fact, this is a major challenge for creating a non-technical tool set usable by ordinary people. 


The Semantic Web and Markup Languages 




    The Wikipedia has a good definition of "Semantic Web".

The Semantic Web , also known loosely as Web 3, is a project that intends to create a universal medium for information exchange by putting documents with computer-processable meaning (semantics) on the World Wide Web. Currently under the direction of the Web's creator, Tim Berners-Lee of the World Wide Web Consortium, the Semantic Web extends the Web through the use of standards, markup languages and related processing tools.

The markup languages mentioned are especially important.  There are several levels and layers of standards, some focused on the Web resources such as Resource Description Framework ( RDF ), others on metadata such as eXtendend Markup Language ( XML ).  Still others concentrating on expressing query logic ( SPARQL ) or interchange of rule sets ( Rule Interchange Format ) or just business rule logic itself ( Business Rules Markup Language (BRML).


The Semantic Web as a Semantic Network


Is It a Knowledge Web ?


The Unification of All Scientific Content !!!


An old concept from AI, the semantic network, may have a second life in the Semantic Web.  In a semantic network, ontologies of distinct types are interpreted within evaluation networks that get their meaning from the semantic relations in which they participate.  In a sense, the subjects of the ontolgies discover their roles by consequence of relationships rather than by declaration or assignment. This can be seen as a direct result of RDF 'entailment rules' and the consequent 'entailment nets'.

In the biological sciences, there is a huge movement afoot to create a workable set of medical ontologies.  According the definition of 'semantic web' provided by Genomics & Proteomics ( nestled between 'self-organization and 'semiochemical', the goal is the "unification of all scientific content by computer languages and technologies that permit the interrelationships between scientific concepts to be communicated between machines". 


The Semantic Web as Advanced Search and Search Engines A very interesting phenomena is Swoogle, a sort of Google for the Semantic Web. 

There is also an interesting comment on the Swoogle Blog, probably belonging more properly to previous section. 

One vision that many of us have is that the Web is evolving into a collective brain for human society, complete with long term memory (web pages), active behaviors (web services and agents), a stream of consciousness (the Blogosphere) and a nervous system (Internet protocols).

An ambitious example from several years ago ( before the RDF standard and the W3 Semantic Web project is SHOE, which may still have a few lessons for the SW five years later. 


The Semantic Web as a Wiki on Steriods

Wikipedia 3.0 and The End of Google



On June 26, 2006 at 5:20am EST, the Evolving Trends web site published an article entitled "Wikipedia 3.0: The End of Google?".  By June 28th, two days later, the article had reached 650,000 people - by July 1st, it was being referenced by over 6,000 other sites and had been read by close to 2,000,000 people.     

This phenomena demonstrated two things.  First, it demonstrated the ability of the Web to generate a tremendous surge of interest in a fairly specialized subject at short notice by selecting pithy, controversial titles.

Secondly, and more importantly, it seems to demonstrate a growing dissatisfaction with Google approach to classifying knowledge via search engines and indexing.  Certainly anyone who has studied the efficacy of the Google indexing paradigm knows that a well-formed Google search may reveal no more than 10% of the interesting sites on a given subject, depending on circumstances.  While the resources of Web with a hundred million or so pages was readily accessible by Google, a Web of ten billion pages has apparently overwhelmed the basic indexing and search technology.  The information you are looking for is probably out there somewhere, but it may take a long struggle and good luck in order to find it.   

Semantic Wikis may provide an alternative to the Google 'knowledge bottleneck'.  Google near monopoly on web search combined with its emerging role as global censor of inconvenient truths is probably fueling the dissatisfaction.


The Semantic Web is NOT Web 2.0 


... Well, Not Exactly.






Another tier of terminology is calling the Semantic Web something like the "Web 2.0", or more commonly "Web 3.0".  Presumably, earlier advances of "Web 1.0" and "Web 1.5" technology were restricted to improved content, database integration, graphical widgets, etc.  However, far from clarifying definitions, using the term "Web Number Whatever" seems to generate yet another level of debate on its own. 

The Wikipedia has an uncharacteristically vague definition of "Web 2.0".  The terms "FOAF"and "XFN" mentioned in the quote will be described in detail in the next section, just note that they facilitate social networking between people. 

Earlier users of the phrase "Web 2.0" employed it as a synonym for "Semantic Web," and indeed, the two concepts complement each other. The combination of social-networking systems such as FOAF and XFN with the development of tag-based folksonomies, delivered through blogs and wikis, sets up a basis for a semantic-web environment. Although the technologies and services that make up Web 2.0 lack the effectiveness of an internet in which the machines can understand and extract meaning (as proponents of the Semantic Web envision), Web 2.0 may represent a step in that direction.

So the bottom line is that Web 2.0 may represent a step in the direction of the Semantic Web, not a ringing endorsement.  The Web 2.0 as envisioned lacks the capability for machines to understand the meaning of things

On the other hand, it is becoming clear that the Web 2.0 initiative will represent a step in that direction, and perhaps far more than a small step in term of the way the Semantic Web is used by ordinary, non-technical people in their everyday activities.  It will require powerful and sophisticated user interfaces to make the new semantic universe accessible to the non-technical 95% of people in the world rather than the technically inclined 5% of the people.  Web 2.0 technology seems to be playing a major role in constructing these interfaces. 




A More Humane Machine Readable Language ?


How Humane Is It ?


Does It Translate ?


The proponents of these markup languages represent their creations as an improvement over implementing rules by programming logic, and that is true from the standpoint of flexibility, but I'm not sure if they are any more readable than programming logic to ordinary human beings.   But this make the end user completely reliant on ontology editors to interact with the final representation of the knowledge.  Are markup languages the strength or the soft underbelly of the Semantic Web ?

It's a critical factor and, in my opinion, there doesn't seem to be a simple way to express rules in both a machine-readable and human readable form at this point of time, although there are some interesting efforts toward Semantic Web editors of various sorts.

There is also an strong international flavor of the Semantic Web, even at this early stage.  Unlike the Web 1.0 and maybe the Web 2.0, the Semantic Web "3.0" is not going to be an English only affair   Translation to and from some sort of structured English ( or German or Spanish or whatever ) is probably still an absolute requirement for a multi-lingual rule language that is usable by ordinary human beings.  



Semantic Services