Context is everything: why the semantic web is important
Posted by Bill Tue, 17 Jun 2008 11:17:00 GMT
There is a lot of discussion on the web about whether the semantic web will be the next big thing or whether it is something that can’t work; and whether to make it succeed we should start top-down or bottom-up.
Whatever happens to the current set of W3C standards that make up the Semantic Web, one thing is for sure: semantics are important. They always have been and they always will be, because they are fundamental to the process of communication. What’s been changing in the last 20 or 30 years is the means we use to record and communicate information.
Let’s take the example of a spreadsheet, holding say monthly sales figures for our made-up company Acme Ice Cream Inc.

For someone to understand this information, they need the file (eg sales2008.xls) and some software to process it (eg Excel) but also they need a whole lot of context and external knowledge. This ranges from a knowledge of English to an understanding of what the information is about and what it was intended to be used for. They need to know or be able to guess that “Mango” in this case is an ice cream flavour, not the fruit. You can’t tell from the spreadsheet if these are sales of the whole company, or one division, or one salesman, or what.
Things get more complicated if you want to exchange this information with another bit of software – say the company accounting system. The two bits of software need to have some kind of shared model of the data and what it means. And that’s where the fun starts. And why the systems integration industry is so large. You can do this integration one pair of systems at a time, but for N systems, that is N(N-1)/2 pairs – or in non-mathematical terms: if you have a lot of systems to be integrated, it’s a hell of a job.
It’s easy to look at the web and conclude that the number of different potential users of a piece of information is very large indeed: but even within one organization, there is a huge potential for re-using information more effectively. To enable this, we need some standardized way of representing data and data models – that’s where the W3C semantic web standards come in.
The simplest way of doing something often turns out in the long run to be the most powerful. HTTP is a good example of that, powering the web economy one request and response at a time. I think RDF will turn out to the be same.
The semantic web gets a bad name for being complicated: I think the rather turgid, not-really-intended-for-humans XML syntax is the main cause of this. (I always think of RDF data as bubbles joined up by lines: it’s much clearer). But the most important concepts are quite simple and quite familiar:
- the (subject, property, value) triple is the ‘atom’ of information: “Bill’s favourite food is cheese.”
- unique identifiers for everything: if two people are talking about the same thing, they should call it by the same name
- the graph as the basic data structure (in the mathematical sense of a set of nodes joined by edges): it’s more flexible than hierarchical or relational structures and a better match for the way people naturally think, supporting arbitrary links between different bits of data
- a type structure to assist analysis: if you know that Stilton and Camembert are types of cheese, then you can conclude that I will probably like them.
And as Jim Hendler put it, “a little semantics goes a long way”. We don’t need to add semantic mark-up to everything, or ensure that everything shares a common top level ontology. By extracting contextual information when data is being entered, maybe with a bit of judicious prompting of the user, it is often possible to capture semantic information automatically.
Let’s hope that we can start building some bridges between our isolated islands of information.


That is a good explanation people in the Web business should be able to understand.
I think, too, that the basic concepts are simple and familiar. Actually much more natural than databases concepts to which we grew accustomed over the years and which deformed our brain so much that we think they are natural :-)
Very well explained with appropriate data…....... But adding a bit of other applications that use this stuff can be more useful