The importance of good URLs
One of our mantras at Swirrl is “data in context” and a key part of creating that context is the ability to link bits of data together, or link descriptive text and discussions to data: essentially using the same principles that work to make the web so powerful.
And to be able to link to something in a web context, it needs to have a URL. Not only does this allow you to find information on an object (by getting your browser to follow the link and show you what’s there) it allows you to identify it clearly, so you can then refer to it in other places.
So good URLs make linking easy – and linking is important in lots of fairly obvious ways:
- You can refer to something so that you can make a statement about it. When you are talking about data, you want to be able to do this in a fine-grained way.
- You can organize your information with links and you can do this in lots of different ways: for example a book can have both a table of contents and an index, as two separate sets of pointers into the content.
- By making pointers to parts of the information, you make it easy for others to find particular things.
When we were scoping out our initial plans for Swirrl, we were influenced by Tom Coates’ excellent presentation at FOWA2006, “Native to a web of data”. (Slides here). One of Tom’s key points is “identify your first order objects and make them addressable”.
What are our first order objects? Swirrl works with data sets. A data set is a table of data, where each cell in the table is essentially an RDF statement, containing the value of a property of something. So we needed to have URLs for data sets, and for individual statements within a data set. We found it’s also handy to be able to individually address things and properties.
OK, so now we know what should have its own URL – but how do we choose what that URL should be? Tom tells us (with a couple of my comments in brackets):
Good URLs should:- be permanent references to resources (don’t change them!)
- have a 1-to-1 correlation with concepts
- use directories to represent hierarchy
- not reflect the underlying technology (users don’t want to see .aspx or .jsp or whatever in their URLs)
- reflect the structure of the data
- be predictable / guessable / hackable
- be as human readable as possible
- be – or expose – identifiers
We’ve tried to follow these guidelines, so for example, if you have a data set called “my_data” in a wiki called “my_wiki”, then the URL for this is http://www.swirrl.com/my_wiki/data_sets/my_data, which we think is about as simple and clear as you can get.

Follow @Swirrl on Twitter