//
you're reading...
FYI, Linked Data

Why we link …

The incentives to put structured data on the Web seem to slowly seep in, but why does it make sense to link your data to other data? Why to invest time and resources to offer 5 star data? Even though the interlinking itself becomes more of a commodity these days – for example, the 24/7 platform we’re deploying in LATC is an interlinking cloud offering – the motivation for dataset publisher to set links to other datasets is, in my experience, not obvious.

I think it’s important to have a closer look at the motivation for interlinking data on the Web from a data integration perspective. Traditionally, you would download data from, say, Infochimps or you find it via CKAN or via the many other places that either directly offer data or provide a data catalog. Then you would put it in your favorite (NoSQL) database and use it in your application. Simple, isn’t it?

Let’s say you’re using a dataset about companies such as the Central Contractor Registration (CCR) . These companies typically have a physical address (or: location) attached:

Now, imagine I ask you to render the location of a selection of companies on a map. This requires you to look up the geographical coordinates of a company in a service such as Geonames:

I bet you can automate this, right? Maybe a bit of manual work involved, but not too much, I guess. So, all is fine, right?

Not really.

The next developer that comes along and wants to use the company data and nicely map it has to go through the exact same process. Figure what geo service to use, write some look-up/glue code, import the data and so on.

Wouldn’t it make more sense, from a re-usability point of view, if the original dataset provider (CCR in our example) would have a look at its data and identify what entities (such as companies) are there and provide the links to other datasets (such as location data) up-front? This is, in a nutshell, what Tim says concerning the 5th star of Open Data deployment:

Link your data to other people’s data to provide context.

To sum up: if you have data, think about providing this context – link it to other data in the Web and you make your data more useful and more usable and, in the long run, more used.

PS: the working title of this blog post was ‘As we may link’, to render homage to Vannevar Bush, but then I thought that might be a bit too cheesy ;)

About these ads

About mhausenblas

Chief Data Engineer EMEA @MapR #bigdata #hadoop #apachedrill

Discussion

7 thoughts on “Why we link …

  1. Thanks for the post, Michael! BTW, I think you should have gone forward with As We May Link… Check out this quote:

    The process of tying two items together is the important thing…

    – Vannevar Bush, As We May Think, The Atlantic (1945)

    Posted by John Erickson | 2011-05-23, 13:44
    • That was my thought as well. To use As We May Link as the title for the post. And lo! The very first comment expresses an identical sentiment.

      Thank you, John Erickson, for the archival link. Oh wow! You are bitwacker? A celebrity in our midst….

      Posted by Ellie K | 2011-05-28, 11:51
  2. Well thought out and explained

    Posted by Barry O'Gorman | 2011-05-24, 21:43
  3. Hi Michael, great post!

    The point is clear: but what about incentives to publish Linked Data instead of human data as today?

    I mean: the classical Web developer probably doesn’t care about the reuse of the data. It’s the first time that he has this opportunity, and the value chain isn’t so clear to him.

    Stefano Mazzocchi wrote something interesting some times ago:
    Freebase Gridworks, Data-Journalism and Open Data Network Effects

    The data curation is a personal time and knowledge investment.
    It’s not a technical problem, but a problem of value chain, based on the concept of the reuse of information. How developers see this reuse of information is critical, I think.

    ( I’m an optimistic, but a lot of developers aren’t so open minded .( )

    Posted by Matteo Brunati | 2011-05-25, 07:04

Trackbacks/Pingbacks

  1. Pingback: Perchè linked? « LinkedOpenData.it - 2011-05-23

  2. Pingback: The Value of Linked Data - semanticweb.com - 2011-05-24

  3. Pingback: Putting the Links into Linked Data | Talis Consulting | World leading expertise in Linked Data and the Semantic Web - 2011-09-26

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Archives

Follow

Get every new post delivered to your Inbox.

Join 2,151 other followers

%d bloggers like this: