Incentives for publishing and consuming linked data

Posted by woddiscovery ⋅ 2009-11-10 ⋅ 11 Comments

Filed Under consume, incentive, publish

Having read Adam Jacobs’ The Pathologies of Big Data and Stefano Mazzocchi’s Data Smoke and Mirrors I found myself asking: what is the motivation for people to publish linked data, and in turn to consume it (sounds funny you think? well, just because the data is available doesn’t necessarily mean it is useful or actually used 😉

Ok, so let’s start with a nice statement from Adam’s ACM article:

Here’s the big truth about big data in traditional databases: it’s easier to get the data in than out.

Yup, I think I agree and I guess the same is true for Linked Data. There are tons of ‘cheap’ ways to publish in RDF (for example, regarding relational databases, we’re currently try to define a standard). However, there is still a need for high quality data and high quality links between the data items in order to allow the data to be used sensibly in applications!

Right, so my hunch is that for data providers there are a couple of reasons to publish their data in an open and easily accessible way, but I guess one main reason may be that due to providing the raw data, one can simply cut costs. Rather than writing a Web application that serves humans and offering an additional Web service/API (such as flickr or delicious did) , one can expose the original data directly via Linked Data and open up the possibility for others to develop cool applications on top of it (see also our recent work in this direction).

On the other hand, data consumers benefit from a single (RESTful) API with a uniform data model (RDF, in case it isn’t that obvious ;), which in turn enables simplified development of applications and allows the reuse of data (just like the BBC doesn’t have to maintain the artist and song data themselves anymore, but reuses MusicBrainz data).

Let me know – what is your incentive to publish/consume Linked Data?

About woddiscovery

Web of Data researcher and practitioner

View all posts by woddiscovery »

Discussion

11 thoughts on “Incentives for publishing and consuming linked data”

My main incentive would be to consume linked data. The only concern I would have is that current linked data is a bit noisy. Is there a way or any research going on to reduce the noise of linked data? Thanks a lot.

Posted by Olaf Henze | 2009-11-10, 14:28

Reply to this comment
This discussion on SemanticOverflow.com includes some interesting ideas on the benefits to publishers of using linked data:

http://www.semanticoverflow.com/questions/74/what-are-the-benefits-of-the-semantic-web-to-publishers

I was one of the responders and tried to explain why I reckon it’s worthwhile. Also interesting comments from Ian Davis, Bob du Charme, Zach Beauvais and others.

Posted by billroberts | 2009-11-10, 14:40

Reply to this comment
I think consuming is pretty straightforward, to add value to your offering for zero cost.

Always going to be a winner once the secret gets out and enough geeks climb the slight hurdles.

Posted by MrG | 2009-11-10, 14:48

Reply to this comment
I think the issues raised in the “The Pathologies of Big Data” text should be discussed far more intensly in the Semantic Web / Linked Data community than they currently are.
Currently, people are plunging ahead with RDFizing and interlinking, and that is a good thing. However, the elephant in the room seems to be ignored: data can still be hard to query and _understand_, even if they are based on a unified standard (RDF) and are interlinked, because of the sheer complexity and inter-dataset heterogeneity. I have the suspicion that the ‘raw data now!’ meme might backfire in some cases — having the most relevant data in simple, heterogeneous formats can be much more valuable than having all the data in a huge, complex mess.

Posted by Matthias Samwald | 2009-11-10, 15:47

Reply to this comment
Michael,

Things exist on the Web for the following primordial reasons:

1. To Be Discovered By Related Things (explicit action)
2. To Discover Related Things (inadvertent action: since this is the by-product of other people doing #1)

Yes, HTTP based Linked Data (hyperdata linking) is a low cost mechanism for accelerating the above, Serendipitously – especially as Linked Data density on the Web will continue to grow exponentially.

DBpedia, Freebase, BBC, Reuters (via OpenCalais), and most recently, the New York Times, are collectively demonstrating these effects right now 🙂

Posted by Kingsley Idehen | 2009-11-10, 19:40

Reply to this comment
The answers given here and at SemanticOverflow.com regarding the incentives of publishers are convincing but they don’t address the whole question. Remember, publishing Linked Data is not only about publishing your dataset as RDF data on the Web; it is also about links. So, the second part of the question is, why would publishers create and publish links to other datasets; or why would they, at least, add linksets provided by others to their dataset? A possible answer is that a publisher may consider out-links as a valuable addition which could increase the popularity of the dataset.

Posted by Olaf Hartig | 2009-11-10, 19:41

Reply to this comment
Matthias,

Quality of Linked Data Highway is a critical component of Linked Data business models. Thus, with high quality will come some compensation oriented incentives. The free stuff can only go so far etc..

Kingsley

Posted by Kingsley Idehen | 2009-11-10, 19:42

Reply to this comment
For uniprot the value is in two things. Reducing the cost for others to work with our data. e.g. load triples -> query instead of write perl -> load into database -> query. Which as a data provider for the life sciences is a nice benefit.

And secondly because we gain enormously by linking our data to other sources. e.g. uniprotkb is linked to more than 130 databases entity to entity. The value of uniprotkb for its users is much higher with these links than without. RDF and linked data will make it easier to maintain and exploit these links.

Posted by Jerven | 2009-11-11, 18:59

Reply to this comment
I try to keep my message simple:

http://www.slideshare.net/juansequeda/introduction-to-linked-data-2341398
Slide 50

Posted by Juan Sequedaq | 2009-11-16, 04:35

Reply to this comment
For me, RDF and Linked Data protocols are about standardisation (lower ‘s’). Every API you come across has its own URL structure, its own schema, its own methods… LD says if you can understand URIs and RDF, you can consume my metadata. You may not understand it, but you can use some standard tools to get it.

It’s basically the XML of metadata schemas and linked-ness. People complain you can’t interpret the semantics in XML – that’s true, but it hasn’t stopped it taking off as a standard transfer protocol.

Posted by Douglas Campbell | 2009-12-21, 22:57

Reply to this comment

Trackbacks/Pingbacks

Pingback: Tweets that mention Incentives for publishing and consuming linked data « Web of Data -- Topsy.com - 2009-11-10

Web of Data

Search

Incentives for publishing and consuming linked data

About woddiscovery

Discussion

11 thoughts on “Incentives for publishing and consuming linked data”

Trackbacks/Pingbacks

Leave a reply to Kingsley Idehen Cancel reply

Tags

Archives

Meta

Web of Data

Search

Incentives for publishing and consuming linked data

Share this:

Related

About woddiscovery

Discussion

11 thoughts on “Incentives for publishing and consuming linked data”

Trackbacks/Pingbacks

Leave a reply to Kingsley Idehen Cancel reply

Tags

Archives

Meta