//

you're reading...

FYI, Linked Data

Oh – it is data on the Web

Posted by woddiscovery ⋅ 2010-04-14 ⋅ 29 Comments

Filed Under Atom, HATEOS, OData

A little story about OData and Linked Data …

Others already gave some high-level overview about OData and Linked Data, but I was interested in two concrete questions: how to utilise OData in the Linked Data Web and how to turn Linked Data into OData.

As already mentioned, I consider Atom, which forms one core bit of OData, as hyperdata allowing to publish data in the Web, not only on the Web. And indeed, the first OData example I examined (http://odata.netflix.com/Catalog/People) looked quite promising:
<entry> <id>http://odata.netflix.com/Catalog/People(196)</id> <title type="text">George Abbott</title> <updated>2010-04-13T12:02:01Z</updated> <author> <name /> </author> <link rel="edit" title="Person" href="People(196)" /> <link rel="http://schemas.microsoft.com/ado/2007/08/dataservices/related/Awards" type="application/atom+xml;type=feed" title="Awards" href="People(196)/Awards" /> <link rel="http://schemas.microsoft.com/ado/2007/08/dataservices/related/TitlesActedIn" type="application/atom+xml;type=feed" title="TitlesActedIn" href="People(196)/TitlesActedIn" /> <link rel="http://schemas.microsoft.com/ado/2007/08/dataservices/related/TitlesDirected" type="application/atom+xml;type=feed" title="TitlesDirected" href="People(196)/TitlesDirected" /> <category term="NetflixModel.Person" scheme="http://schemas.microsoft.com/ado/2007/08/dataservices/scheme" /> <content type="application/xml"> <m:properties> <d:Id m:type="Edm.Int32">196</d:Id> <d:Name>George Abbott</d:Name> </m:properties> </content> </entry>

Note, that there is a URI in the id element that can be used as entity URI and also link/@rel values that can be exploited as typed links. I ran it through OpenLink’s URI Burner (result) and hacked a little XSLT that picks out the relevant bits, just to see how an RDF version might look like. Though the @rel values do not dereference (try it out yourself: http://schemas.microsoft.com/ado/2007/08/dataservices/related/Awards) I thought, well, we can still handle it somehow as Linked Data.

Then, I looked at some more OData examples, just to find out that almost all of the other examples from the OData sources more or less look like the following (from http://datafeed.edmonton.ca/v1/coe/BusStops):
<entry m:etag="W/"datetime'2010-01-14T22%3A43%3A35.7527659Z'""> <id>http://datafeed.edmonton.ca/v1/coe/BusStops(PartitionKey='1000',RowKey='3b57b81c-8a36-4eb7-ac7f-31163abf1737')</id> <title type="text"></title> <updated>2010-04-13T15:42:53Z</updated> <author> <name /> </author> <link rel="edit" title="BusStops" href="BusStops(PartitionKey='1000',RowKey='3b57b81c-8a36-4eb7-ac7f-31163abf1737')" /> <category term="OGDI.coe.BusStopsItem" scheme="http://schemas.microsoft.com/ado/2007/08/dataservices/scheme" /> <content type="application/xml"> <m:properties> <d:PartitionKey>1000</d:PartitionKey> <d:RowKey>3b57b81c-8a36-4eb7-ac7f-31163abf1737</d:RowKey> <d:Timestamp m:type="Edm.DateTime">2010-01-14T22:43:35.7527659Z</d:Timestamp> <d:entityid m:type="Edm.Guid">b0d9924a-8875-42c4-9b1c-246e9f5c8e49</d:entityid> <d:stop_number>1000</d:stop_number> <d:street>Abbottsfield</d:street> <d:avenue>Transit Centre</d:avenue> <d:region>Edmonton</d:region> <d:latitude m:type="Edm.Double">53.57196999</d:latitude> <d:longitude m:type="Edm.Double">-113.3901687</d:longitude> <d:elevation m:type="Edm.Double">0</d:elevation> </m:properties> </content> </entry>

What you immediately see is the XML payload in the content element, making heavy use of two elements in the d: and m: namespace, two URIs that 404 and hence do not allow me to learn more about the schema (beside the fact that they are centrally maintained by Microsoft).

So, what does this all mean?

Imagine a Web (a Web of Documents, if you wish), which is not based on HTML and hyperlinks, but on MS Word documents. The documents are all available on the Internet, so you can download them and consume the content. But after you’re done with a certain document that talks about a book, how do you learn more about it? For example, reviews about the book or where you can purchase it? Maybe the original document mentions that there is some more related information on another server. So you’d need to go there and look for the related bit of information yourself. You see? That’s what the Web is great at – you just click on a hyperlink and it takes you to the document (or section) you’re interested in. All the legwork is taken care of for you through HTML, URIs and HTTP.

Hm, right, but how is this related to OData?

Well, OData feels a bit like the above mentioned scenario, just concerning data. Of course you – well actually rather a software program I guess – can consume it (a single source), but that’s it. To sum up my impression so far:

OData enables to publish structured data on the Web and theoretically in the Web (what’s the difference?)
OData uses Atom (and APP) as a framework with the actual data as (proprietary) XML payload;
OData typically creates data silos; discovering data beyond a single source is, nicely put, not easy;
Creating Linked Data from OData seems not a promising route;
Creating OData from Linked Data seems feasible and is desirable, in order to leverage tools such as Pivot.

Regarding the last bullet point, the ‘how to turn Linked Data into OData’, I will do some further research and keep you posted, here.

About woddiscovery

Web of Data researcher and practitioner

View all posts by woddiscovery »

Discussion

29 thoughts on “Oh – it is data on the Web”

Hello Michael,
This is an interesting and useful post and I thank you for the link.

I don’t have much to add except in regards to your mention of OData and Pivot. Certainly combining the two seems like an obvious thing to do however at the time of writing Pivot is based on a (even) more proprietary representation than OData (I believe that is through necessity though stand to be corrected).

I have asked the question (http://getsatisfaction.com/live_labs_pivot/topics/is_there_a_roadmap_for_building_pivot_collections_from_odata) as to whether we can expect Pivot to consume OData in the future and as yet I have not yet received an answer (and am not expecting to).

Posted by Jamie Thomson | 2010-04-14, 09:35

Reply to this comment
- Thanks a lot, Jamie. Wasn’t aware of the fact re Pivot. Good to see people asking questions like that 😉
  
  Posted by woddiscovery | 2010-04-14, 09:43
Great write-up Michael 🙂

The key point for me is that “OData typically creates data silos; discovering data beyond a single source is, nicely put, not easy;”, that’s the main thing that stops me from any form of OData adoption and which relegates it to the realms of discussion rather than use (for me).

Two points which I am keen to see handled, is dealing with the rel=”” relationships which 404, as mentioned previously for Atom rels in general, and to which you contributed valuable insight.

The final point which is indeed very desirable “Creating OData from Linked Data” is (imo) a major one; simple translations between Linked Data and OData (+GData) will aid adoption greatly, especially in critical enterprise environments where Microsoft are making the push. Interoperability is key.

Again great write-up and thanks for taking the time to analyze this properly 😉

Nathan

Posted by Nathan | 2010-04-14, 09:35

Reply to this comment
- Thank you Nathan, and indeed, I think the LinkedData2OData gateway is something I’d like to invest some time into, yes 😉
  
  Posted by woddiscovery | 2010-04-14, 09:45
Michael,

When using URIBurner, did you look at the collection of examples that I’ve put out re. generation of RDF based Linked Data from OData? Basically, the sequence is as follows:

1. Visit an OData Feed URL

2. Simply use the URIBurner Bookmarklet to generate a Resource Description

3. See RDF based Linked Data generated using SIOC (Container and Items which is basically what OData is offering)

4. Of course URIBurner makes proxy Generic HTTP URIs .

OData is Structured Data based on the EAV model. Its very good foundation for generating RDF based Linked Data.

Likewise, producing OData from RDF Linked Data is also very powerful, in the most basic sense it exposes OData application developers, Information Workers, and Dallas Data Mart producers to the larger pool of RDF based Linked Data Spaces (from DBpedia to the entire LOD Cloud).

Virtuoso can already expose RDF based Linked Data as OData via SPASQL and ADO.NET. Naturally, we want to do better (since the ADO.NET route is platform specific), so you will soon see OData and GData (little revamp of our early implementation from way back) as native Virtuoso protocols; meaning: OData and GData will not only be Data Representation formats for Linked Data (which include our proxy Generic HTTP URIs), they will also act as Linked Data Space APIs.

Links:

1. http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/ODataServiceExample — publishing OData using Virtuoso via ADO.NET

2. http://delicious.com/kidehen/odata_demo .

Kingsley

Posted by Kingsley Idehen | 2010-04-14, 11:05

Reply to this comment
- I think I did use URI Burner correctly (see blog post)? I was not aware of the delicious links, no, but they all follow the same pattern as I tried or did I miss something?
  
  Still, I think my core criticism holds. If you, e.g. look at http://linkeddata.uriburner.com/about/html/http/schemas.microsoft.com/ado/2007/08/dataservices/metadata/nameLast (from one of your examples) you see what I mean 😉
  
  I agree that OData offers the *container* bits, yes, but the actual data is locked down and the discovery of *other* data is simply not there. I’d love to see how you, starting from one OData source, navigate to another OData source, using URI Burner (not talking about the schema level, yet).
  
  Good discussion – I think we should continue!
  
  Posted by woddiscovery | 2010-04-14, 11:16
Thanks Michael for this excellent post!

I think if we push further, we do see that it is Microsoft’s intention to make an OData-based world more accessible, but under their own terms. The (Azure-based) platform complement to OData is Microsoft Codename “Dallas” which promises to make data more discoverable, create an iStore-like ecosystem for data, etc, etc.

Some of the first things to note in watching the “Dallas” technical presentation from MIX10 is that (a) the data is distinctly old-skool “rectangular” and (b) SPARQL is no where to be found. And also, as Michael notes, there is effectively no linking; would you link from an Excel spreadsheet? Some might, but would have the bruises to show for it…

OData appears to be focused on providing hosted, programmatic access to rectangular data, whereas linked data is about creating a true Web of Data…

John

Posted by John S. Erickson, Ph.D. | 2010-04-14, 11:38

Reply to this comment
- John, I couldn’t have said it better, yes. Thanks for sharing your insights!
  
  Posted by woddiscovery | 2010-04-14, 11:40
- Hi John,
  I’ve never seen/heard the adjective “rectangular” used in this context before but I do like it – I’ve been after a colloquial way of referring to traditional rows&columns in a way that differentiates it from RDF – “rectangular” is nice 🙂
  
  Speaking as someone who is entrenched fairly and squarely in the Microsoft ecosystem (i.e. enterprise relational DB developer) I can understand why Microsoft would push this “rectangular” format – it is much closer to the world that my peers and I are much more used to dealing with. Nonetheless I am as disappointed as I’m sure you are that RDF/linked data is not something on their radar.
  
  Going back to the aforementioned Pivot – their XML-based data representation called “Pivot Collections” (but which is not OData) *does* have the ability to link to other datasets so this may come under the headline of linked data – although there’s no RDF to be seen of course. When you use the Pivot tool this manifests itself as (for example) when browsing a collection of movies one could “hop over” to a collection of actors
  
  -Jamie
  
  Posted by Jamie Thomson | 2010-04-14, 13:29
Sorry, the Microsoft Codename “Dallas” links I embedded did not…er…embed. Let’s try again:

[1] Microsoft Codename “Dallas” landing page: http://www.microsoft.com/windowsazure/dallas/

[2] Microsoft Codename “Dallas” technical presentation at MIX10 by Moe Khosravy: http://live.visitmix.com/MIX10/Sessions/SVC02

Posted by John S. Erickson, Ph.D. | 2010-04-14, 11:44

Reply to this comment
- He he, fair enough – that’s what I always say: it’s all about the links 😉
  
  Posted by woddiscovery | 2010-04-14, 11:46
Similar to OData is of course DataRSS. Apart from the fairly confusing name (it’s Atom-based, not RSS-based) DataRSS is a pretty nice standard from a linked data point of view. It parses to RDF.

http://developer.yahoo.com/searchmonkey/smguide/datarss.html

http://search.cpan.org/dist/RDF-RDFa-Parser/lib/RDF/RDFa/Parser.pm#Atom_/_DataRSS

Posted by Toby Inkster | 2010-04-14, 14:00

Reply to this comment
I wish I could take credit for the term rectangular data but I must at least give credit to Flip Kromer of InfoChimps.org, who used the term in an interview with Paul Miller last year.

A good definition of rectangular data might be, when the dataset fits nicely inside a rectangle of observations (rows) and variables (columns) and when the conceptual difference between rows and columns is clear (See e.g. R, Stata and non-rectangular data )

Posted by John S. Erickson, Ph.D. | 2010-04-14, 15:23

Reply to this comment
I work on the OData team – but these opinions are mine not necessarily my employers – with that out of the way 🙂

I really think that one of the keys to the @rel issue, is making OData’s $metadata queryable in much the same way the data is.

So something like this:

~/service.svc/$metadata/ResourceProperties(‘Person.Awards’)

This would return metadata about the Person’s Awards relationship in OData’s Atom format.

Other than just being very useful for querying subsets of a service’s model, this could then be used in a @rel like this:

This way you are fully describing the ‘predicate’ in the link.

Alex

Posted by Alex James | 2010-04-14, 16:16

Reply to this comment
- Thanks a lot, Alex for the comment. I appreciate it very much to hear feedback from the OData community. Indeed it looks like that your proposed way solves some of the problems concerning schema-level discovery. Looking forward to see your announced OData metadata service online and happy to comment on it.
  
  Posted by woddiscovery | 2010-04-15, 08:00
Opps I didn’t escape my XML

‘… this could then be used in a @rel Like this:’

<link rel="$metadata/ResourceProperties(‘Person.Awards’)" …

Posted by Alex James | 2010-04-14, 16:18

Reply to this comment
John & Michael,

OData is Structured Data. It also has LINKs that de-reference to Descriptor Resources. It just lacks the fidelity of RDF based Linked Data (at the current time).

SPARQL and RDF are but implementation details re. “HTTP based Linked Data” and specific requirements for “RDF based Linked Data”.

In the absolute worst case right now, OData provides the following immediate benefits to RDF based Linked Data:

1. Comprehension of the Entity-Attribute-Value model as the unifying Data Model for heterogeneously shaped data (I’ve long given up trying to split RDF hairs re. Data Model and Data Representation Formats, people think its a format, and we cannot change that)

2. Its structure is RDFizer friendly as you can see via my URIBurner and OData demos.

Personally, I am very happy to see both OData and GData helping to broaden the reach of Entity-Attribute-Value model based Linked Data.

BTW – Alex once authored a blog post titled Data 2.0 (prior to his joining Microsoft, we talked about Data 2.0 long before there was an RDF based Linked Data meme, believe me he groks the purity of High Fidelity Linked Data via de-referencable Identifiers).

Alex: I think we should basically knock up a Data 3.0 document / post 🙂

Posted by Kingsley Idehen | 2010-04-14, 19:04

Reply to this comment
- Kingsley,
  
  Yes, OData is EAV and as Alex has shown, the schema-level discovery issue seems to be solvable. However, it seems that you either chose to ignore my main criticism (and my challenge above) or there is indeed not a solution to it. I’ll repeat it here for the record: though OData (as it uses Atom) can be used to connect data sources, in practice it doesn’t. Each OData source is a data island and hence OData is, as it is currently used, data ON the Web but not data IN the Web.
  
  Posted by woddiscovery | 2010-04-15, 08:08
Kingsley,

Yes it is high time for Data 3.0. For me Data 3.0 will also include thinking about how to bridge the gap between data and operations.

I see a world where we create web scoped expressions, and use shape transforming translations to allow for arbitrary composition of both REST and SOAP style endpoints with differing ‘input’ and ‘output’ shapes.

The bridge between these operations is EAV / OData etc.

Hmm… dreaming again.
Alex

Posted by Alex James | 2010-04-15, 01:01

Reply to this comment
More on the whole OData queryable metadata here:
http://www.odata.org/blog/2010/4/22/queryable-odata-metadata

Posted by Alex James | 2010-04-23, 02:39

Reply to this comment
Hi Michael,
Returning here (again). For some reason lots of roads seem to end at this blog post and I always enjoy coming back here for a refresher.

On the following point:
“OData typically creates data silos”
Can I play devil’s advocate and say that OData is typically going to be put on top of data stores that are already silo’d so in some ways OData might be seen as reducing that “siloisation” (yes, I just made up a word) by opening it up on the web (if not in the web).

Any thoughts on that?

-Jamie

Posted by Jamie Thomson | 2010-05-25, 10:09

Reply to this comment
- Jamie,
  
  Always good to see you around 😉
  
  I agree that OData helps opening up the data as it enables to put it on the Web. However, you might want to check out TimBL’s 5-star plan for eGov:
  
  … public information should be awarded a star rating based on the following criteria: one star for making the information public; a second is awarded if the information is machine-readable; a third star if the data is offered in a non-proprietary format; a fourth is given if it is in Linked Data format; a fifth if it has actually been linked.
  
  So, I guess OData would be typically 3-star, and maxed-out with 4-stars, if we’re very generous 😉
  
  Cheers,
  Michael
  
  Posted by woddiscovery | 2010-05-25, 10:13
- “Generous” is an understatement! 🙂
  
  Never seen the 5 star plan before. Going to go and read now.
  
  Posted by Jamie Thomson | 2010-05-25, 10:35
Michael, re-reading this after a month and am struck again by your Web-based-on-Word-documents analogy!

Having been “graduated” after a decade from what is now “the world’s largest IT company…” I can say how infuriating it was to live every day with the stupidity of a massive “Web of Word.” The worst part about it was, the situation only got more painful over the decade as the company transitioned from ad hoc content management to drinking the SharePoint Kool-Aid dry — and licking the cup!

I totally see their motivation for the OData model: it is exactly the sort of thing grouchy old grey-suited CIOs in large corporations would want: a “webby” way to access, track and otherwise extract value from their gazigabytes of otherwise dead-ended data stores. Since there are no links in existing datasets, why would they accommodate them?

OData IMHO is not so much about creating silos, but rather about extracting more ROI from existing silos…

Posted by John Erickson | 2010-06-16, 19:04

Reply to this comment
Michael,

To complete the conversation since I’ve only just seen your reply (post nuding by @TallTed).

OData uses Atom or JSON as EAV model data representation. There is enough fidelity in the representation to craft renditions of DBpedia entity descriptions [1].

If you take OData at face value (which I don’t due to middleware DNA).there is a problem on the metadata side of things re. complex data types. I’ve long raised this concern with Alex and Co. at Microsoft.

Links:

1. http://dbpedia.org/resource/Linked_Data — look at the footer section of the descriptor page .

Posted by Kingsley Idehen | 2010-10-28, 20:35

Reply to this comment

Trackbacks/Pingbacks

Pingback: SSIS Junkie : Explaining the difference between OData & RDF by way of analogy - 2010-04-14
Pingback: Linked Data, OData, GData, DataRSS comparison matrix « Ultan O'Carroll ( … uoccou … ) - 2011-02-21
Pingback: JSON, data and the REST « Web of Data - 2011-08-07
Pingback: OData ou/et RDF ? | Blog de JP Gouigoux - 2013-04-08

Leave a reply to Alex James Cancel reply