//
you're reading...
Experiment, Linked Data

Toying around with Riak for Linked Data

So I stumbled upon Rob Vesse’s tweet the other day, where he said he was about to use MongoDB for storing RDF. A week earlier I watched a nice video about links and link walking in Riak, “a Dynamo-inspired key/value store that scales predictably and easily” (see also the Wiki doc).

Now, I was wondering what it takes to store an RDF graph in Riak using Link headers. Let me say that it was very easy to install Riak and to get started with the HTTP interface.

The main issue then was how to map the RDF graph into Riak buckets, objects and keys. Here is what I came up so far – I use a RDF resource-level approach with a special object key that I called:id, which is the RDF resource URI or the bNode. Further, in order to maintain the graph provenance, I store the original RDF document URI in the metadata of the Riak bucket. Each RDF resource is mapped into a Riak object; for each literal RDF object value the literal value is stored directly via an Riak object-key, for each resource object (URI ref or bNode), I use a Link header.

Enough words. Action.

Take the following RDF graph (in Turtle):


@prefix foaf: <http://xmlns.com/foaf/0.1/&gt;.
@prefix : <http://sw-app.org/mic.xhtml#&gt;.

:i foaf:name "Michael Hausenblas" ;
foaf:knows <http://richard.cyganiak.de/foaf.rdf#cygri&gt; .

To store the above RDF graph in Riak I would then using the following curl commands:

curl -X PUT -d 'Michael Hausenblas' http://127.0.0.1:8098/riak/res0/foaf:name


curl -X PUT -d 'http://sw-app.org/mic.xhtml#i' http://127.0.0.1:8098/riak/res0/:id


curl -X PUT -d 'http://richard.cyganiak.de/foaf.rdf#cygri' http://127.0.0.1:8098/riak/res1/:id


curl -X PUT -d 'http://sw-app.org/mic.xhtml#i' -H "Link: </riak/res1/:id>; riaktag=\"foaf:knows\"" http://127.0.0.1:8098/riak/res0/:id

Then, querying the store is straight-forward like this (here: list all people I know)

curl http://127.0.0.1:8098/riak/res0/:id/_,foaf:knows,_

Yes, I know, the prefixes like foaf: etc. need to be taken care of (but that’s rather easy, can be put in the bucket’s metadata as well, along with the prefix.cc service. Further, the bNodes might cause troubles. And there is no smushing via owl:sameAs or IFPs (yet). But the most challenging area is maybe how to map a SPARQL query onto Riak’s link walking syntax.

Thoughts, anyone?

About these ads

About woddiscovery

Web of Data researcher and practitioner

Discussion

7 thoughts on “Toying around with Riak for Linked Data

  1. Nice one Michael :) FWIW I’ve been doing pretty much the same with REDIS for fast memory storage where you map a triple on to a hash – key is subject, value is hashmap of properties and objects. Further, REDIS supports pub sub and message queues so you can essentially make a fast in memory stream of triples (or changes).

    Thinking that all of these web techs are converging in to something nice!

    Again, great post and hacking!

    Best,

    Nathan

    Posted by Nathan | 2010-10-14, 15:29
    • Nathan,

      Thanks! Do you have anything online available? Observations, a benchmark, whatever? Would be great to learn more about your experience as well.

      Cheers,
      Michael

      Posted by woddiscovery | 2010-10-15, 07:12
  2. I can see the appeal of storing RDF in MongoDB and the appeal of using HTTP to talk to MongoDB.

    Not sure i see the appeal of treating each triple as an HTTP “resource” w/ it’s own URI. I suppose it will work for small collections of data, but will not scale well.

    HTTP is a good fit for large, coarse-grained messages (whole docs|grapphs), but a poor fit for small, chatty convos (single element in a doc|single triple).

    Posted by Mike Amundsen | 2010-10-14, 18:58
    • Mike,

      Thanks for sharing your thoughts! This is exactly what I mean … just because something is technically possible doesn’t mean it makes sense. However, what I don’t know (yet) is, what is in fact the typical size of the objects in Riak? Any idea?

      Cheers,
      Michael

      Posted by woddiscovery | 2010-10-15, 07:08
  3. Hi Michael,
    interesting post. What about using a GraphDB for RDF? RDF data could be seen as a graph. Where GraphDB helps, IMHO, is in performance on traversing relationships and Query languages.

    Although I don’t know native SPARQL support for GraphDB I’d like to plug this gap using OrientDB or even Tinkerpop Blueprints API. WDYT?

    bye,
    Luca Garulli

    Posted by Luca Garulli | 2010-10-29, 10:50

Trackbacks/Pingbacks

  1. Pingback: NoSQL Daily – Fri Oct 15 › PHP App Engine - 2010-10-15

  2. Pingback: Hosted NoSQL | Linked Data - 2012-10-23

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Archives

Follow

Get every new post delivered to your Inbox.

Join 2,150 other followers

%d bloggers like this: