So I stumbled upon Rob Vesse’s tweet the other day, where he said he was about to use MongoDB for storing RDF. A week earlier I watched a nice video about links and link walking in Riak, “a Dynamo-inspired key/value store that scales predictably and easily” (see also the Wiki doc).
The main issue then was how to map the RDF graph into Riak buckets, objects and keys. Here is what I came up so far – I use a RDF resource-level approach with a special object key that I called
:id, which is the RDF resource URI or the bNode. Further, in order to maintain the graph provenance, I store the original RDF document URI in the metadata of the Riak bucket. Each RDF resource is mapped into a Riak object; for each literal RDF object value the literal value is stored directly via an Riak object-key, for each resource object (URI ref or bNode), I use a Link header.
Enough words. Action.
Take the following RDF graph (in Turtle):
:i foaf:name "Michael Hausenblas" ;
foaf:knows <http://richard.cyganiak.de/foaf.rdf#cygri> .
To store the above RDF graph in Riak I would then using the following curl commands:
curl -X PUT -d 'Michael Hausenblas' http://127.0.0.1:8098/riak/res0/foaf:name
curl -X PUT -d 'http://sw-app.org/mic.xhtml#i' http://127.0.0.1:8098/riak/res0/:id
curl -X PUT -d 'http://richard.cyganiak.de/foaf.rdf#cygri' http://127.0.0.1:8098/riak/res1/:id
curl -X PUT -d 'http://sw-app.org/mic.xhtml#i' -H "Link: </riak/res1/:id>; riaktag=\"foaf:knows\"" http://127.0.0.1:8098/riak/res0/:id
Then, querying the store is straight-forward like this (here: list all people I know)
Yes, I know, the prefixes like
foaf: etc. need to be taken care of (but that’s rather easy, can be put in the bucket’s metadata as well, along with the prefix.cc service. Further, the bNodes might cause troubles. And there is no smushing via
owl:sameAs or IFPs (yet). But the most challenging area is maybe how to map a SPARQL query onto Riak’s link walking syntax.