you're reading...
Experiment, Linked Data

Linked Open Data Caching – Establishing a Baseline with HTTP

The other day I was pondering on Linked Open Data Source Dynamics and as a starting point I wanted to learn more about the caching characteristics of LOD data sources. Now, in order to establish a baseline, one should have a look at what HTTP, one of the pillars of Linked Data, offers (see also RFC2616, Caching in HTTP).

So, I hacked a little PHP script that takes 17 sample resources from the LOD cloud (from representative datasets ranging from DBpedia over GeoSpecies to W3C Wordnet). The results of the LOD caching evaluation are somewhat deflating: more than half of the samples do not support cache control and less than 20% support Last-Modified or ETag headers.

LOD datasets sending Last-Modified header

LOD datasets sending ETag header

LOD datasets sending Cache-Control header

I know, I know, this is just a very limited experiment. And yes, very likely there are not yet that many applications out there consuming Linked Data and hence using up the whole bandwidth. However, given that one of the arguments for the scalability on the Web is the built-in HTTP caching mechanism, LOD dataset publisher might want to consider having a closer look into what the server or platform at hand is able to offer concerning caching support.


About woddiscovery

Web of Data researcher and practitioner


No comments yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s


%d bloggers like this: