you're reading...
FYI, Linked Data, Proposal

Data and the Web – a great many of choices

Jan Algermissen recently compiled a very useful Classification of HTTP-based APIs. This, together with Mike Amundsen‘s interesting review of Hypermedia Types made me think about data and the Web.

One important aspect of data is “on the Web vs. in the Web” as Rick Jelliffe already highlighted in 2002:

To explain my POV, let me make a distinction between a resource being “on” the Web or “in” the Web. If it is merely “on” the Web, it does not have any links pointing to it. If a resource is “in” the Web, it has links from other resources to it. […] A service that has no means of discovery (i.e. a link) or advertising is “on” the Web but not “in” the Web, under those terms. It just happens to use a set of protocols but it
is not part of a web. So it should not be called a web service, just an unlinked-to resource.

In 2007 Tom Heath repeated this essential statement in the context of Linked Data.

So, I thought it makes sense to revisit some (more or less) well-known data formats and services and try to pin down what “in the Web” means – a first step to measure how well-integrated they are with the Web. I’ll call the degree of how “much” they are in the Web the Link factor in the following. I suggest that the Link factor ranges from -2 (totally “on the Web”) to +2 (totally “in the Web), with the following attempt of a definition for the scale:

-2 … proprietary, desktop-centric document formats
-1 … structured data that can be exposed and accessed via Web
 0 … standardised, Web-aligned (XML-based) formats or Web services
 1 … open, standardised (document) formats
 2 … full REST-compliant, open (data) standards natively supporting links

Here is what I’ve so far – feel free to ping me if you disagree or have some other suggestions:

Technology Examples Link factor
Documents MS Word, PDF -2
Spreadsheets MS Excel -1
RDBMS Oracle DB, MySQL -1
NoSQL BigTable, HBase, Amazon S3, etc. 0
Hypertext and Hypermedia HTML, VoiceML, SVG, Google Docs 1
Hyperdata Atom, OData, Linked Data 2

About woddiscovery

Web of Data researcher and practitioner


8 thoughts on “Data and the Web – a great many of choices

  1. Michael,

    How about this dichotomy?
    Hypermedia Resources (Hypertext or Hyperdata) and Non Hypermedia Resources (traditional platform and application specific resource types).

    It also becomes somewhat easier to understand value proposition of middleware products that offer the following:

    1. Transformation of Non Hypermedia Resources into Hypermedia Resources
    2. Transformation of Hypertext Resources into Hyperdata Resources .


    Posted by Kingsley Idehen | 2010-03-01, 14:52
  2. Interesting classification, but I don’t completely agree, especially with the PDF bit. PDF documents (and even Word documents) are hypertext documents internally, but also as part of the Web, as it is easy to add external links to them (I do it all the time). So, PDF documents can be very much “in the Web”. Also, note that PDF is in fact an open ISO standard! I would even challenge the claim that PDF is “desktop-centric”. Rather, I would call PDF a “print-centric” format, where HTML is a screen-centric format.

    Posted by Knud Möller | 2010-03-02, 16:05
    • Knud,

      “… as it is easy to add external links to them (I do it all the time).”

      Really? Maybe you mean link *from* (within) them? If you can show me how to link *to*, say, a certain section of a MS Word document from the Web, I’ll come upstairs and get you a coffee (and/or) a cookie for free 😉

      Posted by woddiscovery | 2010-03-02, 16:11
    • Right, I meant linking from within the document to the outside. PDF has some disadvantages with respect to linking in the opposite direction – even though I think there is some arcane mechanism to link to specific pages or sections within a PDF document (does that qualify for a cookie?).

      Still, to say that PDF is a proprietary, document-centric format and not at all “in” the Web sounds wrong to me!

      Posted by Knud Möller | 2010-03-02, 16:34


  1. Pingback: Lessig, Fini, e Open Data: ovvero la libertà non è solo questione di principio : Casual.info.in.a.bottle - 2010-03-12

  2. Pingback: Oh – it is data on the Web | shahverdY - 2010-08-15

  3. Pingback: On the usage of Linksets | shahverdY - 2010-08-15

  4. Pingback: Tim Berners-Lee Proposes Open Data Rating | Open Atlanta - 2013-02-22

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s


%d bloggers like this: