This post has been triggered by a Twitter thread, where I replied to @olyerickson that I think https://subj3ct.com is a good thing to have. Then, @hvdsomp noted (rightly!) that registries don’t scale (in reference to a conversation we had earlier on).
Big confusion, right? Michael says one thing and then the opposite on the very next day. Nah, no really
Actually, turns out I’ve been quite consistent over time. In late 2008 I wrote in Talis’ NodMag #4 (on page 16):
Could you imagine reporting your new blog post, Wiki page or whatever you have to hand to an authority that takes care of adding it to a ‘central look-up repository’? I can’t, and there is at least one good reason for it: such things don’t scale. However, there are ways to announce and promote the content.
So, what is the difference between a UDDI-style registry (which, btw, not to exactly turned out to be a success) and, what I’ll call a central point of access (CPoA) in the following?
Before I try to answer the question, let me first give you some examples of CPoAs in the Web of Data context:
- Services that reconcile data from different sources, like uberblic.org
- Co-reference lookups, such as sameas.org;
- Generic Web of Data indexer such as the LOD Cloud Cache, Falcons or Sindice;
- A namespace/prefix lookup: prefix.cc;
- Ontology lookups such as Cupboard;
- ‘Single-dataset’ lookups such as DBpedia’s lookup service or the LinkedGeoData online access interface;
- Dataset description stores, such as the RKB voiD store.
Some of these CPoAs employ automated techniques to fill their internal databank (such as Sindice or sameas.org), some of them depend on human input (for example prefix.cc). Some of them focus on a special kind of use case or domain (Cupboard or voiD stores), some try to be as generic as possible (Falcons, Sindice).
All of them, though, do share one principle: it’s up to you if you’re listed there or not (ok, technically, some might discover your data and index it, but that’s another story). The (subtle) difference is a-prior vs. a-posterior: no one forces you to submit, say your voiD file to a voiD store or to Sindice. However, if you want to increase your visibility, if you want people to find your valuable data, want them to use it, you’ll need to promote it. So, I conclude: one, effective way to promote your data (and schema, FWIW) is to ‘feed’ CPoA. Contrast this with a centralised registry where you need to submit your stuff first, otherwise no one is able to find it (or, put in other words: if you don’t register, you’re not allowed to participate).
Nevertheless, I stand by it: centralised, forced-to-sign-up registries are bad for the Web (of Data). They do not scale. CPoA, such as listed above are not only good for the Web (of Data) but essential to make it usable; especially to allow to bridge the term-URI gap (or: enter the URI space), which I’ll flesh out in another post. Stay tuned!