Tim's Weblog
Tim Strehle’s links and thoughts on Web apps, software development and Digital Asset Management, since 2002.

Hunting for well-known Semantic Web vocabularies and terms

As a Semantic Web / Linked Data newbie, I’m struggling with finding the right URIs for properties and values.

Say I have a screenshot as an PNG image file.

If I were to describe it in the Atom feed format, I’d make an “entry” for it, write the file size into the “link/@length” attribute, the “image/png” MIME type into the “link/@type” attribute, and a short textual description into “content” (with “@xml:lang” set to “en”). Very easy for me to produce, and the semantics would be clear to everyone reading the Atom standard.

Now I want to take part in the “SemWeb” and describe my screenshot in RDFa instead. (In order to allow highly extensible data exchange between different vendors’ Digital Asset Management systems, for example.) But suddenly life is hard: For each property (“file size”, “MIME type”, “description”) and some values (“type: file”, “MIME type: image/png”, “language: English”) I’ve got to provide a URL (or URI).

I could make up URLs on my own domain – how about http://strehle.de/schema/fileSize ? But that would be missing the point and prevent interoperability. How to Publish Linked Data on the Web puts it like this: “A set of well-known vocabularies has evolved in the Semantic Web community. Please check whether your data can be represented using terms from these vocabularies before defining any new terms.”

The previous link lists about a dozen of vocabularies. There’s a longer list in the State of the LOD Cloud report. And a W3C VocabularyMarket page. These all seem a bit dated and incomplete: None of them link to schema.org, one of the more important vocabularies in my opinion. (Browsing Semantic Web resources in general is no fun, you run into lots of outdated stuff and broken links.) And I haven’t found a good search engine that covers these vocabularies: I don’t want to browse twenty different sites to find out which one defines a “file size” term.

I’m pretty sure the Semantic Web pros know where to look, and how to do this best. Please drop me a line (e-mail or Twitter) if you can help :-)

Update: The answer is the Linked Open Vocabularies site. Check it out!

For the record, here’s what I found so far for my screenshot example:

“file size”: https://schema.org/contentSize

“MIME type: http://en.wikipedia.org/wiki/Internet_media_type or http://www.wikidata.org/wiki/Q1667978

“description”: http://purl.org/dc/terms/description or https://schema.org/text

“type: file”: http://en.wikipedia.org/wiki/Computer_file or http://www.wikidata.org/wiki/Q82753, or more specific: http://schema.org/MediaObject or http://schema.org/ImageObject or even http://schema.org/screenshot

“MIME type: image/png”: http://purl.org/NET/mediatypes/image/png or http://www.iana.org/assignments/media-types/image/png

“language: English”: http://en.wikipedia.org/wiki/English_language or http://www.lingvoj.org/languages/tag-en.html or https://www.wikidata.org/wiki/Q1860