A new fight has broken out in specland, between the supporters of RDFa and supporters of microdata. Observers may be wondering why; both are methods of adding extra markup to existing content in order that machines may better understand the content. Semantic Web proponents (note capital letters) dream of a Web where all content is linked by said machines. Semantic Web sceptics have more humble aspirations of search engines better understanding micro-content (is this string of digits a book ISBN, or a phone number?).
RDFa was part of XHTML 2. It became a W3C standard (or, in their vernacular, a “recommendation”) in 2008. microdata was invented by Ian Hickson as part of HTML5 because he identified deficiencies in RDFa. microdata was subsequently modularised out of W3C HTML5, but microdata is part of HTML5;
it validates, whereas RDFa doesn’t.
Note the history. Like football fights that break out because one guy called an opposing team fan’s pint “a pouff”, this isn’t about the actual slight at all; this is about the past, allegiances and alliances; it’s a clash of world views. This is XML versus non-XML; it’s the XHTML 2 gang against the uncouth young turks of HTML5. This is Rangers vs Celtic; it’s Blur vs Oasis; it’s Tiswas vs Swap Shop.
(Added 15:30 GMT: R/e my framing the current debate as a talismanic battle, I should point out that I don’t mean Manu (whom I’ve always found to be courteous, thoughful and a jolly good chap). Neither do I mean Marcos, who isn’t a WHATWG-er. But some of the discourse “cowardice”, “suck metadata and fade, for all I care” on one side, and “TimBL’s RDF temple priests still mad as hell” suggests some, er, partisan feeling going on.)
What follows is the observation of a layman; I’ve not used much structured content, so am not an expert (I once tried to use microformats for events at the Law Society, but their accessibility problems prevented it.)
In my opinion, the primary deficiencies of Classic RDFa are that it’s too hard to write. For professional metadata-ologists it may be simple (but, hey, those guys understand Dublin Core!). The difficulty for me as an HTML wrangler was namespacing, CURIEs, and triples. This is XML land, and most web authors are not particularly adept with XML.
There’s also the problem that in order to use RDFa properly, you needed an xmlns attribute which is separate from the content you’re actually marking up (you don’t anymore in RDFa 1.1, see Manu’s comment). In a world where lots of content is syndicated via machine, or copy and pasted by authors (many of whom don’t really understand what they’re copy and pasting), this leads to breakage as not all of the necessary moving parts get transferred to their new environment. Hixie wrote
Copy-and-paste of the source becomes very brittle when two separate parts of a document are needed to make sense of the content. Copy-and-paste is how the Web evolved, so I think it is important to keep it functional and easy.
microdata solves this problem. It’s also easier to write than Classic RDFa (in my opinion) although I’m still mystified by the
itemid attribute. I intend to start using microdata on this site soon (in order to plug the holes left by removal of the HTML5
I’ve been recommending that people use microdata. Its main advantages:
- it’s the basis of schema.org vocabularies which is a consortium of Bing, Google, Yahoo and Yandex, so you get some SEO benefits
- it validates as HTML5 [as does RDFa Lite, as Gunnar points out in the comments]
- in Opera, dragging and dropping an item carries microdata with it.
Manu Sporny understood the problem that RDFa is hard to author for those of us who find the best ontology is a don’t-ology. Almost a year ago, he set about simplifying RDFa and came up with RDFa Lite. RDFa Lite greatly simplifies RDFa; in fact, you can search and replace microdata terms with RDFa terms (see his post Mythical Differences: RDFa Lite vs. Microdata).
RDFa has multiple advantages, too:
- it’s compatible with existing RDFa data on the Web (which is why it uses many of the same patterns as microdata but uses a different syntax)
- you can use different vocabularies in the same item, which you can’t in microdata – see Jeni Tennison’s Using Multiple Vocabularies in Microdata. This allows you to support both schema.org and Facebook’s Open Graph Protocol (OGP) using a single markup language
- you can easily switch to full-fat RDFa in the future if you feel the need
- RDFa is also be supported by schema.org (although it’s “experimental” at the moment). Manu has some screenshots of enhanced Google search results
- It also validates as HTML5 (thanks Gunnar, in the comments)
It seems to me that developers should just choose the one that meets their project’s needs.
The current fight, however, won’t allow that. The RDFa gang want to stop microdata going further in the standardisation process because RDFa became a Recommendation first, and microdata is quite similar to it. (This is a controversial perspective; see Manu’s comment.)
While I completely understand that two competing standards makes it harder for developers in the short term, I agree with Marcos Caceres (who isn’t a WHATWG/ HTML5 zealot) who counters Manu Sporny’s objection to microdata progressing thus:
I hope you will instead focus your energy on convincing the world that RDFa is the “correct technology” on its own merits and not place your bets on a mostly meaningless label (“Recommendation”) given by some (much loved, but) random standard organisation.
I see no technical reason to favour microdata or RDFa Lite; both do the job. So, developers; which tickles your fancies? RDFa Lite or microdata?