Scooby Doo and the proposed HTML5 <content> element

Note: Since writing this, I’ve continued vacillating and now support a <main> element. Why I changed my mind about the <main> element.

Trigger warning: contains disagreement about accessibility.

I’ve been vacillating (ooh err, missus) for two weeks from one opinion to the other regarding a proposed (and rejected) <content> element. This weekend, The Mighty Steve Faulkner wrote an unofficial draft of a <maincontent> element.

Dude, where’s my content?

For a while, people have suggested that HTML add a <content> element that wraps main content, because many websites have something like <div id="content"> surrounding the area that authors identify as their main content, which they then use to position and style that central content area.

Fans of WAI-ARIA also like to hang role="main" on that area, to tell assistive technology where the main content of the page starts. I do this too.

The editor of HTML.next, Ian Hickson, rejected a new <content> element:

What would the element _mean_? If it’s just “the main content”, then that is what the element’s contents would mean even without the element, so really it means the element is meaningless. And in that case, <div> is perfect, since that’s what it is: a grouping element with no meaning.

The primary argument against a special element is that it isn’t necessary, because the beginning of “main content” can be identified by a process of elimination that I call the “Scooby Doo algorithm”: you always know that the person behind the ghost mask will be the sinister janitor of the disused theme park, simply because he’s the only person in the episode who isn’t Fred, Daphne, Velma, Shaggy, or Scooby. (Like most Scooby fans, I’m pretending Scrappy never existed.)

Similarly,the first piece of content that’s not in a <header>, <nav>, <aside>, or <footer> is the beginning of the main content, regardless of whether it’s contained in an <article>, or <div>, or whether it is a direct child of the <body> element.

Authors do need to be able to identify their main content, both for styling (in which case <div> seems to be the most appropriate element) and as a target for “skip links”, in which case, the current <a href=”#main”>Skip nav<a> … <div id=”main”> pattern still does the trick.

It’s worth noting that people often code “skip links” believing it’s required by WCAG 2, but if browsers implemented the Scooby Doo algorithm that is explicitly not the case: “It is not the intent of this Success Criterion to require authors to provide methods that are redundant to functionality provided by the user agent.”

Many assistive technology useragents understand the ARIA role=”main”, so skip links should be unnecessary; ATs can hone in on <div id=”main” role=main> by themselves, even without supporting the Scooby Doo algorithm.

This suggests to me that a new element isn’t required. But…

Paving cowpaths, ease for authors

Chaals (ex-Opera, now Yandex) wrote

To turn the question around, if it is more convenient for authors to identify the main content, and not think about the classification of other parts, should we offer that facility as part of the platform? Or does it make sense to say that only the exhaustive identification of all supplementary content is an appropriate way to mark up a page anyway?

Chaals argues that it makes authoring easier – suddenly you get extra accessibility by just adding one <content> element, rather than adding the other elements that the Scooby Doo algorithm can then exclude. People using CMSs, who only control the textarea that gets lumped in as “main content” and can’t touch the surrounding areas can now add an element, without having to ask others to tweak templates.

But then, they can do this already, by surrounding their content with <div role=”main”> and this already works in assistive technologies.

A flawed argument for a new element is that it paves a cowpath, so should be added to the language. It’s certainly the case that <div id=”main”> and <div id=”content”> are very frequently found in pages – they were #2 and #6 in the most-frequently used ID attributes in the 2008 MAMA: What is the Web made of? report.

But not every cowpath needs paving. If it did, we’d also have a <logo> and a <container> element (#4 and #5 respectively), and we’d be recommending tables for layout. If something can be done automatically, without requiring extra authorial work, shouldn’t that be favoured? In the same way that we like HTML5 form types as they’re baked into the browser, shouldn’t the Scooby Doo algorithm be preferable?

Of course, the Scooby Doo algorithm requires the author to use <header>, <footer> <nav> and <aside> — but if (s)he doesn’t want/ isn’t able to author HTML5, ARIA’s role="main" is there precisely as bridging technology.

There’s also the argument that authors expect there to be a <content> element, so its absence violates the Principle of Least Surprise. But I’m not sure that’s a valid argument. Implementing the Scooby Doo algorithm would mean that pages whose author does nothing for accessibility can be made so that their main content area may be programmatically determined. ARIA exists for pages that aren’t in HTML5, or until the Scooby Doo algorithm is widely supported, and analysis shows that most ARIA is correctly used by authors.

Why add an extra complexity, which is more to go wrong and thus potentially harms accessibility?

Also available:

15 Responses to “ Scooby Doo and the proposed HTML5 <content> element ”

Comment by steve faulkner

hi bruce,

Of course, the Scooby Doo algorithm requires the author to use , and — but if (s)he doesn’t want/ isn’t able to author HTML5, ARIA’s role=”main” is there precisely as bridging technology.

Using the term “bridging” implies at some point the “scooby doo” algorithm will be implemented. It also implies that the right containers for all all content other than the main content area are the one’s you’ve indicated. I don’t think this is alway the case. I also think you should call the algorithm “if everything else is marked up using the required containers algorithm” as it relies on marking everything else in the page up correctly to work.

Why add an extra complexity, which is more to go wrong and thus potentially harms accessibility?

built in vs bolt on: the element would provide built in semantics and how exactly is using an element for a well understood and used container semantic structure adding complexity? as against telling authors that if you have marked up your content to make the scooby do algorithm usable then don’t worry about using role=main, otherwise you have to add it.

Comment by Charlie

I agree with the general thrust: “content” is what’s left when the furniture (navigation, header, footer, etc.) have been removed; what’s inside the box. In addition “content” itself is too vague which is why we have can have multiple “articles” or “section” on the same page – something that struck me as counterintuitive when I first came across it. As I’ve never been keen on “div” as a tag name I’d be tempted to use “section” to group elements of editorial content. Not sure if there are problems – doesn’t “section” have some kind of special role? – with that but in any case, “section”, “article” and “content” are semantically all very close if nuanced (“section” is more structural then “article”) and I don’t see the need for three, which I think will make interchangeability [even more] inevitable.

This is probably why, in books, we have names for the “index”, “introduction”, “preface”, “frontispiece”, “table of contents” but never “content” or “contents”. It’s far easier to disambiguate all of the above than “content”.

Comment by steve faulkner

I’ve just been looking at some web applications (google docs/drive/mail etc), it makes no semantic sense to shoehorn the contents of the interface into containers such as navigation, article, header etc just to make the scooby algorithm work, whe they are in fact menu bars and toolbars etc but it does make sense (to me) to be able to mark up the main content pane with

Comment by Bruce

Steve, I spent five minutes marking up a gmail page with HTML5 elements.

gmail marked up with HTML5 elementd

http://farm9.staticflickr.com/8455/7975677659_0383ef03fa.jpg

Seems like the Scooby Doo algorithm kind of works. (Of course, some authors might put the top bar that links to other google properties inside the header rather than a separate aside, but that doesn’t change the scooby algorithm).

Should <menu> be considered part of the stuff that should be excluded from main content? It’s not content – it’s a toolbar. On the other hand, few people would mark it up with menu as it’s not “supported” in any browser yet, so would probably make it an unordered list, in which case the “main content” would start there, and the contextual ad would be considered to be part of the main content (Google no doubt believe it is).

Would a screenreader user believe the menu is part of role=main? Probably not, as menu would have a role=toolbar, which is a separate role.

Comment by Steve Fenton

I too have been vacillating between for and against. I have no objection to a content element as long as it has a clear meaning.

I can imagine entire pages being put inside a content element just as easily as I can imagine multiple content elements on a page. In both of these cases the content element becomes less useful. Once the incorrect usage becomes more common than the correct usage, I suppose we’ll update the specification to match what authors actually do and how browsers actually interpret it.

Comment by Bruce

Steve Fenton, that’s one of my worries too. Misused content element is the worst-case. Used correctly, it adds nothing.

Scooby Doo seems to work, so I favour that.

Comment by steve faulkner

Hi Bruce, as you said the algroithm kind of works in the gamil case. a few comments:
1. the use of menu wil be problematic if and when implemented as it is meant to provide actual interactive UI not just a semantic container. So unless the developers actually use menu to create a menu interface it will be a problem. And i doubt they will in many cases just look under the hood of the gmail app to see how many ‘actual HTML controls are being used.
2. i don’t agree that the email list/table should be marked up as a an article.
3. the algorithm relies upon everything else being marked up as it should rather than just one thing, there are potentially many more points of failure there.
4. basing how practical the algorithm would be on the visual view of complex web apps (like gmail) is deceptively seductive. try turning CSS off to get a better view of the soupness of the interface and then you may get a sense that using a single semantic container for the main content would be a lot easier to get implemented than getting everything else marked up right.

Comment by Bruce

Hi Steve

1) yeah, is the interactive menu part of the main content or not? Hard to choose.

2) that’s to indicate that each email is an article. I’ve updated the image on Flickr to hopefully reflect that.

3) that’s the best argument for a new element, I think.

Comment by Alohci

Hi Bruce and Steve,

I have two questions.

1. It seems to me that if browsers can compute the “main” landmark reliably, then it’s far better than relying on authors to get it right. Is there a use case for the maincontent start tag being anywhere other than at the scooby-doo point, except when the author has insufficiently marked up the “other” content?

2. If the author has marked up the “other” content fully with header, footer, aside and nav elements, but places the maincontent somewhere in the DOM below the scooby-doo point, is there a danger that the text before the maincontent element will be lost to AT users.

Comment by steve faulkner

Hi Alohci

It seems to me that if browsers can compute the “main” landmark reliably, then it’s far better than relying on authors to get it right.

In order for the user agent to compute the main landmark correctly it relies upon the author using header, footer, article, navigation and aside and using them correctly.

I would suggest that there are more points of failure there than asking the author to use 1 element once, correctly.

Comment by steve faulkner

while I am here (again) would like to clarify a few inacuricies;

regarding a proposed (and rejected) element

The proposal for an element has been rejected by hixie, it has not been rejected by the rest of the web standards community. It is fair to say that there is no consensus on the rejection or addition. In as much as hixie gets the final word on what goes into or out of ‘HTML the living standard’, he does not have the same vice like grip on W3C’s HTML5 or what comes after. So the idea of the concept being rejected from HTML is a little premature.

The editor of HTML.next, Ian Hickson,

Ian is not the editor of HTML.next he is the editor of HTML the living standard. The W3C HTML working groups draft charter states that the HTML WG will “continue the development of the HTML language” and whilst it will obviously take new features added to HTML the living standard into account, it will not be the sole source of new features it considers for inclusion in HTML.next.