HTML 5 is a mess

HTML 5 is a mess”, said John Allsopp.

I agree. It is. It’s also several different kind of messes, not all of which are avoidable or bad. Let’s look at them.

The backwards compatibility mess

The first kind of mess is because it builds on HTML 4. I’m sure if you were building a mark-up language from scratch you would include elements like footer, header and nav (actually, HTML 2 had a menu element for navigation that was deprecated in 4.01).

You probably wouldn’t have loads of computer science oriented elements like kbd,var, samp in preference to the structural elements that people “fake” with classes. Things like tabindex wouldn’t be there, as we all know that if you use properly structured code you don’t need to change the tab order, and accesskey wouldn’t make it because it’s undiscoverable to a user and may conflict with assistive technology. Accessibility would have been part of the design rather than bolted on.

But we know that now; we didn’t know that then. And HTML 5 aims to be compatible with legacy browsers and legacy pages. This page is written in HTML 5, and although your browser doesn’t “understand” it, it it still renders it.

There was a cartoon in the ancient satirical magazine Punch showing a city slicker asking an old rural gentleman for directions to his destination. The rustic says “To get there, I wouldn’t start from here”. That’s where we are with HTML. If we were designing a spec from scratch, it would look much like XHTML 2, which I described elsewhere as “a beautiful specification of philosophical purity that had absolutely no resemblance to the real world”, and which was aborted by the W3C last week.

That’s one reason why HTML 5 a mess. It’s built on a mess.

The process mess

Specifying HTML 5 is probably the most open process the W3C has ever had. Different groups with different interests battle it out over the mailing lists. Competing variant specs are springing up: Manu Sporny is devising a spec that incorporates RDFa in HTML 5, while The Mighty Steve Faulkner is writing one addressing accessibility issues including alt text and summary attributes. I’m hopeful that these will all be merged together, while simultaneusly, parts of the spec are being split out into separate specifications where there are no dependancies, such as Web Storage and Web Database.

This amorphous process is messy. But, in my opinion, infinitely better than the ivory tower of XHTML 2 academia.

The spec mess

Then, of course, there’s the real mess that John Allsopp and Matt Wilcox complain about—imprecise, ambiguous specification or unnecessary restrictions.

Why the restrictions on the time element? Why the overlap between figure and aside?

Why is the nav element so loosely specified? (“Only sections that consist of major navigation blocks are appropriate for the nav element”—define “major”, if you please.) Why can you have nav inside a header and not in a footer?

If these things bother you, you can do something about itemail the Working Group and make your feelings known.

Post-publication clarifications

I should clarify two things after some tweets and a Skype conversation with Lachlan Hunt. The first is in fairness to Hixie: he changed the definition of nav to clarify that there can be more than one per page, and I agreed that the new defintion was an improvement. But, as I’m a self-confessed dimwit, I only realised later that the word “major” leaves ambiguity as well. So that’s my brain parsing slowly, not Hixie’s bad.

Secondly, and most importantly, I haven’t had some kind of damascene conversion to the anti-HTML 5 camp. I firmly believe that it’s the best way forward for an Open Standard that allows us to write richer web pages and applications. And I’m sure that no-one, especially those in the Working Group, believes it’s perfect already. In fact, the Working Group consistently call for developer participation in reviewing the spec.

To reiterate: bitching in blogs (like I’m doing now!) and on twitter is OK, but the best way to be sure that your views are heard is to mail to Working Group.

38 Responses to “ HTML 5 is a mess ”

Comment by Remy Sharp

As per my tweet, I’d agree some of the spec is a mess.

I still don’t understand what the point of the hgroup element is, and you’ve even explained it to me in baby language (though perhaps breakfast was taking my attention).

Equally, the whole drag and drop keyboard support is in the spec as it turns out, just no browser implemented probably because they didn’t spot it (I certainly didn’t until Hixie pointed me right at it).

Comment by Remy Sharp

@Henri – that’s the exact same explanation I’ve had for the header element, so exactly how is hgroup different from header? Which is the source of my frustration/misunderstanding.

Comment by Stephen Hay

I see your point, Bruce. As I briefly mentioned to you on Twitter, The beauty of is it’s semantic extensibility. Using classes would therefore not constitute “faking it”, although I understand your argument.

I also understand the so-called “semantic” elements in HTML5, like <nav> and the ill-chosen and asking-for-trouble <aside> were meant to replace commonplace usage of things like <div id="header">. But these semantic elements should be used very sparingly, because simply “changing” HTML when the going gets rough just won’t happen that way. And with things like <footer>, traditional uses of footers should be investigated (which would dictate that sections should be allowed).

As a colleague of mine put it yesterday, we don’t want to end up doing the following:

<div class="header">
<header>
</header>
<nav>
</nav>
</div>

simply because we can’t do

<header>
<nav>
</nav>
</header>

In that case, HTML5 isn’t helping us one bit.

Also, the names should be considered carefully, per your example of HTML2 “menu” being dropped (now it’s called “nav”). Even things like <video> can be dangerous… why video and not audio? Or some form of media which doesn’t exist yet? (Olfactory, who knows). I had always hoped for a more abstract and extensible element with specific attributes, so that a change in the spec would only mean the addition of an attribute, not a dropped or new element. Something like <media type="video|audio|whatever">.

Anyway, you understand what I mean. And I might be speaking for a lot people that, given the (seemingly) chaotic development process and 800+ invited experts, I’m unaware of how to actually be heard, and not simply be throwing an opinion out there. There’s a LOT of input the HTML5 people have to deal with. And there plenty of people who aren’t convinced it’s groep effort after all (see http://www.zeldman.com/2009/07/13/html-5-nav-ambiguity-resolved/#comment-44761). But on that subject, I have no opinion, as I know too little about it.

Thought-provoking piece. Thanks.

Comment by Henri Sivonen

Remy, hgroup used to be called header. The old header was renamed to hgroup and a new header element was introduced, because the old naming caused confusion.

Comment by AlastairC

To me, Henri’s explanation was a bit like “42”, but then I read the spec on hgroup, and it was fine :)

I’m actually quite glad someone like Hixie is editor. It as to be one of the worst, hardest jobs in the world.

Apparently (second-hand remembered anecdote warning), Hixie is someone who at University, didn’t just complete a computing assignment, he submitted 19 bugs about the lecture’s framework/language that it was written in.

That’s a good quality for someone defining a spec. The sheer obsessiveness required to keep on top of it in any way, is not something many people could even consider taking on. I’ve tried, in much smaller ways.

My hat’s off to him, and I hope his skin remains thick.

Comment by Bruce

I have to agree with you, Alastair. I can’t imagine anyone else doing the job, although he would do it even better if he agreed with *me* more ;-)

Comment by Dustin Wilson

Great article. This does accurately describe HTML 5’s current status, but like you I think the fact that the public can get involved in its creation process is a great thing. I do think it’s a mess currently, but I’m sure in the end there can be some order to things.

I ran into a problem myself when I was rewriting my weblog in HTML 5 The footer to my website contains simply a disclaimer sentence along with a menu. Per the spec the way it is currently I cannot put a nav within a footer. Multitudes of websites out there contain menus at the top and the bottom of their websites, specifically in what many would deem the website’s footer. In the case of a simple menu you could mark it up with just an unordered or ordered list, but if within the document’s header element it was marked up with a parent nav element it should for the sake of continuity be able to be marked up as such within a document’s footer element. In short it’s my opinion that the footer element’s content model should be “Flow content, but with no heading content descendants, and no header or footer element descendants.”

Comment by Bruce

Dustin, and Paul

Why not mail the WG with a link to your blog and the salient bit of code and ask how it *should* be marked up that it can be as easily styled as it is currently, and make your suggestion there?

Joe – maybe too cynical? I’ve had a small degree of success.

Comment by Daniel Kvasnička jr.

It’s funny how all the people around HTML5 talk semantics and yet “header”, “footer” and “aside” are possibly the most non-semantic element names in the universe……

Comment by John Allsopp

Bruce,

I guess one of the reasons folks are resorting to raising their legitimate concerns in public fora, rather than directly with the HTML WG (or should that be the WhatWG, or maybe both?) is possible they don’t have a tonne of faith in the process.

In one sense, those in charge of developing HTML 5 have created the perfect process for ensuring that what they want in there is in there, and others input can be easily sidelined, all within a framework of “consensus”. For heaven’s sake, fully spec’d W3 recommendations are simply being sidelined, why would someone as an individual possibly think there’s any point in expending any effort in participating? Afterall, this is something I fit in between all the other personal and professional things in my life, as do many many others. Others are basically paid to work full time on this stuff. Personally, I have not even the remotest time to follow hundreds of emails a month on a mailing list, particularly when there’s a “dictator” (benevolent though he may be – in his own words) who can make whatever choice he deems appropriate, when sensible, well developed W3 recs that solve the problems that HTML 5 is focussing on are simply being ignored. There may be very good reasons why something like the role attribute simply doesn’t fly, but so far I’ve seen flimsy reasons as to why.

5 and a half years have gone since the commencement of this work. It is in a far less advanced state than most folks would believe. It in many parts contradicts its own design principles, and the stated philosophies of its developers.

A perfect example is found in a comment here. Henri Sivonen says on this very page

the point of hgroup is to mask a h2 element (that acts as a secondary title) from the outline algorithm

Great, we are introducing an extraneous “semantic” element because the sectioning algorithm needs it. Seriously! Seriously! Here’s a suggested solution. role=”sectionstart” (or whatever value, doesn’t matter what it is). The problem is conflating what the element *is* (a heading) with the *role* it plays (a heading that delineates the start of a section). This conflation is a direct function of an apparent obsession with using element names alone for semantics, and a strong aversion to attributes among the developers of HTML 5. This is a problem we see with nav, and a problem with the extraneous article element (which is a section whose role is being independent from the rest of the content on the page).

Now, to get back to my point. Role has been raised and effectively sidelined, indeed the entire approach of using attributes to extend semantics has, despite it being what developers have been *trying* to do for a decade using class – how’s that for not “paving the cowpaths”)

So, what’s the point in me participating in the WG? Raising issues already “dealt with”? That’s why folks are taking this stuff “to the streets” ;-)

I think there’s far more benefit publicly discussing these issues, where interestingly there seems, based on comments here, and on Zeldman’s recent posts, as well as the tweets I’m seeing in response to my own, really significant interest. What ultimately comes of that I really don’t know.

But a word of advice to the developers of HTML 5 – your core constituency is web developers who will decide the fate of HTML 5 by deciding whether to implement it. And the first group of these developers you need to reach out to are the “standardistas”. The folks who read and write the blogs. Who set opinions. Who influence others. The classic “early adopters”. Without these folks there’d be no HTML 5 effort, as these are the folks who as much as anyone made the whole concept of standards a practical reality.

And the feeling I get is there are many such folks uneasy about HTML 5. Not against it. But uneasy.

And, BTW, these are largely the folks who got behind the XHTML effort. So, disparaging them, which seems to be a bit of a pastime of late among HTML 5 folks is not a particularly good engagement strategy.

Anil Dash recently wrote his Law of Fail

Once a web community has decided to dislike an idea, the conversation will shift from criticizing the idea to become a competition about who can be most scathing in their condemnation

Don’t let it get there. I don’t want HTML 5 to go there. In fact, no one I know does. But, it could happen.

Comment by Bruce

John, thanks for another mammoth comment!

I’ll split some hairs before moving onto the main point:

Role has been raised and effectively sidelined, indeed the entire approach of using attributes to extend semantics has, despite it being what developers have been *trying* to do for a decade using class.

It’s unclear whether people who have written div class="footer" desperately want to use div role="footer" above footer if they had the choice available. There are lots of people using p class="title" rather than using h2, but that’s not an argument for abolishing h1-h6 in favour of role="heading".

You mention “a strong aversion to attributes among the developers of HTML 5″, but the new HTML 5 form fields are all attributes of input (as textarea should have been, but isn’t).

Personally, I prefer elements to attributes. But I prefer tightly defined roles (no pun intended) over loosely defined, whether they be elements or attributes.

Your main point is that the WHAT-WG don’t listen. I’ve had mixed success with suggesting changes. Sometimes my editorial requests have been accepted, but my requests to change structural definitions (eg, widen the application of time) have been rebuffed with the “show me a use-case” stock response. With microformats, the use case for marking up arbitrary dates (precise but ancient dates, “fuzzy” dates such as “July 2007″, disallowed in time) are obvious. (And it’s not just me: wikipedia editors and museum webmasters have tried to make that point).

So I understand totally why the conversation has been taken to the streets. It’s one of the things I tried to do when getting really into HTML 5 after Xmas.

But if the conversation is only ever “on the streets” and never presented through the “proper channels” then there is a danger that it might not be heard by those with the power actually to make the changes that we, as individuals and a community, propose.

Re Law of Fail: +1. In fact +1000.

Comment by john Allsopp

Thanks Bruce,

It’s unclear whether people who have written div class=”footer” desperately want to use div role=”footer” above footer if they had the choice available

That’s true, but these are the cowpaths, and markup patterns developers are familiar and comfortable with.

Re Form fields, this is how form fields have always been marked up, so clearly here the decision has been (sensibly IMHO) to continue with current practice – and ironically, it is similar to role.

I’ve promised Ian and Henri that when I get my thoughts into sufficiently sensible consistent detailed shape I’ll certainly be contributing to the official channels.

My primary interest in participating in these public conversation is to make folks who don’t follow these issues in detail aware of my concerns.
And just to not, this current flurry of comments commenced with my reply to Ian Hicksons’ comment on Jeffrey Zeldman’s post of a few days ago. Ian and Henri Sivonen have been participating in this conversation, which I think is very encouraging.

Comment by patrick h. lauke

That’s true, but these are the cowpaths, and markup patterns developers are familiar and comfortable with.

familiar and comfortable, yes, but born out of frustration and lack of a better alternative…

Comment by mattur

Don’t let it get there [a competition about who can be most scathing in their condemnation]. I don’t want HTML 5 to go there. In fact, no one I know does. But, it could happen.

You left off “again”. It could happen “again” – as it did with WaSP/standardistas promoting the idea that HTML was inferior/messy/sloppy/not a web standard. But I’d like to think we’ve all learned from that experience.

But a word of advice to the developers of HTML 5 – your core constituency is web developers who will decide the fate of HTML 5 by deciding whether to implement it. And the first group of these developers you need to reach out to are the “standardistas”.

To get web developers to adopt practices with no obvious benefits (eg using XHTML1 as text/html) an advocacy group is required.

To get web developers to adopt practices with obvious benefits (eg using new functionality in HTML5) no advocacy group is required.

Comment by six03 » XHTML 2 or HTML 5?

[…] The current battles. XHTML 2 or HTML 5? Which one is better? Which one is the right path to choose? Which one is the future? You have posts like this post here and this post here and then there are posts such as this one and this one. […]

Comment by john Allsopp

Patrick,

familiar and comfortable, yes, but born out of frustration and lack of a better alternative…

Indeed, but “The street finds its own uses for things”

These are the cowpaths. We should at least not dismiss them out of hand.

Comment by john Allsopp

Mattur,

You left off “again”. It could happen “again” – as it did with WaSP/standardistas promoting the idea that HTML was inferior/messy/sloppy/not a web standard. But I’d like to think we’ve all learned from that experience.

You’ll have to quote people and specifics there. Right now you are disparaging a fair number of folks, with no evidence. I’m not sure that this is a particularly good strategy for promoting engagement.

To get web developers to adopt practices with no obvious benefits (eg using XHTML1 as text/html) an advocacy group is required.

I’m really really not sure why so many folks get so het up about XHTML syntax served as text/html. Really. It really comes across as a kind of obsession.

In terms of benefits of the syntax, here you go – Simpler, cleaner syntax (empty elements are closed differently from non empty elements, attributes are quoted, non empty elements need to be closed), deprecation of presentational markup. A transitional approach to an extensible language. Babysteps. Cleaning up out of date aspects of HTML 4. No obvious benefits? A matter of opinion.

But, I will take some convincing that without the web standards “movement” we’d have anything like the open web we have now. So before you disparage such folks without even bothering to quote a single line by a single person, you should perhaps consider whether there’d be even the opportunity for an HTML 5 effort without them. Hint – how much attention is the developers of the most widely used browser paying to the effort?

To get web developers to adopt practices with obvious benefits (eg using new functionality in HTML5) no advocacy group is required.

Sadly, the history of the world is littered with better technologies not adopted. It pays to consider why. Advocacy and interests. Who is advocating for X/HTML5? And in whose interest is it? These are not rhetorical questions.

Comment by mattur

You’ll have to quote people and specifics there. Right now you are disparaging a fair number of folks, with no evidence.

You think standardistas have *not* been scathing about HTML?! Where’ve you been for the last decade, an alternate universe? :-)

We’re *still* seeing it with references to HTML5 as “a toddler specification” or “HobbyText Mockup Language” by leading standardistas. I’m not naming names.

I think we would have had a HTML5 effort even without the “Web Standards (except HTML4.01) Movement” possibly a lot sooner, and quite possibly web markup standards would not have been frozen for a decade. If WaSP/standardistas had campaigned for web standards generally instead of focusing on XHTML, there wouldn’t be all the sturm und drang now about the demise of XHTML2.

deprecation of presentational markup

…and that will be one of the myths standardistas use(d) to disparage HTML ;-)

No obvious benefits? A matter of opinion.

Even the W3c (now) says: “[authors] get no reward for [using XHTML1], beyond the rather theoretical satisfaction of creating well-formed content.”

Who is advocating for X/HTML5?

Who is advocating for Javascript? Or XHR? Or RSS? Or PNG? Who advocated for HTML0/2/3/4? Or SSL? Or cookies? If it performs a useful function, adoption will follow support. If it doesn’t perform a useful function, an advocacy group is required :-)

Comment by Michael

My big issue with all these new element names is I’m not entirely sure why we are changing something like div class=”footer” div role=”footer”. Are these changes something that will help our users, or are we just standardizing a language, in which case – why is this a ‘role’ and not its own element?

Other than saving 5-6 characters when typing, what’s the benefit of using vs for developers, and for users?

Comment by Netmosfera

I’m tired to say that html5 is a mess. Nobody will listen to you. But I’m perseverant.

lot of random issues that we can talk about:

aside is tangential but related content, it does introduce new section
and, if it is a navigation menu I should write:

[aside]
[nav][/nav]
[/aside]

but hey, I don’t need two nested sections. basically specific section tags, can’t have two roles.

- an article can’t be a nav
– but an article can be an aside
– and a nav can be an aside
– the body can be an article (read below)

[article] is useless: If the [body] –IS– the article, I’m obligated to introduce one unwanted section

[body][article][/article][body]

and, after years, there is still not a way to clearly distinguish SITE TITLE and DOCUMENT TITLE
read this: http://www.w3.org/Bugs/Public/show_bug.cgi?id=14540#c2

It is supposed that all elements semantically contains some related data to them, but is not:

[aside]
[!– advertising banners here –]
[h1]Site Navigation
[ul][/ul]
[/aside]

are the banners related? I think not.