HTML5 articles and sections: what’s the difference?
An article is an independent, stand-alone piece of discrete content. Think of a blogpost, or a news item.
Consider this real-world article:
<article>
<h1>Bruce Lawson is World's Sexiest Man</h1>
<p>Legions of lovely ladies voted luscious lothario Lawson as the World's Sexiest Man today.</p>
<h2>Second-sexiest man concedes defeat</h2>
<p>Remington Sharp, jQuery glamourpuss and Brighton roister-doister, was gracious in defeat. "It's cool being the second sexiest man when number one is Awesome Lawson" he said, from his swimming pool-sized jacuzzi full of supermodels.</p>
</article>
It could be syndicated, either by RSS or other means, and makes sense without further contextualisation. Just as you can syndicate partial feeds, a “teaser” article is still an article:
<article>
<a href=full-story.html>
<h1>Bruce Lawson is World's Sexiest Man</h1>
<p>Legions of lovely ladies voted luscious lothario Lawson as the World's Sexiest Man today.</p>
<p>Read more</p>
</a>
</article>
Other articles can be nested inside an article, for example a transcript to a video:
<article>
<h1>Stars celebrate Bruce Lawson</h1>
<video>…</video>
<article class=transcript>
<h1>Transcript</h1>
<p>Priyanka Chopra: "He's so hunky!"</p>
<p>Konnie Huq: "He's a snogtabulous bundle of gorgeous manhood! And I saw him first, Piggy Chops!"</p>
</article>
</article>
The transcript is complete in itself, even though it’s related to the video in the outer article. The spec says “When article elements are nested, the inner article elements represent articles that are in principle related to the contents of the outer article.”
section
Section, on the other hand, isn’t “a self-contained composition in a document, page, application, or site and that is intended to be independently distributable or reusable”. It’s either a way of sectioning a page into different subject areas, or sectioning an article into … well, sections.
Consider this article:
<article>
<h1>Important legal stuff</h1>
<h2>Carrots</h2>
<p>Thingie thingie lah lah</p>
<h2>Parsnips</h2>
<p>Thingie thingie lah lah</p>
<h2>A turnip for the books</h2>
<p>Thingie thingie lah lah</p>
<strong>Vital caveat about the information above!</strong>
</article>
Does the “vital caveat about the information above” refer to the whole article, eg everything under the introuctory h1, or does it refer only to the information under the preceding h2 (“A turnip for the books”)? In HTML4, there is no way to tell. In HTML5, the section element makes its meaning unambiguous (and therefore, more “semantic”):
<article>
<h1>Important legal stuff</h1>
<section>
<h2>Carrots</h2>
<p>Thingie thingie lah lah</p>
</section>
<section>
<h2>Parsnips</h2>
<p>Thingie thingie lah lah</p>
</section>
<section>
<h2>A turnip for the books</h2>
<p>Thingie thingie lah lah</p>
</section>
<strong>Vital caveat about the information above!</strong>
</article>
Now we can see that the vital caveat refers to the whole article. If it had been inside the final section element, it would unambiguously refer to that section alone. It would not have been correct to divide up this article with nested article elements, as they would not be independent discrete entities, which is why we used the section element.
OK. So we’ve seen that we can have article inside article and section inside article. But we can also have article inside section. What’s that all about then?
article inside section
Imagine that your content area is divided into two units, one for articles about llamas, the other for articles about root vegetables. (Or see today’s Guardian home page with its main news, a section of election picks, a section of “latest multimedia” etc).
You’re not obliged to markup your llama articles separately from your root vegetable articles, but you want to demonstrate that the two groups are thematically distinct, and perhaps you want them in separate columns, or you’ll use CSS and JavaScript to make a tabbed interface. In HTML4, you’d use our good but meaningless friend div. In HTML5, you use section which, like article invokes the HTML5 outlining algorithm, while div doesn’t because it has no meaning. (A great read on the outlining algorithm is Lachlan Hunt’s A Preview of HTML 5):
<div role=main>
<section>
<h1>Articles about llamas</h1>
<article>
<h2>The daily llama: buddhism and South American camelids</h2>
<p>blah blah</p>
</article>
<article>
<h2>Shh! Do not alarm a llama</h2>
<p>blah blah</p>
</article>
</section>
<section>
<h1>Articles about root vegetables</h1>
<article>
<h2>Carrots: the orange miracle</h2>
<p>blah blah</p>
</article>
<article>
<h2>Swedes: don't eat people, eat root vegetables</h2>
<p>blah blah</p>
</article>
</section>
</div>
Why not article? Because, in this example, each section is a collection of independent entities, each of which could be syndicated—but you wouldn’t syndicate the collection as an individual entity.
Note that a section doesn’t need to be lots of articles; it could be a collection of paragraphs explaining your creative commons licensing, an author bio or a copyright notice. In our example, each article could contain sub-articles or section, as explained above—or both.
Finally, a conclusion!
Jeremy Keith writes that authors are confused about when to use the two elements. I think the name article is a cause of confusion; perhaps post or entry or even story would be more intuitive if you’re thinking about blog or news sites (although not all sites are like that, of course).
But I disagree that the two elements are so similar that they should be amalgamated. Jeremy writes
the only thing that distinguishes the definition of article from the definition of section is the presence of the phrase “self-contained”. A section groups together thematically-related content. An article groups together self-contained thematically-related content. That distinction is too fine to warrant a separate element, in my opinion.
I agree that the difference between them is the “self-contained”ness. But, personally, I find it pretty easy to work out whether something is self-contained or not and have tried to explain it above. Your comments will hopefully let me know if I’ve explained it clearly enough. (I think it’s very tough explaining it in the terse language required in normative sections of a specification).
It seems to me that brand-new elements will require people to spend time learning them without being able to immediately understand the difference in a matching exercise. Dan Cederholm’s Simplequiz showed that in 2003 many of us didn’t understand HTML4 elements properly. How many of us would have chosen ol rather than ul from name and single line from the spec if asked the most appropriate element for breadcrumb trails, or chosen dt as the most appropriate term for the speaker’s name in a dialogue (as the HTML4 spec wrongly specifies)? But seven years down the line, I imagine we all agree that it would have been wrong to amalgamate dl, ul and ol.
I also think the spec isn’t sufficiently clear (and emailed the Working Group): the definition for article says “The article element represents a self-contained composition in a document, page, application, or site and that is intended to be independently distributable or reusable, e.g. in syndication.”
This suggests that if you have a self-contained composition that you do not intend to be distributable via syndication, you shouldn’t use article.
Section says “Authors are encouraged to use the article element instead of the section element when it would make sense to syndicate the contents of the element” – here, the intent of syndication is diluted into “it would make sense to syndicate the content”.
I suggest that article be amended to say something similar, eg “The article element represents a self-contained composition in a document, page, application, or site which would make sense if independently distributed or reused, e.g. in syndication.” so that the two mentions of article match.
If we didn’t have an article element, we’d be left with lots of different riffs on section class=article, section class=story or section class=post, which is what HTML5 tries to avoid.
18 Responses to “ HTML5 articles and sections: what’s the difference? ”
This is a great explanation, Bruce. I think both are necessary and I’ve begun using them already for some of my own personal projects.
I see great potential for an expansion or reimagining of RSS, that if you enabled it, you could simply crawl your site for <article> blocks and syndicate them automagically.
Loving HTML5.
Thanks Bruce,
That’s cleared up a few problems I was having trying to decide when which should be used.
I’m using article in a search engine for each result. I use section to contain the collection of results.
I was confused by article as well. The name seems to be restrictive. But then I got over the connection between blogs and news articles.
I’m still alarmed by the way that HTML5 seems to have decided to turn the hypertext markup language into the blog markup language.
While it is clear that many kinds of web page have sections, it’s certainly not clear that many kinds of web pages – aside from blogs and new portals – have parts of their content that could be described as ‘articles’. And even in those contexts, a single article might actually be larger than a single page. Any other use of ‘article’ is using it to mark up something that, while it’s not actually, really, an article, strictly, it kind of can be treated in the same way as an article. That’s just bad naming. Like having a fruit markup language that declares an ‘apple’ element, and says ‘use it for solid, round fruit, such as apples, and oranges. Can even apply to bananas when they’re viewed from a certain angle’.
I suppose that online stores have ‘articles’, but that’s actually misleading – an article on the page about an article for sale might legitimately be marked up as an article by virtue of it being an article, not by virtue of being about an article.
An article can be larger than a single page, as well as smaller. Can the article element accurately capture the fact that this is a subsection within a larger article, which itself might make sense in isolation?
I just think something less focused on the publication of textual, journalistically presented items might have made a better choice. ‘item’, ‘contentItem’, or similar seems to sum up the intent of ‘article’, without the bloggy, newsy, magaziney overtones.
Hmm… I think James has a good point, especially when articles break over multiple pages.
Obviously, all sites have a distinction between sections of content, and standalone pieces of content that could be pulled out and exist on their own. Maybe the ‘article’ element needs a permalink attribute, so that excerpted versions and multipage versions could link to the full version?
One last question: How is ‘section’ different from what ‘div’ is supposed to be?
I just think of a “section” as being bigger than an “article.” Referring to http://www.usconstitution.net/const.html to remind me in the future that my thinking is narrow.
We need a pneumonic!
Good post, but only one area still confuses me:
The article element represents a self-contained composition in a document, page, application, or site which would make sense if independently distributed or reused, e.g. in syndication
You said that article could be post or story, but because the spec says that article also covers an application it confuses matter.
For example, http://gist.github.com – if you’re logged in, you see a sidebar of saved gists – that’s fine, it’s a sidebar element, but the main content is the text box – the application (or perhaps if it was more of a stand alone widget on the page). The spec would suggest to put this in to an article element.
Currently I don’t know. For apps like http://jsbin.com I’m using divs over sections and articles. In fact, until I under it completely, I’m using article and section for content based markup and divs for non-content, non-semantic elements – and I’m including widgets and applications as not having semantics (obviously they would within them, but not at the top level).
- Remy, the Brighton roister-doister
Bruce asked me to name an example usage for the article element within an application. A simple example would be individual emails within an email application. They’re a component of the application and can be independently re-used. Same goes for items in a feed reader.
I don’t think the definition really goes for the text box in Remy’s example. It’s a component of the application for sure, but it does not make much sense on its own.
hi there! thanks for making the difference between and clear, now that’s really much better!
i have some questions here though about other uses of and , perhaps we can discuss about this a bit?
1. let’s say we have an e-commerce website, and there is this product listing page that we’re showing different items. does it make sense if i use for wrapping a list of products that uses to list down the products which has long description for each?
Soap Brand 1Long description
Soap Brand 2Long description
Soap Brand 3Long description
Or is strictly used for blogs or news, that it doesn’t have a place in e-commerce or other types of website? Or perhaps a still makes a better sense here?
2. let’s say we have a sidebar now, . we have different sections for the sidebar, such as
Blogroll
Twitter Updates
Related Posts
Is suitable for such usage in this situation?
My problem with the new article tag is that “article” seems to suggests text.
When it comes to syndication, there are plenty of content types that could be syndicated, like “event”, “address” or even video. It sounds strange to wrap them in an article tag because they are in no way articles as we know them in real life.
Section is a more generic term that doesn’t have that same real life connotation, so for it would feel more natural to mark up an event with the section tag.
HTML5 is still not popular in China, it’s nice to get some knowledge here.
article sounds a lot like a “type” of section.
thanks for the writeup! It really helps to clear up confusion. My vote would have been to name article, item. Much more generic and applicable to things other than text.
I’m in agreement with Comment by Jason Rhodes. Except for a few mentions on this page, all the web discussions of HTML5′s sectioning agents has centered on blog posts and news articles. Granted, we live in an age of two-paragraph chunked text, but some lengthier multi-page presentations still occur.
Offering a short story or a chapter spread across several page views, neither work having “natural” or “conceptual” internal headings, seems beyond the scope of <article> and <section>. My best guess at a solution for screen readers is to make each page a <section> with an hgroup made up of an <h1>title, an <h2>page 3 of 9. Whether or not the hgroup is margin-9999 or not, it will need to be there.
Would this be pretty close to best-practice? Or, whatever will eventually become best-practice?
Brilliant! Thank you for the very clear explanation of this issue, Bruce.