Archive for the 'HTML5' Category

Live Regions resources

Yesterday I asked “What’s the most up-to-date info on aria-live regions (and ) support in AT?” for some client work I’m doing. As usual, Twitter was responsive and helpful.

Heydon replied

Should be fine, support is good for live regions. Not sure about output, though … Oh, you’re adding the p _with_ the other XHR content? That will have mixed results in my experience.

Brennan said

I’ve seen some failed announcements with live-regions on VoiceOver, especially with iframes. (Announcement of the title seems to kill any pending live content).output has surprisingly good support but (IIRC) is not live by default on at least one browser (IE, I think).

Some more resources people pointed me to:

No to Service Worker, Yes to Wank!

My chum and co-author Remington Sharp tweeted

We need a universally recognised icon/image/logo for "works offline".

Like the PWA or HTML5 logo. We need to be able to signal to visitors that our URLs are always available.

To the consumer, the terms Progressive Web App or Service Worker are meaningless. So I applied my legendary branding, PR and design skills to come up with something that will really resonate with a web user: the fact that this app works online, offline – anywhere.

So the new logo is a riff on the HTML5 logo, because this is purely web technologies. It has the shield, a wifi symbol on one side and a crossed out wifi symbol on the other, and a happy smile below to show that it’s happy both on and offline. Above it is the acronym “wank” which, of course, stands for “Works anywhere—no kidding!”

Take it to use on your sites. I give the fruits of my labour and creativity free, as a gift to humanity.

Accessible data tables

I’ve been working on a client project and one of the tasks was remediating some data tables. As I began researching the subject, it became obvious that most of the practical, tested advice comes from my old mates Steve Faulkner and Adrian Roselli.

I’ve collated them here so they’re in one place when I need to do this again, and in case you’re doing similar. But all the glory belongs with them. Buy them a beer when you next see them.

Making accessible tagged PDFs with Prince

Love them or hate them, PDFs are a fact of life for many organisations. If you produce PDFs, you should make them accessible to people with disabilities. With Prince, (twitter) it’s easy to produce accessible, tagged PDFs from semantic HTML, CSS and SVG.

It’s an enduring myth that PDF is an inaccessible format. In 2012, the PDF profile PDF/UA (for ‘Universal Accessibility’) was standardised. It’s the U.S. Library of Congress’ preferred format for page-oriented content and the International Standard for accessible PDF technology, ISO 14289.

Let’s look at how to make accessible PDFs with Prince. Even if you already have Prince installed, grab Prince 13 and install it; it’s a free license for non-commercial use. Prince is available for Windows, Mac, Linux, Free BSD desktops and wrappers are available for Java, C#/ .NET, ActiveX/COM, PHP, Ruby on Rails and Node/ JavaScript for integrating Prince into websites and applications.

Here’s a trivial HTML file, which I’ve called prince1.html.

<!DOCTYPE html>
<html>
<meta charset=utf-8>
<title>My lovely PDF</title>
<style>
        h1 {color:red;}
        p {color:green;}
</style>
<h1>Lovely heading</h1>
<p>Marvellous paragraph!</p>
</html>

From the command line, type

$ prince prince1.html

Prince has produced prince1.pdf in the same folder. (There are many command line switches to choose the name of the output file, combine files into a single PDF etc., but that’s not relevant here. Windows fans can also use a GUI.)

Using Adobe Acrobat Pro, I can inspect the tag structure of the PDF produced:

Acrobat screenshot: no tags available

As you can see, Acrobat reports “No Tags available”. This is because it’s perfectly legitimate to make inaccessible PDFs – documents intended only for printing, for example. So let’s tell Prince to make a tagged PDF:

$ prince prince1.html --tagged-pdf

Inspecting this file in Acrobat shows the tag structure:

Acrobat screenshot showing tags

Now we can see that under the <Document> tag (PDF’s equivalent of a <body> element), we have an <H1> and a <P>. Yes, PDF tags often —but not always— have the same name as their HTML counterparts. As Adobe says

PDF tags are similar to tags used in HTML to make Web pages more accessible. The World Wide Web Consortium (W3C) did pioneering work with HTML tags to incorporate the document structure that was needed for accessibility as the HTML standard evolved.

However, the fact that the PDF now has structural tags doesn’t mean it’s accessible. Let’s try making a PDF with the PDF-UA profile:

$ prince prince1.html --pdf-profile="PDF/UA-1"

Prince aborts, giving the error “prince: error: PDF/UA-1 requires language specification”. This is because our HTML page is missing the lang attribute on the HTML element, which tells assistive technologies which language the text is written in. This is very important to screen reader users, for example; the pronunciation of the word “six” is very different in English and French.

Unfortunately, this is a very common error on the Web; WebAIM recently analysed the accessibility of the top 1,000,000 home pages and discovered that a whopping 97.8% of home pages had detectable accessibility failures. A missing language specification was the fifth most common error, affecting 33% of sites.

screenshot from webaim showing most common accessibility errors on top million homepages
Image courtesy of webaim.org, © WebAIM, used by kind permission

Let’s fix our web page by amending the HTML element to read <html lang=en>.

Now it princifies without errors. Inspecting it in Acrobat Pro, we see a new <Annot> tag has appeared. Right-clicking on it in the tag inspector reveals it to be the small Prince logo image (that all free licenses generate), with alternate text “This document was created with Prince, a great way of getting web content onto paper”:

Acrobat screenshot with annotation on the Prince logo added with free licenses

This generation of the <Annot> with alternate text, and checking that the document’s language is specified allows us to produce a fully-accessible PDF, which is why we generally advise using the --pdf-profile="PDF/UA-1" command line switch rather than --tagged-pdf.

Adobe maintains a list of Standard PDF tags, most of which can easily be mapped by Prince to HTML counterparts.

Customising Prince’s default mappings

Prince can’t always map HTML directly to PDF tags. This could be because there isn’t a direct counterpart in HTML, or it could be because the source markup has conflicting markup and styling.

Let’s look at the first scenario. HTML has a <main> element, which doesn’t have a one-to-one correspondence with a single PDF tag. On many sites, there is one article per document (a wikipedia entry, for example), and it’s wrapped by a <main> element, or some other element serving to wrap the main content.

Let’s look at the wikipedia article for stegosaurus, because it is the best dinosaur.

We can see from browser developer tools that this article’s content is wrapped with <div id=”bodyContent”>. We can tell Prince to map this to the PDF <Art> tag, defined as “Article element. A self-contained body of text considered to be a single narrative” by adding a declaration in our stylesheet:

#bodyContent { prince-pdf-tag-type: Art; }

On another site, we might want to map the <main> element to <Art>. The same method applies:

Main { prince-pdf-tag-type: Art;}

Different authors’ conventions over the years is one reason why Prince can’t necessarily map everything automatically (although, by default HTML <article> gets mapped to <Art>).

Therefore, in this new build of PrinceXML, much of the mapping of HTML elements to PDF tags has been removed from the logic of Prince, and into the default stylesheet html.css in the style sub-folder. This makes it clearer how Prince maps HTML elements to PDF tags, and allows the author to override or customise it if necessary.

Here is the relevant section of the default mappings:

article { prince-pdf-tag-type: Art }
section { prince-pdf-tag-type: Sect }
blockquote { prince-pdf-tag-type: BlockQuote }
h1 { prince-pdf-tag-type: H1 }
h2 { prince-pdf-tag-type: H2 }
h3 { prince-pdf-tag-type: H3 }
h4 { prince-pdf-tag-type: H4 }
h5 { prince-pdf-tag-type: H5 }
h6 { prince-pdf-tag-type: H6 }
ol { prince-pdf-tag-type: OL }
ul { prince-pdf-tag-type: UL }
li { prince-pdf-tag-type: LI }
dl { prince-pdf-tag-type: DL }
dl > div { prince-pdf-tag-type: DL-Div }
dt { prince-pdf-tag-type: DT }
dd { prince-pdf-tag-type: DD }
figure { prince-pdf-tag-type: Div } /* figure grouper */
figcaption { prince-pdf-tag-type: Caption }
p { prince-pdf-tag-type: P }
q { prince-pdf-tag-type: Quote }
code { prince-pdf-tag-type: Code }
img, input[type="image"] {
prince-pdf-tag-type: Figure;
prince-alt-text: attr(alt);
}
abbr, acronym {
prince-expansion-text: attr(title)
}

There are also two new properties, prince-alt-text and prince-expansion-text, which can be overridden to support the relevant ARIA attributes.

Uncle Hakon shouting at me in Paris
Uncle Håkon shouting at me last month in Paris

Taking our lead from wikipedia again, we might want to produce a PDF table of contents from the ‘Contents’ box. Here is the Contents for the entry about otters (which are the best non-dinosaurs):

screenshot of wikipedia's in-page table of contents

The box is wrapped in an unordered list inside a <div id=”toc”>. To make this into a PDF Table of Contents (<TOC>), I add these lines to Prince’s HTML.css (because obviously I can’t touch the wikipedia source files):

#toc ul {prince-pdf-tag-type: TOC;} /*Table of Contents */
#toc li {prince-pdf-tag-type: TOCI;} /* TOC item */

This produces the following tag structure:

Acrobat screenshot showing PDF table of contents based on the wikipedia table of contents

In one of my personal sites, I use HTML <nav> as the wrapper for my internal navigation, so would use these declaration instead:

nav ul {prince-pdf-tag-type: TOC;}
nav li {prince-pdf-tag-type: TOCI;}

Only internal links are appropriate for a PDF Table of Contents, which is why Prince can’t automatically map <nav> to <TOC> but makes it easy for you to do so, either by editing html.css directly, or by pulling in a supplementary stylesheet.

Mapping when semantic and styling conflict

There are a number of tricky questions when it comes to tagging when markup and style conflict. For example, consider this markup which is used to “fake” a bulleted list visually:


<!DOCTYPE html>
<html lang=en>
<meta charset=utf-8>
<title>My lovely PDF</title>
<style>
div div {display:list-item;
    list-style-type: disc;
    list-style-position: inside;}
</style>
<div>

    <div>One</div>
    <div>Two</div>
    <div>Three</div>

</div>

Browsers render it something like this:

what looks like a bulleted list in a browser

But this merely looks like a bulleted list — it isn’t structurally anything other than three meaningless <div>s. If you need this to be tagged in the output PDF as a list (so a screen reader user can use a keyboard short cut to jump from list to list, for example), you can use these lines of CSS:

body>div {prince-pdf-tag-type: UL;}
div div {prince-pdf-tag-type: LI;}

Prince creates custom OL-L and UL-L tags which are role-mapped to PDF’s list structure tag <L>. Prince also sets the ListNumbering attribute when it can infer it.

Mapping ARIA roles

Often, developers supplement their HTML with ARIA roles. This can be particularly useful when retrofitting legacy markup to be accessible, especially when that markup contains few semantic elements — the usual example is adding role=button to a set of nested <div>s that are styled to look like a button.

Prince does not do anything special with ARIA roles, partly because, as webaim reports,

they are often used to override correct HTML semantics and thus present incorrect information or interactions to screen reader users

But by supplementing Prince’s html.css, an author can map elements with specific ARIA roles to PDF tags. For example, if your webpage has many <div role=”article”> you can map these to pdf <Art> tags thus:

div[role="article"] {prince-pdf-tag-type: Art;}

Conclusion

As with HTML, the more structured and semantic the markup is, the better the output will be. But of course, Prince cannot verify that alternate text is an accurate description of the function of an image. Ultimately claiming that a document meets the PDF/UA-1 profile actually requires some human review, so Prince has to trust that the author has done their part in terms of making the input intelligible. Using Prince, it’s very easy to turn long documents —even whole books— into accessible and attractive PDFs.

Structured data and Google

Domain-specific markup for fun and profit

It doesn’t come as a surprise to Dull Old Web Farts (DOWFs) like me to learn last month that Google gives a search boost to sites that use structured data (as well as rewarding sites for being performant and mobile-friendly). Google has brilliant heuristics for analysing the content of sites, but developers being explicit and marking up their content using subject-specific vocabularies means more robust results.

For the first time (to my knowledge), Google has published some numbers on how structured data affects business. The headlines:

  • Jobrapido’s overall organic traffic grew by 115%, and they have seen a 270% increase in new user registrations from organic traffic
  • After the launch of job posting structured data, Google organic traffic to ZipRecruiter job pages converted at a rate three times higher than organic traffic from other search engines. The Google organic conversion rate on job pages was also more than 4.5 times higher than it had been previously, and the bounce rate for Google visitors to job pages dropped by over 10%.
  • In the month following implementation, Eventbrite saw roughly a 100-percent increase in the typical year-over-year growth of traffic from Google Search
  • Traffic to all Rakuten Recipe pages from search engines soared 2.7 times, and the average session duration was now 1.5 times longer than before.

Impressive, indeed. So how do you do it? For this site, I chose a vocabulary from schema.org:

These vocabularies cover entities, relationships between entities and actions, and can easily be extended through a well-documented extension model. Over 10 million sites use Schema.org to markup their web pages and email messages. Many applications from Google, Microsoft, Pinterest, Yandex and others already use these vocabularies to power rich, extensible experiences.

Because this is a blog, I chose the BlogPosting schema, and I use the HTML5 microdata syntax. So each article is marked up like this:

<article itemscope itemtype="http://schema.org/BlogPosting">
  <header>
  <h2 itemprop="headline" id="post-11378">The HTML Treasure Hunt</h2>
  <time itemprop="dateCreated pubdate datePublished" 
    datetime="2019-05-20">Monday 20 May 2019</time>
  </header>
    ...
</article>

The values for the microdata attributes are specified in the schema vocabulary, except the pubdate value on itemprop which isn’t from schema.org, but is required by Apple for WatchOS because, well, Apple likes to be different.

And that’s basically it. All of this, of course, is taken care of by one WordPress template, so it’s automatic.

Metadata partial copy-paste necrosis for misery and loss

One thing puzzles me, however; Google documentation says that Google Search supports structured data in any of three formats: JSON-LD, RDFa and microdata formats, but notes “Google recommends using JSON-LD for structured data whenever possible”.

However, no reason is given for preferring JSON-LD except “Google can read JSON-LD data when it is dynamically injected into the page’s contents, such as by JavaScript code or embedded widgets in your content management system”. I guess this could be an advantage, but one of the other “features” of JSON-LD is, in my opinion, a bug:

The markup is not interleaved with the user-visible text

I strongly feel that metadata that is separated from the user-visible data associated with it highly susceptible to metadata partial copy-paste necrosis. User-visible text is also developer-visible text. When devs copy/ paste that, it’s very easy to forget to copy any associated metadata that’s not interleaved, leading to errors. (And Google will penalise errors: structured data will not show up in search results if “The structured data is not representative of the main content of the page, or is potentially misleading”.)

An example of metadata partial copy-paste necrosis can be seen in the commonly-recommended accessible form pattern:

<label for="my-input">Your name:</label>
<input id="my-input"/>

As Thomas Caspars wrote

I’ve contacted chums in Google to ask why JSON-LD is preferred, but had no reply. (I may go as far as trying to “reach out” next time.)

Andrew wrote

I’m pretty sure Google prefers JSON-LD over microdata because it’s easier for them to stealborrow the data for their own use in that format. When I was working on a screen-scraping project a few years ago, I found that to be the case. Since then, I’ve come to believe that schema.org is really about making it easier for the big guys to profit from data collection instead of helping site owners improve their SEO. But I’m probably just being a conspiracy theorist.

Speculation and conspiracy theories aside, until there’s a clear reason why I should use JSON-LD over interleaved microdata, I’m keeping it as it is.

Google replies

Updated 23 May: Dan Brickley, a Google employee who is Lord of Schema.org, wrote this thread on Twitter:

On Smart TVs

When I was doing developer relations at Opera, I did everything I could to avoid having to go near the Opera TV part of the business – which was basically an app store of HTML5 websites for “Smart” TVs. This was for three reasons: the world of Smart TVs was a world of closed standards. Secondly, as Patrick Lauke wrote, the chips in the early Smart TVs were cheap and crappy which seriously crippled the web experience.

But the main reason was that I felt Smart TVs were a solution looking for problem.

TVs are big, beautiful screens and good speakers, sitting in a social space. No-one wants to surf Facebook on a screen that mum and dad are also watching, especially controlling it with arrows on a remote control unit.

A cheap device like a Chromecast allows people to control the content with a portable device they already have and know how to use – be it a laptop, tablet or phone, and see the movie/ photos on a big screen with family and friends. TVs are meant to be dumb; your phone is smart and a little USB gizmo connects the two.

So why did Opera have a B2B division developing Smart TV offerings? (And after the dismemberment of Opera, it continues as Vewd). The answer: because TV set manufacturers wanted Smart TV, so would throw money at us. But why did the set manufacturers want it?

A news report last week made it clear: it’s about collecting data off users, hitting the network a lot to phone that data back to HQ, then “monetizing” it, in some cases, advertising to viewers. That helps reduce the cost at sale of TVs. As the Verge puts it: Taking the smarts out of smart TVs would make them more expensive. From their interview with TV manufacturer Vizio’s CTO Bill Baxter:

So look, it’s not just about data collection. It’s about post-purchase monetization of the TV.

This is a cutthroat industry. It’s a 6-percent margin industry, right? I mean, you know it’s pretty ruthless. You could say it’s self-inflicted, or you could say there’s a greater strategy going on here, and there is. The greater strategy is I really don’t need to make money off of the TV. I need to cover my cost.

…there are ways to monetize that TV and data is one, but not only the only one. It’s sort of like a business of singles and doubles, it’s not home runs, right? You make a little money here, a little money there. You sell some movies, you sell some TV shows, you sell some ads, you know.

I have a Smart TV (it’s difficult to find a new one that isn’t). I’ve never connected it to the web, for just this reason.

The practical value of semantic HTML

Bruce’s guide to writing HTML for JavaScript developers

It has come to my attention that many in the web standards gang are feeling grumpy about some Full Stack Developers’ lack of deep knowledge about HTML. One well-intentioned article about 10 things to learn for becoming a solid full-stack JavaScript developer said

As for HTML, there’s not much to learn right away and you can kind of learn as you go, but before making your first templates, know the difference between in-line elements like <span> and how they differ from block ones like <div>. This will save you a huge amount of headache when fiddling with your CSS code.

This riled me too. But, as it’s Consumerfest and goodwill to all is compulsory, I calmed down. And I don’t want to instigate a pile-on of the author of this piece; it’s indicative of an industry trend to regard HTML as a bit of an afterthought, once you’ve done the real work of learning and writing JavaScript. If the importance of good HTML isn’t well-understood by the newer breed of JavaScript developers, then it’s my job as a DOWF (Dull Old Web Fart) to explain it.

Gather round, Fullstack JavaScript Developers – together we’ll make your apps more usable, and my blood pressure lower.

What is ‘good’ HTML?

Firstly, let’s reach a definition of ‘good’ HTML. Many DOWFs used to get very irked about (X)HTML being well-formed: proper closing tags, quoted attributes and the like. Those days are gone. Sure, it’s good practice to validate your HTML, just like you lint your JavaScript (it can catch errors and make your code more maintainable), but browsers are very forgiving.

In fact, part of what we commonly call ‘HTML5’ is the Parsing Algorithm which is like an HTML ninja – incredibly powerful, yet rarely noticed. It ensures that all modern browsers construct the same DOM from the same HTML, regardless of whether the HTML is well-formed or not. It’s the greatest fillip to interoperability we’ve ever seen.

By ‘good’ HTML, I mean semantic HTML, a posh term for choosing the right HTML element for the content. This isn’t a philosophical exercise; it has directly observable practical benefits.

For example, consider the <button> element. Using this gives you some browser behaviour for free:

  • A button is focusssable via the keyboard. I bet you, dear reader, know all the keyboard shortcuts for your IDE; it makes development much faster. Many people use only the keyboard when using a webpage. I do it because I have multiple sclerosis – therefore the fine motor control required to use a mouse can be difficult for me. My neighbour has arthritis, so she prefers to use the keyboard.
  • Buttons can be activated using the space bar or the enter key; you don’t have to remember to listen for these keypresses in your script.
  • Inside a <form>, it doesn’t even need JavaScript to work.

“What’s that?”, you say. “Everyone has JavaScript”. No, they don’t. Most people do, most of the time. But I guarantee you that everyone is without JavaScript sometimes.

Here’s another example: semantically linking a <label> to its associated <input> increases usability for a mouse-user or touch-screen user, because clicking in the label focusses into the input field. See this in action (and how to do it) on MDN.

This might be me, with my MS; it might be you, on a touch-screen device on a bumpy train, trying to check a checkbox. How much easier it is if the hit area also includes the label “uncheck to opt out of cancelling us not sending you spam forever”. (Compare the first and second identical-looking examples in a checkbox demo.)

But the point is that by choosing the right element for the job, you’re getting browser behaviour for free that makes your app more usable to a range of different people.

Invisible browser behaviours

With me so far? Good. The browser behaviours associated with the semantics of <button> and <label> are obvious once you know about them – because you can see them.

Other semantics aren’t so obvious to a sighted developer with a desktop machine and a nice big monitor, but they are incredibly useful for those who do need them. Let’s look at some of those.

HTML5 has some semantics that you can use for indicating regions on a page. For example, <nav>, <main>, <header>, <footer>.

If you wrap your main content – that is, the stuff that isn’t navigation, logo and main header etc – in a <main> tag, a screen reader user can jump immediately to it using a keyboard shortcut. Imagine how useful that is – they don’t have to listen to all the content before it, or tab through it to get to the main meat of your page.

And for people who don’t use a screenreader, that <main> element doesn’t get in the way. It has no default styling at all, so there’s nothing for you to remove. For those that need it, it simply works; for those that don’t need it, it’s entirely transparent.

Similarly, using <nav> for your primary navigation provides screenreader users with a shortcut key to jump to the navigation so they can continue exploring your marvellous site. You were probably going to wrap your navigation in a <div class=”nav”> to position it and style it; why not choose <nav> instead (it’s shorter!) and make your site more usable to the 15% of the world who have a disability?

For more on this, I humbly point you to my 2014 post Should you use HTML5 header and footer?. A survey of screenreader users last year showed that 80% of respondents will use regions to navigate – but they can only do so if you choose to use them instead of wrapping everything in <div>s. Now you know they exist, why wouldn’t you use them?

Update: here’s a YouTube video of blind screenreader user Leonie Watson talking through how she navigates this site using the HTML semantics we’ve discussed.

YouTube video

New types of devices

We’re seeing more and more types of devices connecting to the web, and semantic HTML can help these devices display your content in a more usable way to their owners. And if your site is more usable than your competitors’, you win, and your boss will erect a massive gold statue of you in the office car park. (Trust me, your boss told me. They’ve already ordered the plinth.)

Let’s look at one example, the Apple Watch. Here are some screenshots and excerpts from the transcript of an Apple video introducing watchOS 5:

We’ve brought Reader to watchOS 5 where it automatically activates when following links to text heavy web pages. It’s important to ensure that Reader draws out the key parts of your web page by using semantic markup to reinforce the meaning and purpose of elements in the document. Let’s walk through an example. First, we indicate which parts of the page are the most important by wrapping it in an article tag.

diagram of an article element wrapping content on an Apple Watch

Specifically, enclosing these header elements inside the article ensure that they all appear in Reader. Reader also styles each header element differently depending on the value of its itemprop attribute. Using itemprop, we’re able to ensure that the author, publication date, title, and subheading are prominently featured.

Apple Watch diagram showing how it uses microdata attributes to layout and display information about an article

itemprop is an HTML5 microdata attribute. There are shared vocabularies documented at schema.org, which is founded by Google, Microsoft, Yahoo and Yandex. Using schema.org vocabularies with microdata can make your pages display better in search results:

Many applications from Google, Microsoft, Pinterest, Yandex and others already use these vocabularies to power rich, extensible experiences.

Update: Google published numbers on how using structured data boosts conversions.

(If you plan to put things into microdata, please note that Apple, being Apple, go their own way, and don’t use a schema.org vocabulary here. Le sigh. See my article Content needs a publication date! for more. Or view source on this page to see how I’m using microdata on this article.)

Apple WatchOS also optimises display of items wrapped in <figure> elements:

Reader recognizes these tags and preserves their semantic styles. For this image, we use figure and figcaption elements to let the Reader know that the image is associated with the below caption. Reader then positions the image alongside its caption.

Apple Watch diagram showing how it lays out figures and captions if appropriately marked up

You probably know that HTML5 greatly increased the number of different <input> types, for example <input type=”email”> on a mobile device shows a keyboard with the “@” symbol and “.” that are in all email addresses; <input type=”tel”> on a mobile device shows a numeric keypad.

On desktop browsers, where you have a physical keyboard, you may get different User Interface benefits, or built-in validation. You don’t need to build any of this; you simply choose the right semantic that best expresses the meaning of your content, and the browser will choose the best display, depending on the device it’s on.

In WatchOS, input types take up the whole watch screen, so choosing the correct one is highly desirable.

First, choose the appropriate type attribute and element tag for your form controls.
WebKit supports a variety of form control types including passwords, numeric and telephone fields, date, time, and select menus. Choosing the most relevant type attribute allows WebKit to present the most appropriate interface to handle user input.

Secondly, it’s important to note that unlike iOS and macOS, input methods on watchOS require full-screen interaction. Label your form controls or specify aria label or placeholder attributes to provide additional context in the status bar when a full-screen input view is presented.

Apple Watch showing different full-screen keyboards for different email types

I didn’t choose to use <article> and itemprop and input types because I wanted to support the Apple Watch; I did it before the Apple Watch existed, in order that my code is future-proof. By choosing the right semantics now, a machine that I don’t know about yet can understand my content and display it in the best way for its users. You are opting out of this if you only use <div> or <span> because, by definition, they have “no special meaning at all”.

Summary

I hope I’ve shown you that choosing the correct HTML isn’t purely an academic exercise. Perhaps you can’t use all the above (and there’s much more that I haven’t discussed) but when you can, please consider whether there’s an HTML element that you could use to describe parts of your content.

For instance, when building components that emit banners and logos, consider wrapping them in <header> rather than <div>. If your styling relies upon classnames, <header class=”header”> will work just as well as <div class=”header”>.

Semantic HTML will give usability benefits to many users, help to future-proof your work, potentially boost your search engine results, and help people with disabilities use your site.

And, best of all, thinking more about your HTML will stop Dull Old Web Farts like me moaning at you.

What’s not to love? Have a splendid holiday season, whatever you celebrate – and here’s to a semantic 2019!

More!

Editing the W3C HTML5 spec

On 3 November, the Queen and Uncle Timbo (Sir Tim Berners-Lee to you) came round to Lawson Towers to threaten me with a punch in the face and a karate chop if I didn’t co-edit the W3C HTML5 spec.

So I said yes — they look pretty frightening, I think you’ll agree.

Tim Berners Lee with fist raised, and the Queen making a karate chop gesture

There’s a pretty groovy and diverse team, including Patricia Aas who works on the Vivaldi browser; my friend and ex-Opera Devrel colleague, Shwetank Dixit who lives and works in India for Barrier Break, an accessibility organisation; ex-Opera chum Sangwhan Moon; long-haired digital trouble maker who does “Open Standards at GDS” (Government Digital Service), Terence Eden; Xiaoqian Wu of W3C and Steve Faulkner of The Paciello Group, another accessibility organisation.

I’m currently advising Wix Engineering on web standards. As an independent consultant, I’m not representing Wix, but obviously any relevant lessons we’ve learned by open-sourcing our Stylable CSS for Components pre-processor will be fed back to the W3C Working Groups.

A lot of people have asked me why there are two HTML specs – the Living Standard at WHATWG, and the HTML5.x specs at W3C. What’s the difference? And which should you use? The answer: it depends. (Well, this is web development, after all).

Spec implementors (primarily, browser vendors) implement from the WHATWG spec. The painstakingly algorithmic WHATWG document is exactly what we need so that browsers implement interoperably. But it’s hard for web authors (the million or so normal folk worldwide who write HTML to make websites) to get advice on how best to write good HTML from the WHATWG spec.

I’ve long used Steve Faulkner’s excellent guidance on HTML and ARIA, for example. The WHATWG spec is a future-facing document; lots of ideas are incubated there. The W3C spec is a snapshot of what works interoperably – authors who don’t care much about what may or may not be round the corner, but who need solid advice on what works now may find this spec easier to use.

For example, the WHATWG spec talks of the outlining algorithm, whereby a heading element such as <h1> changes its “level” depending on how it’s nested in <article>, <section> etc. I wish this actually worked, because it’s useful and elegant. However, as the W3C spec says “There are currently no known native implementations of the outline algorithm in graphical browsers or assistive technology user agents” and so advises “Authors should use heading rank (h1-h6) to convey document structure.”

I believe the latter spec is more useful to people writing websites today.

I’ve got lots of friends in WHATWG and believe it when I wrote that the strength of the web over last 7 years has been due to WHATWG. In my opinion, the work that WHATWG did saved the Web from irrelevance, while the W3C went meandering around XML-land instead. I’ve discussed this with some friends in WHATWG and W3C, too, to canvas opinion. They told me HTML needs a cute little mascot, so I’m stepping up to the plate.

I don’t want the W3C and WHATWG specs to diverge in substantive matters, and hope that useful authoring advice we produce can make its way into both specs.

Vive open standards!

On URLs in Progressive Web Apps

I’m writing this as a short commentary on Stuart Langridge’s post The Importance of URLs which you should read (he’s surprisingly clever, although he looks like the antichrist in that lewd hat).

Stuart says

I approve of the Lighthouse team’s idea that you don’t qualify as an add-to-home-screen-able app if you want a URL bar

Opera’s implementation of Progressive Web Apps differs from Chrome’s here (we only take the content layer of Chromium; we implement all the UI ourselves, precisely so we can do our own thing). Regardless of whether the developer has chosen display: standalone or display: fullscreen in order to hide the URL bar, Opera will display it if the app is served over HTTP because we think that the user should know exactly where she is if the app is served over an insecure connection. Similarly, if the user follows a link from your app that goes outside its domain, Opera spawns a new tab and forces display: browser so the URL bar is shown.

But I take Jeremy Keith’s point:

I want people to be able to copy URLs. I want people to be able to hack URLs. I’m not ashamed of my URLs …I’m downright proud.

One of the superpowers of the Web is URLs, and fullscreen progressive web apps hide them (deliberately). After our last PWA meeting with the Chrome team in early February, I was talking about just this with Andreas Bovens, the PM for Opera for Android. We mused about some mechanism (a new gesture?) that would allow the user to see and copy (if they want) the URL of the current page. I’ve already heard of examples when developers are making their own “share this” buttons — and devs re-implementing browser functionality is often a klaxon signalling something is missing from the platform.

When I mentioned our musings on Twitter this morning, Alex Russell said “we’ve been discussing the same.” It is, as Chrome chappie Owen Campbell-Moore said “a difficult UX problem indeed”, which is one reason that Andreas and I parked our discussion. One of Andreas’ ideas is long press on the current page, and then get an option to copy/share the URL of the page you’re currently viewing (this means that a long press is not available as an action for site owners to use on their sites. Probably not a big deal?)

What do you think? How can we best allow the user to see the current URL in a discoverable way?

Web Components, accessibility and the Priority of Constituencies

Gosh, what a snappy title. I’m not expecting a job offer from Buzzfeed any time soon.

Today, Apple sent their consolidated feedback on Web Components to the webapps Working Group. The TL;DR: they like the concept, are “considering significant implementation effort”, but want lots of changes first including removal of subclassing, eg <button is=”my-button”>.

I think this is bad; this construct means existing HTML elements can be progressively enhanced – in the example above, browsers that don’t support components or don’t support JavaScript get a functional HTML <button> element. It also means that, by enhancing existing HTML elements, your components get the default browser behaviour for free – so, in this example, your snazzy my-button element inherits focussability and activation with return or spacebar withut you having to muck about with tabindex or keyboard listeners. (I wrote about this in more detail last year in On the accessibility of web components. Again.)

Apple raised a bug Remove the support for inherting from builtin subclasses of HTMLElement and SVGElement and notes “without this hack accessibility for trivial components is harder as more things have to be done by hand” (why “this hack”? A loaded term). However, it calls for removal because “Subclassing existing elements is hard as implementation-wise identity is both object-based and name / namespace based.”

Implementation is hard. Too hard for the developers at Apple, it appears. So Web developers must faff around adding ARIA and tabindex and keyboard listeners (so most won’t) and the inevitable consequence of making accessibility hard is that assistive technology users will suffer.

HTML has a series of design principles, co-edited by Maciej Stachowiak who sent Apple’s feedback. One of those is called “Priority of Constituencies” which says

In case of conflict, consider users over authors over implementors over specifiers over theoretical purity. In other words costs or difficulties to the user should be given more weight than costs to authors; which in turn should be given more weight than costs to implementors; which should be given more weight than costs to authors of the spec itself, which should be given more weight than those proposing changes for theoretical reasons alone.

Fine words. What changed?