A very good friend of mine wrote to urge me not to promote the jCap jQuery plugin to generate overlaid video captions from a transcript.
The reasoning was that my proof of concept is a hack and we will soon have real specifications for Media Text Associations and a Media Multitrack API. Of course, those specs need to be agreed and—here’s the rub—implemented in browsers.
My friend also worries that by propagating a hack, I might entrench worse practice and therefore discourage adoption of the proper way. This argument is compelling. However, this worries me because it basically means that HTML5 videos would be inaccessible today by design, waiting for proper accessibility jam tomorrow.
What do you think is the right way forward? Hacking for accessibility now, in a manner that’s acknowledged to be not “proper”, or waiting?
Those nice chaps at 360innovate have started work on a jQuery plugin called jCaps.
As it’s all open licensed, if anyone fancies participating, it would be jolly. Here’s a wishlist that I’d love to see, and would write if I were not so scripting-challenged:
It would be useful to point to the transcriptions via the aria-describedby attribute on the video element, the values of which point to the id of an element containing the transcript (which could therefore be an article, a div or whatever). You can have multiple values on aria-describedby (like you can with class) so that allows you to point to different translations.
This allows you to have transcripts anywhere in the source, so greater flexibility for laying out the page.
If there are multiple transcripts for a single video, they need to have a lang attribute, and the script should construct a select element so the user can choose the language they wish to see transcripts in. (The first one offered should be the transcript that matches the lang attribute on the html element; if there isn’t one, the default should be the first in source order.
I’d love to see a sexy (and stylable) skin around the video element that had a YouTube-stylee user interface that houses all the buttons for turning on captions, transcripts and the language picker.
Anyone up for the challenge, and generous enough to release it as a BSD-licensed library/plug-in?
Note that this technique will have a limited shelf-life. The HTML Accessibility Working Group have two specifications that will enable captioning etc to be done natively once the spec is agreed and (of course) implemented.
My hubcap-thieving Scally chum Jake Smith emailed, expressing concern about the the fact that the codec impasse means we have to encode video twice, once as Ogg and once as H264 to deliver in HTML5:
My concern is from that of a business. Encoding as OGG will only further questions from clients, rather than answering them. “So, this video you’re encoding… I can’t watch it on my Mac (safari)? And I still can’t see it on my iPhone?”
There’s the obvious “be damned with licenses” and encode as MP4 anyway, but then I have to encode twice, which is ok for the odd video, but could be a right arse long term, as that’s more cost to client, and as far as they’re concerned why not pay once for encoding to FLV?
From my (business) point of view, there is no point in chasing HTML5 for video. No matter how much I want to do the right thing…
I’ve only worked for quasi-public sector sites for whom profit isn’t an imperative, and I’ve been absorbed thinking about open-ness and standards, so hadn’t given Jake’s perspective much thought.
To me, the negatives are:
Double encoding is time, extra process and more storage
Flash “works” – change is expensive
The advantages to using open HTML5 video are
It’s (ultimately) a better user experience, as user doesn’t have to worry about plugins (a major source of worry for non-techy users)
It works on iPhones and (eventually) other mobile browsers
As a web designer, you can do fancy stuff with CSS etc as it’s native in the browser (this may not matter to business; depends what they want to do with the video)
You can have a textual transcript, which can be scripted into synchronised video captions: great for “Search Engine Optimisation” and “DDA compliance”
Any one care to wade in with some business reasons for or against double-encoding and using HTML5 video?
If you’re British, it’s not “awesome”. That’s an American word, like “sidewalk”, “gas” for petrol, “critter”, “varmint”, “tarnation” and “gotten” as the third form of the verb. Americans, you’re welcome to use them; they’re your words, but they are not English.
If you want knee-jerk circle-jerk response to mediocre design, the term is “Brendan Dawesome“.
If you want to express actual approbation for something, the English terms are “spiffing”, “top-hole”, “wizard” or “ticketyboo”.