Impasse on HTML 5 video

The video element is the one that seems to excite n00bz the most when I do introductory talks about HTML 5, yet it was always the element that seemed to me to be furthest away from cross-browser interoperability.

Originally, the specification (being an Open Standard) said

User agents should support Ogg Theora video and Ogg Vorbis audio, as well as the Ogg container format.

Ogg is a free, open standard container format maintained by the Xiph.Org Foundation which claims that it is unrestricted by software patents. Firefox and experimental video builds of Opera support Ogg for the element.

However, there were complaints from Nokia (PDF) and Apple that Ogg formats are not good enough technologically and still within patent lifetime and therefore susceptible to submarine patents (unexpected future patent challenges).

Therefore, the spec was revised with the much more wishy-washy

It would be helpful for interoperability if all browsers could support the same codecs. However, there are no known codecs that satisfy all the current players: […] This is an ongoing issue and this section will be updated once more information is available.

http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-June/020620.html

Yesterday, Ian Hickson removed any mention of codecs from the spec because of impasse and non-interoperable implementations: there is no suitable codec that all vendors are willing to implement and ship. He writes

Apple refuses to implement Ogg Theora in Quicktime by default (as used by Safari), citing lack of hardware support and an uncertain patent landscape.

Google has implemented H.264 and Ogg Theora in Chrome, but cannot provide the H.264 codec license to third-party distributors of Chromium, and have indicated a belief that Ogg Theora’s quality-per-bit is not yet suitable for the volume handled by YouTube.

Opera refuses to implement H.264, citing the obscene cost of the relevant patent licenses.

Mozilla refuses to implement H.264, as they would not be able to obtain a license that covers their downstream distributors.

Microsoft has not commented on their intent to support video at all.

So does that end the dream of interoperable, no-plugin video in the browser? Perhaps not entirely.

Some point to the BBC format, DIRAC as “the BBC has no IPR interest in any implementation of Dirac by anyone, based on the Dirac software or not”. However, the answer to the question “Do you infringe any patents?”

The short answer is that we don’t know for certain, but we’re pretty sure we don’t. We haven’t employed armies of lawyers to trawl through the tens of thousands of video compression techniques.

is unlikely to satisfy those who are worried about submarine patents in Ogg.

Sun Microsystem’s announcement of their OMS Video format was exciting:

The Web needs royalty-free video and audio codecs.

…OMS Video seeks to bring an updated royalty-free variant from the h.2.6x video codec lineage to the open source / royalty-free community. With the open source / royalty-free codec community, the OMS Video initiative seeks to collaborate through shared interests and common royalty-free technologies, to unleash innovation, and to update outmoded RAND (“reasonable and nondiscriminatory”) standardization processes to Web speed.

An OMS Video specification is published, but with the Oracle buy-out, I’ve no idea what’s happening with the specification.

I’m not pointing the finger at any other browser manufacturers because it’s entirely understandable after the EOLAS patent debacle that browser vendors are frightened of submarine patents. (I work for Opera which supports Ogg Theora/Vorbis but this is a personal site not the opinion of employers blah.)

What this sorry story really shows is that closed standards and proprietary formats are the enemy of an Open Web.

16 Responses to “ Impasse on HTML 5 video ”

Comment by Henri Sivonen

Your post makes things seem bad. Things seem much better from another point of view: Firefox and Chrome support Theora natively. Opera will if labs builds are any indication. Safari supports Theora with XiphQT. IE supports Theora with Cortado, and hopefully there will be an ActiveX component for making things work even better in IE.

The best way to move things forward now is to publish a lot of interesting Theora-only content on the Web using the video element.

Comment by Tobias Horvath

As you already stated, Bruce, some of the big players are frightened by future patent fiascos and rather not step into the “free” Ogg Theora hype.

I believe there is no real answer to the problem right now—especially considering how much the professional video content authors are embracing H.264 due to widespread adoption and outstanding quality.

I am not endorsing this, I know the licensing scheme is really horrible for anyone but the end user. I believe there should be an effort to make H.264 more open. Allow browser vendors to implement it for free if the browser is free? Dream? Probably :(

The open video web has many hurdles right now.

Comment by Bruce

Henri – what would be good is if there were some “youtube”-type site (oggtube?) that allowed me to upload .avis from Win Movie Maker and converted them to .oggs, much like YouTube does with .flvs.

Comment by Dustin Wilson

I wrote about this problem a while back myself, and I really wonder if we’re ever going to get a quality cross-browser video element implementation.

The main problem with Theora isn’t the fact that it might contain software patents but because of its quality to filesize ratio is horrendous. Many people pushing for it have never even taken a look at Theora itself and have never encoded files themselves. Theora’s main competition on the web, Flash Video, has a much higher ratio and can be encoded at least twice as fast; there are decade old codecs out there that provide better quality than Theora. For websites such as Youtube it’s impractical simply because of the long encoding times needed for it not to mention that the resulting video quality would be lower than the Flash video that’s used today. End users would have to download files that are 33% to 50% larger to obtain the same video quality that is used today (which on YouTube isn’t that great as it is). It’s something the end user would notice.

If Theora’s going to be the choice then the browser developers need to contribute code and resources to the Theora project to improve it first, but with formats like Dirac out there why bother? Dirac has been used in the real world, and it’s top notch.

Is even attempting to work with the MPEG group an option here? Two of the browsers are pushing for MPEG 4. There’s got to be some way for them to put aside their greed and benefit the greater good.

Comment by Tobias Horvath

Re: Dustin’s post: Great insight on the subject! I too believe the quality of Theora is sub-par.

I wouldn’t consider Apple’s stance on the subject as greed tho—they started supporting H.264 a long time ago, even waited for the licensing scheme to get to a point where end users would not have to pay licensing fees to watch content. H.264 is deeply embedded in all things iTunes store, iPhone, iPod, AppleTV, the new QuickTime X in Snow Leopard. There is millions of dollars in research and development that they invested. Diverting from this path is far from being a simple step.

We will not have a one-codec world. And back to the original subject: We will need a free open source alternative that everyone can agree upon. From the video guy perspective, I fear this cannot be Ogg Theora.

Funny side note: All our clients still receive MPEG-1 files for approval. MPEG-1 for video is what IE 6 is for the web.

Comment by John Foliot

@bruce: FFMPEG does codec conversion (but can be CPU intensive) – http://ffmpeg.org/

The bigger problem is that currently <video> has no native support for captioning, although some of the recent work done with Ogg suggest that it is relatively easy to implement. Highly recommend you follow Silvia Pfeiffer’s blog at: http://ow.ly/ggA1

Comment by Kyle Weems

The video element is one of those HTML5 teases that I’m becoming somewhat skeptical of ever becoming a reality because of all the codec snafus. Even if all the other browser makers went with Ogg Theora (Although it sounds like it’s a sub-optimal choice for compression reasons, which is sad to hear) I don’t see Microsoft getting off its butt and including any support for any codec for years, which will unhappily defeat the purpose of having such a lovely element in the first place.

I’d like to be wrong about that, though.

Comment by Dustin Wilson

Oh, no. I didn’t mean it that way. Apple’s stance in using MPEG 4 isn’t based upon greed. It’s based upon the fact that they chose it for their products already, and tools are available for their platforms (and many others) to easily create the format. The ridiculous licensing fees for MPEG 4 however is greed, and that doesn’t really have much to do with Apple even if they are part of the MPEG standards body. Google just wants it because much of YouTube is already in MPEG 4 for the iPhone’s benefit and the fact that the alternative being brought to the table is perceived by them (and rightly so) to be vastly inferior.

Theora is just as mired in theoretical software patents as Dirac is, actually. If Dirac was chosen I believe that the BBC would be pleased to find out their format was chosen, but I don’t believe they really have a vested interest in it like say Microsoft does with EOT in the web font format discussion. I’m afraid that no matter what open source alternative to MPEG 4 is chosen there’ll be claim chowder on its programming.

My stance on the whole situation is that if Theora is chosen it needs some considerable programming contributions; it’s most definitely workable in my opinion and the developers are working to improve the format. Excuse my language, but if an open sourced video format is chosen the browsers need to grow some balls and be ready for a battle because it’ll happen no matter what open sourced format is chosen.

Captioning is a problem, but there’s an external (to the video file) text-based subtitling format many players support already or some XML-based standardized subtitling format could be created.

Comment by Philip Jägenstedt

Dustin, your claims about quality for the YouTube range of resolution/bitrate are greatly exaggerated. For FLV usually H.263/MP3 is used, which is easy to beat with Theora/Vorbis. I’d be thrilled to see your sources for the claims about 33% to 50% bitrate increase with Theora…

H.264 is the real competition, and Greg Maxwell did a good comparison at http://people.xiph.org/~greg/video/ytcompare/comparison.html The point isn’t that Theora is superior to H.264, but that it is close enough that quality is no longer the deciding factor.

Dirac is nice, but its real-world use is pretty much limited to use within the BBC to transmit HD content in the same bandwidth as uncompressed SD video. It hasn’t been tuned for SD or below-SD (YouTube) resolutions, and if you try the encoder for yourself that will become obvious. Nonetheless, I hope that Dirac has a role to play in the future.

Comment by Dave

The only question about Theora’s quality was “is it good enough for web use” the answer is a clear yes. Those who think this a competition between Theora and H.264 are confused.

If Theora, or a similar freely implementable codec, did not exist then Opera would not have suggested the video tag in the first place.

After all if you have no problems with codecs that aren’t freely re-usable then what was the big problem with Flash or silverlight that give you a choice of several for your personal, non-commercial use at no cost?

Comment by DRAFT Video, Video – More from the HTML5 universe at STC AccessAbility SIG

[...] You’d like a – uh – video – to explain all this. Here is a demonstration of video accessibility. (This was a submission to the W4A 2009 Web accessibility conference.) In the video, we explain the current status of video accessibility on the Web and means forward for HTML5. We propose a solution for associating textual captions with video and explain it on the example of Ogg Kate, SRT and DFXP. We then explain further challenges such as Sign Language, Audio Annotations, and more general types of time-aligned text, e.g. Karaoke, music lyrics, ticker-text, transcripts, or annotations with hyperlinks.) Here’s another short video about connecting the video of HTML5 to the Web. Another video about HTML5 and video: Impass on HTML5 and video [...]