Archive for March, 2009

Indian English

It’s often said that English is the second langage in India, and cetainly that’s true when Hindi speakers from the North are doing business with their colleagues from the South, who speak the Dravidian languages of Tamil, Malayalam, Telugu or Kannada. Some say English is the glue that has held India together, although the extensive railway has a comparable claim, while my travelling companion Shwetank believes that the Civil Service is the historical reason. (Whatever it was, it seems to me as an outsider during the Indian general elections that belief in democracy is what holds India together now. That and cricket: "Ek desh ek junoon"—one nation, one obsession—the TV ads for the Indian Premier League say.)

But it’s also equally accurate to describe English as an alternative language which educated speakers of the same language will employ if they determine that it’s the better language to express a particular idea, switching unconsciously between their own language and English, sometimes mid-sentence.

It makes watching Bollywood masala movies much easier for the Hindi-challenged like me. There will be a stream of Hindi and—in the middle—"Wow wow wow, I love you" to help me understand what’s going on. (Not that masala movies are particularly complicated, anyway; they’re pretty light on plot, relying instead on gorgeous scenery, costumes and lots of songs. Top Bollywood tip: a male with too much gold jewellry, a moustache or who smokes is invariably the baddie.)

Indian English can often seem either elaborately formal ("Excuse me good sir, may I impertinently enquire as to your occupation in your country of origin?" I was asked) or somewhat quaint, almost Enid Blyton-esque—I assume that many idioms are frozen in the late 1940s when the British left. So, when police arrested a gang who were stealing gas from cylinders they sold as full, it was reported in my morning paper that "sleuths nabbed neer-do-wells". A man who dressed in a burka in order to visit his girlfriend was "bashed up" when discovered.

There are also a few perculiarly Indian formations. The back of a building or rear of an aeroplane is usually referred to as "the backside" ("Is a backside seat acceptable, Mr Lawson?", to which the answer can anatomically only be "yes"). Near my hotel is a shop offering "gentlemens’ suitings and shirtings". The word "even" has been commandeered as a synonym for "also" as in "Even I need to go to the bank" for "I need to go to be bank, too". Even I’ve found myself saying this, it’s so common.

Not all Indian English is as charming: sexual harrassment is linguisticaly trivialised as "eve-teasing". Perhaps the local ladies might carry some scissors for retaliation, adding reciprocal insult-by-euphemism to the injury by calling it "sausage-snipping"?

Cultural differences

An Indian gentleman corrected my mistake that Mumbai is the capital of India, by explaining that India has three capitals: the political capital is New Delhi, the centre of modern culture is Mumbai and the spiritual capital is Varanasi.

America, this one-time Chicago resident continued, also has three capitals: Washington DC as political capital, New York as cultural, and its spiritual capital is Las Vegas.

Unkind and unfair, but it made me laugh.

Mumbai

Mumbai is city that’s going places, both literally and metaphorically. Everyone is moving, all the time. Seemingly every building is under construction or being demolished. I haven’t felt this sort of dynamism since I lived in Bangkok in the late 90s. (And where the traffic is far worse than Mumbai).

We visited Veermata Jijabai Technological Institute, where we were shown around before the our lecture. It’s charming old place, built in the mid 19th century to study textile technology, so it’s only appropriate that it’s stayed on the cutting edge of applied technology by becoming a centre of excellence for high-voltage research and computer science. (Photos)

The talks went very well; Shwetank and I got a real buzz from the audience who numbered 300 (we expected around 50).

The organiser, Aditya Sengupta, wrote the next day

Many of my friends, acquaintances and complete strangers came up to me and expressed their happiness for having attended the lecture. The insights you provided were both eye-opening and immensely useful. The response has been overwhelmingly positive. We had far more attendees than we had expected or hoped for. It is extremely rare indeed that the auditorium gets filled to capacity – for anything.

When you’re far from home and missing the wife and kids, it lifts the spirits to know that what you’ve done has been appreciated.

Dinner was a meet-up at the Hard Rock Café where lots of Indian blogging and twitter luminaries came to sip a beer with us. I was taught some elementary Hindi (like “Tum khubsurat ho. Aati kya Khandala?” which means “Hello, how are you?” but my accent means I get some odd looks when I use it).

Then, a quick tour around the offices of local company Pin Storm and so to bed.

Strengths and weaknesses of Indian tech scene

One of the attendees, Jayesh, emailed me with the question “what do you think about the technology scene in India? What are the strong points and what are the weak points? And what are the measures for overcoming the weak points?”

Great question Jayesh, and I’m replying here rather than by email, as this is the opinion of an outsider, and I’d love to get Indian people’s views.

As I see it, India’s strong points are the skills and knowledge of its computer students and the fact that it recognised the importance of the computer industry early so began training people early.

It is weaker in shifting courses away from older technologies to the new (apparently there are still institutions where they study WAP and WML!) ; bureaucracy moves slowly while trends in IT move fast.

Although Indian students love Open Source, they haven’t yet shown the same affection for Open Standards (there are a lot of “best viewed in IE” monstrosities here). In any economy development costs increase if you have to tinker your IE-only code to take account of all the other browsers and the endless parade of new devices that come out. Using Open Web Standards means (in theory) write once and use everywhere; in practice, that means only tweaking for legacy browsers and crippled devices of which there is a dwindling number, so Open Web Standards reduce development costs as well as promoting inclusion by being available on cheaper machines or mobiles.

This is the best way for an economy like India’s, where the emphasis can’t be on being the cheapest in the world, as that is unsustainable as an economic strategy: there will always be another nation waiting in the wings to claim the dubious honour of being the primary source of super-cheap labour, but that relies on sweat-shop wages, and therefore perpetuates social inequality and misery. Everything I read before I came, all the newspapers I’m reading now, and conversations with the students I meet suggests that’s not how Indian people wish to see their nation developing further.

Some proprietary standards seem attractive, as they can be made fast – but that’s because but they are made, developed (and sometimes abandoned) with decisions made in secret, according to the needs and ambitions of companies that report to an anonymous group of shareholders that demands feeding with healthly numbers every quarter. Can you always be certain that their needs will exactly align with the needs of a nation like India? (If there is one thing that the current economic crisis has showed us, it’s that the chasing of short term profits for institutions doesn’t always translate into economic health for communities.)

Using Open Standards (and participating in their development) facilitates a sustainable economy that promotes inclusion rather than exploitation by allowing you to work smarter, not cheaper.

Health warning: I’m not an economist; this is just based on my pre-trip research and listening to students here.

What do you think?

Chandigarh, Pilani, Jaipur

I arrived in India on Holi, the day when everyone gets drunk and throws colourful powder over each other. Therefore, instead of doing the sensible thing and going to bed, I drank a shed load of beers and had a meal that the Indian Opera interns cooked—a tremendous chicken biryani created by Saif, with chop-by-chop instructions from his mum over a mobile from South India.

After a couple of jetlaggy days in the Indian office, writing my presentation with Shwetank, we took a 12 hour night train from Chandigarh to Jaipur. Once there, we took a taxi for five hours across the Rajasthan desert, seeing camels

Rajasthan Camel

and the world’s most overloaded truck:

Most overloaded truck in the world

Finally we reached our destination, BITS-Pilani where I gave a talk on the advantages of Open Web Standards in developing nations, and was regaled with a good number of really searching questions. (BITS people who asked how you can get involved with specifying HTML 5: sign up to the WHAT-WG mailing list.)

Then, back across the desert, stopping for daal and aloo parthas, to the Nana ki Havali, a beautiful old ancestral home converted into a hotel, where there was a welcome beer waiting for me. I also went out for a shave: there are few more luxurious feelings than a really close shave with a really sharp blade by someone who knows what he’s doing.

The next day was a rest day, so Shwetank and I hired the same taxi driver and toured the sights of Jaipur, the pink city: the Jal Mahal, Amber Fort:

Jaipur, India

We also visited Jantar Mantar (an old astronomical observatory), the museum and palace, where we saw the world’s largest cannon, the world’s largest sundial and the world’s largest silver water holders (which the maharajah filled with Gangees water and took to England to attend the coronation of Edward VII, as he mistrusted English water.

Today, we flew to Mumbai in preparation for speaking at VJTI tomorrow.

The things I do for Opera, eh?

Opera India tour

Tomorrow night, Shwetank and I go by overnight train and then taxi to Birla Institute of Technology and Science (BITS) in Pilani, the first stop on our Opera India University Tour where we’ll be showing such acronyms as SVG, CSS, HTML 5 and explaining the why of the technologies in an Indian context.

Looks like we’ve picked the right time to visit Pilani; wikipedia says

Pilani is known for its extreme climate. Summer temperatures reach up to 50 degree Celsius from May to July, while Winter temperatures reach sub-zero levels between December and January. Months of October and March are generally considered the most pleasant.

Some people have expressed surprise that I’m not going to South By Soutn West. I’ve never been before and I’d love to go, but I want to make a difference, so the chance to spread the Open Web Standards gospel to thousands of India’s brightest developers is unmissable.

Marking up a blog with HTML 5 (part 2)

Further refining the HTML 5 structure

Last month, I replumbed this blog to use HTML 5 for the markup and replaced the basic framework (a completely typical collection of divs to hold headers, footers and sidebars) with new html 5 structural tags. Browsers can’t yet do anything useful with those new elements but I showed that, with a bit of coaxing, browsers can be persuaded to style them with CSS and JavaScript can access those new HTML 5 elements.

What I didn’t do then – because I needed to bury myself in the specs and do some research – is use HTML 5 to mark up the real guts of the site, to give articles, comments and datestamps real semantics.

This should not be considered a tutorial. It’s an experiment. The specs are ambiguous, so I can’t be sure I’m using every element properly, and there’s not exactly a huge body of examples in the wild to draw from. Therefore, if you disagree with my markup choices please do let me know. Nicely.

You can use my WordPress HTML 5 theme, which is based on the excellent Kubrick theme. I’d be dead chuffed if you’d let me know if you do use it. You’ll probably want to comment out references to plugins (unless you also use them). I don’t support the theme and I’m sorry that my PHP is so shockingly bad.

The blog home page

An interesting thing about a blog homepage is that there are generally the last 5 or so posts, each with a heading, a "body" and data about the post (time, who wrote it, how many comments etc.) and usually a link to another page that has the full blog post (if the homepage just showed an excerpt) and its comments.

HTML 5 has an article element which I use to wrap each story:

The article element represents a section of a page that consists of a composition that forms an independent part of a document, page, or site. This could be a forum post, a magazine or newspaper article, a Web log entry, a user-submitted comment, or any other independent item of content.

Let’s look in more detail at the guts of how I mark up each blogpost.

Anatomy of a blog post

diagram of article structure; explanation follows

The wrapper is no longer a generic div but an article. Within that is a header, comprising a heading (the title of the blogpost) and then the time of publication, marked up using the time element.

Then there are the pearls of wit and wisdom that consitute each of my posts, marked up as paragraphs, blockquotes etc., and is pulled unchanged out of the database. Following that is data about the blog post (category, how many comments) marked up as a footer and, in the case of pages that show a single blogpost, there are comments expressing undying admiration and love. Finally, there may be navigation from one article to the next.

Data about the article

Following the content there is some “metadata” about the post: what category it’s in, how many comments there are. I’ve marked this up as footer. I previously used aside which “represents a section of a page that consists of content that is tangentially related to the content around the aside element, and which could be considered separate from that content” but decided that it was too much of a stretch; data about a post is intimately related.

footer is a much better fit: “A footer typically contains information about its section such as who wrote it, links to related documents, copyright data, and the like.” I was initially thrown off-course by the presentational name of the element; my use here isn’t at the bottom of the page, or even at the bottom of the article, but it certainly seems to fit the bill – it’s information about its section, containing author name, links to related documents (comments) and the like. There’s no reason that you can’t have more than one footer on page; the spec’s description says "the footer element represents a footer for the section it applies to" and a page may have any number of sections. The spec also says "Footers don’t necessarily have to appear at the end of a section, though they usually do."

This does, however, raise an interesting question about WAI-ARIA. In the structural redesign, I gave the page’s "main" footer an aria-role="contentinfo", on the grounds that assistive technology users (or search engines) might wish to jump straight to that information about the page they’re using.

I’m assuming that it would not be helpful for each article’s footer metadata to also have the same aria role. Additionally, the aria spec says content info is "metadata that applies to the parent document", which I read as meaning the whole web page, not each individual article.

Those who know more about ARIA and HTML 5 than I do have suggested that there is an automatic one-to-one correspondance between the HTML 5 footer element and aria-role="contentinfo" and (see Comparison of ARIA landmark roles and HTML5 structural elements by Steve Faulkner and ARIA in HTML5 Integration: Document Conformance (Draft) by Henri Sivonen.)

Henri’s draft (but not Steve’s) suggests an approximate correspondance between HTML 5’s header and ARIA’s banner role.

If I’m right that assistive technology user expect only one instance of banner and contentinfo (and I’d very much like to discuss this), we might need to revisit these assumptions.

Comments

I’ve marked up comments as articles, too, as the spec says that an article could be “a user-submitted comment”, but nested these inside the parent. These are headed with the date and the time of the comment and name of its author. I tried wrapping these in a header too, but it wouldn’t validate as each header requires at least one heading with it. As the author and time of a comment doesn’t feel like a heading, and there was not one there before, I’ve left it as plain text. But I’m undecided.

The original WordPress install had an ordered list of comments, which I’ve removed; some things have an implied order and, now they’re marked up with unambigously parsable dates and times, it’s trivial to programmatically determine the order. I thought it might be fun to generate numbers with CSS using this code

article article {counter-increment: comment;}
article article:before {content: counters(comment, ".");}

and, for those using the Opera 10 alpha (or the more-recently released Safari 4 beta), those generated numbers should be styled using the punky demand font from nerfect.com, used as an CSS web font with kind permission from its creator, Mr Walters. (Internet Explorer has been able to embed DRM-ed fonts since version 4, but I was unable to understand how to use its WEFT tool.)

Times and dates

Most blogs, news sites and the like provide dates of article publication (and dates of events and the like).

Microformats people, the most vocal advocates of marking up dates and times, believe that computer-formatted dates are best for people: their wiki says “the ISO8601 YYYY-MM-DD format for dates is the best choice that is the most accurately readable for the most people worldwide, and thus the most accessible as well”.

I don’t agree (and neither do candidates in my vox pop of non-geeks, my wife, brother and parents). Therefore I’ve used the HTML 5 time element to give a machine parsable date to computers, while giving people a human-readable date. Blog posts get the date, while comments get the date and time.

The spec is quite hard to understand, in my opinion, but the format you use is 2004-02-28T15:19:21+00:00, where T separates the date and the time, and the + (or a -) is the offset from UTC. Dates on their own don’t need a timezone; full datetimes do. Oddly, the spec suggests that if you use a time without a date, you don’t need a timezone either.

There’s considerable controversy over the time element at the moment. Recently one of the inner circle, Henri Sivonen, wrote that it’s for marking up future events only and not for timestamping blogs or news items: “The expected use cases of hCalendar are mainly transferring future event entries from a Web page into an application like iCal." This seems very silly to me; if there is a time element, why not allow me to mark up any time or date?

The spec for time does not mention the future event-only restriction: "The time element represents a precise date and/or a time in the proleptic Gregorian calendar" and gives three examples, two of which are about the past and none of which are "future events". (Henri calls this is a "spec bug". I’m not picking on Henri, by the way; I have loads of respect for him. It’s just that he has written some of the most quotable quotes since I’ve been looking at this).

Although the spec doesn’t (currently) limit use of the element, it does limit format to precise dates in "the proleptic Gregorian calendar". This means I can mark up an archive page for "all blog posts today" using time, but not "all July 2008 posts" as that’s not a full YYYY-MM-DD date. Neither can you mark up precise, but ancient dates, so the date of Julius Ceasar’s assassination, 15 March 44 BC is not compatible.

It is inconsistent that some dates may be marked up and not others, and it’s a problem that’s going to get worse as we’ll see some dates marked up with HTML 5 and others, such as fuzzy dates and ancient dates as microformats as they’re not allowed in the official markup scheme, fragmenting such data.There are already search engines that look for dates such as searchmonkey and YQL and of historical dates marked up in Wikipedia as a microformat or in museum databases.

Henri writes "The time element is meant as a replacement for the microformat abbrdesign pattern in hCalendar (if the microformat community embraces time; if not, time is pretty much pointless in HTML 5)". Henri is right: time is pretty much pointless in HTML5 if it’s not embraced by the microformats community, but why would they embrace it? It prevents them doing a lot of what they do now so what do they gain?

I suggest the spec be amended to allow dates like "July 1966" and "3 January 1077" to be compatible with the time element. Restricting it to "future events" is likely to make it a stillborn element, or make that bit of the spec completely irrelevant to its de facto use.

Stay tuned

That’s enough for now. Corrections to the CSS as I notice abominations. More write-up to come about sections and headings. Any comments?

Forget the mobile web: One site should work for all

In answer to Jakob Nielsen’s recent advice

Mobile phone users struggle mightily to use websites, even on high-end devices. To solve the problems, websites should provide special mobile versions.

A load of nonsense, of course. So I’ve written Forget the mobile web: One site should work for all, published today at ZDnet.co.uk:

Access to the web is a human right, says Bruce Lawson. It should not matter if you browse using a mobile phone, or with an assistive technology because of a disability. You should still have access to the same website a desktop user enjoys.

The Wall Street Journal has an interesting article called Squeezed! Small businesses have to decide whether creating a Web site for mobile devices is worth the expense:

Small firms have to decide whether to build or remake Web sites to accommodate them. The investment may be hefty, but so is the risk of ignoring an expanding market. … Companies can avoid the confusion and hassle of having two separate Web addresses by creating a single Web site that is viewable on either a PC or cellphone.

“One Web” isn’t a new idea, of course. Here’s a presentation called W3C – One Web: Going Mobile by Steve Bratt, former CEO of the W3C from November 2006.

Strategy to develop Short Breaks for Disabled Children

This week’s usability atrocity is editorial rather than technical, but demonstrates the same fundamental error: a failure to consider that the audience is not the author, and the arrogant belief that the audience should work around the author’s laziness. Author or web developer: both should serve the visitor.

A neighbour of mine has a disabled child. She was therefore sent Birmingham City Council’s four page executive summary of the Strategy to develop Short Breaks for Disabled Children, Young People and their Families 2009 – 2011 [58K PDF] but found it complicated to read as her first language is not English and asked me for help. (25 October 2010: Version hosted on my site as original authors deleted it from their site.)

What a pile of bullshit it is.

It seems to be all motherhood and apple pie, but it’s difficult to tell; although they claim “parent views count”, it’s a shame that no-one wrote a summary in language likely to make sense to parents or indeed anyone outside the consultation/ Local Authority industries.

It kicks off explaining that this is a joint strategy with the three PCTs in Birmingham, yet do not explain what a PCT is. The approach “promotes good outcomes for children and young people”; does “promotes good outcomes” mean the same as “gets good results”? As defined by whom?

The summary tells us that the strategy “adopts the logic model approach to service design”. What does this gobbledegook mean?

Here’s another horror:

To transform short break services for disabled children Birmingham City Council has received a major financial investment from Government, in the form of a ring fenced grant, of £5,806,000 revenue and £2,311,700 capital spread over a two year period beginning April 2009.

What does “ring fenced grant” mean to someone who doesn’t understand the language of budgets and spreadsheets?

Why not say “We have received a government grant of £5,806,000 revenue and £2,311,700 capital spread over two years from April 2009. We can only use this money to improve short break services for disabled children”. (I changed “transform” to “improve” on the assumption that the grant is not given for the purpose of worsening the short break service).

There are several more crimes against the English language. For example, the passive voice creeps in for no apparent reason: “consultation has also taken place with other stakeholders and the key messages to emerge were…”.

Why complicate this? I assume this means “We have talked to other interested parties, and they told us..”? If that’s it means, why not say that?

The first commissioning objective is “to develop a more dynamic model”. What does “dynamic model” mean? Kate Moss after a snout full of coke?

Objective six tells me of a “distinctive gap”. It what way is it distinctive? Is it wider than other gaps that they identified? Narrower? Smells vaguely of chives or raspberries?

Objective seven is “to establish robust infrastructure…to deliver short breaks services”. If “robust infrastructure” means “buildings that don’t fall down on disabled children”, then it has my unqualified support. If it doesn’t mean that, what does it mean? If it’s an objective, I’d like to know what it is, please.

The authors will apparently “work collaboratively with the family information service”. The definition of “collaborate” is “work together on a common enterprise of project”, so “working with” anyone is collaboration by definition.

There will be a “dynamic process that engages disabled children”; in what sense will it be dynamic (aren’t processes supposed not to change all the time?) What are “opportunity change plans” – or is this a typo for “with the opportunity to change plans”?

I also noticed some horrible jargon that seems to deliberately obscure the message:

There are no plans to disinvest in existing residential provision in Birmingham. An increase in capacity is being sought through the more flexible use of residential provision…

I’m interested in the word “disinvest”. As an opposite to “invest”, it would seem that they do not plan to withdraw money or reduce funding to existing residential provision, although this message is diluted by using the passive rather than “We have no plans…” and not using the more direct “We will not reduce funding…”.

But “not disinvesting” doesn’t sound like there will be more money into the system. At the same time “an increase in capacity is being sought” [by whom?]. So my decoding suggests that what they really mean is “Our funding of existing residential provision will not change, but we will require working practices to change so that we can accommodate more children”.

That may or may not be a good thing; I can’t judge. But it sounds like it might have a profound effect on staff and the disabled children, so I wonder why it is written so unclearly.

I’m interested to know why there was no budget to proof-read this document or even to turn it into the language of people like me – a parent. Currently, it reads like a memo between policy wonks.

Or is it deliberately designed to obscure the message and get through on the nod?