The road to HTML5: conformance of HTML4 documents

By crisp on Sunday 3 February 2008 23:37 - Comments (3)
Categories: HTML5,, Views: 11.263

Recently I ran the frontpage through the (experimental) HTML5 validator (by Henri Sivonen) to see how well we are being forwards-compatible. The result wasn't too bad, just 13 errors.

I'm writing this because Henri himself recently posted an analysis of recent validations through the service on the HTML WG mailinglist in which he makes some recommendations on current practices that are at the moment non-conforming to the current HTML5 specification draft. It seems like we suffer from this too.

Let's go through the errors. Fortunately there are a lot less since the HTML5 WG relaxed the content model restrictions December of last year. Before that this kind of markup was deemed invalid since it contained a mix of block-level and inline-level elements within another block-level element:
    <li><a href="">Frontpage</a>
            <li><a href=""></a></li>
            <li><a href="">Core</a></li>
            <li><a href="">Life</a></li>
            <li><a href="">Pro</a></li>

Error no 1:
Error: Bad value Content-Language for attribute http-equiv on element meta.

I don't think this should be an error. W3C itself describes this meta-tag as a means to identify the intended audience for the document (in our case: Dutch or Dutch-speaking audience). The fact that we use a meta-tag instead of an actual HTTP header doesn't change it's meaning.

Errors no. 2, 6, 11:
Error: Attribute cellspacing not allowed on element table at this point.

We use cellspacing=0 on tables since IE doesn't yet support the CSS border-spacing attribute and border-collapse isn't always preferable.

Errors no. 3, 4, 5, 7, 8, 9, 10, 12, 13:
Error: Attribute width not allowed on element col at this point.

We insert (calculated) width-attributes in order to force more consistent table-rendering across browsers. Using the width-attribute is less verbose than using style="width:__px" and isn't any different from it. With a lot of different types of tables that we generate this is for us a better solution than having to specify a seperate class for each width that we might want to set on a column.

So basically distinctive these are just 3 errors for which we have solid usecases. Going further on different pages I find errors on the use of <center> which is a valid error (also in HTML4 strict), but which is still legacy from our CMS (and obviously using <div class="center"> wouldn't really make it better), and on the size attribute for <input> for which there is no CSS alternative (and really doesn't belong in CSS anyway) so that is also a usecase for conformance in HTML5.

Henri himself had some interesting usecases as well:

- cases where the http-equiv Content-Type meta isn't the first element in the head-section
- whitespace in href-attributes; I feel that leading and trailing whitespace in href-attributes should be ignored. Further I think it is a bad practice not to urlencode non-valid URI-characters, and a space character within such attribute might indicate some other error (such as forgetting to properly enclose the attribute with quote-characters), so spaces within an URI should imo still be an error.
- border=0 attributes; especially for images within an anchor this can be considered an anti-presentational attribute since no-one really likes the old default linked-image borderstyle (and CSS directives will still take precedence).
- <acronym> has disappeared in HTML5 in favor of <abbr> which is striking since IE has never supported <abbr> until IE7 (but doesn't have any default styling in IE7).
- <wbr> still seems to be used a lot although it has never been part of any specification. Someone noted that since Firefox3 will support &shy; this may soon not be an issue anymore, but imo <wbr> is more like Zero Width Space (&8203;) and as long as we have to support IE6 which has poor unicode support I don't think there is a real solution for this.

There are a number of other cases that you may or not may agree that they should be made conforming in HTML5. In any case the road to HTML5 may not be without obstacles as long as we have to support browsers like IE (including IE8 and next versions without having to resort to some meta-tag).

Volgende: Some thoughts on HTML5's getElementsByClassName 02-'08 Some thoughts on HTML5's getElementsByClassName
Volgende: Using the HTML5 doctype prematurely "considered harmful" 01-'08 Using the HTML5 doctype prematurely "considered harmful"


By Tweakers user TeeDee, Monday 4 February 2008 09:42

Nu is mijn eigen site vele malen kleiner maar leuk om te zien dat mijn investeringen om standards-compliant te werken z'n vruchten af begint te werken.

Van de 15 errors (met HTML5 exp. en de HTML5 parser) zijn er 12 van een extern ingeladen feed waar ik geen zin in had om het helemaal aan te passen. (align attributes op een image.)

De website van mijn werkgever 'doet het' ook nog aardig. Even een aantal steekproeven gedaan en het valt gelukkig allemaal mee.

By Tweakers user MueR, Monday 4 February 2008 11:34

On my website, I got 4 errors, of which 2 were cellpadding/spacing.
The most striking one was this:
Error: Attribute value not allowed on element input at this point.

By Tweakers user 147846, Friday 16 May 2008 09:16

On a rather big website, my site had 7 errors, 1 fatal error and one warning. I'm quite happy with these results, the site is made HTML4 Strict.

The one thing that catched my eye was the fatal error, this error was pointed to the code of Google Analytics, so should I now conculde that you won't get a HTML5 valid website when using Google Analytics on your website? Or is it just the way I place it? (bottom of website)

I haven't gotten any "Attribute value not allowed at this point" errors though.

Comments are closed