HTML5 - why not use XML syntax?

By crisp on Sunday 08 July 2007 02:00
Category: HTML, Views: 620

The XML-fanboys are at it again, this time tripping over the actual syntax used in W3C documents such as the HTML 5 differences from HTML 4 doc. Next comes a flurry of mails from people suggesting that HTML5 should actually make XML-syntax an author-conformance requirement.

Let's take a look at some of the arguments:

- It is claimed that XML-syntax is more logical and is easier to teach.
I agree to some extend; consistency is a great thing and I do encourage people to for instance always quote their attributes and include tags that are optional by spec. I however do not see any logic in having to explicitly close elements that by nature cannot have any content such as <br>. Also how many people always use implicit elements such as <tbody> in their XHTML markup today?

- It is claimed that XML-syntax is easier to parse
I agree that an XML-parser is much less complex compared to a parser that has to deal with less strict syntax requirements, but we're not talking about XML-parsers here but about HTML-parsers that are still required to deal with non-strict syntax wether that's conforming or not. The complexity of the parser won't be less so probably parsing-speed won't be much less either.

- It is claimed that XML-syntax is easier to read
Since when is markup-syntax meant for human consumption? Should indentation also be made a conformance-requirement? You can always use a prettifier or just reserialize to XML-syntax from a built DOM-tree. This is clearly a non-issue.

- It is claimed that you can use XML-syntax in HTML today because every browser supports it
That is not true, although browsers have never truly implemented HTML as a true SGML-application (by nature only validators use a true SGML parser - you can't blame them for following the specifications when implementations don't) and thus won't trip over the XML short-close syntax for empty elements it is still not possible to use for instance <script/>. Also the use of an XML-declaration will force some browsers into quirksmode.

Now I will present some arguments against making XML-syntax an author-conformance requirement:

- The more strict you make author conformance the less likely people will comply
HTML has been around for some time now and most people have learnt by example. Making the rules more strict will only confuse these people and drive them away from standards as they will not understand why suddenly what they've learned isn't "valid" anymore.

- It will punish people who have done the "right thing" in the past using HTML
Some people that are standards-aware have consciously avoided the "faux XHTML" trap and conformed to strict HTML compliance. An XML-syntax requirement will place extra burdon upon these people to make their documents HTML5-compliant.

- It makes markup needlessly verbose
And therefor doesn't cater to people who have a need for making their documents as small as possible but still be conforming.

XML-fanboys are still allowed to use the XML-serialization of HTML5 (wether it be called XHTML1.5, XHTML5, HTML5/XML or whatever), or are encouraged to join the XHTML2 WG which will probably never see a browser-implementation.

Volgende: prototype: IE and the cost of Element.extend() 09-08
Volgende: Fixing the web? Fix your browser! 17-06

Comments

There are no comments for this post


Comment form
(required)
(required, but will not be displayed)
(optional)

Please enter the characters you see in the image below: