HTML serialization: position of DOCTYPE declaration
The W3C serialization specification says that with the HTML output method, the DOCTYPE declaration should appear "immediately before" the first start-element tag. That implies it should appear after any opening comments and processing instructions. I'm pretty sure we output it before any opening comments and PIs.
Need to check against the HTML spec to see what makes most sense.
Note also, we've had a couple of people trying to output PHP using the HTML output method, and stumbling over the fact that the HTML output method outputs PIs with a closing ">" rather than "?>". We've also had requests to suppress the DOCTYPE output for HTML-5.
#2 Updated by Michael Kay about 1 year ago
I've confirmed that we output the DOCTYPE before any comments using the simple XQuery command line
-qs:"<!--hey!-->,<html><head><title>Howdy</title></head><body>How are you?</body></html>" !method=html !html-version=5.0
I can't think of any good reason why the serialization spec says what it does, but it feels like a deliberate statement rather than something that slipped in by accident.
The HTML5 spec allows comments both before and after the DOCTYPE declaration, and there doesn't seem to be any strong reason for preferring one over the other. (Processing instructions aren't defined in the HTML syntax of HTML5, though the recovery features in the parser will cause them to be treated as comments.)
#3 Updated by Michael Kay about 1 year ago
Further observation: the requirement to output the DOCTYPE "immediately before the first element" has been there since XSLT 1.0.
Moreover, the same requirement applies to the XML output method. (And for XML, this is what Saxon currently does).
For HTML, added test case output-0233
Please register to edit this issue