Project

Profile

Help

Bug #4401

HTML serialization: position of DOCTYPE declaration

Added by Michael Kay 8 months ago. Updated 8 months ago.

Status:
New
Priority:
Low
Assignee:
Category:
-
Sprint/Milestone:
-
Start date:
2019-12-03
Due date:
% Done:

0%

Estimated time:
Legacy ID:
Applies to branch:
Fix Committed on Branch:
Fixed in Maintenance Release:

Description

The W3C serialization specification says that with the HTML output method, the DOCTYPE declaration should appear "immediately before" the first start-element tag. That implies it should appear after any opening comments and processing instructions. I'm pretty sure we output it before any opening comments and PIs.

Need to check against the HTML spec to see what makes most sense.

Note also, we've had a couple of people trying to output PHP using the HTML output method, and stumbling over the fact that the HTML output method outputs PIs with a closing ">" rather than "?>". We've also had requests to suppress the DOCTYPE output for HTML-5.

History

#1 Updated by Michael Kay 8 months ago

  • Description updated (diff)

#2 Updated by Michael Kay 8 months ago

I've confirmed that we output the DOCTYPE before any comments using the simple XQuery command line

-qs:"<!--hey!-->,<html><head><title>Howdy</title></head><body>How are you?</body></html>" !method=html !html-version=5.0

I can't think of any good reason why the serialization spec says what it does, but it feels like a deliberate statement rather than something that slipped in by accident.

The HTML5 spec allows comments both before and after the DOCTYPE declaration, and there doesn't seem to be any strong reason for preferring one over the other. (Processing instructions aren't defined in the HTML syntax of HTML5, though the recovery features in the parser will cause them to be treated as comments.)

#3 Updated by Michael Kay 8 months ago

Further observation: the requirement to output the DOCTYPE "immediately before the first element" has been there since XSLT 1.0.

Moreover, the same requirement applies to the XML output method. (And for XML, this is what Saxon currently does).

For HTML, added test case output-0233

Please register to edit this issue

Also available in: Atom PDF