XML->HTML: No newline after doctype declaration
Added by Denis Maier over 1 year ago
Hi, First post here.
I'm converting xml to html, and in my output files I have no newline after the document declaration, i.e. the root element is on the same line as the doctype declaration, like so:
<!DOCTYPE HTML><html lang="de">
Is this normal? Do I need to use some setting to add the newline? Or is there something wrong with my xsl? I can, of course, prepare a MWE if that's necessary, but I thought maybe I'm missing something fundamental.
Replies (11)
Please register to reply
RE: XML->HTML: No newline after doctype declaration - Added by Michael Kay over 1 year ago
The serialization specification says:
If the HTML output method MUST output a document type declaration, it MUST be serialized immediately before the first element, if any, and the name following <!DOCTYPE MUST be HTML or html.
I guess we're interpreting "immediately" as meaning that intervening whitespace isn't allowed. Perhaps that's an excessively pedantic interpretation of the spec, but we generally prefer to do exactly what it says unless there's a good reason otherwise.
Is there any particular reason you want the newline here?
Have you tried indent="yes"
?
RE: XML->HTML: No newline after doctype declaration - Added by Denis Maier over 1 year ago
Yes, I have indent="yes"
on xsl:output
.
Is there any particular reason you want the newline here?
No, nothing particular. It's just that it will appear without the newline in about every HTML tutorial you'll find online, e.g.
- https://www.w3schools.com/html/
- https://wiki.selfhtml.org/wiki/HTML/Tutorials/Grundger%C3%BCst
- https://developer.mozilla.org/en-US/docs/Learn/Getting_started_with_the_web/HTML_basics
So, I assumed that this is the somewhat more official way of doing things.
But, if it isn't a problem to have the declaration and the root element on the same line, I can live with that. I just wanted to make sure I'm not missing anything.
RE: XML->HTML: No newline after doctype declaration - Added by Michael Kay over 1 year ago
It seems SaxonJ does the same. I would have expected the newline with indent="yes".
RE: XML->HTML: No newline after doctype declaration - Added by Denis Maier over 1 year ago
Oh, it looks like I’ve posted to the wrong forum. I’m actually using the Java version (through Oxygen), and the .net Version (via transfom on the command line). There’s no JS involved.
RE: XML->HTML: No newline after doctype declaration - Added by Denis Maier over 1 year ago
Well, that's weird. I've tested with transform on the command line. There, I seem to be getting the result I want. Is that possible?
RE: XML->HTML: No newline after doctype declaration - Added by Michael Kay over 1 year ago
If you can tell us precisely WHAT you tested with transform on the command line....?
RE: XML->HTML: No newline after doctype declaration - Added by Denis Maier over 1 year ago
catalog = schema/catalog-bits-v2-1-with-base.xml
xsl = bits2html.xsl
html:
transform input.xml -xsl:$(xsl) -catalog:$(catalog) -expand:off -o:output.html
-> make html
the start of the xsl looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="3.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xlink="http://www.w3.org/1999/xlink"
exclude-result-prefixes="xlink"
>
<xsl:output method="html" indent="yes" version="5" html-version="5" encoding="UTF-8"/>
<xsl:preserve-space elements="p div"/>
<xsl:strip-space elements="*"/>
RE: XML->HTML: No newline after doctype declaration - Added by Denis Maier over 1 year ago
Do you need more info from me?
As I've said in the original post: I can cook it down to a MWE if that is necessary.
RE: XML->HTML: No newline after doctype declaration - Added by Martin Honnen over 1 year ago
I wonder whether Saxon so far behaves as it does because https://www.w3.org/TR/xslt-xquery-serialization-31/#HTML_DOCTYPE states:
If the HTML output method MUST output a document type declaration, it MUST be serialized immediately before the first element
RE: XML->HTML: No newline after doctype declaration - Added by Michael Kay over 1 year ago
I have changed the HTML5 serializer in SaxonJ (which will also affect SaxonCS and SaxonC) so that if indentation is on, a newline is output after the DOCTYPE declaration. This change will apply from 12.2; I'm not retrofitting it to the 11.x or earlier branches.
RE: XML->HTML: No newline after doctype declaration - Added by Denis Maier over 1 year ago
Thanks! That's great.
Please register to reply