PI output bug in @method="html"
I'm working with DITA to HTML5 project and generating HTML with PHP instructions. As you know PHP instruction is written in processing-instruction like this:
However when I generate processing-instruction in HTML output, Saxon does not close the processing-instruction by "?>".
Here is sample XSLT file and the result:
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:math="http://www.w3.org/2005/xpath-functions/math" exclude-result-prefixes="xs math" version="3.0"> <xsl:output method="html" encoding="UTF-8" html-version="5.0"/> <xsl:template name="xsl:initial-template"> <!-- XSLT processor information --> <xsl:variable name="vendor" as="xs:string" select="system-property('xsl:vendor')"/> <xsl:variable name="vendorUrl" as="xs:string" select="system-property('xsl:vendor-url')"/> <xsl:variable name="productName" as="xs:string" select="system-property('xsl:product-name')"/> <xsl:variable name="productVersion" as="xs:string" select="system-property('xsl:product-version')"/> <html> <head> <title>Saxon PI Output Test</title> </head> <body> <main> <xsl:comment select="'Running on: ' || $productName || ' (' || $vendorUrl || ') Version: ' || $productVersion"/> <p>Here is PI output:</p> <xsl:processing-instruction name="php">includeInnerHtml('C_19.php')</xsl:processing-instruction> </main> </body> </html> </xsl:template> </xsl:stylesheet>
<!DOCTYPE HTML> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <title>Saxon PI Output Test</title> </head> <body> <main><!--Running on: SAXON (http://www.saxonica.com/) Version: PE 18.104.22.168--> <p>Here is PI output:</p> <?php includeInnerHtml('C_19.php')></main> </body> </html>
This occurs only HTML output. If I change @method="xml", this phenomenon does not occur.
Please fix this problem.
#1 Updated by Michael Kay 6 months ago
This behaviour is in accordance with section 7.3 of the XSLT 3.0/XQuery 3.1 Serialization specification at https://www.w3.org/TR/xslt-xquery-serialization-31/#HTML_CHARDATA :
The HTML output method MUST terminate processing instructions with > rather than ?>. It is a serialization error [err:SERE0015] to use the HTML output method when > appears within a processing instruction in the data model instance being serialized.
This reflects the rule in HTML 4.01 https://www.w3.org/TR/html401/appendix/notes.html#h-B.3.6
A processing instruction begins with <? and ends with >
I have not been able to find any reference to processing instruction syntax in the HTML5 specifications, though they do refer to processing instruction nodes in the data model.
I would be interested to know if there are any practical use cases for processing instructions in HTML, and whether either syntax is accepted or rejected by deployed HTML browsers.
#2 Updated by Toshihiko Makita 6 months ago
Thank you for your reply.
I understand that processing-instruction definition differs between HTML and XML.
For your reference <?php [PHP instruction] ?> is interpreted by server PHP processor and never passed to browser applications in client side.
Also processing-instruction seems to be inhibited in HTML5 specification:
22.214.171.124 Tag open state https://html.spec.whatwg.org/multipage/parsing.html#tag-open-state
Please close this ticket.
#3 Updated by Michael Kay 6 months ago
As a workaround for generating PHP code, I would suggest one of the following:
(a) Use method="xhtml"
(b) Use character-maps to output the "" and "?>" delimiters, perhaps from some obscure characters like ⬅︎ and ➡︎.
(c) Add a "?" within the content of the PI, so it's serialized as "?>".
(d) Write some Java to customize the serializer - not as difficult as it might seem (call
Configuration.setSerailizerFactory, supplying a subclass of
SerializerFactory that overrides
newHTMLEmitter() to return a subclass of
HTML50Emitter which overrides the
Please register to edit this issue