Issues with FOP+Saxon generating the FOP intermediate format (area tree)
Added by Nico Kutscherauer 1 day ago
Hi,
as the title says, I have a problem using FOP and Saxon generating the FOP intermediate XML format. I have reconstructed the issues and described the problems in this GitHub repository.
The user perspective is:
I have the following in my XSL-FO input:
<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:fox="http://xmlgraphics.apache.org/fop/extensions">
<!-- ... -->
<fo:declarations>
<pdf:catalog xmlns:pdf="http://xmlgraphics.apache.org/fop/extensions/pdf">
<pdf:dictionary type="normal" key="ViewerPreferences">
<pdf:boolean key="DisplayDocTitle">true</pdf:boolean>
</pdf:dictionary>
</pdf:catalog>
</fo:declarations>
<!-- ... -->
</fo:root>
If the Saxon is on the classpath the resulting FOP area tree contains this:
<document xmlns="http://xmlgraphics.apache.org/fop/intermediate"
version="2.0">
<header>
<pdf:catalog>
<!-- ... -->
</pdf:catalog>
The namespace declaration for the prefix pdf
is missing so the result is not wellformed!
If Xalan is on the classpath the result is wellformed:
<document xmlns="http://xmlgraphics.apache.org/fop/intermediate" version="2.0">
<header>
<pdf:catalog xmlns:pdf="apache:fop:extensions:pdf">
<!-- ... -->
</pdf:catalog>
I made some debugging and provided the details here. Not sure if that is usefull.
I don't expect that this is a Saxon bug but I'm not deep in SAX/JAXP to argue that and to identify the problems in FOP. Do you have a hint/guess for me how that could happen?
Thanks & Best Regards,
Nico
Replies (2)
RE: Issues with FOP+Saxon generating the FOP intermediate format (area tree) - Added by Michael Kay about 23 hours ago
Without looking too deeply into the gory detail, I suspect the problem is caused by known weaknesses in the design of the JAXP ContentHandler interface. Specifically, the details of the events that are passed across, especially for namespaces, depend on the property settings of the XML "Parser" (that is, the component that issues the events). But in an application like this, Saxon has no opportunity to confirgure the parser, nor even to discover what these property settings are. We decided therefore to mandate that if an application wants to supply the input using this interface, it is required to conform to our expectations on these settings, and that we wouldn't incur the (significant) cost of verifying what is passed over.
See in particular the Javadoc comments on ReceivingContentHandler
:
* <p>The {@code ReceivingContentHandler} is written on the assumption that it is receiving events
* from a parser configured with {@code http://xml.org/sax/features/namespaces} set to true
* and {@code http://xml.org/sax/features/namespace-prefixes} set to false.</p>
* <p>When running as a {@code TransformerHandler}, we have no control over the feature settings
* of the sender of the events, and if the events do not follow this pattern then the class may
* fail in unpredictable ways.</p>
There's also a comment on the startElement
method:
* <p>This event allows up to three name components for each
* element:</p>
*
* <ol>
* <li>the Namespace URI;</li>
* <li>the local name; and</li>
* <li>the qualified (prefixed) name.</li>
* </ol>
*
* <p>Saxon expects all three of these to be provided.
We also rely on startPrefixMapping()
and endPrefixMapping()
calls happening.
RE: Issues with FOP+Saxon generating the FOP intermediate format (area tree) - Added by Michael Kay about 21 hours ago
I strongly suspect that the startNamespacePrefix()
and endNamespacePrefix()
methods on the ContentHandler
are not being called.
Please register to reply