Project

Profile

Help

systemId of input seems to be lost along the way

Added by Douglas Atique over 8 years ago

Hi,

I am trying to run a test Java program that uses Saxon-HE 9.7.0-2 to do an XSLT 2.0 transform. The Java code consists of a simple main class that goes like:

SAXTransformerFactory tf = (SAXTransformerFactory) TransformerFactory.newInstance(); TransformerHandler th1 = tf.newTransformerHandler(new StreamSource("stage1.xsl")); TransformerHandler th2 = tf.newTransformerHandler(); // this will be a second transformation step when I am finished with the stylesheet above Transformer xf = tf.newTransformer(); StreamResult out = new StreamResult(System.out); th2.setResult(out); SAXResult stage2 = new SAXResult(th2); th1.setResult(stage2); SAXResult stage1 = new SAXResult(th1); InputSource in = new InputSource(new File("input.xml").toURI().toURL().toExternalForm()); xf.transform(in, stage1);

The stage1 stylesheet is a very simple (at this point) script, in which I am trying to get the URI of the input document (headers omitted): <xsl:template match="/"> <xsl:attribute name="path"><xsl:value-of select="document-uri(/)"/></xsl:attribute> </xsl:template>

Now I run this test and put breakpoints in the DocumentPool.add methods, and I get the following stack trace when the program stops at one of them: DocumentPool.add:64 <-- here I see the document added to the pool without a system ID (which will cause document-uri function to fail) Controller.registerDocument:1461 XsltTransformer.transform:553 TransformerImpl.transform:183 TransformerHandlerImpl.endDocument:172 <-- here the systemID is null and I don't see it propagated anywhere ContentHandlerProxy.close:288 <-- here I still see a systemID instance variable pointing to the correct input document URI ProxyReceiver.close:103 ComplexContentOutputter.close:618 ReceivingContentHandler.endDocument:244 AbstractSAXParser.endDocument:745 XMLDocumentFragmentScannerImpl.scanDocument:515 XML11Configuration.parse:848 XML11Configuration.parse:777 XMLParser.parse:141 AbstractSAXParser.parse:1213 Sender.sendSAXSource:449 Sender.send:151 IdentityTransformer:transform:362 Main.main:...

I would like to ask: Is this a bug I found? Or am I doing something wrong when I invoke the XSLT transform? I have tried invoking Saxon on the command-line for the same stage1.xsl and input.xml, and the document-uri function works as expected.


Replies (3)

Please register to reply

RE: systemId of input seems to be lost along the way - Added by Michael Kay over 8 years ago

I'm confused that you're supplying an InputSource to the Transformer.transform() method. This can't work, surely?

But I think the essence of the issue is that you're running an identity transform to pipe input into the first transformation step. I can't see why you want to do that, but you presumably have your reasons. Should the document-uri() (or system ID) of the output of an identity transform be the same as the document-uri() (or system ID) of the input? I don't think any of the specs are likely to tell us. In general I think you should assume that the system ID of in-memory documents that have transient existence as intermediate steps in a pipeline (whether DOM trees in memory or SAX streams) is undefined.

RE: systemId of input seems to be lost along the way - Added by Douglas Atique over 8 years ago

Well... actually I forgot to mention I have also been testing a custom EntityResolver to retrieve my DTD from classpath. So there is actually a few more lines between that InputSource instantiation and the transform invocation. I create an XMLReader, set its entity resolver, create a SAXSource, pass it both the InputSource and the XMLReader and then invoke the transform on the SAXSource. Sorry, I omitted that part.

I managed to get the systemId after all. You pointed me to the mistake right away. I followed some recipe found in stackoverflow where the Transform is created without arguments, like I did. Now I am using getTransform on the first TransformHandler to get the transformer instead, and that way the systemId gets through and document-uri(/) works.

Now I need to figure out the next step. I want to use resolve-uri(), document-uri(/) and document() to resolve some node text as a relative path to an included file, and make document() find the same DTD in the included file. Right now, it looks like any included document's DTD will not be found, even though the top-level document's DTD is found.

Thanks for pointing me the way.

RE: systemId of input seems to be lost along the way - Added by Douglas Atique over 8 years ago

I just got hold of another answer of yours (http://saxon-help.narkive.com/wqnRay2M/wiring-an-entityresolver-into-saxon) telling me to define a URIResolver that returns a SAXSource already equipped with the custom EntityResolver I need. That's great!

Thanks for your valuable help.

    (1-3/3)

    Please register to reply