Bug #5615
closed
Saxon 11 - Cannot have two different documents with the same document-uri
Fixed in Maintenance Release:
Description
I'm getting this unhandled error which comes probably from reusing a transformer, the code we use is quite complex in those parts:
7065 ERROR [ main ] ro.sync.quickfix.QuickFixExecutor - Cannot generate late quick fixes: net.sf.saxon.trans.XPathException: Cannot have two different documents with the same document-uri file:/D:/projects/../SchematronQF/add/element/selectAttrValue/topic.dita
net.sf.saxon.trans.XPathException: Cannot have two different documents with the same document-uri file:/D:/projects/../SchematronQF/add/element/selectAttrValue/topic.dita
at net.sf.saxon.om.DocumentPool.add(DocumentPool.java:69)
at net.sf.saxon.Controller.registerDocument(Controller.java:1004)
at net.sf.saxon.Controller.makeSourceTree(Controller.java:1359)
at net.sf.saxon.s9api.XsltTransformer.transform(XsltTransformer.java:343)
at net.sf.saxon.jaxp.TransformerImpl.transform(TransformerImpl.java:75)
would it be a good idea in the Controller.registerDocument to check if the document is already in the pool before adding it?
if (getDocumentPool().find(uri) == null) {
sourceDocumentPool.add(doc, uri);
}
Files
Controller.registerDocument() is called from two places: from DocumentFn (implementing doc()
and document()
) and from Controller.makeSourceTree()
which handles the "primary input" to a transformation (to the extent that's still a meaningful concept). The main reason for putting the primary input in the document pool is so that it isn't parsed again if the transformation then uses doc() to access it. Yes, I guess it wouldn't do any harm if we find the document URI is already in use to just skip that: it just means anyone who tries to use document-uri() on the document is going to wonder why it hasn't got one.
Ok, I do not understand this part of what you are saying:
it just means anyone who tries to use document-uri() on the document is going to wonder why it hasn't got one.
so there's a side effect to checking if the document is already in the pool when Controller.registerDocument is called?
document-uri() gives a result only for a document that's in the pool. That's because returning a document-uri() UUU for a document provides a guarantee that doc('UUU') will return that document; and that's why two documents aren't allowed to have the same document URI.
So if the XSLT uses document-uri on a node from the current XML source it would not return anything? Might be a problem. How about auto discard a document if already in the pool and the Controller.registerDocument is called?
TreeInfo cachedDoc = getDocumentPool().find(uri);
if(cachedDoc != null) {
getDocumentPool().discard(cachedDoc);
}
sourceDocumentPool.add(doc, uri);
We can't do anything that would make the data mutable - the result of document-uri() applied to a node can't change over time.
The problems with this started with fn:transform(), which muddies the scope rules for things like the immutability of the result of doc().
I'm not seeing the entire picture here, you can choose to do what's best, maybe even not fix this issue if it looks like something few people would run into or that we could fix by discarding all documents in the pool when reusing the transformer. How about if in the method "net.sf.saxon.Controller.makeSourceTree(Source, int)" if there is already a DocumentKey for that source in the pool we no longer create a new document from it and just use the document in the pool?
Looking more at what we are doing on our side to cause this problem, we are reusing a Transformer and give it each time a javax.xml.transform.Source which has the same system ID but has a reader with different contents each time. So on our side we can clear the pool every time after transforming.
In general on the method "net.sf.saxon.Controller.makeSourceTree(Source, int)" creating a DocumentKey over the "new DocumentKey(source.getSystemId())" will not capture the fact that the source contents (reader or input stream or parser) may be different than what's currently in the pool so in my opinion it would make sense to remove the key from the pool if it exists before adding it.
Is there any good reason to reuse the Transformer rather than creating a new one? I usually recommend creating a new Transformer for every transformation.
I think we might reuse the transformer usually because it has a pre-compiled stylesheet. Other projects like the DITA Open Toolkit may reuse the transformer because it has a document pool and has there are lots of DITA topics, each with links to various targets which may be the same file, once the DITA OT loads a target file using document() it's useful that for another processed topic which refers to the same target, the target file is not loaded again.
One more problem related to this, if we enable this feature:
net.sf.saxon.lib.FeatureKeys.ASSERTIONS_CAN_SEE_COMMENTS
and publish a simple XML document with an XSLT stylesheet I will alwaysget this error reported by Saxon:
net.sf.saxon.trans.XPathException: Cannot have two different documents with the same document-uri file:/D:/projects/eXml_Saxon11.3/test/EXM-24141/assert.xml
at net.sf.saxon.om.DocumentPool.add(DocumentPool.java:71)
at net.sf.saxon.Controller.registerDocument(Controller.java:972)
at net.sf.saxon.Controller.makeSourceTree(Controller.java:1327)
at net.sf.saxon.s9api.XsltTransformer.transform(XsltTransformer.java:345)
at net.sf.saxon.jaxp.TransformerImpl.transform(TransformerImpl.java:75)
because the XML document is first parsed and added to the pool in the assertion related packages:
Add file:/D:/projects/.../assert.xml
java.lang.Exception
at net.sf.saxon.om.DocumentPool.add(DocumentPool.java:68)
at net.sf.saxon.sxpath.XPathDynamicContext.setContextItem(XPathDynamicContext.java:79)
at net.sf.saxon.sxpath.XPathExpression.createDynamicContext(XPathExpression.java:145)
at com.saxonica.ee.schema.Assertion.testComplex(Assertion.java:252)
at com.saxonica.ee.validate.ValidationStack.testAssertion(ValidationStack.java:595)
at com.saxonica.ee.validate.ValidationStack.testAssertions(ValidationStack.java:589)
at com.saxonica.ee.validate.ValidationStack.endElement(ValidationStack.java:532)
at net.sf.saxon.event.ProxyReceiver.endElement(ProxyReceiver.java:149)
at net.sf.saxon.event.ProxyReceiver.endElement(ProxyReceiver.java:149)
at com.saxonica.ee.validate.AttributeInheritor.endElement(AttributeInheritor.java:63)
at net.sf.saxon.event.PathMaintainer.endElement(PathMaintainer.java:59)
at net.sf.saxon.event.DocumentValidator.endElement(DocumentValidator.java:79)
at net.sf.saxon.event.ReceivingContentHandler.endElement(ReceivingContentHandler.java:609)
at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source)
- Assignee set to Michael Kay
- Priority changed from Low to Normal
- Status changed from New to Closed
I haven't been able to reproduce this on 12.x so I'm assuming it is fixed, and I think it's best now to close the issue. Please raise a new issue if there is still a problem.
Please register to edit this issue
Also available in: Atom
PDF