Bug #5615
closedSaxon 11 - Cannot have two different documents with the same document-uri
0%
Description
I'm getting this unhandled error which comes probably from reusing a transformer, the code we use is quite complex in those parts:
7065 ERROR [ main ] ro.sync.quickfix.QuickFixExecutor - Cannot generate late quick fixes: net.sf.saxon.trans.XPathException: Cannot have two different documents with the same document-uri file:/D:/projects/../SchematronQF/add/element/selectAttrValue/topic.dita
net.sf.saxon.trans.XPathException: Cannot have two different documents with the same document-uri file:/D:/projects/../SchematronQF/add/element/selectAttrValue/topic.dita
at net.sf.saxon.om.DocumentPool.add(DocumentPool.java:69)
at net.sf.saxon.Controller.registerDocument(Controller.java:1004)
at net.sf.saxon.Controller.makeSourceTree(Controller.java:1359)
at net.sf.saxon.s9api.XsltTransformer.transform(XsltTransformer.java:343)
at net.sf.saxon.jaxp.TransformerImpl.transform(TransformerImpl.java:75)
would it be a good idea in the Controller.registerDocument to check if the document is already in the pool before adding it?
if (getDocumentPool().find(uri) == null) {
sourceDocumentPool.add(doc, uri);
}
Files
Updated by Michael Kay over 2 years ago
Controller.registerDocument() is called from two places: from DocumentFn (implementing doc()
and document()
) and from Controller.makeSourceTree()
which handles the "primary input" to a transformation (to the extent that's still a meaningful concept). The main reason for putting the primary input in the document pool is so that it isn't parsed again if the transformation then uses doc() to access it. Yes, I guess it wouldn't do any harm if we find the document URI is already in use to just skip that: it just means anyone who tries to use document-uri() on the document is going to wonder why it hasn't got one.
Updated by Radu Coravu over 2 years ago
Ok, I do not understand this part of what you are saying:
it just means anyone who tries to use document-uri() on the document is going to wonder why it hasn't got one.
so there's a side effect to checking if the document is already in the pool when Controller.registerDocument is called?
Updated by Michael Kay over 2 years ago
document-uri() gives a result only for a document that's in the pool. That's because returning a document-uri() UUU for a document provides a guarantee that doc('UUU') will return that document; and that's why two documents aren't allowed to have the same document URI.
Updated by Radu Coravu over 2 years ago
So if the XSLT uses document-uri on a node from the current XML source it would not return anything? Might be a problem. How about auto discard a document if already in the pool and the Controller.registerDocument is called?
TreeInfo cachedDoc = getDocumentPool().find(uri);
if(cachedDoc != null) {
getDocumentPool().discard(cachedDoc);
}
sourceDocumentPool.add(doc, uri);
Updated by Michael Kay over 2 years ago
We can't do anything that would make the data mutable - the result of document-uri() applied to a node can't change over time.
The problems with this started with fn:transform(), which muddies the scope rules for things like the immutability of the result of doc().
Updated by Radu Coravu over 2 years ago
I'm not seeing the entire picture here, you can choose to do what's best, maybe even not fix this issue if it looks like something few people would run into or that we could fix by discarding all documents in the pool when reusing the transformer. How about if in the method "net.sf.saxon.Controller.makeSourceTree(Source, int)" if there is already a DocumentKey for that source in the pool we no longer create a new document from it and just use the document in the pool?
Updated by Radu Coravu over 2 years ago
Looking more at what we are doing on our side to cause this problem, we are reusing a Transformer and give it each time a javax.xml.transform.Source which has the same system ID but has a reader with different contents each time. So on our side we can clear the pool every time after transforming. In general on the method "net.sf.saxon.Controller.makeSourceTree(Source, int)" creating a DocumentKey over the "new DocumentKey(source.getSystemId())" will not capture the fact that the source contents (reader or input stream or parser) may be different than what's currently in the pool so in my opinion it would make sense to remove the key from the pool if it exists before adding it.
Updated by Michael Kay over 2 years ago
Is there any good reason to reuse the Transformer rather than creating a new one? I usually recommend creating a new Transformer for every transformation.
Updated by Radu Coravu over 2 years ago
I think we might reuse the transformer usually because it has a pre-compiled stylesheet. Other projects like the DITA Open Toolkit may reuse the transformer because it has a document pool and has there are lots of DITA topics, each with links to various targets which may be the same file, once the DITA OT loads a target file using document() it's useful that for another processed topic which refers to the same target, the target file is not loaded again.
Updated by Radu Coravu over 2 years ago
- File asserts.zip asserts.zip added
One more problem related to this, if we enable this feature: net.sf.saxon.lib.FeatureKeys.ASSERTIONS_CAN_SEE_COMMENTS and publish a simple XML document with an XSLT stylesheet I will alwaysget this error reported by Saxon:
net.sf.saxon.trans.XPathException: Cannot have two different documents with the same document-uri file:/D:/projects/eXml_Saxon11.3/test/EXM-24141/assert.xml
at net.sf.saxon.om.DocumentPool.add(DocumentPool.java:71)
at net.sf.saxon.Controller.registerDocument(Controller.java:972)
at net.sf.saxon.Controller.makeSourceTree(Controller.java:1327)
at net.sf.saxon.s9api.XsltTransformer.transform(XsltTransformer.java:345)
at net.sf.saxon.jaxp.TransformerImpl.transform(TransformerImpl.java:75)
because the XML document is first parsed and added to the pool in the assertion related packages:
Add file:/D:/projects/.../assert.xml
java.lang.Exception
at net.sf.saxon.om.DocumentPool.add(DocumentPool.java:68)
at net.sf.saxon.sxpath.XPathDynamicContext.setContextItem(XPathDynamicContext.java:79)
at net.sf.saxon.sxpath.XPathExpression.createDynamicContext(XPathExpression.java:145)
at com.saxonica.ee.schema.Assertion.testComplex(Assertion.java:252)
at com.saxonica.ee.validate.ValidationStack.testAssertion(ValidationStack.java:595)
at com.saxonica.ee.validate.ValidationStack.testAssertions(ValidationStack.java:589)
at com.saxonica.ee.validate.ValidationStack.endElement(ValidationStack.java:532)
at net.sf.saxon.event.ProxyReceiver.endElement(ProxyReceiver.java:149)
at net.sf.saxon.event.ProxyReceiver.endElement(ProxyReceiver.java:149)
at com.saxonica.ee.validate.AttributeInheritor.endElement(AttributeInheritor.java:63)
at net.sf.saxon.event.PathMaintainer.endElement(PathMaintainer.java:59)
at net.sf.saxon.event.DocumentValidator.endElement(DocumentValidator.java:79)
at net.sf.saxon.event.ReceivingContentHandler.endElement(ReceivingContentHandler.java:609)
at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source)
Updated by Michael Kay almost 2 years ago
- Assignee set to Michael Kay
- Priority changed from Low to Normal
Updated by Michael Kay 6 months ago
- Status changed from New to Closed
I haven't been able to reproduce this on 12.x so I'm assuming it is fixed, and I think it's best now to close the issue. Please raise a new issue if there is still a problem.
Please register to edit this issue