Bug #4729
closedTransformerFactory doesn't accept ACCESS_EXTERNAL_STYLESHEET property
100%
Description
In the codebase I have exisiting code like this (boiled down to the essence)
TransformerFactory transformerFactory = TransformerFactory.newInstance();
transformerFactory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
transformerFactory.setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "");
transformerFactory.setAttribute(XMLConstants.ACCESS_EXTERNAL_STYLESHEET, "");
Transformer transformer = transformerFactory.newTransformer();
//set transformer properties...
StringReader stringReader = new StringReader(someXml);
transformer.transform(new StreamSource(stringReader), someOutput);
When I include Saxon as a library to use it at some other place in my code, the above code breaks. It raises a java.lang.IllegalArgumentException: Unrecognized configuration feature: http://javax.xml.XMLConstants/property/accessExternalDTD
I saw there was a discussion in https://saxonica.plan.io/issues/4234 about this but it seem to be not solved. I tried with Saxon-HE 10.2 and 9.9.1-7. Both have the exception.
If I remove ACCESS_EXTERNAL_DTD/ACCESS_EXTERNAL_STYLESHEET it works. But then the code really tries to access an external dtd!
I think you should support the above properties as you wrote in https://saxonica.plan.io/issues/4234: "However, when input is provided in the form of a StreamSource, and we instantiate the XMLReader ourselves from within Saxon, then we should arguably take account of these properties supplied to the TransformerFactory by setting corresponding properties on the XMLReader that we instantiate. I will look at making that change."
I think this is especially important because Saxon is "registered" to be the factory returned by "TransformerFactory.newInstance()" as soon as it is on the classpath. Of course in my own code I can change this and might use something like TransformerFactory.newDefaultInstance but imagine if a third party library makes use of a transformer via newInstance... Maybe it could also be an option to provide a saxon dependency that includes the Saxon Transformers but does not "register" itself as standard transformer. I don't know if this is easy to achieve as I do not know the details of this mechanism. But if would be possible it would be easier to just use the Saxon Transformer where I want it to be used but rely on the default transformers at other places.
Related issues
Updated by Michael Kay about 4 years ago
Two questions:
(a) Are you using the XML parser built in to the JDK, or some other parser (such as Apache Xerces)
(b) If the JDK parser, which JDK version?
This is all about persuading Saxon to configure the XML parser on your behalf, which is tricky because different parsers don't all behave in the same way.
Updated by Michael Kay about 4 years ago
The other point I would make here is that we will never guarantee 100% compatibility of Saxon's JAXP TransformerFactory implementation with the one in the JDK. There are a number of reasons:
(a) the JAXP interfaces are under-specified
(b) the JAXP interface specifications have changed over the years (not the formal interfaces, but details such as requiring particular configuration properties to be recognized) and we have no input to these changes
(c) there's no interoperability test suite for JAXP
(d) the JDK implementation is XSLT 1.0 with vendor extensions that are permitted but not required for XSLT conformance
Therefore it's unwise to put an application into production that relies solely on JAXP interfaces, and expect it to work whatever TransformerFactory it finds lying around on the classpath. You need test your application against every JAXP TransformerFactory that you intend to support.
Updated by Matthias H about 4 years ago
I use the built in parsers of JDK11. Without the Saxon library TransformerFactory.newInstance returns com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl (which is the same as newDefaultInstance).
If Saxon can not guarantee the compatibility wouldn't it be even more important to have an easy way to opt out from Saxon being used everywhere newInstance is used?
If you could provide 2 Saxon dependencies a user could choose:
- I want Saxon and use it everywhere (via newInstance) or
- I want Saxon but I define explicitely where I want to use it (for example via BasicTransformerFactory)
Updated by Michael Kay about 4 years ago
If Saxon can not guarantee the compatibility wouldn't it be even more important to have an easy way to opt out from Saxon being used everywhere newInstance is used?
It would be great if we could do that but we can't: the JAXP instantiation mechanism isn't under our control. The only way we can opt out is by not including the relevant entry in the JAR manifest, which would stop the mechanism working for everybody. That's what we chose to do with XPath - we no longer register as an XPath service provider, and you only get Saxon if you explicitly request it. But we decided it would be too disruptive to do the same on the XSLT side.
Updated by Michael Kay about 4 years ago
Bug #4234 was concerned with the JAXP validation interface, not with the transformation interface. The fix for that bug caused the three properties
XMLConstants.FEATURE_SECURE_PROCESSING
XMLConstants.ACCESS_EXTERNAL_DTD
XMLConstants.ACCESS_EXTERNAL_SCHEMA
to be recognised on the validation API, but it made no changes to the transformation API, and did not cause XMLConstants.ACCESS_EXTERNAL_STYLESHEET
to be recognised.
I propose to fix this so that the transformation interface conforms with JAXP 1.5. Note however that this will not fix the underlying problem, which is that a transformation that requires Xalan should explicitly load Xalan, rather than using the JAXP loading mechanism indiscriminantly. There may well be other aspects of your transformation that make it dependent on Xalan (or on XSLT 1.0).
Updated by Michael Kay about 4 years ago
Also note, the fix for #4234 didn't actually result in the ACCESS_EXTERNAL_SCHEMA property being recognized on the schema validation interface, because of difficulties interpreting exactly what the JAXP specification was supposed to mean.
Updated by Michael Kay about 4 years ago
Discussed at team meeting. We decided that it might make sense to apply these properties before calling the relevant resolver, rather than leaving the resolver to make the decision. This would generalise more easily to different kinds of resource and resolver, especially resources not relevant to XSLT 1.0 processors, such as unparsed text and JSON resources.
Updated by Michael Kay about 4 years ago
I'm now implementing this.
The JAXP definition is hopelessly underspecified, even allowing for the fact that it assumes XSLT 1.0. For example, it doesn't say what happens if you call setAttribute() with one of these properties more than once: are they supposed to be cumulative? I'm assuming the last one wins. I'll attempt to respect the intent, rather than the detail.
ACCESS_EXTERNAL_DTD
will be passed straight to the XML parser - which means it's ignored if the user supplies the parser, and is only effective if Saxon instantiates the parser itself.
ACCESS_EXTERNAL_STYLESHEET
will map to the Saxon Configuration property Feature.ALLOWED_PROTOCOLS
which will be changed to affect all resources fetched directly by Saxon (schemas, source documents, stylesheet modules, queries, JSON documents etc etc), and to kick in before calling any URI resolver. On the Validation APIs, ACCESS_EXTERNAL_SCHEMA will set the same Saxon Configuration property.
Updated by Michael Kay about 4 years ago
I decided to roll back on this, and go for a minimum change that fixes the bug.
In Saxon 10 we introduced a configuration property Feature.ALLOWED_PROTOCOLS
which has the format of the JAXP constants such as ACCESS_EXTERNAL_STYLESHEET
. If ACCESS_EXTERNAL_STYLESHEET
is supplied on the TransformerFactory
, I shall simply set ACCESS_EXTERNAL_STYLESHEET
on the Configuration
. This property is currently used by the "standard" resolvers (such as the standard URI resolvers, and can be overridden or ignored if a custom resolver is used.
The ACCESS_EXTERNAL_DTD
property will be passed through to the XML parser in cases where Saxon instantiates the XML parser. It won't affect any user-supplied parser (e.g an XMLReader
in a SAXSource
).
Updated by Michael Kay about 4 years ago
- Subject changed from Including Saxon breaks existing code (IllegalArgumentException) to TransformerFactory doesn't accept ACCESS_EXTERNAL_STYLESHEET property
- Category set to JAXP Java API
- Status changed from New to Resolved
- Assignee set to Michael Kay
- Applies to branch 10, trunk added
- Fix Committed on Branch 10, trunk added
Saxon's TransformerFactory
now accepts the ACCESS_EXTERNAL_DTD and ACCESS_EXTERNAL_STYLESHEET properties on the getAttribute()
and setAttribute()
methods. The first is passed to any Saxon-instantiated SAX parser; the second is implemented by setting the Configuration
property Feature.ALLOWED_PROTOCOLS
.
A new set of JUnit tests has been written as jaxptests/AllowedProtocolsTest.java
Updated by O'Neil Delpratt about 4 years ago
Bug fix applied in the Saxon 10.3 maintenance release
Updated by O'Neil Delpratt about 4 years ago
- Status changed from Resolved to Closed
- % Done changed from 0 to 100
- Fixed in Maintenance Release 10.3 added
Updated by Michael Kay almost 2 years ago
- Has duplicate Bug #5844: Issue when using Jakarta JAXB Implementation added
Please register to edit this issue