Support #5980
openShould setting parser property affect XsltCompiler?
0%
Description
Using Saxon 12.1 HE, the code
Processor processor = new Processor(false);
Configuration configuration = processor.getUnderlyingConfiguration();
configuration.setParseOptions(configuration.getParseOptions().withParserProperty("http://www.oracle.com/xml/jaxp/properties/" + "entityExpansionLimit", "128000"));
XsltCompiler xsltCompiler = processor.newXsltCompiler();
XsltExecutable xsltExecutable = xsltCompiler.compile(new File("C:\\Users\\marti\\Downloads\\entity-expansion-size.xsl"));
System.out.println(xsltExecutable);
with the XSLT from https://saxonica.plan.io/attachments/64064 gives an exception indicating that the parser property I have set is not taken into account by the parser XsltCompiler uses:
Error on line 1 column 1
SXXP0003 Error reported by XML parser: JAXP00010001: Der Parser hat mehr als 64000
Entityerweiterungen in diesem Dokument gefunden. Dies ist der von JDK vorgeschriebene
Grenzwert.: JAXP00010001: Der Parser hat mehr als 64000 Entityerweiterungen in diesem
Dokument gefunden. Dies ist der von JDK vorgeschriebene Grenzwert.
Exception in thread "main" net.sf.saxon.s9api.SaxonApiException: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; JAXP00010001: Der Parser hat mehr als 64000 Entityerweiterungen in diesem Dokument gefunden. Dies ist der von JDK vorgeschriebene Grenzwert.
at net.sf.saxon.s9api.XsltCompiler.compile(XsltCompiler.java:974)
at net.sf.saxon.s9api.XsltCompiler.compile(XsltCompiler.java:996)
at org.example.Main2.main(Main2.java:17)
Caused by: net.sf.saxon.trans.XPathException: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; JAXP00010001: Der Parser hat mehr als 64000 Entityerweiterungen in diesem Dokument gefunden. Dies ist der von JDK vorgeschriebene Grenzwert.
at net.sf.saxon.resource.ActiveSAXSource.deliver(ActiveSAXSource.java:224)
at net.sf.saxon.resource.ActiveStreamSource.deliver(ActiveStreamSource.java:65)
at net.sf.saxon.event.Sender.send(Sender.java:104)
at net.sf.saxon.style.StylesheetModule.sendStylesheetSource(StylesheetModule.java:157)
at net.sf.saxon.style.StylesheetModule.loadStylesheet(StylesheetModule.java:231)
at net.sf.saxon.style.Compilation.compileSingletonPackage(Compilation.java:113)
at net.sf.saxon.s9api.XsltCompiler.compile(XsltCompiler.java:969)
... 2 more
Caused by: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; JAXP00010001: Der Parser hat mehr als 64000 Entityerweiterungen in diesem Dokument gefunden. Dies ist der von JDK vorgeschriebene Grenzwert.
at java.xml/com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:204)
at java.xml/com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:178)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:400)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:327)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:284)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(XMLEntityManager.java:1408)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(XMLEntityManager.java:1332)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEntityReference(XMLDocumentFragmentScannerImpl.java:1844)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2985)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:605)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:534)
at java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:888)
at java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:824)
at java.xml/com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
at java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1216)
at java.xml/com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:635)
at net.sf.saxon.resource.ActiveSAXSource.deliver(ActiveSAXSource.java:190)
... 8 more
When I change the code to use a DocumentBuilder e.g.
Processor processor = new Processor(false);
Configuration configuration = processor.getUnderlyingConfiguration();
configuration.setParseOptions(configuration.getParseOptions().withParserProperty("http://www.oracle.com/xml/jaxp/properties/" + "entityExpansionLimit", "128000"));
DocumentBuilder docBuilder = processor.newDocumentBuilder();
XdmNode doc = docBuilder.build(new File("C:\\Users\\marti\\Downloads\\entity-expansion-size.xsl"));
XsltCompiler xsltCompiler = processor.newXsltCompiler();
XsltExecutable xsltExecutable = xsltCompiler.compile(doc.asSource());
System.out.println(xsltExecutable);
the code runs fine and the file is parsed and compiled fine, meaning the parser property is taken into account.
Shouldn't that also happen for the direct use of XsltCompiler.compile(file)?
Updated by Michael Kay over 1 year ago
The standard parse options set at configuration level are not used when parsing stylesheets because we want to have more control: for example we force schema and DTD validation to be off and line numbering to be on.
If you use the API XsltCompiler.compile(Source)
then you can supply an AugmentedSource
which contains parse options, and these will be used largely unchanged; except that some options will be force-set, see StylesheetModule.makeStylesheetParseOptions()
.
Updated by Michael Kay over 1 year ago
This came up again in #6152. The documentation is very poor at explaining exactly what is affected when you set parser options at configuration level. The problem is that there are about 100 places that do use the Configuration's parser options, and plenty of others that don't, which makes it very difficult to make any kind of definitive statement; it also means that any changes we make to try and rationalise things are almost certain to break something.
Please register to edit this issue