Project

Profile

Help

Support #5980

open

Should setting parser property affect XsltCompiler?

Added by Martin Honnen 10 months ago. Updated 7 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
s9api API
Sprint/Milestone:
-
Start date:
2023-04-19
Due date:
% Done:

0%

Estimated time:
Legacy ID:
Applies to branch:
12
Fix Committed on Branch:
Fixed in Maintenance Release:
Platforms:

Description

Using Saxon 12.1 HE, the code

        Processor processor = new Processor(false);

        Configuration configuration = processor.getUnderlyingConfiguration();
        configuration.setParseOptions(configuration.getParseOptions().withParserProperty("http://www.oracle.com/xml/jaxp/properties/" + "entityExpansionLimit", "128000"));

        XsltCompiler xsltCompiler = processor.newXsltCompiler();

        XsltExecutable xsltExecutable = xsltCompiler.compile(new File("C:\\Users\\marti\\Downloads\\entity-expansion-size.xsl"));

        System.out.println(xsltExecutable);

with the XSLT from https://saxonica.plan.io/attachments/64064 gives an exception indicating that the parser property I have set is not taken into account by the parser XsltCompiler uses:

Error on line 1 column 1 
  SXXP0003   Error reported by XML parser: JAXP00010001: Der Parser hat mehr als 64000
  Entityerweiterungen in diesem Dokument gefunden. Dies ist der von JDK vorgeschriebene
  Grenzwert.: JAXP00010001: Der Parser hat mehr als 64000 Entityerweiterungen in diesem
  Dokument gefunden. Dies ist der von JDK vorgeschriebene Grenzwert.
Exception in thread "main" net.sf.saxon.s9api.SaxonApiException: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; JAXP00010001: Der Parser hat mehr als 64000 Entityerweiterungen in diesem Dokument gefunden. Dies ist der von JDK vorgeschriebene Grenzwert.
	at net.sf.saxon.s9api.XsltCompiler.compile(XsltCompiler.java:974)
	at net.sf.saxon.s9api.XsltCompiler.compile(XsltCompiler.java:996)
	at org.example.Main2.main(Main2.java:17)
Caused by: net.sf.saxon.trans.XPathException: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; JAXP00010001: Der Parser hat mehr als 64000 Entityerweiterungen in diesem Dokument gefunden. Dies ist der von JDK vorgeschriebene Grenzwert.
	at net.sf.saxon.resource.ActiveSAXSource.deliver(ActiveSAXSource.java:224)
	at net.sf.saxon.resource.ActiveStreamSource.deliver(ActiveStreamSource.java:65)
	at net.sf.saxon.event.Sender.send(Sender.java:104)
	at net.sf.saxon.style.StylesheetModule.sendStylesheetSource(StylesheetModule.java:157)
	at net.sf.saxon.style.StylesheetModule.loadStylesheet(StylesheetModule.java:231)
	at net.sf.saxon.style.Compilation.compileSingletonPackage(Compilation.java:113)
	at net.sf.saxon.s9api.XsltCompiler.compile(XsltCompiler.java:969)
	... 2 more
Caused by: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; JAXP00010001: Der Parser hat mehr als 64000 Entityerweiterungen in diesem Dokument gefunden. Dies ist der von JDK vorgeschriebene Grenzwert.
	at java.xml/com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:204)
	at java.xml/com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:178)
	at java.xml/com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:400)
	at java.xml/com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:327)
	at java.xml/com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:284)
	at java.xml/com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(XMLEntityManager.java:1408)
	at java.xml/com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(XMLEntityManager.java:1332)
	at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEntityReference(XMLDocumentFragmentScannerImpl.java:1844)
	at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2985)
	at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:605)
	at java.xml/com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112)
	at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:534)
	at java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:888)
	at java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:824)
	at java.xml/com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
	at java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1216)
	at java.xml/com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:635)
	at net.sf.saxon.resource.ActiveSAXSource.deliver(ActiveSAXSource.java:190)
	... 8 more

When I change the code to use a DocumentBuilder e.g.

        Processor processor = new Processor(false);

        Configuration configuration = processor.getUnderlyingConfiguration();
        configuration.setParseOptions(configuration.getParseOptions().withParserProperty("http://www.oracle.com/xml/jaxp/properties/" + "entityExpansionLimit", "128000"));

        DocumentBuilder docBuilder = processor.newDocumentBuilder();

        XdmNode doc = docBuilder.build(new File("C:\\Users\\marti\\Downloads\\entity-expansion-size.xsl"));
        
        XsltCompiler xsltCompiler = processor.newXsltCompiler();

        XsltExecutable xsltExecutable = xsltCompiler.compile(doc.asSource());

        System.out.println(xsltExecutable);

the code runs fine and the file is parsed and compiled fine, meaning the parser property is taken into account.

Shouldn't that also happen for the direct use of XsltCompiler.compile(file)?

Actions #1

Updated by Michael Kay 10 months ago

The standard parse options set at configuration level are not used when parsing stylesheets because we want to have more control: for example we force schema and DTD validation to be off and line numbering to be on.

If you use the API XsltCompiler.compile(Source) then you can supply an AugmentedSource which contains parse options, and these will be used largely unchanged; except that some options will be force-set, see StylesheetModule.makeStylesheetParseOptions().

Actions #2

Updated by Martin Honnen 10 months ago

Thanks, Mike, that makes sense.

Actions #3

Updated by Michael Kay 7 months ago

This came up again in #6152. The documentation is very poor at explaining exactly what is affected when you set parser options at configuration level. The problem is that there are about 100 places that do use the Configuration's parser options, and plenty of others that don't, which makes it very difficult to make any kind of definitive statement; it also means that any changes we make to try and rationalise things are almost certain to break something.

Please register to edit this issue

Also available in: Atom PDF