Bug #1793
closedProblem with EntityResolver and document()
100%
Description
Reported by Nicolas Houillon:
Hi,
I encountered a really weird problem when using an entity resolver with the parser used to read files loaded with the document() function.
I use document() to load a bunch of files that i want to merge, which worked fine until the url in their doctype declaration ceased to work.
As i can't change the original xml files i wrote an entity resolver and supplied it to the parser used by saxon with System.setProperty("javax.xml.parsers.SAXParserFactory", "org.test.TempSAXParserFactory");.
Now, the weird part is that my entity resolver is called on half the document() calls, while the other half fail as if it wasn't there at all (first call works, second fails, third works, fourth fails, etc...).
I made an example to illustrate the problem, that i'm attaching to the post, you can run it with mvn test.
Expected result would be :
==================================
Source document
==================================
<?xml version="1.0" encoding="UTF-8"?>
<documents>
<path>xml/1.xml</path>
<path>xml/2.xml</path>
<path>xml/3.xml</path>
<path>xml/4.xml</path>
<path>xml/5.xml</path>
<path>xml/6.xml</path>
</documents>
==================================
Starting transform
==================================
Resolving with TempEntityResolver :
name : null
publicId : null
baseURI : file:/home/houillon/workspace/saxon-test/xml/1.xml
systemId : 1
systemId match, using resource dtd
Resolving with TempEntityResolver :
name : null
publicId : null
baseURI : file:/home/houillon/workspace/saxon-test/xml/2.xml
systemId : 2
systemId match, using resource dtd
Resolving with TempEntityResolver :
name : null
publicId : null
baseURI : file:/home/houillon/workspace/saxon-test/xml/3.xml
systemId : 3
systemId match, using resource dtd
Resolving with TempEntityResolver :
name : null
publicId : null
baseURI : file:/home/houillon/workspace/saxon-test/xml/4.xml
systemId : 4
systemId match, using resource dtd
Resolving with TempEntityResolver :
name : null
publicId : null
baseURI : file:/home/houillon/workspace/saxon-test/xml/5.xml
systemId : 5
systemId match, using resource dtd
Resolving with TempEntityResolver :
name : null
publicId : null
baseURI : file:/home/houillon/workspace/saxon-test/xml/6.xml
systemId : 6
==================================
Transform ended
==================================
==================================
Result document
==================================
<?xml version="1.0" encoding="UTF-8"?>
<result>
<value>1</value>
<value>2</value>
<value>3</value>
<value>4</value>
<value>5</value>
<value>6</value>
</result>
But I get :
==================================
Source document
==================================
<?xml version="1.0" encoding="UTF-8"?>
<documents>
<path>xml/1.xml</path>
<path>xml/2.xml</path>
<path>xml/3.xml</path>
<path>xml/4.xml</path>
<path>xml/5.xml</path>
<path>xml/6.xml</path>
</documents>
==================================
Starting transform
==================================
Resolving with TempEntityResolver :
name : null
publicId : null
baseURI : file:/home/houillon/workspace/saxon-test/xml/1.xml
systemId : 1
systemId match, using resource dtd
Recoverable error on line 7 of test.xsl:
FODC0002: I/O error reported by XML parser processing
file:/home/houillon/workspace/saxon-test/xml/2.xml:
/home/houillon/workspace/saxon-test/xml/2 (No such file or directory)
Resolving with TempEntityResolver :
name : null
publicId : null
baseURI : file:/home/houillon/workspace/saxon-test/xml/3.xml
systemId : 3
systemId match, using resource dtd
Recoverable error on line 7 of test.xsl:
FODC0002: I/O error reported by XML parser processing
file:/home/houillon/workspace/saxon-test/xml/4.xml:
/home/houillon/workspace/saxon-test/xml/4 (No such file or directory)
Resolving with TempEntityResolver :
name : null
publicId : null
baseURI : file:/home/houillon/workspace/saxon-test/xml/5.xml
systemId : 5
systemId match, using resource dtd
Recoverable error on line 7 of test.xsl:
FODC0002: I/O error reported by XML parser processing
file:/home/houillon/workspace/saxon-test/xml/6.xml:
/home/houillon/workspace/saxon-test/xml/6 (No such file or directory)
==================================
Transform ended
==================================
==================================
Result document
==================================
<?xml version="1.0" encoding="UTF-8"?>
<result>
<value>1</value>
<value>3</value>
<value>5</value>
</result>
I tried with saxon-HE versions 9.4.0.7 and 9.5.0.2 from maven repository, with similar results.
Updated by Michael Kay over 11 years ago
We definitely need to fix this bug, but as a workaround, define a URIResolver that returns a SAXSource containing the required XMLReader fully initialized with its EntityResolver. Saxon will not cache or modify such an XMLReader, unlike one that it creates for itself using the JAXP factory mechanism.
Updated by O'Neil Delpratt over 11 years ago
- Status changed from New to Resolved
Bug fixed and committed to subversion in the 9.4 and 9.5 branches, which will be available in the next maintenance releases, respectively. The fix was to maintain the user EntityResolver
when the sourceParser
is reused.
Updated by O'Neil Delpratt over 11 years ago
- Status changed from Resolved to Closed
- % Done changed from 0 to 100
- Fixed in version set to 9.4.0.8 9.5.1.1
Bug now closed. Successfully applied to the Saxon maintenance releases 9.4.0.8 and 9.5.1.1
Please register to edit this issue