Project

Profile

Help

Bug #1793

closed

Problem with EntityResolver and document()

Added by O'Neil Delpratt almost 9 years ago. Updated almost 9 years ago.

Status:
Closed
Priority:
Normal
Category:
Internals
Sprint/Milestone:
-
Start date:
2013-06-06
Due date:
% Done:

100%

Estimated time:
Legacy ID:
Applies to branch:
Fix Committed on Branch:
Fixed in Maintenance Release:
Platforms:

Description

Reported by Nicolas Houillon:

Hi,

I encountered a really weird problem when using an entity resolver with the parser used to read files loaded with the document() function.

I use document() to load a bunch of files that i want to merge, which worked fine until the url in their doctype declaration ceased to work.

As i can't change the original xml files i wrote an entity resolver and supplied it to the parser used by saxon with System.setProperty("javax.xml.parsers.SAXParserFactory", "org.test.TempSAXParserFactory");.

Now, the weird part is that my entity resolver is called on half the document() calls, while the other half fail as if it wasn't there at all (first call works, second fails, third works, fourth fails, etc...).

I made an example to illustrate the problem, that i'm attaching to the post, you can run it with mvn test.

Expected result would be :

==================================
Source document
==================================
<?xml version="1.0" encoding="UTF-8"?>
<documents>
   <path>xml/1.xml</path>
   <path>xml/2.xml</path>
   <path>xml/3.xml</path>
   <path>xml/4.xml</path>
   <path>xml/5.xml</path>
   <path>xml/6.xml</path>
</documents>
==================================
Starting transform
==================================

Resolving with TempEntityResolver :
name : null
publicId : null
baseURI : file:/home/houillon/workspace/saxon-test/xml/1.xml
systemId : 1

systemId match, using resource dtd

Resolving with TempEntityResolver :
name : null
publicId : null
baseURI : file:/home/houillon/workspace/saxon-test/xml/2.xml
systemId : 2

systemId match, using resource dtd

Resolving with TempEntityResolver :
name : null
publicId : null
baseURI : file:/home/houillon/workspace/saxon-test/xml/3.xml
systemId : 3

systemId match, using resource dtd

Resolving with TempEntityResolver :
name : null
publicId : null
baseURI : file:/home/houillon/workspace/saxon-test/xml/4.xml
systemId : 4

systemId match, using resource dtd

Resolving with TempEntityResolver :
name : null
publicId : null
baseURI : file:/home/houillon/workspace/saxon-test/xml/5.xml
systemId : 5

systemId match, using resource dtd

Resolving with TempEntityResolver :
name : null
publicId : null
baseURI : file:/home/houillon/workspace/saxon-test/xml/6.xml
systemId : 6
==================================
Transform ended
==================================
==================================
Result document
==================================
<?xml version="1.0" encoding="UTF-8"?>
<result>
   <value>1</value>
   <value>2</value>
   <value>3</value>
   <value>4</value>
   <value>5</value>
   <value>6</value>
</result>
But I get :

==================================
Source document
==================================
<?xml version="1.0" encoding="UTF-8"?>
<documents>
   <path>xml/1.xml</path>
   <path>xml/2.xml</path>
   <path>xml/3.xml</path>
   <path>xml/4.xml</path>
   <path>xml/5.xml</path>
   <path>xml/6.xml</path>
</documents>
==================================
Starting transform
==================================

Resolving with TempEntityResolver :
name : null
publicId : null
baseURI : file:/home/houillon/workspace/saxon-test/xml/1.xml
systemId : 1

systemId match, using resource dtd

Recoverable error on line 7 of test.xsl:
  FODC0002: I/O error reported by XML parser processing
  file:/home/houillon/workspace/saxon-test/xml/2.xml:
  /home/houillon/workspace/saxon-test/xml/2 (No such file or directory)

Resolving with TempEntityResolver :
name : null
publicId : null
baseURI : file:/home/houillon/workspace/saxon-test/xml/3.xml
systemId : 3

systemId match, using resource dtd

Recoverable error on line 7 of test.xsl:
  FODC0002: I/O error reported by XML parser processing
  file:/home/houillon/workspace/saxon-test/xml/4.xml:
  /home/houillon/workspace/saxon-test/xml/4 (No such file or directory)

Resolving with TempEntityResolver :
name : null
publicId : null
baseURI : file:/home/houillon/workspace/saxon-test/xml/5.xml
systemId : 5

systemId match, using resource dtd

Recoverable error on line 7 of test.xsl:
  FODC0002: I/O error reported by XML parser processing
  file:/home/houillon/workspace/saxon-test/xml/6.xml:
  /home/houillon/workspace/saxon-test/xml/6 (No such file or directory)
==================================
Transform ended
==================================
==================================
Result document
==================================
<?xml version="1.0" encoding="UTF-8"?>
<result>
   <value>1</value>
   <value>3</value>
   <value>5</value>
</result>

I tried with saxon-HE versions 9.4.0.7 and 9.5.0.2 from maven repository, with similar results.

Actions #1

Updated by O'Neil Delpratt almost 9 years ago

  • Description updated (diff)
Actions #2

Updated by Michael Kay almost 9 years ago

We definitely need to fix this bug, but as a workaround, define a URIResolver that returns a SAXSource containing the required XMLReader fully initialized with its EntityResolver. Saxon will not cache or modify such an XMLReader, unlike one that it creates for itself using the JAXP factory mechanism.

Actions #3

Updated by O'Neil Delpratt almost 9 years ago

  • Status changed from New to Resolved

Bug fixed and committed to subversion in the 9.4 and 9.5 branches, which will be available in the next maintenance releases, respectively. The fix was to maintain the user EntityResolver when the sourceParser is reused.

Actions #4

Updated by O'Neil Delpratt almost 9 years ago

  • Status changed from Resolved to Closed
  • % Done changed from 0 to 100
  • Fixed in version set to 9.4.0.8 9.5.1.1

Bug now closed. Successfully applied to the Saxon maintenance releases 9.4.0.8 and 9.5.1.1

Please register to edit this issue

Also available in: Atom PDF