Project

Profile

Help

Bug #2187

closed

Entity Re-Declarations Bug

Added by Yvon DUBOIS about 10 years ago. Updated almost 9 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
XSLT conformance
Sprint/Milestone:
-
Start date:
2014-10-17
Due date:
% Done:

100%

Estimated time:
Legacy ID:
Applies to branch:
9.5, 9.6
Fix Committed on Branch:
9.5, 9.6
Fixed in Maintenance Release:
Platforms:

Description

Hi, Based on the Extensible Markup Language (XML) 1.1 (Second Edition)", "W3C Recommendation 16 August 2006, edited in place 29 September 2006", any first entity declaration is binding; See "4.2 Entity Declarations ... If the same entity is declared more than once, the first declaration encountered is binding; ...

URL: http://www.w3.org/TR/xml11/#sec-entity-decl

To my knowledge, since Saxon 8.3, up to Saxon-EE 9.5.1.5J, the last declaration is bounded in Saxon.

Our multilingual International environment makes extensive use of images external entity declarations. The order in which they are binded is crucial for our operations; Example:

For a Spanish publication, our first declaration would be something like

and a second declaration further down the DTD modules for common images in other languages like

We expect the Spanish image to be used, but Saxon binds the second declaration and we end up with the common image rather than the "es"panol one.

We've been using a work-around where we use a reverse order of entity declarations when generating output from XML. But we need to keep the Recommendation order for compliance with the authoring software and composition engine.

Is this a non-compliance in Saxon that was not yet reported? Or is there a command line option to address this issue (searched the documentation but was unsuccessful)? It would be great to cease using this reverse order of entity declarations.

We currently use the following Saxon and Java versions:

Saxon-EE 9.5.1.5J from Saxonica

Java version 1.7.0_67

Regards,

Yvon


Files

entityTest.zip (248 KB) entityTest.zip Yvon DUBOIS, 2014-11-18 23:55
Actions #1

Updated by Michael Kay about 10 years ago

  • Priority changed from High to Normal

If this is a bug, then it is a bug in the XML parser you are using, not in Saxon. Saxon knows nothing of entity declarations; that's left entirely to the XML parser. By default, on the Java platform, Saxon uses the XML parser built in to the JDK. If you are having problems with the conformance of this parser, then the first thing to do is switch to using the version of Xerces from Apache, which is far more conformant. If that still gives problems, then you need to raise the issue with the developers of that parser, not with Saxon.

Actions #2

Updated by Michael Kay about 10 years ago

  • Status changed from New to Rejected

Closing this as "rejected" on the grounds that if there is a bug, it's a bug in the XML parser, not in Saxon.

Actions #3

Updated by Yvon DUBOIS about 10 years ago

Thank you Michael for your reply, it actually helped me focus on the area that might be the cause. Started with identifying the XML parser used in our environment:

Using parser com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser

Then, created a test set of files to isolate where the order of entity resolution is becoming non-compliant.

Found the use of the XPath function "unparsed-entity-uri()" gives the wrong result. In the attached zip file, the XML file uses an processing instruction and it works fine in IE8. It also keep the first declaration in the XML editor in usage (currently XMetaL). When generating wth Saxon 9.5EE to an actual html file, the last declaration is kept for the image. Can you point me in the right direction to find a solution to this issue?

Thank you so much and Best Regards,

Yvon

Actions #4

Updated by Michael Kay about 10 years ago

  • Category set to XSLT conformance
  • Status changed from Rejected to In Progress
  • Assignee set to Michael Kay

OK, sorry, I see now that the parser is reporting all the entity declarations to Saxon, and that Saxon has a choice which one to use.

I've raised a spec bug to ask for the spec in this area to be clarified, since this should really be defined in the XPath data model and/or the XML infoset:

https://www.w3.org/Bugs/Public/show_bug.cgi?id=27369

But I agree that Saxon should be taking the last of the entities, not the first.

Actions #5

Updated by Michael Kay about 10 years ago

  • Found in version set to 9.5

Patched in Subversion on the 9.5, 9.6, and 9.7 branches for both the TinyTree and LinkedTree builders.

Actions #6

Updated by Michael Kay about 10 years ago

  • Status changed from In Progress to Resolved

Submitted this repro, with minor tweaking, as W3C XSLT 3.0 test case unparsed-entity-uri-050.

Actions #7

Updated by O'Neil Delpratt almost 10 years ago

Bug fix patched applied in the Saxon 9.6.0.3 maintenance release.

Actions #8

Updated by O'Neil Delpratt almost 10 years ago

  • Fixed in version set to 9.6.0.3
Actions #9

Updated by O'Neil Delpratt almost 10 years ago

  • Found in version changed from 9.5 to 9.5 9.6
Actions #10

Updated by O'Neil Delpratt over 9 years ago

  • Status changed from Resolved to Closed
  • % Done changed from 0 to 100
  • Fixed in version changed from 9.6.0.3 to 9.5.1.10

Bug fix applied in the Saxon 9.5.1.10 maintenance release

Actions #11

Updated by O'Neil Delpratt almost 9 years ago

  • Applies to branch 9.5, 9.6 added
Actions #12

Updated by O'Neil Delpratt almost 9 years ago

  • Fix Committed on Branch 9.5, 9.6 added
  • Fixed in Maintenance Release 9.6.0.3, 9.5.1.10 added

Please register to edit this issue

Also available in: Atom PDF