Project

Profile

Help

Bug #3018

closed

Encoding issue with SEF files

Added by Michael Kay about 8 years ago. Updated over 7 years ago.

Status:
Closed
Priority:
Normal
Category:
XSLT export
Sprint/Milestone:
-
Start date:
2016-11-04
Due date:
% Done:

100%

Estimated time:
Legacy ID:
Applies to branch:
9.7
Fix Committed on Branch:
9.7, trunk
Fixed in Maintenance Release:
Platforms:

Description

An SEF file containing the attribute card="°" is not being successfully read (on .NET). It fails with IllegalStateException complaining about the value of this attribute. It seems the SEF file has no encoding declaration, so there is scope for reading the file using the wrong encoding.


Files

test_decl.zip (937 Bytes) test_decl.zip T Hata, 2016-11-06 05:16
Actions #1

Updated by Michael Kay about 8 years ago

  • Status changed from New to Resolved
  • Fix Committed on Branch 9.7, 9.8 added

I have changed the default serialization properties set in ExpressionPresenter so it always uses encoding="utf-8", omit-xml-declaration="no", and version="1.0". Although I haven't actually reproduced the problem, I'm pretty confident this should fix it.

Actions #2

Updated by T Hata about 8 years ago

So, for now, just prepending <&#x200b;?xml version="1.0" encoding="UTF-8"?> to the export file should enable it to work on .NET?

Actions #3

Updated by Michael Kay about 8 years ago

Prepending an XML declaration should work provided that the XML declaration is correct. Obviously if the file is in iso-8859-1 and you prepend a declaration saying it is in UTF-8 then this will fail. I wouldn't be confident about the correct encoding without knowing more detail of the exact circumstances.

Actions #4

Updated by T Hata about 8 years ago

I tried to work around the problem by prepending an XML declaration. But it didn't work:

P:\>"C:\Program Files\Saxonica\SaxonEE9.7N\bin\Transform.exe" -it:main -t -xsl:test_decl.sef
Saxon-EE 9.7.0.11N from Saxonica
.NET 4.0.30319.34014 on Microsoft Windows NT 6.2.9200.0
Using license serial number [snip]
URIResolver.resolve href="file:/P://test_decl.sef" base="null"
Using parser org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser
java.lang.IllegalStateException
        at com.saxonica.trans.PackageLoaderPE$23.loadFrom(PackageLoaderPE.java:1565)
        at com.saxonica.trans.PackageLoaderPE.loadExpression(PackageLoaderPE.java:934)
        at com.saxonica.trans.PackageLoaderPE.readNamedTemplate(PackageLoaderPE.java:518)
        at com.saxonica.trans.PackageLoaderPE.readComponents(PackageLoaderPE.java:391)
        at com.saxonica.trans.PackageLoaderPE.loadPackageElement(PackageLoaderPE.java:232)
        at com.saxonica.trans.PackageLoaderPE.loadPackageDoc(PackageLoaderPE.java:153)
        at net.sf.saxon.style.StylesheetModule.loadStylesheet(StylesheetModule.java:246)
        at net.sf.saxon.style.Compilation.compileSingletonPackage(Compilation.java:101)
        at net.sf.saxon.s9api.XsltCompiler.compile(XsltCompiler.java:859)
        at net.sf.saxon.Transform.doTransform(Transform.java:727)
        at cli.Saxon.Cmd.DotNetTransform.Main(Unknown Source)
Fatal error during transformation: java.lang.IllegalStateException:  (no message)

No problem on Java with the same file:

P:\>java -classpath saxon9ee.jar net.sf.saxon.Transform -it:main -t -xsl:test_decl.sef
Saxon-EE 9.7.0.11J from Saxonica
Java version 1.8.0_111
Using license serial number [snip]
Stylesheet compilation time: 963.9542ms
Processing  (no source document) initial template = main
Hello, World!
<?xml version="1.0" encoding="UTF-8"?>Execution time: 22.1675ms
Memory used: 23084928
Actions #5

Updated by Michael Kay about 8 years ago

  • Status changed from Resolved to In Progress
  • Assignee changed from Michael Kay to O'Neil Delpratt
Actions #6

Updated by O'Neil Delpratt about 8 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 0 to 100

Bug fix committed in the class PackageLoaderPE.

The offending occurence indicator card="°" was represented as a string literal in the equality check. We changed this to its unicode representation '\u00B0' which worked. It is unclear what the difference is between the platforms Java and .NET in its handling of this character.

Added nunit test case to .NET

Actions #7

Updated by O'Neil Delpratt about 8 years ago

  • Status changed from Resolved to Closed
  • Fixed in Maintenance Release 9.7.0.12 added

Bug fix applied in the Saxon 9.7.0.12 maintenance release.

Actions #8

Updated by O'Neil Delpratt over 7 years ago

  • Applies to branch deleted (9.8)
  • Fix Committed on Branch trunk added
  • Fix Committed on Branch deleted (9.8)

Please register to edit this issue

Also available in: Atom PDF