Bug #4467
closedtransformToString() Encoding issue
100%
Description
Issue reported by Saxon/C user in the PHP extension:
transformToString() => encoding issue with ISO-8859-1 specific characters (output utf-8?).
The string from the transformation comes back in the ISO-8859-1 encoding but is being decoded as UTF in the C++ code by the JNI function GetStringUTFChars.
I also noticed that the NewStringUTF potentially can cause encoding issues too.
Related issues
Updated by O'Neil Delpratt almost 5 years ago
So it looks like the JNI function GetStringUTFChars successfully encodes the jstring to UTF-8 char array. But we have the meta element:
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"
Then the browser will try to render the document as ISO-8859-1, when in fact the transformtToString has encoded the string to UTF-8.
Updated by O'Neil Delpratt almost 3 years ago
- Status changed from In Progress to Resolved
We have redesigned the handling of string encoding in SaxonC 11.
Updated by O'Neil Delpratt almost 3 years ago
- Status changed from Resolved to Closed
- % Done changed from 0 to 100
- Fixed in version set to 11.1
Bug fix patched in SaxonC 11.1 release
Updated by O'Neil Delpratt over 2 years ago
- Related to Support #4638: Output xsl3 transform to UTF-16LE with BOM added
Please register to edit this issue