transformToString() Encoding issue
Issue reported by Saxon/C user in the PHP extension:
transformToString() => encoding issue with ISO-8859-1 specific characters (output utf-8?).
The string from the transformation comes back in the ISO-8859-1 encoding but is being decoded as UTF in the C++ code by the JNI function GetStringUTFChars.
I also noticed that the NewStringUTF potentially can cause encoding issues too.
#1 Updated by O'Neil Delpratt 6 months ago
So it looks like the JNI function GetStringUTFChars successfully encodes the jstring to UTF-8 char array. But we have the meta element:
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"
Then the browser will try to render the document as ISO-8859-1, when in fact the transformtToString has encoded the string to UTF-8.
Please register to edit this issue