Bug #424
closedSerialization of japanese characters corrupts XML
0%
Description
SourceForge user: kcritz
I am constructing a DOM from JAVA which includes
japanese characters. When I try to serialize this DOM,
the "<" character of a close-tag after certain japanese
text is not written properly. Also, the text itself
is not written properly.
I have attached several files which demonstrate the issue:
-
A simplified java test file
-
A screenshot of the japanese section of the file
-
An example of the result file
-
A screenshot of the result file
Interestingly enough, the result file is parseable by
Xerces, though JADE has trouble reading it.
Am I doing something wrong in my serialization, or is
this a legit bug in SAXON?
Files
Updated by Anonymous over 20 years ago
SourceForge user: kcritz
Logged In: YES
user_id=189759
Using SAXON 6.5.3, if you're interested
Updated by Anonymous over 20 years ago
SourceForge user: mhkay
Logged In: YES
user_id=251681
PLEASE do not enter suspected bugs in this area of the site
until they have been confirmed. There is a bright yellow
notice asking you not to do this on the "Submit New" page -
I fail to see how people can fail to see this.
I want people to be able to browse the bugs area knowing
that it only contains real bugs.
I'm afraid I can't see what's wrong with the output. It
appears to be correctly encoded UTF-8, and is a well-formed
XML file. I can't tell whether the output is correct,
because I don;t know what the encoding used in your Java
source file is - it doesn't appear to be UTF-8, as far as I
can see.
I am closing this bug because you raised it in the wrong
place. Please use the saxon-help list or forum.
Michael Kay
Please register to edit this issue