Project

Profile

Help

Bug #424

closed

Serialization of japanese characters corrupts XML

Added by Anonymous almost 20 years ago. Updated about 12 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Category:
Serialization
Sprint/Milestone:
-
Start date:
Due date:
% Done:

0%

Estimated time:
Legacy ID:
sf-966759
Applies to branch:
Fix Committed on Branch:
Fixed in Maintenance Release:
Platforms:

Description

SourceForge user: kcritz

I am constructing a DOM from JAVA which includes

japanese characters. When I try to serialize this DOM,

the "<" character of a close-tag after certain japanese

text is not written properly. Also, the text itself

is not written properly.

I have attached several files which demonstrate the issue:

  • A simplified java test file

  • A screenshot of the japanese section of the file

  • An example of the result file

  • A screenshot of the result file

Interestingly enough, the result file is parseable by

Xerces, though JADE has trouble reading it.

Am I doing something wrong in my serialization, or is

this a legit bug in SAXON?


Files

EncodingTest.java (9.56 KB) EncodingTest.java Anonymous, 2004-06-04 19:35
EncodingTest.java.png (9.56 KB) EncodingTest.java.png Anonymous, 2004-06-04 19:39
EncodingTestResult.xml (9.56 KB) EncodingTestResult.xml Anonymous, 2004-06-04 19:40
EncodingTestResult.xml.png (9.56 KB) EncodingTestResult.xml.png Anonymous, 2004-06-04 19:40
Actions #1

Updated by Anonymous almost 20 years ago

SourceForge user: kcritz

Logged In: YES

user_id=189759

Using SAXON 6.5.3, if you're interested

Actions #2

Updated by Anonymous almost 20 years ago

SourceForge user: mhkay

Logged In: YES

user_id=251681

PLEASE do not enter suspected bugs in this area of the site

until they have been confirmed. There is a bright yellow

notice asking you not to do this on the "Submit New" page -

I fail to see how people can fail to see this.

I want people to be able to browse the bugs area knowing

that it only contains real bugs.

I'm afraid I can't see what's wrong with the output. It

appears to be correctly encoded UTF-8, and is a well-formed

XML file. I can't tell whether the output is correct,

because I don;t know what the encoding used in your Java

source file is - it doesn't appear to be UTF-8, as far as I

can see.

I am closing this bug because you raised it in the wrong

place. Please use the saxon-help list or forum.

Michael Kay

Please register to edit this issue

Also available in: Atom PDF