Project

Profile

Help

Problem converting unicode to ascii

Added by Anonymous over 17 years ago

Legacy ID: #3976396 Legacy Poster: Toralion (toralion05)

Hello, I try to convert xml to rtf and have problems converting unicode characters to ansi (rtf allows only ansi chars). For that purpose I have to convert any char above codepoint 255 into it's numeric value (string-to-codepoints) and write it (under the rules of rtf) into my target document. The problem is, when I use Saxon8B out of Oxygen (I use it to develop the xsl's)everything works fine, but if I call the same template out of my Java-Application also using Saxon8B, umlauts are not translated correct. E.g. With Saxon out of Oxygen Ä remains "Ä" (Decimal code points 196) With Saxon out of my Application Ä becomes "Ä" (Decimal code points 195 8222) Does anyone know if there are any 'special' settings or parameters in saxon that could be accountable for that differences? Any idea? I would be thankful for any suggestion. P.s. In both cases I use Saxon8B 8.7.1


Replies (2)

RE: Problem converting unicode to ascii - Added by Anonymous over 17 years ago

Legacy ID: #3976487 Legacy Poster: Michael Kay (mhkay)

It's hard to say what you're doing wrong because you don't say much about what you are doing. The output "Ä" suggests a file in UTF-8 encoding that's being mis-displayed by software that doesn't understand UTF-8 or that doesn't know that the encoding is UTF-8. So the simplest explanation would be that you have set (or defaulted) the encoding to UTF-8 on xsl:output. If you're not doing anything special I would suggest that you use encoding="iso-8859-1". That's not quite the same as "ansi", which despite the name is a Microsoft-proprietary variant of iso-8859-1 differing in positions 128-159.

RE: Problem converting unicode to ascii - Added by Anonymous over 17 years ago

Legacy ID: #3978227 Legacy Poster: Toralion (toralion05)

Hello Michael, thanks for you fast response... I fixed the Problem today. It was not the output format what leeds to this problem. The problem was the way I read the xml. When I use StreamSource to read the xmlFile I get the above mentioned problem. --> Source source = new StreamSource(xmlFile); The Solution was to use a SAXSource to read the xml --> Source source = new SaxSource(xmlFile); I don't know why the encodng of the xml-File is not interpreted correctly when I read it via StreamSource or what the differences between StreamSource and SAXSource are... but as long as it works... c'est la vie. ;-) Thanks again for your help Toralion

    (1-2/2)

    Please register to reply