Project

Profile

Help

Apparent Bug in Saxon 9.0.0.4 Handing \u2028

Added by Anonymous over 16 years ago

Legacy ID: #4904799 Legacy Poster: W. Eliot Kimber (drmacro)

I have content that contains character \u2028 (line separator). Any way I copy this character to the output results in character \u8232. Here is a test case that demonstrates the problem: <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"> <xsl:template match="/"> <result> <via-sequence> <xsl:sequence select="/"/> </via-sequence> <via-copy-of> <xsl:copy-of select="/"/> </via-copy-of> <via-apply-templates> <xsl:apply-templates/> </via-apply-templates> </result> </xsl:template> <xsl:template match="*"> <xsl:copy> <xsl:apply-templates/> </xsl:copy> </xsl:template> <xsl:template match="text()"> <xsl:if test="contains(., '&#x2028;')"> <msg> + DEBUG: in text(): found 2028. </msg> <xsl:value-of select="."/> </xsl:if> </xsl:template> </xsl:stylesheet> Test input: <?xml version="1.0" encoding="UTF-8"?> <doc> <test>Before2028[&#x2028;]after2028</test> </doc> And the output I get running the transform using Oxygen 9.2 and Saxon 9.0.0.4 (at least as reported by Oxygen's component list dialog): <?xml version="1.0" encoding="UTF-8"?><result><via-sequence><doc> <test>Before2028[&#8232;]after2028</test> </doc></via-sequence><via-copy-of><doc> <test>Before2028[&#8232;]after2028</test> </doc></via-copy-of><via-apply-templates><doc><test><msg> + DEBUG: in text(): found 2028. </msg>Before2028[&#8232;]after2028</test></doc></via-apply-templates></result> Cheers, Eliot


Replies (3)

Please register to reply

RE: Apparent Bug in Saxon 9.0.0.4 Handing \u2 - Added by Anonymous over 16 years ago

Legacy ID: #4904804 Legacy Poster: W. Eliot Kimber (drmacro)

Follow up: If I add this character map, then the 2028 character is correctly output: <xsl:character-map name="bugfix"> <xsl:output-character character="&#x2028;" string="&#x2028;"/> </xsl:character-map> (that is, when I use the map to construct the result document).

RE: Apparent Bug in Saxon 9.0.0.4 Handing \u2 - Added by Anonymous over 16 years ago

Legacy ID: #4904956 Legacy Poster: Michael Kay (mhkay)

Decimal 8232 = hex 2028. The character references &#8232; and &#x2028; are completely equivalent. It's not possible in standard XSLT to control whether hex or decimal character references are used by the serializer, but in Saxon you can control it using the saxon:character-representation attribute on xsl:output. Michael Kay

RE: Apparent Bug in Saxon 9.0.0.4 Handing \u2 - Added by Anonymous over 16 years ago

Legacy ID: #4905479 Legacy Poster: W. Eliot Kimber (drmacro)

Doh! I'm clearly totally brainwashed to see a 4-digit value as a hex number. There are days I forget decimal values are even an option. Sorry for the false alarm. Cheers, Eliot

    (1-3/3)

    Please register to reply