Project

Profile

Help

Indent adds significant whitespace

Added by Anonymous over 17 years ago

Legacy ID: #4059630 Legacy Poster: jbreure (jbreure)

Hi, I have found what looks to me like a bug and it concerns the indenting produced by Saxon. The specifications for serialization say the following: Whitespace characters SHOULD NOT be added in places where the characters would constitute significant whitespace, for example, in the content of an element whose content model is known to be mixed. Link: http://www.w3.org/TR/xslt-xquery-serialization/#xml-indent. But when using the attribute indent="yes" with method="xml" on the output element, Saxon actually adds indent whitespace in an element with mixed content if the last or the first child node is an element. As an example see the 3rd and 4th "b" element in the following case (i hope the whitespace will display correctly): ----------------------------------------------- <?xml version="1.0"?> <a> <b><c>test</c><d attr="value"/></b> <b>this is <c>mixed</c> content</b> <b><c/>this is also <c>mixed</c> content</b> <b>and this is also <c>mixed</c> content<c/></b> </a> ----------------------------------------------- was turned into: ----------------------------------------------- <?xml version="1.0" encoding="UTF-8"?> <a> <b> <c>test</c> <d attr="value"/> </b> <b>this is <c>mixed</c> content</b> <b> <c/>this is also <c>mixed</c> content</b> <b>and this is also <c>mixed</c> content<c/> </b> </a> ----------------------------------------------- Please let me know if I'm wrong or if this should be raised as a bug. Thanks, JB.


Replies (3)

Please register to reply

RE: Indent adds significant whitespace - Added by Anonymous over 17 years ago

Legacy ID: #4059631 Legacy Poster: jbreure (jbreure)

Whitespace did not really show up, here is another version of the example: ----------------------------------------------- <?xml version="1.0"?> <a> <b><c>test</c><d attr="value"/></b> <b>this is <c>mixed</c> content</b> <b><c/>this is also <c>mixed</c> content</b> <b>and this is also <c>mixed</c> content<c/></b> </a> ----------------------------------------------- was turned into: ----------------------------------------------- <?xml version="1.0" encoding="UTF-8"?> <a> [space]<b> [space][space]<c>test</c> [space][space]<d attr="value"/> [space]</b> [space]<b>this is <c>mixed</c> content</b> [space]<b> [space][space]<c/>this is also <c>mixed</c> content</b> [space]<b>and this is also <c>mixed</c> content<c/> [space]</b> </a> -----------------------------------------------

RE: Indent adds significant whitespace - Added by Anonymous over 17 years ago

Legacy ID: #4059941 Legacy Poster: Michael Kay (mhkay)

Are you (a) using Saxon-SA, and (b) validating the output against a schema? If not, the content model of the output is unknown, and is therefore is not "known to be mixed". Clearly if adding whitespace were banned in the case where the content model is unknown, no indentation could take place at all.

RE: Indent adds significant whitespace - Added by Anonymous over 17 years ago

Legacy ID: #4061109 Legacy Poster: jbreure (jbreure)

Ok, I am not using schema-aware Saxon so i guess that's the reason. I would have thought that Saxon was able to know the content model of an element by looking at its children but I guess this information is not available when serializing. In our case, because it generated a problem, we just turned indenting off. Thanks, JB.

    (1-3/3)

    Please register to reply