Support #6501
closedUnexpected space at the beginning of each line (except the first) when exporting as csv
0%
Description
When we export data as csv and declare the return type of a template or function, we believe an extra space is mistakenly added at the beginning of each line (except the first). The template without return type is the following:
<xsl:template mode="export-ok" match="*">
<xsl:value-of select="@*" separator=";"/>
<xsl:value-of select="$EOL"/>
</xsl:template>
and generates the correct output:
10;2024-08-09;season
11;2024-08-09;daily
12;2024-08-09;season
The template with return type is the following:
<xsl:template mode="export-nok" match="*" as="xs:string*">
<xsl:value-of select="@*" separator=";"/>
<xsl:value-of select="$EOL"/>
</xsl:template>
and generates the wrong output:
10;2024-08-09;season
11;2024-08-09;daily
12;2024-08-09;season
Attached you will find this simple test scenario.
Files
Updated by Michael Kay 5 months ago
There's a possible complication here that the template is called from within xsl:message, and the specification gives the processor complete discretion over the output of xsl:message. However, I think that's probably not relevant, in that you would get the same effect if you used xsl:document.
The spec for xsl:message says:
If the xsl:message instruction contains a sequence constructor, then the sequence obtained by evaluating this sequence constructor is used to construct the content of the new document node, as described in 5.8.1 Constructing Complex Content.
The rules for "constructing complex content" say that:
Any consecutive sequence of strings in the sequence is converted to a single text node, whose string value contains the content of each of the strings in turn, with U+0020 (SPACE) used as a separator between successive strings.
The rules for an xsl:template
with as="xs:string"
say:
The result of evaluating the sequence constructor is then converted to the required type using the function conversion rules.
The result of evaluating the sequence constructor in your case is a sequence of two text nodes.
The function conversion rules say:
If the expected type is a sequence of a generalized atomic type (possibly with an occurrence indicator *, +, or ?), the following conversions are applied: Atomization is applied to the given value, resulting in a sequence of atomic values....
So after conversion the result of the template is a sequence of two strings. You're applying the template to a sequence of three <train>
elements, so the result should be a sequence of six strings, and the rules say that a space will be inserted between each of these six strings.
If you want to avoid this, then don't convert the text nodes to strings - specify the return type as text()*
.
Updated by Michael Kay 5 months ago
Incidentally, using <xsl:value-of select="@*" separator=";"/>
to generate CSV output is probably ill-advised as the order of attributes is unpredictable.
Updated by Johan Gheys 5 months ago
Thanks Michael for the clarifications and advice, which are always appreciated. Indeed, the "problem" also occurs with an xsl:document, but for simplicity I used an xsl:message here. As you said, the "problem" is solved when the return type is defined as text()* instead of xs:string*. I was a bit confused by reading the documentation about xsl:value-of (Evaluates an expression as a string, and outputs its value to the current result tree), mistakenly thinking that xs:string* as the return type was just clearer without additional conversions.
Updated by Michael Kay 5 months ago
- Tracker changed from Bug to Support
- Status changed from New to Closed
Please register to edit this issue