Support #6384
closedInconsistent results for substring-after() in XQuery
0%
Description
The results of expression "substring-after($who, '#')" is different when used for transformation Saxon-HE versus Saxon-EE. For input string "#GA" Saxon-HE returns "GA" while Saxon-EE leaves the input string unaltered. Attached files for reproducing the behvior. Please note that is not available for XSLT, where both versions of Saxon succeed.
Files
Updated by Michael Kay 8 months ago
The XQuery code declares a default collation, which the XSLT code doesn't. It's almost certainly the collation that accounts for the EE/HE difference, because Saxon-EE uses ICU for collation support whereas Saxon-HE uses the native JDK libraries.
The semantics of substring functions in the presence of a default collation are fairly peculiar because they cause certain characters to be treated as ignorable. Unless you really want a collation-sensitive substring, I would avoid this area.
Updated by Michael Kay 8 months ago
It's worth reading
https://www.w3.org/TR/xpath-functions-31/#substring.functions
for an explanation of what is happening here.
The substring-after
function says:
The function returns the substring of the value of $arg1 that follows in the value of $arg1 the first occurrence of a sequence of collation units that provides a minimal match to the collation units of $arg2 according to the collation that is used.
If the second argument of substring-after()
is a character (such as '#') that is ignored for collation purposes, then the sequence of collation units for the second argument is empty, which means that it matches at the beginning of the string, which means that the entire string is returned.
Updated by Michael Kay 8 months ago
- Tracker changed from Bug to Support
- Status changed from New to Closed
Please register to edit this issue