xsl:sort using @lang locales
I have a question regarding the sort support in Saxon 9.9 and Saxon 10. One of Oxygen XML Editor users is trying to sort German and French terms according to locale sort rules (de-DE, de-AT, fr-FR, fr-CA), but it seems that only responds to main/basic language rules (de-AT = de = de-DE and fr-CA = fr = fr-FR). In the attached input file you can see the desired result for de-AT and fr-CA in the comments.
#1 Updated by Michael Kay 8 months ago
lang attribute is present on
xsl:sort, and if no other relevant attributes (such as
collation) are present, then Saxon will obtain a Java Locale using the logic in
JavaCollationFactory.getLocale() (which splits the supplied lang value into language and country parts), and then gets a Java Collator using
The set of locales available depends on the JVM installation.
If you're concerned about accurate collation, then I'd advise setting the
collation attribute to a UCA collation URI, and making sure the ICU implementation is used (which means you need Saxon-PE or higher, and ICU must be on the classpath).
#2 Updated by Michael Kay 8 months ago
I've confirmed that if you use a collation attribute rather than a lang attribute, for example
then it now uses ICU collations rather than Java collations. This leads to a difference between fr-FR and fr-CA, though the results for Germany and Austria are still the same.
(I've always doubted whether the traditional differences noted by collation experts still exist in the 21st century - I think collation standards nowadays are much more likely to vary from one publisher to another, rather than from one country to another. Austrians surely read the same books that Germans do, and the indexes at the back of the book aren't going to be re-sorted for the Austrian market. But I'm happy to leave that question to the ICU experts).
Perhaps in the case where xsl:sort is used with a lang attribute and no collation attribute, Saxon-PE and -EE should now be using the ICU/UCA collation rather than the Java collation.
Please register to edit this issue