Project

Profile

Help

collation parameter alphanumeric=yes

Added by Martin Honnen over 1 year ago

There seems to be some overlap between the W3C UCA collation parameter numeric=yes and the Saxon extension alphanumeric=yes and perhaps I am just misreading the documentation but why does Saxon HE 10.8 .NET for

declare namespace output = "http://www.w3.org/2010/xslt-xquery-serialization";

declare option output:method 'text';
declare option output:item-separator ' ';

'B5 C10 C8 D11 D13 D3 D7 D9 E12 E8'  
=> tokenize('\s+') 
=> sort('http://www.w3.org/2013/collation/UCA?alphanumeric=yes')

give me B5 C8 C10 D3 D7 D9 D11 D13 E8 E12 while neither SaxonCS Query nor the various Java editions seem to take alphanumeric=yes into account e.g. with SaxonCS I get B5 C10 C8 D11 D13 D3 D7 D9 E12 E8, as I get B5 C10 C8 D11 D13 D3 D7 D9 E12 E8 with Saxon HE 11.4 and Saxon EE 11.4

In the end I found that using

'B5 C10 C8 D11 D13 D3 D7 D9 E12 E8'  
=> tokenize('\s+') 
=> sort('http://www.w3.org/2013/collation/UCA?numeric=yes;fallback=yes')

seems to give me the same results between .NET and Java (and even SaxonJS) but I am kind of confused why the Java and the CS version seem to ignore the alphanumeric=yes parameter.


Please register to reply