Order By German vowels and special characters
Added by Anonymous over 14 years ago
Legacy ID: #8444894 Legacy Poster: ailli (ailli)
Today I found out that Saxon seems to treat sepcial characters as German vowels (ä, ö, ü) or French characters as é in a rather unfortunate way when it comes to ordering. Consider the following statement: let $d := MilchÖlWasser for $i in $d//liquid order by $i ascending return $i This returns the following result: MilchWasserÖl (By the way, those are the German words for milk, water and oil.) One (at least a German speaking person) would expect the follwing result: Milch>ÖlWasser The reason therefore is that Ö may be replaced by Oe in this case. (ä=ae, ö=oe, ü=ue) As far as I'm conserned also the French would expect to find é somewhere around e in an ordered set. My question now is whether Saxon strictly sticks to any standard here, ie. order by UTF-8 hex or something like that? Is there a way to achieve a language based ordering by setting any parameter or something? Writing a filter would be rather inefficient I guess. If there is nothing Saxon can do about this, it might be a useful extension for up comming releases. In example one could configure Saxons sorting algorithm to stick to some language rules. Best Regards!
Replies (4)
Please register to reply
RE: Order By German vowels and special characters - Added by Anonymous over 14 years ago
Legacy ID: #8444922 Legacy Poster: Michael Kay (mhkay)
To get a language-sensitive collation for German in XSLT, use <xsl:sort lang="de"/>. To get a language-sensitive collation for German in XQuery, use order by $i collation "http://saxon.sf.net/collation?lang=de" Unfortunately the XQuery version of this is not portable across XQuery processors. There are other parameters you can set to control sorting with more precision, for example whether case is significant. See http://www.saxonica.com/documentation/extensibility/collation.html for details.
RE: Order By German vowels and special characters - Added by Anonymous over 14 years ago
Legacy ID: #8452382 Legacy Poster: ailli (ailli)
I read your post and followed the links. Unfortunately I can't get my collation set as described. What I do is the following: I have a webapplication buit as MVC architecture. So from the parameters provided the controller fetches the data in XQuery according to the ordering provided. Then the result is pushed into the view, where another XQuery statement renders the data as HTML. The only explicit ordering happens at the controller level. I plugged in your statement as described, but it didn't show any effect. Then I tried implementing my own Collation in Java, but still no imapct. I'm a bit stuck here - is it possible, that the Xquery statements for the view implicitly overwrite the collation again? Is there a way to force saxon to use a specific collation for all statements? Best Regards!
RE: Order By German vowels and special characters - Added by Anonymous over 14 years ago
Legacy ID: #8453774 Legacy Poster: Michael Kay (mhkay)
I'm sorry, but I can't tell why your code isn't working without seeing your code. General information about the design of your code isn't enough. It could be a very simple error like misspelling the collation name. Please try to drill down until you can find a free-standing piece of code that produces different results from those you expect it to produce, preferably in an environment where anyone can replicate the problem.
RE: Order By German vowels and special characters - Added by Anonymous over 14 years ago
Legacy ID: #8454036 Legacy Poster: ailli (ailli)
Hi I found the mistake today: What I do in Java is parsing the GET parameters and building order by constraints from what I receive. As last element I always append an empty sequence, because of the final comma. I added collation 'http://saxon.sf.net/collation?lang=de;' once at the end, wherefore it was applied to () only and thus having no impact on the result. So as advice to anyone ever having the same problem: Make sure you add the collation to EVERY field that has to consider it, ie: order by $item/name ascending collation '...', $item/age descending collation '...' Setting it only at the end of the order by does NOT imply, that it is set globally as I asumed. Thanks and Regards!
Please register to reply