Project

Profile

Help

Bug #5325

closed

format-integer: German representation

Added by Christian Grün about 2 years ago. Updated about 1 year ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Localization
Sprint/Milestone:
-
Start date:
2022-02-15
Due date:
% Done:

100%

Estimated time:
Legacy ID:
Applies to branch:
10, 11, trunk
Fix Committed on Branch:
10, 11, trunk
Fixed in Maintenance Release:
Platforms:
Java

Description

This query …

for $i in (0, 16, 21, 111, 1111, 10000, 1000000000000, 1000000000000000000)
return format-integer($i, 'w', 'de'), '',
for $i in (0, 10000000, 1000000000, 10000000000)
return format-integer($i, 'w;o', 'de')

… returns …

Zero
Sechszehn
EinundZwanzig
Einhundertundelf
Eintausend Einhundertundelf
Zehntausend
Eintausend Milliarde
Eine Milliarde Milliarde

e
Zehn Millionenste
Eine Milliardeste
Zehn Milliardeste

… and this would be correct:

Null
Sechzehn
Einundzwanzig
Einhundertelf
Eintausendeinhundertelf
Zehntausend
Eine Billion
Eine Trillion

Nullte
Zehnmillionste
Milliardste
Zehnmilliardste

I’ll be glad to help if you think I can.

Actions #1

Updated by Michael Kay about 2 years ago

  • Subject changed from format-date: German representation to format-integer: German representation
Actions #2

Updated by Michael Kay about 2 years ago

So what's the current usage of Billion vs Milliarde in German? (I never trust official data on this kind of thing. English Dictionaries still list the usage of billion = 10^12 which I think fell out of use about 75 years ago in favour of 10^9. The same goes for collations - officially Austrian and German are supposed to collate differently, but I'm sure book publishers don't produce two versions of a book with different indexes).

Rhetorical question really. I've no intention of setting up as an authority on this. See response to bug #5324 - we need to drop the user-contributed localisers that are still used in Saxon-HE.

(I tried to spell localiser with a "z" above but the software corrected it. Again, the spell-checkers have got it wrong - "-ize" and "-ise" endings were equally acceptable in British English until American software companies decided otherwise.)

Actions #3

Updated by Christian Grün about 2 years ago

Interesting to hear about the usage of billion; Germans are still taught that British people use it exclusively for 10^12; presumably, as it reflects the German way of counting.

German Collation may depend on the context: Different rules apply for lists of places vs. names (DIN 5007 1 vs 2). In practice, the first variant is clearly predominant and often used for names as well. As far as I know, the Austrian collation order (which places ä after az, and similar) has only been relevant for name listings in some particular phone books. I would be pretty surprised if an Austrian user or customer asked for it.

I used Saxon-EE 10 for testing, but without ICU in the classpath. If you find any reason for fixing existing code, the first three cases (Null, Sechzehn, Einundzwanzig) may be the most relevant ones.

Actions #4

Updated by Michael Kay about 2 years ago

It takes teachers (and localization experts) a while to catch up with language changes. I remember my German grandfather once met the German teacher at my school and commented afterwards that it was 50 years since he had heard German spoken so "correctly"...

Actions #5

Updated by Christian Grün about 2 years ago

;·)

One last note I forgot: The most compelling reason to use localization in Austria is to have Januar replaced by Jänner. I just found ICU is smart enough to handle that.

Actions #6

Updated by Christian Grün about 2 years ago

And it’s never good to say “last”:

ICU seems to add soft hyphens for (at least some) spelled out numbers in German. If it’s Saxon that takes care of capitalizing the result, the soft hyphens should probably be ignored:

Query: format-integer(21, 'Ww', 'de'),

Returned: Ein­Und­Zwanzig

Expected: Ein­und­zwanzig

Actions #7

Updated by Michael Kay about 2 years ago

For the query

for $i in (0, 16, 21, 111, 1111, 10000, 1000000000000, 1000000000000000000) return format-integer($i, 'w', 'de')

Saxon-EE (based on ICU-J) returns

null sechzehn ein­und­zwanzig ein­hundert­elf ein­tausend­ein­hundert­elf zehn­tausend eine billion 1.000.000.000.000.000.000

and I propose to align the Saxon-HE output with this.

Actions #8

Updated by Michael Kay about 2 years ago

  • Category set to Localization
  • Status changed from New to In Progress
  • Assignee set to Michael Kay
  • Priority changed from Low to Normal
  • Applies to branch 10, 11, trunk added
  • Fix Committed on Branch 10, 11, trunk added
  • Platforms Java added

Partly fixed - the cardinal numbers in the example now produce the same output in Saxon-HE as in Saxon-EE (ICU-J)

Actions #9

Updated by Michael Kay over 1 year ago

I have changed the code for converting ICU output to title case so it doesn't capitalize the letter after a soft hyphen.

This change on the 11 and 12 branches only.

Actions #10

Updated by Michael Kay over 1 year ago

  • Status changed from In Progress to Resolved
Actions #11

Updated by O'Neil Delpratt about 1 year ago

  • Fixed in Maintenance Release 12.0 added

Bug fix applied in the Saxon 12.0 major release. (Issue remains open awaiting Saxon 11 maintenance release.)

Actions #12

Updated by O'Neil Delpratt about 1 year ago

  • Fixed in Maintenance Release 11.5 added

Bug fix applied in the Saxon 11.5 maintenance release.

Actions #13

Updated by O'Neil Delpratt about 1 year ago

  • Status changed from Resolved to Closed
  • % Done changed from 0 to 100
  • Fixed in Maintenance Release 10.9 added

Bug fix applied in the Saxon 10.9 maintenance release.

Please register to edit this issue

Also available in: Atom PDF