Project

Profile

Help

Bug #6183

open

"fallback=no" on UCA collation not working in EE edition

Added by Trevor Lawrence 8 months ago. Updated 6 months ago.

Status:
In Progress
Priority:
Normal
Category:
-
Sprint/Milestone:
-
Start date:
2023-08-23
Due date:
% Done:

0%

Estimated time:
Legacy ID:
Applies to branch:
Fix Committed on Branch:
Fixed in Maintenance Release:
Platforms:
Java

Description

We are using a licensed (as of 05/04/23) edition of SaxonJ-EE 11.5.

We're using the <xsl:sort> instruction within an <xsl:for-each-group> with a custom-defined UCA collation URI. While debugging an issue with it we decided to add the fallback=no parameter to see if we were misusing one of the options, and instead got a message saying it's not supported in Saxon-HE.

Attached to this issue is a minimal stylesheet that reproduces the issue. Invoking it with this command:

java -cp <path_to_xml_resolver_4.6.4>:<path_to_saxon_EE_11.5> \
         net.sf.saxon.Transform \
         -xsl:transform-with-EE-feature.xsl \
         -o:file.xml \
         -it \
         -config:<path_to_our_config_and_license>

gives:

Error at xsl:sort on line 9 column 141 of transform-with-EE-feature.xsl:
  XTDE1035  Failed to load collation
  http://www.w3.org/2013/collation/UCA?lang=en;maxVariable=symbol;strength=4;alternate=shifted;fallback=no: Error in UCA Collation URI http://www.w3.org/2013/collation/UCA?lang=en;maxVariable=symbol;strength=4;alternate=shifted;fallback=no: fallback=no is not supported in Saxon-HE
Errors were reported during stylesheet compilation

We currently pull down Saxon-EE from your new Maven repo and XML Resolver from Maven central.


Files

transform-with-EE-feature.xsl (550 Bytes) transform-with-EE-feature.xsl Trevor Lawrence, 2023-08-23 03:22
Actions #1

Updated by Michael Kay 8 months ago

The fact that you're getting this message means that you're running internally with a JavaPlatform rather than JavaPlatformPE, which suggests strongly that you've picked up the Saxon-HE software rather than Saxon-PE or Saxon-EE.

Two immediate suggestions: (a) use "-t" on the command line to see what version Saxon thinks it's running; (b) use the alternate entry point com.saxonica.Transform (in place of net.sf.saxon.Transform) to force loading of Saxon-EE.

Actions #2

Updated by Trevor Lawrence 8 months ago

Michael Kay wrote in #note-1:

The fact that you're getting this message means that you're running internally with a JavaPlatform rather than JavaPlatformPE, which suggests strongly that you've picked up the Saxon-HE software rather than Saxon-PE or Saxon-EE.

Two immediate suggestions: (a) use "-t" on the command line to see what version Saxon thinks it's running; (b) use the alternate entry point com.saxonica.Transform (in place of net.sf.saxon.Transform) to force loading of Saxon-EE.

Adding -t gives this output:

SaxonJ-EE 11.5 from Saxonica
Java version 11.0.15
Using license serial number A011351
Error at xsl:sort on line 9 column 141 of transform-with-EE-feature.xsl:
  XTDE1035  Failed to load collation
  http://www.w3.org/2013/collation/UCA?lang=en;maxVariable=symbol;strength=4;alternate=shifted;fallback=no: Error in UCA Collation URI http://www.w3.org/2013/collation/UCA?lang=en;maxVariable=symbol;strength=4;alternate=shifted;fallback=no: fallback=no is not supported in Saxon-HE
Errors were reported during stylesheet compilation

Using the com.saxonica.Transform class leads to the same result.

If it's relevant, we're using a Red Hat build of OpenJDK on Windows:

> java --version
openjdk 11.0.15 2022-04-19 LTS
OpenJDK Runtime Environment 18.9 (build 11.0.15+9-LTS)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.15+9-LTS, mixed mode)
Actions #3

Updated by Michael Kay 8 months ago

Thanks for the info. So, first conjecture disproved. That's progress of a kind.

Actions #4

Updated by Trevor Lawrence 8 months ago

Michael Kay wrote in #note-3:

Thanks for the info. So, first conjecture disproved. That's progress of a kind.

I don't know whether this is helpful information or not, but the reason we added fallback=no is that we were not seeing the expected behavior when tweaking our UCA URI. We found that the presence of the collation attribute with just the base UCA URI did cause some changes in sorting, but further tweaking with the parameters didn't seem to have any effect.

I'll fully admit, however, that my confidence level in our ability to interpret the UCA spec is quite low.

Actions #5

Updated by Michael Kay 8 months ago

I tried this:

java -cp bin/11.5/ee/SaxonEE11-5J/saxon-ee-11.5.jar:license-dir net.sf.saxon.Query -t -qs:"contains('aa','a','http://www.w3.org/2013/collation/UCA?lang=en;maxVariable=symbol;strength=4;alternate=shifted;fallback=no')" 

with output

SaxonJ-EE 11.5 from Saxonica
Java version 11.0.6
Using license serial number V010895
Analyzing query from {contains('aa','a','http://www.w3.org/2013/collation/UCA?lang=en;maxVariable=symbol;strength=4;alternate=shifted;fallback=no')}
Analysis time: 460.394833 milliseconds
<?xml version="1.0" encoding="UTF-8"?>true

I then tried without the license file on the classpath and got

SaxonJ-EE 11.5 from Saxonica
Java version 11.0.6
No license file found - running with licensable features disabled
Analyzing query from {contains('aa','a','http://www.w3.org/2013/collation/UCA?lang=en;maxVariable=symbol;strength=4;alternate=shifted;fallback=no')}
Error 
  FOCH0002  Error in UCA Collation URI
  http://www.w3.org/2013/collation/UCA?lang=en;maxVariable=symbol;strength=4;alternate=shifted;fallback=no: fallback=no is not supported in Saxon-HE
Static error(s) in query

So I'm looking for some way in which Saxon can report that a specific license file was found, and then behave as if it wasn't...

Actions #6

Updated by Trevor Lawrence 8 months ago

Michael Kay wrote in #note-5:

So I'm looking for some way in which Saxon can report that a specific license file was found, and then behave as if it wasn't...

I did some troubleshooting on the "license not being picked up" angle before filing this issue, and found that with our typical setup I was able to successfully:

  1. Use a <xsl:source-document> with streamable="true".
  2. Use the extension function saxon:string-to-hexBinary.

It's only been the <xsl:sort> with the fallback=no UCA collation that's caused issues.

My command-line reproduction of the issue was using a barebones config file with just a relative path to our license file, though our typical setup picks it up as a classpath resource.

Actions #7

Updated by Michael Kay 8 months ago

I've now run your stylesheet, under 11.5, with an EET-S license, and it's working fine.

My next line of investigation is whether the EEJ build that I'm running is identical to the one you installed from our Maven repo.

Actions #8

Updated by Michael Kay 8 months ago

  • Status changed from New to In Progress
  • Assignee set to Norm Tovey-Walsh

I have reproduced the problem when running with the Saxon-EE jar file from the Maven repository. It looks as if there is something wrong with this build. While we investigate, could you please try switching to the download from the saxonica.com web site?

Actions #9

Updated by Norm Tovey-Walsh 8 months ago

TL;DR we think you are running in an environment where you have the Saxon EE jar file on the class path but you do not have the ICU4J jar file on the class path. If you put the ICU4J jar file on the class path, the problem should go away.

The problem isn't in the Maven build per se. The jar files are bit-for-bit identical except for the class-path defined in the jar manifest.

For the "download this zip file" distribution, we include a few critical dependencies in a lib subfolder and add them automatically to the classpath in the manifest.

For the Maven distribution, we don't do that because the dependencies are listed in the POM file and Maven will (should?) automatically add them to the classpath.

Mike did a nice bit of detective work and we sorted out that the problem you're seeing arises if you're running EE but the ICU4J jar file is not available. (Creating an "ICU" collator fails and we fall back to a "UCA" collator and that leads to the confusing error message.)

Is it possible that you're using the jar file you got from our Maven repository but you're not using it with Maven? If you're constructing the class path through some other means, you need to make sure that the ICU4J jar files are also on the class path.

Please let us know if that fixes the problem. (Or not, of course.)

Actions #10

Updated by Trevor Lawrence 8 months ago

I can confirm the transform compiles and runs successfully with the icu4j-59.2jar (from the saxonica.com download) included in the classpath. Thank you!

Norm Tovey-Walsh wrote in #note-9:

Is it possible that you're using the jar file you got from our Maven repository but you're not using it with Maven? If you're constructing the class path through some other means, you need to make sure that the ICU4J jar files are also on the class path.

We do use it with Maven, but when submitting this issue I wanted to eliminate as many variables as possible.

Norm Tovey-Walsh wrote in #note-9:

For the Maven distribution, we don't do that because the dependencies are listed in the POM file and Maven will (should?) automatically add them to the classpath.

I don't know if this is true (or at least doesn't appear to be with my Maven build). The com.saxonica:Saxon-EE:11.5 artifact's pom.xml lists the icu4j library as an "optional" dependency. From my reading of the Maven docs, this means that it's not automatically brought in transitively and needs to be explicitly added as a dependency in the dependent project, i.e. ours. I updated our pom.xml to pull in icu4j explicitly and that fixed the issue in our production setup.

Thanks for setting up the Maven repo, by the way. It's one less thing for me to have to manage (publishing our own Saxon-EE artifacts for local consumption).

Actions #11

Updated by Michael Kay 8 months ago

Thanks for that feedback.

Just an extra bit of detail about what's actually happening here. Saxon-EE uses ICU for collation support if it's available, otherwise it uses the collations offered natively by the JDK. In your situation we were using the JDK collations. We believe (rightly or wrongly) that the JDK collations are not 100% conformant with the Unicode Collation Algorithm. By putting the parameter "fallback=no" in the collation URI, you are saying you either want a collation that is 100% conformant, or you want your transformation to fail. So if you're using the JDK collations, we reject a collation URI that specifies fallback=no. We inaccurately say in the message that this is because you are using Saxon-HE; we should improve the message.

Actions #12

Updated by Norm Tovey-Walsh 6 months ago

  • Assignee changed from Norm Tovey-Walsh to Michael Kay

I think there are two options here:

  1. We can document that the ICU4J dependency is optional and tell users that if they want to use ICU4J, they have to include the dependency themselves in Maven (or their build tool of choice). This is consistent with the way we've always done releases and does not unilaterally impose ICU4J on any user that's getting the distribution through Maven. Or,
  2. We can change the dependency so that it isn't optional. This will fix this problem (I think) with the consequence that it will be harder (though perhaps not impossible) to use the Maven distribution of SaxonJ without using ICU4J.

I don't have a good, intuitive sense of which would be better.

(Mike, I assume you've fixed the error message, or is that still open as well?)

Actions #13

Updated by Michael Kay 6 months ago

Norm, yes, the diagnostics were improved: commit dated 24 Aug 2023, in module UcaCollatorUsingJava.

Actions #14

Updated by Norm Tovey-Walsh 6 months ago

  • Assignee changed from Michael Kay to Norm Tovey-Walsh

I'll propose some documentation improvements.

Please register to edit this issue

Also available in: Atom PDF