Bug #4444
closedProblem with count(//parent::node()) and schematron comparison operators
100%
Description
Beginning with Saxon HE 9.5.1-4, the library no longer properly handles the use of count(//parent::node()) properly when doing comparisons in schematron.
Oddly enough, if you output the value in the assert using value-of, it is successful. For example:
This asserts:
<sch:pattern>
<sch:rule context="sp:Bin_Center_Lookup/sp:bin_center_field_name">
<sch:let name="cnt2" value="count(//parent::node())" />
<sch:assert test="$cnt2 eq 379">
Test1 No output count</sch:assert>
</sch:rule>
</sch:pattern>
While this works as expected:
<sch:pattern>
<sch:rule context="sp:Bin_Center_Lookup/sp:bin_center_field_name">
<sch:let name="cnt2" value="count(//parent::node())" />
<sch:assert test="$cnt2 eq 379">
Test2 Output count: <sch:value-of select="$cnt2"/></sch:assert>
</sch:rule>
</sch:pattern>
Example files for validation attached.
We may just be somehow calling the schematron incorrectly from our libraries, but this appears to be the only place where we are running into issues.
Files
Updated by Michael Kay almost 5 years ago
When I execute the XPath query count(//parent::node())
directly against your source document, I get the answer 379, which is what you appear to be expecting.
If schematron is getting a different answer, then that must be because it is doing something different. I'm afraid we can't debug schematron code for you. You haven't even told us which schematron processor you are using. Is it one that generates XSLT? If you can send us the XSLT and convince us that a known XSLT stylesheet is not producing the correct output, then that's something we can investigate.
Have you checked any variations on the expression such as number($cnt2) eq 379
or $cnt = 379
or $cnt = '379'
? One possible explanation, if $cnt
displays as "379" but doesn't compare "eq" to 379, is that it's a string rather than a number. Without knowing schematron, I've no idea if that's a real possibility.
Updated by Michael Kay almost 5 years ago
- Status changed from New to In Progress
Updated by Michael Kay almost 5 years ago
I have generated the XSLT stylesheet using the "skeleton" schematron implementation and am running the XSLT directly against the source document, which demonstrates that this does appear to be a Saxon problem. The XPath expression count(//parent::node())
is evaluated successfully, but there seems to be a problem with the variable $cnt2. Adding xsl:message output to show the variable value causes everything to run fine; this is usually an indication of some incorrect optimisation rewrite, which doesn't happen when an extra reference to the variable is added.
Updated by Michael Kay almost 5 years ago
In the absence of a second reference to the variable, the variable has been inlined, giving an expression like
if test="count(//parent::node()) = 379" then () else <svrl:failed-assert..../>
This then triggers another optimization, where count(X) = N
is replaced by count(subsequence(X, 1, N+1)) = N
. The idea of this optimization is to avoid evaluating more items in the expression than are needed to establish whether or not the sequence size is 379.
The debugger shows that the actual length of the subsequence() is 380.
It looks to me as if the expression isn't eliminating duplicates. Clearly the naive nested-loop evaluation strategy for a descendant/parent path expression is going to include duplicates that need to be eliminated, but there doesn't appear to be any code on the expression tree to do this.
Updated by Michael Kay almost 5 years ago
ValueComparison.optimizeCount()
is responsible for the rewrite of the count(X) = N
expression, and it starts by doing
sequence = sequence.unordered(false, false);
which is done to avoid unnecessary sorting of the input sequence (the order of the sequence does not affect its size). The first argument to this call is a flag retainAllNodes
which should be set to true, indicating that although the order of the sequence does not matter, the presence or absence of duplicates does.
Setting this argument to true
appears to solve the problem.
Updated by Michael Kay almost 5 years ago
Incidentally, a more efficient query to count the number of nodes that have children is probably
count(//node()[child::node()])
Writing it this way doesn't involve any sorting or elimination of duplicates.
Updated by Michael Kay almost 5 years ago
- Category set to Internals
- Status changed from In Progress to Resolved
- Assignee set to Michael Kay
- Applies to branch 9.9, trunk added
- Fix Committed on Branch 9.9, trunk added
Fixed on the 9.9 and development branches. I'm not planning to provide a fix for earlier releases.
Updated by Jordan Padams almost 5 years ago
Thanks so much! Is there a plan for a new maintenance release in the near-term?
Updated by Michael Kay almost 5 years ago
We base the decision on the number of outstanding patches and their severity. There are currently 12 outstanding patches for 9.9 and the rate has slowed to about 6 a month, so I think it's likely to be 4-6 weeks before we decide another maintenance release is due. Suggest you try the workaround in comment #6 - which is a better way to write the query anyway.
Updated by O'Neil Delpratt over 4 years ago
- Status changed from Resolved to Closed
- % Done changed from 0 to 100
- Fixed in Maintenance Release 9.9.1.7 added
Patch applied in the 9.9.1.7 maintenance release.
Please register to edit this issue