Bug #2633
closed
9.7.0.3 regression: XQuery compilation does not terminate
Fix Committed on Branch:
9.7
Fixed in Maintenance Release:
Description
The attached XQuery code is a Java 6 parser created by REx. It works OK on Saxon-HE 9.7.0.2, but when I run it on 9.7.0.3, with '-t', I see this output
Saxon-HE 9.7.0.3J from Saxonica
Java version 1.8.0_40
Analyzing query from JavaParser.xquery
and it then appears to go into a loop, fully occupying one processor. A stack trace acquired while observing this is attached. This does not occur when using '-opt:0'.
Best regards,
Gunther
Files
- Category set to Performance
- Status changed from New to In Progress
- Assignee set to Michael Kay
- Priority changed from Low to Normal
I think it's not actually looping; it's just embarking on an optimization strategy with very poor (possibly exponential) performance characteristics.
It's basically looking at the multi-way "or" expression around line 10868 to 10896, and is attempting to turn this into a single GeneralComparison (i.e. turn (X = 1 or X = 2 or X = 3...) into (X = (1,2,3,...)) so that it can benefit from the optimizations that exist for general comparisons. In fact it decides it can't do the optimization because it doesn't have enough type information ($state has type item()+), but it takes a mighty long time deciding this.
I think the problem is that there is a mixture of top-down and bottom-up optimization which is causing a lot of retrying of things that have already been attempted unsuccessfully.
(A secondary problem - or rather, opportunity - is that the optimization could probably be refined so that it works on this particular query; but we need to focus on solving the performance issue for the case where it fails.)
I think this is a regression caused by the fix to bug #2598. John Lumley's comment was prescient: "A solution (which might possibly be expensive in deep or trees due to repeated sub-tree optimisation) is to add optimizeChildren() before the attempt to use a general comparison."
I've solved the performance problem by:
(a) invoking super.optimize() (which causes optimization of the children) before we do anything else, and
(b) attempting the conversion to GeneralComparison only on a top-level OrExpression, i.e. one whose parent is not an OrExpression.
It remains
(i) to check the orginal DITA test case for bug 2598
(ii) to see if we can actually get the optimization to work for this particular test case.
The optimization is now working: the entire OrExpression gets converted into
map:untyped-contains($MAP, data(head($state)))
where $MAP is a map constructed at compile-time containing the set of values that we are testing for, and untyped-contains() is a variant of the map:contains() function where untyped atomic values match values of any other type.
The optimization was failing previously because of an error that we've seen elsewhere: the automatic refactoring done by IntelliJ when we separated Location out of Expression rewrote this code to look for matching Location objects rather than matching Expression objects. Another patch is being committed to correct this.
- Status changed from In Progress to Resolved
- % Done changed from 0 to 100
- Found in version deleted (
Saxon-HE 9.7.0.3)
- Applies to branch 9.7 added
- Fix Committed on Branch 9.7 added
- Status changed from Resolved to Closed
- Fixed in Maintenance Release 9.7.0.4 added
Bug fix applied in the Saxon 9.7.0.4 maintenance release.
Please register to edit this issue
Also available in: Atom
PDF