Project

Profile

Help

Bug #2664

closed

XQuery optimize leads to stackoverflow

Added by Matthew Halverson about 8 years ago. Updated about 8 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
XQuery conformance
Sprint/Milestone:
-
Start date:
2016-03-06
Due date:
% Done:

100%

Estimated time:
Legacy ID:
Applies to branch:
9.7
Fix Committed on Branch:
9.7
Fixed in Maintenance Release:
Platforms:

Description

In Saxon-HE 9.7.0.2J, the following xquery (ran with @java -cp saxon9he.jar net.sf.saxon.Query test.xqy@)

declare namespace output = "http://www.w3.org/2010/xslt-xquery-serialization";
declare option output:method "text";
declare context item := doc('test.xml'); (: error occurs with or without this line :)

for $x in distinct-values(doc('test.xml')/report//section//title)
where count(/report//section[title=string($x)]) > 1
return $x

causes a stackoverflow error when ran against the following xml

<report>
    <title>Main title</title>
    <section>
        <title>sec1</title>
    </section>

    <section>
        <title>sec1</title>
        <section>
            <title>sec21</title>
        </section>
    </section>      
</report>

The error does not occur in the following three variants:

loop uses default context

declare namespace output = "http://www.w3.org/2010/xslt-xquery-serialization";
declare option output:method "text";
declare context item := doc('test.xml');

for $x in distinct-values(/report//section//title)
where count(doc('test.xml')/report//section[title=string($x)]) > 1
return $x

loop and where uses explicit context

declare namespace output = "http://www.w3.org/2010/xslt-xquery-serialization";
declare option output:method "text";
declare context item := doc('test.xml');

for $x in distinct-values(doc('test.xml')/report//section//title)
where count(doc('test.xml')/report//section[title=string($x)]) > 1
return $x

both loop and where uses default context

declare namespace output = "http://www.w3.org/2010/xslt-xquery-serialization";
declare option output:method "text";
declare context item := doc('test.xml');

for $x in distinct-values(/report//section//title)
where count(/report//section[title=string($x)]) > 1
return $x

The original query and xml as well as the complete stack trace is attached.


Files

test.xqy (291 Bytes) test.xqy XQuery causing error Matthew Halverson, 2016-03-06 08:52
test.xml (251 Bytes) test.xml XML causing error Matthew Halverson, 2016-03-06 08:52
stacktrace.txt (69.4 KB) stacktrace.txt Complete stacktrace Matthew Halverson, 2016-03-06 08:54
Actions #1

Updated by Michael Kay about 8 years ago

  • Category set to XQuery conformance
  • Status changed from New to In Progress
  • Assignee set to Michael Kay

Thanks for reporting it. I've reproduced the problem. Initial investigation shows

(a) it happens under HE but not under EE

(b) the optimizer is attempting to rewrite the "where" clause so it doesn't have any dependency on the context item (by introducing a new variable). It appears to do the rewrite successfully, but it still thinks there is a dependency there, so it tries again... and again...

Actions #2

Updated by Michael Kay about 8 years ago

The optimizer is binding a variable to "." outside the FLWOR expression because it's a good idea to remove focus dependencies from the where clause (so that it can be turned into a predicate). But then it notices that there is only one reference to this variable, so it inlines it, which leaves the expression back where it started.

Actions #3

Updated by Michael Kay about 8 years ago

Saxon-EE doesn't hit the problem because the optimizer takes a completely different path: it builds an index to support evaluation of the where clause.

Actions #4

Updated by Michael Kay about 8 years ago

  • Status changed from In Progress to Resolved
  • Fix Committed on Branch 9.7 added

A patch has been committed on the 9.7 branch to stop system-created variables bound to "." from being subsequently inlined.

Actions #5

Updated by Michael Kay about 8 years ago

  • Status changed from Resolved to In Progress

I'm re-opening this because although the patch works, I think it only addresses the symptoms and not the root cause.

The variable (let $zz:76543 := .) that Saxon introduces outside the FLWOR expression should not be inlined in subsequent optimization phases because it is used in the where clause, and the where clause is evaluated repeatedly. Variables that are used in a looping subexpression should never be inlined. So something has gone wrong with the analysis.

Actions #6

Updated by Michael Kay about 8 years ago

Indeed, the code that iterates over the operands of a FLWOR expression is making no attempt to classify any of the operands as "higher order" - that is, evaluated repeatedly. The only reason we get away with that is that this information is normally used only during streamability analysis, which is XSLT-only, and generalised FLWOR expressions do not occur in XSLT.

Actions #7

Updated by Michael Kay about 8 years ago

  • Status changed from In Progress to Resolved

I have replaced the previous patch with a more extensive patch that correctly computes which subexpressions of a FLWOR expression are evaluated repeatedly.

Actions #8

Updated by O'Neil Delpratt about 8 years ago

  • Status changed from Resolved to Closed
  • % Done changed from 0 to 100
  • Fixed in Maintenance Release 9.7.0.4 added

Bug fix applied in the Saxon 9.7.0.4 maintenance release.

Please register to edit this issue

Also available in: Atom PDF