Project

Profile

Help

Bug #5054

Internal error optimising an inline function that calls its containing named function

Added by Mary Holstege 3 months ago. Updated about 1 month ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
XQuery Java API
Sprint/Milestone:
-
Start date:
2021-07-30
Due date:
% Done:

100%

Estimated time:
Legacy ID:
Applies to branch:
10, trunk
Fix Committed on Branch:
10, trunk
Fixed in Maintenance Release:
Platforms:

Description

Running a particular XQuery program with Saxon EE 10.3 raises an internal error (see stack trace). Note that Saxon HE 10.3 has no problem with it. Code attached in zip.

Executed via: java -cp saxon-ee-10.3.jar net.sf.saxon.Query -traceout:#err -q:bug.xqy

Stack trace: java.lang.RuntimeException: Internal error evaluating function this:split-tree at line 30 in module file:/home/mary/SVG/BUGREPORT/geo/splittree.xqy at net.sf.saxon.expr.instruct.UserFunction.call(UserFunction.java:655) at net.sf.saxon.expr.UserFunctionCall.callFunction(UserFunctionCall.java:560) at net.sf.saxon.expr.UserFunctionCall.evaluateItem(UserFunctionCall.java:491) at net.sf.saxon.expr.parser.Evaluator$5.evaluate(Evaluator.java:151) at net.sf.saxon.expr.parser.Evaluator$5.evaluate(Evaluator.java:148) at net.sf.saxon.expr.SystemFunctionCall.evaluateArguments(SystemFunctionCall.java:454) at net.sf.saxon.expr.FunctionCall.iterate(FunctionCall.java:546) at net.sf.saxon.expr.CardinalityChecker.evaluateItem(CardinalityChecker.java:283) at net.sf.saxon.expr.ItemChecker.evaluateItem(ItemChecker.java:255) at com.saxonica.ee.bytecode.ByteCodeCandidate.evaluateItem(ByteCodeCandidate.java:299) at net.sf.saxon.expr.parser.Evaluator$5.evaluate(Evaluator.java:151) at net.sf.saxon.expr.parser.Evaluator$5.evaluate(Evaluator.java:148) at net.sf.saxon.expr.LetExpression.eval(LetExpression.java:545) at net.sf.saxon.expr.LetExpression.iterate(LetExpression.java:519) at net.sf.saxon.expr.ItemChecker.iterate(ItemChecker.java:231) at com.saxonica.ee.bytecode.ByteCodeCandidate.iterate(ByteCodeCandidate.java:328) at net.sf.saxon.expr.instruct.ForEach.iterate(ForEach.java:557) at net.sf.saxon.expr.ForExpression.process(ForExpression.java:407) at net.sf.saxon.expr.instruct.ElementCreator.processLeavingTail(ElementCreator.java:352) at net.sf.saxon.expr.instruct.ElementCreator.processLeavingTail(ElementCreator.java:298) at net.sf.saxon.expr.instruct.Instruction.process(Instruction.java:142) at com.saxonica.ee.bytecode.ByteCodeCandidate.process(ByteCodeCandidate.java:149) at net.sf.saxon.query.XQueryExpression.run(XQueryExpression.java:455) at net.sf.saxon.s9api.XQueryEvaluator.run(XQueryEvaluator.java:370) at net.sf.saxon.Query.runQuery(Query.java:895) at net.sf.saxon.Query.doQuery(Query.java:423) at net.sf.saxon.Query.main(Query.java:97) Caused by: java.lang.RuntimeException: Internal error evaluating function anon:f_1561063579 at net.sf.saxon.expr.instruct.UserFunction.call(UserFunction.java:655) at net.sf.saxon.functions.hof.UserFunctionReference$BoundUserFunction.call(UserFunctionReference.java:411) at net.sf.saxon.functions.SystemFunction.dynamicCall(SystemFunction.java:459) at net.sf.saxon.functions.hof.FoldLeftFn$FoldLeftFold.processItem(FoldLeftFn.java:56) at net.sf.saxon.functions.FoldingFunction.call(FoldingFunction.java:51) at net.sf.saxon.expr.FunctionCall.iterate(FunctionCall.java:548) at net.sf.saxon.expr.CardinalityChecker.evaluateItem(CardinalityChecker.java:283) at net.sf.saxon.expr.ItemChecker.evaluateItem(ItemChecker.java:255) at com.saxonica.ee.bytecode.ByteCodeCandidate.evaluateItem(ByteCodeCandidate.java:299) at net.sf.saxon.expr.LetExpression.evaluateItem(LetExpression.java:576) at com.saxonica.ee.bytecode.ByteCodeCandidate.evaluateItem(ByteCodeCandidate.java:299) at net.sf.saxon.expr.parser.Evaluator$5.evaluate(Evaluator.java:151) at net.sf.saxon.expr.parser.Evaluator$5.evaluate(Evaluator.java:148) at net.sf.saxon.expr.instruct.UserFunction.call(UserFunction.java:645) ... 26 more Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 4 out of bounds for length 4 at net.sf.saxon.value.Closure.saveContext(Closure.java:88) at net.sf.saxon.value.MemoClosure.(MemoClosure.java:71) at net.sf.saxon.expr.parser.Evaluator$9.evaluate(Evaluator.java:245) at net.sf.saxon.expr.instruct.UserFunction.call(UserFunction.java:645) ... 39 more

java.lang.RuntimeException: Internal error evaluating template rule at line 18 in module file:/home/mary/SVG/BUGREPORT/bug.xqy
at com.saxonica.ee.bytecode.ByteCodeCandidate.process(ByteCodeCandidate.java:162)
at net.sf.saxon.query.XQueryExpression.run(XQueryExpression.java:455)
at net.sf.saxon.s9api.XQueryEvaluator.run(XQueryEvaluator.java:370)
at net.sf.saxon.Query.runQuery(Query.java:895)
at net.sf.saxon.Query.doQuery(Query.java:423)
at net.sf.saxon.Query.main(Query.java:97)

Caused by: java.lang.RuntimeException: Internal error evaluating function this:split-tree at line 30 in module file:/home/mary/SVG/BUGREPORT/geo/splittree.xqy at net.sf.saxon.expr.instruct.UserFunction.call(UserFunction.java:655) at net.sf.saxon.expr.UserFunctionCall.callFunction(UserFunctionCall.java:560) at net.sf.saxon.expr.UserFunctionCall.evaluateItem(UserFunctionCall.java:491) at net.sf.saxon.expr.parser.Evaluator$5.evaluate(Evaluator.java:151) at net.sf.saxon.expr.parser.Evaluator$5.evaluate(Evaluator.java:148) at net.sf.saxon.expr.SystemFunctionCall.evaluateArguments(SystemFunctionCall.java:454) at net.sf.saxon.expr.FunctionCall.iterate(FunctionCall.java:546) at net.sf.saxon.expr.CardinalityChecker.evaluateItem(CardinalityChecker.java:283) at net.sf.saxon.expr.ItemChecker.evaluateItem(ItemChecker.java:255) at com.saxonica.ee.bytecode.ByteCodeCandidate.evaluateItem(ByteCodeCandidate.java:299) at net.sf.saxon.expr.parser.Evaluator$5.evaluate(Evaluator.java:151) at net.sf.saxon.expr.parser.Evaluator$5.evaluate(Evaluator.java:148) at net.sf.saxon.expr.LetExpression.eval(LetExpression.java:545) at net.sf.saxon.expr.LetExpression.iterate(LetExpression.java:519) at net.sf.saxon.expr.ItemChecker.iterate(ItemChecker.java:231) at com.saxonica.ee.bytecode.ByteCodeCandidate.iterate(ByteCodeCandidate.java:328) at net.sf.saxon.expr.instruct.ForEach.iterate(ForEach.java:557) at net.sf.saxon.expr.ForExpression.process(ForExpression.java:407) at net.sf.saxon.expr.instruct.ElementCreator.processLeavingTail(ElementCreator.java:352) at net.sf.saxon.expr.instruct.ElementCreator.processLeavingTail(ElementCreator.java:298) at net.sf.saxon.expr.instruct.Instruction.process(Instruction.java:142) at com.saxonica.ee.bytecode.ByteCodeCandidate.process(ByteCodeCandidate.java:149) ... 5 more Caused by: java.lang.RuntimeException: Internal error evaluating function anon:f_1561063579 at net.sf.saxon.expr.instruct.UserFunction.call(UserFunction.java:655) at net.sf.saxon.functions.hof.UserFunctionReference$BoundUserFunction.call(UserFunctionReference.java:411) at net.sf.saxon.functions.SystemFunction.dynamicCall(SystemFunction.java:459) at net.sf.saxon.functions.hof.FoldLeftFn$FoldLeftFold.processItem(FoldLeftFn.java:56) at net.sf.saxon.functions.FoldingFunction.call(FoldingFunction.java:51) at net.sf.saxon.expr.FunctionCall.iterate(FunctionCall.java:548) at net.sf.saxon.expr.CardinalityChecker.evaluateItem(CardinalityChecker.java:283) at net.sf.saxon.expr.ItemChecker.evaluateItem(ItemChecker.java:255) at com.saxonica.ee.bytecode.ByteCodeCandidate.evaluateItem(ByteCodeCandidate.java:299) at net.sf.saxon.expr.LetExpression.evaluateItem(LetExpression.java:576) at com.saxonica.ee.bytecode.ByteCodeCandidate.evaluateItem(ByteCodeCandidate.java:299) at net.sf.saxon.expr.parser.Evaluator$5.evaluate(Evaluator.java:151) at net.sf.saxon.expr.parser.Evaluator$5.evaluate(Evaluator.java:148) at net.sf.saxon.expr.instruct.UserFunction.call(UserFunction.java:645) ... 26 more Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 4 out of bounds for length 4 at net.sf.saxon.value.Closure.saveContext(Closure.java:88) at net.sf.saxon.value.MemoClosure.(MemoClosure.java:71) at net.sf.saxon.expr.parser.Evaluator$9.evaluate(Evaluator.java:245) at net.sf.saxon.expr.instruct.UserFunction.call(UserFunction.java:645) ... 39 more Fatal error during query: java.lang.RuntimeException: Internal error evaluating template rule at line 18 in module file:/home/mary/SVG/BUGREPORT/bug.xqy

BUGREPORT20210730.zip (45.6 KB) BUGREPORT20210730.zip Zip file of code Mary Holstege, 2021-07-30 23:42

History

#1 Updated by Michael Kay 3 months ago

Thanks Mary. The crash location is the same as the unresolved bug #4934, so perhaps we'll kill two birds with one stone.

I'm afraid I probably won't get a chance to look at it till after Balisage, unless it's really urgent. I'm looking forward to your talk -- we seem to have similar topic areas from a first glance!

#2 Updated by Mary Holstege 3 months ago

Not urgent at all. It is standing in the way of me sharing some code to do well-separated pair decomposition, it all.

#3 Updated by Michael Kay 2 months ago

  • Status changed from New to In Progress

Problem reproduced on the current 10.x branch.

The critical error is this:

Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 4 out of bounds for length 4
	at net.sf.saxon.value.Closure.saveContext(Closure.java:88)
	at net.sf.saxon.value.MemoClosure.<init>(MemoClosure.java:71)

which is some kind of problem in allocating the size of a stack frame.

#4 Updated by Michael Kay 2 months ago

The problem is with the anonymous function on line 365 of splittree.xqy. The stack frame map for this function has only four variables (the two parameters $tree and $node, the Saxon-allocated switch variable, and $point), but the function clearly declares more variables than this ($depth, $points, $box) etc, and some of these at least are retained in the compiled code.

It looks at first sight as if the code for allocating variable slots isn't handling switch expressions correctly. Note that the switch has been compiled to an if-then-else expression presumably because it only has three branches.

#5 Updated by Michael Kay 2 months ago

At compile time we are correctly allocating a stack frame with 16 slots to the anonymous user function. But at run time, when we come to do a dynamic call on the anonymous function (from the fold-left call), the stack frame has mysteriously been replaced by one with only 4 slots. It's still the same UserFunction object, but it's a different SlotManager object (allocated much later).

I'm monitoring cases where the SlotManager/StackFrameMap of a function gets replaced, and it seems to happen repeatedly:

Replacing stack frame 16 to 14
Replacing stack frame 14 to 14
Replacing stack frame 14 to 14
...
Replacing stack frame 14 to 4

This is happening as a result of optimization. I suspect that the inline function is being optimised to use fewer local variables, and we are using a reduced stack frame for the optimised form of the function, but we are not actually replacing the function body with the optimised version.

#6 Updated by Michael Kay 2 months ago

In fact, the optimisation of the inline function has been done incorrectly. For some reason the optimiser has decided that many of the variables in the default branch of the switch are unreferenced, and has therefore removed their bindings, but references to these variables (and their slot numbers) actually remain in the expression tree.

#7 Updated by Michael Kay 2 months ago

On the second optimization pass of the FLWOR expression on line 377, we ask whether there are any references to the variable $depth in the first let clause, and the answer comes back false: because the variable reference $depth on line 404 has been rebound to a simple XPath-style LetExpression.

This appears to be happening in the course of inlining of the function call (line 404) this:node($last-id + 1, $depth + 1, $bottom).

This expression is initially rewritten as

let $id:=$last-id + 1, $depth := $depth + 1, $points:= $bottom return
map {
    "kind": "node",
    "id": $id,
    "depth": $depth,
    "points": $points,
    "size": count($points)
  }

It goes wrong after that. The rewrite seems correct, though the reuse of variable names complicates the debugging.

Because some of these variables are used only once, this should subsequently be rewritten as

map {
    "kind": "node",
    "id": $last-id + 1,
    "depth": $depth + 1,
    "points": $bottom,
    "size": count($bottom)
  }

Confirmed that this rewrite takes place, and at this stage the variable $depth is correctly bound to the let clause of the 12-clause FLWOR expression.

Shortly afterwards, the 12-clause FLWOR is rewritten as a LetExpression (nested 12 deep), On entry to FLWORExpression.rewriteForOrLet(), the binding of $depth looks correct. On exit from this method the binding of $depth appears to be incorrect.

#8 Updated by Michael Kay 2 months ago

Well, I've got to a very strange point in the debugger.

We've just rewritten the FLWOR expression as a nested LetExpression, and the third action operand of the Choose is Operand@12562 with childExpression set to this nested LetExpression@11503 (let $depth := ...). And Choose.optimize() invokes operand.optimize() on this operand, which does childExpression().optimize(), which somehow invokes FLWORExpression.optimize() on the original FLWOR expression.

Very bizarre.

Looking more carefully, we are optimising the Choose (switch) expression on line 366, and while this is still going on, we re-enter the optimiser to process the same expression recursively.

This is happening essentially because of the call on line 419: the body of the named function naive-split-tree() contains an anonymous function which contains a call on naive-split-tree(); we are examining that call to see if it is eligible for inlining, and end up looking at code that is in the process of being rewritten.

#9 Updated by Michael Kay 2 months ago

At one level the solution is obvious: prevent inlining of a function call that appears within an inline (anonymous) function (whether or not it's a call to a containing function).

The only problem is knowing that we are within an anonymous function. We can't establish this by searching parent pointers in the expression tree.

The best bet seems to be to strengthen the check against recursive optimisation of a UserFunctionReference (@UserFunctionReference.java#110) which was clearly designed to catch this kind of problem, but is failing to do so in this case.

NoteL workaround is to suppress function inlining (-opt:-f)

#10 Updated by Michael Kay 2 months ago

  • Subject changed from Internal error running XQuery with Saxon EE to Internal error optimising an inline function that calls its containing named function
  • Status changed from In Progress to Resolved
  • Applies to branch trunk added
  • Fix Committed on Branch 10, trunk added

#11 Updated by Michael Kay about 2 months ago

  • Status changed from Resolved to In Progress
  • Assignee set to Michael Kay
  • Priority changed from Low to Normal

Reopening. Both Saxon 10 and Saxon 11 are now looping on XSLT3 test case -s:higher-order-functions -t:higher-order-functions-075 and this patch is the obvious culprit.

Interestingly, the test case follows the same pattern as the one in this bug issue: an anonymous function makes a call on the containing named function; the only obvious difference is that in this case, the containing function is XSLT.

#12 Updated by Michael Kay about 2 months ago

It's clearly in an optimization loop (traced entry to UserFunctionReference.optimize()) but the looping is much more complex in this case because there are multiple inline functions involved, some of them nested.

And looking at the stack in the debugger, it's caused by recursive inlining.

I think we have to find a way to recognise that the outer named function is not inline-able because it is "indirectly recursive" (or "statically recursive") - it contains a call on itself within its body, even though the recursive call is indirect, because it's within an anonymous function.

(I don't know, incidentally, how the patch exposed this problem, but I don't think the patch is wrong: it's just incomplete).

#13 Updated by Michael Kay about 2 months ago

Inlining is suppressed if the target function is beyond a certain size, computed by ExpressionTool.expressionSize(). Modifying this function so that the expression size of a UserFunctionReference is the size of the body of the referenced function has the effect that inlining causes the function size to grow to a level beyond the threshold for further inlining. Which ensures that this test passes.

#14 Updated by Michael Kay about 2 months ago

  • Status changed from In Progress to Resolved

#15 Updated by O'Neil Delpratt about 1 month ago

  • Status changed from Resolved to Closed
  • % Done changed from 0 to 100
  • Fixed in Maintenance Release 10.6 added

Bug fix applied in the Saxon 10.6 maintenance release

Please register to edit this issue

Also available in: Atom PDF