Project

Profile

Help

Bug #6555

closed

java.lang.StackOverflowError when compile Xquery with and without tracing enabled

Added by Joe Che about 2 months ago. Updated about 2 months ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
XQuery Java API
Sprint/Milestone:
-
Start date:
2024-10-01
Due date:
% Done:

0%

Estimated time:
Legacy ID:
Applies to branch:
12, trunk
Fix Committed on Branch:
12, trunk
Fixed in Maintenance Release:
Platforms:
.NET, Java

Description

Hi Team, while we are migrating from Saxon-ee 9 to 12.5. During regression test, a simple xquery is raising java.lang.StackOverflowError during compile.

This issue can be reproduced with the attach xquery compileXQueryWithStackOverflowError_test003.xquery


Files

Actions #1

Updated by Joe Che about 2 months ago

Please change the Priority of this issue to high.

Actions #2

Updated by Michael Kay about 2 months ago

  • Priority changed from Low to High
  • Fix Committed on Branch 12, trunk added

Thanks. It's gone into an infinite optimization loop while inlining variables. Haven't seen one of those for a long while.

As a workaround, you can disable variable inlining using -opt:-v on the command line or XQueryCompiler.getUnderlyingStaticContext.setOptimizerOptions() in the Java API.

Actions #3

Updated by Michael Kay about 2 months ago

I'm not sure yet whether these observations are critical to the problem, or merely incidental.

We're calculating the dependencies of a SimpleStepExpression incorrectly. This is an expression of the form A/axis::B, where A is known to select a singleton. We're using the default algorithm of combining the dependencies of the two operands, which means we're treating the expression as dependent on the focus because the RH operand is dependent on the focus, even though if A is a simple variable reference then this clearly isn't the case.

Perhaps as an indirect consequence of this, it seems that we decide that the predicate in the where clause is focus-dependent, which is bad news because it means we can't turn it into a simple filter expression to do further optimizations. So we try to factor out the context dependency, which doesn't exist because the dependency is spurious, and perhaps it's this logic (a rewrite is possible, so do it, but it turns out not be be possible) that accounts for the infinite regress. I don't think that's a full explanation yet, but that's the avenue I'm pursing.

There's a bit of a paradoc here because the where clause exists($logActivity) is self-evidently true, from type analysis, which means that there's no need to compute $logActivity, which means the whole FLWOR expression should reduce to something very simple. Perhaps that's how the optimisation would proceed if it handn't gone into a screaming loop so early.

Actions #4

Updated by Michael Kay about 2 months ago

Not quite. We are indeed inferring that the where clause is focus dependent, but that's because we have done some perfectly correct variable inlining that makes it so: we've inlined $logActivity, and we've inlined $fault, and the expansion of $fault is indeed focus-dependent.

So we do have a predicate that depends on ".", and we do try to factor out this dependency, and this appears to succeed; but in the reconstructed FLWOR expression the offending subexpression root(.) (from the inlining of $fault) is still there. I'm wondering if this is due to not paying enough attention (well, any attention...) to the presence of the injected Trace code.

Actions #5

Updated by Michael Kay about 2 months ago

Coming back after sleeping on this, I think I should focus on the fact that the unusual aspect of this query is the use of a WHERE clause in a FLWOR expression that has no FOR clause. Two reasons: (a) the optimisation strategy for WHERE clauses is all designed on the assumption that it's going to be evaluated repeatedly, and it's a pointless exercise if that isn't the case; and (b) it's only because the WHERE clause isn't within a FOR loop that we get into a mess by injecting a focus dependency that wasn't there originally.

Note also: I found that this failure occurs whether or not trace code has been injected [I should have read the title more carefully]. That makes the debugging rather easier.

Actions #6

Updated by Michael Kay about 2 months ago

I found that the optimisation for exists(X) based on the static cardinality of X is triggering only for the cases where the cardinality is 0 or 1+, it is not triggering where (as here) it is exactly one. Fixing this makes the test case run successfully, but of course it doesn't solve the bug, it merely bypasses it.

I thought I would be able to reinstate the failure by changing the condition to something less tractable, like exists($logActivity//text()), but no, it's not as simple as that. Optimization bugs tend to happen only under very specific conditions, and to make it trigger, I think I'm going to have to temporarily revert the change to exists().

Actions #7

Updated by Michael Kay about 2 months ago

  • Subject changed from java.lang.StackOverflowError when compile Xquert with and without tracing enabled to java.lang.StackOverflowError when compile Xquery with and without tracing enabled
Actions #8

Updated by Michael Kay about 2 months ago

Running with -explain gives an account of how it's looping:

OPT : At line 26 of file:/Users/mike/bugs/2024/6555-Che/test.xq
OPT : Inlined references to $saxon:dot1026871825
OPT : Expression after rewrite: let $payload := <local:getTaskOutputForSuspendOrderResponse {Block($taskData, $suspendOrderResponse)}/> where exists(<oms:LogActivity {<oms:OrderActivity {Block(<oms:TaskData {$taskData}/>, <oms:SuspendOrderResponse {$suspendOrderResponse}/>, <oms:Fault {(((root((.) treat as node()))/child::element(Q{http://schemas.xmlsoap.org/soap/envelope/}Envelope))/child::element(Q{http://schemas.xmlsoap.org/soap/envelope/}Body))/child::element(Q{http://schemas.xmlsoap.org/soap/envelope/}Fault)}/>, <oms:OutputData {$payload}/>)}/>}/>)  return $payload/child::element()
OPT : At line 26 of file:/Users/mike/bugs/2024/6555-Che/test.xq
OPT : Inlined references to $saxon:dot1074389766
OPT : Expression after rewrite: let $payload := <local:getTaskOutputForSuspendOrderResponse {Block($taskData, $suspendOrderResponse)}/> where exists(<oms:LogActivity {<oms:OrderActivity {Block(<oms:TaskData {$taskData}/>, <oms:SuspendOrderResponse {$suspendOrderResponse}/>, <oms:Fault {(((root((.) treat as node()))/child::element(Q{http://schemas.xmlsoap.org/soap/envelope/}Envelope))/child::element(Q{http://schemas.xmlsoap.org/soap/envelope/}Body))/child::element(Q{http://schemas.xmlsoap.org/soap/envelope/}Fault)}/>, <oms:OutputData {$payload}/>)}/>}/>)  return $payload/child::element()

Unfortunately it's not tracing all the rewrite events - in particular it's not tracing the extraction of "." into an outer let expression, only its reversal. I'll add the missing tracing, and we get:

OPT : At line 26 of file:/Users/mike/bugs/2024/6555-Che/test.xq
OPT : Factored out context item
OPT : Expression after rewrite: let $Q{http://saxon.sf.net/}dot573200870 := . return (let $payload := <local:getTaskOutputForSuspendOrderResponse {Block($taskData, $suspendOrderResponse)}/> where exists(<oms:LogActivity {<oms:OrderActivity {Block(<oms:TaskData {$taskData}/>, <oms:SuspendOrderResponse {$suspendOrderResponse}/>, <oms:Fault {(((root(($Q{http://saxon.sf.net/}dot573200870) treat as node()))/child::element(Q{http://schemas.xmlsoap.org/soap/envelope/}Envelope))/child::element(Q{http://schemas.xmlsoap.org/soap/envelope/}Body))/child::element(Q{http://schemas.xmlsoap.org/soap/envelope/}Fault)}/>, <oms:OutputData {$payload}/>)}/>}/>)  return $payload/child::element())
OPT : At line 26 of file:/Users/mike/bugs/2024/6555-Che/test.xq
OPT : Inlined references to $saxon:dot573200870
OPT : Expression after rewrite: let $payload := <local:getTaskOutputForSuspendOrderResponse {Block($taskData, $suspendOrderResponse)}/> where exists(<oms:LogActivity {<oms:OrderActivity {Block(<oms:TaskData {$taskData}/>, <oms:SuspendOrderResponse {$suspendOrderResponse}/>, <oms:Fault {(((root((.) treat as node()))/child::element(Q{http://schemas.xmlsoap.org/soap/envelope/}Envelope))/child::element(Q{http://schemas.xmlsoap.org/soap/envelope/}Body))/child::element(Q{http://schemas.xmlsoap.org/soap/envelope/}Fault)}/>, <oms:OutputData {$payload}/>)}/>}/>)  return $payload/child::element()
OPT : At line 26 of file:/Users/mike/bugs/2024/6555-Che/test.xq
OPT : Factored out context item
OPT : Expression after rewrite: let $Q{http://saxon.sf.net/}dot1277933280 := . return (let $payload := <local:getTaskOutputForSuspendOrderResponse {Block($taskData, $suspendOrderResponse)}/> where exists(<oms:LogActivity {<oms:OrderActivity {Block(<oms:TaskData {$taskData}/>, <oms:SuspendOrderResponse {$suspendOrderResponse}/>, <oms:Fault {(((root(($Q{http://saxon.sf.net/}dot1277933280) treat as node()))/child::element(Q{http://schemas.xmlsoap.org/soap/envelope/}Envelope))/child::element(Q{http://schemas.xmlsoap.org/soap/envelope/}Body))/child::element(Q{http://schemas.xmlsoap.org/soap/envelope/}Fault)}/>, <oms:OutputData {$payload}/>)}/>}/>)  return $payload/child::element())

which makes it clear what's happening. The optimization of a WHERE clause tries to get rid of a context-dependent expression by binding a variable to the context item at an outer level. The optimization of variable references then inlines variables that are only referenced once, where the reference is not within a loop. These two rewrites are clearly in conflict.

The simplest resolution seems to be to avoid the attempt to "factor out dot" from a where clause if the where clause is not within a loop.

Actions #9

Updated by Michael Kay about 2 months ago

  • Status changed from New to Resolved
  • Platforms .NET added

Fixed by changing the "factorOutDot" optimization to apply only to WHERE clauses where IsRepeated() is true, i.e. if the where clause is within a FOR or similar clause.

Please register to edit this issue

Also available in: Atom PDF