Project

Profile

Help

Bug #2825

closed

XQuery performance degradation with Saxon-HE 9.7.0.6

Added by Gunther Rademacher over 8 years ago. Updated over 7 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Performance
Sprint/Milestone:
-
Start date:
2016-07-05
Due date:
% Done:

100%

Estimated time:
Legacy ID:
Applies to branch:
9.7, trunk
Fix Committed on Branch:
9.7, trunk
Fixed in Maintenance Release:
Platforms:

Description

The attached XQuery code is a REx parser that was generated from the EcmaScript grammar at http://bottlecaps.de/rex/EcmaScript.ebnf. It also has some extra tracing on some parsing functions.

Originally I had observed that a similar parser seemed not to terminate, but some investigation showed that it indeed works, however doing a lot of excessive work.

Using this command line to parse a single integer literal,

java net.sf.saxon.Query Ecmascript.xquery input={1}

the parser terminates quickly when run on 9.7.0.5, tracing 96 lines. On 9.7.0.6, there are 450924 trace lines. The result is OK, though.


Files

EcmaScript.xquery (472 KB) EcmaScript.xquery Gunther Rademacher, 2016-07-05 23:13
Actions #1

Updated by Michael Kay over 8 years ago

Thanks for reporting it. I've reproduced the results and have no immediate idea of the cause.

A rather strange effect is the execution profile counts for 9.7.0.6, many of which are powers of 2:

file line instruction count average time (gross) total time (gross) average time (net) total time (net)

"*ascript.xquery" 7212 function p:parse-NumericLiteral 65536 0.019 1229.020 0.017 1135.094

"*ascript.xquery" 7538 function p:parse-MemberExpression 32768 0.146 4793.314 0.028 914.005

"*ascript.xquery" 7271 function p:parse-Literal 65536 0.031 2053.640 0.012 772.322

"*ascript.xquery" 7342 function p:parse-PrimaryExpression 65536 0.043 2838.317 0.011 734.901

"*ascript.xquery" 7764 function p:parse-LeftHandSideExpression 32768 0.187 6118.195 0.017 557.469

"*ascript.xquery" 4518 function p:parse-UnaryExpression 32768 0.218 7151.053 0.015 490.553

"*ascript.xquery" 4425 function p:parse-PostfixExpression 32768 0.202 6633.534 0.015 483.854

whereas the corresponding execution counts for 9.7.0.5 are typically 1 or 2.

Actions #2

Updated by Michael Kay over 8 years ago

More specifically, in 9.7.0.5 there is one call on p:parse-Program resulting in 2 calls on p:parse-Program-1; in 9.7.0.6 there is one call on p:parse-Program resulting in 4 calls on p:parse-Program-1; and the doubling of calls seems to continue in the same vein.

Actions #3

Updated by Michael Kay over 8 years ago

Confirmed that the problem also occurs in the current 9.8 code, and that it still occurs with bytecode generation off.

But it goes away if optimisation is disabled (-opt:0).

(Note, there's an oddity in the profile output: multiple entries for the p:trace function on line 664. I guess this is because there are multiple copies of the function body as a result of inlining, and we're failing to combine them for profiling purposes.)

The problem does not go away if tail call optimization is disabled (which can only be done by commenting out code internally).

Actions #4

Updated by Michael Kay over 8 years ago

This is proving very difficult to track down.

The most obvious symptom of the problem is that function parse-Program-1 makes three recursive calls on itself when it should only make one. The reason it does so is that the $state variable indicates that parsing has not yet finished. I think we need to pin down at what point the value of $state deviates from previous releases. This is hampered by the fact that (as a result of the changes made under bug #2707) there are legitimate differences in the detailed execution plan.

Actions #5

Updated by Michael Kay over 8 years ago

  • Category set to Performance
  • Status changed from New to In Progress
  • Assignee set to Michael Kay
  • Priority changed from Low to Normal

Got it. It was very simple in the end - on some paths (I'm not sure exactly what the preconditions are) we are evaluating the arguments of a function call twice. Where the expression used as the argument of a function call (possibly after inlining of variables) is a complex expression, that can obviously be very expensive. And it accounts for the "powers of two" observation in the profile.

Actions #6

Updated by Michael Kay over 8 years ago

  • Status changed from In Progress to Resolved

For reference, the problem was introduced by the patch for bug #2793.

A fix has been committed for the 9.7 and 9.8 branches.

Actions #7

Updated by Michael Kay over 8 years ago

  • Applies to branch 9.8 added
  • Fix Committed on Branch 9.7, 9.8 added
Actions #8

Updated by Gunther Rademacher over 8 years ago

Thanks for resolving this so quickly! I have fetched the fix from svn an I can confirm that it works fine now.

Actions #9

Updated by O'Neil Delpratt over 8 years ago

  • % Done changed from 0 to 100
  • Fixed in Maintenance Release 9.7.0.7 added

Bug fix applied in the Saxon 9.7.0.7 maintenance release.

Actions #10

Updated by O'Neil Delpratt over 8 years ago

  • Status changed from Resolved to Closed
Actions #11

Updated by O'Neil Delpratt over 7 years ago

  • Applies to branch trunk added
  • Applies to branch deleted (9.8)
Actions #12

Updated by O'Neil Delpratt over 7 years ago

  • Fix Committed on Branch trunk added
  • Fix Committed on Branch deleted (9.8)

Please register to edit this issue

Also available in: Atom PDF