Bug #2401: Performance regression for positional patterns - Saxon - Saxonica Developer Community

Actions

Send by e-mail Copy link

Bug #2401

closed

Performance regression for positional patterns

Added by Michael Kay over 9 years ago. Updated about 9 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Michael Kay

Category:

Performance

Sprint/Milestone:

9.6.0.7

Start date:

2015-06-17

Due date:

% Done:

100%

Estimated time:

Legacy ID:

Applies to branch:

9.6

Fix Committed on Branch:

9.6

Fixed in Maintenance Release:

9.6.0.7

Platforms:

Description

The cost of matching a positional pattern such as

match="XYZ[P]/node()[last()]

appears to have increased substantially between 9.5 and 9.6, at least in the case where the predicate P is expensive to evaluate. In 9.5, P is evaluated only if the node being tested is the last child of an XYZ, but in 9.6, P is tested before testing whether the node is the last child.

Both strategies appear to b correct and it's not clear that one is superior to the other in all cases. In the absence of any cost-based optimization, or learning approaches, or parallel threads so we can report a mismatch as soon as either condition is found to be false, it's not clear what the best approach is.

Actions

Copy link

Updated by Michael Kay over 9 years ago

I've added a rewrite to FilterExpression so the pattern child::node()[last()] becomes child::node()[empty(following-sibling::node())].

This has a dramatic effect on this test case: 9.5 = 13s, 9.6 without patch = 2m 45s, 9.6 with patch = 3s.

But this doesn't help with the general case. Given

A[P]/B[Q]

performance is very sensitive to whether P or Q is evaluated first, and either might be better.

Actions

Copy link

Updated by Michael Kay over 9 years ago

Need to revise the patch to handle the filter [position() != last()] correctly.

Actions

Copy link

Updated by Michael Kay over 9 years ago

Indeed, as predicted in comment #3, a further refinement was needed to the patch: exposed by XSLT3 tests position-3003 / position-5001.

A further patch has been committed.

Actions

Copy link

Updated by Michael Kay over 9 years ago

Status changed from New to Resolved

Closing this.

For 9.7 I have added some crude metrics to help us decide the order of evaluation of predicates (and parent/ancestor qualifiers). These appear to handle some test cases quite well: at least the cases where one predicate uses grossly inefficient constructs such as the preceding axis.

Actions

Copy link