Bug #3902
closedRegex non-greedy matching failure
100%
Description
See https://saxonica.plan.io/boards/3/topics/7300
Made into XSLT3 test case analyze-string-099
Updated by Michael Kay over 6 years ago
The path taken by the non-greedy code for OpRepeat at Operation#884 is quite bewildering. Can't see the logic to it at all. Will need to study how this works in some examples where we actually get the right answer.
Note also in passing, the optimization that recognizes that a match must start with a particular character and fast-forwards to that character (at REMatcher#439) doesn't seem to be working as well as it should - although it recognizes that there's a fixed prefix it still seems to look for a match at every character position.
Updated by Michael Kay over 6 years ago
Not many tests exercise this path. In fact, only the following six tests do so:
In XSLT3: regex-syntax-0861, -0955
In QT3: fn-matches-50, re00974, -5, -6
And all of these are concerned with giving a boolean match/no-match result, not in extracting the matching substring.
Updated by Michael Kay over 6 years ago
Concerning the easier problem of the poorly-performing prefix comparison, the issue is that when there is a mismatch on the last character of the prefix (which in this case is the only character) we're taking the path as if the prefix matched, which negates the optimization in the case where the prefix has length 1.
Fixed this on the 9.9 branch.
Updated by Michael Kay over 6 years ago
Updated by Michael Kay over 6 years ago
I have replaced the impenetrable code (for non-greedy OpRepeat with a variable length match) with a greatly simplified version. This is working for the test case in question, but it doesn't yet handle min/max cardinality.
With this simple implementation, one of the tests in fn-matches-50 (the Perl regex test suite) fails with infinite backtracking: the regex in question is matches('abcde', '(?:r?)*?r|(.{2,4})')
(a classic one that allows an infinite number of zero-length matches).
Solved this the traditional way of adding a ForceProgressIterator.
With this change all tests are passing. So given that we haven't implemented min and max, we clearly need more tests!
Updated by Michael Kay over 6 years ago
- Status changed from New to Resolved
- Fix Committed on Branch 9.8, trunk added
I have added a number of tests to QT3 fn-matches to handle non-greedy matching with min/max constraints.
All XSLT3 / QT3 tests now passing.
Updated by O'Neil Delpratt about 6 years ago
- % Done changed from 0 to 100
- Fixed in Maintenance Release 9.9.0.1 added
Bug fix applied in the Saxon 9.9.0.1 major release.
Leave open until fix applied in the next Saxon 9.8 maintenance release.
Updated by O'Neil Delpratt about 6 years ago
- Status changed from Resolved to Closed
- Fixed in Maintenance Release 9.8.0.15 added
Bug fix applied in the Saxon 9.8.0.15 maintenance release.
Please register to edit this issue