Bug #3429
closedRegular expression in fn:replace does not match (but should)
100%
Description
Hello,
I have an XSLT with an Regex that extracts the unit from a numeric value e.g. "10%" results in "%".
With e.g. Saxon HE 9.6.0.7 the result is as expected:
<?xml version="1.0" encoding="UTF-8"?>
<unit>%</unit>
With Saxon HE 9.8.0.4 however the result is different:
<?xml version="1.0" encoding="UTF-8"?>
<unit>10%</unit>
OS is Linux, but another user experiences the problem under Windows as well.
I attached a minimal example XSLT, which may also be used as its own input; command line is:
java -jar saxon9he.jar -s:regex_min.xsl -xsl:regex_min.xsl
Files
Updated by Michael Kay about 7 years ago
Confirmed that there appears to be a regression here between 9.7 and 9.8.
Updated by Michael Kay about 7 years ago
Added to QT3 test suite as test fn-replace-56.
Updated by Michael Kay about 7 years ago
- Category set to XPath conformance
- Status changed from New to Resolved
- Assignee set to Michael Kay
The Saxon regex engine, given a sequence containing a repeatable term (\d*) followed by another term (.?) attempts to establish whether the boundary is unambiguous: that is, whether given a particular character in the input, it is possible to determine unambigously whether it belongs to the first term or the second. Because this eliminates the need for backtracking it can deliver substantial performance improvements: the test for unambiguity was therefore improved in 9.8. But it has wrongly decided that this case is unambiguous, because although a digit cannot match the second term (.) it can match the third (\d+), and the second term is allowed to be empty.
Ideally we should check whether a character that matches the Nth term can also match any subsequent term, allowing for the fact that some of the subsequent terms can match an empty string. For the present, however, I will fix it so that the match is considered ambiguous if the second term allows a repeat count of zero.
Updated by O'Neil Delpratt about 7 years ago
- % Done changed from 0 to 100
- Fix Committed on Branch 9.8 added
Updated by O'Neil Delpratt about 7 years ago
- Status changed from Resolved to Closed
- Fixed in Maintenance Release 9.8.0.5 added
Bug fix applied in the Saxon 9.8.0.5 maintenance release.
Please register to edit this issue