Bug #3429
closed
Regular expression in fn:replace does not match (but should)
Category:
XPath conformance
Fix Committed on Branch:
9.8
Fixed in Maintenance Release:
Description
Hello,
I have an XSLT with an Regex that extracts the unit from a numeric value e.g. "10%" results in "%".
With e.g. Saxon HE 9.6.0.7 the result is as expected:
<?xml version="1.0" encoding="UTF-8"?>
<unit>%</unit>
With Saxon HE 9.8.0.4 however the result is different:
<?xml version="1.0" encoding="UTF-8"?>
<unit>10%</unit>
OS is Linux, but another user experiences the problem under Windows as well.
I attached a minimal example XSLT, which may also be used as its own input; command line is:
java -jar saxon9he.jar -s:regex_min.xsl -xsl:regex_min.xsl
Files
Confirmed that there appears to be a regression here between 9.7 and 9.8.
Added to QT3 test suite as test fn-replace-56.
- Category set to XPath conformance
- Status changed from New to Resolved
- Assignee set to Michael Kay
The Saxon regex engine, given a sequence containing a repeatable term (\d*) followed by another term (.?) attempts to establish whether the boundary is unambiguous: that is, whether given a particular character in the input, it is possible to determine unambigously whether it belongs to the first term or the second. Because this eliminates the need for backtracking it can deliver substantial performance improvements: the test for unambiguity was therefore improved in 9.8. But it has wrongly decided that this case is unambiguous, because although a digit cannot match the second term (.) it can match the third (\d+), and the second term is allowed to be empty.
Ideally we should check whether a character that matches the Nth term can also match any subsequent term, allowing for the fact that some of the subsequent terms can match an empty string. For the present, however, I will fix it so that the match is considered ambiguous if the second term allows a repeat count of zero.
- % Done changed from 0 to 100
- Fix Committed on Branch 9.8 added
- Status changed from Resolved to Closed
- Fixed in Maintenance Release 9.8.0.5 added
Bug fix applied in the Saxon 9.8.0.5 maintenance release.
Please register to edit this issue
Also available in: Atom
PDF