Project

Profile

Help

Bug #6061

closed

Crash in SaxonCS using ";n" in regex flags

Added by Michael Kay 11 months ago. Updated 8 months ago.

Status:
Closed
Priority:
Low
Assignee:
Category:
-
Sprint/Milestone:
-
Start date:
2023-06-02
Due date:
% Done:

100%

Estimated time:
Legacy ID:
Applies to branch:
11, 12, trunk
Fix Committed on Branch:
11, 12, trunk
Fixed in Maintenance Release:
Platforms:
.NET

Description

I’m evaluating SaxonCS 12.2. When I do a transform using SaxonCS.exe with the attached XSLT I get the error below. It happens when the XSLT contains a regular expression with the “;n” flag (i.e. using the .net regex engine). In the example I use analyze-string, but I can reproduce this with other regex-methods as well. It works without the “;n” flag.

<xsl:analyze-string select="." regex="a" flags=";n">

Saxon license expires in 19 days
System.ArgumentOutOfRangeException: Length cannot be less than zero. (Parameter 'length')
   at System.String.Substring(Int32 startIndex, Int32 length)
   at Saxon.Hej.regex.JRegexIterator.next()
   at Saxon.Hej.om.FocusTrackingIterator.next()
   at Saxon.Hej.expr.instruct.AnalyzeString.AnalyzeStringElaborator.<>c__DisplayClass1_0.<elaborateForPush>b__0(Outputter out, XPathContext context)
   at Saxon.Hej.expr.instruct.Copy.CopyElaborator.<>c__DisplayClass0_0.<elaborateForPush>b__0(Outputter output, XPathContext context)
   at Saxon.Hej.expr.instruct.TemplateRule.applyLeavingTail(Outputter output, XPathContext context)
System.Exception: Internal error evaluating template rule  at line 14 in module file:///D:/Temp/saxontest/template.xslt
---> System.ArgumentOutOfRangeException: Length cannot be less than zero. (Parameter 'length')
   at System.String.Substring(Int32 startIndex, Int32 length)
   at Saxon.Hej.regex.JRegexIterator.next()
   at Saxon.Hej.om.FocusTrackingIterator.next()
   at Saxon.Hej.expr.instruct.AnalyzeString.AnalyzeStringElaborator.<>c__DisplayClass1_0.<elaborateForPush>b__0(Outputter out, XPathContext context)
   at Saxon.Hej.expr.instruct.Copy.CopyElaborator.<>c__DisplayClass0_0.<elaborateForPush>b__0(Outputter output, XPathContext context)
   at Saxon.Hej.expr.instruct.TemplateRule.applyLeavingTail(Outputter output, XPathContext context)
   --- End of inner exception stack trace ---
   at Saxon.Hej.expr.instruct.TemplateRule.applyLeavingTail(Outputter output, XPathContext context)
   at Saxon.Hej.trans.Mode.handleRuleNotNull(Rule rule, TraceListener traceListener, XPathContextMajor context, Item item, TemplateRule previousTemplate, ParameterSet parameters, ParameterSet tunnelParameters, Outputter output)
   at Saxon.Hej.trans.Mode.applyTemplates(ParameterSet parameters, ParameterSet tunnelParameters, NodeInfo separator, Outputter output, XPathContextMajor context, Location locationId)
   at Saxon.Hej.trans.rules.ShallowCopyRuleSet.process(Item item, ParameterSet parameters, ParameterSet tunnelParams, Outputter out, XPathContext context, Location locationId)
   at Saxon.Hej.trans.Mode.applyTemplates(ParameterSet parameters, ParameterSet tunnelParameters, NodeInfo separator, Outputter output, XPathContextMajor context, Location locationId)
   at Saxon.Hej.trans.XsltController.applyTemplates(Sequence source, Receiver out)
   at Saxon.Hej.s9api.AbstractXsltTransformer.applyTemplatesToSource(Source source, Receiver out)
   at Saxon.Hej.s9api.Xslt30Transformer.applyTemplates(Source source, Destination destination)
   at Saxon.Hej.Transform.processFile(Source source, XsltExecutable sheet, File outputFile, CommandLineOptions options)
   at Saxon.Hej.Transform.doTransform(String[] args)
Fatal error during transformation: Exception: Internal error evaluating template rule  at line 14 in module file:///D:/Temp/saxontest/template.xslt
Exiting with code 2
Actions #1

Updated by Michael Kay 11 months ago

  • Description updated (diff)
Actions #2

Updated by Michael Kay 11 months ago

I have reproduced the issue.

Actions #3

Updated by Michael Kay 11 months ago

A reminder of the code structure here. When the ";n" flag is used, the code uses the class JRegexIterator which is a transpilation to C# of the Java class with the same name, with no significant changes other than the standard ones (e.g. calling substring() with start and length rather than start and end). The class is written to call the Java regex API classes Pattern and Matcher, which are emulated in the C# product; the emulated classes call the underlying C# regex API.

What seems to be happening here is that the emulation is not quite accurate enough.

We're matching the string "Hello James" against the regex "James|Mary".

On the Java side:

  • On the first call to next(), matcher.find() returns true, start=6, end=11; current="Hello ", nextSubstring="James", prevEnd=0.
  • On the second call to next(), we set current="James", nextSubstring = null, prevEnd = matcher.end() = 11
  • On the third call to next(), matcher.find() returns false, prevend<theString.length() is false, so we return null indicating end of sequence.

On the CS side:

  • On the first call to next(), matcher.find() returns true, start=6, end=11; current="Hello ", nextSubstring="James", prevEnd=0.
  • On the second call to next(), we set current="James", nextSubstring = null, prevEnd = matcher.end() = 11
  • On the third call to next(), we call matcher.find(), but this computes start = latestMatch?.Index which is 6. We should surely add latestMatch?.Length to start at 11.

That change seems to fix it. But there could be edge cases we need to worry about; next thing to do is to check the test cases for this area.

Actions #4

Updated by Michael Kay 11 months ago

  • Status changed from New to Resolved
  • Applies to branch 11, 12, trunk added
  • Fix Committed on Branch 11, 12, trunk added
  • Platforms .NET added

It seems there's only a handful of unit tests for this feature; these were working successfully before the change and they continue to work successfully after it. It really needs more tests.

Actions #5

Updated by O'Neil Delpratt 10 months ago

  • Status changed from Resolved to Closed
  • % Done changed from 0 to 100
  • Fixed in Maintenance Release 12.3 added

Bug fix applied in the Saxon 12.3 maintenance release.

Actions #6

Updated by O'Neil Delpratt 10 months ago

  • Status changed from Closed to Resolved

Leaving this bug issue as resolved until resolved against Saxon 11.

Actions #7

Updated by Debbie Lockett 8 months ago

  • Status changed from Resolved to Closed
  • Fixed in Maintenance Release 11.6 added

Bug fix applied in the Saxon 11.6 maintenance release.

Please register to edit this issue

Also available in: Atom PDF