Project

Profile

Help

Bug #5637

closed

More detailed message for regular expression error

Added by Octavian Nadolu about 2 months ago. Updated 23 days ago.

Status:
Resolved
Priority:
Low
Assignee:
Category:
Diagnostics
Sprint/Milestone:
-
Start date:
2022-08-08
Due date:
% Done:

0%

Estimated time:
Legacy ID:
Applies to branch:
11, trunk
Fix Committed on Branch:
11, trunk
Fixed in Maintenance Release:
Platforms:
.NET, Java

Description

If validate the following XSL with Saxon 11.4 I get an error message:

"Error in regular expression: Incorrect syntax for Java regular expression".

I expect to have also the reason of the regexp error in the message. Something like this:

"Error in regular expression: Incorrect syntax for Java regular expression: Unclosed group near index 7 ((.+)"

Maybe you can modify the " throw XPathException" from JavaRegularExpression class something like this:

throw new XPathException("Incorrect syntax for Java regular expression: " + e.getMessage(), e);

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="2.0">
    <xsl:template match="/">
        <xsl:analyze-string select="." regex=" ((\.+)" flags=";j">
            <xsl:matching-substring/>
        </xsl:analyze-string>
    </xsl:template>
</xsl:stylesheet>
Actions #1

Updated by Michael Kay 23 days ago

The code currently does

throw new XPathException("Incorrect syntax for Java regular expression", e);

so we have the Java exception as a nested exception.

In XPathException.getMessage() we expand the message with the message from the underlying cause exception, so getMessage() returns

 Incorrect syntax for Java regular expression: Dangling meta character '*' near index 0
**
^

(3 lines)

The StandardErrorReporter calls getExpandedMessage() which calls formatNestedMessages(). Because the Java exception is a RuntimeException this unfortunately adds a stack trace to the message, which we really don't want in this case.

In fact, we end up with three copies of the nested message being output, as well as the stack trace.

In the light of this, it seems best to avoid including the nested exception as a cause, and only include its message, so we're left with

throw new XPathException("Incorrect syntax for Java regular expression: " + e.getMessage());

which produces the output:

Error on line 1 column 9 of file:/Users/mike/GitHub/saxon2020-saxon11/src/test/xmark/:
   Incorrect syntax for Java regular expression: Dangling meta character '*' near index 0
**
^

(4 lines)

I think we should also add the error code FORX0001.

Actions #2

Updated by Michael Kay 23 days ago

I was looking at this in the context of an invalid regex in fn:matches(). The xsl:analyze-string case is a little different.

With the supplied example, and with my changes applied, we're getting

Error in xsl:analyze-string/@regex on line 4 column 65 of test.xsl:
  XTDE1140  Error in regular expression: Incorrect syntax for Java regular expression:
  Unclosed group near index 7
 ((\.+)
Errors were reported during JIT compilation of template rule with match="/"

The phrase "Error in regular expression" seems redundant here, but I think I'll live with it, because the ";j" case is a bit unusual. But I will change the word "Java" to "native" so it works on both Java and C#.

Actions #3

Updated by Michael Kay 23 days ago

  • Category set to Diagnostics
  • Status changed from New to Resolved
  • Assignee set to Michael Kay
  • Applies to branch 11, trunk added
  • Fix Committed on Branch 11, trunk added
  • Platforms .NET, Java added

Running these test cases in 12.x also revealed some opportunities to improve the diagnostics there.

Please register to edit this issue

Also available in: Atom PDF