Project

Profile

Help

Bug #6022

closed

Eager evaluation prematurely throws exception in analyze-string/non-matching-string.

Added by Paul Merchant over 1 year ago. Updated over 1 year ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Internals
Sprint/Milestone:
-
Start date:
2023-05-08
Due date:
% Done:

100%

Estimated time:
Legacy ID:
Applies to branch:
12, trunk
Fix Committed on Branch:
12, trunk
Fixed in Maintenance Release:
Platforms:
.NET, Java

Description

The following transformation should produce 22 files in the /tmp/xml directory when run given itself as an input file, however, after generating the first 21 files it throws an exception that the calculated path for the 22nd file does not match the regular expression given to analyze-string.

Command:

java -classpath Saxon-HE-12.2.jar:xmlresolver-5.1.1.jar net.sf.saxon.Transform -s:xsl.xsl -xsl:xsl.xsl

XSLT file (xsl.xsl):

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
    xmlns:fn="http://www.w3.org/2005/xpath-functions"
    xmlns:lfn="local"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns="http://www.w3.org/1999/xhtml"
    exclude-result-prefixes="fn lfn xs"
    version="3.0">

  <xsl:output indent="yes"/>
  
  <xsl:template match="/">
    <xsl:for-each select="1 to 22">      
      <xsl:result-document href="file://{lfn:path(xs:string(.))}" method="xml" indent="yes" >
        <result>          
          <iter><xsl:value-of select="."/></iter>
        </result>
      </xsl:result-document>
    </xsl:for-each>    
  </xsl:template>


  <xsl:function name="lfn:path" as="xs:string">
    <xsl:param name="iter" as="xs:string"/>
    
    <xsl:variable name="path" select="'/tmp/xml/doc-' || $iter || '.xml'"/>
    
    <xsl:variable name="regex">^/tmp/xml/doc-\d+\.xml$</xsl:variable>

    <!-- split the path into protocol/host, directory path, and query string components-->

    <xsl:analyze-string select="$path" regex="{$regex}">
      <xsl:matching-substring>
        <xsl:value-of select="$path"/>
      </xsl:matching-substring>
      <xsl:non-matching-substring>
        <xsl:value-of select="fn:error(fn:QName('error', 'err:path-syntax'), 'The path &quot;' || $path || '&quot; does not match regex &quot;' || $regex || '&quot;')"/>
      </xsl:non-matching-substring>
    </xsl:analyze-string>
  </xsl:function>
  
</xsl:stylesheet>


Related issues

Has duplicate Saxon - Bug #6026: Unexpected function call triggers cardinality check and errorDuplicate2023-05-09

Actions
Has duplicate Saxon - Bug #6082: Saxon 12 XQuery: Issue with compile-time error checkingClosedMichael Kay2023-06-19

Actions
Actions #1

Updated by Paul Merchant over 1 year ago

I should note that this issue also appears in releases 12.0 and 12.1 but is not present in 11.5.

Actions #2

Updated by Michael Kay over 1 year ago

I've reproduced this running SaxonJ-HE 12.2 from the command line, but it works fine when running SaxonJ-EE in IntelliJ.

Which makes it a bit tricky to debug...

Actions #3

Updated by Michael Kay over 1 year ago

For the record, it also works in SaxonJ-EE 12.2 when run from the command line (with a license file; but fails without one).

Now reproduced in IntelliJ by renaming my license file so I'm effectively running without one.

Actions #4

Updated by Michael Kay over 1 year ago

Noted that if I change the XSLT to say (1 to 20) I get no failure; if I change it to (1 to 24) then it still fails on file 22. So it's not just because it's the last one. Weird.

What seems to be happening is that the call on error() is loop-lifted because it has no dependencies on the context item. The execution strategy changes after the first 20 evaluations (or so) because (since Saxon 12) we're choosing between lazy and eager evaluation based on experience. But an expression that's been loop lifted in this way should force lazy evaluation to avoid spurious errors.

No idea why this is happening in HE and not EE, probably EE does some other optimization which takes things down a different path by chance.

Actions #5

Updated by Michael Kay over 1 year ago

Here's the -explain optimizer trace:

OPT : At line 29 of file:/Users/mike/bugs/2023/6022-Merchant/test.xsl
OPT : Inlined constant variable regex
OPT : Expression after rewrite: ^/tmp/xml/doc-\d+\.xml$
OPT : At line 38 of file:/Users/mike/bugs/2023/6022-Merchant/test.xsl
OPT : Lifted (fn:string-join(...)) above (analyzeString) on line 33
OPT : Expression after rewrite: let $Q{http://saxon.sf.net/generated-variable}v0 := string-join(convertTo_xs:string(data(error(QName(error, err:path-syntax), concat(The path ", $path, " does not match regex ", ^/tmp/xml/doc-\d+\.xml$, ")))),  ) return (exactly-one(convertTo_xs:string(data(AnalyzeString($path, ^/tmp/xml/doc-\d+\.xml$, , ValueOf($path), ValueOf($Q{http://saxon.sf.net/generated-variable}v0))))))

What should happen now is that the generated variable v0 is flagged as requiring lazy evaluation. Somewhere along the line this flag is being lost or ignored.

Actions #6

Updated by Michael Kay over 1 year ago

I've changed the lazily() method of Elaborator to pass a parameter indicating that lazy evaluation is required, but this isn't enough to fix the problem.

What seems to be happening is that we have in effect a LetExpression with two variables:

let $path := concat('/tmp/xml/doc-', $iter, '.xml'),
     $v0 := string-join(error(), ...)
return analyse-string(....)

and because it's decided to evaluate the first variable eagerly (which is reasonable) it's also evaluating the second one eagerly (which isn't).

Actions #7

Updated by Michael Kay over 1 year ago

Changing LetExprElaborator.eagerly() to invoke LetExprElaborator.lazily() when the needsLazyEvaluation flag is set solves the problem.

Regression testing.

Actions #8

Updated by Michael Kay over 1 year ago

Seeing some regression:

as : 3 catalog : 2 choose : 2 xslt-compat : 1

as-1303 does:

java.lang.ClassCastException: class net.sf.saxon.value.MemoClosure cannot be cast to class net.sf.saxon.om.GroundedValue (net.sf.saxon.value.MemoClosure and net.sf.saxon.om.GroundedValue are in unnamed module of loader 'app')
	at net.sf.saxon.expr.parser.ExpressionTool.eagerEvaluate(ExpressionTool.java:195)
	at net.sf.saxon.expr.instruct.GlobalVariable.getSelectValue(GlobalVariable.java:645)
	at net.sf.saxon.expr.instruct.GlobalVariable.actuallyEvaluate(GlobalVariable.java:739)
	at net.sf.saxon.expr.instruct.GlobalVariable.evaluateVariable(GlobalVariable.java:708)

Fixed by changing ExpressionTool.eagerEvaluate() to call Sequence.materialize() rather than assuming the returned Sequence will be a GroundedValue.

Actions #9

Updated by Michael Kay over 1 year ago

  • Category set to Internals
  • Status changed from New to Resolved
  • Assignee set to Michael Kay
  • Priority changed from Low to Normal
  • Applies to branch trunk added
  • Fix Committed on Branch 12, trunk added
  • Platforms .NET added

QT4 tests running OK.

I'll commit this fix.

Actions #12

Updated by Michael Kay over 1 year ago

  • Has duplicate Bug #6026: Unexpected function call triggers cardinality check and error added
Actions #13

Updated by Michael Kay over 1 year ago

  • Has duplicate Bug #6082: Saxon 12 XQuery: Issue with compile-time error checking added
Actions #14

Updated by O'Neil Delpratt over 1 year ago

  • Status changed from Resolved to Closed
  • % Done changed from 0 to 100
  • Fixed in Maintenance Release 12.3 added

Bug fix applied in the Saxon 12.3 maintenance release.

Please register to edit this issue

Also available in: Atom PDF