Project

Profile

Help

Bug #4786

accumulator value of root node is wrong when not using streaming

Added by Martin Honnen about 1 year ago. Updated 7 months ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
XSLT conformance
Sprint/Milestone:
-
Start date:
2020-10-07
Due date:
% Done:

100%

Estimated time:
Legacy ID:
Applies to branch:
10, 9.9, trunk
Fix Committed on Branch:
10, 9.9, trunk
Fixed in Maintenance Release:
Platforms:

Description

It seems that Saxon Java (tested with Saxon 10.2 HE and EE) wrongly computes the accumulator-before value of the root node of a tree e.g. for the example document

<?xml version="1.0" encoding="UTF-8"?>
<foo/>

and the sample stylesheet

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="#all"
    version="3.0">
    
    <xsl:output indent="yes" omit-xml-declaration="yes"/>
    
    <xsl:accumulator name="a1" as="xs:string" initial-value="'init'" streamable="yes">
        <xsl:accumulator-rule match="document-node()" select="$value || ', ' || 'matched root /'"/>
        <xsl:accumulator-rule match="node()" select="$value || ', ' || 'matched node /' || ancestor-or-self::node()/node-name() => string-join('/')"/>
    </xsl:accumulator>
    
    <xsl:param name="p1">
        <foo/>
    </xsl:param>
    
    <xsl:template name="xsl:initial-template">
        <xsl:source-document href="sample1.xml" streamable="yes" use-accumulators="a1">
            <xsl:apply-templates select="." mode="accu-test"/>
        </xsl:source-document>
    </xsl:template>
    
    <xsl:mode name="accu-test" streamable="yes"/>
    
    <xsl:template match="." mode="accu-test">
        <processed node="/{ancestor-or-self::node()/node-name() => string-join('/')}" accumulator-value="{accumulator-before('a1')}"/>
        <xsl:apply-templates mode="#current"/>
    </xsl:template>
    
</xsl:stylesheet>

processing with streaming gives the (in my view correct) result

<processed node="/" accumulator-value="init, matched root /"/>
<processed node="/foo" accumulator-value="init, matched root /, matched node /foo"/>

while processing without streaming gives the wrong result for the value of the accumulator on the root node:

<processed node="/" accumulator-value="init"/>
<processed node="/foo" accumulator-value="init, matched root /, matched node /foo"/>

Somehow the value output for the root node is only the initial value but the matching rule's select expression doesn't seem to be applied when giving the accumulator value for the root, although for the child node the value added by the rule for the root node is included.

History

#1 Updated by Michael Kay about 1 year ago

I suspect this is a consequence of the buffering that is performed when processing the start of a streamed document. In order to ensure that the expression N instance of Tis streamable, including the caseN instance of document-node(element(E)), the startDocument event is not processed until the first element start tag is encountered. This also ensures that the function unparsed-entity-uri()works at all times, including while processing the document start. This buffering means that comments and processing instructions encountered before the first element are accumulated in memory before dealing with thestartDocument()` event.

#2 Updated by Martin Honnen about 1 year ago

The issue is, that in my view, Saxon Java gets it right with streaming but wrong without streaming.

I am not quite sure about your last comment, it seems to explain why with streaming a slightly different processing approach might produce a different result.

But Saxon-JS 2, for instance, which doesn't support streaming, gives the result

<processed node="/" accumulator-value="init, matched root /"/>
<processed node="/foo" accumulator-value="init, matched root /, matched node /foo"/>

that is, the same result that Saxon Java gives with streaming.

Thus, this also suggest the problem is in the non-streaming accumulator implementation of Saxon Java.

#3 Updated by Michael Kay about 1 year ago

I'm just working on it now, as it happens, and yes, you are right.

Added test cases accumulator-087 and accumulator-087s for the unstreamed and streamed cases respectively.

#4 Updated by Michael Kay about 1 year ago

The problem is that in the AccumulatorData data structure, we actually store two accumulator values associated with the "before document-node" event: the first is the initial value, the second is the value after processing the accumulator rule that matches the document node. We store the accumulator values for the document as an ordered list and find the appropriate value using a binary-chop algorithm. When there are two values with the same key this algorithm can choose either of the two values unpredictably and with this dataset it is choosing the wrong one.

#5 Updated by Michael Kay about 1 year ago

Fixed by adding to AccumulatorData.processRule():

        if (node.getNodeKind() == Type.DOCUMENT && !isPostDescent && values.size() == 1) {
            // Overwrite the accumulator's initial value with the "before document start" value. Bug 4786.
            values.clear();
        }

#6 Updated by Michael Kay about 1 year ago

  • Status changed from New to Resolved
  • Applies to branch 9.9, trunk added
  • Fix Committed on Branch 10, 9.9, trunk added

Fixed on 9.9, 10.x, and 11.x branches

#7 Updated by Martin Honnen about 1 year ago

Does that fix also help if the root node is an element node?

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="#all"
    version="3.0">
    
  <xsl:output indent="yes" omit-xml-declaration="yes"/>
    
  <xsl:accumulator name="a1" as="xs:string" initial-value="'init'">
      <xsl:accumulator-rule match="foo" select="$value || ', matched ' || path()"/>
  </xsl:accumulator>
    
  <xsl:param name="p1">
      <foo/>
  </xsl:param>
    
  <xsl:param name="p2" as="element(foo)">
      <foo/>
  </xsl:param>
    
  <xsl:template name="xsl:initial-template">
      <xsl:apply-templates select="$p1/node(), $p2"/>
  </xsl:template>
       
  <xsl:template match="foo">
      <foo-processed accumulator-value="{accumulator-before('a1')}" 
                     root="{node-name(root())}"
                     path="{path()}" instance-of-foo="{. instance of element(foo)}"/>
  </xsl:template>
    
</xsl:stylesheet>

Saxon-JS 2 gives me

<foo-processed accumulator-value="init, matched /Q{}foo[1]" root="" path="/Q{}foo[1]" instance-of-foo="true"/>
<foo-processed accumulator-value="init, matched /Q{}foo[1]" root="foo" path="/Q{}foo[1]" instance-of-foo="true"/>

while Saxon Java gives me

<foo-processed accumulator-value="init, matched /Q{}foo[1]"
               root=""
               path="/Q{}foo[1]"
               instance-of-foo="true"/>
<foo-processed accumulator-value="init"
               root="foo"
               path="Q{http://www.w3.org/2005/xpath-functions}root()"
               instance-of-foo="true"/>

So the patch would also need to cover a root node being an element node, not only a document-node.

#8 Updated by Martin Honnen about 1 year ago

Would

        if ((node.getNodeKind() == Type.DOCUMENT || (node.getNodeKind() == Type.ELEMENT && node.getParent() == null))&& !isPostDescent && values.size() == 1) {

cover the case of element root nodes?

#9 Updated by Michael Kay about 1 year ago

  • Status changed from Resolved to In Progress

#10 Updated by Michael Kay 11 months ago

  • Status changed from In Progress to Resolved

This was resolved a while ago, but there are a couple of glitches:

(a) test accumulator-068 is failing but that turns out to be unrelated to this bug

(b) the Saxon-JS results are incorrect, because path() applied to a node in a tree not rooted at a document is giving the wrong result.

#11 Updated by O'Neil Delpratt 7 months ago

  • Status changed from Resolved to Closed
  • % Done changed from 0 to 100
  • Fixed in Maintenance Release 10.5 added

Bug fix applied to Saxon 10.5 maintenance release.

Please register to edit this issue

Also available in: Atom PDF