Project

Profile

Help

Bug #4786

accumulator value of root node is wrong when not using streaming

Added by Martin Honnen 15 days ago. Updated 7 days ago.

Status:
In Progress
Priority:
Normal
Assignee:
-
Category:
XSLT conformance
Sprint/Milestone:
-
Start date:
2020-10-07
Due date:
% Done:

0%

Estimated time:
Legacy ID:
Applies to branch:
10, 9.9, trunk
Fix Committed on Branch:
10, 9.9, trunk
Fixed in Maintenance Release:

Description

It seems that Saxon Java (tested with Saxon 10.2 HE and EE) wrongly computes the accumulator-before value of the root node of a tree e.g. for the example document

<?xml version="1.0" encoding="UTF-8"?>
<foo/>

and the sample stylesheet

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="#all"
    version="3.0">
    
    <xsl:output indent="yes" omit-xml-declaration="yes"/>
    
    <xsl:accumulator name="a1" as="xs:string" initial-value="'init'" streamable="yes">
        <xsl:accumulator-rule match="document-node()" select="$value || ', ' || 'matched root /'"/>
        <xsl:accumulator-rule match="node()" select="$value || ', ' || 'matched node /' || ancestor-or-self::node()/node-name() => string-join('/')"/>
    </xsl:accumulator>
    
    <xsl:param name="p1">
        <foo/>
    </xsl:param>
    
    <xsl:template name="xsl:initial-template">
        <xsl:source-document href="sample1.xml" streamable="yes" use-accumulators="a1">
            <xsl:apply-templates select="." mode="accu-test"/>
        </xsl:source-document>
    </xsl:template>
    
    <xsl:mode name="accu-test" streamable="yes"/>
    
    <xsl:template match="." mode="accu-test">
        <processed node="/{ancestor-or-self::node()/node-name() => string-join('/')}" accumulator-value="{accumulator-before('a1')}"/>
        <xsl:apply-templates mode="#current"/>
    </xsl:template>
    
</xsl:stylesheet>

processing with streaming gives the (in my view correct) result

<processed node="/" accumulator-value="init, matched root /"/>
<processed node="/foo" accumulator-value="init, matched root /, matched node /foo"/>

while processing without streaming gives the wrong result for the value of the accumulator on the root node:

<processed node="/" accumulator-value="init"/>
<processed node="/foo" accumulator-value="init, matched root /, matched node /foo"/>

Somehow the value output for the root node is only the initial value but the matching rule's select expression doesn't seem to be applied when giving the accumulator value for the root, although for the child node the value added by the rule for the root node is included.

History

#1 Updated by Michael Kay 12 days ago

I suspect this is a consequence of the buffering that is performed when processing the start of a streamed document. In order to ensure that the expression N instance of Tis streamable, including the caseN instance of document-node(element(E)), the startDocument event is not processed until the first element start tag is encountered. This also ensures that the function unparsed-entity-uri()works at all times, including while processing the document start. This buffering means that comments and processing instructions encountered before the first element are accumulated in memory before dealing with thestartDocument()` event.

#2 Updated by Martin Honnen 7 days ago

The issue is, that in my view, Saxon Java gets it right with streaming but wrong without streaming.

I am not quite sure about your last comment, it seems to explain why with streaming a slightly different processing approach might produce a different result.

But Saxon-JS 2, for instance, which doesn't support streaming, gives the result

<processed node="/" accumulator-value="init, matched root /"/>
<processed node="/foo" accumulator-value="init, matched root /, matched node /foo"/>

that is, the same result that Saxon Java gives with streaming.

Thus, this also suggest the problem is in the non-streaming accumulator implementation of Saxon Java.

#3 Updated by Michael Kay 7 days ago

I'm just working on it now, as it happens, and yes, you are right.

Added test cases accumulator-087 and accumulator-087s for the unstreamed and streamed cases respectively.

#4 Updated by Michael Kay 7 days ago

The problem is that in the AccumulatorData data structure, we actually store two accumulator values associated with the "before document-node" event: the first is the initial value, the second is the value after processing the accumulator rule that matches the document node. We store the accumulator values for the document as an ordered list and find the appropriate value using a binary-chop algorithm. When there are two values with the same key this algorithm can choose either of the two values unpredictably and with this dataset it is choosing the wrong one.

#5 Updated by Michael Kay 7 days ago

Fixed by adding to AccumulatorData.processRule():

        if (node.getNodeKind() == Type.DOCUMENT && !isPostDescent && values.size() == 1) {
            // Overwrite the accumulator's initial value with the "before document start" value. Bug 4786.
            values.clear();
        }

#6 Updated by Michael Kay 7 days ago

  • Status changed from New to Resolved
  • Applies to branch 9.9, trunk added
  • Fix Committed on Branch 10, 9.9, trunk added

Fixed on 9.9, 10.x, and 11.x branches

#7 Updated by Martin Honnen 7 days ago

Does that fix also help if the root node is an element node?

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="#all"
    version="3.0">
    
  <xsl:output indent="yes" omit-xml-declaration="yes"/>
    
  <xsl:accumulator name="a1" as="xs:string" initial-value="'init'">
      <xsl:accumulator-rule match="foo" select="$value || ', matched ' || path()"/>
  </xsl:accumulator>
    
  <xsl:param name="p1">
      <foo/>
  </xsl:param>
    
  <xsl:param name="p2" as="element(foo)">
      <foo/>
  </xsl:param>
    
  <xsl:template name="xsl:initial-template">
      <xsl:apply-templates select="$p1/node(), $p2"/>
  </xsl:template>
       
  <xsl:template match="foo">
      <foo-processed accumulator-value="{accumulator-before('a1')}" 
                     root="{node-name(root())}"
                     path="{path()}" instance-of-foo="{. instance of element(foo)}"/>
  </xsl:template>
    
</xsl:stylesheet>

Saxon-JS 2 gives me

<foo-processed accumulator-value="init, matched /Q{}foo[1]" root="" path="/Q{}foo[1]" instance-of-foo="true"/>
<foo-processed accumulator-value="init, matched /Q{}foo[1]" root="foo" path="/Q{}foo[1]" instance-of-foo="true"/>

while Saxon Java gives me

<foo-processed accumulator-value="init, matched /Q{}foo[1]"
               root=""
               path="/Q{}foo[1]"
               instance-of-foo="true"/>
<foo-processed accumulator-value="init"
               root="foo"
               path="Q{http://www.w3.org/2005/xpath-functions}root()"
               instance-of-foo="true"/>

So the patch would also need to cover a root node being an element node, not only a document-node.

#8 Updated by Martin Honnen 7 days ago

Would

        if ((node.getNodeKind() == Type.DOCUMENT || (node.getNodeKind() == Type.ELEMENT && node.getParent() == null))&& !isPostDescent && values.size() == 1) {

cover the case of element root nodes?

#9 Updated by Michael Kay 7 days ago

  • Status changed from Resolved to In Progress

Please register to edit this issue

Also available in: Atom PDF