Project

Profile

Help

Bug #5560

closed

intersect doesn’t seem to work correctly for attributes

Added by Gerrit Imsieke over 2 years ago. Updated about 2 years ago.

Status:
Closed
Priority:
High
Assignee:
-
Category:
-
Sprint/Milestone:
Start date:
2022-06-11
Due date:
% Done:

100%

Estimated time:
Applies to JS Branch:
2, Trunk
Fix Committed on JS Branch:
2, Trunk
Fixed in JS Release:
SEF Generated with:
Platforms:
Company:
-
Contact person:
-
Additional contact persons:
-

Description

While developping XPathle, I stumbled across SaxonJS behaviour that was different than in the Java version.

Consider the commented-out code at https://github.com/gimsieke/xpathle/blob/main/xpathle-lib.xsl#L184

If you use the commented-out lines that contain intersect instead of the preceding two xsl:when ⁠s and recompile the SEF and use the example “XSLT 2.0 for postprocessing a CSS parse tree” and enter, for example, //@as as your guess, the as attributes will not be highlighted.

Comparing generated IDs fixed it for me, and this can immediately be tested on https://gimsieke.github.io/xpathle/

This issue hit me with the v1.2 SEF that Saxon EE 9.9 generated, and I therefore purchased and used Saxon EE 11 for generating an .sef.json, which unfortunately didn’t fix the issue. (I don’t regret buying a new EE version though.)


Files

sample2.xml (68 Bytes) sample2.xml Martin Honnen, 2022-06-16 10:47
sheet2.xsl (814 Bytes) sheet2.xsl Martin Honnen, 2022-06-16 10:47
sheet2.xsl.saxoncs-export.sef.json (2.93 KB) sheet2.xsl.saxoncs-export.sef.json Martin Honnen, 2022-06-16 10:47
sheet2.xsl.saxonjs-export.sef.json (6.38 KB) sheet2.xsl.saxonjs-export.sef.json Martin Honnen, 2022-06-16 10:47
Actions #2

Updated by Norm Tovey-Walsh over 2 years ago

  • Priority changed from Low to High
  • Sprint/Milestone set to SaxonJS 3.0
Actions #3

Updated by Martin Honnen over 2 years ago

I have tried to understand what could go wrong, I have not been able to identify what goes wrong, but so far all my attempts suggest that the problem only occurs if SaxonJS runs a SaxonEE exported stylesheet sef.json stylesheet. On the fly or with a SaxonJS/XX compiler generated sef.json stylesheet the problem does not occur.

Sample stylesheet is:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="3.0"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  exclude-result-prefixes="#all"
  expand-text="yes">

  <xsl:mode on-no-match="shallow-copy"/>
  
  <xsl:param name="guess-path" select="//@id"/>

  <xsl:template match="@*">
    <xsl:next-match/>
    <xsl:if test=". intersect $guess-path">
      <xsl:attribute name="highlight">test</xsl:attribute>
    </xsl:if>
  </xsl:template>
  
  <xsl:template match="/" name="xsl:initial-template">
    <xsl:next-match/>
    <xsl:comment>Run with {system-property('xsl:product-name')} {system-property('xsl:product-version')} {system-property('Q{http://saxon.sf.net/}platform')}</xsl:comment>
  </xsl:template>
  
</xsl:stylesheet>

Now when I run it directly or with xslt3 (of SaxonJS 2.4) compiled sef.json through xslt3 it gives the right output:

<?xml version="1.0" encoding="UTF-8"?><root id="el1" highlight="test" class="foo">
  <item id="el2" highlight="test" class="bar"/>
</root><!--Run with SaxonJS 2.4 Node.js-->

However, when I use SaxonCS or EE to compile for SaxonJS, I get the wrong output:

<?xml version="1.0" encoding="UTF-8"?><root id="el1" class="foo">
  <item id="el2" class="bar"/>
</root><!--Run with SAXON JS 11.3 Node.js-->

Input sample is e.g.

<root id="el1" class="foo">
  <item id="el2" class="bar"/>
</root>

I am afraid I am not skilled enough to identify whether the SaxonCS/EE generated sef.json is wrong (it doesn't seem to contain an intersect, for instance) or whether the SaxonJS runtime can't handle the perhaps optimized SaxonCS/EE compiled sef.json for that sample.

Actions #4

Updated by Gerrit Imsieke over 2 years ago

Thanks Martin for investigating and for preparing a minimal test case!

Actions #5

Updated by Martin Honnen over 2 years ago

SaxonCS/EE seems to compile the intersect check into something like exists(among(..)) while SaxonJS seems to compile intersect in the XSLT/XPath into intersect in the SEF as well. But I have now tested that a template matching element nodes instead of attribute nodes and using the same intersect test gives the same SEF compilation with exists(among(..)) with SaxonCS/EE but that SaxonJS executes that test fine with element nodes. The SaxonJS among function in the end seems to do === which I think, in terms of JavaScript objects like nodes should do an identity check but from the minimized/garbled source I am not able to see/figure what kind of arguments are processed.

Actions #6

Updated by Martin Honnen over 2 years ago

In terms of among with element nodes it indeed seems that two DOM element nodes end up being compared with ===, but as SaxonJS seems to handle attribute nodes differently it appears some wrapper/closure around the node ends up being compared and these wrappers/closures, even if having the same attribute node (data) inside (e.g. id="el1") compared with === are not the same objects so the comparison returns false. That seems to be basically the reason why the SaxonCS/EE compiled test=". intersect $nodes" fails with . being attribute node and attribute nodes being in $nodes, it does some comparison with among doing some === comparison failing to identify identical attributes as not a DOM attribute node (reference) is compared with another DOM attribute node reference but two SaxonJS data structures that are not the same object but reference the same attribute fail the === comparison.

Actions #7

Updated by Martin Honnen over 2 years ago

Replacing the code in among doing return S===T with return S instanceof Node ? S===T : S.parent === T.parent && S.name === T.name && S.namespaceURI === T.namespaceURI} seems to fix the attribute handling. But I have no idea whether it breaks other things or which other type of wrappers for which the second part after the colon doesn't make sense could be passed to the function.

Actions #8

Updated by Michael Kay over 2 years ago

The basic problem is that we are using a === b to compare nodes for identity, and this doesn't work for attributes because the attribute axis constructs attribute nodes "on the fly". We should be using DomUtils.isSameNode(a, b).

The "among" operator is basically an optimisation for a one-to-many intersect. This is most commonly encountered in a boolean context

if (X intersect (P, Q, R))

and the main benefit of the optimisation is to avoid sorting nodes into document order. The XJ compiler does this optimisation; XX does not.

Actions #9

Updated by Michael Kay over 2 years ago

Come to think of it, the code for "among" has another bug. If we don't sort the sequence into document order, and don't eliminate duplicates, there's a danger that (P intersect (P, Q, R, P)) is going to return (P, P). rather than (P).

Added a QT3 test case for this: -s:op-intersect -t:K2-SeqIntersect-48

SaxonJ is passing the test because it effectively rewrites the expression as

((P, Q, R, P)[. is P][1])

That is, it knows the expression can return at most one node so it stops processing the input after one node has been found.

Actions #10

Updated by Michael Kay over 2 years ago

I'm inclined to fix this by scrapping the SingletonIntersectExpr ("among" operator) in SaxonJ, rewriting instead as

let $x := §ARG0§ return head(§ARG1§[. is $x])

and dropping support for "among" in SaxonJS 3 (SEF files will probably need to be recompiled for SaxonJS 3 anyway).

Reducing the number of instructions is always good, as each instruction now carries a lot of code and potential bugs (bytecode, streaming, etc).

Actions #11

Updated by Norm Tovey-Walsh over 2 years ago

  • Sprint/Milestone changed from SaxonJS 3.0 to SaxonJS 2.5
Actions #12

Updated by Norm Tovey-Walsh about 2 years ago

  • Sprint/Milestone changed from SaxonJS 2.5 to SaxonJS 3.0
  • Applies to JS Branch deleted (1.0, 2)

I'm moving this back to Saxon 3. If we're going to rewrite it entirely in 3, requiring SEF files to be recompiled, I don't think there's a lot of value in trying to fix the implementation in 2.x

Actions #13

Updated by Norm Tovey-Walsh about 2 years ago

  • Sprint/Milestone changed from SaxonJS 3.0 to SaxonJS 2.5

On further discussion, perhaps we do want to try to fix this in 2.5.

Actions #14

Updated by Michael Kay about 2 years ago

I think the "quick fix" is simply to change the code at Expr.js#89 (in my build) from

node === m

to

DU.isSameNode(node, m)

(But check for the potential bug identified in comment #9)

Actions #15

Updated by Debbie Lockett about 2 years ago

  • Status changed from New to Resolved
  • Applies to JS Branch 2, Trunk added
  • Fix Committed on JS Branch 2, Trunk added

Confirmed that Martin's sample reproduced the issue; unit test iss5560 now added.

Fix in implementation code for "among" in Expr.js (as suggested in comment #note-10) committed on saxonjs2 and main branches.

There doesn't seem to be any problem with potential bug identified in #note-9; the new QT3 test -s:op-intersect -t:K2-SeqIntersect-48 passes with SaxonJS.

Actions #16

Updated by Norm Tovey-Walsh about 2 years ago

  • Status changed from Resolved to Closed
  • % Done changed from 0 to 100
  • Fixed in JS Release set to SaxonJS 2.5

Fixed in SaxonJS 2.5.

Please register to edit this issue

Also available in: Atom PDF Tracking page