Project

Profile

Help

Bug #6415

closed

Saxon-HE 12.4J trace for unreferenced XSLT variables seems off

Added by A Galtman 14 days ago. Updated 9 days ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Diagnostics
Sprint/Milestone:
-
Start date:
2024-05-05
Due date:
% Done:

0%

Estimated time:
Legacy ID:
Applies to branch:
12, trunk
Fix Committed on Branch:
12, trunk
Fixed in Maintenance Release:
Platforms:
Java

Description

The XSLT transformation below declares some global and local variables without referencing them. The trace behavior I'm seeing looks incorrect because the name of a trace element doesn't always match the line/column data.

XSLT Stylesheet, unreferenced-vars.xsl

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  
  <xsl:variable name="global-var1"><a/></xsl:variable>
  <xsl:variable name="global-var2"><b/></xsl:variable>
  <xsl:variable name="global-var3"><c/></xsl:variable>
  
  <xsl:template name="xsl:initial-template">
    <xsl:variable name="var1"><a/></xsl:variable>
    <xsl:variable name="var02"><b/></xsl:variable>
    <xsl:variable name="var003"><c/></xsl:variable>
    <root/>
  </xsl:template>
</xsl:stylesheet>

(The zeros in the variable names are just to make it easier to distinguish, in the trace, between line/column pointing to a different instruction and line being off by one.)

Trace

The trace includes mention of the lines that declare var02 and var003 but not var1.

Further examination shows that in trace, @name does not match @line and @column. For instance, <trace name="var1" line="10" column="32"...> has a mismatch because line 10 column 32 is the declaration of var02, not var1.

  
<trace saxon-version="12.4" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:template name="xsl:initial-template" line="8" column="45" module="unreferenced-vars.xsl">
  <trace name="var1" line="10" column="32" module="unreferenced-vars.xsl">
   <trace name="var02" line="11" column="33" module="unreferenced-vars.xsl">
    <trace name="var003" line="12" column="12" module="unreferenced-vars.xsl">
     <LRE name="root" line="12" column="12" module="unreferenced-vars.xsl">
     </LRE>
    </trace>
   </trace>
  </trace>
 </xsl:template>
</trace>

Command using Saxon-HE 12.4J:

java -cp "%SAXON_CP%" net.sf.saxon.Transform -T -it -xsl:unreferenced-vars.xsl >output.xml

Question

In general, should users expect unreferenced variables to be traced? In this stylesheet, the unreferenced global variables are not traced, but the unreferenced local variables are.

If I modify the XSLT to reference the variables, I still get the trace elements with data mismatches and I also get xsl:document and LRE elements that look correct.

Actions #1

Updated by Michael Kay 14 days ago

What's happening here is a little bit strange. Saxon injects the trace instructions into the tree, then does optimization. The optimization removes the unused local variable declarations, but it doesn't remove the trace instructions, which are now reporting the evaluation of a phantom variable that is no longer there. You can see the detail of what's happening if you run with -explain.

Why the mismatched line numbers? I need to look into this more carefully, but I suspect what is happening is that after the "let" expression (representing the local variable) is eliminated, we get two trace instructions next to each other, and these are elided into a single trace instruction and this is where the incorrect line numbers are picked up.

It's always been an open question whether tracing should suppress optimisation rewrites. On the one hand, tracing is there to help with debugging, and you want the output to be intelligible, which it isn't if there are a lot of radical rewrites. On the other hand, the reason you are debugging might be because of "bad" rewrites, for example loop-lifting calls to external functions with side-effects, and you don't wan't Heisenbugs where the code works when tracing and not otherwise. So on the whole, we don't suppress optimisations just because tracing is enabled. One major exception is global variables: IIRC correctly we evaluate globals eagerly if tracing is enabled. The other exception is that the presence of trace expressions in the tree may mean that the optimizer fails to detect opportunities for optimization that it would detect otherwise.

I think, though I would need to check my facts, that tracing also suppresses lazy evaluation (which is a run-time optimization rather than an expression tree rewrite). Lazy evaluation makes the order of evaluation of subexpressions very chaotic indeed.

Actions #2

Updated by Michael Kay 10 days ago

I've been doing work on a fairly radical upheaval of line/column maintenance, and although much of this will probably only appear in the 13 branch, I'll report it here because some of if may find its way back into 12.x.

Firstly, if line numbering is requested and we're building from a SAX source, then I'm injecting a new filter into the document building pipeline that changes the location information from what SAX reports, which is the location of the end of the current event, to instead report the end of the previous event. That's working well except where the end of the previous event is in a different external entity. In that case SAX doesn't give us good enough information, so I'm having to approximate by saying that if the location for the current event has a different system ID from the location of the previous event, we use the current event in preference.

Next, I've added to the linked tree the ability to hold line/column information (in a compressed format) for text nodes. This is worth doing because the dominant usage of the linked tree is for stylesheets, and with the introduction of text value templates, text nodes are rather more "interesting" than they used to be for purposes such as tracing. Current progress is that the text node holds a correct line/column number, but this isn't finding its way into the instructions in the compiled stylesheet, so they aren't being traced properly.

Currently looking at how to record more precisely the location of an expression contained within curly braces within a text node within a stylesheet, now that we have a reliable datum for the location of the text node. We have some machinery for doing this kind of thing, but it doesn't quite handle this scenario.

Actions #3

Updated by Michael Kay 9 days ago

After first injecting the trace instructions, the named template body looks like this:

<componentTracer>
 <traceExp name='var1' line='9' col='31'>
  <let baseUri='file:/Users/mike/bugs/2024/6415-Galtman/test.xslt' ns='xsl=~' line='9' var='Q{}var1' as='1ND' slot='-999'>
   <traceExp line='9' col='35'>
    <doc base='file:/Users/mike/bugs/2024/6415-Galtman/test.xslt' validation='preserve'>
     <elem name='a' nsuri='' flags='l'>
      <empty/>
     </elem>
    </doc>
   </traceExp>
   <traceExp name='var02' line='10' col='32'>
    <let line='10' var='Q{}var02' as='1ND' slot='-999'>
     <traceExp line='10' col='36'>
      <doc base='file:/Users/mike/bugs/2024/6415-Galtman/test.xslt' validation='preserve'>
       <elem name='b' nsuri='' flags='l'>
        <empty/>
       </elem>
      </doc>
     </traceExp>
     <traceExp name='var003' line='11' col='33'>
      <let line='11' var='Q{}var003' as='1ND' slot='-999'>
       <traceExp line='11' col='37'>
        <doc base='file:/Users/mike/bugs/2024/6415-Galtman/test.xslt' validation='preserve'>
         <elem name='c' nsuri='' flags='l'>
          <empty/>
         </elem>
        </doc>
       </traceExp>
       <traceExp name='root' line='12' col='12'>
        <elem line='12' name='root' nsuri=''>
         <empty/>
        </elem>
       </traceExp>
      </let>
     </traceExp>
    </let>
   </traceExp>
  </let>
 </traceExp>
</componentTracer>

After optimization, it looks like this:

<componentTracer>
 <traceExp name='var1' line='9' col='31'>
  <traceExp name='var02' line='10' col='32'>
   <traceExp name='var003' line='11' col='33'>
    <traceExp name='root' line='12' col='12'>
     <elem baseUri='file:/Users/mike/bugs/2024/6415-Galtman/test.xslt' ns='xsl=~' line='12' name='root' nsuri=''>
      <empty/>
     </elem>
    </traceExp>
   </traceExp>
  </traceExp>
 </traceExp>
</componentTracer>

So the line/column information on the trace instructions seems to be correct. But I think the trace listener, when a trace instruction is encountered, is being given position information for the instruction wrapped by that trace instruction, which in this case is another trace instruction.

Actions #4

Updated by Michael Kay 9 days ago

Indeed, that's the explanation. The effect of the trace expression is to do

TraceListener.enter(body, properties, context)

where body is the child of the trace expression; and the trace listener takes the line/column information from the body expression.

I think that the answer to this is to suppress the relevant trace. Either during the optimization rewrite, we should eliminate a trace expression if its child is a trace expression; or at run-time, a trace expression should do nothing if its child is a trace expression.

Actions #5

Updated by Michael Kay 9 days ago

  • Category set to Diagnostics
  • Status changed from New to Resolved
  • Assignee set to Michael Kay
  • Priority changed from Low to Normal
  • Applies to branch 12, trunk added
  • Fix Committed on Branch 12, trunk added
  • Platforms Java added

I changed TraceExpression.optimize() as described. After an unused variable is removed from the tree, the optimization of the trace expression now discovers that its wrapped expression is another trace expression, so it drops itself from the tree.

Please register to edit this issue

Also available in: Atom PDF