Project

Profile

Help

Bug #4339

XQuery variables seem to get inlined even when optimizations are disabled

Added by Radu Coravu 11 months ago. Updated 6 months ago.

Status:
Closed
Priority:
Low
Assignee:
Category:
Diagnostics
Sprint/Milestone:
-
Start date:
2019-10-11
Due date:
% Done:

0%

Estimated time:
Legacy ID:
Applies to branch:
Fix Committed on Branch:
Fixed in Maintenance Release:

Description

For our xquery debugger we set the net.sf.saxon.lib.FeatureKeys.OPTIMIZATION_LEVEL property to be "new OptimizerOptions(0)" meaning no optimizations.

Then we load the associated XQuery and when the execution goes to line 62 we compute the list of local variables with some code which looks like this:

  public static List<Variable> getXQueryLocalVariables(
      XPathContext pathContext,
      InstructionInfo instructionInfo) {

    ArrayList<Variable> localVariables = new ArrayList<Variable>();

    StackFrame stackFrame = pathContext.getStackFrame();
    if (stackFrame != null) {
      SlotManager slotMgr = stackFrame.getStackFrameMap();
      if (slotMgr != null) {        
        List variablesQNames = slotMgr.getVariableMap();
        if (variablesQNames != null) {
          if (logger.isDebugEnabled()) {
            logger.debug("Locals var fingerprints array size : " + variablesQNames.size());
          }
       .............
    return localVariables;
  }

The list of "variablesQNames" obtained with Saxon 9.8 used to be much larger, it reported 19 variables, it reported for example the variable "process_inst" which is declared and used only once. But with Saxon 9.9 only 4 variables are reported: "documentSeq1", "in-xml", "booleanSequence" and "attr"

xqueryDebug.zip (1.43 KB) xqueryDebug.zip Radu Coravu, 2019-10-11 07:17

History

#1 Updated by Radu Coravu 11 months ago

Is there any other property we can set to see all these attributes again in the debugger? Otherwise we can accept that the behavior has changed and update our automatic tests accordingly.

#2 Updated by Michael Kay 11 months ago

If I run with -opt:0 from the command line, then the query fails saying orders.xml does not exist, and thereby showing that the unused variable $doc_node has not been eliminated.

If I change it so $doc_node binds to literal XML, the -explain output tells us:

OPT : At line 38 of file:/Users/mike/bugs/2019/4339-coravu/xqueryDebug/testSequenceTypes.xquery
OPT : Pre-evaluated function call fn:boolean(...)
OPT : Expression after rewrite: true()
OPT : At line 38 of file:/Users/mike/bugs/2019/4339-coravu/xqueryDebug/testSequenceTypes.xquery
OPT : Pre-evaluated function call fn:boolean(...)
OPT : Expression after rewrite: true()

showing us that the two calls on fn:boolean() with constant arguments are pre-evaluated despite the presence of the -opt:0 flag. I don't regard that as a bug, though it would be consistent to put that optimization under the control of one of the optimization flags.

But the -explain output confirms that all the variables are still there; so we need to look more carefully at exactly how you are trying to programmatically achieve the same effect as -opt:0 on the command line.

Could you try and provide a repro that demonstrates a sequence of Java calls equivalent to the calls you are making?

#3 Updated by Michael Kay 11 months ago

For 10.0 I have put the optimization of constant expressions such as boolean(0) under the control of a new optimization flag "n" (CONSTANT_FOLDING) so that -opt:0 will now suppress this.

However, there are still many simple optimizations that cannot be suppressed.

#4 Updated by Michael Kay 11 months ago

  • Category set to Diagnostics
  • Status changed from New to AwaitingInfo
  • Assignee set to Michael Kay

#5 Updated by Radu Coravu 11 months ago

As our XQuery/XSLT debugger code is quite large and convoluted I'm not sure I can come up with sample source code in a decent amount of time.

With the "net.sf.saxon.Configuration" we do in our code something like:

        config.setConfigurationProperty(
            FeatureKeys.OPTIMIZATION_LEVEL,
            new OptimizerOptions(0);

and then later on:

net.sf.saxon.Configuration.setTraceListener(TraceListener)

and after the transformation is started on the trace listener's callback:

net.sf.saxon.lib.TraceListener.enter(InstructionInfo, XPathContext)

we call the code "getXQueryLocalVariables" which was already pasted in the initial email.

#6 Updated by Michael Kay 11 months ago

For the XSLT case I can see potential confusion if the OptimizerOptions defined at Configuratation level and at CompilerInfo level disagree: the many paths seem to be taking the optimizer options from the Configuration level rather than from the CompilerInfo level. But for XQuery, as far as I can see, the effective value of optimizerOptions is always taken directly from the Configuration.

I've tried a little test case like this:

public void testOptimizationOff() {
        try {
            Processor p = new Processor(false);
            p.setConfigurationProperty(Feature.OPTIMIZATION_LEVEL, 0);
            p.setConfigurationProperty(Feature.TRACE_OPTIMIZER_DECISIONS, true);
            DocumentBuilder b = p.newDocumentBuilder();
            XdmNode foo = b.build(new StreamSource(new StringReader("<foo><bar/></foo>")));
            XQueryCompiler c = p.newXQueryCompiler();
            XQueryExecutable expr = c.compile("let $x := //bar return count($x)");
            expr.explain(p.newSerializer(System.out));
            XQueryEvaluator eval = expr.load();
            eval.setContextItem(foo);
            XdmValue val = eval.evaluate();
            assertEquals(val.size(), 1);
        } catch (SaxonApiException e) {
            fail(e.getMessage());
        }
    }

With the OPTIMIZATION_LEVEL setting removed, the trace of optimizer decisions includes

OPT : At line 1 of null OPT : Inlined references to $x OPT : Expression after rewrite: count((/)/descendant::element(Q{}bar))

and the resulting expression contains no <let> expression. With optimization level set to 0, the optimizer trace disappears, and the resulting expression includes <let var="Q{}x">.

So as far as I can see the switch is basically working, and the problem must be with some detail of how you are setting it.

#7 Updated by Michael Kay 11 months ago

Actually I now see that if I do

p.setConfigurationProperty(Feature.OPTIMIZATION_LEVEL, new OptimizerOptions(0));

I get

java.lang.IllegalArgumentException: The value of http://saxon.sf.net/feature/optimizationLevel must be a string

and the same if I use FeatureKeys.OPTIMIZATION_LEVEL rather than Feature.OPTIMIZATION_LEVEL. Perhaps you are catching the error and ignoring it?

The move from FeatureKeys to Feature was designed to add some type safety into this interface, but it doesn't work well for this property which for historic reasons allows either an integer or a string.

#8 Updated by Radu Coravu 11 months ago

Sorry, my original code should have been:

 config.setConfigurationProperty( FeatureKeys.OPTIMIZATION_LEVEL, new OptimizerOptions(0).toString()); 

so we are setting the optimization level correctly. But thanks for the extra details, I'll look more into this on my side.

#9 Updated by Radu Coravu 11 months ago

I reduced our debugger code to simpler code using only Saxon classes, but now my sample code works, it started reporting all variables on the last line of the XQuery, so I'm now back to square one, trying to comment out large parts of our debugger code and see when it starts working in our code. I'm attaching my sample code below because in a way it exemplifies how our debugger works with the Saxon code:

public static void main(String[] args) throws SaxonApiException, InstantiationException, IllegalAccessException, XPathException {
	    Configuration config = new Configuration();
	    config.setConfigurationProperty(net.sf.saxon.lib.FeatureKeys.ALLOW_MULTITHREADING, false);
	    config.setConfigurationProperty(net.sf.saxon.lib.FeatureKeys.OPTIMIZATION_LEVEL, 0);
	    config.setCompileWithTracing(true);
	    config.setLineNumbering(true);
	    config.setTraceListener(new TraceListener() {
	      
	      @Override
	      public void startCurrentItem(Item currentItem) {
	      }
	      
	      @Override
	      public void setOutputDestination(net.sf.saxon.lib.Logger stream) {
	      }
	      
	      @Override
	      public void open(Controller controller) {
	        new Exception().printStackTrace();
	      }
	      
	      @Override
	      public void leave(InstructionInfo instruction) {
	      }
	      
	      @Override
	      public void enter(InstructionInfo instruction, XPathContext context) {
	        System.err.println("ENTER " + instruction + " line " + instruction.getLineNumber());
	        System.err.println("++++++++++");
	        printXQueryLocalVariables(context, instruction);
	      }
	      
	      @Override
	      public void endCurrentItem(Item currentItem) {
	      }
	      
	      @Override
	      public void close() {
	      }
	    });
	    
	    DynamicQueryContext dynamicQueryContext = new DynamicQueryContext(config);
	    StaticQueryContext staticQueryContext = config.newStaticQueryContext();
	    staticQueryContext.setBaseURI(new File("D:\\projects\\eXml//test/debugger/xquery/Books/").toURI().toString());
	    staticQueryContext.setCompileWithTracing(true);
	    
	    StreamSource source = new StreamSource(new StringReader("<root/>"));
	    TreeInfo tree = staticQueryContext.getConfiguration().buildDocumentTree(source);
	    NodeInfo root = tree.getRootNode();
	    DocumentInfo document = new DocumentInfo(root);
	    dynamicQueryContext.setContextItem(document);
	
	    XQueryExpression expression = staticQueryContext.compileQuery("(:namespace declarations :)\n" + 
	        "declare namespace sam=\"http://www.aaa.aaa/a\";\n" + 
	        "declare namespace huhu=\"http://www.aaa.aaa/b\";\n" + 
	        "\n" + 
	        "(: document-node sequence :)\n" + 
	        "let $documentSeq1 := doc(\"books.xml\")\n" + 
	        "let $doc_node := doc('order.xml')\n" + 
	        "\n" + 
	        "let $sam:extern := <a namespace=\"huhu\"/>\n" + 
	        "\n" + 
	        "(: elements sequence :)\n" + 
	        "let $elementSeq := $documentSeq1//author\n" + 
	        "(: inline element :)\n" + 
	        " let $in-xml := <in-xml>\n" + 
	        "        <!-- this is in-xml -->\n" + 
	        "      <?test see?>\n" + 
	        "    <a z=\"2\"><b>xyz</b></a>\n" + 
	        "    </in-xml>    \n" + 
	        "let $elem := $in-xml/a\n" + 
	        "(: attribute node :)\n" + 
	        "let $attr := $in-xml/a/@z\n" + 
	        "let $c := count($attr)\n" + 
	        "(: comment node :)\n" + 
	        "let $comm := $in-xml/comment()\n" + 
	        "\n" + 
	        "(: pi node :)\n" + 
	        "let $process_inst := $in-xml/processing-instruction()\n" + 
	        "\n" + 
	        "(: text node :)\n" + 
	        "let $txt := $in-xml/a/text()\n" + 
	        "\n" + 
	        "(: sequence of strings :)\n" + 
	        "let $stringSequence := ('apple', 'banana', 'carrot', 'dog', 'egg', 'fig')\n" + 
	        "(: sequence of integer :)\n" + 
	        "let $integerSequence := (1, 2, 3, 4, 5,6)\n" + 
	        "(: sequence of boolean :)\n" + 
	        "(:let $booleanSequence := (contains($stringSequence[0],'apple'), contains($stringSequence[0],'mac')):)\n" + 
	        "let $booleanSequence := (boolean('0'), boolean('1'))\n" + 
	        "(: sequence of dates :)\n" + 
	        "\n" + 
	        "let $a:= (xs:float(1), xs:float(2)) \n" + 
	        "\n" + 
	        "let $dateSequence1 := ($booleanSequence, $integerSequence)\n" + 
	        "let $dateSequence2 := ($booleanSequence, $integerSequence, $process_inst, string($attr))\n" + 
	        "\n" + 
	        "let $elementSeqLength := count($elementSeq)\n" + 
	        "(:let $stringSequence := $elementSeq[1]//text():)\n" + 
	        "\n" + 
	        "let $dateSequence3 := (xs:date('1998-02-02'),xs:date('1998-02-01'))\n" + 
	        "\n" + 
	        "return\n" + 
	        "<test> {\n" + 
	        "    $attr, $elementSeqLength,$elem, $a, $booleanSequence\n" + 
	        "    , true()\n" + 
	        "    , <a> $a </a>\n" + 
	        "    ,$txt\n" + 
	        "    ,$comm\n" + 
	        "    ,$dateSequence1\n" + 
	        "    ,$dateSequence2\n" + 
	        "    ,$dateSequence3\n" + 
	        "}\n" + 
	        "<test2>{string($attr)}</test2>\n" + 
	        "</test>\n" + 
	        "");
	    ByteArrayOutputStream baos = new ByteArrayOutputStream();
	    expression.run(dynamicQueryContext, new StreamResult(baos), new Properties());
	  }
	
	private static void printXQueryLocalVariables(
		      XPathContext pathContext,
		      InstructionInfo instructionInfo) {

		StackFrame stackFrame = pathContext.getStackFrame();
		if (stackFrame != null) {
			SlotManager slotMgr = stackFrame.getStackFrameMap();
			if (slotMgr != null) {        
				List variablesQNames = slotMgr.getVariableMap();
				System.err.println("LOCALS " + variablesQNames);
			}
		}
	}

#10 Updated by Radu Coravu 11 months ago

After commenting enough code on our side, the problem with some local variables not being reported on the trace listener seemed to be triggered by the fact that we set our own error listener to the configuration class:

config.setErrorListener(errorListener);

I replaced the line above with:

config.getParseOptions().setErrorListener(errorListener);

and the automatic test started to work, reporting all 14 local variables on the last line of the XQuery.

It seems that if I call this line:

config.getDefaultStaticQueryContext().setErrorListener(listener);

again the variables start to be incorrectly reported. The error listener's callback methods are not getting called at all in this particular case so I have no idea what the problem is. I will probably make a workaround in our code, avoid calling "config.getDefaultStaticQueryContext().setErrorListener" and see what tests fail.

#11 Updated by Michael Kay 11 months ago

That observation is so surprising that if it came from anyone but you I would dismiss it out of hand! Very strange, but I will investigate.

#12 Updated by Radu Coravu 11 months ago

It is silly but I think I started to understand the problem. Calling "net.sf.saxon.Configuration.setErrorListener(ErrorListener)" calls "getDefaultStaticQueryContext()" which lazily initializes an object. So if I call "Configuration.setErrorListener" very early in the code, the "net.sf.saxon.Configuration.defaultStaticQueryContext" object gets initialized very early, before certain settings are set to the Configuration. So if I move the "Configuration.setErrorListener(ErrorListener)" method call later on, after lots of other settings have been set on the configuration then things seem to work on my side. So I have a workaround for this.

#13 Updated by Michael Kay 11 months ago

Makes sense. With a lot of configuration issues, the problem is that we keep data in more than one place and it's not clear whether they should be allowed to get out of sync.

#14 Updated by Radu Coravu 11 months ago

Thanks for all the help, you can probably close this one as I will just move the "setErrorListener" down in the code, after all the other properties have been set.

#15 Updated by Michael Kay 10 months ago

  • Status changed from AwaitingInfo to In Progress

Changing status to "in progress". I want to do some further investigation to see if we can reduce the problems of storing configuration properties such as optimization options and error listeners in multiple places.

#16 Updated by Michael Kay 6 months ago

  • Status changed from In Progress to Closed

Decided to close this with no further action.

Please register to edit this issue

Also available in: Atom PDF