Project

Profile

Help

Bug #6438

open

Command Line and Config File option differences

Added by Adrian Bird 5 months ago. Updated 5 months ago.

Status:
New
Priority:
Low
Assignee:
-
Category:
-
Sprint/Milestone:
-
Start date:
2024-05-24
Due date:
% Done:

0%

Estimated time:
Legacy ID:
Applies to branch:
Fix Committed on Branch:
Fixed in Maintenance Release:
Platforms:

Description

I'm looking at moving from command line options to using a config file and based on the documentation there are some discrepancies when using the command line vs. config file options.

The 2 main issues are with TIMING / -t and TRACE_­OPTIMIZER_­DECISIONS / -explain as in both cases the command line and config options do not produce the same output although the documentation says they should.

There are also a couple of other issues below.

For each point I've copied the description from the documentation and tried the 3 ways of using each option: -xxx on the command line, xxx="" in the config file and --xxx:yyy on the command line.

1) TIMING / -t

timing description from the Configuration Features:

This is set to true to cause basic timing and tracing information to be output to the standard error output stream. The name of the feature is poorly chosen, since much of the information that is output has nothing to do with timing, for example the names of output files for xsl:result-document are traced, as are the names of schema documents loaded.

-t description:

Display version and timing information to the standard error output. The output also traces the files that are read and written, and extension modules that are loaded.

timing description from the <global> element:

Outputs progress messages to System.err. Equivalent to the -t option on the command line.

Conclusion:

None of the 3 ways of displaying timing information display the same data. Setting --timing:true on the command line gives the basic information. Setting -t gives a bit more information and corresponds to its description apart from the fact that extension modules are not displayed. Setting timing="true" in the config file gives the same information as --timing:true on the command line but also outputs extension modules.

Here's an example of what each produces - there are some notes in brackets.

--timing:true on command line gives

Using parser com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser
Building tree for file:/L:/Tests/Saxon/SaxonConfiguration/Input1.xml using class net.sf.saxon.tree.tiny.TinyBuilder         (Source file)
Tree built in 1.3368ms
Tree size: 6 nodes, 3 characters, 0 attributes
Writing to file:/L:/Tests/Saxon/SaxonConfiguration/Output2.txt                                                              (Output from xsl:result-document)
Building tree for file:/L:/Tests/Saxon/SaxonConfiguration/Input2.xml using class net.sf.saxon.tree.linked.LinkedTreeBuilder (File read with doc())
Tree built in 0.2556ms

-t gives:

SaxonJ-HE 12.4 from Saxonica
Java version 11.0.23
Stylesheet compilation time: 398.3237ms
Processing file:/L:/Tests/Saxon/SaxonConfiguration/Input1.xml initial template = initialTemplate
Using parser com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser
Building tree for file:/L:/Tests/Saxon/SaxonConfiguration/Input1.xml using class net.sf.saxon.tree.tiny.TinyBuilder         (Source file)
Tree built in 1.6525ms
Tree size: 6 nodes, 3 characters, 0 attributes
Writing to file:/L:/Tests/Saxon/SaxonConfiguration/Output2.txt                                                              (Output from xsl:result-document)
Building tree for file:/L:/Tests/Saxon/SaxonConfiguration/Input2.xml using class net.sf.saxon.tree.linked.LinkedTreeBuilder (File read with doc())
Tree built in 0.2684ms
Execution time: 163.6502ms
Memory used: 13Mb

timing="true" in config file gives:

Loading com.dummy.ColumnNumber                                                                                              (Extension module)
Loading com.dummy.LineNumber                                                                                                (Extension module)
Using parser com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser
Building tree for file:/L:/Tests/Saxon/SaxonConfiguration/Input1.xml using class net.sf.saxon.tree.tiny.TinyBuilder         (Source file)
Tree built in 1.4289ms
Tree size: 6 nodes, 3 characters, 0 attributes
Writing to file:/L:/Tests/Saxon/SaxonConfiguration/Output2.txt                                                              (Output from xsl:result-document)
Building tree for file:/L:/Tests/Saxon/SaxonConfiguration/Input2.xml using class net.sf.saxon.tree.linked.LinkedTreeBuilder (File read with doc())
Tree built in 0.3191ms

The common set is:

Using parser com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser
Building tree for file:/L:/Tests/Saxon/SaxonConfiguration/Input1.xml using class net.sf.saxon.tree.tiny.TinyBuilder         (Source file)
Tree built in 1.6525ms
Tree size: 6 nodes, 3 characters, 0 attributes
Writing to file:/L:/Tests/Saxon/SaxonConfiguration/Output2.txt                                                              (Output from xsl:result-document)
Building tree for file:/L:/Tests/Saxon/SaxonConfiguration/Input2.xml using class net.sf.saxon.tree.linked.LinkedTreeBuilder (File read with doc())
Tree built in 0.2684ms

-t adds:

SaxonJ-HE 12.4 from Saxonica
Java version 11.0.23
Stylesheet compilation time: 398.3237ms
Processing file:/L:/Tests/Saxon/SaxonConfiguration/Input1.xml initial template = initialTemplate
...
Execution time: 163.6502ms
Memory used: 13Mb

--timing adds nothing

timing=true adds:

Loading com.dummy.ColumnNumber (Extension module)
Loading com.dummy.LineNumber (Extension module)

2) TRACE_­OPTIMIZER_­DECISIONS / -explain

-explain command line option description:

Display an execution plan and other diagnostic information for the stylesheet. This is a representation of the expression tree after rewriting by the optimizer. It combines the XSLT instructions and the XPath expressions into a single tree. If no file name is specified the output is sent to the standard error stream. The output is in XML format.

TRACE_­OPTIMIZER_­DECISIONS (-explain) in Configuration Features description:

If this option is set, Saxon will output (to the standard error output) detailed information about the rewrites to the expression tree made by the optimizer. This information is mainly useful for internal system debugging, but it is also possible to digest it to analyze the ways in which the expression has been optimized for the purpose of performance analysis and tuning.

traceOptimizerDecisions description from the <global> element:

Causes tracing of decisions made by the optimizer.

Conclusion:

All 3 methods output the decisions made by the optimizer although that isn't described in the -explain description. The TRACE_­OPTIMIZER_­DECISIONS (-explain) in Configuration Features says the configuration option is the same as -explain, but it doesn't output the XML expression tree after rewriting by the optimizer.

3) STANDARD_­ERROR_­OUTPUT_­FILE

STANDARD_­ERROR_­OUTPUT_­FILE in Configuration Features description:

STANDARD_ERROR_OUTPUT_FILE is the name of a file to which Saxon will redirect output that would otherwise go to the operating system standard error stream (System.err). This is the fallback destination for various tracing and diagnostic output. In some cases a more specific mechanism exists to select the destination for particular kinds of output. Note that if the Configuration is used in more than one processing thread, the messages from different threads will be interleaved in the output file. A more selective approach is to use a different ErrorListener in different processing threads, and arrange for each ErrorListener to write to its own logging destination.

standardErrorOutputFile description from the <global> element:

Redirects output which would otherwise go to the standard error output stream System.err, to this file.

Conclusion:

With this set in the config file I still see the following line on system.err (isn't output when using the LinkedTree):

Tree size: 6 nodes, 3 characters, 0 attributes

Setting the feature in the config file and also on the command line with different file names sends different output to both files in the case where I tried it (with timing="true" in the config file). Using the output from above for timing="true" I get the following:

  • the Extension module details go to the file specified in the config file
  • the other details go to the file specified on the command line.

The output from -explain and -T still go to system.err when this is set, but as the description above says "In some cases a more specific mechanism exists to select the destination for particular kinds of output".

4) traceExternalFunctions

TRACE_­EXTERNAL_­FUNCTIONS / -TJ description from Configuration Features:

If this option is set, Saxon will output (to the standard error output) progress information about its attempts to locate and disambiguate references to reflexive Java extension functions. This is useful for diagnostics if the XQuery or XSLT compiler is failing to locate user-written extension functions

traceExternalFunctions description from the <global> element:

Provides diagnostics when external functions are dynamically loaded.

-TJ description:

Switches on tracing of the binding of calls to external Java methods. This is useful when analyzing why Saxon fails to find a Java method to match an extension function call in the stylesheet, or why it chooses one method over another when several are available.

Conclusion:

Setting this gives the following output when I've used string() and number() in my stylesheet. I'm surprised these are traced as external functions. I get the same output by using -TJ or setting traceExternalFunctions in the config file.

Looking for function Q{http://www.w3.org/2005/xpath-functions}string#1
Trying net.sf.saxon.functions.registry.XSLT30FunctionSet
Looking for function Q{http://www.w3.org/2005/xpath-functions}number#1
Trying net.sf.saxon.functions.registry.XSLT30FunctionSet

5) Initial Template

-it[:template-name] description:

Selects the initial named template to be executed. If this is namespaced, it can be written as {uri}localname. If the template name is omitted, the default is xsl:initial-template. When this option is used, you do not need to supply a source file, but if you do, you must supply it using the -s option.

initialTemplate description from the <xslt> element:

The name of a named template within a stylesheet where execution should begin.

Conclusion

I was expecting to be able to set my own initial template name in the config file and then use -it on the command line to default to my initial template name but I got the message:

 XTDE0040  Template xsl:initial-template does not exist.

It seems that setting the initial template in the config file is ignored.

6) Saxon Configuration File "Applies to" column

There seem to be multiple cases where HE is not included in the column but the configuration item works in HE. I've used traceExternalFunctions and traceOptimizerDecisions above and also disableXslEvaluate and enableAssertions previously in HE.

Adrian

Actions #1

Updated by Michael Kay 5 months ago

I think these problems are most likely to occur for options that directly affect the command line layer of the code. Using --x or loading a configuration file is likely to be setting the option in the Configuration object, but not affecting the local variables in the command line layer itself (some of which are probably set, and indeed used, before the configuration is even created).

There's another problem, though that's probably separate, that some of these options make very little sense EXCEPT when running from the command line -- or at any rate, when a Configuration is used for a single-shot transformation or query as distinct from a long-running service hosting a variety of work.

Actions #2

Updated by Adrian Bird 5 months ago

Can I add an additional comment about RECOGNIZE_­URI_­QUERY_­PARAMETERS and -p.

There seems to be some confliction around the -p command line option and the RECOGNIZE_­URI_­QUERY_­PARAMETERS configuration item which on first glance seem to be the same. The doc() function has the following "The standard URI resolver has an option (set using -p on the command line, or via options on the Configuration or TransformerFactory classes) to recognize query parameters in the URI." The RECOGNIZE_­URI_­QUERY_­PARAMETERS option is the only option which seems to relate to the -p command line option.

A bit of testing shows that the RECOGNIZE_­URI_­QUERY_­PARAMETERS configuration option allows query parameters to be set on on doc() function and works fine (I haven't tried document()), but although the -p options says "Enable recognition of query parameters (such as xinclude=yes) in the StandardURIResolver" it also says that it turns on the command line options -sa and -u as well and says it is only available in Saxon-PE and Saxon-EE, which is quite different from the RECOGNIZE_­URI_­QUERY_­PARAMETERS configuration option which says it is for HE, PE and EE.

There is also a description of -p in the Whitespace Stripping section of the document which doesn't mention needing Saxon-PE and Saxon-EE.

Please register to edit this issue

Also available in: Atom PDF