Feature #4051: Documentation namespace in XSLT - Saxon - Saxonica Developer Community

Actions

Send by e-mail Copy link

Feature #4051

closed

Documentation namespace in XSLT

Added by Michael Kay over 5 years ago. Updated over 5 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Michael Kay

Category:

Saxon extensions

Sprint/Milestone:

Start date:

2018-11-26

Due date:

% Done:

Estimated time:

Legacy ID:

Applies to branch:

9.9

Fix Committed on Branch:

Fixed in Maintenance Release:

Platforms:

Description

FINAL OUTCOME: PROPOSAL NOT IMPLEMENTED

It is common practice in serious XSLT programming to put documentation comments in elements/attributes using a namespace dedicated to this purpose. For example, the XSLTdoc tool uses xmlns:xd="http://www.pnp-software.com/XSLTdoc"

Unfortunately this namespace "leaks" into the semantics of the XSLT code, even at runtime. Even with exclude-result-prefixes, a namespace declared in the stylesheet needs to be retained in the static context accessible at execution time in case it is used, for example, when casting a dynamically-constructed string to a QName. This also means that it needs to appear in exported SEF files, cluttering up the file and reducing performance.

There might be a case for an attribute saxon:exclude-namespaces which is stronger in its effect than exclude-result-prefixes: it should mean that the namespace binding is available only for resolving element and attribute names appearing literally in the stylesheet, and not for resolving QNames-in-content, either at compile time or at run time. This means it would not clutter up SEF files.

Actions

Copy link

Updated by Michael Kay over 5 years ago

I have produced an initial implementation of this feature.

The attribute saxon:documentation-prefixes can be used on the outermost element of the top-level module in a package; it applies to the whole package. The value is a whitespace-separated list of namespace prefixes (which must be declared). If a prefix is in the list, then the corresponding (prefix, uri) pair cannot be used for the names of stylesheet objects such as variables, functions, and templates; and is not part of the static context for XPath expressions. It is therefore not included in SEF export files.

Actions

Copy link

Updated by Michael Kay over 5 years ago

Status changed from New to Resolved
Applies to branch 9.9 added
Fix Committed on Branch 9.9 added

Actions

Copy link

Updated by Michael Kay over 5 years ago

Status changed from Resolved to In Progress

Re-opening this, to review the proposed syntax and semantics.

I don't think the proposal goes far enough. The problem is that in a substantial stylesheet, namespace declarations (e.g for functions in different modules) can easily proliferate, and these all find their way into the run-time expression tree (and the SEF file) serving no useful purpose but potentially affecting performance.

The problem is exactly how and where to get rid of them.

Actions

Copy link

Updated by Michael Kay over 5 years ago

We almost certainly need in-scope namespaces in the expression tree for compile-time operations; the time to get rid of them when not needed is probably when exporting the expression tree to a SEF file.

I did an experiment running the XSLT30 test suite with -export:on, having patched the code to stop outputting the "ns" attribute in the SEF file entirely. This causes 1259 out of 11476 tests to fail. Main causes of failure:

function calls like accumulator-after/before, system-property, key, format-number that take a lexical QName as an argument. (Note, in the vast majority of cases the argument is supplied as a string literal anyway; in other case it can be supplied as an EQName to remove the dependency).
for literal result elements, we generate a "namespaces" attribute in the SEF containing namespaces to be included in the result tree; these rely on the "ns" attribute containing the relevant bindings

Looking at the detail, we (specifically, the XJ compiler) are outputting an "ns" attribute for any expression whose static namespace context differs from its parent element; generally this means there is one ns attribute on the top-level expression for each component, which is perfectly acceptable. There is one exceptions, we seem to be generating an ns attribute unconditionally on calls to system functions that depend on the static context. This looks redundant to me.

The XX compiler is generating far more "ns" attributes: the problem is the cost of comparing the namespace context of an element with that of its parent to decide that they are the same, which is probably greater than the cost of just generating the attribute unconditionally.

I've done work to reduce this with a call on an extension function saxon:has-local-namespaces() to test whether a child element has namespaces different from the parent, but I'm not convinced it's doing a good job yet.

Actions

Copy link

Updated by Michael Kay over 5 years ago

After further thought, I'm inclined to the following.

Add two extension attributes which can (only) be used on the outermost element of the top-level module in a stylesheet package:

saxon:static-namespaces defines the namespaces that are available during static processing of the stylesheet. The value is a set of prefixes (or #default, or #all) which expands to a set of (prefix=uri) bindings using the namespace declarations on the containing element. This set of namespaces is used for all static resolution of lexical QNames (and other namespace prefixes) throughout the package. This includes names used in the declaration of stylesheet components such as functions, templates, variables, and modes; names used in all XPath expressions and patterns; prefixes used in @extension-element-prefixes, @exclude-result-prefixes, and xsl:namespace-alias; and names used in QName-valued arguments of functions such as function-available, system-property, format-number, key, and accumulator-before/after, provided they appear as string literals in a static call of the function. They do not, of course, affect the resolution of literal result element or attribute names, which are resolve by the XML parser in the usual way.

saxon:dynamic-namespaces defines the namespaces that are available during execution of the stylesheet. This must be a subset of the static namespaces. By default, all static namespaces are available. The namespaces in this list are used to provide the default namespace context for xsl:evaluate (in the absence of a namespace-context attribute); when casting from string to QName; and for dynamic calls to functions such as function-available, accumulator-before/after, etc. In addition, a namespace that is not present in the list automatically becomes an "excluded" namespace for the purpose of literal result elements (that is, it is not copied to result elements created using an LRE). For a great many stylesheets, setting saxon:dynamic-namespaces="" (an empty set of bindings) will be quite appropriate.

The advantages of using these attributes are:

(a) a single set of namespaces is established for use throughout the stylesheet. It is no longer necessary, for example, to declare prefixes used in variable or function names in every module that contains such a reference; they are effectively declared once, at the package level.

(b) the compiler is freed from the burden of maintaining lots of different namespace contexts when performing rewrites such as function inlining. Because all expressions have the same static namespace context, it only has to be recorded once at the root of the expression tree.

(c) the number of namespaces copied into the SEF file is greatly reduced, which reduces the size of the file an improves speed of loading. Most of the namespaces that currently appear in SEF files are never used. (Note that names of components such as variable and functions, and references thereto, will always appear in the SEF file as EQNames; therefore namespace bindings that allocate a prefix to these namespaces are not needed.)

Actions

Copy link

Updated by Michael Kay over 5 years ago

Status changed from In Progress to Closed
Fix Committed on Branch deleted (~~9.9~~)

Having implemented and tested this feature, I found that it delivered no measurable performance benefits, and therefore cannot be justified. I have therefore reverted the changes, and am closing the issue with no change to the product.

Actions

Copy link

Updated by Michael Kay over 5 years ago

Description updated (diff)

Please register to edit this issue

Actions

Send by e-mail Copy link

Also available in: Atom PDF

Project

Profile

Help

Saxon

Feature #4051

Documentation namespace in XSLT

Updated by Michael Kay over 5 years ago

Updated by Michael Kay over 5 years ago

Updated by Michael Kay over 5 years ago

Updated by Michael Kay over 5 years ago

Updated by Michael Kay over 5 years ago

Updated by Michael Kay over 5 years ago

Updated by Michael Kay over 5 years ago