Bad performance of xsl:result-document with deeply nested output content
I have a problem with bad performance of xsl:result-document when outputting html content with more complicated/nested structure.
This is my frequent case as I generate bigger html fragments on server and paste them to defined places in my html (with xsl:copy-of).
Attachment shows it in a simple way.
#2 Updated by Debbie Lockett about 1 year ago
- Subject changed from Bad performance of xsl:result-document to Bad performance of xsl:result-document with deeply nested output content
- Status changed from New to In Progress
- Priority changed from Low to Normal
The problem is certainly to do with producing output that is deeply nested. (And not particular to nested tables as in the provided test, it also occurs for nested divs, etc.)
In the output content contruction method (Expr.makeComplexContent), effectively all child nodes are copied before being appended to a parent. However, the copying mechanism (DomUtils.copyItem) is recursive (copying a node involves copying all of its children). So we end up copying each node something like n^2 times where n is its nested depth.
Which is clearly madness! And bad news for performance.
This may then happen all over again as the output is actually appended to the HTML page.
The node copying was added to fix the issues https://saxonica.plan.io/boards/5/topics/6589 and https://saxonica.plan.io/issues/2944, but clearly the construction should be made smarter to recognise when a child node is already a copy, and so does not need to be copied again (and again and again...)
#3 Updated by Debbie Lockett about 1 year ago
JS unit tests added ('performance' dir).Various code fixes:
- References to variables always return a copy, to prevent node destruction (re bug https://saxonica.plan.io/issues/2944) - i.e. for 'gVarRef', 'varRef', 'supplied' in Expr.js
- Nodes are labelled with the property _saxonIsTemporary when they are created by Saxon-JS when constructing a temporary tree e.g. in the Context.createElement method (used by DomUtils.shallowCopy; in 'compElem' & 'elem' in Expr.js); when copying text nodes in DomUtils.copyItem, or creating them in 'valueOf' in Expr.js
- Element copying in Expr.makeComplexContent will now only occur if it is really necessary, i.e. when the element is not already a copy, i.e. when the element does not have the property _saxonIsTemporary, e.g. when it is an element in the HTML page or a source document
- Ensure that context.resultDocument is set to window.document for the context supplied to Expr.makeComplexContent when:
(a) transform option destination is appendToBody|prependToBody|replaceBody, or
(b) the constructed content of an xsl:result-document will be added to the HTML page (i.e. href is '?.' or begins with #)
Then the constructed content in the relevant document fragment can just be moved across to the target, without the need for further copying, because uses of context.createElement method will have produced HTMLElements within makeComplexContent as required (re issue https://saxonica.plan.io/boards/5/topics/6589).
#4 Updated by Michael Kay about 1 year ago
We found that not only are newly created nodes sometimes being copied unnecessarily, but the code for deep copy of an element takes exponential time (specifically O(2^n) where n is the depth of the subtree). The existing code invokes shallowCopy with the content being a sequence containing copies of the original element's children; the shallowCopy routine then copies these again. So each of the children is copied twice; the grandchildren are therefore copied 4 times, the greatgrandchildren 8 times, and so on.
#5 Updated by Debbie Lockett about 1 year ago
The previous "fix" to always return copies for references to variables (i.e. for 'gVarRef', 'varRef', 'supplied') has been retracted. This was unnecessary (and slow) in various cases. New mechanisms are used to determine when to make copies of nodes.
Instead of labelling created nodes with _saxonIsTemporary (as in previous fix), we label with _saxonIsLocal, as outlined below.
Firstly, we now do more at compile time (Java code fixes on 9.7 and 9.8 branches). The ExpressionTool.isLocalConstructor() method is added to ask whether a supplied expression is a nested node constructor. This returns true if the child expression creates nodes that will only be used as children of some parent node (meaning that they do not need to be copied). The child expression (i.e. compElem, copy, doc, elem, comment, compAtt, att, namespace, procInst, valueOf) is then labelled with an "l" flag when isLocal() is true.
In Saxon-JS, when this flag is present, the created node is labelled with 'node._saxonIsLocal = true'.
We check for this label during the append method of Expr.makeComplexContent. If child._saxonIsLocal == true, then the child is "attachable", i.e. it may be possible to attach the child directly to the parent without needing a copy. A copy is made if the child is not attachable, or if the parent and child are not of the same type (i.e. both HTMLElements or both XML elements) - this ensures that constructed trees are homogeneous. Once attached to a parent, the child is no longer attachable, so we set child._saxonIsLocal = false.
Also made the fix for Mike's previous point: removed the use of DomUtils.shallowCopy when doing DomUtils.copyItem on an element.
Please register to edit this issue