Bug #6598
openjava.lang.ArrayIndexOutOfBoundsException processing 1.0 stylesheet
0%
Description
Processing the XSLT 4.0 schema with the W3C's (1.0) xsd.xsl
issues an error message then throws an exception:
Warning: /xs:schema/xs:defaultOpenContent not matched.
Warning: /xs:schema/xs:defaultOpenContent/xs:any not matched.
Warning: /xs:schema/xs:defaultOpenContent/xs:any/@processContents not matched.
Error at xsl:copy on line 676 column 14 of xsd.xsl:
XTDE0410 An attribute node (processContents) cannot be created after a child of the
containing element. Most recent element start tag was output at line 674 of module xsd.xsl
In template rule with match="@*" on line 668 of xsd.xsl
Focus
Context item: /xs:schema/xs:defaultOpenContent[1]/xs:any[1]/@processContents
Context position: 1
Local variables
$fqgi = doc()
invoked by xsl:apply-templates at file:/Volumes/Saxonica/src/qt4cg/qtspecs/xsd.xsl#677
In template rule with match="*" on line 668 of xsd.xsl
Focus
Context item: /xs:schema/xs:defaultOpenContent[1]/xs:any[1]
Context position: 4
Local variables
$fqgi = doc()
java.lang.ArrayIndexOutOfBoundsException: Index 1 out of bounds for length 1
at com.saxonica.ee.config.SlotManagerEE.showStackFrame(SlotManagerEE.java:82)
at net.sf.saxon.lib.StandardDiagnostics.logStackTrace(StandardDiagnostics.java:279)
at net.sf.saxon.lib.StandardErrorReporter.outputStackTrace(StandardErrorReporter.java:634)
Files
Updated by Michael Kay about 2 months ago
The AIOOB exception occurs while formatting the diagnostic stack trace, so let's deal with that one first.
The stack frame one below the top stack frame has a stack frame map saying there are two variables, but the array used to hold variables has only one entry.
So there are two parts to that problem. Firstly, it shouldn't happen. Secondly, if it does, the stack trace display should be more resilient.
Updated by Michael Kay about 2 months ago
I've fixed the resilience problem; the stack trace is now reporting the inconsistency and recovering, outputting:
Error at xsl:copy on line 676 column 14 of xsd.xsl:
XTDE0410 An attribute node (processContents) cannot be created after a child of the
containing element. Most recent element start tag was output at line 674 of module xsd.xsl
In template rule with match="@*" on line 668 of xsd.xsl
Focus
Context item: /xs:schema/xs:defaultOpenContent[1]/xs:any[1]/@processContents
Context position: 1
Local variables
$fqgi = doc()
invoked by xsl:apply-templates at file:/Users/mike/bugs/2024/6598-ToveyWalsh/xsd.xsl#677
In template rule with match="*" on line 668 of xsd.xsl
Focus
Context item: /xs:schema/xs:defaultOpenContent[1]/xs:any[1]
Context position: 4
Inconsistent stack frame: expecting 2 variables, but found 1
invoked by xsl:apply-templates at file:/Users/mike/bugs/2024/6598-ToveyWalsh/xsd.xsl#677
In template rule with match="*" on line 668 of xsd.xsl
Focus
Context item: /xs:schema/xs:defaultOpenContent[1]
Context position: 24
Inconsistent stack frame: expecting 2 variables, but found 1
invoked by xsl:apply-templates (tail calls omitted) at file:/Users/mike/bugs/2024/6598-ToveyWalsh/xsd.xsl#176
In template rule with match="/" on line 72 of xsd.xsl
Focus
Context item: /
Context position: 1
Local variables
$doctitle = u"Schema document for ... g/1999/XSL/Transform"
$docIsProlific = true()
An attribute node (processContents) cannot be created after a child of the containing element. Most recent element start tag was output at line 674 of module xsd.xsl
Updated by Michael Kay about 2 months ago
The stack frame that is expecting 2 variables thinks that both variables are called $fqgi
I've stopped it at the applyTemplates call processing xs:defaultOpenContent in the unnamed mode. JIT template rule compilation kicks in here as its the first use of the template rule, and slots are allocated, giving (correctly) a stack frame map with one variable, for $fqgi
.
My suspicion now is that this template (at line 668) has a union pattern with match="*|@*"
which means there are going to be two copies of the template body for the two match patterns, and that somehow the stack frames for both are going to be intermingled.
Sure enough, when we come to handle the @processContents
attribute, we find ourselves dealing with a template rule whose body is non-null (indicating it has been initialised) but whose initializer is non-null (indicating it has not). This leads to the template rule being initialised for a second time, which leads to duplicate slots being allocated -- which almost certainly does no harm until we get a failure that causes the stack trace to be printed.
Updated by Michael Kay about 2 months ago
I'm now looking at why the stylesheet fails in the first place, it's fairly obvious: at line 665 we have
<xsl:element name="div">
<xsl:value-of select="concat('<',name(),'>')"/>
<xsl:copy>
and if the context node is an attribute (in this case, the @processContents
attribute) the xsl:value-of
will produce a text node, and the xsl:copy
will then produce an attribute node. Running in the browser with a 1.0 processor this is going to be a recoverable error ("implementations may either signal the error or ignore the attribute.")
This became a non-recoverable error in 2.0:
[ERR XTDE0410] It is a non-recoverable dynamic error if the result sequence used to construct the content of an element node contains a namespace node or attribute node that is preceded in the sequence by a node that is neither a namespace node nor an attribute node.
This is a documented incompatibility: §J1.4 para 6 ("In classifying such errors as non-recoverable, the Working Group used the criterion that no stylesheet author would be likely to write code that deliberately triggered the error and relied on the recovery action.")
Updated by Michael Kay about 2 months ago
- Category set to Internals
- Status changed from New to In Progress
- Assignee set to Michael Kay
- Priority changed from Low to Normal
Back with the problem of dual-initialisation of the template rule.
We split the match=" * | @* "
template into two separate TemplateRule
objects, with separate TemplateRuleInitializer objects. When the first of these fires (assuming JIT compilation is on), the initializer calls XSLTemplate.compileTemplateRule(), which processes the source content of the template rules, and calls setBody()
on all of them, to the same compiled expression. But the initializer of the other TemplateRule
objects isn't set to null, so when they are first invoked, their initializer is called. As well as unnecessarily recompiling the body of the template rule, this calls ExpressionTool.allocateSlots() to allocate stackframe slots to local variables, and this goes wrong because the stackframe map already contains entries, and the new allocated entries are additive.
So, what's the solution? First thought is that when we copy the template body to all the "cloned" template rules, we also set their initializer to null to indicate that they have been fully initialized. But I don't think that quite works. For example, the initializer does type checking of the template body taking into account knowledge of the match pattern, and checking when the context item is *
may produce a different expression tree from the case where the context item is @*
. So, I think we do need to execute the initializer on the cloned rule, but it needs to be sensitive to the fact that some initialization has already been done.
In fact, I think it potentially needs to re-allocate slot numbers, because as a result of typechecking and optimization it has potentially produced a different expression tree, which potentially contains different local variables (for example branches of a conditional might have been removed if they were dependent on the type of the context node).
The call on XSLTemplate.jitCompile() already detects a second call so there's no harm in calling this again.
Hmmm. I think there's another issue here, which is that the body expression that results from compiling the first "clone" template is copied to the other clones AFTER type checking against the match pattern of the first clone, which might produce an expression body that's wrong for other match patterns. I'm going to try and produce a test case to demonstrate that.
Updated by Michael Kay about 2 months ago
Wrote test case apply-templates-003.
First time through, TemplateRuleInitializer.init() calls jitCompiler, calls XSLTemplate.compileDeclaration(), which copies the compiled body to all cloned template rules, and this happens before any type checking or optimization.
The TemplateRuleInitializer then use the union pattern (in this case A|B|C) as the context item type, rather than the specific branch. As a result, the optimization of the body expression doesn't optimize away the if (. instance of element(A))
branches.
Tried changing TemplateRuleInitializer.init() so it skips much of the logic if templateRule.getBody() is non-null. This causes failures in a few tests, for example next-match-017 which fails saying "Can't elaborate a local variable reference before slot numbers have been allocated". This test doesn't involve a union pattern, it involves the same template rule being imported/included at two different precedence levels.
Generally, this code is proving a bit fragile... Any way we change it seems to break some rarely-executed path.
Updated by Michael Kay about 2 months ago
I now have a clearer picture of what is going wrong.
At the point we apply templates to the defaultOpenContent element, we do the lazy initialization of its template rule, building the code and creating a stackFrameMap with one variable name.
At the point we apply templates to the processContents attribute (called from within the above template), we do another lazy initialization, this time to the "clone" of the above template rule. This is where we create a new stackFrameMap, doubling up the number of slots. This is wasteful but is in itself harmless; the new stackFrameMap has two variables rather than one, but one goes unused.
The problem is that the new oversized stack frame map gets written to both clones. The first template rule, which is still being executed, now has a slot array of length 1 (based on the stack frame map at the time execution started), but has a stackFrameMap of length 2 (because it was overwritten when the clone template rule was initialized). This creates an inconsistency when the stack frame is later examined while creating the stack trace. The inconsistency probably does no harm otherwise.
It's not at all clear how best to solve the problem. A substantial redesign seems not to be the right choice, given that the symptoms are very minor and easily masked. But sticky-plaster fixes all seem to have unwanted side-effects.
Updated by Michael Kay about 2 months ago
I think the sensible thing to do is to have a single Initializer at the level of the XSLTemplate, rather than at the level of the individual (clone) TemplateRule, and for this to be invoked when any of the clones is needed; it should do the initialization of all of them at the same time. That's a somewhat bigger change than I would like, but smaller fixes don't seem to work.
Really it would make sense to go back to having one TemplateRule per applicable mode, rather than one per sub-pattern per mode.
Please register to edit this issue