Bug #4381
closedlast() cannot be used when streaming (grounded-but-consuming focus)
0%
Updated by Michael Kay about 5 years ago
- Subject changed from last() cannot be used when streaming to last() cannot be used when streaming (grounded-but-consuming focus)
I think this is going to be a tough one.
In the instruction
<xsl:apply-templates select="current-group()!copy-of()" mode="add-index"/>
We try to process the items in the current group one at a time, as they are encountered, rather than buffering the whole group in memory. But the template that's invoked calls last(), which can't be evaluated without reading ahead to the end of the group, and this can't be done unless the group is buffered.
It would be bad news to have to buffer the whole group for a group-adjacent instruction just in case this problem arises. We could do the buffering only for the case where a grounded-consuming operand is supplied to apply-templates, though even that would impose a cost on the many cases where last() doesn't get called.
The called template could pass back some kind of response at run-time that says "I can't process any of these items until I know how many there are going to be", triggering buffering dynamically, but that's a whole new mechanism. (And it's tricky because some processing might have already been done at the point where the problem is detected.)
Updated by Michael Kay about 5 years ago
This scenario is actually explored in detail in section 19.7 of the spec, see
https://www.w3.org/TR/xslt-30/#grounded-consuming-constructs
Updated by Michael Kay about 5 years ago
I created a variant of the test case (si-group-065) that uses xsl:for-each
rather than xsl:apply-templates
, This fails in the same way. In principle this case differs because the call on last()
is statically detectable.
Updated by Michael Kay about 5 years ago
For the time being I've patched this to produce a regular dynamic error with the message
Saxon streaming restriction: last() cannot be used when consuming a sequence of streamed nodes, even if the items being processed are grounded.
It would be nice if we could recommend some workaround that forces the buffering. An obvious idea is to write select="[current-group()!copy-of()]*?"
, because array contents are always held in memory. However, this doesn't work. Many such expressions including this one get optimized away. One that doesn't is map{'x':current-group()!copy-of()}?x
. The streaming is handled using a BufferingFeed
, and although this accumulates the results in memory, it still processes the sequence using the streamed push pipeline, which has no mechanism for evaluating last()
.
Updated by Martin Honnen about 5 years ago
Based on your comments I have tried whether using a variable
<xsl:for-each-group select="*" group-adjacent="node-name()">
<xsl:variable name="group" select="current-group()!copy-of()"/>
<xsl:apply-templates select="$group" mode="add-index"/>
</xsl:for-each-group>
ensures the buffering and then allows the use of last()
and that seems to do the job with Saxon 9.9.1.5.
Updated by Michael Kay about 5 years ago
Thanks for that workaround. I'm mildly surprised that the variable isn't inlined by the optimizer.
Updated by Michael Kay almost 5 years ago
I have been doing a number of experiments to find a solution to this problem, without a great deal of success.
I tried introducing an extension function, saxon:materialize(), to force the sequence to be fully evaluated (buffered), but (a) that's not really a solution to the conformance issue, and (b) even if buffered, it's still quite hard to get last() to work correctly because we're still using a ManualIterator and it's not clear at what point we inject the known size of the context sequence.
I tried doing some static detection: we have the ability to detect that the body of for-each depends on last(), and I tried extending this to apply-templates (by examining all the template rules in the target mode); but even if we know the dependency exists, it's hard to make use of this knowledge (for similar reasons: where do we inject the known last position into the manual iterator).
Updated by Michael Kay almost 5 years ago
- Status changed from New to Resolved
- Fix Committed on Branch trunk added
I noticed that for filter expressions, we seem to handle a predicate that invokes last() successfully (of course, we know this statically). FilterExpressionAdjunct.getFeedMaker()
detects this situation and generates a special BufferingFilterExpressionFeed
when it occurs. This is similar to the design I was attempting for a ForEach
with a static dependency on last()
.
I have added similar logic for xsl:for-each
and xsl:apply-template
s, and the two new test cases are now passing. There's probably a need to handle some other cases such as for-each-group
and analyze-string
, but we can assume they are rare and deal with them when they arise.
I'll leave this as a fix for 10.0 only, I don't want to risk disruptions to the 99 streaming code which has become quite stable.
Updated by Michael Kay over 4 years ago
- Project changed from Saxon to Non-Conformances
- Category deleted (
Streaming)
Updated by Michael Kay over 4 years ago
The attempt to tackle this for xsl:apply-templates
(by looking at all the template rules in the mode to see if they call last()
) fails when JIT template rule compilation is enabled; it requires a global perspective which is not available in this situation.
So I've scrapped this code, which means we now fail test si-group-064.
Please register to edit this issue