Project

Profile

Help

Bug #4381

closed

last() cannot be used when streaming (grounded-but-consuming focus)

Added by Michael Kay over 4 years ago. Updated about 4 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Start date:
2019-11-12
Due date:
% Done:

0%

Estimated time:
Applies to branch:
9.9, trunk
Fix Committed on Branch:
trunk
Fixed in Maintenance Release:
Found in version:
Fixed in version:
Platforms:

Description

See https://saxonica.plan.io/boards/3/topics/7672

Reproduced as XSLT3 test si-group-064

Actions #1

Updated by Michael Kay over 4 years ago

  • Subject changed from last() cannot be used when streaming to last() cannot be used when streaming (grounded-but-consuming focus)

I think this is going to be a tough one.

In the instruction

<xsl:apply-templates select="current-group()!copy-of()" mode="add-index"/>

We try to process the items in the current group one at a time, as they are encountered, rather than buffering the whole group in memory. But the template that's invoked calls last(), which can't be evaluated without reading ahead to the end of the group, and this can't be done unless the group is buffered.

It would be bad news to have to buffer the whole group for a group-adjacent instruction just in case this problem arises. We could do the buffering only for the case where a grounded-consuming operand is supplied to apply-templates, though even that would impose a cost on the many cases where last() doesn't get called.

The called template could pass back some kind of response at run-time that says "I can't process any of these items until I know how many there are going to be", triggering buffering dynamically, but that's a whole new mechanism. (And it's tricky because some processing might have already been done at the point where the problem is detected.)

Actions #2

Updated by Michael Kay over 4 years ago

This scenario is actually explored in detail in section 19.7 of the spec, see

https://www.w3.org/TR/xslt-30/#grounded-consuming-constructs

Actions #3

Updated by Michael Kay over 4 years ago

I created a variant of the test case (si-group-065) that uses xsl:for-each rather than xsl:apply-templates, This fails in the same way. In principle this case differs because the call on last() is statically detectable.

Actions #4

Updated by Michael Kay over 4 years ago

For the time being I've patched this to produce a regular dynamic error with the message

Saxon streaming restriction: last() cannot be used when consuming a sequence of streamed nodes, even if the items being processed are grounded.

It would be nice if we could recommend some workaround that forces the buffering. An obvious idea is to write select="[current-group()!copy-of()]*?", because array contents are always held in memory. However, this doesn't work. Many such expressions including this one get optimized away. One that doesn't is map{'x':current-group()!copy-of()}?x. The streaming is handled using a BufferingFeed, and although this accumulates the results in memory, it still processes the sequence using the streamed push pipeline, which has no mechanism for evaluating last().

Actions #5

Updated by Martin Honnen over 4 years ago

Based on your comments I have tried whether using a variable

            <xsl:for-each-group select="*" group-adjacent="node-name()">
                <xsl:variable name="group" select="current-group()!copy-of()"/>
                <xsl:apply-templates select="$group" mode="add-index"/>
            </xsl:for-each-group>

ensures the buffering and then allows the use of last() and that seems to do the job with Saxon 9.9.1.5.

Actions #6

Updated by Michael Kay over 4 years ago

Thanks for that workaround. I'm mildly surprised that the variable isn't inlined by the optimizer.

Actions #7

Updated by Michael Kay about 4 years ago

I have been doing a number of experiments to find a solution to this problem, without a great deal of success.

I tried introducing an extension function, saxon:materialize(), to force the sequence to be fully evaluated (buffered), but (a) that's not really a solution to the conformance issue, and (b) even if buffered, it's still quite hard to get last() to work correctly because we're still using a ManualIterator and it's not clear at what point we inject the known size of the context sequence.

I tried doing some static detection: we have the ability to detect that the body of for-each depends on last(), and I tried extending this to apply-templates (by examining all the template rules in the target mode); but even if we know the dependency exists, it's hard to make use of this knowledge (for similar reasons: where do we inject the known last position into the manual iterator).

Actions #8

Updated by Michael Kay about 4 years ago

  • Status changed from New to Resolved
  • Fix Committed on Branch trunk added

I noticed that for filter expressions, we seem to handle a predicate that invokes last() successfully (of course, we know this statically). FilterExpressionAdjunct.getFeedMaker() detects this situation and generates a special BufferingFilterExpressionFeed when it occurs. This is similar to the design I was attempting for a ForEach with a static dependency on last().

I have added similar logic for xsl:for-each and xsl:apply-templates, and the two new test cases are now passing. There's probably a need to handle some other cases such as for-each-group and analyze-string, but we can assume they are rare and deal with them when they arise.

I'll leave this as a fix for 10.0 only, I don't want to risk disruptions to the 99 streaming code which has become quite stable.

Actions #9

Updated by Michael Kay about 4 years ago

  • Project changed from Saxon to Non-Conformances
  • Category deleted (Streaming)
Actions #10

Updated by Michael Kay about 4 years ago

The attempt to tackle this for xsl:apply-templates (by looking at all the template rules in the mode to see if they call last()) fails when JIT template rule compilation is enabled; it requires a global perspective which is not available in this situation.

So I've scrapped this code, which means we now fail test si-group-064.

Please register to edit this issue

Also available in: Atom PDF