Project

Profile

Help

Bug #5484

closed

Streaming with xsl:source-document: early exit

Added by Michael Kay about 2 months ago. Updated about 1 month ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Streaming
Sprint/Milestone:
-
Start date:
2022-05-11
Due date:
% Done:

0%

Estimated time:
Legacy ID:
Applies to branch:
11, trunk
Fix Committed on Branch:
11, trunk
Fixed in Maintenance Release:
Platforms:

Description

There are tests such as sx-intersect-021 that are annotated with early-exit-possible="true" where we are not in fact doing an early exit.

In this particular example, the code within the xsl:source-document instruction is statically reduced to <out>0</out> - it doesn't use any information from the source document at all. Perhaps that's not something that happens in real life very often, but we should handle it intelligently.

A complicating factor here is the possible presence of accumulators. I'm not sure how this works if we actually do an early exit. For example what happens with

<xsl:source-document href="in.xml">
  <xsl:if test="exists(*/item)">Gotcha!</xsl:if>
  <xsl:value-of select="accumulator-after('xxx')"/>
</xsl:source-document>

As far as I can see, the call on exists() is going to throw a QuitParsingException when it hits the first item and the xsl:source-document instruction is going to exit without evaluating the accumulator.

Need some test cases in this area...

Actions #1

Updated by Michael Kay about 2 months ago

  • Project changed from 4 to Saxon
  • Category set to Streaming
  • Priority changed from Low to Normal

There is indeed a problem, demonstrated by new test case source-document/stream-208.

The call on accumulator-after() function appears to return the value of the accumulator at the point where "early exit" was signalled on the main processing loop.

Actions #2

Updated by Michael Kay about 2 months ago

  • Status changed from New to In Progress

First step here: don't register an AccumulatorWatch with the WatchManager if there are no accumulators. (Currently AccumulatorRegistryEE.registerSelectedAccumulators() registers a watch that does nothing, which is a known inefficiency).

This gives me no new failures in the source-document tests.

Next step: in WatchManager, don't do an early exit if there is an AccumulatorWatch registered. That's easier said than done. What exactly should the WatchManager do if there's a QuitParsingException but there are outstanding accumulator (or other?) watches. Needs some experimenting.

Actions #3

Updated by Michael Kay about 2 months ago

We're still getting the value of the accumulator at the point where exists() thinks it has finished. This is because ExistenceFeed.action() calls getNextOutputter().close(), which eventually triggers a close() on the BlockFeed representing the body of the xsl:source-document instruction, which causes evaluation of the instructions that follow the consuming instruction, including the instruction that evaluates the accumulator.

I think this is going to need a modification to all Feeds that terminate early. We can probably recognize these as they call Terminator.terminate(), and there are only 13 of them.

Actions #4

Updated by Michael Kay about 2 months ago

This all seems to be working nicely, though it's under-tested.

The tests that exit early are supposed to be marked as such, but we're not always issuing a warning on early exit, so the test suite driver isn't able to check this rigorously. I'm now trying to improve this.

The WatchManager does warning(quit) when it catches a QuitParsingException during startElement() processing. I suspect it needs to do the same on other paths.

Actions #5

Updated by Michael Kay about 2 months ago

The sx-IntersectExpr tests are proving troublesome. Many of them are marked to expect an early exit. This is largely because, as written, static type inference can work out that the intersection is empty (so the input document is not needed). We're not capturing that as an "early exit" condition.

If I change the type of the "insertion" operand so it overlaps the type of the streamed operand, we are reading the streamed operand to completion. But we could be smarter - we could detect dynamically that the intersection of a set of streamed nodes and a set of unstreamed nodes is bound to be empty.

Indeed, I'm inclined to question the whole thing. If both operands of "intersect" are streamed nodes, then we have two consuming operands so the expression is non-streamable. If only one operand is streamed, then the intersection will always be empty. Except, possibly, when one operand contains a blend of streamed and unstreamed nodes; but then the streamed nodes can always be ignored. (The same arguments apply to "except").

POSTSCRIPT: this is incorrect. Union, intersect, and except expression do allow two consuming operands. This has been addressed under bug #5488

Actions #6

Updated by Michael Kay about 2 months ago

  • Status changed from In Progress to Resolved
  • Applies to branch 11, trunk added
  • Fix Committed on Branch 11, trunk added

Fixed on the 11 and 12 branches. Decided not to fix this on 10.x in the interests of stability.

Actions #7

Updated by Michael Kay about 1 month ago

  • Status changed from Resolved to In Progress

Reopened because test case mode-1107b is failing; it is expecting the accumulator watch to have been registered, but it is null.

Actions #8

Updated by Michael Kay about 1 month ago

  • Status changed from In Progress to Resolved

Fixed; in AccumulatorRegistryEE.getStreamingAccumulatorValue(), need to allow for the MultiAccumulatorWatch being null when there are no applicable accumulators.

Please register to edit this issue

Also available in: Atom PDF