Bug #5627
closed
PushToPullIterator does not implement the `discharge()` method
Applies to branch:
11, trunk
Fix Committed on Branch:
trunk
Fixed in Maintenance Release:
Description
Bug #5177 introduced the method SequenceIterator.discharge()
to solve the dilemma that some iterators hold resources, and therefore need to be explicitly closed if not read to completion, but some consumers of iterators aren't able to issue a close() because they don't know whether the user-written code will read more data or not. The idea is that a consumer that can't guarantee to call close() can instead call discharge()
to provide notification of the fact that close()
might not be called.
The PushToPullIterator
has a discharge() method that throws a NotImplementedException
. This hasn't caused any failure on the 11.x branch, but it is causing a couple of streaming tests to fail on the 12.x branch, and it's entirely possible that similar conditions could cause an 11.x failure. Specifically, tests stream-107 and stream-109 are failing.
What basically triggers the error is that on the Saxon 12.x path (work in progress) we're creating an unnecessary MemoClosure to hold intermediate results; and the MemoClosure is the classic case where we simply don't know whether it's going to be read to completion. In this particular case the input to the MemoClosure comes from a PushPullIterator (we're reading streamed XML input, which is being pushed to a buffer by the XML parser in one thread, while the iterator is reading the data from that buffer in another thread. If the iterator isn't properly closed then the parsing thread is going to hang.
I can get rid of the problem fairly easily in this instance by not creating the MemoClosure, but that doesn't solve the underlying problem.
Implementing discharge()
in the PushPullIterator would mean reading ahead to the end of the file, "just in case", which would destroy streaming, and eliminate the possibility of an early exit when we've read all the data we need. So there's an issue here that needs to be addressed.
I'm wondering about a different design (replacing the discharge() mechanism) where any iterator that holds resources registers itself at the level of the Controller, and the Controller tidies up any unclosed resources when it shuts down (at the end of a transformation or query).
I don't think it's too much of a problem to ensure that all affected iterators have access to the context and hence to the Controller.
I'm not quite sure how this will work for free-standing XPath expressions where each evaluation gets a new Controller.
I think the solution might be to register any SequenceIterator
that holds resources (for example, the UnparsedTextLines
iterator) with a java.lang.ref.Cleaner
, owned by the Controller. If the SequenceIterator is garbage collected without having been closed, the clean() method can close the underlying resource.
I've done some initial experiments with the Cleaner
mechanism and it looks encouraging.
Essentially, any SequenceIterator that needs to close resources should register a cleanup action with the Cleaner held by the Controller. I've prototyped this with the UnparsedTextIterator.
I've now extended the mechanism to the (small number of) iterators that need to release resources, such as CollectionIterator, MultithreadedContextMappingIterator, PushPullIterator. The Cleaner is now owned by the Configuration, not by the Controller. The mechanism seems to be working well.
The discharge() method is dropped.
Work is now needed to make the code transpile to C#.
So far the changes have only been made on the 12.x branch. This probably makes sense unless and until a user reports hitting a problem in this area - or at least, until we have some field exposure of the new mechanism to be confident it doesn't cause any new problems.
Note if we do consider retrofitting it to Saxon 11, the Cleaner requires Java 9, but Saxon 11 is currently supported on Java 8.
Further testing shows that I'm getting failures on XSLT streaming tests that do an "early exit", so further work is needed.
Those test failures occur only when running with the "-strict" option, and are due to limitations of the "early exit" test assertion. So, false alarm.
- Status changed from New to Resolved
- Applies to branch 11, trunk added
- Fix Committed on Branch trunk added
- Platforms Java added
Marking as resolved, though I decided to apply the change only on the 12.x branch for the time being.
- Fixed in Maintenance Release 12.0 added
Bug issue fix applied in the Saxon 12.0 Major Release.
Leaving this bug marked as Resolved until fix applied on the Saxon 11 branch.
Please register to edit this issue
Also available in: Atom
PDF