Project

Profile

Help

Bug #4469

closed

XQJ bindDocument() fails java.lang.IllegalStateException: Attempt to output end tag with no matching start tag

Added by Michael Kay about 4 years ago. Updated about 4 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
XQuery Java API
Sprint/Milestone:
-
Start date:
2020-02-27
Due date:
% Done:

100%

Estimated time:
Legacy ID:
Applies to branch:
9.9, trunk
Fix Committed on Branch:
9.9, trunk
Fixed in Maintenance Release:
Platforms:

Description

Revealed by JUnit test case TestXQueryUsingXQJ/testBindDocument

The problem seems to be that when TinyTreeWalker encounters a TinyTextualElement node, it should deliver a sequence of three events: startElement, text, endElement, but it is only sending the last two of the three.

I've no idea why this unit test failure hasn't been picked up earlier.

XQJ is pretty well the only thing using the pull pipeline these days (everything else is in push mode), so the code doesn't get well tested.

Also failing on the development branch.

Actions #1

Updated by Michael Kay about 4 years ago

The problem only arises with mixed content, for example with a document like

<a>xx<b>yy</b></a>

When the TinyTextualElement <b>yy</b> is encountered and the most recent event was a text event, the TinyTreeWalker assumes that we've already emitted the "yy" and that we can now emit the endElement event for </b>, whereas in fact we haven't we've only just emitted the xx.

Actions #2

Updated by Michael Kay about 4 years ago

The answer to this is probably to maintain a queue of events waiting to be output. When next() is called and there are items on the queue, it emits the next item on the queue. When a TinyTextualElement is processed, it should put three events on the queue, for the startElement, text, and endElement.

Actions #3

Updated by Michael Kay about 4 years ago

I wrote a new batch of unit tests for the TinyTreeWalker and discovered a lot more cases that were failing.

Attempted a rewrite, but found it challenging to get it all right.

In principle it's just an inversion of the logic used by TinyElementImpl.copy(), but it seems harder to achieve in pull mode because of all the state that needs to be maintained.

Since TinyTreeWalker is just an optimization anyway, and it's not used on any very important paths, it seems simplest to scrap it and just use the generic TreeWalker instead (this works on all trees. It has its own complexities, being based on a stack of iterators, but it works...).

This exposed one other bug: we get a NullPointerException if getSystemId() is called when the tree walker is in a particular state.

Actions #4

Updated by Michael Kay about 4 years ago

  • Category set to XQuery Java API
  • Status changed from New to Resolved
  • Applies to branch 9.9, trunk added
  • Fix Committed on Branch 9.9, trunk added

Fixed.

Replaced use of TinyTreeWalker by the generic TreeWalker.

Made changes to the generic TreeWalker to make it more resilient to methods being called when there is no current node (e.g. before reading the first node and after reading the last).

Actions #5

Updated by O'Neil Delpratt about 4 years ago

  • Status changed from Resolved to Closed
  • % Done changed from 0 to 100
  • Fixed in Maintenance Release 9.9.1.7 added

Patch applied in the 9.9.1.7 maintenance release.

Please register to edit this issue

Also available in: Atom PDF