Bug #5600
closedStack Overflow traversing an array that was built incrementally
100%
Description
I have stumbled across a bug that shows up only when running Saxon from Java, not in Oxygen.
In the attached stylesheet, an array is sent through tan:array-permutations(), which normally returns an array with 12,288 items. It does so very quickly in Oxygen (v. 24.1).
But when I call this stylesheet from a Java-based application, using Saxon 10.3, 10.8, and 11.3 HE, I get the following error:
Exception in thread "main" java.lang.StackOverflowError
at net.sf.saxon.ma.parray.ImmList2.iterator(ImmList2.java:120)
. . . . . . .
[PREVIOUS LINE REPEATED ABOUT 1000 TIMES]
When the population of the first member of $test-array is reduced by say 36 items, then it works fine in Java.
I first encountered the problem in another context with the cryptic SXLM0001 Too any nested function calls. May be due to infinite recursion. After about four hours of troubleshooting, I isolated the problem to the example in the attached stylesheet.
In Java I'm working with IntelliJ 2021.2.3 build #IC-212.5457.46
Files
Updated by Martin Honnen over 2 years ago
Which processor did you use in oXygen, only Saxon EE? Or HE as well?
Updated by Michael Kay over 2 years ago
The symptoms suggest that the array is being created as an unbalanced tree, leading to stack overflow when subsequently scanning the tree in a recursive tree walk.
Updated by Martin Honnen over 2 years ago
Interesting, I tried the code with SaxonCS and SaxonJS from the command line and it works to output e.g.
<?xml version="1.0" encoding="UTF-8"?>
<output xmlns:array="http://www.w3.org/2005/xpath-functions/array"
xmlns:tan="tag:textalign.net,2015:ns"
xmlns:xs="http://www.w3.org/2001/XMLSchema">12288</output>
but both Saxon HE 11.3 J and EE 11.3 J from the command line just crash out with lots of at net.sf.saxon.ma.parray.ImmList2.iterator(ImmList2.java:120)
lines.
But oXygen 24.1 also gives me a StackOverflow when using Saxon 10.6 EE.
Updated by Michael Kay over 2 years ago
The failure is going to be sensitive to the amount of stack space available.
Updated by Michael Kay over 2 years ago
array:join() is being called to join 12288 arrays each apparently containing one member which is a sequence of length 3.
array:join() is implemented by repeated application of array:concat().
It looks to me as if the ImmList2 binary tree is never rebalanced when adding a singleton array using concat(). This is clearly a bug.
Updated by Michael Kay over 2 years ago
There is in fact an attempt at rebalancing, but it's not effective. After 13 or so additions, the tree looks like this:
((1,1),1)
(((1,1),1),1)
(((1,1),1),(1,1))
(((1,1),1),((1,1),1))
(((1,1),1),(((1,1),1),1))
(((1,1),1),((((1,1),1),1),1))
(((1,1),1),(((((1,1),1),1),1),1))
(((1,1),1),((((((1,1),1),1),1),1),1))
(((1,1),1),(((((((1,1),1),1),1),1),1),1))
(((1,1),1),((((((((1,1),1),1),1),1),1),1),1))
(((1,1),1),(((((((((1,1),1),1),1),1),1),1),1),1))
I think it's rebalacing at the top level, but isn't recursively rebalancing below the top level.
Updated by Michael Kay over 2 years ago
The code isn't recursively rebalancing the trees created during the rebalancing operation. When we add this, we get
((1,1),1)
(((1,1),1),1)
(((1,1),1),(1,1))
(((1,1),1),((1,1),1))
(((1,1),1),(((1,1),1),1))
(((1,1),1),(((1,1),1),(1,1)))
(((1,1),1),(((1,1),1),((1,1),1)))
(((1,1),1),(((1,1),1),(((1,1),1),1)))
(((1,1),1),(((1,1),1),(((1,1),1),(1,1))))
(((1,1),1),(((1,1),1),(((1,1),1),((1,1),1))))
(((1,1),1),(((1,1),1),(((1,1),1),(((1,1),1),1))))
(((1,1),1),(((1,1),1),(((1,1),1),(((1,1),1),(1,1)))))
(((1,1),1),(((1,1),1),(((1,1),1),(((1,1),1),((1,1),1)))))
(((1,1),1),(((1,1),1),(((1,1),1),(((1,1),1),(((1,1),1),1)))))
(((1,1),1),(((1,1),1),(((1,1),1),(((1,1),1),(((1,1),1),(1,1))))))
(((1,1),1),(((1,1),1),(((1,1),1),(((1,1),1),(((1,1),1),((1,1),1))))))
which looks much healthier. But it's still not quite right.
Updated by Michael Kay over 2 years ago
Add another call on rebalance() and we get
((1,1),1)
(((1,1),1),1)
(((1,1),1),(1,1))
(((1,1),1),((1,1),1))
(((1,1),1),(((1,1),1),1))
(((1,1),1),(((1,1),1),(1,1)))
(((1,1),1),(((1,1),1),((1,1),1)))
(((1,1),1),(((1,1),1),(((1,1),1),1)))
(((1,1),1),(((1,1),1),(((1,1),1),(1,1))))
(((1,1),1),(((1,1),1),(((1,1),1),((1,1),1))))
((((1,1),1),((1,1),1)),(((1,1),1),(((1,1),1),1)))
((((1,1),1),((1,1),1)),(((1,1),1),(((1,1),1),(1,1))))
((((1,1),1),((1,1),1)),(((1,1),1),(((1,1),1),((1,1),1))))
((((1,1),1),((1,1),1)),(((1,1),1),(((1,1),1),(((1,1),1),1))))
((((1,1),1),((1,1),1)),(((1,1),1),(((1,1),1),(((1,1),1),(1,1)))))
((((1,1),1),((1,1),1)),(((1,1),1),(((1,1),1),(((1,1),1),((1,1),1)))))
which is now looking properly balanced. I'm just slightly concerned (a) that we're doing more work than is needed to keep it balanced, and (b) that I'm not 100% confident the process is guaranteed to terminate.
Updated by Michael Kay over 2 years ago
A couple of thoughts on this:
(a) I'm sure there must be a better way of creating an iterator that walks a binary tree, rather than using the stack of iterators we currently employ.
(b) It would be nice to use the "ZenoChain" machinery that we now use for incrementally-constructed sequences, rather than having a completely separate mechanism. The ZenoChain consolidates chunks of the sequence away from its ends, meaning there are fewer objects to traverse and less work keeping the structure balanced. If it's good enough for sequences it should be good enough for arrays...
Updated by Michael Kay over 2 years ago
Added this as XSLT3 test case arrays-306.
Updated by Michael Kay over 2 years ago
- Subject changed from SaxonHE via Java fails to create array in an iteration to Stack Overflow traversing an array that was built incrementally
- Category set to Performance
- Status changed from New to Resolved
- Assignee set to Michael Kay
- Applies to branch 11, trunk added
- Fix Committed on Branch 11, trunk added
Updated by Michael Kay over 2 years ago
Another litte thought: instead of array:join()
working sequentially through the supplied arrays doing an incremental array:concat()
on each one, it could do a divide-and-conquer approach (split the sequence in two, do array-join on each, then array-concat the results) which would mean much less incremental rebalancing is needed on the constructed tree.
Updated by Joel Kalvesmaki over 2 years ago
Not sure if this helps. I revised the function, replacing array:join()
with an incremental application of array:append()
. The revised function worked fine with large sets in Oxygen, but I encountered the same error conditions in Java.
Updated by Michael Kay over 2 years ago
Have you tried increasing the stack size in Java? It's clear from the evidence that a little more stack space will give you breathing room.
Updated by Joel Kalvesmaki over 2 years ago
I just tried doubling and tripling memory allowances, but no success--still the same error. My copy of VisualVM didn't show any unusual roofing of the memory during execution. But I may not know what to look for. I tried to remove IntelliJ from the equation by building and running from the CLI, but build dependencies were the customary thorn in my side, and I couldn't get it off the ground.
I wrote another revised permutation function, this time depending not upon arrays but upon an element tree, and the Java environment had no problem with the 12K+ permutations. Performance was only slightly longer, 136 ms extra for an operation that took on average 6,018 ms, averaged across 16 tests.
Updated by Michael Kay over 2 years ago
- Status changed from Resolved to In Progress
Reopened because this patch causes a StackOverflow in a Docbook test case (see internal bug #5612)
Decided to go back to the theory and re-read how binary tree balancing should be done. There's a clear description here:
https://appliedgo.net/balancedtree/
which emphasises that the heights of the subtrees should be compared, not their sizes. We are comparing their sizes.
Updated by Michael Kay over 2 years ago
We currently hold the size (aka weight) of a tree in each node, so perhaps we should look for an algorithm that balances trees by weight rather than height:
https://en.wikipedia.org/wiki/Weight-balanced_tree
(But note, for some reason we only hold keys in leaf nodes. No idea why that decision was made, but let's try to avoid reinventing the whole thing...)
The balancing operations (rotations) for a weight-balanced tree are identical to those for a height-balanced tree; the difference is the criterion for deciding whether the tree is balanced. This is given in Wikipedia as size(left) <= α * size(right) && size(right) <= α * size(left)
, where the optimum value of α is 0.29289 (= 1 - √2/2).
Updated by Michael Kay over 2 years ago
With this change, the DocBook test passes. But I now get a StackOverflow in XSLT3 test arrays-306, creating an iterator over an array, which clearly shows that the tree is not balanced. I've made changes to fix that, and now arrays-305 gives infinite recursion while rebalancing. It looks as if the threshold level of imbalance is too low, allowing oscillation. Perhaps this is because I took the threshold value from a binary tree used as a search tree rather than a list, where the size of a new parent node is M+N+1 rather than M+N.
Updated by Michael Kay over 2 years ago
- Status changed from In Progress to Resolved
- Platforms .NET, Java added
After a good deal more work trying to find a balancing algorithm that (a) always left the tree balanced, and (b) didn't itself "oscillate" with constant rebalancing, I decided to do what has been my intention for some time: re-implement the ImmutableArrayItem
class to use a ZenoChain
as its backing storage, rather than an ImmList
. The Immlist
structure can now be dropped. This means that extensible arrays and extensible sequences now both use the same underlying data structure, so we have better re-use internally. The ZenoChain should be more economical on memory and give better traversal (iterator) performance; it should be equally efficient for get(), append(), and prepend() operations, but it may be a little slower for put(), insert(), and remove() at arbitrary positions.
Updated by Debbie Lockett over 2 years ago
- Status changed from Resolved to Closed
- % Done changed from 0 to 100
- Fixed in Maintenance Release 11.4 added
Bug fix applied in the Saxon 11.4 maintenance release.
Updated by Michael Kay over 2 years ago
- Applies to branch 10 added
Note that the bug is also present in 10.x (the new test arrays-306 fails with a StackOverflow)
Please register to edit this issue