More than 1G of text in a TinyTree
Applies to branch:
Fix Committed on Branch:
Fixed in Maintenance Release:
In 11.0 we lifted limits so the LargeTextBuffer, used to hold text nodes in the TinyTree, can expand beyond 2^32 characters. However, the offsets in the buffer (in the alpha and beta arrays) are still 32-bit ints, so this doesn't achieve much. In fact, all it seems to achieve is that we no longer fail cleanly when the limit is exceeded.
Tree size: 2397877 nodes, -1857909795 characters, 298610 attributes java.lang.ArrayIndexOutOfBoundsException: -32768 at java.util.ArrayList.elementData(ArrayList.java:424) at java.util.ArrayList.get(ArrayList.java:437) at net.sf.saxon.str.LargeTextBuffer.getSegment(LargeTextBuffer.java:255) at net.sf.saxon.str.LargeTextBuffer.substring(LargeTextBuffer.java:377) at net.sf.saxon.tree.tiny.TinyTextImpl.getStringValue(TinyTextImpl.java:50)
A possible design to increase the capacity without penalty for "ordinary" users might be for the alpha and beta entries to hold pointers into the "current region" of the text buffer, with a separate index holding a mapping from ranges of node numbers to regions.
No data to display
Please register to edit this issue