Scaleability of Chain sequence
The class Chain in 9.5 replaces ShareableSequence in 9.4, and is designed to handle a wider variety of use cases that involve incremental construction of a sequence, usually by means of a recursive function that prepends or appends one item on each call to the function. A major aim is that the cost of constructing the sequence as a whole should have linear cost; another aim is that iterating over the sequence should have linear cost. While the implementation achieves these objectives, there are some drawbacks, notably that in a typical use case iterating over a long sequence can fail with a stack overflow because it will often involve one recursive call for each item in the sequence.
Additional problems that have been found with Chain sequences are that subscripting operations do not have constant performance, and (more seriously) it is possible to construct a Chain that is lazily evaluated, where lazy evaluation depends on parts of the context which may change before the evaluation takes place.
A redesigned implementation of Chain is therefore being substituted on the 9.5 branch. The key changes are:
(a) entries in the chain are "chunked" to reduce the total number of objects maintained
(b) iteration is no longer recursive (this is achieved by an algorithm which maintains its own stack (on the Java heap) rather than relying on the Java stack
(c) the component subsequences of a Chain are always "grounded" - which eliminates lazy evaluation, other than through Closures which retain all aspects of the context on which they depend; and the Chain itself is now classified as a grounded sequence
(d) operations such as subscripting or subsequence trigger a reorganisation of the data structure which takes linear time and ensures that subsequent calls on such operations can be performed in constant time. So in a typical scenario where there is a phase of building the sequence, and another stage of accessing it, the structure is reorganized at the end of the construction phase. This may not be ideal where operations such as subscripting are used on the sequence while it is still being constructed.
#3 Updated by Michael Kay about 3 years ago
- Status changed from Closed to Resolved
A further patch is being submitted (post - 126.96.36.199) to ensure that generated bytecode constructing a Chain (in LetExpressionCompiler) only adds GroundedValue's to the chain. (Test case -s:prod-FunctionDecl -t:K2-FunctionProlog-31 fails).