Project

Profile

Help

Bug #1752

closed

Scaleability of Chain sequence

Added by Michael Kay almost 11 years ago. Updated almost 11 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Performance
Sprint/Milestone:
-
Start date:
2013-04-29
Due date:
% Done:

100%

Estimated time:
Legacy ID:
Applies to branch:
Fix Committed on Branch:
Fixed in Maintenance Release:
Platforms:

Description

The class Chain in 9.5 replaces ShareableSequence in 9.4, and is designed to handle a wider variety of use cases that involve incremental construction of a sequence, usually by means of a recursive function that prepends or appends one item on each call to the function. A major aim is that the cost of constructing the sequence as a whole should have linear cost; another aim is that iterating over the sequence should have linear cost. While the implementation achieves these objectives, there are some drawbacks, notably that in a typical use case iterating over a long sequence can fail with a stack overflow because it will often involve one recursive call for each item in the sequence.

Additional problems that have been found with Chain sequences are that subscripting operations do not have constant performance, and (more seriously) it is possible to construct a Chain that is lazily evaluated, where lazy evaluation depends on parts of the context which may change before the evaluation takes place.

A redesigned implementation of Chain is therefore being substituted on the 9.5 branch. The key changes are:

(a) entries in the chain are "chunked" to reduce the total number of objects maintained

(b) iteration is no longer recursive (this is achieved by an algorithm which maintains its own stack (on the Java heap) rather than relying on the Java stack

(c) the component subsequences of a Chain are always "grounded" - which eliminates lazy evaluation, other than through Closures which retain all aspects of the context on which they depend; and the Chain itself is now classified as a grounded sequence

(d) operations such as subscripting or subsequence trigger a reorganisation of the data structure which takes linear time and ensures that subsequent calls on such operations can be performed in constant time. So in a typical scenario where there is a phase of building the sequence, and another stage of accessing it, the structure is reorganized at the end of the construction phase. This may not be ideal where operations such as subscripting are used on the sequence while it is still being constructed.

Actions #1

Updated by Michael Kay almost 11 years ago

  • Status changed from In Progress to Resolved
Actions #2

Updated by O'Neil Delpratt almost 11 years ago

  • Status changed from Resolved to Closed
  • % Done changed from 0 to 100
  • Fixed in version set to 9.5.0.2
Actions #3

Updated by Michael Kay almost 11 years ago

  • Status changed from Closed to Resolved

A further patch is being submitted (post - 9.5.0.2) to ensure that generated bytecode constructing a Chain (in LetExpressionCompiler) only adds GroundedValue's to the chain. (Test case -s:prod-FunctionDecl -t:K2-FunctionProlog-31 fails).

Actions #4

Updated by Michael Kay almost 11 years ago

A further patch is being committed (post-9.5.0.2) to correct the bytecode generated for an expression such as $seq[ 3 ] where the sequence is a MemoClosure and the subscript is an integer known at compile time.

Actions #5

Updated by O'Neil Delpratt almost 11 years ago

  • Status changed from Resolved to Closed
  • Fixed in version changed from 9.5.0.2 to 9.5.1.1

Please register to edit this issue

Also available in: Atom PDF