Bug #1844: Out-of-memory due to non-synchronisation of TinyTree statistics - Saxon - Saxonica Developer Community

Actions

Send by e-mail Copy link

Bug #1844

closed

Out-of-memory due to non-synchronisation of TinyTree statistics

Added by Michael Kay almost 11 years ago. Updated over 10 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Michael Kay

Category:

Performance

Sprint/Milestone:

Start date:

2013-07-16

Due date:

% Done:

100%

Estimated time:

Legacy ID:

Applies to branch:

Fix Committed on Branch:

Fixed in Maintenance Release:

Platforms:

Description

Wolfgang Hoschek reports

At company X sometimes (perhaps after a week or so of heavy duty data crunching) we'd see a seemingly random OutOfMemoryError with Saxon in production. After studying the heap dumps I suspected a race condition in TinyTree memory allocation. Examination of the source code confirmed the problem.

The underlying issue is that TinyTree.updateStatistics() updates these 64 bit global vars without any synchronisation:

private static int treesCreated = 5;

private static double averageNodes = 4000.0;

private static double averageAttributes = 100.0;

private static double averageNamespaces = 20.0;

private static double averageCharacters = 4000.0;

If multiple threads are running (unrelated) TinyTree.updateStatistics() at the same time it can happen that averageNodes and cousins behave somewhat like a random number. Java does not guarantee atomic updates to 64 bit variables, only to 32 bit vars. averageNodes is a double, i.e. has 64 bit. So every once in a while a parallel thread sees the "old" low 32 bits and the "new" high 32 bits, or the other way round, and the resulting view of averageNodes is correspondingly off. This way averageNodes might be "visible" as a huge number, which causes "nodes" to be huge number in the TinyTree constructor, which in turn causes an OOM (next = new int[nodes]).

A workaround is to make TinyTree.updateStatistics() synchronized on the class, or to use AtomicDouble, or similar. Back then we applied that fix and our production OOMs disappeared completely.

I'm considering using Saxon again at company Y, but I'm concerned about the race condition leading to random blow ups in production. I looked at the source code of 8.5.1.1 [9.5.1.1? - MK] and the same problem is still present there in there. Any chance this could be fixed for good, e.g. with a bit of synchronization or AtomicDouble or similar?

Please register to edit this issue

Actions

Send by e-mail Copy link

Also available in: Atom PDF

Project

Profile

Help

Saxon

Bug #1844

Out-of-memory due to non-synchronisation of TinyTree statistics

Updated by Michael Kay almost 11 years ago

Updated by Wolfgang Hoschek almost 11 years ago

Updated by Michael Kay almost 11 years ago

Updated by Michael Kay almost 11 years ago

Updated by O'Neil Delpratt over 10 years ago