Project

Profile

Help

Bug #3287

closed

Performance bottleneck in SequenceType

Added by William McCusker about 7 years ago. Updated almost 7 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Performance
Sprint/Milestone:
-
Start date:
2017-06-16
Due date:
% Done:

100%

Estimated time:
Legacy ID:
Applies to branch:
9.7, 9.8
Fix Committed on Branch:
9.7, 9.8
Fixed in Maintenance Release:
Platforms:

Description

Saxon-HE 9.7.0-18

When processing a large number of paths in parallel with many threads (150+) our application was hitting a bottle neck in net.sf.saxon.value.SequenceType.makeSequenceType(ItemType, int) from a static synchronizedMap.

Would it be possible to instead use a ConcurrentHashMap for the "pool" field in SequenceType? This will improve throughput when accessing the "pool" to get the cached array. Right now because it a synchronizedMap every call to get on the "pool" is paying the cost of synchronizing.

Actions #1

Updated by Michael Kay about 7 years ago

Thanks: we're very reliant on users to tell us where bottlenecks occur in real production workloads.

I'm wondering whether this pool mechanism actually adds value at all. Object creation in Java is much less of an overhead than it used to be.

If calls on SequenceType.makeSequenceType() are so common as to cause a bottleneck, it would be nice to know where these calls are coming from, because perhaps there are other measures that would be more effective. At a first glance nearly all the calls on the method are during stylesheet/query compilation which we have usually assumed is less performance-critical than run-time execution, though we're starting to take it much more seriously.

Actions #2

Updated by Michael Kay about 7 years ago

We're only actually using the pool for built-in atomic types, and I'm inclined to replace this mechanism by giving each BuiltInAtomicType a set of four fields containing the SequenceType's one, oneOrMore, zeroOrMore, zeroOrOne, with corresponding memo-getters. These don't need to be synchronized, or at worst they would be synchronized on the BuiltInAtomicType object. In fact we could put these getter methods on ItemType. In the vast majority of cases when we call SequenceType.makeSequenceType() we know the atomic type statically, so this must be better than a hash table lookup.

Actions #3

Updated by Michael Kay about 7 years ago

I have implemented this change on the 9.8 branch (that is, the pool of SequenceTypes for BuiltInAtomicTypes is abolished, and replaced with four lazily-evaluated fields in the BuiltInAtomicType object).

Actions #4

Updated by Michael Kay about 7 years ago

  • Status changed from New to Resolved
  • Applies to branch 9.8 added
  • Fix Committed on Branch 9.7, 9.8 added

I have extended the 9.8 fix to apply to other commonly used item types (other than BuiltInAtomicType).

For 9.7 I have implemented the simpler and less disruptive fix of using a ConcurrentHashMap for the cache, as suggested.

Actions #5

Updated by William McCusker about 7 years ago

Great, thanks! Where can I find information about when the next maintenance release of 9.7 will be?

Actions #6

Updated by Michael Kay about 7 years ago

Sorry, we don't announce (or plan!) maintenance releases more than a few days in advance. We produce them when there's a critical bug with no workaround, or when the level of bugs gets to a point where a release is worthwhile, or when it's a few months since the last one.

Actions #7

Updated by William McCusker about 7 years ago

I'll keep watch for next release then. Thanks!

Actions #8

Updated by O'Neil Delpratt about 7 years ago

  • % Done changed from 0 to 100
  • Fixed in Maintenance Release 9.8.0.2 added

Patch applied in the Saxon 9.8.0.2 maintenance release. Leaving bug issue marked as resolved until applied in the next Saxon 9.7 maintenance release.

Actions #9

Updated by O'Neil Delpratt about 7 years ago

  • Status changed from Resolved to Closed
  • Fixed in Maintenance Release 9.7.0.19 added

Bug fix applied in the 9.7.0.19 maintenance release.

Actions #10

Updated by Joseph Benken almost 7 years ago

William and I were looking for the fix in 9.7.0-19 but don't see it, at least not the change we expected. SequenceType.pool is still using Collections.synchronizedMap(). After profiling we still see the same performance bottleneck here in SequenceType.

Actions #11

Updated by Michael Kay almost 7 years ago

  • Status changed from Closed to In Progress

Sorry, it looks as if I failed to get this patch into the main 9.7 branch and it therefore missed the maintenance build.

Actions #12

Updated by Michael Kay almost 7 years ago

  • Category set to Performance
  • Status changed from In Progress to Resolved
  • Assignee set to Michael Kay

Patch is now correctly committed on the 9.7 branch.

Actions #13

Updated by O'Neil Delpratt almost 7 years ago

  • Fixed in Maintenance Release 9.7.0.20 added
  • Fixed in Maintenance Release deleted (9.8.0.2, 9.7.0.19)

Bug fix applied in the Saxon 9.7.0.20 maintenance release.

Actions #14

Updated by O'Neil Delpratt almost 7 years ago

  • Status changed from Resolved to Closed
  • Fixed in Maintenance Release 9.8.0.4 added

Bug fix applied in the 9.8.0.4 maintenance release.

Please register to edit this issue

Also available in: Atom PDF