Bug #3287
closed
Performance bottleneck in SequenceType
Applies to branch:
9.7, 9.8
Fix Committed on Branch:
9.7, 9.8
Fixed in Maintenance Release:
Description
Saxon-HE 9.7.0-18
When processing a large number of paths in parallel with many threads (150+) our application was hitting a bottle neck in net.sf.saxon.value.SequenceType.makeSequenceType(ItemType, int) from a static synchronizedMap.
Would it be possible to instead use a ConcurrentHashMap for the "pool" field in SequenceType? This will improve throughput when accessing the "pool" to get the cached array. Right now because it a synchronizedMap every call to get on the "pool" is paying the cost of synchronizing.
Thanks: we're very reliant on users to tell us where bottlenecks occur in real production workloads.
I'm wondering whether this pool mechanism actually adds value at all. Object creation in Java is much less of an overhead than it used to be.
If calls on SequenceType.makeSequenceType() are so common as to cause a bottleneck, it would be nice to know where these calls are coming from, because perhaps there are other measures that would be more effective. At a first glance nearly all the calls on the method are during stylesheet/query compilation which we have usually assumed is less performance-critical than run-time execution, though we're starting to take it much more seriously.
We're only actually using the pool for built-in atomic types, and I'm inclined to replace this mechanism by giving each BuiltInAtomicType a set of four fields containing the SequenceType's one, oneOrMore, zeroOrMore, zeroOrOne, with corresponding memo-getters. These don't need to be synchronized, or at worst they would be synchronized on the BuiltInAtomicType object. In fact we could put these getter methods on ItemType. In the vast majority of cases when we call SequenceType.makeSequenceType() we know the atomic type statically, so this must be better than a hash table lookup.
I have implemented this change on the 9.8 branch (that is, the pool of SequenceTypes for BuiltInAtomicTypes is abolished, and replaced with four lazily-evaluated fields in the BuiltInAtomicType object).
- Status changed from New to Resolved
- Applies to branch 9.8 added
- Fix Committed on Branch 9.7, 9.8 added
I have extended the 9.8 fix to apply to other commonly used item types (other than BuiltInAtomicType).
For 9.7 I have implemented the simpler and less disruptive fix of using a ConcurrentHashMap for the cache, as suggested.
Great, thanks! Where can I find information about when the next maintenance release of 9.7 will be?
Sorry, we don't announce (or plan!) maintenance releases more than a few days in advance. We produce them when there's a critical bug with no workaround, or when the level of bugs gets to a point where a release is worthwhile, or when it's a few months since the last one.
I'll keep watch for next release then. Thanks!
- % Done changed from 0 to 100
- Fixed in Maintenance Release 9.8.0.2 added
Patch applied in the Saxon 9.8.0.2 maintenance release. Leaving bug issue marked as resolved until applied in the next Saxon 9.7 maintenance release.
- Status changed from Resolved to Closed
- Fixed in Maintenance Release 9.7.0.19 added
Bug fix applied in the 9.7.0.19 maintenance release.
William and I were looking for the fix in 9.7.0-19 but don't see it, at least not the change we expected. SequenceType.pool is still using Collections.synchronizedMap(). After profiling we still see the same performance bottleneck here in SequenceType.
- Status changed from Closed to In Progress
Sorry, it looks as if I failed to get this patch into the main 9.7 branch and it therefore missed the maintenance build.
- Category set to Performance
- Status changed from In Progress to Resolved
- Assignee set to Michael Kay
Patch is now correctly committed on the 9.7 branch.
- Fixed in Maintenance Release 9.7.0.20 added
- Fixed in Maintenance Release deleted (
9.8.0.2, 9.7.0.19)
Bug fix applied in the Saxon 9.7.0.20 maintenance release.
- Status changed from Resolved to Closed
- Fixed in Maintenance Release 9.8.0.4 added
Bug fix applied in the 9.8.0.4 maintenance release.
Please register to edit this issue
Also available in: Atom
PDF