Patch #6540
closedNamePool - Concurrency/Locking behavior improvements
0%
Description
Context:
- Java 21, Eclipse Temurin/Adoptium
- SaxonJ 12.5 HE, JAXP/Saxon API, some custom extension functions
- lots of big documents without any namespaces
- web application doing various transformations for each request (concurrency depends on users)
- several transformations using
StAXSource
- using Woodstox 6.7 for StAX
- using StAX since we actually have some
javax.xml.stream.StreamFilter
s removing elements not needed for some transformations reducing size of TinyTree and speeding up transformation
Locking behavior¶
I saw net.sf.saxon.om.NamePool#allocateFingerprint
on and off in the past years appearing in different profiles especially around lock usage, now finally dug deeper into it and made a patch trying to improve behavior.
Ran lock profiling using async profiler ( https://github.com/async-profiler/async-profiler ) like:
asprof -d 60 -e lock -f locksprof.html PID
Lock stacks around NamePool mostly look like this:
net.sf.saxon.om.NamePool
net/sf/saxon/om/NamePool.allocateFingerprint
net/sf/saxon/om/FingerprintedQName.<init>
net/sf/saxon/pull/StaxBridge.next
net/sf/saxon/pull/PullFilter.next
net/sf/saxon/pull/PullPushTee.next
net/sf/saxon/pull/PullConsumer.consume
net/sf/saxon/pull/PullPushCopier.copy
net/sf/saxon/pull/PullSource.deliver
net/sf/saxon/pull/ActiveStAXSource.deliver
net/sf/saxon/event/Sender.send
net/sf/saxon/Controller.makeSourceTree
net/sf/saxon/s9api/XsltTransformer.transform
net/sf/saxon/jaxp/TransformerImpl.transform
Actual performance impact is a bit hard to measure since it depends on concurrency and there are some timing variations from other application code. So far I ran tests with 7 threads doing continuous requests calling the same transformation on same large document but I'll also try to set up a bigger test with a lot more threads.
I guess one of the issues is also from https://saxonica.plan.io/issues/3366 if that's still valid ?:
Note: on the push() path, the ReceivingContentHandler maintains a local cache of allocated name codes, but there is no equivalent on the pull path.
Another specialty is lack of namespaces so it's going through net.sf.saxon.om.NoNamespaceName#obtainFingerprint
which calls allocateFingerprint with NamespaceUri.NULL uri.
All measurements/observations are after stylesheets were compiled and a few transformations already ran on same documents, so the NamePool already contains all names that will be seen.
Patch summary/performance¶
Attached patch NamePool_locking_behavior_improvements.patch
that I'm testing now in my environment.
Hard to measure exact performance impact, I'll try to see if I manage to test in production with a lot of requests but so far results from a test environment being called continuously by 7 threads doing transformations:
-
NamePool
completely disappeared from async profiler lock profiling traces- funnily enough now I see
Expression.getSlotsUsed
onFilterExpression
in lock monitoring...it's also synchronized, maybe something for another time...
- funnily enough now I see
- ran async profiler in CPU profiling mode (
asprof -d 60 -e cpu -f cpuflame.html PID
before/after and thereFingerprintedQName.<init>
went from 3.9% to 1.3% of total traces going throughTransformerImpl.transform
- no significant differences in total transformation time (might be same 1-2% but hard to measure reliably at that point, maybe more visible when I can bump up the threads a lot more in later tests)
Patch details¶
Removed synchronized
from allocateFingerprint method and moved to a synchronized block.
The reserved/Saxon/StandardNames checks do not need synchronization since they're simple checks or StandardNames where the lookup map is pre-populated in a static block.
From a memory point of view there's an extra allocation of StructuredQName since that's now outside synchronized block but that's just a wrapper on existing parameters doesn't seem to allocate anything else in construction.
Added fingerprintGuard
as a private guard object on which to lock instead of the NamePool instance. NamePool can be retrieved from Configuration or other places so this avoids external code accidentally locking on it and causing other problems.
There are various places in Saxon where allocateFingerprint is called with NamespaceUri.NULL as parameter and while NamespaceUri#isReserved
has a check against null
uri it doesn't check against NamespaceUri.NULL so I added it as condition to the if block.
Duplicated the initial qNameToInteger.getOrDefault
also inside the synchronized block. In the happy path if the fingerprint is already allocated for the qName this completely avoids any locking (as per ConcurrentHashMap javadoc retrieval operations do not entail locking
). If a new fingerprint needs to be allocated there is an extra concurrent map get but that seems like it could be worth it. If new fingerprint creation is a hot path it's probably going to hit max fingerprint.
Files
Please register to edit this issue