Bug #3294
closedThread contention on LRUCache.get
100%
Description
Running XQuery on Saxon-PE 9.7.0-18 with 4 concurrent threads running the same query, I am seeing a lot of contentions between threads with the following stack trace:
pool-1-thread-3 [BLOCKED]
java.util.Collections$SynchronizedMap.get(Object)
net.sf.saxon.expr.sort.LRUCache.get(Object) LRUCache.java:58
net.sf.saxon.lib.ConversionRules.getConverter(AtomicType, AtomicType) ConversionRules.java:196
net.sf.saxon.expr.CastExpression.doCast(AtomicValue, XPathContext) CastExpression.java:366
net.sf.saxon.expr.CastExpression.evaluateItem(XPathContext) CastExpression.java:405
net.sf.saxon.expr.CastExpression.evaluateItem(XPathContext) CastExpression.java:31
net.sf.saxon.expr.GeneralComparison.effectiveBooleanValue(XPathContext) GeneralComparison.java:616
net.sf.saxon.expr.instruct.Choose.choose(XPathContext) Choose.java:865
net.sf.saxon.expr.instruct.Choose.evaluateItem(XPathContext) Choose.java:893
net.sf.saxon.expr.LetExpression.evaluateItem(XPathContext) LetExpression.java:521
net.sf.saxon.expr.parser.ExpressionTool.evaluate(Expression, int, XPathContext, int) ExpressionTool.java:314
net.sf.saxon.expr.LetExpression.eval(XPathContext) LetExpression.java:500
net.sf.saxon.expr.LetExpression.processLeavingTail(XPathContext) LetExpression.java:702
net.sf.saxon.expr.instruct.Block.processLeavingTail(XPathContext) Block.java:653
net.sf.saxon.expr.instruct.Instruction.process(XPathContext) Instruction.java:149
net.sf.saxon.expr.instruct.ElementCreator.processLeavingTail(XPathContext, NodeInfo) ElementCreator.java:364
net.sf.saxon.expr.instruct.ElementCreator.processLeavingTail(XPathContext) ElementCreator.java:311
net.sf.saxon.expr.instruct.Instruction.process(XPathContext) Instruction.java:149
net.sf.saxon.expr.flwor.ReturnClausePush.processTuple(XPathContext) ReturnClausePush.java:37
net.sf.saxon.expr.flwor.ForClausePush.processTuple(XPathContext) ForClausePush.java:44
net.sf.saxon.expr.flwor.FLWORExpression.process(XPathContext) FLWORExpression.java:864
net.sf.saxon.expr.instruct.UserFunction.process(Sequence[], XPathContextMajor) UserFunction.java:626
net.sf.saxon.expr.UserFunctionCall.process(XPathContext) UserFunctionCall.java:549
net.sf.saxon.expr.instruct.ElementCreator.processLeavingTail(XPathContext, NodeInfo) ElementCreator.java:364
net.sf.saxon.expr.instruct.ElementCreator.processLeavingTail(XPathContext) ElementCreator.java:311
net.sf.saxon.expr.instruct.Instruction.process(XPathContext) Instruction.java:149
net.sf.saxon.expr.flwor.ReturnClausePush.processTuple(XPathContext) ReturnClausePush.java:37
net.sf.saxon.expr.flwor.ForClausePush.processTuple(XPathContext) ForClausePush.java:44
net.sf.saxon.expr.flwor.FLWORExpression.process(XPathContext) FLWORExpression.java:864
net.sf.saxon.expr.instruct.UserFunction.process(Sequence[], XPathContextMajor) UserFunction.java:626
net.sf.saxon.expr.UserFunctionCall.process(XPathContext) UserFunctionCall.java:549
net.sf.saxon.query.XQueryExpression.run(DynamicQueryContext, Result, Properties) XQueryExpression.java:414
net.sf.saxon.s9api.XQueryEvaluator.run(Destination) XQueryEvaluator.java:370
org.talend.transform.xquery.SaxonProcessor.run(Source, ContentHandler) SaxonProcessor.java:48
org.talend.transform.engine.xquery.XQueryConcurTest$MyCallable.call() XQueryConcurTest.java:114
org.talend.transform.engine.xquery.XQueryConcurTest$MyCallable.call() XQueryConcurTest.java:1
java.lang.Thread.run()
forum link: https://saxonica.plan.io/boards/4/topics/6819
Attached the xquery we are running.
Thank you
Files
Updated by Michael Kay over 7 years ago
See also the discussion a couple of years ago at https://saxonica.plan.io/boards/3/topics/6004,
Updated by Michael Kay over 7 years ago
Looking first at whether the query could be changed to avoid the problem:
Firstly, I think most of the casts are unnecessary. Consider the first let clause:
let $unconvertedValue := (
if (fn:count($out_AVRO_root/AVRO:CUSTOMER-ID[1]) = 0) then ('_789odtMissing') else (
if ($out_AVRO_root/AVRO:CUSTOMER-ID[1]/@xsi:nil['true']) then ('_osdt987Null') else ($out_AVRO_root/AVRO:CUSTOMER-ID[1]/text())
we can see that on all branches the variable is bound either to a string or to a text node. Because it can be either, Saxon isn't going to be able to infer a very useful type for the variable, and therefore isn't going to be able to infer that a cast is unnecessary. But the effect of comparing $unconvertedValue to a string literal is exactly the same without the cast. I suspect that if these variables were declared (using "as xs:string") then most of the cast operations might be optimized away; alternatively, the cast operations could be explicitly removed. I haven't checked the whole query, and I don't fully understand the logic, but that's the way it appears at first glance.
Even if the cast expressions aren't optimized away, if Saxon can infer an explicit type for the operand of the cast, it will allocate the converter statically, avoiding run-time access to the cache that is acting as the contention spot.
Simplifying some of the expressions in the query would make it easier for Saxon to do this kind of type inferencing. For example
if (if (fn:count($unconvertedValue) = 1) then $unconvertedValue else $unconvertedValue[1]) cast as xs:string? = '_osdt987Null') then...
could be rewritten as
if ($unconvertedValue[1] cast as xs:string? = '_osdt987Null') then ...
Updated by Michael Kay over 7 years ago
Now looking at what we could do about it in Saxon. Here are some ideas, not necessarily mutually exclusive:
(A) replace the shared cache with a ThreadLocal cache. There's a lot of advice against using ThreadLocal, but we use it successfully for the cache of validated URIs.
(B) we currently special-case conversion FROM string to other types, bypassing the cache to allocate a StringConverter. We could do the same for conversion TO string, which is also a very common case.
(C) hold the cache in a ConcurrentHashMap, which is what we do for the TypeHierarchy cache. This wouldn't be an LRU cache; but if we can get away with an ever-growing cache for the TypeHierarchy, then we can probably get away with it here too.
(D) cache the converter from X to Y within the type object for Y, rather than globally.
(E) use statically-allocated converters for conversions between primitive types, and dynamically-allocated converters for non-primitive types, with no caching.
One complication is that the converter being allocated depends not only on the source and target type, but also on the ConversionRules in force. Handling of year zero and "+INF" depends on whether XSD 1.0 or 1.1 is in use, and handling of xs:NOTATION depends on the specific schema in use. This is why the converters are currently cached as part of the ConversionRules object rather than being held as a property of the source or target types.
Updated by Michael Kay over 7 years ago
- Category set to Performance
- Assignee set to Michael Kay
I have implemented what is essentially (B): the method ConversionRules.getConverter(), before it looks in the cache, handles some common cases directly:
if (source == target) {
return Converter.IDENTITY_CONVERTER;
} else if (source == BuiltInAtomicType.STRING || source == BuiltInAtomicType.UNTYPED_ATOMIC) {
return target.getStringConverter(this);
} else if (target == BuiltInAtomicType.STRING) {
return Converter.TO_STRING;
} else if (target == BuiltInAtomicType.UNTYPED_ATOMIC) {
return Converter.TO_UNTYPED_ATOMIC;
}
Hopefully this will reduce the pressure on the cache; it certainly should do so for this particular query.
Updated by Michael Kay over 7 years ago
- Status changed from New to Resolved
- Priority changed from Low to Normal
- Applies to branch 9.8 added
- Fix Committed on Branch 9.7, 9.8 added
Patch committed on the 9.7 and 9.8 branches
Updated by O'Neil Delpratt over 7 years ago
- % Done changed from 0 to 100
- Fixed in Maintenance Release 9.8.0.2 added
Patch applied in the Saxon 9.8.0.2 maintenance release. Leaving bug issue marked as resolved until applied in the next Saxon 9.7 maintenance release.
Updated by O'Neil Delpratt over 7 years ago
- Status changed from Resolved to Closed
- Fixed in Maintenance Release 9.7.0.19 added
Bug fix applied in the 9.7.0.19 maintenance release.
Please register to edit this issue