Bug #3022
closed
Multithreading error: ClassCastException at at net.sf.saxon.expr.parser.ExpressionTool.evaluate(ExpressionTool.java:276)
Fix Committed on Branch:
9.7, trunk
Fixed in Maintenance Release:
Description
Hello,
We have a very strange issue in our application which is able to run many XSLT transformer instances in separate threads.
When we activated the multi-threading option in our application (It is not related to Saxonica multi-threading option) we had the issue below as shown in the stack trace below :
Caused by: java.lang.ClassCastException: net.sf.saxon.expr.LocalVariableReference cannot be cast to net.sf.saxon.expr.Literal
at net.sf.saxon.expr.parser.ExpressionTool.evaluate(ExpressionTool.java:276)
at net.sf.saxon.expr.UserFunctionCall.evaluateArguments(UserFunctionCall.java:624)
at net.sf.saxon.expr.UserFunctionCall.callFunction(UserFunctionCall.java:495)
at net.sf.saxon.expr.UserFunctionCall.evaluateItem(UserFunctionCall.java:456)
at net.sf.saxon.expr.SimpleStepExpression.iterate(SimpleStepExpression.java:110)
at net.sf.saxon.expr.Expression.process(Expression.java:896)
at net.sf.saxon.expr.instruct.Block.processLeavingTail(Block.java:658)
at net.sf.saxon.expr.instruct.Choose.processLeavingTail(Choose.java:838)
at net.sf.saxon.expr.instruct.Block.processLeavingTail(Block.java:656)
at net.sf.saxon.expr.LetExpression.processLeavingTail(LetExpression.java:732)
at net.sf.saxon.expr.instruct.Block.processLeavingTail(Block.java:656)
at net.sf.saxon.expr.instruct.Instruction.process(Instruction.java:149)
at net.sf.saxon.expr.instruct.ElementCreator.processLeavingTail(ElementCreator.java:366)
at net.sf.saxon.expr.instruct.ElementCreator.processLeavingTail(ElementCreator.java:313)
at net.sf.saxon.expr.instruct.Block.processLeavingTail(Block.java:656)
at net.sf.saxon.expr.instruct.Instruction.process(Instruction.java:149)
at net.sf.saxon.expr.ItemChecker.process(ItemChecker.java:249)
at net.sf.saxon.expr.instruct.TemplateRule.applyLeavingTail(TemplateRule.java:358)
However, in the mono-threading mode there is no error.
Sometimes, when we made some refactoring by decomposing the main function in many sub-methods the error disappeared.
I thank you in advance for your reply.
Regards,
Files
Thanks for reporting it. Multi-threading errors (indeed, many errors) are difficult to track down unless we can reproduce them. If you can send the stylesheet that causes this, we'll be happy to take a look at it.
In the absence of that, I think I can make a guess. When calling a user-defined function, there are a number of different strategies (varieties of eager and lazy evaluation) that we use for evaluating the arguments. Normally the strategy for each argument is decided statically (at compile time). However, if we find at run-time that for some reason the strategy hasn't yet been computed, then we compute it on first evaluation. Because this updates the expression tree, it should be synchronized, but it isn't. We can certainly add synchronization to this method, but I would like to reproduce the error first so that I have confidence in the fix; and since we're already on a recovery path here, I would like to see why it is that the work wasn't done at compile time.
- Subject changed from Random Saxonica Error to Multithreading error: ClassCastException at at net.sf.saxon.expr.parser.ExpressionTool.evaluate(ExpressionTool.java:276)
- Status changed from New to In Progress
Hi Mr. Michael,
I sent you an email and i hope that the provided content would help you to identify the issue root cause.
Regards,
Oussama
- Category set to Multithreading
- Status changed from In Progress to Resolved
- Applies to branch 9.8 added
- Fix Committed on Branch 9.8 added
Because this is a schema-aware transformation and I don't have the schema, and because you only supplied a fragment of the stylesheet code, I'm not going to be able to reproduce the problem and test my solution. I will apply the fix proposed in comment #1 and we will have to hope that I have identified the cause correctly.
Hello Mr. Michael,
Could you please provide us the Saxonica package containing the fix and we will test it in our environment.
The problem is recurrent in the production environment and we hope to fix it as soon as possible.
Kind regards,
- Status changed from Resolved to Closed
- % Done changed from 0 to 100
- Fixed in Maintenance Release 9.7.0.12 added
Bug fix applied in the Saxon 9.7.0.12 maintenance release.
Client reported that the initial patch didn't resolve the issue.
Attached file eej.zip is a custom build 9.7.3022.12 to generate diagnostics.
There are two problems here.
The first is that the argument evaluation modes for a stylesheet function call are not being computed at compile time as they should be. Or rather, they are being computed, but they are then being reset. After a tree rewrite, resetLocalProperties() is called to unset cached expression properties, in case the rewrite has invalidated the properties. In this example the final phase of optimization is to extract global variables, and this triggers a call on resetLocalProperties, which are therefore unset when execution commences.
The second problem is that the recovery action, which computes argument evaluation modes at run-time if they are not available in the expression tree, is not thread safe. The code does "if (evaluationModes == null) computeEvaluationModes()", where the computeEvaluationModes() method is synchronised; but if thread A is computing the evaluation modes, thread B will see a non-null value for evaluationModes, and use the data even though it has not been fully computed.
Let's address the compile-time story first.
The argument evaluation modes are needed only at run-time, unlike most of the properties we store on the expression tree which are used primarily at compile time. So it makes sense to compute them only at the end of optimization. Fortunately we have a mechanism for doing this in ExpressionTool.optimizeComponentBody().
I have implemented this change and it appears to be OK (I've got a number of test failures that need to be investigated but they seem unrelated.)
- Status changed from In Progress to Resolved
On the run-time issue, we still have the fallback path which computes evaluation modes if this hasn't been done statically -- though hopefully this should no longer happen. The code is now properly synchronized, so the multithreading errors should no longer occur. I have also retained the code that recovers from the ClassCastException (with a warning) if the evaluation mode 0 is used with an argument that is not a literal.
The bug is therefore resolved.
Further testing revealed that the fix is incorrect for XQuery - committing further changes for that case.
- Fixed in Maintenance Release deleted (
9.7.0.12)
- Status changed from Resolved to Closed
- Fixed in Maintenance Release 9.7.0.13 added
Bug fix applied in the Saxon 9.7.0.13 maintenance release.
- Applies to branch deleted (
9.8)
- Fix Committed on Branch trunk added
- Fix Committed on Branch deleted (
9.8)
Please register to edit this issue
Also available in: Atom
PDF