Bug #4405
closedInternal error: local variable encountered whose binding has been deleted
100%
Description
Hello Team,
Today I encountered "local variable encountered whose binding has been deleted" internal error with SaxonPE9-9-1-6J.
I attached the log file: log2019-12-09.log
You can see the stylesheet file from following repository:
https://github.com/AntennaHouse/pdf5-ml/tree/master/com.antennahouse.pdf5.ml https://github.com/AntennaHouse/pdf5-ml/blob/master/com.antennahouse.pdf5.ml/xsl/dita2fo_thumbindexmap.xsl
This phenomenon does not occur with SaxonPE9-9-1-5J.
If you need the steps to reproduce this error, please let me know.
Regards,
Files
Updated by Toshihiko Makita almost 5 years ago
- File log-2019-12-09.txt log-2019-12-09.txt added
Update the log file.
Updated by Toshihiko Makita almost 5 years ago
Today I encountered "local variable encountered whose binding has been deleted" internal error with SaxonPE9-9-1-6J.
I have noticed that I am using Open JDK because I installed oXygen 21.1 with Open JDK 21.1.
Microsoft Windows [Version 10.0.18362.476]
(c) 2019 Microsoft Corporation. All rights reserved.
D:\DITA-OT\dita-ot-3.4-tm>java -version
openjdk version "12.0.2" 2019-07-16
OpenJDK Runtime Environment (build 12.0.2+10)
OpenJDK 64-Bit Server VM (build 12.0.2+10, mixed mode, sharing)
D:\DITA-OT\dita-ot-3.4-tm>
Does using Open JDK affects this problem?
Regards,
Updated by Michael Kay almost 5 years ago
- Category set to Internals
- Assignee set to Michael Kay
- Priority changed from Low to Normal
I'd be grateful if you could give me simple instructions for reproducing it so I can concentrate my efforts on diagnosing the problem rather than creating the test scenario.
I think it's very unlikely to depend on the Java version, though it's always possible. This error message usually suggests a bug in the Saxon optimizer, since it's essentially reporting an inconsistency in the Saxon expression tree.
You can test this hypothesis by running with optimizations disabled (-opt:0 on the command line).
Updated by Toshihiko Makita almost 5 years ago
It seems that memory capacity affects this problem.
- Windows 10 + JDK 12.02 + 8GB memory desktop: This error occurred.
- Windows 10 + JDK 12.02 + 12GB memory hand-held: This error does not occur.
- Windows 10 + JDK 12.02 + 32GB memory hand-held: This error does not occur.
I will send the test data tomorrow with confirming the error occurrence once again in my desktop.
Regards,
Updated by Toshihiko Makita almost 5 years ago
- File 2019-12-11.png 2019-12-11.png added
- File 2019-12-11-1.png 2019-12-11-1.png added
- File 2019-12-11-log.txt 2019-12-11-log.txt added
- File test-sample-en-pdf5-ml.bat test-sample-en-pdf5-ml.bat added
Here are steps to reproduce this difficulty in Windows 10.
- Download DITA-OT 3.4 from dita-ot.org: https://www.dita-ot.org/download
- Unzip the archive to D:\DITA-OT\dita-ot-3.4
- Download PDF5-ML plugin: https://github.com/AntennaHouse/pdf5-ml
- Unzip the plugin-archive and copy "com.antennahouse.pdf5.ml" to D:\DITA-OT\dita-ot-3.4\plugins folder.
- Also copy sample file folder "sample_en" to your temporary folder (D:\My_Documents\Temp\sample_en)
- Copy SaxonPE9-9-1-6J jar file to D:\DITA-OT\dita-ot-3.4\lib folder.
- Open command prompt at D:\DITA-OT\dita-ot-3.4 and type "bin\dita --install"
- Double click D:\DITA-OT\dita-ot-3.4\startcmd.bat and enter return.
- Copy attached "test-sample-en-pdf5-ml.bat" to D:\DITA-OT\dita-ot-3.4 folder.
- Invoke DITA-OT transformation by entering follwoing command.
test-sample-en-pdf5-ml.bat > 2019-12-11-log.txt 2>&1
The log file is generated as attached 2019-12-11-log.txt.
Hope this helps your debugging.
Regards,
Updated by Michael Kay almost 5 years ago
I'm afraid I'm not making much progress on this. I've been trying to do it on a Mac and adapt it so that I can see what's going on in a debugger, and that's not easy when the actual transformation is buried deep within shell scripts and Ant scripts.
Is there no way of isolating the transformation that actually fails and supplying its input files so I can run it directly in the IDE? The log file seems to suggest what's needed:
Processing D:\My_Documents\Temp\sample_en\temp\sample_en_CONVERTED.xml to D:\My_Documents\Temp\sample_en\temp\sample_en_psmi.fo
524
[xslt] Loading stylesheet D:\DITA-OT\dita-ot-3.4\plugins\com.antennahouse.pdf5.ml\xsl\dita2fo_shell.xsl
I've got the stylesheet; all I need is the source file sample_en_CONVERTED.xml
assuming there aren't any other dependencies that I haven't identified.
Updated by Michael Kay almost 5 years ago
I guess the other way of doing it is to attack it bottom-up from the crash information. The diagnostics point to a problem with the variable $partCount
here:
<xsl:variable name="label" as="xs:string">
<xsl:variable name="partCount" as="xs:integer" select="count(preceding-sibling::*[contains(@class,' bookmap/part ')]|.)"/>
<xsl:variable name="partCountFormat" as="xs:string" select="ahf:getVarValue('Part_Count_Format')"/>
<xsl:number format="{$partCountFormat}" value="$partCount"/>
</xsl:variable>
and the error means that for some reason the optimizer thought that $partCount
was an unused variable. In the dump of the expression tree, numSeqFmt
refers to a NumberSequenceFormatter
, which is the number-formatting part of xsl:number
( as distinct from the node-counting part).
Looking at the code for NumberSequenceFormatter
, a common weak point during tree rewrites is the copy() method which makes a copy of an expression (used, for example, when doing function inlining). One reason this is a common source of problems is that it's not a very frequently executed path; inlining of functions or loop-lifting of expressions that call xsl:number
is likely to be a rare event. There does seem to be a problem in the ``NumberSequenceFormatter.copy()` method: it copies its subexpressions without passing on the rebinding map, which is a list of variables that need to be rebound in the new copy. So I can see a potential problem with a fairly straightforward fix; the challenge is testing it, which can really only be done by first reproducing the failure. It does however create the possibility of testing it by doing a product build with the (speculative) patch and testing it outside the IDE.
Updated by Michael Kay almost 5 years ago
I've attempted to mock up a stylesheet containing the failing template rule with mocked-up dependencies. On my first attempt it ran successfully. Then I ran it with -explain and I got a crash with a NullPointerException
, which could well be the outcome of an incorrect tree rewrite, although it's not the same symptoms it could be the same underlying cause.
Unfortunately it's clear that my mocking of dependencies is too simplistic; a lot of constant expressions are being simplified which would not happen in real life; I'll have to see if I can make it more realistic. But before I do that, I'll explore the NPE to see if it shows up anything interesting. But if I fix that, we'll never really know whether it's the same bug until we can reproduce the original failure.
Updated by Michael Kay almost 5 years ago
I have established that in my mocked example, NumberSequenceFormatter.copy()
does indeed get called with a non-empty rebinding map; and fixing the code to pass on the rebinding map to its subexpressions makes the NPE go away. So I'm gaining confidence that this could be the actual cause.
Updated by Michael Kay almost 5 years ago
- Status changed from New to Resolved
- Applies to branch 9.9, trunk added
- Fix Committed on Branch 9.9, trunk added
I think I have sufficient confidence to apply this patch and mark the bug as resolved. If it's possible to produce a better test case to test the patch against, however, then I'd be happy to do so.
Updated by O'Neil Delpratt over 4 years ago
- Status changed from Resolved to Closed
- % Done changed from 0 to 100
- Fixed in Maintenance Release 9.9.1.7 added
Patch applied in the 9.9.1.7 maintenance release.
Please register to edit this issue