Project

Profile

Help

Bug #4405

closed

Internal error: local variable encountered whose binding has been deleted

Added by Toshihiko Makita almost 5 years ago. Updated over 4 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Internals
Sprint/Milestone:
-
Start date:
2019-12-09
Due date:
% Done:

100%

Estimated time:
Legacy ID:
Applies to branch:
9.9, trunk
Fix Committed on Branch:
9.9, trunk
Fixed in Maintenance Release:
Platforms:

Description

Hello Team,

Today I encountered "local variable encountered whose binding has been deleted" internal error with SaxonPE9-9-1-6J.

I attached the log file: log2019-12-09.log

You can see the stylesheet file from following repository:

https://github.com/AntennaHouse/pdf5-ml/tree/master/com.antennahouse.pdf5.ml https://github.com/AntennaHouse/pdf5-ml/blob/master/com.antennahouse.pdf5.ml/xsl/dita2fo_thumbindexmap.xsl

This phenomenon does not occur with SaxonPE9-9-1-5J.

If you need the steps to reproduce this error, please let me know.

Regards,


Files

log2019-12-09.log (56.2 KB) log2019-12-09.log Toshihiko Makita, 2019-12-09 03:08
log-2019-12-09.txt (63 KB) log-2019-12-09.txt Toshihiko Makita, 2019-12-09 05:23
2019-12-11.png (5.42 KB) 2019-12-11.png Toshihiko Makita, 2019-12-11 02:46
2019-12-11-1.png (7.04 KB) 2019-12-11-1.png Toshihiko Makita, 2019-12-11 02:54
2019-12-11-log.txt (50.7 KB) 2019-12-11-log.txt Toshihiko Makita, 2019-12-11 03:01
test-sample-en-pdf5-ml.bat (203 Bytes) test-sample-en-pdf5-ml.bat Toshihiko Makita, 2019-12-11 03:01
Actions #1

Updated by Toshihiko Makita almost 5 years ago

Update the log file.

Actions #2

Updated by Toshihiko Makita almost 5 years ago

Today I encountered "local variable encountered whose binding has been deleted" internal error with SaxonPE9-9-1-6J.

I have noticed that I am using Open JDK because I installed oXygen 21.1 with Open JDK 21.1.

Microsoft Windows [Version 10.0.18362.476]
(c) 2019 Microsoft Corporation. All rights reserved.

D:\DITA-OT\dita-ot-3.4-tm>java -version
openjdk version "12.0.2" 2019-07-16
OpenJDK Runtime Environment (build 12.0.2+10)
OpenJDK 64-Bit Server VM (build 12.0.2+10, mixed mode, sharing)

D:\DITA-OT\dita-ot-3.4-tm>

Does using Open JDK affects this problem?

Regards,

Actions #3

Updated by Michael Kay almost 5 years ago

  • Category set to Internals
  • Assignee set to Michael Kay
  • Priority changed from Low to Normal

I'd be grateful if you could give me simple instructions for reproducing it so I can concentrate my efforts on diagnosing the problem rather than creating the test scenario.

I think it's very unlikely to depend on the Java version, though it's always possible. This error message usually suggests a bug in the Saxon optimizer, since it's essentially reporting an inconsistency in the Saxon expression tree.

You can test this hypothesis by running with optimizations disabled (-opt:0 on the command line).

Actions #4

Updated by Toshihiko Makita almost 5 years ago

It seems that memory capacity affects this problem.

  • Windows 10 + JDK 12.02 + 8GB memory desktop: This error occurred.
  • Windows 10 + JDK 12.02 + 12GB memory hand-held: This error does not occur.
  • Windows 10 + JDK 12.02 + 32GB memory hand-held: This error does not occur.

I will send the test data tomorrow with confirming the error occurrence once again in my desktop.

Regards,

Actions #5

Updated by Toshihiko Makita almost 5 years ago

Here are steps to reproduce this difficulty in Windows 10.

  • Download DITA-OT 3.4 from dita-ot.org: https://www.dita-ot.org/download
  • Unzip the archive to D:\DITA-OT\dita-ot-3.4
  • Download PDF5-ML plugin: https://github.com/AntennaHouse/pdf5-ml
  • Unzip the plugin-archive and copy "com.antennahouse.pdf5.ml" to D:\DITA-OT\dita-ot-3.4\plugins folder.
  • Also copy sample file folder "sample_en" to your temporary folder (D:\My_Documents\Temp\sample_en)
  • Copy SaxonPE9-9-1-6J jar file to D:\DITA-OT\dita-ot-3.4\lib folder.
  • Open command prompt at D:\DITA-OT\dita-ot-3.4 and type "bin\dita --install"

  • Double click D:\DITA-OT\dita-ot-3.4\startcmd.bat and enter return.
  • Copy attached "test-sample-en-pdf5-ml.bat" to D:\DITA-OT\dita-ot-3.4 folder.
  • Invoke DITA-OT transformation by entering follwoing command.
test-sample-en-pdf5-ml.bat > 2019-12-11-log.txt 2>&1

The log file is generated as attached 2019-12-11-log.txt.

Hope this helps your debugging.

Regards,

Actions #6

Updated by Michael Kay almost 5 years ago

I'm afraid I'm not making much progress on this. I've been trying to do it on a Mac and adapt it so that I can see what's going on in a debugger, and that's not easy when the actual transformation is buried deep within shell scripts and Ant scripts.

Is there no way of isolating the transformation that actually fails and supplying its input files so I can run it directly in the IDE? The log file seems to suggest what's needed:

Processing D:\My_Documents\Temp\sample_en\temp\sample_en_CONVERTED.xml to D:\My_Documents\Temp\sample_en\temp\sample_en_psmi.fo
524	
     [xslt] Loading stylesheet D:\DITA-OT\dita-ot-3.4\plugins\com.antennahouse.pdf5.ml\xsl\dita2fo_shell.xsl

I've got the stylesheet; all I need is the source file sample_en_CONVERTED.xml assuming there aren't any other dependencies that I haven't identified.

Actions #7

Updated by Michael Kay almost 5 years ago

I guess the other way of doing it is to attack it bottom-up from the crash information. The diagnostics point to a problem with the variable $partCount here:

                <xsl:variable name="label" as="xs:string">
                    <xsl:variable name="partCount" as="xs:integer" select="count(preceding-sibling::*[contains(@class,' bookmap/part ')]|.)"/>
                    <xsl:variable name="partCountFormat" as="xs:string" select="ahf:getVarValue('Part_Count_Format')"/>
                    <xsl:number format="{$partCountFormat}" value="$partCount"/>
                </xsl:variable>

and the error means that for some reason the optimizer thought that $partCount was an unused variable. In the dump of the expression tree, numSeqFmt refers to a NumberSequenceFormatter, which is the number-formatting part of xsl:number ( as distinct from the node-counting part).

Looking at the code for NumberSequenceFormatter, a common weak point during tree rewrites is the copy() method which makes a copy of an expression (used, for example, when doing function inlining). One reason this is a common source of problems is that it's not a very frequently executed path; inlining of functions or loop-lifting of expressions that call xsl:number is likely to be a rare event. There does seem to be a problem in the ``NumberSequenceFormatter.copy()` method: it copies its subexpressions without passing on the rebinding map, which is a list of variables that need to be rebound in the new copy. So I can see a potential problem with a fairly straightforward fix; the challenge is testing it, which can really only be done by first reproducing the failure. It does however create the possibility of testing it by doing a product build with the (speculative) patch and testing it outside the IDE.

Actions #8

Updated by Michael Kay almost 5 years ago

I've attempted to mock up a stylesheet containing the failing template rule with mocked-up dependencies. On my first attempt it ran successfully. Then I ran it with -explain and I got a crash with a NullPointerException, which could well be the outcome of an incorrect tree rewrite, although it's not the same symptoms it could be the same underlying cause.

Unfortunately it's clear that my mocking of dependencies is too simplistic; a lot of constant expressions are being simplified which would not happen in real life; I'll have to see if I can make it more realistic. But before I do that, I'll explore the NPE to see if it shows up anything interesting. But if I fix that, we'll never really know whether it's the same bug until we can reproduce the original failure.

Actions #9

Updated by Michael Kay almost 5 years ago

I have established that in my mocked example, NumberSequenceFormatter.copy() does indeed get called with a non-empty rebinding map; and fixing the code to pass on the rebinding map to its subexpressions makes the NPE go away. So I'm gaining confidence that this could be the actual cause.

Actions #10

Updated by Michael Kay almost 5 years ago

  • Status changed from New to Resolved
  • Applies to branch 9.9, trunk added
  • Fix Committed on Branch 9.9, trunk added

I think I have sufficient confidence to apply this patch and mark the bug as resolved. If it's possible to produce a better test case to test the patch against, however, then I'd be happy to do so.

Actions #11

Updated by Toshihiko Makita almost 5 years ago

Dr Kay,

Thank you very much.

Actions #12

Updated by O'Neil Delpratt over 4 years ago

  • Status changed from Resolved to Closed
  • % Done changed from 0 to 100
  • Fixed in Maintenance Release 9.9.1.7 added

Patch applied in the 9.9.1.7 maintenance release.

Please register to edit this issue

Also available in: Atom PDF