Bug #3401
closedFile Handle Leaks when a stylesheet module is included/imported more than once
Added by Harry Davidson over 7 years ago. Updated almost 7 years ago.
100%
Description
In the attached example, when opening a.xsl, two instances of common.xsl are opened but only one instance is closed causing a file handle leak.
Files
MultipleInclude.zip (1.53 KB) MultipleInclude.zip | Harry Davidson, 2017-08-18 10:02 | ||
UnclosedFilesExample.txt (10 KB) UnclosedFilesExample.txt | Harry Davidson, 2017-08-20 11:18 | ||
StackTraces.txt (12.4 KB) StackTraces.txt | Harry Davidson, 2017-08-21 10:11 | ||
StackTraces2.txt (12.4 KB) StackTraces2.txt | Harry Davidson, 2017-08-23 17:30 |
Updated by Michael Kay over 7 years ago
- Subject changed from File Handle Leaks to File Handle Leaks when a stylesheet module is included/imported more than once
- Status changed from New to In Progress
I can't reproduce the problem.
-
Are you on Java or .NET?
-
How were you running it - command line or via API?
-
Any custom URIResolver or EntityResolver involved?
-
Which maintenance release of Saxon?
-
How did you detect the problem?
-
Do you see the warning message: Warning at xsl:transform on line 4 column 66 of common.xsl:
Stylesheet module file:/Users/mike/bugs/2017/davidson/Bug3401/common.xsl is included or
imported more than once. This is permitted, but may lead to errors or unexpected behavior
Running with the tool at http://file-leak-detector.kohsuke.org, I see common.xsl being opened once and closed once, which is what I would expect from looking at the source code and running under a debugger.
Updated by Harry Davidson over 7 years ago
- File UnclosedFilesExample.txt UnclosedFilesExample.txt added
Sorry, please see attached Saxon .Net example code. Using this produces the trace shown below...
_Note that the second common.xsl is never closed or disposed of.
It would get cleaned up in the stream finalize eventually, but you can't count on that happening in a timely fasion, and with the addition of the static in this example, it is more clearly left unclosed.
_
===== XsltSimpleMultipleInclude =======
Tracking Stream Created 31201899 = file:///C:/Temp/SaxonIssue/SaxonBug/SaxonSamples/data/books.xml
Tracking Stream Created 28316044 = file:///C:/Temp/SaxonIssue/SaxonBug/SaxonSamples/data/books.dtd
Tracking Stream Closed 28316044 = file:///C:/Temp/SaxonIssue/SaxonBug/SaxonSamples/data/books.dtd
Tracking Stream Closed 31201899 = file:///C:/Temp/SaxonIssue/SaxonBug/SaxonSamples/data/books.xml
Tracking Stream Closed 31201899 = file:///C:/Temp/SaxonIssue/SaxonBug/SaxonSamples/data/books.xml
Tracking Stream Created 16745860 = file:///C:/Temp/SaxonIssue/SaxonBug/SaxonSamples/multipleInclude/a.xsl
Tracking Stream Created 43749873 = file:///C:/Temp/SaxonIssue/SaxonBug/SaxonSamples/multipleInclude/b.xsl
Tracking Stream Created 58204539 = file:///C:/Temp/SaxonIssue/SaxonBug/SaxonSamples/multipleInclude/common.xsl
Tracking Stream Closed 58204539 = file:///C:/Temp/SaxonIssue/SaxonBug/SaxonSamples/multipleInclude/common.xsl
Tracking Stream Closed 43749873 = file:///C:/Temp/SaxonIssue/SaxonBug/SaxonSamples/multipleInclude/b.xsl
Tracking Stream Created 41816592 = file:///C:/Temp/SaxonIssue/SaxonBug/SaxonSamples/multipleInclude/common.xsl
Tracking Stream Closed 16745860 = file:///C:/Temp/SaxonIssue/SaxonBug/SaxonSamples/multipleInclude/a.xsl
Tracking Stream Closed 16745860 = file:///C:/Temp/SaxonIssue/SaxonBug/SaxonSamples/multipleInclude/a.xsl
<BOOKLIST xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
-- SNIP --
The stream should have been closed 41816592 = file:///C:/Temp/SaxonIssue/SaxonBug/SaxonSamples/multipleInclude/common.xsl
Updated by Michael Kay over 7 years ago
Thanks for the further details. I'll investigate when I get access to a Windows machine. The curious thing is not that the common.xsl stream isn't closed the second time, but that it is opened at all: once a stylesheet module has been read, its absolute URI should go in a cache and subsequent includes/imports for the same URI shouldn't get as far as the URIResolver (which is where GetEntity() is called from).
It's also slightly curious that a.xsl is closed twice, though I'm not greatly concerned about that.
Updated by Michael Kay over 7 years ago
Although I don't have a Windows machine with me for hands-on debugging, I've been looking at the source code.
The code in XSLGeneralIncorporate.getIncludedStylesheet() calls DocumentFn.computeDocumentKey() to establish an absolute URI to use for looking up the cache of known stylesheet modules. DocumentFn.computeDocumentKey() has two different paths depending on whether the URIResolver implements the interface RelativeURIResolver. The default URIResolver used on .NET does implement this interface; the one used on Java does not. The key difference is that if the URIResolver is a RelativeURIResolver, then it takes responsibility for resolving a relative URI against an absolute URI, as well as dereferencing the absolute URI to retrieve a resource. This reflects the fact that the Microsoft XmlResolver interface has two methods for these two operations (ResolveURI and GetEntity), while the Java URIResolver interface fails to make this distinction.
The fact that GetEntity is called twice for the common.xsl module suggests that for some reason the document keys computed for the two instances do not compare equal. I can't immediately see why this should be: I'll need to step into it with the Visual Studio debugger (I'm going to have to relearn how to do this - O'Neil is the expert on this and he's on holiday).
The other thing I notice is that the DotNetURIResolver calls the GetEntity method of the underlying XmlResolver on two distinct paths. This reflects the fact that the DotNetURIResolver is used both as a URIResolver (used by the XSLT engine to deference URIs) and as an EntityResolver (used by the XML parser to dereference URIs). The "dereference" method is called when the object is used as a URIResolver, and takes care to call AugmentedSource.setPleaseCloseAfterUse(true). By contrast, the resolveEntity() method, used when the object acts as an EntityResolver, doesn't do anything like this. When Saxon invokes an XML parser, supplying an EntityResolver, it has no control over whether the XML parser closes input files or not.
We attempt to resolve xsl:include references during the "use-when" preprocessing pass over the stylesheet. This code loads a module using StylesheetModule.loadStylesheetModule(). This code calls URIResolver.resolve() and checks whether the returned Source is an AugmentedSource with the property pleaseCloseAfterUse set (which, in the .NET case, it is), and on completion if this property is set, it closes the stream. On this path we pass a Stream to the XML parser (not a URI) so the EntityResolver path shouldn't be invoked. But the fact that a stream is being opened and not subsequently closed does appear to suggest that it was on the EntityResolver path that this happened.
Updated by Michael Kay over 7 years ago
Since you have this running and I don't (yet), it might be revealing to take a stack trace at each call of GetEntity to see where it is called from.
Updated by Harry Davidson over 7 years ago
- File StackTraces.txt StackTraces.txt added
Here is the stack trace (v9.8.0.4).
Updated by Michael Kay over 7 years ago
Many thanks. Any idea why there is no stack trace for the final call where a stream for common.xsl is created? Because that's the most interesting one...
Updated by Harry Davidson over 7 years ago
- File StackTraces2.txt StackTraces2.txt added
Hi, please see attached StackTraces2.txt. I changed the order so it makes more sense.
Updated by Michael Kay over 7 years ago
Thanks. The second trace makes it clear what's going on.
The UseWhenFilter, when it encounters an xsl:include or xsl:import, does the following
-
Compute a document key (essentially an absolute URI, with some normalization) from the relative URI and base URI.
-
Call the resolve() method of the URIResolver to get a Source
-
if the document key computed at (1) is not in the known set of stylesheet modules, load the stylesheet module from the source computed in (2)
On the java side, by default the URIResolver.resolve() method merely wraps the absolute URI in a SAXSource object and returns. But on .NET, the default URIResolver actually deferences the URI, calling the underlying GetEntity method of the XmlResolver. So we are opening a stream to this resource in step (2); and step (3) then causes this stream to be ignored (not read, and not closed) if the module is already known.
The obvious fix is to skip step 2 if the document key is known, since the Source object obtained is never used.
Updated by Michael Kay over 7 years ago
- Category set to Internals
- Status changed from In Progress to Resolved
- Assignee set to Michael Kay
- Applies to branch 9.7 added
- Fix Committed on Branch 9.7, 9.8 added
Patch applied on both the 9.7 and 9.8 branches.
Updated by O'Neil Delpratt about 7 years ago
- Fixed in Maintenance Release 9.8.0.5 added
Bug fix applied in the Saxon 9.8.0.5 maintenance release. Leave open until the bug fix is applied in 9.7 branch.
Updated by O'Neil Delpratt about 7 years ago
- Status changed from Resolved to Closed
- % Done changed from 0 to 100
Bug fix applied in the Saxon 9.8.0.5 maintenance release. Leave open until fix applied in the 9.7 release
Updated by O'Neil Delpratt about 7 years ago
- Status changed from Closed to Resolved
Bug fix applied in the Saxon 9.8.0.5 maintenance release. Leave open until fix applied in the 9.7 release
Updated by O'Neil Delpratt almost 7 years ago
- Status changed from Resolved to Closed
- Fixed in Maintenance Release 9.7.0.21 added
Bug fix applied in the Saxon 9.7.0.21 maintenance release.
Please register to edit this issue