Project

Profile

Help

Bug #2754

closed

Memory leak of an instance model from a schema containing QName valued elements

Added by Stuart Barker over 8 years ago. Updated over 7 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Performance
Sprint/Milestone:
-
Start date:
2016-05-23
Due date:
% Done:

100%

Estimated time:
Legacy ID:
Applies to branch:
9.6, 9.7, trunk
Fix Committed on Branch:
9.6, 9.7, trunk
Fixed in Maintenance Release:
Platforms:

Description

We have been profiling our code after upgrading to Saxon 9.7, and there appears to be a leak of the model of an instance document from the schema grammar where the latter contains elements whose type is derived from xs:QName.

Examining the heap of the attached code sample should demonstrate that the model (in this case a TinyTree) for "instance.xml" is still referenced from the processor's schema model even after processing is finished.

In our case the leak is significant. We are attempting to process XML instances with on-disk size of up to 400Mb and we are using an implementation of NodeInfo backed by a DOM-based model, which results in a leak of several Gb. In our workflow several schema grammars will typically be retained for a long period and have multiple instance documents processed against them. This means that multiple leaked models will exist in memory at the same time as the model for the latest instance.

The profiler results show the following chain of references: UserAtomicType -> StringConverter -> InscopeNamespaceResolver -> impl of NodeInfo. No such chain of references existed in Saxon 9.3 but the problem does seem to affect 9.6 as well.


Files

instanceModelLeak.zip (1.52 KB) instanceModelLeak.zip Stuart Barker, 2016-05-23 16:59
Actions #1

Updated by Michael Kay over 8 years ago

  • Category set to Performance
  • Assignee set to Michael Kay
  • Priority changed from High to Normal

Thanks for your detective work on this and for providing such a clear repro.

I think there is in fact a functionality problem as well as just a performance problem: the fact that the StringConverter for the type has been updated to point to the instance indicates a multi-threading problem if the same schema is used to validate two instances concurrently.

Fixing it is non-trivial but I can see an approach that looks viable. The most obvious solution is to use a namespaceContext that copies the namespace bindings from the source document rather than referencing the source document itself, which would be a one-line change, but this would solve the garbage collection problem without solving the multithreading problem.

Actions #2

Updated by Michael Kay over 8 years ago

Prepared a patch whereby Converter.setNamespaceResolver() creates a new Converter rather than modifying the original one. Currently working OK on 9.7 with bytecode=off (passes all QT3 tests); needs further work for bytecode generation and retrofitting to 9.6 and 9.8

Actions #3

Updated by Michael Kay over 8 years ago

In running QT3 tests with bytecode enabled I hit a bug which turned out to have nothing to do directly with this change: probably a side-effect of the fix for bug #2707. Specifically, bytecode classes generated for mapping functions used by function sequence coercion can in some circumstances have names which are not unique. Fixed this (without generating a separate bug entry) and the tests now run. Reran the supplied repro to check that the UserAtomicType's StringConverter no longer contains a reference to the instance document.

Still need to port the patch to the 9.6 and 9.8 branches.

Actions #4

Updated by Michael Kay over 8 years ago

  • Status changed from New to Resolved
  • Applies to branch 9.8 added
  • Fix Committed on Branch 9.6, 9.7, 9.8 added

Patch now committed on the 9.6, 9.7, and 9.8 branches.

Actions #5

Updated by O'Neil Delpratt over 8 years ago

  • % Done changed from 0 to 100
  • Fixed in Maintenance Release 9.6.0.9 added

Bug fix applied in the Saxon 9.6.0.9 maintenance release. Leave open until fix applied in the 9.7 maintenance release

Actions #6

Updated by O'Neil Delpratt over 8 years ago

  • Status changed from Resolved to Closed
  • Fixed in Maintenance Release 9.7.0.6 added

Bug fix applied in the Saxon 9.7.0.6 maintenance release

Actions #7

Updated by O'Neil Delpratt over 7 years ago

  • Applies to branch trunk added
  • Applies to branch deleted (9.8)
Actions #8

Updated by O'Neil Delpratt over 7 years ago

  • Fix Committed on Branch trunk added
  • Fix Committed on Branch deleted (9.8)

Please register to edit this issue

Also available in: Atom PDF