Bug #6612
openMemory leak under sustained load (12.5)
0%
Description
We're using Saxonche in a Python application (running in Docker on AWS ECS) to process many XML messages under sustained load. Currently there are about 3-6 events per second, and this might increase to 100-200 events per second.
We're observing that the Saxonche PySaxonProcessor is allocating 0.5 - 4.0 MB per run, and is not releasing the memory as long as the load is sustained. If we pauze the event processing we see the memory being released after a few minutes. Unfortunately this is not workable for us, because the container runs out of memory.
Our code looks like this:
@profile
def transform_with_xslt(
self,
input: bytes,
xslt_path: str | Path,
xpath_to_root: str | None = None,
mapping_input_parameter_name: str | None = None,
namespace_of_inputmessageschema: str | None = None,
) -> str:
with PySaxonProcessor(license=False) as processor:
xml_doc = processor.parse_xml(xml_text=input.decode())
xslt_processor = processor.new_xslt30_processor()
# Compile the enhanced template and cache it
xslt = self._compiled_xslt(xslt_processor, xslt_path)
# Use the configured XPath expression
xpath_proc = processor.new_xpath_processor()
if namespace_of_inputmessageschema:
# If the namespaces is provided, set it as the default namespace. This allows us to define
# XPath expressions without specifying the namespace. The system will default to the namespace
# provided here. Also see:
# https://www.saxonica.com/saxon-c/doc12/html/saxonc.html#PyXPathProcessor-declare_namespace
xpath_proc.declare_namespace("", namespace_of_inputmessageschema)
xpath_proc.set_context(xdm_item=xml_doc)
root_node = xpath_proc.evaluate_single(f"/{xpath_to_root}")
xslt.set_parameter(mapping_input_parameter_name, root_node)
result = xslt.apply_templates_returning_string(xdm_node=xml_doc)
if result is None:
raise MappingError("XSLT transformation produced no output")
return result
If we send a single event, the memory profiler (memory_profiler
) shows us the following output:
Line # Mem usage Increment Occurrences Line Contents
=============================================================
37 147.3 MiB 147.3 MiB 1 @profile
38 def transform_with_xslt(
39 self,
40 input: bytes,
41 xslt_path: str | Path,
42 xpath_to_root: str | None = None,
43 mapping_input_parameter_name: str | None = None,
44 namespace_of_inputmessageschema: str | None = None,
45 ) -> str:
46 149.2 MiB 0.2 MiB 2 with PySaxonProcessor(license=False) as processor:
47 148.0 MiB 0.5 MiB 1 xml_doc = processor.parse_xml(xml_text=input.decode())
48 148.0 MiB 0.0 MiB 1 xslt_processor = processor.new_xslt30_processor()
49
50 # Compile the enhanced template and cache it
51 148.2 MiB 0.2 MiB 1 xslt = self._compiled_xslt(xslt_processor, xslt_path)
52
53 # Use the configured XPath expression
54 148.2 MiB 0.0 MiB 1 xpath_proc = processor.new_xpath_processor()
55
56 148.2 MiB 0.0 MiB 1 if namespace_of_inputmessageschema:
57 # If the namespaces is provided, set it as the default namespace. This allows us to define
58 # XPath expressions without specifying the namespace. The system will default to the namespace
59 # provided here. Also see:
60 # https://www.saxonica.com/saxon-c/doc12/html/saxonc.html#PyXPathProcessor-declare_namespace
61 xpath_proc.declare_namespace("", namespace_of_inputmessageschema)
62
63 148.2 MiB 0.0 MiB 1 xpath_proc.set_context(xdm_item=xml_doc)
64 148.3 MiB 0.0 MiB 1 root_node = xpath_proc.evaluate_single(f"/{xpath_to_root}")
65 148.3 MiB 0.0 MiB 1 xslt.set_parameter(mapping_input_parameter_name, root_node)
66
67 149.2 MiB 0.9 MiB 1 result = xslt.apply_templates_returning_string(xdm_node=xml_doc)
68
69 149.2 MiB 0.0 MiB 1 if result is None:
70 raise MappingError("XSLT transformation produced no output")
71
72 149.2 MiB 0.0 MiB 1 return result
If we run it again, this is the output (please note the 2.5MiB increase in memory usage):
Line # Mem usage Increment Occurrences Line Contents
=============================================================
37 149.8 MiB 149.8 MiB 1 @profile
38 def transform_with_xslt(
39 self,
40 input: bytes,
41 xslt_path: str | Path,
42 xpath_to_root: str | None = None,
43 mapping_input_parameter_name: str | None = None,
44 namespace_of_inputmessageschema: str | None = None,
45 ) -> str:
46 151.8 MiB 0.2 MiB 2 with PySaxonProcessor(license=False) as processor:
47 150.6 MiB 0.6 MiB 1 xml_doc = processor.parse_xml(xml_text=input.decode())
48 150.6 MiB 0.0 MiB 1 xslt_processor = processor.new_xslt30_processor()
49
50 # Compile the enhanced template and cache it
51 150.8 MiB 0.2 MiB 1 xslt = self._compiled_xslt(xslt_processor, xslt_path)
52
53 # Use the configured XPath expression
54 150.8 MiB 0.0 MiB 1 xpath_proc = processor.new_xpath_processor()
55
56 150.8 MiB 0.0 MiB 1 if namespace_of_inputmessageschema:
57 # If the namespaces is provided, set it as the default namespace. This allows us to define
58 # XPath expressions without specifying the namespace. The system will default to the namespace
59 # provided here. Also see:
60 # https://www.saxonica.com/saxon-c/doc12/html/saxonc.html#PyXPathProcessor-declare_namespace
61 xpath_proc.declare_namespace("", namespace_of_inputmessageschema)
62
63 150.8 MiB 0.0 MiB 1 xpath_proc.set_context(xdm_item=xml_doc)
64 150.8 MiB 0.0 MiB 1 root_node = xpath_proc.evaluate_single(f"/{xpath_to_root}")
65 150.8 MiB 0.0 MiB 1 xslt.set_parameter(mapping_input_parameter_name, root_node)
66
67 151.8 MiB 0.9 MiB 1 result = xslt.apply_templates_returning_string(xdm_node=xml_doc)
68
69 151.8 MiB 0.0 MiB 1 if result is None:
70 raise MappingError("XSLT transformation produced no output")
71
72 151.8 MiB 0.0 MiB 1 return result
When we run a thousand invocations, the number just keeps increasing:
Line # Mem usage Increment Occurrences Line Contents
=============================================================
37 420.6 MiB 420.6 MiB 1 @profile
38 def transform_with_xslt(
39 self,
40 input: bytes,
41 xslt_path: str | Path,
42 xpath_to_root: str | None = None,
43 mapping_input_parameter_name: str | None = None,
44 namespace_of_inputmessageschema: str | None = None,
45 ) -> str:
46 421.5 MiB 0.0 MiB 2 with PySaxonProcessor(license=False) as processor:
47 421.5 MiB 0.9 MiB 1 xml_doc = processor.parse_xml(xml_text=input.decode())
48 421.5 MiB 0.0 MiB 1 xslt_processor = processor.new_xslt30_processor()
49
50 # Compile the enhanced template and cache it
51 421.5 MiB 0.0 MiB 1 xslt = self._compiled_xslt(xslt_processor, xslt_path)
52
53 # Use the configured XPath expression
54 421.5 MiB 0.0 MiB 1 xpath_proc = processor.new_xpath_processor()
55
56 421.5 MiB 0.0 MiB 1 if namespace_of_inputmessageschema:
57 # If the namespaces is provided, set it as the default namespace. This allows us to define
58 # XPath expressions without specifying the namespace. The system will default to the namespace
59 # provided here. Also see:
60 # https://www.saxonica.com/saxon-c/doc12/html/saxonc.html#PyXPathProcessor-declare_namespace
61 xpath_proc.declare_namespace("", namespace_of_inputmessageschema)
62
63 421.5 MiB 0.0 MiB 1 xpath_proc.set_context(xdm_item=xml_doc)
64 421.5 MiB 0.0 MiB 1 root_node = xpath_proc.evaluate_single(f"/{xpath_to_root}")
65 421.5 MiB 0.0 MiB 1 xslt.set_parameter(mapping_input_parameter_name, root_node)
66
67 421.5 MiB 0.0 MiB 1 result = xslt.apply_templates_returning_string(xdm_node=xml_doc)
68
69 421.5 MiB 0.0 MiB 1 if result is None:
70 raise MappingError("XSLT transformation produced no output")
71
72 421.5 MiB 0.0 MiB 1 return result
Considering the sustained load, this is a major problem for us. It seems that garbage collection is taking place when the event stream pauzes, but this outside of our control. We would like the memory to be released as soon as the system is done processing the event. Can you help us resolve this issue?
Updated by Matt Patterson 2 days ago
Okay, with the caveat that without knowing much more about your setup and the way you're processing events I might be completely off target, there seems to be one major bottleneck in your code.
First, assumptions:
- The code you provided is indicative of how your processing code is structured in the real application.
- You're processing events by invoking this code once for every event, probably from an ASGI- or WSGI-style handler, so the handler class itself is instantiated only once.
The PySaxonProcessor
object is designed to be a long-lived object, and it has a lot of data attached to it that will help if you're reusing it, but just be deadweight if you're not.
All the calls to the _compile_xslt
method are using a new PySaxonProcessor
and a new Xslt30Processor
, so the compiled stylesheet objects that they create will all maintain references to the PySaxonProcessor
that created them. Without seeing your cache code, at best it's resulting in lots of extra work because of the fresh PySaxonProcessor
instances, and at worst it's the major cause of your GC bottleneck.
If you can move the creation of your PySaxonProcessor
out of your event handler so that it happens once at initialisation time you ought to see a benefit.
That may be all you need, but without knowing more about the caching mechanism you're using, I can't say how big a benefit it would be.
If you're still experiencing problems after that, please update us and we'll be better able to see if there's something deeper going on.
Thanks,
Matt
Updated by Luc van Donkersgoed 2 days ago
Hey Matt,
Thanks for the quick reply, it's much appreciated. Your assumptions are correct.
Funny thing is we tried to use the PySaxonProcessor as a long-lived object first, but ran into the stack overflow exceptions mentioned elsewhere on this forum. Moving the context into the invoke path solved that problem - and introduced this one.
But your guidance helps. We will investigate how we can successfully convert the PySaxonProcessor to a singleton in our code and report back.
Luc
Updated by Matt Patterson 2 days ago
Do you mean crashes like the one reported in https://saxonica.plan.io/issues/6564?
We're working on fixing that, but it's not fixed in 12.5. If it was not that, please open a new issue if you encounter it again, and feel free to add comments to that issue if you saw something we didn't there.
We haven't had much use of the SaxonC product in threaded network server applications before, so anything you do encounter and can share will help us ensure we're testing these use cases properly.
Matt
Updated by Michael Kay 2 days ago
We haven't had much use
A caveat on that comment - we only know a tiny fraction of what users are doing with our product!
Updated by Michael Kay 1 day ago
Note also that if you're aiming at 100 transformations per second, then it's pretty much essential that you compile the stylesheets once and then use them repeatedly, because the compile cost will often be much higher than the execution cost. And if you're going to reuse the compiled stylesheets then you also need to reuse the Saxon processor: that's because the compiled stylesheets and source documents that participate in a transformation must use the same NamePool, and the NamePool is owned by the processor.
Please register to edit this issue