Project

Profile

Help

Bug #6025

closed

Python Saxon not releasing memory

Added by Mark Pierce over 1 year ago. Updated over 1 year ago.

Status:
Closed
Priority:
Normal
Category:
Python
Start date:
2023-05-09
Due date:
% Done:

100%

Estimated time:
Applies to branch:
Fix Committed on Branch:
Fixed in Maintenance Release:
Found in version:
Fixed in version:
12.3
SaxonC Languages:
SaxonC Platforms:
SaxonC Architecture:

Description

I've recently built an AWS Lambda running Python 3.10 that takes a request, transforms it and returns the result.

This all seems fine and works quite well.

The issue is with each invocation the memory usage and runtime grows until it hits the limit and the lambda instance is killed off.

I've knocked up a very simple program that I can run locally which demonstrates this behaviour.

import psutil
from saxoncpe import *

process = psutil.Process()
start_memory = process.memory_info().rss

with PySaxonProcessor(license=True) as saxon_processor:
    xslt_proc = saxon_processor.new_xslt30_processor()
    document = saxon_processor.parse_xml(xml_text="<request></request>")
    executable = xslt_proc.compile_stylesheet(stylesheet_file="transforms/QuoteRequest-sample.xslt")

    values = range(1000)
    for i in values:
            executable.transform_to_string(xdm_node=document)
            print(process.memory_info().rss)

end_process = psutil.Process()
print("Start memory:")
print(start_memory)
print("End memory:")
print(process.memory_info().rss)

If you comment out all of the Saxon code then you will see the memory does not increase. If you move most of it out of the for loop, so it's only called once, and only have the transform_to_string function in the loop then the memory still increases.

What is cause the memory to be locked? How can I release it so it can be used in an AWS lambda with 100s and 1000s of requests?

Actions #1

Updated by Michael Kay over 1 year ago

  • Project changed from Saxon to SaxonC
  • Category set to Python
  • Assignee set to O'Neil Delpratt
  • Priority changed from Low to Normal
  • Applies to branch deleted (12)
Actions #2

Updated by O'Neil Delpratt over 1 year ago

  • Status changed from New to In Progress

Thanks for reporting this problem and sending us your repo. I managed to reproduce the problem. Currently investigating it.

Actions #3

Updated by Mark Pierce over 1 year ago

Thank you, that’s good to hear!

Hopefully, it’s not too difficult to find the cause.

If you need anything else, please let me know.

From: Saxonica Developer Community
Date: Tuesday, 9 May 2023 at 17:36
To: Mark Pierce
Subject: [SaxonC - Bug #6025] (In Progress) Python Saxon not releasing memory

Actions #4

Updated by O'Neil Delpratt over 1 year ago

Hi Mark,

We have been investigating this issue. Specifically looking at the heap dump produced by Graalvm. As default, Graalvm sets a max heap size available of 25% on the total machine memory. It is possible that the garbage collection is not being triggered in the lambda server. I have looked at configuring the maxHeapSize in a standalone machine which did some improvements in memory usage.

I would like to under more about your environment. What are your configurations on memory? such as max memory available?

Actions #5

Updated by Mark Pierce over 1 year ago

Hi.

Currently we are running the AWS lambda with 1024MB RAM.

However, the example I included I am running locally where I have 16GB RAM and am running Mac OS 13.3.1(a) on an M1 Pro.

As this occurs running locally I do not believe this is a lambda issue.

Thanks,

Mark.

From: Saxonica Developer Community
Date: Tuesday, 16 May 2023 at 11:42
To: Mark Pierce
Subject: [SaxonC - Bug #6025] Python Saxon not releasing memory

Actions #6

Updated by O'Neil Delpratt over 1 year ago

Hi,

Yes you are right the problem is not an AWS lambda. After some more investigation work, I found that we have a memory leak with the string that is returned from transform_to_string.

I applied the fix and the memory does not go up past the heap size limit.

Actions #7

Updated by Mark Pierce over 1 year ago

Great!

If you'd like me to do any testing please let me know.

Actions #8

Updated by O'Neil Delpratt over 1 year ago

Thanks Mark. I have sent you an email where you can download the Python wheels for testing.

Actions #9

Updated by O'Neil Delpratt over 1 year ago

I am interested in strenuous tests to learn how memory usage goes as it hits the limits.

Actions #10

Updated by O'Neil Delpratt over 1 year ago

I have applied another fix to this bug issue, which will be available in the next maintenance release.

Also hoping to send out some test Python wheels soon.

Actions #11

Updated by Mark Pierce over 1 year ago

I still see the issue with the test version... though I may have installed it wrong I suppose.

Actions #12

Updated by O'Neil Delpratt over 1 year ago

Hi Mark,

Thanks for your feedback. We have built a new set of Python wheels (see the link sent via email) for you to test.

I do see a memory increase but for me, with a large load, it levels off and does not increase in the for loop.

Is it possible we can plan a screen share or even a visit to your site?

Actions #13

Updated by Mark Pierce over 1 year ago

Ah I see now. I extended my test loop and can now see the memory coming back down. Thank you!

Actions #14

Updated by O'Neil Delpratt over 1 year ago

Thanks for confirming that the patch has worked. It would be good to know how the Python script behaves on your AWS server

Actions #15

Updated by O'Neil Delpratt over 1 year ago

Further bug fix applied in SaxonProcessor.h to delete string array no longer used in Python code.

Actions #16

Updated by Mark Pierce over 1 year ago

Sorry for the late reply. Due to the original issue we have pivoted back to using the Java version, coupled with SnapStart which performs well. I will try the fixed Python version but business priorities means it won't be for a while.

The code we were running however was pretty much:

  • Read in request transform file.
  • Call Saxon
  • Call HTTP service
  • Read in response transform file.
  • Call Saxon
  • Return result

And under heavy load you could see the memory creeping up until the lambda killed itself.

We ran the lambda with 1024MB RAM.

Actions #17

Updated by O'Neil Delpratt over 1 year ago

  • Status changed from In Progress to Resolved
  • % Done changed from 0 to 100

I have applied a number of fixes in this area to clear memory created by SaxonC. But I will continue to monitor this issue.

Actions #18

Updated by O'Neil Delpratt over 1 year ago

  • Status changed from Resolved to Closed
  • Fixed in version set to 12.3

Bug fix applied in the SaxonC 12.3 maintenance release.

Please register to edit this issue

Also available in: Atom PDF