Forums » Saxon/C Help and Discussions »
Is there some way to use ThreadPoolExecutor with Python API and ensure that detach_current_thread is propertly called?
Added by Martin Honnen over 2 years ago
I am looking into multi-threading with Python and SaxonC (testing with 11.3 HE) again and I am wondering whether there is a way to use a ThreadPoolExecutor
and ensure the detach_current_thread
is callled properly?
My test code is
from saxonc import *
import glob
from concurrent.futures import ThreadPoolExecutor
xsltExecutable = None
def run_transform(input_file):
result_file = input_file.replace('input-samples', 'python-output-samples')
print('Transforming file', input_file, 'to file', result_file)
xsltExecutable.transform_to_file(source_file = input_file, output_file = result_file)
with PySaxonProcessor(license = False) as saxon:
print(saxon.version)
input_files = glob.glob('input-samples/*')
print(input_files)
xslt30Processor = saxon.new_xslt30_processor()
xsltExecutable = xslt30Processor.compile_stylesheet(stylesheet_file = 'transform-file.xsl')
xsltExecutable.set_cwd('.')
with ThreadPoolExecutor(max_workers = 4) as executor:
executor.map(run_transform, input_files)
This seems to process and transform all files just fine but then Python/SaxonC crashes and core dumps with e.g.
JET RUNTIME HAS DETECTED UNRECOVERABLE ERROR: runtime error
Thread 3128 ["Thread-1"] is terminated without notifying the JVM. Probably, "DetachCurrentThread" function was not called
At which point would I need to inject the saxon.detach_current_thread
to avoid the crash and core dump?
Replies (2)
RE: Is there some way to use ThreadPoolExecutor with Python API and ensure that detach_current_thread is propertly called? - Added by O'Neil Delpratt over 2 years ago
I think you need to find a way to call detach_current_thread()
at the end of run_transform()
. Is it possible to pass the PySaxonProcessor
as an argument?
The following works for me:
from saxonc import *
import glob
from concurrent.futures import ThreadPoolExecutor
xsltExecutable = None
saxon = None
def run_transform(input_file):
result_file = input_file.replace('input-samples', 'python-output-samples')
print('Transforming file', input_file, 'to file', result_file)
xsltExecutable.transform_to_file(source_file = input_file, output_file = result_file)
saxon.detach_current_thread()
saxon = PySaxonProcessor(license = False)
print(saxon.version)
input_files = glob.glob('../../samples/data/*')
print(input_files)
xslt30Processor = saxon.new_xslt30_processor()
xsltExecutable = xslt30Processor.compile_stylesheet(stylesheet_file = 'transform-file.xsl')
xsltExecutable.set_cwd('.')
with ThreadPoolExecutor(max_workers = 4) as executor:
executor.map(run_transform, input_files)
RE: Is there some way to use ThreadPoolExecutor with Python API and ensure that detach_current_thread is propertly called? - Added by Martin Honnen over 2 years ago
Thanks for the suggestion, indeed a global variable for the Saxon processor helps and then just calling detach_current_thread each time in run_transform
. I had thought, that, due to the reuse of threads in a thread pool, that would somehow detach threads too often, but it looks as if it at least doesn't give any errors and doesn't crash jet, so it seems that is a way.
As for passing arguments to map
, I will need to check whether Python allows some kind of closure to do that.
Please register to reply