Forums » Saxon/C Help and Discussions »
Is PyXdmNode meant to be thread-safe?
Added by Martin Honnen over 2 years ago
I am wondering whether can build a single PyXdmNode
once with SaxonC and parse_xml
to be used as the same input to various XsltExecutable
s in different threads?
I tried the code sample below and don't get any errors but also no output files:
import threading
from saxonc import *
class myThread (threading.Thread):
def __init__(self, threadID, name, counter, node, saxon_proc):
threading.Thread.__init__(self)
self.threadID = threadID
self.counter = counter
self.name = name
self.node = node
self.saxon_proc = saxon_proc
self.xslt30_processor = saxon_proc.new_xslt30_processor()
self.xslt30_processor.set_cwd('.')
def run(self):
print ("Starting " + self.name)
run_transform(self.name, self.counter, self.node, self.xslt30_processor, self.saxon_proc)
print ("Exiting " + self.name)
def run_transform(threadName, counter, node, xslt30_processor, saxon_proc):
sheet_file = "sheet-samples/sheet{}.xsl".format(counter)
result_file = "threading-example/result-{}.xml".format(counter)
print('Transforming with', sheet_file, 'to', result_file)
xslt30_processor.transform_to_file(xdm_node = node, stylesheet_file = sheet_file, output_file = result_file)
print(xslt30_processor.error_message)
saxon_proc.detach_current_thread
with PySaxonProcessor(license = False) as saxon_proc:
xdm_node = saxon_proc.parse_xml(xml_file_name = 'input-samples/sample-1.xml')
# Create new threads
thread1 = myThread(1, "Thread-1", 1, xdm_node, saxon_proc)
thread2 = myThread(2, "Thread-2", 2, xdm_node, saxon_proc)
thread3 = myThread(3, "Thread-3", 3, xdm_node, saxon_proc)
# Start new Threads
thread1.start()
thread2.start()
thread3.start()
thread1.join()
thread2.join()
thread3.join()
print ("Exiting Main Thread")
Replies (9)
Please register to reply
RE: Is PyXdmNode meant to be thread-safe? - Added by Martin Honnen over 2 years ago
So my previous attempt using transform_to_file
seems to have been doomed by that method ignoring xdm_node
.
Therefore I have changed to use apply_templates_returning_file(xdm_value=...
e.g.
import threading
from saxonc import *
class myThread (threading.Thread):
def __init__(self, threadID, name, counter, node, saxon_proc):
threading.Thread.__init__(self)
self.threadID = threadID
self.counter = counter
self.name = name
self.node = node
self.saxon_proc = saxon_proc
self.xslt30_processor = saxon_proc.new_xslt30_processor()
self.xslt30_processor.set_cwd('.')
def run(self):
print ("Starting " + self.name)
run_transform(self.name, self.counter, self.node, self.xslt30_processor, self.saxon_proc)
print ("Exiting " + self.name)
def run_transform(threadName, counter, node, xslt30_processor, saxon_proc):
sheet_file = "sheet-samples/sheet{}.xsl".format(counter)
result_file = "threading-example/result-{}.xml".format(counter)
print('Transforming with', sheet_file, 'to', result_file)
xslt_executable = xslt30_processor.compile_stylesheet(stylesheet_file = sheet_file)
xslt_executable.apply_templates_returning_file(xdm_value = node, output_file = result_file)
print(xslt_executable.error_message)
saxon_proc.detach_current_thread
with PySaxonProcessor(license = False) as saxon_proc:
xdm_node = saxon_proc.parse_xml(xml_file_name = 'input-samples/sample-1.xml')
# Create new threads
thread1 = myThread(1, "Thread-1", 1, xdm_node, saxon_proc)
thread2 = myThread(2, "Thread-2", 2, xdm_node, saxon_proc)
thread3 = myThread(3, "Thread-3", 3, xdm_node, saxon_proc)
# Start new Threads
thread1.start()
thread2.start()
thread3.start()
thread1.join()
thread2.join()
thread3.join()
print ("Exiting Main Thread")
This runs the first transformation fine it seems but then dies on the second with a core dump:
Starting Thread-1
Transforming with sheet-samples/sheet1.xsl to threading-example/result-1.xml
Starting Thread-2
Transforming with sheet-samples/sheet2.xsl to threading-example/result-2.xml
JET RUNTIME HAS DETECTED UNRECOVERABLE ERROR: system exception at 0x0000000000a6730e
JET RUNTIME HAS DETECTED UNRECOVERABLE ERROR: system exception at 0x0000000000a6730e
Please, contact the vendor of the application.
Crash dump will be written to "C:\SomePath\SomeDir\jet_dump_31972.dmp"
Exception 0xC0000005 (EXCEPTION_ACCESS_VIOLATION) at 0x0000000000a6730e (C:\Program Files\Saxonica\SaxonC HE 11.3\libsaxonhec.dll+0x66730e)
Failed to read memory at 0x000000440fbf0000
Is that due to using the same PyXdmNode in different threads (Mike says on the Java side with the default tiny tree XdmNode is thread-safe) or due to other reasons?
RE: Is PyXdmNode meant to be thread-safe? - Added by O'Neil Delpratt over 2 years ago
Hi Martin,
Please can you send me the data files you are using. Thanks
RE: Is PyXdmNode meant to be thread-safe? - Added by Martin Honnen over 2 years ago
I simply tried any input fed to some identity transformation adding some comment about sheet and time, see the attached zip.
sheet-samples.zip (2.18 KB) sheet-samples.zip |
RE: Is PyXdmNode meant to be thread-safe? - Added by O'Neil Delpratt over 2 years ago
Thanks for sending the zip file. For some strange reason I am getting the follow error:
AttributeError: 'NoneType' object has no attribute 'apply_templates_returning_file'
This means the stylesheet compilation is failing, but I don't know why.
RE: Is PyXdmNode meant to be thread-safe? - Added by Martin Honnen over 2 years ago
Perhaps the unzipping went wrong and the XSLT to be compiled is not there or corrupted?
But I don't know, perhaps it is also a Windows versus Linux problem, the further simplified code in the newly attached zip does run and output stuff on Windows before crashing jet while under Linux I get an error I don't understand either, similar to yours:
Starting Thread-1
Transforming with sheet1.xsl to threading-example/result-1.xml
Starting Thread-2
Transforming with sheet2.xsl to threading-example/result-2.xml
None
Exception in thread Thread-1:
Starting Thread-3
Traceback (most recent call last):
File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
self.run()
Transforming with sheet3.xsl to threading-example/result-3.xml
File "./test1.py", line 18, in run
None
run_transform(self.name, self.counter, self.node, self.xslt30_processor, self.saxon_proc)
File "./test1.py", line 28, in run_transform
None
Exception in thread Thread-2:
Traceback (most recent call last):
File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
Exception in thread Thread-3:
Traceback (most recent call last):
xslt_executable.apply_templates_returning_file(xdm_value = node, output_file = result_file)
self.run()
File "./test1.py", line 18, in run
File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
AttributeError: 'NoneType' object has no attribute 'apply_templates_returning_file'
self.run()
run_transform(self.name, self.counter, self.node, self.xslt30_processor, self.saxon_proc)
File "./test1.py", line 18, in run
File "./test1.py", line 28, in run_transform
run_transform(self.name, self.counter, self.node, self.xslt30_processor, self.saxon_proc)
File "./test1.py", line 28, in run_transform
xslt_executable.apply_templates_returning_file(xdm_value = node, output_file = result_file)
xslt_executable.apply_templates_returning_file(xdm_value = node, output_file = result_file)
AttributeError: 'NoneType' object has no attribute 'apply_templates_returning_file'
AttributeError: 'NoneType' object has no attribute 'apply_templates_returning_file'
Exiting Main Thread
JET RUNTIME HAS DETECTED UNRECOVERABLE ERROR: runtime error
Thread 2786 ["Thread-1"] is terminated without notifying the JVM. Probably, "DetachCurrentThread" function was not called
RE: Is PyXdmNode meant to be thread-safe? - Added by Martin Honnen over 2 years ago
As far as I can see, even on Linux one transformation runs through and creates a result, the other two give that exception as if the execution of one thread corrupted the data of the other ones. Why that happens on Linux and not the same way on Windows is something I can't explain.
RE: Is PyXdmNode meant to be thread-safe? - Added by Martin Honnen over 2 years ago
As a sanitity check, that the code is doing the right thing without using threading, I run
#import threading
from saxonc import *
class myThread ():
def __init__(self, threadID, name, counter, node, saxon_proc):
#threading.Thread.__init__(self)
self.threadID = threadID
self.counter = counter
self.name = name
self.node = node
self.saxon_proc = saxon_proc
self.xslt30_processor = saxon_proc.new_xslt30_processor()
self.xslt30_processor.set_cwd('.')
def start(self):
self.run()
def run(self):
print ("Starting " + self.name)
run_transform(self.name, self.counter, self.node, self.xslt30_processor, self.saxon_proc)
print ("Exiting " + self.name)
def join(self):
return
def run_transform(threadName, counter, node, xslt30_processor, saxon_proc):
sheet_file = "sheet{}.xsl".format(counter)
result_file = "threading-example/result-{}.xml".format(counter)
print('Transforming with', sheet_file, 'to', result_file)
xslt30_processor.set_cwd('.')
xslt_executable = xslt30_processor.compile_stylesheet(stylesheet_file = sheet_file)
print('Error after compiling:', xslt30_processor.error_message)
xslt_executable.apply_templates_returning_file(xdm_value = node, output_file = result_file)
print('Error after applying templates', xslt_executable.error_message)
saxon_proc.detach_current_thread
with PySaxonProcessor(license = False) as saxon_proc:
xdm_node = saxon_proc.parse_xml(xml_file_name = 'sample-1.xml')
# Create new threads
thread1 = myThread(1, "Thread-1", 1, xdm_node, saxon_proc)
thread2 = myThread(2, "Thread-2", 2, xdm_node, saxon_proc)
thread3 = myThread(3, "Thread-3", 3, xdm_node, saxon_proc)
# Start new Threads
thread1.start()
thread2.start()
thread3.start()
thread1.join()
thread2.join()
thread3.join()
print ("Exiting Main Thread")
and then indeed all transformations run through fine.
So it looks like using threading messes up the Saxon state of the supposedly thead separated variables.
RE: Is PyXdmNode meant to be thread-safe? - Added by Michael Lisitsa about 2 years ago
I was able to create a temporary XML file from a string and refer to its path using source_file
as a workaround to the apparently non thread-safe PyXdmNode. Here is a GitHub issue that describes the approach of using tempfiles python package https://github.com/PyFilesystem/pyfilesystem2/issues/402
Has there been any progress on a way to use the Xdm_node as an argument in multiple threads to a single PyXsltExecutable?
import os
import tempfile
import saxonc
# Stylesheet compiled once at startup of a FastAPI server:
proc = saxonc.PySaxonProcessor(license = False)
xsltproc = proc.new_xslt30_processor()
executable = xsltproc.compile_stylesheet(...)
def request_handler_runs_in_multiple_threads(xml_string):
# delete=False ensures file persists after .close() is called, which is necessary,
# Python may still be internally buffering the data (per above thread)
tmp_file=tempfile.NamedTemporaryFile(mode="w",suffix=".xml",prefix="myname", delete=False)
tmp_file.write("xml_string")
tmp_file.close()
output = executable.apply_templates_returning_string(source_file=tmp_file.name)
os.unlink(tmp_file.name)
RE: Is PyXdmNode meant to be thread-safe? - Added by O'Neil Delpratt about 2 years ago
Apologies I have dropped the ball on this forum post. I will create a bug issue and investigate this further.
Please register to reply