Feature #6316


Make PyXslt30Processor and pysaxonProcessor serializable/pickleable

Added by Youssef Bettayeb 6 months ago. Updated 6 months ago.

Python API
Start date:
Due date:
% Done:


Estimated time:
Applies to branch:
Fix Committed on Branch:
Fixed in Maintenance Release:
Found in version:
Fixed in version:
SaxonC Languages:
SaxonC Platforms:
SaxonC Architecture:


Hello, I am working with Saxonche in a databricks/pyspark environment and when it comes to applying transformations to XMLs using saxonche, it is not making use of the parralel processing capabilities because the PyXslt30Processor are not "serializable" and thus it's impossible to use UDF etc to mass transform XMLs (i have multiple XMLs and one single XSLT to apply to all of them) The error I am getting when doing so is : Python process TypeError: no default reduce due to non-trivial cinit

would making those objects serializable possible ?

Thanks a lot for your time

Actions #1

Updated by O'Neil Delpratt 6 months ago

  • Category set to Python API


It should be possible to make the SaxonC classes which are defined in Cython, pickleable. From my reading given that we have redefined our cinit method for each class we would need to define implementations of the methods __reduce__, __getstate__ and __setstate__ to make them work. We will discuss this feature request with the team.

I have never used a databricks/pyspark environment, but wondering if you have a simple repo which we can test?

Actions #2

Updated by Youssef Bettayeb 6 months ago

Not sure what you mean by repo for databricks/pyspark environment as those are cloud solutions for which you need either to setup the environment on a cloud provider or maybe use pyspark in a Jupyter environment on a local machine. This link is a nice starting point as for a code sample of how using SaxonC classes when they would be pickleable i published this snippet

If you need anything else feel free



Please register to edit this issue

Also available in: Atom PDF