Project

Profile

Help

Feature #6316

open

Make PyXslt30Processor and pysaxonProcessor serializable/pickleable

Added by Youssef Bettayeb 10 months ago. Updated 10 months ago.

Status:
New
Priority:
Low
Assignee:
-
Category:
Python API
Start date:
2024-01-11
Due date:
% Done:

0%

Estimated time:
Applies to branch:
Fix Committed on Branch:
Fixed in Maintenance Release:
Found in version:
Fixed in version:
SaxonC Languages:
SaxonC Platforms:
SaxonC Architecture:

Description

Hello, I am working with Saxonche in a databricks/pyspark environment and when it comes to applying transformations to XMLs using saxonche, it is not making use of the parralel processing capabilities because the PyXslt30Processor are not "serializable" and thus it's impossible to use UDF etc to mass transform XMLs (i have multiple XMLs and one single XSLT to apply to all of them) The error I am getting when doing so is : Python process TypeError: no default reduce due to non-trivial cinit

would making those objects serializable possible ?

Thanks a lot for your time

Actions #1

Updated by O'Neil Delpratt 10 months ago

  • Category set to Python API

Hi,

It should be possible to make the SaxonC classes which are defined in Cython, pickleable. From my reading given that we have redefined our cinit method for each class we would need to define implementations of the methods __reduce__, __getstate__ and __setstate__ to make them work. We will discuss this feature request with the team.

I have never used a databricks/pyspark environment, but wondering if you have a simple repo which we can test?

Actions #2

Updated by Youssef Bettayeb 10 months ago

Not sure what you mean by repo for databricks/pyspark environment as those are cloud solutions for which you need either to setup the environment on a cloud provider or maybe use pyspark in a Jupyter environment on a local machine. This link is a nice starting point https://towardsdatascience.com/how-to-use-pyspark-on-your-computer-9c7180075617 as for a code sample of how using SaxonC classes when they would be pickleable i published this snippet https://github.com/ybettayeb/sample-udf-saxonche/blob/main/sample.py

If you need anything else feel free

Sincerely,

Youssef.

Please register to edit this issue

Also available in: Atom PDF