Project

Profile

Help

Bug #4428

closed

Multi-threading support of Python bindings Saxon-C

Added by Andreas Jung about 4 years ago. Updated about 2 years ago.

Status:
Closed
Priority:
Normal
Category:
Saxon-C Internals
Start date:
2020-01-14
Due date:
% Done:

100%

Estimated time:
Found in version:
1.2.1
Fixed in version:
11.3
Platforms:

Description

Question regarding the Python bindings: are the Python bindings thread-safe?


Related issues

Is duplicate of SaxonC - Bug #5373: Python multithreading code crashesClosedO'Neil Delpratt2022-03-07

Actions
Actions #1

Updated by Michael Kay about 4 years ago

  • Project changed from Saxon to SaxonC
  • Assignee set to O'Neil Delpratt
Actions #2

Updated by O'Neil Delpratt about 4 years ago

  • Status changed from New to AwaitingInfo

Saxon/C has been cross compiled using Excelsior JET. The runtime maps Java threads directly onto native operating system threads.

Saxon/C is designed to support common scenarios like compiling a stylesheet once and then using it repeatedly, in multiple threads, to perform transformations.

Would you be able to give more detail what you mean by a "thread-safe"? For instance in Saxon the xsl:result-document instruction can be executed in multiple threads. See: configuration feature ALLOWING_MULTITHREADING and its use in the XSLT instruction xsl:result-document

Actions #3

Updated by Andreas Jung about 4 years ago

The question is about thread safety within Python. E.g. it is common that Python web application or Python web frameworks use threads as worker model for processing requests. The question is about if it is safe to use Saxon-C through its Python bindings in a multi-threaded Python application. As part of concurrent web requests there might be situation that two threads process different XML data at the same time. This boils down to the question if there is a global state somewhere in Saxon-C or it's bindings that would require locking.

Actions #4

Updated by O'Neil Delpratt about 4 years ago

  • Status changed from AwaitingInfo to In Progress
  • Priority changed from Low to Normal

There is no global state in Saxon/C or its bindings.

Reading documents is thread safe.

However the Processors for Xslt, XQuery and XPath are not thread safe as they currently hold internal state. We are treating this as a bug and will be looking to make some changes in its design to make them thread safe.

Is it possible to send us a sample Python web application with threads which we can use please.

Actions #5

Updated by Andreas Jung about 4 years ago

Thanks for the information.

My own XML CMS platform xml-director.info is based on Python 3 and Plone 5.2 CMS and the typical out of the box configuration of webserver stack is based on Python threads. Nowadays we also have other options like a fork-worker model.

My interest with Saxon-C comes from the now available most decent XML processing capabilities that we have been missing in Python 2+3 for many, many years due to the limitations of libxml2. So for future project I am happy to see state-of-art XML processing capabilities in upcoming projects.

Case closed...

Actions #6

Updated by O'Neil Delpratt about 4 years ago

Update:

I have made changes to the Java code to make the Xslt and XQuery processors thread-safe. Specifically I have made them stateless. In the C++ code we have have added in the class SaxonProcessor the methods 'attachThread' and 'detachThread' for JNI purposes. At the start of creating a new thread in C++ we have to call attachThread and at the end detachThread. This is working progress so this API design might change.

C++ test code below. Here we compile the stylesheet once and reuse it in a number of threads to execute it against a source document concurrently:

void *RunThread(void *args) {

    struct arg_struct *argsi = (struct arg_struct *)args;
    int threadid = argsi->id;
    Xslt30Processor * trans = argsi->trans;
    long tid;
    tid = (long)threadid;

    trans->attachThread();
 
   trans->setInitialMatchSelectionAsFile("../xml/foo.xml");
    
    const char *result = trans->applyTemplatesReturningString();
    cout<<" Result from THREAD ID: "<< tid << ", " << result<<endl;
    delete result;
    trans->detachThread();
}

void testThreads (SaxonProcessor * processor) {
    pthread_t threads[NUM_THREADS];
    int rc;
    int i;
    
    Xslt30Processor *  trans = processor->newXslt30Processor();
    
    trans->compileFromFile("../xsl/foo.xsl");
    struct arg_struct args;
    args.trans = trans;
    
    for( i = 0; i < NUM_THREADS; i++ ) {
        cout << "main() : creating thread, " << i << endl;
        args.id = i;
        rc = pthread_create(&threads[i], NULL, RunThread, (void *)&args);
        
        if (rc) {
            cout << "Error:unable to create thread," << rc << endl;
            exit(-1);
        }
    }
}

The C++ code is crashing with segmentation fault in different ways between the runs, so currently investigating these errors.

For python I have setup in anaconda a django web framework, which I will use next to do some multi-threading testing.

Actions #7

Updated by O'Neil Delpratt about 4 years ago

Added join method to prevent the main current thread ending before other threads by using:

(void) pthread_join(threads[i], NULL);

The C++ multithreading test case is now working without any errors. I still need to do some more experiments and move on to some python testing.

Actions #8

Updated by Anton Shchetikhin over 3 years ago

Mr. Delpratt, I faced the same problem as Andreas Jung. From the comments above, I see that the changes have been made to the C ++ codebase. How is the Python bindings testing going and when to expect the release of multi-threading support?

Actions #9

Updated by O'Neil Delpratt about 2 years ago

Hi, Sorry for the delays on this bug issue. The new release for SaxonC will be out next week.

Actions #10

Updated by O'Neil Delpratt about 2 years ago

  • Is duplicate of Bug #5373: Python multithreading code crashes added
Actions #11

Updated by O'Neil Delpratt about 2 years ago

  • Tracker changed from Support to Bug
  • Category set to Saxon-C Internals
  • Status changed from In Progress to Resolved
  • % Done changed from 0 to 100
  • Found in version set to 1.2.1

Hi, I have made some progress on SaxonC multithreading in python applications. The problem was in the internal Saxon/C code relating to JNI misuse. See the bug issue #5373 for more details.

I am marking this bug as resolved. This will be available in the next maintenance release of SaxonC, which we hope will be out in the next week or two.

Actions #12

Updated by O'Neil Delpratt about 2 years ago

  • Status changed from Resolved to Closed
  • Fixed in version set to 11.3

Bug fix applied in the SaxonC 11.3 maintenance release.

Please register to edit this issue

Also available in: Atom PDF