Project

Profile

Help

Configuration's ErrorListener not thread safe?

Added by Anna Benton over 9 years ago

When we get an error in DocumentBuilder.build() it's using the ErrorListener of another thread to report it.

Details:

Document builder part: We create a SAXSource, passing in an XMLReader on which we have set an ErrorHandler. There's a problem with the source file (it is empty) and the ErrorHandler is used to output an error. At the same time an error goes out to an ErrorListener for a completely different thread. I poked around the DocumentBuilder source a little and found that DocumentBuilder.build() calls buildDocument from the Configuration, and that has a note that says: "if any errors occur during document parsing or validation. Detailed errors occurring during schema validation will be written to the ErrorListener associated with the AugmentedSource, if supplied, or with the Configuration otherwise."

The transform part: While we are failing to build a document with one thread we are successfully transforming some xml in another. When we do this we create a new XsltTransformer off of a cached XSLTExecutable and we attach an ErrorListener that is specific to that transformation to it. Note that we call load() on this single XsltExecutable across many threads at once to get an XsltTransformer for each xml file in a batch, but we do not re-use the XsltTransformers (although to isolate this problem I limited our inputs to a single xml file being transformed and a single xml file going through DocumentBuilder).

When we actually do our transform() the XsltTransformer's ErrorListener becomes the ErrorListener of the Configuration. I've tracked this by outputting the ERROR_LISTENER_CLASS configuration property from the Processor right before and right after the transform() call.

A clearer example:

Thread 1: Configuration ErrorListener Originally: net.sf.saxon.lib.StandardErrorListener Source #1 runs a transform() (with one of our ErrorListeners set on the XsltTransformer) Configuration ErrorListener is now set to one of our ErrorListeners (it is using our LogWriter class)

Thread 2: Source #2 is an empty file. When we go to build a document off of it, an error is emitted to two places:

  1. The ErrorHandler we set up on the XMLReader which we pass in when we build the SAXSource (this is great, it's what we want).
  2. The Configuration's ErrorListener, which, thanks to thread #1, is now Source #1's LogWriter.

We depend on our ErrorListeners to write out our logs for our transformations, which we then parse and use.

Are we not using ErrorListeners as intended here? Is there a way to keep the ErrorListener we set on the XsltTransformer from being set on the Configuration? That would be our first choice since we're not sure what other circumstances might trigger its use at the Configuration level. I haven't run through the Saxon code itself very far, so I'm not sure of the exact point at which transforming sets the Configuration's ErrorListener. Note that we're using the EnterpriseConfiguration with saxonEE 9.5.1.8.


    (1-1/1)

    Please register to reply