Project

Profile

Help

Bug #2338 ยป Bug #4118 - 2015-03-26T15_01_03Z.eml

Anna Benton, 2015-03-26 16:01

 
Return-Path: <anna.benton@gmail.com>
Received: from mi008.mc1.hosteurope.de ([80.237.138.247]) by wp245.webpack.hosteurope.de running ExIM with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) id 1Yb9H4-0002MA-S9; Thu, 26 Mar 2015 16:00:46 +0100
Received: from mail-wi0-f171.google.com ([209.85.212.171]) by mx0.webpack.hosteurope.de (mi008.mc1.hosteurope.de) with esmtps (TLSv1.2:AES128-GCM-SHA256:128) id 1Yb9H3-0008Gw-L3 for dropbox+saxonica+f38e@plan.io; Thu, 26 Mar 2015 16:00:46 +0100
Received: by wibg7 with SMTP id g7so151858164wib.1 for <dropbox+saxonica+f38e@plan.io>; Thu, 26 Mar 2015 08:00:45 -0700
Received: by 10.194.43.105 with HTTP; Thu, 26 Mar 2015 08:00:44 -0700
Date: Thu, 26 Mar 2015 08:00:44 -0700
From: Anna Benton <anna.benton@gmail.com>
To: Saxonica Developer Community <dropbox+saxonica+f38e@plan.io>
Message-ID: <CALoy=bjCLxU0tQT3U4HmGv_CTHSLeToa187kp4KBef50w4AbZg@mail.gmail.com>
In-Reply-To: <redmine.journal-4116.20150324180940.3c37b7a5f3fe5965@plan.io>
References: <redmine.issue-2338.20150324172731@plan.io>
<redmine.journal-4116.20150324180940.3c37b7a5f3fe5965@plan.io>
Subject: Re: [Saxon - Bug #2338] Configuration's ErrorListener not thread safe
Mime-Version: 1.0
Content-Type: multipart/alternative;
boundary=001a11c26baab29a8d05123247a3;
charset=UTF-8
Content-Transfer-Encoding: 7bit
Delivery-date: Thu, 26 Mar 2015 16:00:46 +0100
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
h=mime-version:in-reply-to:references:date:message-id:subject:from:to
:content-type; bh=0pf14x+M0PhqfSqoWi18mdQgtTNsa5/V6e23Cp6PJ3w=;
b=gYj5d6CUf4ma9bm0HfCfLBd91ZixQ0KPi0Ly5CFXiBOwiHR5VsmL5nZB2nYlhp0Pgx
Vg4rl8COEmqOpcbkN1vnf6gYOsCyMqgLmvcFsAXGyUrytixT0IkOw14LrxGvU9rlMU3x
wQWiFRRQlzGUPTyPDxk6DAAAre8m37hD2yEitx4nZEcBj1hMjVUm/4R9A1EPI6nM11F3
qGLjiXmFqkQwH9dvyIZ60fD/W3WL3f7T+2AGaifxCfx2lTtHDiJPEA787TTiK2UkBrn3
wBOSuHUr8gitcYT2fnft6/mn89Cg2tQWjEA2KupPeH+VpPOawp6VXCdcGENtJOFNeA+K k17A==
X-Received: by 10.180.187.200 with SMTP id fu8mr48054785wic.2.1427382044103;
Thu, 26 Mar 2015 08:00:44 -0700 (PDT)
X-HE-Spam-Level: +
X-HE-Spam-Score: 1.8
X-HE-Spam-Report: Content analysis details: (1.8 points) pts rule name
description ---- ----------------------
-------------------------------------------------- -0.7 RCVD_IN_DNSWL_LOW RBL:
Sender listed at http://www.dnswl.org/, low trust [209.85.212.171 listed in
list.dnswl.org] 2.5 RCVD_IN_SORBS_HTTP RBL: SORBS: sender is open HTTP proxy
server [209.85.212.171 listed in dnsbl.sorbs.net] 0.0 FREEMAIL_FROM Sender
email is commonly abused enduser mail provider (anna.benton[at]gmail.com) 0.1
HTML_MESSAGE BODY: HTML included in message -0.1 DKIM_VALID_AU Message has a
valid DKIM or DK signature from author's domain -0.1 DKIM_VALID Message has at
least one valid DKIM or DK signature 0.1 DKIM_SIGNED Message has a DKIM or DK
signature, not necessarily valid
X-HE-SPF: PASSED
Envelope-to: dropbox+saxonica+f38e@plan.io


--001a11c26baab29a8d05123247a3
Content-Type: text/plain;
charset=UTF-8
Content-Transfer-Encoding: 7bit

Dear Dr. Kay,

I've sent you a link to a zip file containing our test suite (took us a
while to simplify things) via Google Docs.

Please let me know if you have any questions or if it doesn't work for you!

I forgot to mention in the note on the google doc, we've been running it
with java1.6.

Thanks,
Anna Benton

On Tue, Mar 24, 2015 at 11:09 AM, Saxonica Developer Community <
dropbox+saxonica+f38e@plan.io> wrote:

> --- In your reply, please do not write below this line ---
> Issue #2338 has been updated by Michael Kay.
>
>
> Yes, this area is pretty messy. Given a free choice, we wouldn't have an
> ErrorListener at the Processor/Configuration level, but we need it because
> our Configuration corresponds to JAXP's TransformerFactory, and JAXP puts
> the ErrorListener on the TransformerFactory.
>
> The intention is that setting an ErrorListener on the Configuration should
> implicitly set the ErrorListener on queries and transforms run under that
> Configuration, but not vice versa. You seem to suggest that it's happening
> the other way, and I don't immediately see how that can happen.
>
> Complicating this further is that if the ErrorListener implements Saxon's
> StandardErrorListener interface, then we can clone it using its
> getAnother() method, so that different instances can be used in different
> threads.
>
> I'd be grateful if you could put together a repro that illustrates the
> problem more specifically.
> ------------------------------
> Bug #2338: Configuration's ErrorListener not thread safe
> <https://saxonica.plan.io/issues/2338#change-4116>
>
> - Author: Anna Benton
> - Status: New
> - Priority: Normal
> - Assignee:
> - Category:
> - Sprint/Milestone:
> - Legacy ID:
> - Found in version: 9.5.1.8
> - Fixed in version:
>
> When we get an error in DocumentBuilder.build() it's using the
> ErrorListener of another thread to report it.
>
> Details:
>
> Document builder part:
> We create a SAXSource, passing in an XMLReader on which we have set an
> ErrorHandler.
> There's a problem with the source file (it is empty) and the ErrorHandler
> is used to output an error. At the same time an error goes out to an
> ErrorListener for a completely different thread. I poked around the
> DocumentBuilder source a little and found that DocumentBuilder.build()
> calls buildDocument from the Configuration, and that has a note that says:
> "if any errors occur during document parsing or validation. Detailed
> errors occurring during schema validation will be written to the
> ErrorListener associated with the AugmentedSource, if supplied, or with the
> Configuration otherwise."
>
> The transform part:
> While we are failing to build a document with one thread we are
> successfully transforming some xml in another. When we do this we create a
> new XsltTransformer off of a cached XSLTExecutable and we attach an
> ErrorListener that is specific to that transformation to it. Note that we
> call load() on this single XsltExecutable across many threads at once to
> get an XsltTransformer for each xml file in a batch, but we do not re-use
> the XsltTransformers (although to isolate this problem I limited our inputs
> to a single xml file being transformed and a single xml file going through
> DocumentBuilder).
>
> When we actually do our transform() the XsltTransformer's ErrorListener
> becomes the ErrorListener of the Configuration. I've tracked this by
> outputting the ERROR_LISTENER_CLASS configuration property from the
> Processor right before and right after the transform() call.
>
> A clearer example:
>
> Thread 1:
> Configuration ErrorListener Originally:
> net.sf.saxon.lib.StandardErrorListener
> Source #1 runs a transform() (with one of our ErrorListeners set on the
> XsltTransformer)
> Configuration ErrorListener is now set to one of our ErrorListeners (it is
> using our LogWriter class)
>
> Thread 2:
> Source #2 is an empty file. When we go to build a document off of it, an
> error is emitted to two places:
>
> 1. The ErrorHandler we set up on the XMLReader which we pass in when we
> build the SAXSource (this is great, it's what we want).
> 2. The Configuration's ErrorListener, which, thanks to thread #1, is now
> Source #1's LogWriter.
>
> We depend on our ErrorListeners to write out our logs for our
> transformations, which we then parse and use.
>
> Are we not using ErrorListeners as intended here? Is there a way to keep
> the ErrorListener we set on the XsltTransformer from being set on the
> Configuration? That would be our first choice since we're not sure what
> other circumstances might trigger its use at the Configuration level. I
> haven't run through the Saxon code itself very far, so I'm not sure of the
> exact point at which transforming sets the Configuration's ErrorListener.
> Note that we're using the EnterpriseConfiguration with saxonEE 9.5.1.8.
> ------------------------------
>
> You have received this notification because you have either subscribed to
> or are involved in a project on Saxonica Developer Community site.
> To change your notification preferences, please click here:
> https://saxonica.plan.io/my/account?tour=mail_preferences
>
> This notification was cheerfully delivered by <https://plan.io/>
> [image: Planio] <https://plan.io/>
>

--001a11c26baab29a8d05123247a3
Content-Type: text/html;
charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Dear Dr. Kay,<div><br></div><div>I&#39;ve sent you a lin=
k to a zip file containing our test suite (took us a while to simplify th=
ings) via Google Docs.</div><div><br></div><div>Please let me know if you=
have any questions or if it doesn&#39;t work for you!</div><div><br></di=
v><div>I forgot to mention in the note on the google doc, we&#39;ve been =
running it with java1.6.</div><div><br></div><div>Thanks,</div><div>Anna =
Benton</div></div><div class=3D"gmail_extra"><br><div class=3D"gmail_quot=
e">On Tue, Mar 24, 2015 at 11:09 AM, Saxonica Developer Community <span d=
ir=3D"ltr">&lt;<a href=3D"mailto:dropbox+saxonica+f38e@plan.io" target=3D=
"_blank">dropbox+saxonica+f38e@plan.io</a>&gt;</span> wrote:<br><blockquo=
te class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc =
solid;padding-left:1ex">



<div>
<table width=3D"100%">
<tbody><tr><td style=3D"font-family:MarketWeb,Verdana,sans-serif;font-siz=
e:0.8em;text-align:center;width:100%;color:#d7d7d7"><p>--- In your reply,=
please do not write below this line ---</p></td></tr>
<tr><td>Issue #2338 has been updated by Michael Kay.

<ul>
</ul>

<p>Yes, this area is pretty messy. Given a free choice, we wouldn&#39;t h=
ave an ErrorListener at the Processor/Configuration level, but we need it=
because our Configuration corresponds to JAXP&#39;s TransformerFactory, =
and JAXP puts the ErrorListener on the TransformerFactory.</p>


<p>The intention is that setting an ErrorListener on the Configuration s=
hould implicitly set the ErrorListener on queries and transforms run unde=
r that Configuration, but not vice versa. You seem to suggest that it&#39=
;s happening the other way, and I don&#39;t immediately see how that can =
happen.</p>


<p>Complicating this further is that if the ErrorListener implements Sax=
on&#39;s StandardErrorListener interface, then we can clone it using its =
getAnother() method, so that different instances can be used in different=
threads.</p>


<p>I&#39;d be grateful if you could put together a repro that illustrate=
s the problem more specifically.</p>
<hr>
<h1><a href=3D"https://saxonica.plan.io/issues/2338#change-4116" target=3D=
"_blank">Bug #2338: Configuration&#39;s ErrorListener not thread safe</a>=
</h1>

<ul><li>Author: Anna Benton</li>
<li>Status: New</li>
<li>Priority: Normal</li>
<li>Assignee: </li>
<li>Category: </li>
<li>Sprint/Milestone: </li>
<li>Legacy ID: </li>
<li>Found in version: 9.5.1.8</li>
<li>Fixed in version: </li></ul>

<p>When we get an error in DocumentBuilder.build() it&#39;s using the Err=
orListener of another thread to report it.</p>


<p>Details:</p>


<p>Document builder part:<br>We create a SAXSource, passing in an XMLRea=
der on which we have set an ErrorHandler.<br>There&#39;s a problem with t=
he source file (it is empty) and the ErrorHandler is used to output an er=
ror. At the same time an error goes out to an ErrorListener for a complet=
ely different thread. I poked around the DocumentBuilder source a little =
and found that DocumentBuilder.build() calls buildDocument from the Confi=
guration, and that has a note that says:<br>&quot;if any errors occur dur=
ing document parsing or validation. Detailed errors occurring during sche=
ma validation will be written to the ErrorListener associated with the Au=
gmentedSource, if supplied, or with the Configuration otherwise.&quot;</p=
>


<p>The transform part:<br>While we are failing to build a document with =
one thread we are successfully transforming some xml in another. When we =
do this we create a new XsltTransformer off of a cached XSLTExecutable an=
d we attach an ErrorListener that is specific to that transformation to i=
t. Note that we call load() on this single XsltExecutable across many thr=
eads at once to get an XsltTransformer for each xml file in a batch, but =
we do not re-use the XsltTransformers (although to isolate this problem I=
limited our inputs to a single xml file being transformed and a single x=
ml file going through DocumentBuilder).</p>


<p>When we actually do our transform() the XsltTransformer&#39;s ErrorLi=
stener becomes the ErrorListener of the Configuration. I&#39;ve tracked t=
his by outputting the ERROR_LISTENER_CLASS configuration property from th=
e Processor right before and right after the transform() call.</p>


<p>A clearer example:</p>


<p>Thread 1:<br>Configuration ErrorListener Originally: net.sf.saxon.lib=
.StandardErrorListener<br>Source #1 runs a transform() (with one of our E=
rrorListeners set on the XsltTransformer)<br>Configuration ErrorListener =
is now set to one of our ErrorListeners (it is using our LogWriter class)=
</p>


<p>Thread 2:<br>Source #2 is an empty file. When we go to build a docume=
nt off of it, an error is emitted to two places:</p>


<p>1. The ErrorHandler we set up on the XMLReader which we pass in when =
we build the SAXSource (this is great, it&#39;s what we want).<br>2. The =
Configuration&#39;s ErrorListener, which, thanks to thread #1, is now Sou=
rce #1&#39;s LogWriter.</p>


<p>We depend on our ErrorListeners to write out our logs for our transfo=
rmations, which we then parse and use.</p>


<p>Are we not using ErrorListeners as intended here? Is there a way to k=
eep the ErrorListener we set on the XsltTransformer from being set on the=
Configuration? That would be our first choice since we&#39;re not sure w=
hat other circumstances might trigger its use at the Configuration level.=
I haven&#39;t run through the Saxon code itself very far, so I&#39;m not=
sure of the exact point at which transforming sets the Configuration&#39=
;s ErrorListener. Note that we&#39;re using the EnterpriseConfiguration w=
ith saxonEE 9.5.1.8.</p>




</td></tr>
<tr><td style=3D"font-size:0.8em;width:100%"><hr><p>You have received thi=
s notification because you have either subscribed to or are involved in a=
project on Saxonica Developer Community site.<br>To change your notifica=
tion preferences, please click here: <a href=3D"https://saxonica.plan.io/=
my/account?tour=3Dmail_preferences" target=3D"_blank">https://saxonica.pl=
an.io/my/account?tour=3Dmail_preferences</a></p></td><td></td></tr>
<tr><td style=3D"font-family:MarketWeb,Verdana,sans-serif;font-size:1.2em=
;text-align:center;width:100%;color:#d7d7d7"><br><div><a href=3D"https://=
plan.io/" style=3D"color:#d7d7d7;text-decoration:none" target=3D"_blank">=
This notification was cheerfully delivered by</a></div></td><td></td></tr=
>
<tr><td style=3D"text-align:center;width:100%"><a href=3D"https://plan.io=
/" title=3D"Planio" target=3D"_blank"><img src=3D"https://assets.plan.io/=
images/planio_logo_gray_204x50.png" height=3D"25" width=3D"102" alt=3D"Pl=
anio" style=3D"vertical-align:middle"></a></td></tr>
</tbody></table>
</div>

</blockquote></div><br></div>

--001a11c26baab29a8d05123247a3--
    (1-1/1)