|
Return-Path: <anna.benton@gmail.com>
|
|
Received: from mi008.mc1.hosteurope.de ([80.237.138.247]) by wp245.webpack.hosteurope.de running ExIM with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) id 1Yb9H4-0002MA-S9; Thu, 26 Mar 2015 16:00:46 +0100
|
|
Received: from mail-wi0-f171.google.com ([209.85.212.171]) by mx0.webpack.hosteurope.de (mi008.mc1.hosteurope.de) with esmtps (TLSv1.2:AES128-GCM-SHA256:128) id 1Yb9H3-0008Gw-L3 for dropbox+saxonica+f38e@plan.io; Thu, 26 Mar 2015 16:00:46 +0100
|
|
Received: by wibg7 with SMTP id g7so151858164wib.1 for <dropbox+saxonica+f38e@plan.io>; Thu, 26 Mar 2015 08:00:45 -0700
|
|
Received: by 10.194.43.105 with HTTP; Thu, 26 Mar 2015 08:00:44 -0700
|
|
Date: Thu, 26 Mar 2015 08:00:44 -0700
|
|
From: Anna Benton <anna.benton@gmail.com>
|
|
To: Saxonica Developer Community <dropbox+saxonica+f38e@plan.io>
|
|
Message-ID: <CALoy=bjCLxU0tQT3U4HmGv_CTHSLeToa187kp4KBef50w4AbZg@mail.gmail.com>
|
|
In-Reply-To: <redmine.journal-4116.20150324180940.3c37b7a5f3fe5965@plan.io>
|
|
References: <redmine.issue-2338.20150324172731@plan.io>
|
|
<redmine.journal-4116.20150324180940.3c37b7a5f3fe5965@plan.io>
|
|
Subject: Re: [Saxon - Bug #2338] Configuration's ErrorListener not thread safe
|
|
Mime-Version: 1.0
|
|
Content-Type: multipart/alternative;
|
|
boundary=001a11c26baab29a8d05123247a3;
|
|
charset=UTF-8
|
|
Content-Transfer-Encoding: 7bit
|
|
Delivery-date: Thu, 26 Mar 2015 16:00:46 +0100
|
|
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
|
|
h=mime-version:in-reply-to:references:date:message-id:subject:from:to
|
|
:content-type; bh=0pf14x+M0PhqfSqoWi18mdQgtTNsa5/V6e23Cp6PJ3w=;
|
|
b=gYj5d6CUf4ma9bm0HfCfLBd91ZixQ0KPi0Ly5CFXiBOwiHR5VsmL5nZB2nYlhp0Pgx
|
|
Vg4rl8COEmqOpcbkN1vnf6gYOsCyMqgLmvcFsAXGyUrytixT0IkOw14LrxGvU9rlMU3x
|
|
wQWiFRRQlzGUPTyPDxk6DAAAre8m37hD2yEitx4nZEcBj1hMjVUm/4R9A1EPI6nM11F3
|
|
qGLjiXmFqkQwH9dvyIZ60fD/W3WL3f7T+2AGaifxCfx2lTtHDiJPEA787TTiK2UkBrn3
|
|
wBOSuHUr8gitcYT2fnft6/mn89Cg2tQWjEA2KupPeH+VpPOawp6VXCdcGENtJOFNeA+K k17A==
|
|
X-Received: by 10.180.187.200 with SMTP id fu8mr48054785wic.2.1427382044103;
|
|
Thu, 26 Mar 2015 08:00:44 -0700 (PDT)
|
|
X-HE-Spam-Level: +
|
|
X-HE-Spam-Score: 1.8
|
|
X-HE-Spam-Report: Content analysis details: (1.8 points) pts rule name
|
|
description ---- ----------------------
|
|
-------------------------------------------------- -0.7 RCVD_IN_DNSWL_LOW RBL:
|
|
Sender listed at http://www.dnswl.org/, low trust [209.85.212.171 listed in
|
|
list.dnswl.org] 2.5 RCVD_IN_SORBS_HTTP RBL: SORBS: sender is open HTTP proxy
|
|
server [209.85.212.171 listed in dnsbl.sorbs.net] 0.0 FREEMAIL_FROM Sender
|
|
email is commonly abused enduser mail provider (anna.benton[at]gmail.com) 0.1
|
|
HTML_MESSAGE BODY: HTML included in message -0.1 DKIM_VALID_AU Message has a
|
|
valid DKIM or DK signature from author's domain -0.1 DKIM_VALID Message has at
|
|
least one valid DKIM or DK signature 0.1 DKIM_SIGNED Message has a DKIM or DK
|
|
signature, not necessarily valid
|
|
X-HE-SPF: PASSED
|
|
Envelope-to: dropbox+saxonica+f38e@plan.io
|
|
|
|
|
|
--001a11c26baab29a8d05123247a3
|
|
Content-Type: text/plain;
|
|
charset=UTF-8
|
|
Content-Transfer-Encoding: 7bit
|
|
|
|
Dear Dr. Kay,
|
|
|
|
I've sent you a link to a zip file containing our test suite (took us a
|
|
while to simplify things) via Google Docs.
|
|
|
|
Please let me know if you have any questions or if it doesn't work for you!
|
|
|
|
I forgot to mention in the note on the google doc, we've been running it
|
|
with java1.6.
|
|
|
|
Thanks,
|
|
Anna Benton
|
|
|
|
On Tue, Mar 24, 2015 at 11:09 AM, Saxonica Developer Community <
|
|
dropbox+saxonica+f38e@plan.io> wrote:
|
|
|
|
> --- In your reply, please do not write below this line ---
|
|
> Issue #2338 has been updated by Michael Kay.
|
|
>
|
|
>
|
|
> Yes, this area is pretty messy. Given a free choice, we wouldn't have an
|
|
> ErrorListener at the Processor/Configuration level, but we need it because
|
|
> our Configuration corresponds to JAXP's TransformerFactory, and JAXP puts
|
|
> the ErrorListener on the TransformerFactory.
|
|
>
|
|
> The intention is that setting an ErrorListener on the Configuration should
|
|
> implicitly set the ErrorListener on queries and transforms run under that
|
|
> Configuration, but not vice versa. You seem to suggest that it's happening
|
|
> the other way, and I don't immediately see how that can happen.
|
|
>
|
|
> Complicating this further is that if the ErrorListener implements Saxon's
|
|
> StandardErrorListener interface, then we can clone it using its
|
|
> getAnother() method, so that different instances can be used in different
|
|
> threads.
|
|
>
|
|
> I'd be grateful if you could put together a repro that illustrates the
|
|
> problem more specifically.
|
|
> ------------------------------
|
|
> Bug #2338: Configuration's ErrorListener not thread safe
|
|
> <https://saxonica.plan.io/issues/2338#change-4116>
|
|
>
|
|
> - Author: Anna Benton
|
|
> - Status: New
|
|
> - Priority: Normal
|
|
> - Assignee:
|
|
> - Category:
|
|
> - Sprint/Milestone:
|
|
> - Legacy ID:
|
|
> - Found in version: 9.5.1.8
|
|
> - Fixed in version:
|
|
>
|
|
> When we get an error in DocumentBuilder.build() it's using the
|
|
> ErrorListener of another thread to report it.
|
|
>
|
|
> Details:
|
|
>
|
|
> Document builder part:
|
|
> We create a SAXSource, passing in an XMLReader on which we have set an
|
|
> ErrorHandler.
|
|
> There's a problem with the source file (it is empty) and the ErrorHandler
|
|
> is used to output an error. At the same time an error goes out to an
|
|
> ErrorListener for a completely different thread. I poked around the
|
|
> DocumentBuilder source a little and found that DocumentBuilder.build()
|
|
> calls buildDocument from the Configuration, and that has a note that says:
|
|
> "if any errors occur during document parsing or validation. Detailed
|
|
> errors occurring during schema validation will be written to the
|
|
> ErrorListener associated with the AugmentedSource, if supplied, or with the
|
|
> Configuration otherwise."
|
|
>
|
|
> The transform part:
|
|
> While we are failing to build a document with one thread we are
|
|
> successfully transforming some xml in another. When we do this we create a
|
|
> new XsltTransformer off of a cached XSLTExecutable and we attach an
|
|
> ErrorListener that is specific to that transformation to it. Note that we
|
|
> call load() on this single XsltExecutable across many threads at once to
|
|
> get an XsltTransformer for each xml file in a batch, but we do not re-use
|
|
> the XsltTransformers (although to isolate this problem I limited our inputs
|
|
> to a single xml file being transformed and a single xml file going through
|
|
> DocumentBuilder).
|
|
>
|
|
> When we actually do our transform() the XsltTransformer's ErrorListener
|
|
> becomes the ErrorListener of the Configuration. I've tracked this by
|
|
> outputting the ERROR_LISTENER_CLASS configuration property from the
|
|
> Processor right before and right after the transform() call.
|
|
>
|
|
> A clearer example:
|
|
>
|
|
> Thread 1:
|
|
> Configuration ErrorListener Originally:
|
|
> net.sf.saxon.lib.StandardErrorListener
|
|
> Source #1 runs a transform() (with one of our ErrorListeners set on the
|
|
> XsltTransformer)
|
|
> Configuration ErrorListener is now set to one of our ErrorListeners (it is
|
|
> using our LogWriter class)
|
|
>
|
|
> Thread 2:
|
|
> Source #2 is an empty file. When we go to build a document off of it, an
|
|
> error is emitted to two places:
|
|
>
|
|
> 1. The ErrorHandler we set up on the XMLReader which we pass in when we
|
|
> build the SAXSource (this is great, it's what we want).
|
|
> 2. The Configuration's ErrorListener, which, thanks to thread #1, is now
|
|
> Source #1's LogWriter.
|
|
>
|
|
> We depend on our ErrorListeners to write out our logs for our
|
|
> transformations, which we then parse and use.
|
|
>
|
|
> Are we not using ErrorListeners as intended here? Is there a way to keep
|
|
> the ErrorListener we set on the XsltTransformer from being set on the
|
|
> Configuration? That would be our first choice since we're not sure what
|
|
> other circumstances might trigger its use at the Configuration level. I
|
|
> haven't run through the Saxon code itself very far, so I'm not sure of the
|
|
> exact point at which transforming sets the Configuration's ErrorListener.
|
|
> Note that we're using the EnterpriseConfiguration with saxonEE 9.5.1.8.
|
|
> ------------------------------
|
|
>
|
|
> You have received this notification because you have either subscribed to
|
|
> or are involved in a project on Saxonica Developer Community site.
|
|
> To change your notification preferences, please click here:
|
|
> https://saxonica.plan.io/my/account?tour=mail_preferences
|
|
>
|
|
> This notification was cheerfully delivered by <https://plan.io/>
|
|
> [image: Planio] <https://plan.io/>
|
|
>
|
|
|
|
--001a11c26baab29a8d05123247a3
|
|
Content-Type: text/html;
|
|
charset=UTF-8
|
|
Content-Transfer-Encoding: quoted-printable
|
|
|
|
<div dir=3D"ltr">Dear Dr. Kay,<div><br></div><div>I've sent you a lin=
|
|
k to a zip file containing our test suite (took us a while to simplify th=
|
|
ings) via Google Docs.</div><div><br></div><div>Please let me know if you=
|
|
have any questions or if it doesn't work for you!</div><div><br></di=
|
|
v><div>I forgot to mention in the note on the google doc, we've been =
|
|
running it with java1.6.</div><div><br></div><div>Thanks,</div><div>Anna =
|
|
Benton</div></div><div class=3D"gmail_extra"><br><div class=3D"gmail_quot=
|
|
e">On Tue, Mar 24, 2015 at 11:09 AM, Saxonica Developer Community <span d=
|
|
ir=3D"ltr"><<a href=3D"mailto:dropbox+saxonica+f38e@plan.io" target=3D=
|
|
"_blank">dropbox+saxonica+f38e@plan.io</a>></span> wrote:<br><blockquo=
|
|
te class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc =
|
|
solid;padding-left:1ex">
|
|
|
|
|
|
|
|
<div>
|
|
<table width=3D"100%">
|
|
<tbody><tr><td style=3D"font-family:MarketWeb,Verdana,sans-serif;font-siz=
|
|
e:0.8em;text-align:center;width:100%;color:#d7d7d7"><p>--- In your reply,=
|
|
please do not write below this line ---</p></td></tr>
|
|
<tr><td>Issue #2338 has been updated by Michael Kay.
|
|
|
|
<ul>
|
|
</ul>
|
|
|
|
<p>Yes, this area is pretty messy. Given a free choice, we wouldn't h=
|
|
ave an ErrorListener at the Processor/Configuration level, but we need it=
|
|
because our Configuration corresponds to JAXP's TransformerFactory, =
|
|
and JAXP puts the ErrorListener on the TransformerFactory.</p>
|
|
|
|
|
|
<p>The intention is that setting an ErrorListener on the Configuration s=
|
|
hould implicitly set the ErrorListener on queries and transforms run unde=
|
|
r that Configuration, but not vice versa. You seem to suggest that it'=
|
|
;s happening the other way, and I don't immediately see how that can =
|
|
happen.</p>
|
|
|
|
|
|
<p>Complicating this further is that if the ErrorListener implements Sax=
|
|
on's StandardErrorListener interface, then we can clone it using its =
|
|
getAnother() method, so that different instances can be used in different=
|
|
threads.</p>
|
|
|
|
|
|
<p>I'd be grateful if you could put together a repro that illustrate=
|
|
s the problem more specifically.</p>
|
|
<hr>
|
|
<h1><a href=3D"https://saxonica.plan.io/issues/2338#change-4116" target=3D=
|
|
"_blank">Bug #2338: Configuration's ErrorListener not thread safe</a>=
|
|
</h1>
|
|
|
|
<ul><li>Author: Anna Benton</li>
|
|
<li>Status: New</li>
|
|
<li>Priority: Normal</li>
|
|
<li>Assignee: </li>
|
|
<li>Category: </li>
|
|
<li>Sprint/Milestone: </li>
|
|
<li>Legacy ID: </li>
|
|
<li>Found in version: 9.5.1.8</li>
|
|
<li>Fixed in version: </li></ul>
|
|
|
|
<p>When we get an error in DocumentBuilder.build() it's using the Err=
|
|
orListener of another thread to report it.</p>
|
|
|
|
|
|
<p>Details:</p>
|
|
|
|
|
|
<p>Document builder part:<br>We create a SAXSource, passing in an XMLRea=
|
|
der on which we have set an ErrorHandler.<br>There's a problem with t=
|
|
he source file (it is empty) and the ErrorHandler is used to output an er=
|
|
ror. At the same time an error goes out to an ErrorListener for a complet=
|
|
ely different thread. I poked around the DocumentBuilder source a little =
|
|
and found that DocumentBuilder.build() calls buildDocument from the Confi=
|
|
guration, and that has a note that says:<br>"if any errors occur dur=
|
|
ing document parsing or validation. Detailed errors occurring during sche=
|
|
ma validation will be written to the ErrorListener associated with the Au=
|
|
gmentedSource, if supplied, or with the Configuration otherwise."</p=
|
|
>
|
|
|
|
|
|
<p>The transform part:<br>While we are failing to build a document with =
|
|
one thread we are successfully transforming some xml in another. When we =
|
|
do this we create a new XsltTransformer off of a cached XSLTExecutable an=
|
|
d we attach an ErrorListener that is specific to that transformation to i=
|
|
t. Note that we call load() on this single XsltExecutable across many thr=
|
|
eads at once to get an XsltTransformer for each xml file in a batch, but =
|
|
we do not re-use the XsltTransformers (although to isolate this problem I=
|
|
limited our inputs to a single xml file being transformed and a single x=
|
|
ml file going through DocumentBuilder).</p>
|
|
|
|
|
|
<p>When we actually do our transform() the XsltTransformer's ErrorLi=
|
|
stener becomes the ErrorListener of the Configuration. I've tracked t=
|
|
his by outputting the ERROR_LISTENER_CLASS configuration property from th=
|
|
e Processor right before and right after the transform() call.</p>
|
|
|
|
|
|
<p>A clearer example:</p>
|
|
|
|
|
|
<p>Thread 1:<br>Configuration ErrorListener Originally: net.sf.saxon.lib=
|
|
.StandardErrorListener<br>Source #1 runs a transform() (with one of our E=
|
|
rrorListeners set on the XsltTransformer)<br>Configuration ErrorListener =
|
|
is now set to one of our ErrorListeners (it is using our LogWriter class)=
|
|
</p>
|
|
|
|
|
|
<p>Thread 2:<br>Source #2 is an empty file. When we go to build a docume=
|
|
nt off of it, an error is emitted to two places:</p>
|
|
|
|
|
|
<p>1. The ErrorHandler we set up on the XMLReader which we pass in when =
|
|
we build the SAXSource (this is great, it's what we want).<br>2. The =
|
|
Configuration's ErrorListener, which, thanks to thread #1, is now Sou=
|
|
rce #1's LogWriter.</p>
|
|
|
|
|
|
<p>We depend on our ErrorListeners to write out our logs for our transfo=
|
|
rmations, which we then parse and use.</p>
|
|
|
|
|
|
<p>Are we not using ErrorListeners as intended here? Is there a way to k=
|
|
eep the ErrorListener we set on the XsltTransformer from being set on the=
|
|
Configuration? That would be our first choice since we're not sure w=
|
|
hat other circumstances might trigger its use at the Configuration level.=
|
|
I haven't run through the Saxon code itself very far, so I'm not=
|
|
sure of the exact point at which transforming sets the Configuration'=
|
|
;s ErrorListener. Note that we're using the EnterpriseConfiguration w=
|
|
ith saxonEE 9.5.1.8.</p>
|
|
|
|
|
|
|
|
|
|
</td></tr>
|
|
<tr><td style=3D"font-size:0.8em;width:100%"><hr><p>You have received thi=
|
|
s notification because you have either subscribed to or are involved in a=
|
|
project on Saxonica Developer Community site.<br>To change your notifica=
|
|
tion preferences, please click here: <a href=3D"https://saxonica.plan.io/=
|
|
my/account?tour=3Dmail_preferences" target=3D"_blank">https://saxonica.pl=
|
|
an.io/my/account?tour=3Dmail_preferences</a></p></td><td></td></tr>
|
|
<tr><td style=3D"font-family:MarketWeb,Verdana,sans-serif;font-size:1.2em=
|
|
;text-align:center;width:100%;color:#d7d7d7"><br><div><a href=3D"https://=
|
|
plan.io/" style=3D"color:#d7d7d7;text-decoration:none" target=3D"_blank">=
|
|
This notification was cheerfully delivered by</a></div></td><td></td></tr=
|
|
>
|
|
<tr><td style=3D"text-align:center;width:100%"><a href=3D"https://plan.io=
|
|
/" title=3D"Planio" target=3D"_blank"><img src=3D"https://assets.plan.io/=
|
|
images/planio_logo_gray_204x50.png" height=3D"25" width=3D"102" alt=3D"Pl=
|
|
anio" style=3D"vertical-align:middle"></a></td></tr>
|
|
</tbody></table>
|
|
</div>
|
|
|
|
</blockquote></div><br></div>
|
|
|
|
--001a11c26baab29a8d05123247a3--
|