Project

Profile

Help

Bug #4446 ยป Re_ [Saxon - Bug #4446] Schema-Aware Transformation_ wrong node set - 2020-01-31T04_37_45Z.eml

Frank Steimke, 2020-01-31 05:37

 
X-He-Spam-Score: -1.9
Return-Path: <f-steimke@berger-und-steimke.de>
Delivered-To: dropbox@plan.io
Received: from m.launchco.com ([127.0.0.1])
by m.launchco.com with LMTP id uEFcBAGvM17lHgAAa1G0NA
for <dropbox@plan.io>; Fri, 31 Jan 2020 05:37:21 +0100
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on m.launchco.com
X-Spam-Level:
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,HTML_MESSAGE,
RCVD_IN_DNSWL_NONE,SPF_HELO_NONE autolearn=ham autolearn_force=no
version=3.4.2
X-Spam-Report:
* -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at
* https://www.dnswl.org/, no trust
* [212.227.126.133 listed in list.dnswl.org]
* -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
* [score: 0.0000]
* 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record
* 0.0 HTML_MESSAGE BODY: HTML included in message
X-Spam-Score: -1.9
Envelope-to: inbox+saxonica+f38e+saxon@plan.io
Authentication-Results: m.launchco.com; dmarc=none (p=none dis=none) header.from=berger-und-steimke.de
Received: from mout.kundenserver.de (mout.kundenserver.de [212.227.126.133])
(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
(No client certificate requested)
by m.launchco.com (Postfix) with ESMTPS id E4D1180282
for <inbox+saxonica+f38e+saxon@plan.io>; Fri, 31 Jan 2020 05:37:20 +0100 (CET)
Received: from [192.168.178.30] ([92.77.8.223]) by mrelayeu.kundenserver.de
(mreue009 [212.227.15.167]) with ESMTPSA (Nemesis) id
1MlwBf-1jMmn50J5X-00j0OL for <inbox+saxonica+f38e+saxon@plan.io>; Fri, 31 Jan
2020 05:37:20 +0100
Subject: Re: [Saxon - Bug #4446] Schema-Aware Transformation: wrong node set
To: Saxonica Developer Community <inbox+saxonica+f38e+saxon@plan.io>
References: <redmine.issue-4446.20200129102034@plan.io>
<redmine.journal-14847.20200131001337.62c407dd9e223c4d@plan.io>
From: Frank Steimke <f-steimke@berger-und-steimke.de>
Message-ID: <81b96bd1-d36d-3d5c-a9f4-7df223041aa6@berger-und-steimke.de>
Date: Fri, 31 Jan 2020 05:37:19 +0100
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101
Thunderbird/68.4.2
MIME-Version: 1.0
In-Reply-To: <redmine.journal-14847.20200131001337.62c407dd9e223c4d@plan.io>
Content-Type: multipart/alternative;
boundary="------------D5719F63CBBAA1D261A0B977"
X-Provags-ID: V03:K1:F1kt7KdR7i2YJv/35BcJdaythAe21UhcoIKZWAEu4WLU/T1LWGA
HbCH7jSQmJuH+R817SEMR3RDQZTGka+8jQQemy02W932rXjbSKDZbydtsSnHfjDTbTnL0il
JG7Wz9gFLGsTqHlF4xR4L2YImSimgkRpL0yHXP5q32zQ3OUlY33sIaQRXItuUmb9kPVJkiu
uwH0tvP0Qj/+nJB5dl1LA==
X-UI-Out-Filterresults: notjunk:1;V03:K0:UKtPmWfEKts=:XntqShGokYhBnk6NiDA2R8
nW/qN/PYtlX6ANbUMJUCO7daZkebUT/RCdmGt+VJyyMJw9xUlHuf2OGvN0s8RUNfbeh0wWf9S
u3EfWqm2FX+JCkiK0bjkdsoSI3DCotV6AKcniuaJ/lRUfR+s3l0Y96ET8PETt2uPgKU/0oUzL
o4gY5zfdTABhc3ZWv7J0nZCoBmCCzWv11qTxfutzvpUJIpHlL4K8ebe4/2x84BQxKOYBJF4nh
rYYQbkwrCffkRtO7MT6dHbKQFhQsW9rglqtUV7VzwUjfkA66irY4yaEUcAQrnH8wwayEQBdYy
idaQyjspG60Y/O56Ksnr05YQNC/PquQ7/jLqPum+61rAyaokkaiMHzdGUGYmuenzUPe0pTUvS
YHAC5zxYzuHWMlBpfEp7vb3q0lLnhOQKEAPXBqj12B+cymY1ZtcDJid6sji2vBRi1HwKS2CI4
GfaN3Tfx5HzG9FMVb/eL+5XIRqn9voM5rXeDNnMTzljWu2Kd/d1C08OpEJ8bZjP5iF00aAhR9
IYvIA9yHkXyfkOeMbWzSU5qmR3mhCOGIjfBl/Xqp/1Su/u/YpCrXMxx2CKvaOtvQm+DK6j4U8
wFz4k8bd0duRj9a5Ok/TeISHdyQowDWwsr+quQuSQD8uvo2aQhxbKbnAE75pLjRJmyCKdH/Jm
ImavaGXll10c59Nax0BqpdN5sKSO96Dr4Dn4/JUrUhxHv049yFUfv8BToAqhj9Igehr8QNl6U
gqU0HCRzKAkBqRRsbQw+h8KRNU+kmdgT0mYrjwfaNOTLYdG1m1Ju7WSh4n4JFRLXJoeUFlH6P
/mfftsZe8yCB1dl3XCADxe09v2ejUMrFveq6UeZgax8sDBOoRwF+E9oMGpsx5ZgUF6/nH5I8w
vXJwgX7doH9w4okclO2Q==

This is a multi-part message in MIME format.
--------------D5719F63CBBAA1D261A0B977
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit

You can reach me at

* frank.steimke@finanzen.bremen.de (office);
* f-steimke@berger-und-steimke.de (home) or
* fsteimke.de@gmail.com (also home)

I will also ask one or two of my collegues wheter they can reproduce it
on their machines in Oxygen. I'd like to know where the difference is
between the system at our office and the almost identical Oxygen
installation at home (where i can't reproduce the issue).

Frank

Am 31.01.2020 um 01:13 schrieb Saxonica Developer Community:
>
> --- In your reply, please do not write below this line ---
>
> Issue #4446 <https://saxonica.plan.io/issues/4446?pn=1#change-14847>
> has been updated by Michael Kay.
>
> I think it would be a good idea if I sent you a temporary license key
> so you can investigate whether the problem is reproducible outside
> oXygen. (I have an email address for you from Jan 2018, is that likely
> to still reach you?) The license that comes with oXygen only covers
> embedded use within oXygen itself.
>
> The fact that it fails with optimization on, and succeeds with
> optimization off, is certainly a useful data point, though until we
> can reproduce it it doesn;t help us much. With the free-standing
> product, as soon as we can reproduce the issue, we'll be able to
> investigate what rewrites are taking place during optimization.
>
> The fact that the nodes are parentless is very mysterious. I can't
> think of any mechanism that would exhibit that particular failure
> mode. We absolutely need to reproduce this "in the lab".
>
> ------------------------------------------------------------------------
>
>
> Bug #4446: Schema-Aware Transformation: wrong node set
> <https://saxonica.plan.io/issues/4446?pn=1#change-14847>
>
> * Author: Frank Steimke
> * Status: AwaitingInfo
> * Priority: Normal
> * Assignee: Michael Kay
> * Category: Schema-Aware processing
> * Sprint/Milestone:
> * Legacy ID:
> * Applies to branch:
> * Fix Committed on Branch:
> * Fixed in Maintenance Release:
>
> Hi, i have a medium-sized project dealing with latin characters in
> Unicode. There is a database of latin characters (latinchars.xml),
> which is a XML Document valid with respect to an XML 1.1 schema
> latinchars.xsd. There is a schema-aware function library in XSLT 3.
> The overall goal is th produce a docbook documentation, which works
> fine. All this is done as an Oxygen project. Oxygen version is 21.1
> (recent) which includes Saxon EE 8.8.0.1 on Windows 10.
>
> However, we want to analyze some aspects of NFD normalization. For
> this i added an extension Element in the schema, which allows xs:any
> childs. While reading the document from the database, we add an
> extension element with an nfd element as child. the nfd Element has an
> mandatory base element as child, followed by an optional diacritical
> element. The enriched document is validated against the schema without
> any error.
>
> When i appy transformations to this document, there is a strange
> behaviour, which seems to be a bug. Unfortunately, i am unable to
> reproduce it with a small script. I can only describe what is see, and
> give you the project attached.
>
> Observation:
>
> The enriched database is hold in a global variable $characterSet
> as='document-node(schema-element(lc:characterSet))'. There are 924
> child Elements of Type (*, Entry). Each of these has an nfd child
> element, every nfd element has an bas child element. Counting the
> nfd/base elements, i would expect 924 nodes.
>
> This expression gives the correct result:
>
> |xsl:value-of select="count($characterSet//element(*,
> Entry)/extension/nfd/base)"/> |
>
> This expression, however, gives the incorrect number of 1 node only:
>
> |<xsl:value-of select="count($characterSet//nfd/base)"/> |
>
> So, when i count the number of nfd/base descendats of $characterset i
> get only one, but when i count the number of element(*,
> Entry)/extension/nfd/base) descendants, i get 924.
>
> I have tried to boil it down to a simple script, but failed. So i have
> attached the whole oxygen projekt. The transformation which counts the
> number of nodes is called xsl/dia-matrix.xsl
>
> Sincerely, Frank
>
> Files latinchars.zip
> <https://saxonica.plan.io/attachments/download/48916/latinchars.zip>
> (1.31 MB)
> dia-matrix.xsl
> <https://saxonica.plan.io/attachments/download/48918/dia-matrix.xsl>
> (2.18 KB)
>
> ------------------------------------------------------------------------
>
> You have received this notification because you have either subscribed
> to or are involved in a project on Saxonica Developer Community site.
> To change your notification preferences, please click here:
> https://saxonica.plan.io/my/account?tour=mail_preferences
>
>
>
> This notification was cheerfully delivered by <https://plan.io/>
>
> Planio <https://plan.io/>
>

--------------D5719F63CBBAA1D261A0B977
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: 7bit

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>You can reach me at</p>
<ul>
<li><a class="moz-txt-link-abbreviated" href="mailto:frank.steimke@finanzen.bremen.de">frank.steimke@finanzen.bremen.de</a> (office);</li>
<li><a class="moz-txt-link-abbreviated" href="mailto:f-steimke@berger-und-steimke.de">f-steimke@berger-und-steimke.de</a> (home) or <br>
</li>
<li><a class="moz-txt-link-abbreviated" href="mailto:fsteimke.de@gmail.com">fsteimke.de@gmail.com</a> (also home)</li>
</ul>
<p>I will also ask one or two of my collegues wheter they can
reproduce it on their machines in Oxygen. I'd like to know where
the difference is between the system at our office and the almost
identical Oxygen installation at home (where i can't reproduce the
issue).</p>
<p>Frank<br>
</p>
<div class="moz-cite-prefix">Am 31.01.2020 um 01:13 schrieb Saxonica
Developer Community:<br>
</div>
<blockquote type="cite"
cite="mid:redmine.journal-14847.20200131001337.62c407dd9e223c4d@plan.io">
<!--[if !mso]><!-- -->
<link href="https://assets.plan.io/stylesheets/fonts.css"
rel="stylesheet" type="text/css">
<!--<![endif]-->
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<style>a:link{color:#0088b7}
a:visited{color:#0088b7}
a:hover{color:#0088b7}
a:active{color:#0088b7}</style>
<table
style="border-spacing:0;border-collapse:collapse;width:100%"
width="100%" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td class="header"
style="text-align:center;width:100%;font-family:MarketWeb,
Helvetica, Arial,
sans-serif;font-size:0.8em;color:#D7D7D7">
<p>--- In your reply, please do not write below this line
---</p>
</td>
</tr>
<tr>
<td>Issue <a
href="https://saxonica.plan.io/issues/4446?pn=1#change-14847"
style="color:#0088b7" moz-do-not-send="true">#4446</a>
has been updated by Michael Kay.
<ul>
</ul>
<p>I think it would be a good idea if I sent you a
temporary license key so you can investigate whether the
problem is reproducible outside oXygen. (I have an email
address for you from Jan 2018, is that likely to still
reach you?) The license that comes with oXygen only
covers embedded use within oXygen itself.</p>
<p>The fact that it fails with optimization on, and
succeeds with optimization off, is certainly a useful
data point, though until we can reproduce it it doesn;t
help us much. With the free-standing product, as soon as
we can reproduce the issue, we'll be able to investigate
what rewrites are taking place during optimization.</p>
<p>The fact that the nodes are parentless is very
mysterious. I can't think of any mechanism that would
exhibit that particular failure mode. We absolutely need
to reproduce this "in the lab".</p>
<hr
style="width:100%;height:1px;background:#ccc;border:0;margin:1.2em
0">
<h1 style="font-family:&quot;ProximaNova-Bold&quot;,
Helvetica, Arial,
sans-serif;font-weight:normal;margin:0px;font-size:1.3em;line-height:1.4em"><a
href="https://saxonica.plan.io/issues/4446?pn=1#change-14847"
style="color:#0088b7;text-decoration:none"
moz-do-not-send="true">Bug #4446: Schema-Aware
Transformation: wrong node set </a></h1>
<ul>
<li>Author: Frank Steimke</li>
<li>Status: AwaitingInfo</li>
<li>Priority: Normal</li>
<li>Assignee: Michael Kay</li>
<li>Category: Schema-Aware processing</li>
<li>Sprint/Milestone: </li>
<li>Legacy ID: </li>
<li>Applies to branch: </li>
<li>Fix Committed on Branch: </li>
<li>Fixed in Maintenance Release: </li>
</ul>
<p>Hi,
i have a medium-sized project dealing with latin
characters in Unicode. There is a database of latin
characters (latinchars.xml), which is a XML Document
valid with respect to an XML 1.1 schema latinchars.xsd.
There is a schema-aware function library in XSLT 3. The
overall goal is th produce a docbook documentation,
which works fine. All this is done as an Oxygen project.
Oxygen version is 21.1 (recent) which includes Saxon EE
8.8.0.1 on Windows 10.</p>
<p>However, we want to analyze some aspects of NFD
normalization. For this i added an extension Element in
the schema, which allows xs:any childs. While reading
the document from the database, we add an extension
element with an nfd element as child. the nfd Element
has an mandatory base element as child, followed by an
optional diacritical element. The enriched document is
validated against the schema without any error.</p>
<p>When i appy transformations to this document, there is
a strange behaviour, which seems to be a bug.
Unfortunately, i am unable to reproduce it with a small
script. I can only describe what is see, and give you
the project attached.</p>
<p>Observation:</p>
<p>The enriched database is hold in a global variable
$characterSet
as='document-node(schema-element(lc:characterSet))'.
There are 924 child Elements of Type (*, Entry). Each of
these has an nfd child element, every nfd element has an
bas child element. Counting the nfd/base elements, i
would expect 924 nodes.</p>
<p>This expression gives the correct result:</p>
<pre style="font-family:Hack, Consolas, Menlo, &quot;Liberation Mono&quot;, Courier, monospace;font-size:85%;background-color:#ececec;word-wrap:break-word;margin:1em 0;padding:8px;border:none;border-radius:3px;width:auto;overflow-x:auto;overflow-y:hidden"><code style="font-family:Hack, Consolas, Menlo, &quot;Liberation Mono&quot;, Courier, monospace;font-size:85%;background-color:#ececec;word-wrap:break-word;border-radius:3px;padding:0.2em;margin:0;padding:0">xsl:value-of select="count($characterSet//element(*, Entry)/extension/nfd/base)"/&gt;
</code></pre>
<p>This expression, however, gives the incorrect number of
1 node only:</p>
<pre style="font-family:Hack, Consolas, Menlo, &quot;Liberation Mono&quot;, Courier, monospace;font-size:85%;background-color:#ececec;word-wrap:break-word;margin:1em 0;padding:8px;border:none;border-radius:3px;width:auto;overflow-x:auto;overflow-y:hidden"><code style="font-family:Hack, Consolas, Menlo, &quot;Liberation Mono&quot;, Courier, monospace;font-size:85%;background-color:#ececec;word-wrap:break-word;border-radius:3px;padding:0.2em;margin:0;padding:0">&lt;xsl:value-of select="count($characterSet//nfd/base)"/&gt;
</code></pre>
<p>So, when i count the number of nfd/base descendats of
$characterset i get only one, but when i count the
number of element(*, Entry)/extension/nfd/base)
descendants, i get 924.</p>
<p>I have tried to boil it down to a simple script, but
failed. So i have attached the whole oxygen projekt. The
transformation which counts the number of nodes is
called xsl/dia-matrix.xsl</p>
<p>Sincerely,
Frank</p>
<fieldset class="attachments" style="border:solid
#ccc;border-width:1px 0 0 0"><legend>Files</legend> <a
href="https://saxonica.plan.io/attachments/download/48916/latinchars.zip"
style="color:#0088b7" moz-do-not-send="true">latinchars.zip</a>
(1.31 MB)<br>
<a
href="https://saxonica.plan.io/attachments/download/48918/dia-matrix.xsl"
style="color:#0088b7" moz-do-not-send="true">dia-matrix.xsl</a>
(2.18 KB)<br>
</fieldset>
<div itemscope="itemscope"
itemtype="http://schema.org/EmailMessage">
<div itemscope="itemscope" itemprop="action"
itemtype="http://schema.org/ViewAction">
<link itemprop="url"
href="https://saxonica.plan.io/issues/4446?pn=1#change-14847">
<meta itemprop="name" content="View Issue">
</div>
<meta itemprop="description" content="View this issue
update on Planio">
</div>
</td>
</tr>
<tr>
<td class="footer" style="font-size:0.8em;width:100%">
<hr
style="width:100%;height:1px;background:#ccc;border:0;margin:1.2em
0">
<p>You have received this notification because you have
either subscribed to or are involved in a project on
Saxonica Developer Community site.
To change your notification preferences, please click
here: <a
href="https://saxonica.plan.io/my/account?tour=mail_preferences"
class="external" style="color:#0088b7"
moz-do-not-send="true">https://saxonica.plan.io/my/account?tour=mail_preferences</a></p>
</td>
<td><br>
</td>
</tr>
<tr>
<td class="planio_footer"
style="text-align:center;width:100%;font-family:MarketWeb,
Helvetica, Arial,
sans-serif;font-size:1.2em;color:#D7D7D7"><br>
<div><a href="https://plan.io/"
style="color:#0088b7;color:#D7D7D7;text-decoration:none"
moz-do-not-send="true">This notification was
cheerfully delivered by</a></div>
</td>
<td><br>
</td>
</tr>
<tr>
<td class="planio_footer_logo"
style="text-align:center;width:100%"><a
href="https://plan.io/" title="Planio"
style="color:#0088b7" moz-do-not-send="true"><img
src="https://assets.plan.io/images/planio_logo_gray_204x50.png"
alt="Planio" style="vertical-align:middle;border:none"
moz-do-not-send="true" width="102" height="25"
border="0"></a></td>
</tr>
</tbody>
</table>
</blockquote>
</body>
</html>

--------------D5719F63CBBAA1D261A0B977--
    (1-1/1)