https://saxonica.plan.io/https://saxonica.plan.io/favicon.ico2020-11-12T10:23:10ZSaxonica Developer CommunitySaxon - Feature #4823: Decimal Precisionhttps://saxonica.plan.io/issues/4823?journal_id=168392020-11-12T10:23:10ZMichael Kaymike@saxonica.com
<ul></ul><p>I haven't been able to reproduce the compile error (I'm running from the command line, I haven't tried running your Java code). What are the minimum steps to reproduce it? What is the exact error message, including line number?</p>
<p>As regards decimal precision, Saxon has an extension function to do decimal division with user-specified precision:</p>
<p><a href="https://saxonica.com/documentation/index.html#!functions/saxon/decimal-divide" class="external">https://saxonica.com/documentation/index.html#!functions/saxon/decimal-divide</a></p> Saxon - Feature #4823: Decimal Precisionhttps://saxonica.plan.io/issues/4823?journal_id=168402020-11-12T10:30:27ZMichael Kaymike@saxonica.com
<ul></ul><p>Note also that the default precision for decimal divide is not directly the value of the static variable BigDecimalValue.DIVIDE_PRECISION, rather it is</p>
<pre><code>Math.max(BigDecimalValue.DIVIDE_PRECISION, A.scale() - B.scale() + BigDecimalValue.DIVIDE_PRECISION);
</code></pre>
<p>(see line 851).</p>
<p>I'm afraid I don't recall how this formula was arrived at - it's been like that for a long time.</p> Saxon - Feature #4823: Decimal Precisionhttps://saxonica.plan.io/issues/4823?journal_id=168412020-11-12T10:56:28ZMichael Kaymike@saxonica.com
<ul></ul><p>As regards the compile error, Saxon's error message for a missing attribute on an XSLT element does not take this form. The message it produces is more like "Element must have an @select attribute".</p>
<p>Also, I don't know if this is relevant, but there are several files in your collection named Sample121-new.xml. The one in src/test/resources/ubl21 starts</p>
<pre><code><Invoice xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2"
xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2"
xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2">
<cbc:CustomizationID>urn:cen.eu:en16931:2017</cbc:CustomizationID>
</code></pre>
<p>while the one in generated-resources/xml/xslt starts</p>
<pre><code><svrl:schematron-output xmlns:svrl="http://purl.oclc.org/dsdl/svrl"
xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2"
xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2"
xmlns:iso="http://purl.oclc.org/dsdl/schematron"
</code></pre>
<p>which seems confusing.</p> Saxon - Feature #4823: Decimal Precisionhttps://saxonica.plan.io/issues/4823?journal_id=168422020-11-12T11:00:11ZSvante Schubert
<ul></ul><p>Thank you for your quick guidance.</p>
<p>Indeed, I was mistaken. The JAR works by command line and it turned out the problem is not a problem of the JAR I receive by Maven, but only occurs when I compile the sources given by Maven.
<a href="https://repo1.maven.org/maven2/net/sf/saxon/Saxon-HE/10.3/Saxon-HE-10.3-sources.jar" class="external">https://repo1.maven.org/maven2/net/sf/saxon/Saxon-HE/10.3/Saxon-HE-10.3-sources.jar</a>
as
<a href="https://github.com/svanteschubert/Saxon-HE/tree/main/src/main/java" class="external">https://github.com/svanteschubert/Saxon-HE/tree/main/src/main/java</a>
Are they still in synch?
The error was:
'Syntax error in ' if (/ubl-invoice:Invoice) then (if (cbc:InvoicedQuantity) then xs:decimal(cbc:InvoicedQuantity) else 1) else (if (cbc:CreditedQuantity) then xs:decimal(cbc:CreditedQuantity) else 1)'.'
FATAL ERROR: 'file:/E:/GitHub/einvoice/xslt-decimal/src/main/resources/xsl/PEPPOL-EN16931-UBL-V2.xslt: line 200: Required attribute 'select' is missing.'</p>
<p>I am using the JAR source provided by Maven:
<a href="https://repo1.maven.org/maven2/net/sf/saxon/Saxon-HE/10.3/" class="external">https://repo1.maven.org/maven2/net/sf/saxon/Saxon-HE/10.3/</a></p>
<p>My main question should be solved unless I am in need to debug, I am not in need of the sources.
I am nevertheless curious where the error lurks...</p>
<p>I will investigate further if these function will do the trick.</p>
<p>Have a great day, Michael!
Svante</p> Saxon - Feature #4823: Decimal Precisionhttps://saxonica.plan.io/issues/4823?journal_id=168432020-11-12T11:14:27ZMichael Kaymike@saxonica.com
<ul></ul><p>This doesn't look like a Saxon error message. I'm wondering if it comes from Xalan? Perhaps when you rebuilt Saxon from source code, you didn't generate the MANIFEST file that causes JAXP to pick it up as the chosen XSLT transformer, and JAXP is running Xalan instead?</p> Saxon - Feature #4823: Decimal Precisionhttps://saxonica.plan.io/issues/4823?journal_id=168442020-11-12T12:08:35ZSvante Schubert
<ul></ul><p>Michael Kay wrote:</p>
<blockquote>
<p>This doesn't look like a Saxon error message. I'm wondering if it comes from Xalan? Perhaps when you rebuilt Saxon from source code, you didn't generate the MANIFEST file that causes JAXP to pick it up as the chosen XSLT transformer, and JAXP is running Xalan instead?</p>
</blockquote>
<p>That was exactly the problem! A thousand thanks! :-)
I copied the META-INF/services/javax.xml.transform.TransformerFactory file from the binary JAR, as it was missing in the META-INF of the sources JAR. Perhaps you like to add JAXP file to the sources JAR to avoid to be molested on the same problem again ;-)</p>
<p>I tried your Saxon function
<a href="https://saxonica.com/documentation/index.html#!functions/saxon/decimal-divide" class="external">https://saxonica.com/documentation/index.html#!functions/saxon/decimal-divide</a>
but it works not on the HE that I aimed for use for our full open-source stack EU e-invoice validation artefact.
Hopefully, decimal-based floating-point is somehow available for the opensource stack. ;-)</p> Saxon - Feature #4823: Decimal Precisionhttps://saxonica.plan.io/issues/4823?journal_id=168522020-11-14T23:00:48ZSvante Schubert
<ul></ul><p>Hello Michael,</p>
<p>There are some changes, you might want to overtake.</p>
<p>I updated the pom.xml to the recent library and added a simplified version of the prior XSL transformation as a JUnit test to be able to easily debug Saxon via IDE. I believe you are still using ANT - that's what the manifest claims.</p>
<p>There was this annoying e-commerce test case :</p>
<p>quantity = 1000000000.0
priceAmount = 1.0
baseQuantity = 3</p>
<p>($quantity * ($priceAmount div $baseQuantity)) =
(3 *(1.0 div 3 )) =
333333333.333333333333333333</p>
<p>($quantity * $priceAmount div $baseQuantity)
(3 * 1.0 div 3 ) =
333333333.3333333333333333333333333333333333
123456789 0123456789012345678901234567890123</p>
<p>Now it has 34 digits all the time according to Java DECIMAL-128 -</p>
<p>333333333.3333333333333333333333333
123456789 0123456789012345678901234</p>
<p>see <a href="https://en.wikipedia.org/wiki/Decimal128_floating-point_format" class="external">https://en.wikipedia.org/wiki/Decimal128_floating-point_format</a></p>
<p>To enable this I did the following:</p>
<ol>
<li>
<p>Fixed by using the highest precision (and different ways to call BigDecimal when to multiply/divide)
See <a href="https://github.com/svanteschubert/Saxon-HE/commit/68c538a364e8bfd8aa5598077521ad87fb297e88" class="external">https://github.com/svanteschubert/Saxon-HE/commit/68c538a364e8bfd8aa5598077521ad87fb297e88</a></p>
</li>
<li>
<p>Disabled the usage of Double as inappropriate for e-commerce:
Information: <a href="https://github.com/svanteschubert/Saxon-HE#decimal-based-floating-point" class="external">https://github.com/svanteschubert/Saxon-HE#decimal-based-floating-point</a>
See <a href="https://github.com/svanteschubert/Saxon-HE/commit/fe8ca45c54622b467eb58fbaeae0d3edbe4461c7" class="external">https://github.com/svanteschubert/Saxon-HE/commit/fe8ca45c54622b467eb58fbaeae0d3edbe4461c7</a></p>
</li>
<li>
<p>When disabling Double the BigDecimal needs to overtake its part and have to be enhanced (and can be simplified) ->
see <a href="https://github.com/svanteschubert/Saxon-HE/commit/68c538a364e8bfd8aa5598077521ad87fb297e88" class="external">https://github.com/svanteschubert/Saxon-HE/commit/68c538a364e8bfd8aa5598077521ad87fb297e88</a></p>
</li>
<li>
<p>As I not only added to the EU e-invoice CEN norm (EN16931) the recommendation of decimal-based usage but also HALF-UP as the default rounding and Saxon does not support AFAIK user functions, I had to overwrite the XPath round() function - you should not overtake this hack ;-)</p>
</li>
</ol>
<p>I have not tested the performance/memory penalty but my primary goal is to add accuracy to the XSLT based validation artifacts of the CEN e-invoice specification. In general, I would prefer accuracy over 10% run-time penalty, this is what Mike Cowlishaw mentioned to me is the usual rate.</p>
<p>Hope I could be of help and thank you again for your quick response, Mike!
Svante</p> Saxon - Feature #4823: Decimal Precisionhttps://saxonica.plan.io/issues/4823?journal_id=168532020-11-15T11:33:00ZMichael Kaymike@saxonica.com
<ul></ul><p>Thanks for doing this investigation.</p>
<p>I think that the changes to the precision of decimal arithmetic are probably conformant with the the XPath specification, which leaves many details of decimal arithmetic implementation-defined. We would need to run all the tests to see if it has any adverse impacts on conformance. Testing for backwards compatibility effects is more difficult because we don't have many tests for aspects of the specification that are implementation-defined. In theory we can also make it configurable, though I'm reluctant because that adds a lot of complexity and a lot of test cases.</p>
<p>The change to using decimal rather than double for numeric literals is however non-conformant and also breaks backwards compatibility. That's not a change we can contemplate. If you want 1.5e0 treated as <code>xs:decimal</code>, you need to write <code>xs:decimal('1.5e0')</code>.</p>
<p>If you want to customise functions like round() then I strongly recommend writing your own functions rather than modifying the standard functions. You can implement your own functions even in Saxon-HE by writing them as "integrated extension functions".</p>
<p>We have no plans to change the current policy of differentiating Saxon-PE from -HE, under which Saxon extensions and extensibility mechanisms are generally available only in the commercial product. This policy has proved highly successful in generating a revenue stream that enables us to continue development both for the 10% of users who pay for the product and for the 90% who use the free version. Everyone benefits.</p> Saxon - Feature #4823: Decimal Precisionhttps://saxonica.plan.io/issues/4823?journal_id=168542020-11-15T13:35:01ZSvante Schubert
<ul></ul><p>Thanks again, for your feedback and guidance.</p>
<p>I was not aware of the "integrated extension functions" nor on the numeric literals.
What you say is all reasonable and will investigate more in this area.</p>
<p>Have a nice weekend,
Svante</p>
<p>PS: Sorry, for the typos, for instance, the example should be of course the following:</p>
<pre><code> $quantity * ($priceAmount div $baseQuantity)) = (1000000000.0 *(1.0 div 3 )) = 333333333.3333333333333333333333333
($quantity * $priceAmount div $baseQuantity) = (1000000000.0 * 1.0 div 3 ) = 333333333.3333333333333333333333333
</code></pre> Saxon - Feature #4823: Decimal Precisionhttps://saxonica.plan.io/issues/4823?journal_id=168552020-11-15T18:32:02ZSvante Schubert
<ul></ul><p>You might want to consider to change the default type of floating-point on XSLT versions:</p>
<ul>
<li>XSLT 1 & 2 using binary-based floating-point as XSLT 2.0 was released 2007 a year before IEEE 754:2008 embraced decimal-based.</li>
<li>XSLT 3 using by default decimal-based floating-point as XSLT 3.0 was released 2017 and referring to IEEE 754:2008
Or even change the default later in XSLT 4.</li>
</ul>
<p>In any case, you should allow some configuration for changing the floating-point to decimal-based (or from the default just in case).
It is an easy switch in NumericValue.java,
see <a href="https://github.com/svanteschubert/Saxon-HE/commit/fe8ca45c54622b467eb58fbaeae0d3edbe4461c7" class="external">https://github.com/svanteschubert/Saxon-HE/commit/fe8ca45c54622b467eb58fbaeae0d3edbe4461c7</a>
The complete e-commerce business should better be using decimal-based.</p>
<p>For Saxon, a default switch should be considered with a new major release to switch the default.
This might be helpful as the new accuracy will change results and some automated regression tests might be caught by surprise.
Such changes can be expected in a major release.</p>
<p>Again the former example now with better format:</p>
<pre><code>quantity = 1000000000.0
priceAmount = 1.0
baseQuantity = 3
</code></pre>
<p>Using binary floating-point:</p>
<pre><code> $quantity * ($priceAmount div $baseQuantity)) = (1000000000.0 *(1.0 div 3 )) = 333333333.333333333333333333
($quantity * $priceAmount div $baseQuantity) = (1000000000.0 * 1.0 div 3 ) = 333333333.3333333333333333333333333333333333
</code></pre>
<p><strong>The above values should be the same</strong>, but differ by 0.0000000000000003333333333333333
In the energy & pharma sector prices with 6 to 9 decimal places are often, going along with high-quantity errors are in easily in Cent level.</p>
<p>Using decimal-based floating-point (IEEE 754:2008 or later)</p>
<pre><code> $quantity * ($priceAmount div $baseQuantity)) = (1000000000.0 *(1.0 div 3 )) = 333333333.3333333333333333333333333
($quantity * $priceAmount div $baseQuantity) = (1000000000.0 * 1.0 div 3 ) = 333333333.3333333333333333333333333
</code></pre>
<p>Question/Suggestion:
I added half-up to the "integrated extension functions" and it seems that half-up-even was not consistently implemented and fixed this according to the XSLT >=2 specification:
<a href="https://www.w3.org/TR/xquery-operators/#func-round-half-to-even" class="external">https://www.w3.org/TR/xquery-operators/#func-round-half-to-even</a>
or
<a href="https://www.w3.org/TR/xpath-functions-31/#func-round-half-to-even" class="external">https://www.w3.org/TR/xpath-functions-31/#func-round-half-to-even</a></p>
<p>The JavaDoc of the parent class "NumericValue" states:</p>
<pre><code> /**
* Implement the XPath 2.0 round-half-to-even() function
*
* @param scale the decimal position for rounding: e.g. 2 rounds to a
* multiple of 0.01, while -2 rounds to a multiple of 100
* @return a value, of the same type as the original, rounded towards the
* nearest multiple of 10**(-scale), with rounding towards the nearest
* even number if two values are equally near
*/
</code></pre>
<p>But some implementations change the JavaDoc and are not implementing the XPath function by not allowing positive scale:</p>
<pre><code> /**
* Implement the XPath round-to-half-even() function
*
* @param scale number of digits required after the decimal point; the
* value -2 (for example) means round to a multiple of 100
* @return if the scale is &gt;=0, return this value unchanged. Otherwise
* round it to a multiple of 10**-scale
*/
</code></pre>
<p><strong>if the scale is >=0, return this value unchanged. Otherwise round it to a multiple of 10</strong>-scale**
Not only is the positive parameter neglected, but a change object instance instead of a returning a rounded copy.
Either way is fine, but it should be consistent.</p>
<p>What do you prefer?</p> Saxon - Feature #4823: Decimal Precisionhttps://saxonica.plan.io/issues/4823?journal_id=168562020-11-15T19:23:14ZMichael Kaymike@saxonica.com
<ul></ul><p>The <code>xs:double</code> data type in XSD and XPath is based very firmly on 64-bit binary floating point.</p>
<p>Support for "xs:precisionDecimal" based on IEEE-754:2008 was proposed for XSD 1.1 but didn't make it into the final spec (it turned into something of a political battle between Oracle and IBM). When it was withdrawn, however, there was a concession that allowed implementors to add primitive data types beyond those in the standard. So at the XSD level Saxon could add precisionDecimal (decimal-based floating point) if we chose, as an extension.</p>
<p>However, one of the factors that led to its withdrawal from XSD was the amount of work that would be needed to support it in XPath, especially the complexity that two values can be numerically equal, but still different when scale is taken into account. Having observed those discussions from the sidelines, defining the semantics for precisionDecimal support in XPath (especially the semantics of mixed-type operations) I would not be at all enthusiastic about taking it on. I think this belongs in an add-on function library, not in the core, at least until it's becomes tried and tested.</p>
<p>As far as your comments on the Javadoc are concerned, the <code>round-half-to-even()</code> operation on an integer is a no-op if the scale is positive (round-half-to-even(23, 4) returns 23), and the Javadoc on the roundHalfToEven() methods on <code>Int64Value</code> and <code>IntegerValue</code> reflects this.</p>
<p>You seem to suggest that there's an implementation that's modifying existing values in situ rather than returning a copy; if that's the case then it would certainly be a bug, but I haven't found it from your description.</p> Saxon - Feature #4823: Decimal Precisionhttps://saxonica.plan.io/issues/4823?journal_id=168572020-11-16T17:08:55ZSvante Schubert
<ul><li><strong>File</strong> <a href="/attachments/49171">2020-11-16_17-17-25.png</a> <a class="icon-only icon-download" title="Download" href="/attachments/download/49171/2020-11-16_17-17-25.png">2020-11-16_17-17-25.png</a> added</li></ul><a name="Can-we-fix-the-specs"></a>
<h2 >Can we fix the specs?<a href="#Can-we-fix-the-specs" class="wiki-anchor">¶</a></h2>
<p>Unfortunately, XSD is completely unaware (or ignorant) towards decimal-based floating-point.
By prohibiting the scientific notation for <code>xs:decimal</code> and allowing it only for binary floating-point as <code>xs:double</code>, the usage of pure decimal-based arithmetic becomes very difficult, if not impossible for users.
But science and especially e-commerce sector are desperately in need of accuracy, which decimal-base is offering.
Users should be able to easily switch from binary to the decimal-based implementation detail.</p>
<p>Therefore instead of being ignorant to decimal-based and stating solely float and double, there likely should be differentiated types like:</p>
<ul>
<li>bFlout</li>
<li>bDouble</li>
<li>bQuadrupel</li>
<li>dDouble</li>
<li>dQuadrupel</li>
</ul>
<p>According to the table of parameters defining basic format floating-point numbers from IEEE 754 (attached).</p>
<p>Regarding XPath. I do not see the problem, yet. The Java implementation BigDecimal has an equal() function on the number the syntax might differ, but the semantic stays the same. Aren't the multiple representations of the same number an implementation detail that can be shielded away by some normalization layer?
The user's need for accuracy outruns the problems we might have with implementations ;-)</p>
<a name="Can-users-rely-on-decimal-based-accuracy-via-Saxon"></a>
<h2 >Can users rely on decimal-based accuracy via Saxon?<a href="#Can-users-rely-on-decimal-based-accuracy-via-Saxon" class="wiki-anchor">¶</a></h2>
<p>So if we are strict to the old XSD spec being ignorant to decimal-based, we can not solve the problem.
How can a decimal-based extension work? Any suggestions? I bet NumericValue.parseNumber has to decide between the floating-point base. If someone wants accuracy it makes little sense to mix binary and decimal but should stick to decimal.
That is the reason for my prototype/fork/test to give the EU e-invoice validation artefacts a solid XSLT Saxon base (decimal)!</p>
<a name="CommentsQuestions-on-Saxon-Code"></a>
<h2 >Comments/Questions on Saxon Code<a href="#CommentsQuestions-on-Saxon-Code" class="wiki-anchor">¶</a></h2>
<ol>
<li>I agree with your comment on XPath <a href="https://www.w3.org/TR/xpath-functions-31/#func-round-half-to-even" class="external">https://www.w3.org/TR/xpath-functions-31/#func-round-half-to-even</a> specifies that the same (or related) type has to be returned.
FYI: In Java, the following can be done:
Rounding half-even 123456789 with scale -2 to 123456800
Rounding half-even 123456789 with scale 2 to 123456789.00
Last line via: new BigDecimal("123456789").divide(BigDecimal.ONE, 2, RoundingMode.HALF_EVEN).toPlainString()</li>
<li>If rounding should always return a copy, is this correct?
<a href="https://github.com/svanteschubert/Saxon-HE/blob/main/src/main/java/net/sf/saxon/value/Int64Value.java#L495" class="external">https://github.com/svanteschubert/Saxon-HE/blob/main/src/main/java/net/sf/saxon/value/Int64Value.java#L495</a>
It is safer to return only copies, are there concurrent accesses or what is the reason? Just curious.</li>
</ol>
<a name="Whats-next"></a>
<h2 >What's next?<a href="#Whats-next" class="wiki-anchor">¶</a></h2>
<p>In the next days, I will review my weekend work to provide you with a more solid "pull-request/suggestion". BTW I give you all rights/agreements you like to rejoin my work with your codebase.
If you like we might have a shortly joined tea-break to discuss any further obstacles in our way to support decimal-based :-)
We shortly met in Prag this year at the XML conference, I asked you support bidirectional XSL transformation... ;-)</p> Saxon - Feature #4823: Decimal Precisionhttps://saxonica.plan.io/issues/4823?journal_id=168762020-11-19T12:02:13ZMichael Kaymike@saxonica.com
<ul><li><strong>Tracker</strong> changed from <i>Bug</i> to <i>Feature</i></li><li><strong>Subject</strong> changed from <i>Switching from 9.9.1-8 to 9.9.1-9 (also in 10.3) causes error: "Required attribute 'select' is missing"</i> to <i>Decimal Precision</i></li><li><strong>Status</strong> changed from <i>New</i> to <i>In Progress</i></li></ul><p>Changing the title to better reflect the topic, and recategorising from "Bug" to "Feature" since there appears to be no suggestion that the product isn't behaving to spec.</p> Saxon - Feature #4823: Decimal Precisionhttps://saxonica.plan.io/issues/4823?journal_id=170082020-12-11T18:07:30ZMichael Kaymike@saxonica.com
<ul><li><strong>Category</strong> set to <i>Saxon extensions</i></li><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Closed</i></li><li><strong>Assignee</strong> set to <i>Michael Kay</i></li></ul><p>I'm going to close this with no action, I'm afraid. I can't envisage circumstances in which we would be able to construct a business case for investing in this area. If someone produces a third-party library that provides such functionality, then we would consider integrating it.</p>