Bug #4081
closedXdmNode.toString() output changes between 9.8.0.14 to 9.9.0.2
100%
Description
With following document:
<?xml version="1.0" encoding="UTF-8"?>
<Employees>
<Employee id="1">
<age:ag xmlns:age="http://www.w3.org/2003/01/geo/wgs84_pos#">29</age:ag>
<name>Pankaj</name>
<gender>Male</gender>
<role>Java Developer</role>
</Employee>
<Employee id="2">
<age:ag xmlns:age="http://www.w3.org/2003/01/geo/wgs84_pos#">35</age:ag>
<name>Lisa</name>
<gender>Female</gender>
<role>CEO</role>
</Employee>
<Employee id="3">
<age:ag xmlns:age="http://www.w3.org/2003/01/geo/wgs84_pos#">45</age:ag>
<name>Tom</name>
<gender>Male</gender>
<role>Manager</role>
</Employee>
<Employee id="4">
<age:ag xmlns:age="http://www.w3.org/2003/01/geo/wgs84_pos#">55</age:ag>
<name>Meghan</name>
<gender>Female</gender>
<role>Manager</role>
</Employee>
</Employees>
Using XPath selector (with XPathExecutable):
"//Employees/Employee[1]/age:ag"
With 9.8.0-14: "<age:ag xmlns:age="http://www.w3.org/2003/01/geo/wgs84_pos#\">29</age:ag>"
But with 9.9.0-2: "<age:ag xmlns:age="http://www.w3.org/2003/01/geo/wgs84_pos#\">29</age:ag>\n"
A newline is added.
I didn't see any mention of this in release notes or maybe I didn't understand.
Regards
Updated by Michael Kay almost 6 years ago
This XPath expression returns an element node. The string value of the element node is "29", which is the result I get when I do
Processor p = new Processor(false);
XPathCompiler c = p.newXPathCompiler();
c.declareNamespace("age", "http://www.w3.org/2003/01/geo/wgs84_pos#");
XPathExecutable e = c.compile(xpath);
XPathSelector s = e.load();
s.setContextItem(p.newDocumentBuilder().build(new StreamSource(new StringReader(xml))));
XdmItem val = s.evaluateSingle();
assertEquals("result", "29", val.getStringValue());
The result you are displaying (in both cases) includes the name of the element node as well as its string value. It's not clear how you are getting this string; we need to know this because that would appear to be where the changed behaviour lies. So, could you please supply a simple repro that shows the effect?
Updated by Philippe Mouawad almost 6 years ago
Hello, Thanks for your very fast answer.
This is the test case that passes with 9.8.0-14 and fails with 9.9.0-2:
@Test
public void spaceNewLineBugReproducer() throws SaxonApiException {
Processor p = new Processor(false);
XPathCompiler c = p.newXPathCompiler();
c.declareNamespace("age", "http://www.w3.org/2003/01/geo/wgs84_pos#");
String xPathQuery="//Employees/Employee[1]/age:ag";;
XPathExecutable e = c.compile(xPathQuery);
XPathSelector selector = e.load();
selector.setContextItem(p.newDocumentBuilder().build(new StreamSource(new StringReader(xmlDoc))));
XdmValue nodes = selector.evaluate();
XdmItem item = nodes.itemAt(0);
assertEquals("<age:ag xmlns:age=\"http://www.w3.org/2003/01/geo/wgs84_pos#\">29</age:ag>",item.toString());
}
Regards
Updated by Michael Kay almost 6 years ago
The specification of XdmNode.toString() says
In the case of an element node, the result will be a well-formed
XML document serialized as defined in the W3C XSLT/XQuery serialization specification,
using options method="xml", indent="yes", omit-xml-declaration="yes".
Because this specifies indent="yes", the processor is free to add whitespace (in this case a newline character) after an element end tag.
I would need to do some digging to find out exactly why it was changed (addition of a newline at the end of a serialized file is always a debatable topic...) but both versions with and without the newline are within the spec. We did find some cases where the Saxon indentation algorithm wasn't conformant to the W3C specs, so there were changes in 9.9, but I'm pretty sure the W3C specs make the newline here optional.
If you want predictable and repeatable results you shouldn't be relying on indented serialization, because it's not sufficiently specified.
Updated by Michael Kay almost 6 years ago
More specifically, what has changed is that the from 9.9 the serializer implements the process that the W3C spec calls "sequence normalization"; previously this was the responsibility of the client of the serializer. The practical implication of this is that in 9.9 the element node being serialized by the toString() call is wrapped in a document node; and the XML indenter always (in both 9.8 and 9.9) adds a final newline after the endDocument() event. The purpose of this is so that when XSLT or XQuery output is sent to the console, the XML result is not followed on the same line by the next thing sent to the console.
We could, and perhaps should, change XdmNode.toString() so this newline is somehow suppressed; but the fact that it is there isn't a bug.
Updated by Michael Kay almost 6 years ago
- Status changed from New to Resolved
- Applies to branch 9.9 added
- Fix Committed on Branch 9.9 added
I have changed XdmNode.toString() to remove the final newline in the output of the serializer.
Updated by Michael Kay almost 6 years ago
- Subject changed from Upgrading from 9.8.0-14 to 9.9.0-2 gives different results when using XPath selector and XPathExecutable to XdmNode.toString() output changes between 9.8.0.14 to 9.9.0.2
- Category changed from XPath conformance to Serialization
Updated by O'Neil Delpratt almost 6 years ago
- % Done changed from 0 to 100
- Fixed in Maintenance Release 9.9.1.1 added
Bug fix applied to the Saxon 9.9.1.1 maintenance release.
Updated by O'Neil Delpratt almost 6 years ago
- Status changed from Resolved to Closed
Updated by Philippe Mouawad almost 6 years ago
Thanks a lot for your great support ! We will upgrade in next JMeter 5.1 and you'll be thanked.
Regards Philippe
Please register to edit this issue