Project

Profile

Help

Bug #5951

closed

XQuery with DOMResult - empty prefixed namespace bindings added to children elements

Added by Steven Dürrenmatt about 1 year ago. Updated 5 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
DOM Interface
Sprint/Milestone:
-
Start date:
2023-04-01
Due date:
% Done:

80%

Estimated time:
Legacy ID:
Applies to branch:
10, 11, 12
Fix Committed on Branch:
10, 11, 12
Fixed in Maintenance Release:
Platforms:
Java

Description

Hello,

From Saxon 10 I have failing tests for XQuery transformations that deal with elements without namespace attribute. Such elements can inherit their parent namespace prefix and the namespace binding is reassigned the empty value.

Example of a basic XQuery transformation that does not need any context to be evaluated:

DOMResult result = new DOMResult();
Configuration config = new Configuration();
XQueryExpression expression = config.newStaticQueryContext().compileQuery("declare namespace b = 'test';<b:book><title/></b:book>");
expression.run(new DynamicQueryContext(config), result, new Properties());

Document doc = (Document) result.getNode();
StringWriter sw = new StringWriter();
Transformer transformer = new TransformerFactoryImpl().newTransformer();
transformer.transform(new DOMSource(doc), new StreamResult(sw));

System.out.println(sw);

That should output:

<?xml version="1.0" encoding="UTF-8"?><b:book xmlns:b="test"><title/></b:book>

But we have the following invalid XML instead:

<?xml version="1.0" encoding="UTF-8"?><b:book xmlns:b="test"><title xmlns:b=""/></b:book>

It works as expected if you use a StreamBuffer from a byte stream instead of a DOMResult. Another workaround is to redeclare the namespace inside the XML in the transformation, however that does not work for more complex transformations where the XML could be splitted into different functions. Any suggestion is welcome.


Files

clipboard-202304011320-pjfuf.png (76.1 KB) clipboard-202304011320-pjfuf.png Steven Dürrenmatt, 2023-04-01 13:20

Related issues

Has duplicate Saxon - Bug #6173: Namespace with empty uri when using Saxon-11.5DuplicateMichael Kay2023-08-16

Actions
Actions #1

Updated by Michael Kay about 1 year ago

Reproduced in 12.1 as unit test TestXQueryDOM/testBug5951.

As a matter of interest, is there any good reason why you are sending the query output to a DOMResult? There are always problems getting it right because of the mismatch between the DOM and XDM data models, particularly in areas like the different representation of namespaces.

Actions #2

Updated by Michael Kay about 1 year ago

  • Subject changed from xquery - empty prefixed namespace bindings added to children elements to XQuery with DOMResult - empty prefixed namespace bindings added to children elements
  • Status changed from New to In Progress
  • Priority changed from Low to Normal
Actions #3

Updated by Steven Dürrenmatt about 1 year ago

I am actually using the XQuery component from Apache Camel and recently they upgraded from Saxon 9 to Saxon 11. In Apache Camel DOM is the default result format but it can be configured per XQuery. They do not use s9api and XDM data models but lower level code.

See https://github.com/apache/camel/blob/main/components/camel-saxon/src/main/java/org/apache/camel/component/xquery/XQueryBuilder.java

MHK: Thanks. Yes, we've had a lot of problems with the way Apache Camel uses Saxon interfaces.

Actions #4

Updated by Michael Kay about 1 year ago

The XQuery rules are are rather precise and not necessarily intuitive.

Firstly, the rules at https://www.w3.org/TR/xquery-31/#id-ns-nodes-on-elements specify what namespace nodes should be present on the element produced by a direct element constructor, and these do not include namespaces declared in the query prolog (there are four rules saying when namespaces should be added, and none of them apply). So the title element, as original constructed, does not have an in-scope namespace for the "b" namespace.

When the title element is then added to the book element, you might think that it would inherit the namespaces of its new parent. This process is described at https://www.w3.org/TR/xquery-31/#id-content. Rule 1(d) applies, and no namespace inheritance takes place. If the constructor for title were enclosed in curly braces, rule 1(e)(iv)(D) would kick in, defining whether and when namespaces on the parent are copied to the children; but without the curly braces, this does not happen.

So in the result tree, the title element has no in-scope namespace for the b namespace.

The next question is how the result is serialized. Adding the namespace undeclaration xmlns:b="" is definitely correct for XML 1.1 serialization, but wrong for XML 1.0 serialization: yet the serialized output specifies <?xml version="1.0"?>.

Note that the code fragment you supplied isn't explicit about which TransformerFactoryImpl class you are instantiating. Both Saxon and Xalan have classes with that name. I'll assume it's the Saxon one, since that's what I'm using in my test, and the results are the same as yours.

It seems that the xmlns:b="" is actually being generated by the DOM tree walker and is sent to the serializer as an attribute node, not as a namespace node. The problem is in DOMSender.outputElement(), where the hasNamespaceDeclarations flag is set to false, and therefore all DOM attributes are output as attribute nodes without checking to see if they really represent namespace nodes; and because it arrives at the serializer as an attribute node, the serializer's attempt to eliminate unwanted namespace declarations has no effect.

Actions #5

Updated by Steven Dürrenmatt about 1 year ago

Thanks for the clarification. However I am not sure I understand your point about the transformer. Yes I use Saxon implementation, but I get the same serialized output with other transformers. It looks like the namespace attribute xmlns:b="" is already present in the DOMResult, despite it never being explicitely undeclared.

MHK: yes, it turned out not to be a problem in the identity transformer. The DOM tree is a correct representation of the query result tree, in so far as any DOM can ever be an accurate representation of an XDM tree, but Saxon's identity transformer requires the DOM view of the world (where namespaces are attributes) to be translated back to the XDM view of the world (where namespaces are namespaces) and this translation has been done incorrectly.

Actions #6

Updated by Michael Kay about 1 year ago

Fixed by removing the optimization in DOMSender.outputElement() - it now checks all DOM attributes to see if they are really namespace declarations, regardless.

(Cynical view: anyone using DOM by definition doesn't care very much about performance, so why bother optimising?)

Actions #7

Updated by Michael Kay about 1 year ago

  • Category set to DOM Interface
  • Status changed from In Progress to Resolved
  • Assignee set to Michael Kay
  • Fix Committed on Branch 10, 11, 12 added
  • Platforms Java added
Actions #8

Updated by O'Neil Delpratt 12 months ago

  • % Done changed from 0 to 100
  • Fixed in Maintenance Release 12.2 added

Bug fix applied in the Saxon 12.2 maintenance release.

Actions #9

Updated by Debbie Lockett 8 months ago

  • Fixed in Maintenance Release 11.6 added

Bug fix applied in the Saxon 11.6 maintenance release.

Actions #10

Updated by Michael Kay 8 months ago

  • Has duplicate Bug #6173: Namespace with empty uri when using Saxon-11.5 added
Actions #11

Updated by O'Neil Delpratt 5 months ago

  • Status changed from Resolved to Closed
Actions #12

Updated by Debbie Lockett 5 months ago

  • Status changed from Closed to Resolved
  • % Done changed from 100 to 80

Leave as "resolved" rather than "closed" as fix has not gone out in a Saxon 10 maintenance release.

Please register to edit this issue

Also available in: Atom PDF