Processing Empty Text Node with XPath

Added by Anonymous over 14 years ago

Legacy ID: #8203302 Legacy Poster: Rami (iirami)

When executing an XPath like this one [code]/Items/Item/ItemInfo3/text()[/code] with [XPathExpression.evaluate(domDocument, XPathConstants.NODESET)[/code] The resulting NodeList will not include text nodes with empty string content. However, the XPath implementation that ships with the Sun JRE returns such empty text nodes. [code] 11 12 13 21 22 [/code] When a DOM is created directly from that sample XML, both return the same NodeList result, but when I add the empty Text node with the following code: [code]XPathExpression exp = xpath.compile("/Items/Item[2]/ItemInfo3"); NodeList list = (NodeList)exp.evaluate(doc, XPathConstants.NODESET); Node node = list.item(0); node.appendChild(doc.createTextNode(""));[/code] I get the difference. Which one is correct? Is there a way to configure Saxon to return the empty Text nodes like the Apache implementation does? I have reproducing Java code to demonstrate this difference if needed. Thanks!

Replies (1)

RE: Processing Empty Text Node with XPath - Added by Anonymous over 14 years ago

Legacy ID: #8203379 Legacy Poster: Michael Kay (mhkay)

XPath semantics are defined against the XDM data model, not against the DOM, and in XDM, elements never have empty text nodes as children. Given an empty element , it will have no children in the XDM view of the world. When you supply a DOM as input, Saxon has to perform a mapping from the DOM world-view to the XDM world-view - that's one of the reasons that query against a DOM is so slow compared with using the native Saxon tree model. It's also one of the reasons why I always advise people against using the construct A/B/C/text() - it's much better to write A/B/C/string().

the XPath implementation that ships with the Sun JRE returns such empty text nodes

That's an XPath 1.0 implementation. In XPath 1.0 the description of the data model was much briefer and less formal, but it clearly states in section 5.7 "A text node always has at least one character of data." So the JDK implementation is clearly wrong.

(1-1/1)

Please register to reply

Project

Profile

Help

Saxon