Bug #2130
closedXPath referencing attribute with namespace fails when using DOM
0%
Description
Using JAXP to create an XPath with the Saxon-HE implementation fails if you run it on a org.w3c.dom.Document (using Xerces 2.7.1 - but that shouldn't matter) where the XPath references an attribute with a namespace prefix, unless that namespace is defined in the same Element as the prefixed attribute.
For example, using the below code on Saxon-HE 9.5.1.6 (I can't find 9.5.1.7 in Maven but I have confirmed that the issue should still exist by analysing the code):
public static void main(String[] args) throws Exception {
System.setProperty("javax.xml.xpath.XPathFactory", "net.sf.saxon.xpath.XPathFactoryImpl");
System.setProperty("javax.xml.parsers.DocumentBuilderFactory", "org.apache.xerces.jaxp.DocumentBuilderFactoryImpl");
XPathFactory xpf = XPathFactory.newInstance();
XPath xp = xpf.newXPath();
// Note the reference to attribute @xsi:type!!!
XPathExpression xpe = xp.compile("/root/child/item[@xsi:type=\"typeA\"]/info");
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(ClassLoader.getSystemResourceAsStream("test.xml"));
System.out.println("XPath: " + xpe.evaluate(doc));
}
If you run this on the document (named test.xml)
<?xml version="1.0" encoding="UTF-8"?>
<root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<child>
<item name="item1" xsi:type="typeA" >
<info>1234</info>
<more>abdc</more>
</item>
</child>
</root>
The output is "Xpath: "
If you run the above code on this altered XML document:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<child>
<item name="item1" xsi:type="typeA" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >
<info>1234</info>
<more>abdc</more>
</item>
</child>
</root>
The output is "XPath: 1234"
The culprit is net.sf.saxon.dom.DOMNodeWrapper line 501 (503 in 9.5.1.7)
Node node = attr.getOwnerElement();
do {
String attVal = ((Element) node).getAttribute(attName); // this is the line!!!
if (attVal != null) {
return attVal;
}
node = node.getParentNode();
} while (node != null && node.getNodeType() == Node.ELEMENT_NODE);
By the specification of JAXP - Element#getAttribute(String attrName) returns an "empty String if that attribute does not have a specified or default value.".
This means that attVal is ALWAYS an empty String and so returns an empty String without traversing the rest of the tree and finding the URI.
What this should do is first call Element#hasAttribute(String attrName) and return the attribute if this returns true (similar to DOMNodeWrapper#getElementURI(Element element)) i.e.:
if (((Element)node).hasAttribute(attName)) {
return ((Element) node).getAttribute(attName);
}
This is a simple fix, however I have a NamespaceContext declared that defines some namespaces that aren't declared in some XML documents I need to process. I notice DOMNodeWrapper traverses the tree to find the namespace declaration, and doesn't query the NamespaceContext. I have literally learnt the internals of Saxon today to solve this problem and I am by no means an expert on the API. Is there a reason why the NamespaceContext isn't queried, or is it just because the DOMNodeWrapper doesn't have access to the JAXPXPathStaticContext stored in XPathEvaluator? If it's the latter, what would be the best way to open this up?
I don't mind implementing the fix, however I want some feedback by someone regarding that last paragraph as it's an important issue for me that I need to fix
Thanks, Dan
Files
Please register to edit this issue