Support #6554
closedThe contains (text()) function does not return all results
0%
Description
Version:12.4 Gradle dependency: group: 'net.sf.saxon', name: 'Saxon-HE', version: '12.4'
Xpath: //li[contains(text(), 'Text 3')]
In version 12.3 it brings 2 records, while in version 12.4 only 1 piece for the same xpath. I have attached my test file.
Files
Updated by Michael Kay 3 months ago
- Tracker changed from Bug to Support
You haven't said how you are running the XPath expression.
The correct result, for XPath 2.0+, is an error, and this is the result I am getting: the contains()
function does not allow a sequence of multiple items to be supplied as the first argument. That's because the second li
element has multiple text node children.
If you want to tell if the string value of the li
element contains a particular substring, the correct expression is //li[contains(., 'Text 3')]
. If you want to tell whether any of the text node children contains a particular substring, the correct expression is //li[text()[contains(., 'Text 3')]]
.
It's possible that you were running in XPath 1.0 compatibility mode, in which case you wouldn't get an error, rather it would ignore all text nodes except the first.
I'm not sure what you mean in your question about a "record" or a "piece" - I think you're using non-technical terms here - and it would help to know more precisely exactly what you were doing and exactly what the results were.
I can't think of any change between 12.2 and 12.3 that would affect this, and indeed, I can't think of any scenario where your expression would deliver multiple results: it should deliver an error in 2.0+ mode or a single item (the third li
element) in 1.0 mode.
Updated by Ati Wolf 3 months ago
Michael Kay wrote in #note-1:
You haven't said how you are running the XPath expression.
The correct result, for XPath 2.0+, is an error, and this is the result I am getting: the
contains()
function does not allow a sequence of multiple items to be supplied as the first argument. That's because the secondli
element has multiple text node children.If you want to tell if the string value of the
li
element contains a particular substring, the correct expression is//li[contains(., 'Text 3')]
. If you want to tell whether any of the text node children contains a particular substring, the correct expression is//li[text()[contains(., 'Text 3')]]
.It's possible that you were running in XPath 1.0 compatibility mode, in which case you wouldn't get an error, rather it would ignore all text nodes except the first.
I'm not sure what you mean in your question about a "record" or a "piece" - I think you're using non-technical terms here - and it would help to know more precisely exactly what you were doing and exactly what the results were.
I can't think of any change between 12.2 and 12.3 that would affect this, and indeed, I can't think of any scenario where your expression would deliver multiple results: it should deliver an error in 2.0+ mode or a single item (the third
li
element) in 1.0 mode.
How can I switch between XPath 1.0 and XPath 2.0?
The goal is to process an xhtml to work similarly to js libraries:
Here's how it's used:
net.sf.saxon.xpath.XPathFactoryImpl saxon = new net.sf.saxon.xpath.XPathFactoryImpl();
XPath newXPath = saxon.newXPath();
newXPath.compile(expression);
NodeList nodeList = (NodeList) newXPath.evaluate(contextNode, XPathConstants.NODESET);
12.3 - This version throws an error
net.sf.saxon.trans.UncheckedXPathException: A sequence of more than one item is not allowed as the first argument of fn:contains() ("
Text 1
", "
")
12.4 - In this version, the size of the nodeList is 1
Updated by Michael Kay 3 months ago
How can I switch between XPath 1.0 and XPath 2.0?
Using the JAXP XPath API, you can switch to XPath 1.0 compatibility mode by doing
net.sf.saxon.xpath.XPathFactoryImpl saxon = new net.sf.saxon.xpath.XPathFactoryImpl();
XPath newXPath = saxon.newXPath();
((net.sf.saxon.xpath.XPathEvaluator)newXPath).getStaticContext().setBackwardsCompatibilityMode(true);
newXPath.compile(expression);
Updated by Michael Kay 3 months ago
I ran the following Java program against both 12.3 and 12.4:
public static void main(String[] args) throws InterruptedException {
try {
XPathFactoryImpl saxon = new XPathFactoryImpl();
XPath newXPath = saxon.newXPath();
XPathExpression exp = newXPath.compile("//li[contains(text(), 'Text 3')]");
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
DocumentBuilder builder = factory.newDocumentBuilder();
Node root = builder.parse(new File("/Users/mike/...../test.xml"));
NodeList nodeList = (NodeList) exp.evaluate(root, XPathConstants.NODESET);
System.out.println(nodeList.getLength());
} catch (XPathExpressionException | ParserConfigurationException | SAXException | IOException e) {
throw new RuntimeException(e);
}
}
In both cases it failed, as expected, saying
Exception in thread "main" net.sf.saxon.trans.UncheckedXPathException
Caused by: net.sf.saxon.trans.XPathException: A sequence of more than one item is not allowed as the first argument of fn:contains() ("
Text 1
", "
")
If I change it to set 1.0 compatibility mode as described above, it outputs "1" under either 12.3 or 12.4.
These results are all correct according to the spec.
If you still think there is a problem, please supply a repro that shows exactly what you are doing so that I can reproduce your results.
Updated by Michael Kay 3 months ago
- Status changed from New to AwaitingInfo
- Assignee set to Michael Kay
- Applies to branch 12 added
Updated by Michael Kay about 1 month ago
- Status changed from AwaitingInfo to Closed
Closing this as it has gone dormant.
Please register to edit this issue