Project

Profile

Help

Bug #5867

closed

Unecessary warning due to out of order match pattern resolution

Added by Michael Ihde almost 2 years ago. Updated over 1 year ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Internals
Sprint/Milestone:
-
Start date:
2023-02-02
Due date:
% Done:

100%

Estimated time:
Legacy ID:
Applies to branch:
11, 12, trunk
Fix Committed on Branch:
11, 12, trunk
Fixed in Maintenance Release:
Platforms:
.NET, Java

Description

A template match of the form foo/bar[dangerousPredicate] can evaluate out of order (which is rather cool). But this means it can first evaluate the dangerous predicate against some bar, even if it's parent is not foo, and encounter some error in the dangerous predicate then emit a warning. I believe this warning should not be emitted in such circumstances unless the preceding checks also match (in this case that bar has parent foo). I feel like such out of order warning handling has already been dealt with before but perhaps something was just missed here.

Code to reproduce the warning running JDK 11:

import java.io.StringReader;
import javax.xml.transform.stream.StreamSource;
import net.sf.saxon.s9api.Processor;
import net.sf.saxon.s9api.SaxonApiException;
import net.sf.saxon.s9api.Xslt30Transformer;
public class SaxonTest {
	static final String TRANSFORM_DOC = "" //
	+ "<xsl:transform version=\"3.0\" xmlns:xsl=\"http://www.w3.org/1999/XSL/Transform\" xmlns:xs=\"http://www.w3.org/2001/XMLSchema\" xmlns:map=\"http://www.w3.org/2005/xpath-functions/map\">" //
	+ "    <xsl:variable name=\"myMap\" as=\"map(xs:string, xs:string)\"><xsl:map/></xsl:variable>" //
	+ "    <xsl:template match=\"foo/bar[map:contains($myMap, @key)]\"/>" //
	+ "</xsl:transform>"; //

	static final String XML_DOC = "<bar/>";

	public static void main(String[] args) throws SaxonApiException {
		Processor processor = new Processor(false);
		try (StringReader xsltIn = new StringReader(TRANSFORM_DOC); StringReader xmlIn = new StringReader(XML_DOC);) {
			Xslt30Transformer transformer = processor.newXsltCompiler().compile(new StreamSource(xsltIn)).load30();
			transformer.applyTemplates(new StreamSource(xmlIn), processor.newSerializer(System.out));
		}
	}
}

Emits this warning:

Warning at char 7 in xsl:template/@match on line 1 column 328 
  XPTY0004  An error occurred matching pattern
  {element(Q{}bar)[Q{http://www.w3.org/2005/xpath-functions/map}contains($myMap,
  atomizeSingleton(attribute::attribute(Q{}key)))]}: An empty sequence is not allowed as the
  second argument of map:contains()
Actions #1

Updated by Michael Kay almost 2 years ago

The problem with this is that you can get match patterns like

chapter[count(.//word) > 50000]//para[starts-with(., 'The')]

where we have to try and make an intelligent decision on the best order of evaluation; there's nothing in the spec to say that the ancestor/parent conditions should be evaluated before the predicate conditions, and sometimes it can be much more efficient to evaluate the predicate first.

That leaves the question of whether to produce a warning if evaluation of the predicate fails, and I think it's best to do so because otherwise it's completely mystifying why the pattern didn't match. I'd suggest in your example rewriting it to avoid the failure, for example

match="foo/bar[exists(@key) and map:contains($myMap, @key)]

There's been a lot of discussion recently about order of evaluation of predicates or of terms in an "and" expression, for example if you write

match="bar[parent::foo and exists(@key) and map:contains($myMap, @key)]"

and in Saxon 12 we might still rearrange the order of terms, but we won't report an error if any one term fails and another term returns false. I guess that's essentially what you're asking for here...

Actions #2

Updated by Michael Ihde almost 2 years ago

My apologies, if there's nothing in the spec regarding emitting these warnings vs short circuiting them then I guess this is not a bug at all, it just seemed wrong at the moment I encountered it.

I realized the out of order evaluation was for optimization, but now I'm thinking that evaluating more of the expression just to know if a warning should be emitted could be a lot more checks...

You're certainly right that it's better to report warnings than not when something fails. And better to focus on the optimization than trying to stop an unnecessary warning from escaping, though there's the consideration that certain warning handling could be slower than another simple check (for me it can write to a log), but how could Saxon know...

I think the heart of this is that there's a difference in knowledge between me (the XSLT author) and Saxon here.

I know that the bars which are children of foo will always have a @key and can assume that map:contains($myMap, @key) is safe but Saxon can't know that without a schema (if one even exists) or by checking the "preceding" predicates to short circuit and avoid the warning.

I agree that your suggestion of match="foo/bar[exists(@key) and map:contains($myMap, @key)] is probably the right approach, I assume there's some implementation magic that makes exists(@key) evaluated first.

The discussion on how to weigh predicates and their terms to determine order of evaluation is certainly interesting though. I look forward to following any future developments.

Thanks!

Actions #3

Updated by Michael Kay almost 2 years ago

Although there is no non-conformance here, the optimizer isn't doing a great job. It is computing the expected cost of the two expressions parent::foo and self::bar[map:contains($map, @key)] and has assigned them both the default cost of 1. Clearly the second expression is more expensive and the calculation needs to be refined.

(Note, ideally the calculation would not only estimate the cost of the expression, but also estimate its effectiveness as a filter. This is a case where the learning strategy that we're starting to use in 12.0 could play a role.)

Actions #4

Updated by Michael Kay almost 2 years ago

No, that's a misdiagnosis. The optimizer's calculation is initially correct, but the pattern then gets copied as a result of a tree rearrangement, and in the course of this copying, the result of the calculation gets lost.

It's a one-line fix to retain the result through the copy operation (affects AncestorQualifiedPattern.copy())

Actions #5

Updated by Michael Kay almost 2 years ago

  • Category set to Internals
  • Status changed from New to Resolved
  • Assignee set to Michael Kay
  • Priority changed from Low to Normal
  • Applies to branch 11, trunk added
  • Fix Committed on Branch 11, 12, trunk added
  • Platforms .NET added
Actions #6

Updated by Michael Ihde almost 2 years ago

Excellent! Thank you so much Michael.

Actions #7

Updated by O'Neil Delpratt over 1 year ago

  • Status changed from Resolved to Closed
  • % Done changed from 0 to 100
  • Fixed in Maintenance Release 12.1 added

Bug fix applied in the Saxon 12.1 maintenance release.

Please register to edit this issue

Also available in: Atom PDF