Project

Profile

Help

question on path expression

Added by Anonymous over 18 years ago

Legacy ID: #3390807 Legacy Poster: marcvc (marcvc)

Michael, we're seeing some wierd behaviour with a number of path expressions. It might well be that we overlook something? Consider a test.xml document as follows: <a><b><c><d a='1'/><c><d a='2'/></c><d a='3'/></c></b></a> And the following XQuery: declare variable $x1 := <a><b><c><d a='1'/><c><d a='2'/></c><d a='3'/></c></b></a>; let $x2 := <a><b><c><d a='1'/><c><d a='2'/></c><d a='3'/></c></b></a> let $x3 := doc("test.xml") return <result>{ for $c in $x1//c return $c/d, for $c in $x2//c return $c/d, for $c in $x3//c return $c/d }</result> It evaluates as follows, note that the d elements of the 3 inner flwrs are not ordered the same. <result> <d a="1"/> <d a="3"/> <d a="2"/> <d a="1"/> <d a="2"/> <d a="3"/> <d a="1"/> <d a="2"/> <d a="3"/> </result> Thanks, Marc


Replies (3)

Please register to reply

RE: question on path expression - Added by Anonymous over 18 years ago

Legacy ID: #3390891 Legacy Poster: Michael Kay (mhkay)

Thanks - an interesting one. As I imagine you realize, the correct order is 1,3,2. Saxon's applying an optimization rewriting for $i in a/b/c return $i/d as a/b/c/d and in this case they aren't equivalent: it causes an incorrect sort into document order to be added. I need to think (a) about whether the rewrite is actually useful (I think it is, as a step along the way to recognizing joins that can be optimized), and (b) about how the preconditions for the rewrite should be refined. As a first guess, the rewrite is only valid if it produces a path expression that is "naturally sorted", i.e. that satisfies the tests Saxon applies to avoid doing an extra sort and deduplication step. Alternatively, something I've thought about doing for a while is rewriting a/b as sort(a\b) where \ is an operator that's like / except for the sorting-and-deduplication semantics. Long term that would help the robustness of the code a lot. But in the short term, I'm a bit scared of destabilising path expressions, given that the optimizations are very sensitive to small changes in the expression tree.

RE: question on path expression - Added by Anonymous over 18 years ago

Legacy ID: #3390892 Legacy Poster: Michael Kay (mhkay)

For my own reference this is test case qxmp299.

RE: question on path expression - Added by Anonymous over 18 years ago

Legacy ID: #3390908 Legacy Poster: Michael Kay (mhkay)

The check that the new path expression is naturally sorted seems to do the trick: that is, changing the code at line 170 of ForExpression to: if (declaration != null && positionVariable==null && sequence instanceof PathExpression && action instanceof PathExpression) { int count = declaration.getReferenceCount(this, env); PathExpression path2 = (PathExpression)action; Expression s2 = path2.getStartExpression(); if (count == 1 && s2 instanceof VariableReference && ((VariableReference)s2).getBinding() == this) { PathExpression newPath = new PathExpression(sequence, path2.getStepExpression()); if ((newPath.getSpecialProperties() & StaticProperty.ORDERED_NODESET) != 0) { return newPath.simplify(env).typeCheck(env, contextItemType).optimize(opt, env, contextItemType); } } } Not yet regression tested.

    (1-3/3)

    Please register to reply