Project

Profile

Help

Maintenance: Planio will be observing a scheduled maintenance window this Tuesday, November 5, 2024 from 03:00 UTC until 06:30 UTC to perform urgent network maintenance in our primary data center. Your Planio account will be unavailable during this maintenance window.

question on path expression

Added by Anonymous about 19 years ago

Legacy ID: #3390807 Legacy Poster: marcvc (marcvc)

Michael, we're seeing some wierd behaviour with a number of path expressions. It might well be that we overlook something? Consider a test.xml document as follows: <a><b><c><d a='1'/><c><d a='2'/></c><d a='3'/></c></b></a> And the following XQuery: declare variable $x1 := <a><b><c><d a='1'/><c><d a='2'/></c><d a='3'/></c></b></a>; let $x2 := <a><b><c><d a='1'/><c><d a='2'/></c><d a='3'/></c></b></a> let $x3 := doc("test.xml") return <result>{ for $c in $x1//c return $c/d, for $c in $x2//c return $c/d, for $c in $x3//c return $c/d }</result> It evaluates as follows, note that the d elements of the 3 inner flwrs are not ordered the same. <result> <d a="1"/> <d a="3"/> <d a="2"/> <d a="1"/> <d a="2"/> <d a="3"/> <d a="1"/> <d a="2"/> <d a="3"/> </result> Thanks, Marc


Replies (3)

Please register to reply

RE: question on path expression - Added by Anonymous about 19 years ago

Legacy ID: #3390891 Legacy Poster: Michael Kay (mhkay)

Thanks - an interesting one. As I imagine you realize, the correct order is 1,3,2. Saxon's applying an optimization rewriting for $i in a/b/c return $i/d as a/b/c/d and in this case they aren't equivalent: it causes an incorrect sort into document order to be added. I need to think (a) about whether the rewrite is actually useful (I think it is, as a step along the way to recognizing joins that can be optimized), and (b) about how the preconditions for the rewrite should be refined. As a first guess, the rewrite is only valid if it produces a path expression that is "naturally sorted", i.e. that satisfies the tests Saxon applies to avoid doing an extra sort and deduplication step. Alternatively, something I've thought about doing for a while is rewriting a/b as sort(a\b) where \ is an operator that's like / except for the sorting-and-deduplication semantics. Long term that would help the robustness of the code a lot. But in the short term, I'm a bit scared of destabilising path expressions, given that the optimizations are very sensitive to small changes in the expression tree.

RE: question on path expression - Added by Anonymous about 19 years ago

Legacy ID: #3390892 Legacy Poster: Michael Kay (mhkay)

For my own reference this is test case qxmp299.

RE: question on path expression - Added by Anonymous about 19 years ago

Legacy ID: #3390908 Legacy Poster: Michael Kay (mhkay)

The check that the new path expression is naturally sorted seems to do the trick: that is, changing the code at line 170 of ForExpression to: if (declaration != null && positionVariable==null && sequence instanceof PathExpression && action instanceof PathExpression) { int count = declaration.getReferenceCount(this, env); PathExpression path2 = (PathExpression)action; Expression s2 = path2.getStartExpression(); if (count == 1 && s2 instanceof VariableReference && ((VariableReference)s2).getBinding() == this) { PathExpression newPath = new PathExpression(sequence, path2.getStepExpression()); if ((newPath.getSpecialProperties() & StaticProperty.ORDERED_NODESET) != 0) { return newPath.simplify(env).typeCheck(env, contextItemType).optimize(opt, env, contextItemType); } } } Not yet regression tested.

    (1-3/3)

    Please register to reply