Bug with Operator for Date Subtraction (...
Added by Anonymous almost 14 years ago
Legacy ID: #9062436 Legacy Poster: smootz (smootz)
In both the HE and EE versions of saxon, using the subtraction operator ('-') on two dates results in the following error message: XPTY0004: An error occurred matching pattern {pattern matching node()}: Arithmetic operator is not defined for arguments of types (xs:double, xs:date) The template matching predicate that is generating this error is as follows: someNode[days-from-duration(@date - current-date()) < 0] (Note that @date is of type xs:date) In the EE version (which was only downloaded and used with the 30 day trial license to verify the same behavior existed), I attempted to enforce a schema-aware transformation by using the -sa flag, but that produced the same results. Interestingly enough, however, using the -val option (with either lax or strict) produces the anticipated results...that is, a duration is returned to the days-from-duration function, which results in the integer value needed for comparison. The point is that this should not be something that works as expected when schema-awareness is turned on; rather, it should work as anticipated without schema-awareness.
Replies (13)
Please register to reply
RE: Bug with Operator for Date Subtraction (... - Added by Anonymous almost 14 years ago
Legacy ID: #9062650 Legacy Poster: Michael Kay (mhkay)
In the absence of a schema, the typed value of an attribute is always xs:untypedAtomic, regardless of the fact that its value might look like a date and its name might be "date". In a subtraction, an xs:untypedAtomic value is always treated as an xs:double (again, regardless of its lexical form, and regardless of the type of the other operand). Sorry, but that's what the spec says, and that's what Saxon does. You have two choices: create a typed document by validating against a schema, or convert the attribute to a date by hand (xs:date(@date))/
RE: Bug with Operator for Date Subtraction (... - Added by Anonymous almost 14 years ago
Legacy ID: #9385828 Legacy Poster: smootz (smootz)
Understood...thank you for the explanation, it is greatly appreciated. I just wanted to follow up with you regarding one aspect from my original post that you didn't comment on (the -sa flag vs. the -val flag). I would expect the -sa flag to treat the value as an xs:date (because it's enforcing schema-awareness). This is not the case, as I am still receiving the XPTY0004 message. However, when using the -val flag (either strict or lax), which automatically switches on the -sa option, the transformation works as expected. So, what's the difference between these two flags, and why are things working as expected using -val but not by using -sa?
RE: Bug with Operator for Date Subtraction (... - Added by Anonymous almost 14 years ago
Legacy ID: #9386034 Legacy Poster: Michael Kay (mhkay)
Historically, the -sa flag meant "Use Saxon-SA, ensure it is licensed, and enable its functionality". The meaning has shifted somewhat; it now means "compile this query or stylesheet so that it is capable of dealing with typed input documents as well as untyped documents". It does not actually say "validate the primary input document". Perhaps it would be better if it did, but I'm reluctant to make that change. (It's reasonable to validate some input documents and not others. It's possible to control which documents are validated by using xsl:type and xsl:validate within the stylesheet itself. For example, the stylesheet might have a preprocessing phase that it has to run before doing validation.) The -val flag means "validate all input documents". This automatically switches on -sa, since if input documents are going to be typed, then the query/stylesheet had better be compiled to deal with this. So -sa means that the query/stylesheet can handle typed data; -val says tha the primary input, and documents read using doc(), should be validated. They're subtly different.
RE: Bug with Operator for Date Subtraction (... - Added by Anonymous over 13 years ago
Legacy ID: #10431517 Legacy Poster: smootz (smootz)
Since the time of my original post, we have purchased licensing for the EE version of Saxon and have come across another issue (maybe two) that are semi-related to this issue. Since we are operating using the schema awareness flag (-sa), I would expect that the stylesheets would be able to determine the type of a value from the schema. However, this does not appear to be the case (at least when using the abbreviated comparison operators). Let me provide a quick example/test case. Let's say I have an element, TempRange, with attributes min and max, both defined in mySchema as a simpleType with a decimal restriction base (totalDigits is 6, fractionDigits is 2), and have the following XML: [code] ... <TempRange @min="25" @max="30.01"/> ... [/code] When attempting the following comparison in a template matching rule, the template is not triggered: mySchema:TempRange[@min gt 30.0 or @max gt 30.0] However, all else being equal, the following template matching rules are both triggered: mySchema:TempRange[@min > 30.0 or @max > 30.0] mySchema:TempRange[xs:decimal(@min) gt 30.0 or xs:decimal(@max) gt 30.0] So I suppose that I can ask my question in two ways (or perhaps they are separate questions): 1. What's the difference between 'gt' and '>' that's causing the latter to compare correctly using the schema types that are defined and the former to require casting the attributes to xs:decimal explicitly? 2. Why doesn't operating with the schema awareness flag make the stylesheets implicitly use the type definitions from the schema? (of course, this question assumes that there are no bugs with the abbreviated comparison operators - i.e., if the answer to #1 is that there's a bug with 'gt' because it should behave the same as '>' and use the schema types when the -sa flag is present, then this question becomes null and void) As always, thank you for your time and expertise :-)
RE: Bug with Operator for Date Subtraction (... - Added by Anonymous over 13 years ago
Legacy ID: #10432167 Legacy Poster: Michael Kay (mhkay)
- What's the difference between 'gt' and '>' that's causing the latter to compare correctly Answer: the > operator converts an untyped atomic operand to the type of the other operand, whereas gt treates untyped atomic as string. This is strong evidence that your input attribute is labelled as untypedAtomic, meaning that the source document has not been validated. Did you specify -val on the command line? 2. Why doesn't operating with the schema awareness flag make the stylesheets implicitly use the type definitions from the schema? Answer: I think I explained this in my earlier replies. The -sa option forces the stylesheet to be compiled with schema-awareness, but it does not force validation of input documents; for that you need the -val flag. (there are cases where you want a schema-aware stylesheet to operate on unvalidated input, for example where the purpose of the stylesheet is to convert an invalid document to a valid one.)
RE: Bug with Operator for Date Subtraction (... - Added by Anonymous over 13 years ago
Legacy ID: #10434883 Legacy Poster: smootz (smootz)
I guess what I'm not understanding is why a document needs to be validated in order for the processor to treat the types as they are defined in the schema when schema awareness is being specified?
RE: Bug with Operator for Date Subtraction (... - Added by Anonymous over 13 years ago
Legacy ID: #10435017 Legacy Poster: smootz (smootz)
I think I found the explanation I needed in your documentation at [url]http://www.saxonica.com/documentation/expressions/comparisons.xml[/url]. Specifically, the comment "...Saxon currently uses its string value in the comparison, not its typed value as required by the XPath 2.0 specification." Are there any future plans to resolve this issue and make Saxon compliant with the XPath specification in this regard?
RE: Bug with Operator for Date Subtraction (... - Added by Anonymous over 13 years ago
Legacy ID: #10435274 Legacy Poster: Michael Kay (mhkay)
- I guess what I'm not understanding is why a document needs to be validated in order for the processor to treat the types as they are defined in the schema when schema awareness is being specified? Because validation is the process that associates nodes in the document with declarations in the schema. Without validation, Saxon has no idea whether there is any relationship between an attribute called "date" in your source document and a declaration of an attribute called "date" in your schema. Remember your schema might define lots of attributes called "date", all with different types... 2. I think I found the explanation I needed in your documentation... No, sorry, that sentence is long obsolete (now fixed). Saxon is 100% conformant with the specs in this area.
RE: Bug with Operator for Date Subtraction (... - Added by Anonymous over 13 years ago
Legacy ID: #10435514 Legacy Poster: smootz (smootz)
Can you point me to a specification reference that details this should be the behavior of an XPath 2.0 compliant processor so that I may review it with my colleagues (either where it is documented that 'gt' and '>' should have different behaviors, or where it is documented that a schema aware processor should not use the type definitions of the referenced schema)? According to the specification documentation I've reviewed, it looks like Saxon should be using type promotion and subtype substitution to perform the appropriate comparison when using 'gt' with two numeric types (as it does when using '>'). For reference, here's the spec documentation that I've reviewed: 6.3 Comparison Operators on Numeric Values ([url]http://www.w3.org/TR/2007/REC-xpath-functions-20070123/#comp.numeric[/url]) states that "if the arguments are of different types, one argument is promoted to the type of the other as described above in 6.2 Operators on Numeric Values." So following the reference to section 6.2 ([url]http://www.w3.org/TR/2007/REC-xpath-functions-20070123/#op.numeric[/url]), we find the statement "if the two operands are not of the same type, subtype substitution and numeric type promotion are used to obtain two operands of the same type...Section B.1 Type Promotion and Section B.2 Operator Mapping describe the semantics of these operations in detail." Following the reference to section B.1, "type promotion is used in evaluating function calls and operators that accept numeric or string operands (see B.2 Operator Mapping)." ...and following the reference to B.2 ([url]http://www.w3.org/TR/xpath20/#mapping[/url])... "A numeric operator may be validly applied to an operand of type AT if type AT can be converted to any of the four numeric types by a combination of type promotion and subtype substitution. If the result type of an operator is listed as numeric, it means "the first type in the ordered list (xs:integer, xs:decimal, xs:float, xs:double) into which all operands can be converted by subtype substitution and type promotion." As an example, suppose that the type hatsize is derived from xs:integer and the type shoesize is derived from xs:float. Then if the + operator is invoked with operands of type hatsize and shoesize, it returns a result of type xs:float. Similarly, if + is invoked with two operands of type hatsize it returns a result of type xs:integer." ...along with the following table snippet of interest... Operator Type(A) Type(B) Function Result type ... ... ... ... ... A gt B numeric numeric op:numeric-greater-than(A, B) xs:boolean ... ... ... ... ... I realize that the last section states that "if the result type of an operator is listed as numeric, it means the first type in the ordered list (xs:integer, xs:decimal, xs:float, xs:double) into which all operands can be converted by subtype substitution and type promotion" and the table clearly shows the result type as a boolean, but are you telling me that the same rules of type promotion and subtype substitution no longer apply when comparison operators are used?
RE: Bug with Operator for Date Subtraction (... - Added by Anonymous over 13 years ago
Legacy ID: #10435883 Legacy Poster: Michael Kay (mhkay)
The gt operator is a ValueComparison, so the rules are here: http://www.w3.org/TR/xpath20/#id-value-comparisons The > operator is a GeneralComparison, so the rules are here: http://www.w3.org/TR/xpath20/#id-general-comparisons Both steps perform Atomization, which gets the typed value of the node. The typed value depends (in spec-speak) on whether the XDM instance was constructed from an infoset or a PSVI (in real language, whether the source document was validated or not): it will be untypedAtomic in the first case, numeric (or whatever) in the second. The details of this are in the XDM data model specification. Rule 4 of ValueComparisons says "If the atomized operand is of type xs:untypedAtomic, it is cast to xs:string." While Rule 2b of General Comparisons (with 1.0 mode off) says "If exactly one of the atomic values is an instance of xs:untypedAtomic, it is cast to a type depending on the other value's dynamic type T according to the following rules..."
RE: Bug with Operator for Date Subtraction (... - Added by Anonymous over 13 years ago
Legacy ID: #10436070 Legacy Poster: smootz (smootz)
I appreciate the references; however, that still seems a little hokey (the spec itself, not Saxon)...is there at least a good/logical reason for having differing implementations for gt and >? I get that they're different now, but I'm having a hard time understanding why there's such a difference.
RE: Bug with Operator for Date Subtraction (... - Added by Anonymous over 13 years ago
Legacy ID: #10436754 Legacy Poster: Michael Kay (mhkay)
I'm having a hard time understanding why there's such a difference. XPath 1.0 defined the "<" family of operators. It was designed for handling unstructured untyped documents for use in the same kind of environment as Javascript, so the philosophy was dynamic typing, avoiding run-time errors, and generally trying to do the right thing in the face of unpredictable and possibly invalid data. The XQuery came along with its background in databases and query languages, which is an environment where static analysis and optimization is all-important, and hence strict/static typing. The folks from this culture looked at the XPath comparison operators with horror because it's very hard to support them well with indexes - and so they invented a new set. In practice of course, people use the operators like "=" and so implementors have to find a way of making them work. Using indexes to support "=" with its strange dynamic semantics is a tough challenge, but it's certainly possible and in my view adding the second set of operators was a mistake - but committees make lots of mistakes.
RE: Bug with Operator for Date Subtraction (... - Added by Anonymous over 13 years ago
Legacy ID: #10440240 Legacy Poster: smootz (smootz)
Okay, that clears it up for me. I appreciate the time you've taken to provide me with the background information I needed on these issues. Thank you.
Please register to reply