Project

Profile

Help

to find out in general which line in the source xml causes the SaxonApiException

Added by luba zlatin over 7 years ago

How to find out in general which line in the source xml file causes the SaxonApiException in XsltTransformer.transform().).


Replies (10)

Please register to reply

RE: to find out in general which line in the source xml causes the SaxonApiException - Added by Michael Kay over 7 years ago

When a dynamic error occurs during the processing of stylesheet, the exception will generally contain location information relating to the position in the stylesheet of the instruction that failed, which you can get by calling getSystemId() and getLineNumber() on the exception object. In some cases the XPathException wrapped by the SaxonApiException will also contain an XPathContext object, which contains information about the context item: so something like

Exception cause = saxonApiException.getCause();
if (cause instanceof XPathException) {
            XPathContext cxt = ((XPathException) cause).getContext();
            Item item = cxt == null ? null : cxt.getContextItem();
            if (item instanceof NodeInfo) {
               return ((NodeInfo)item).getLineNumber();
            }
}

You will need to enable line numbering (-l on the command line). And it's likely this will work for some errors and not others.

If there is an XPathContext object, then the static method

StandardErrorListener.printStackTrace(logger, context)

can also be useful (though it you use the standard error listener, Saxon will call it anyway).

RE: to find out in general which line in the source xml causes the SaxonApiException - Added by luba zlatin over 7 years ago

Thank you for the fast reply. I got the following SaxonApiException: net.sf.saxon.s9api.SaxonApiException: Invalid date "ttt" (Year is less than four digits). getSystemId() returns an empty string and getLineNumber() returns 24 while the real problem is on line number 103. Its cause is ValidationException. All functions I tried on the cause returns either null or an empty string.

RE: to find out in general which line in the source xml causes the SaxonApiException - Added by Michael Kay over 7 years ago

If you can send us a concrete example then we can see whether any improvements to diagnostics work for that particular case.

With this kind of error it depends very much where it occurs: for example if you are initializing a global variable with a call to xs:date supplying a string, then we probably don't know where in the source document the string came from, if indeed it came from the source document at all.

Equally, tracking the line number in the stylesheet is an inexact science because of rewrites like variable and function inlining, though we try hard to get it right. But conversions and comparisons take place in code that doesn't have access to a specific stylesheet context, so we rely on catching the error as it bubbles up to a point where we do have that information, and adding the location information at that point - which is not always accurate, again because of effects like lazy evaluation of variables.

RE: to find out in general invali line. May I rely on the fact that the last XSLTTraceListener line consists of the problematic line while SaxonApiException is thrown? - Added by luba zlatin over 7 years ago

I am waiting for approval from my boss to send you data as an example. Meanwhile I am trying to use the info recieved from StandardErrorListener.printStackTrace(logger, context). I erased the date, got the following message: empty sequence is not allowed as the first argument of mxf:format-date() and the path from StandardErrorListener.printStackTrace(logger, context): at xsl:apply-templates (#1) processing /product/schedule[1]/issueDateMeta[1]/issueDate[1]/adjustedDate[1]/unadjustedDate[1]/text()[1]. Next time I put the invalid date at the same path, got the following message: Invalid date "ttt" (Year is less than four digits) and the path from StandardErrorListener.printStackTrace(logger, context): at xsl:apply-templates (#1) processing /product/schedule[1]/issueDateMeta[1]/issueDate[1]/adjustedDate[1]/unadjustedDate[1]. The absence of /text() in the path in the last case does not allow me to find the template with the suitable match attribute in xslt. Unlike StandardErrorListener, XSLTTraceListener in both cases has in the last line the same path:

My question is : "May I rely on the fact that the last XSLTTraceListener line consists of the problematic line while SaxonApiException is thrown?"

RE: to find out in general which line in the source xml causes the SaxonApiException - Added by luba zlatin over 7 years ago

Dear Michael, my example is attached. refProduct.xml is used as a xsltTransformer parameter:

     		XdmNode paramNode = documentBuilder.build(new StreamSource(new StringReader(referenceProduct)));
		xsltTransformer.setParameter(new QName("refProdDoc"), paramNode);

RE: to find out in general which line in the source xml causes the SaxonApiException - Added by Michael Kay over 7 years ago

Running this from the command line as

java net.sf.saxon.Transform -t -xsl:schema.xsl -s:source.xml +refProdDoc=refProduct.xml

I get:

Error at char 16 in xsl:value-of/@select on line 22 column 127 of schema.xsl:
  FORG0001: Invalid date "ttt" (Year is less than four digits)
  at xsl:apply-templates (file:/Users/mike/bugs/2017/zlatin/schema.xsl#10)
     processing /product/schedule[1]/issueDateMeta[1]/issueDate[1]/adjustedDate[1]/unadjustedDate[1]/text()[1]
  at xsl:apply-templates (file:/Users/mike/bugs/2017/zlatin/schema.xsl#10)
     processing /product/schedule[1]/issueDateMeta[1]/issueDate[1]/adjustedDate[1]/unadjustedDate[1]
  at xsl:apply-templates (file:/Users/mike/bugs/2017/zlatin/schema.xsl#10)
     processing /product/schedule[1]/issueDateMeta[1]/issueDate[1]/adjustedDate[1]
  at xsl:apply-templates (file:/Users/mike/bugs/2017/zlatin/schema.xsl#10)
     processing /product/schedule[1]/issueDateMeta[1]/issueDate[1]
  at xsl:apply-templates (file:/Users/mike/bugs/2017/zlatin/schema.xsl#10)
     processing /product/schedule[1]/issueDateMeta[1]
  at xsl:apply-templates (file:/Users/mike/bugs/2017/zlatin/schema.xsl#10)
     processing /product/schedule[1]
  at xsl:apply-templates (file:/Users/mike/bugs/2017/zlatin/schema.xsl#14)
     processing /product

Looking at line 22 of the stylesheet we see the xsl:value-of instruction in:


    

and the actual failure is in the mxf:format-date which does:


    
    
 

So this function takes a string as input, constructs a text node, atomizes the text node to create an untyped atomic value, and then implicitly converts the untyped atomic value to a date. A more direct way of writing this would be:


    
    
 

But the effect is the same: within the function, at the point of failure, there is no knowledge of where the invalid string came from. The stacktrace reports the context item in the source document in the hope that this might bear some relation to the origin of the invalid data, but in this case it does not.

The ideal would be to report the path or line number of the originDate element containing the invalid date. But that's pretty challenging. We could imagine annotating strings internally to say where in a source document they originated, but that's a pretty heavy overhead. Or we could try, when the function call fails, to report what its arguments were as paths to nodes, but again, that's not easy.

If invalid input is something that is going to happen routinely, then perhaps you should be doing a schema-aware transformation that validates the input XML files against a schema?

RE: to find out in general which line in the source xml causes the SaxonApiException - Added by Michael Kay over 7 years ago

As for the question:

"May I rely on the fact that the last XSLTTraceListener line consists of the problematic line while SaxonApiException is thrown?"

the answer is no.

Firstly, the tracelistener has startCurrentItem() and endCurrentItem() calls which indicate when a node becomes the context item, and when it reverts to the previous context item. If the most recent call was endCurrentItem() then you have to unwind the stack to know what the context item is now.

Secondly, these are not called on all changes of context item, but only on changes initiated by XSLT instructions like apply-templates, for-each, and iterate. In particular they are not called when the context item changes by virtue of the "/" operator. So xsl:value-of select="a/b/c/d" will not notify any changes of context item.

Finally, there is no necessary connection between what you call "the problematic line" and a node becoming the context item. For example, if you bind a variable like:


and then do a cast such as xs:date($data) then there is no way any casting error can be traced back to the path that was used when evaluating the variable - especially if the evaluation is more indirect, e.g. involving string operations on the values of several nodes.

RE: to find out in general which line in the source xml causes the SaxonApiException - Added by luba zlatin over 7 years ago

Thank you! While running the java code: xsltCompiler = processor.newXsltCompiler(); xsltExecutable = xsltCompiler.compile(new StreamSource(new StringReader(loadingScheme)));

serializer = processor.newSerializer(); xsltTransformer = xsltExecutable.load(); xsltTransformer.setDestination(serializer);

documentBuilder = processor.newDocumentBuilder(); XdmNode paramNode = documentBuilder.build(new StreamSource(new StringReader(referenceProduct))); xsltTransformer.setParameter(new QName(REFERENCE_PRODUCT_PARAM_NAME), paramNode);

serializer.setOutputWriter(stringWriter); XdmNode contextNode = documentBuilder.build(new StreamSource(inputStream)); xsltTransformer.setInitialContextNode(contextNode); xsltTransformer.transform();

I get another line number :

Validation error at function mxf:format-date on line 24 FORG0001: Invalid date "ttt" (Year is less than four digits) at xsl:apply-templates (#1) processing /product/schedule[1]/issueDateMeta[1]/issueDate[1]/adjustedDate[1]/unadjustedDate[1] at xsl:apply-templates (#1) processing /product/schedule[1]/issueDateMeta[1]/issueDate[1]/adjustedDate[1] at xsl:apply-templates (#1) processing /product/schedule[1]/issueDateMeta[1]/issueDate[1] at xsl:apply-templates (#1) processing /product/schedule[1]/issueDateMeta[1] at xsl:apply-templates (#1) processing /product/schedule[1] at xsl:apply-templates (#1) processing /product

Then I run another example with an empty date at the same place (source_empty_date.xml is attached, schema.xslt and refProduct.xml are not changed). And I get an error line number 1

Error on line 1 XTTE0790: An empty sequence is not allowed as the first argument of mxf:format-date() at xsl:apply-templates (#1) processing /product/schedule[1]/issueDateMeta[1]/issueDate[1]/adjustedDate[1]/unadjustedDate[1]/text()[1] at xsl:apply-templates (#1) processing /product/schedule[1]/issueDateMeta[1]/issueDate[1]/adjustedDate[1]/unadjustedDate[1] at xsl:apply-templates (#1) processing /product/schedule[1]/issueDateMeta[1]/issueDate[1]/adjustedDate[1] at xsl:apply-templates (#1) processing /product/schedule[1]/issueDateMeta[1]/issueDate[1] at xsl:apply-templates (#1) processing /product/schedule[1]/issueDateMeta[1] at xsl:apply-templates (#1) processing /product/schedule[1] at xsl:apply-templates (#1) processing /product

RE: to find out in general which line in the source xml causes the SaxonApiException - Added by Michael Kay over 7 years ago

The stack trace is slightly different for Saxon-HE and Saxon-EE, because Saxon-EE does function inlining. With Saxon-HE the trace (in my case) starts:

Validation error at function mxf:format-date on line 273 of schema.xsl:
  FORG0001: Invalid date "ttt" (Year is less than four digits)
  at xsl:apply-templates (file:/Users/mike/bugs/2017/zlatin/schema.xsl#10)

which I think is entirely reasonable. Line 273 is the function declaration:


I can't reproduce the case where you are seeing line number 1. I get:

Type error at char 16 in xsl:value-of/@select on line 22 column 127 of schema.xsl:
  XPTY0004: An empty sequence is not allowed as the first argument of mxf:format-date()
  at xsl:apply-templates (file:/Users/mike/bugs/2017/zlatin/schema.xsl#10)
     processing /product/schedule[1]/issueDateMeta[1]/issueDate[1]/adjustedDate[1]/unadjustedDate[1]/text()[1]
  at xsl:apply-templates (file:/Users/mike/bugs/2017/zlatin/schema.xsl#10)
     processing /product/schedule[1]/issueDateMeta[1]/issueDate[1]/adjustedDate[1]/unadjustedDate[1]

which again seems perfectly reasonable.

It would help if you supplied a complete runnable program, rather than leaving me to fill in the gaps, which may well account for why we are getting different output.

RE: to find out in general which line in the source xml causes the SaxonApiException - Added by luba zlatin over 7 years ago

Dear Michael, Thank you for the detailed reply.

  1. Pointing to the function declaration does not help me, because there are a lot of the function invocations.
  2. The. xslt comes from DB and sometimes it consists several instructions on the same line, which explains our different output. Excuse me, I should've paid attention on it earlier.
  3. I returned to your recommendation concerning the ErrorListner and I uses StandardErrorListener.printStackTrace(logger, context). In the two examples I have sent you earlier, I get some problematic paths. In the third example with source_wrong_decimal.xml I get an empty logger output stream. Can you explain it, please? The only way for me to get a problematic path in all these cases is through TraceListener.

source_wrong_decimal.xml and the java code are attached.

Best regards.

    (1-10/10)

    Please register to reply