Project

Profile

Help

Schema validation: Saxonica able return XPATH or erroneous node?

Added by Mario Mueller almost 6 years ago

Good morning,

we are using the Enterprise Edition and need - in case of a validation error - the XPATH to the erroneous node. Is this possible with Saxonica?

Many Thanks regards Mario Müller


Replies (9)

Please register to reply

RE: Schema validation: Saxonica able return XPATH or erroneous node? - Added by Michael Kay almost 6 years ago

If you use the s9api SchemaValidator interface and register an InvalidityHandler, each validation error will be reported as an Invalidity object, which includes a path to the offending node, obtainable using the methods getPath() and getContextPath(). The path is returned as an AbsolutePath object, which provides various methods allowing it to be rendered as an XPath if that's the form in which you want it.

Looking at it, I see that the Javadoc doesn't actually explain the difference between getPath() and getContextPath(). I believe:

  • getPath() returns a path to the node in the document being validated. This is not necessarily the node that is invalid in the sense of the XSD specification: for example if a BOOK element requires the content model (AUTHOR, PUBLISHER) but the actual content is (PUBLISHER, AUTHOR) then the path will be to the PUBLISHER element, not to the BOOK.

  • getContextPath() is only applicable where validation is invoked from within XSLT or XQuery, and it gives you a path to the node in the source document that was being processed (i.e. that was the context item) at the time the validation error was detected. This is useful when you are validating the result document, because what you actually want to know is not where in the result document the error appears, but rather what you were doing that caused invalid output to be generated.

You can't rely on these properties being present with every error. For example, if there are IDREF values with no matching ID, we don't report the location of the unmatched IDREF - the space requirements for keeping this information until the end of the document would be excessive.

If you're not using the s9api interface and instead rely on the JAXP ErrorHandler, the same information is buried somewhere within the exception reported to the ErrorHandler, but you will need to do some casting to Saxon-specific classes to find it.

RE: Schema validation: Saxonica able return XPATH or erroneous node? - Added by Mario Mueller almost 6 years ago

Hi Michael,

many thanks for the prompt and detailed answer. Regards Mario

RE: Schema validation: Saxonica able return XPATH or erroneous node? - Added by Wayne Albury almost 6 years ago

I have exactly the same question but regarding the dotNet version. I am using the s9api SchemaValidator but I cannot see any Invalidity object as a child of the InvalidityHandler object, only net.sf.saxon.lib.InvalidityHandlerWrappingErrorListener. Would it be possible to get some sample code demonstrating how this is done? I've got this far (pls excuse the vb syntax)

Dim err As StaticError
For err In validator.ErrorList
Dim vf As object = err.UnderlyingException.locator '(full path present as .path in expanded debug in VS2010)

Doesn't like above, or failure.locator.path although both are present in the debugger hierarchy so any guidance welcome. (Am I maybe actually looking at JAXP objects by this time? net.sf.saxon.type etc anyway. ) Also I'm assuming from all above (esp. "keeping this information until the end of the document") that there is no equivalent to the dotnet xsd validation (sax-parser style) functionality, only the ability to query result files/objects after the run command has been invoked, is this correct? (Any planned if no?) thanks!

(oh and in case you're wondering well why don't I use the dot net parser, more than happy to if it worked! It's always had issues with element/attribute level formdefault qual/unqual local changes but it still has some other (admittedly very obscure) bugs eg when including schemas with no targetnamespace into multiple xsds under one main xsd. Happy to share if anyone's curious.)

RE: Schema validation: Saxonica able return XPATH or erroneous node? - Added by Michael Kay almost 6 years ago

There are three ways of capturing validation errors with the .NET validation API (the Saxon.Api.SchemaValidator).

Firstly, you can call SetValidityReporting(XmlDestination destination). This will cause an XML validation report to be written to the supplied destination.

Secondly, you can call SetInvalidityHandler(IInvalidityHandler inHandler). This will cause each validity error to be reported to your supplied handler. The error is reported by way of a (poorly named) StaticError object. This wraps a Java ValidationFailure object which contains the path information, but you will have to do some digging to find it.

Finally, you can set an ErrorList. This again gives you a list of StaticError objects, which contain the required information, but again hide it fairly effectively.

We can do better than this, I think, but as it involves API changes it will probably need to be in a new major release.

RE: Schema validation: Saxonica able return XPATH or erroneous node? - Added by Wayne Albury almost 6 years ago

Thanks Michael, I had a bit of luck as it happens, Visual Studio kept prompting me to add a reference to IKVM.OpenJDK.XML.API and IKVM.OpenJDK.Core and somewhat surprisingly it let me do so. After this to get the path it was only a matter of doing;

                err = Saxon.Api.StaticError
                Dim ve As net.sf.saxon.type.ValidationException = DirectCast(err.UnderlyingException, net.sf.saxon.type.ValidationException)
                Dim xp As net.sf.saxon.om.AbsolutePath
                xp = ve.getAbsolutePath

which gives me a somewhat clunky

 {/Q{http://example.org/ord}order[1]/Q{}items[1]/Q{http://example.org/prod}product[1]/Q{http://example.org/prod}validFor[1]/@Q{http://example.org/prod}attQualX}

but it will do! thanks again :)

RE: Schema validation: Saxonica able return XPATH or erroneous node? - Added by Wayne Albury almost 6 years ago

My luck hasn't been quite so good trying to get the SetInvalidityHandler to play nice. Is there any possibility of posting any sample code as to how this works in c#/vb? I couldn't see it in the ExamplesEE.cs and both a search here and a google search also returned empty handed. thanks, wayne

RE: Schema validation: Saxonica able return XPATH or erroneous node? - Added by Michael Kay almost 6 years ago

The AbsolutePath object has methods allowing you to get less clunky output, e.g. AbsolutePath.getPathUsingPrefixes(), but the default toString() method gives the full path with all namespaces expanded, which is useful when you want to evaluate the path without knowing a namespace context.

I'm having trouble finding a usable example of SetInvalidityHandler, we'll get back to you on that.

RE: Schema validation: Saxonica able return XPATH or erroneous node? - Added by O'Neil Delpratt almost 6 years ago

Example of SetInvalidityHandler in C#:

[Test]
        public void TestInvalidityHandler()
        {

            XmlReader xsd = XmlReader.Create(Path.GetFullPath(ConfigTest.DATA_DIR + "books.xsd"));

            XmlReader source_xml = XmlReader.Create(new StringReader("<?xml version='1.0'?><request><user_name>ed</user_name><password>sdsd</password><date1>a2009-01-01</date1><date2>b2009-01-01</date2></request>"));

            UriBuilder ub = new UriBuilder();
            ub.Scheme = "file";
            ub.Host = "";
            ub.Path = @"C:\work\tests\";
            Uri baseUri = ub.Uri;

            Processor saxon = new Processor(true);


            SchemaManager manager = saxon.SchemaManager;
            manager.ErrorList = new ArrayList();
            manager.XsdVersion = "1.0";


            try
            {
                DocumentBuilder builder = saxon.NewDocumentBuilder();
                builder.BaseUri = new Uri("http://example.com");
                XdmNode xsdNode = builder.Build(xsd);
                manager.Compile(xsdNode);
            }
            catch (Exception ex)
            {
                Console.WriteLine("Schema compilation failed with " + manager.ErrorList.Count + " errors");
                String errors = "";
                foreach (StaticError error in manager.ErrorList)
                {
                    Console.WriteLine("At line " + error.LineNumber + ": " + error.Message);
                    errors += error.Message + "\n";
                }
                Assert.Fail("Failed in compile of xsd "+ errors);
                
            }

            Saxon.Api.SchemaValidator validator = manager.NewSchemaValidator();
            validator.SetInvalidityHandler(new MyInvalidaityHandler());



            validator.SetSource(source_xml);
            //            validator.ErrorList = new ArrayList();

            Console.WriteLine("\nFile Validating file.. ");

            try
            {
                validator.Run();
                Assert.True(true);
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.StackTrace);
                Assert.Fail(ex.Message);
            }
        }

        public class MyInvalidaityHandler : IInvalidityHandler
        {
            public XdmValue endReporting()
            {
                return currentNode; // this could be the entire  constructedreport
            }

            public void reportInvalidity(StaticError i)
            {
                //TODO - Do something here. Maybe write to a file or throw an exception 
            }

            public void startReporting(string systemId)
            {
                // no action, but can do setup of file
            }
        }
    }

RE: Schema validation: Saxonica able return XPATH or erroneous node? - Added by Wayne Albury over 5 years ago

my apologies, I missed seeing this email, I shall try it out asap. thanks!

On Wed, Jul 18, 2018 at 10:04 PM, Saxonica Developer Community < > wrote:

    (1-9/9)

    Please register to reply