Project

Profile

Help

XdmNode from MemoryStream without valid base URI

Added by Funkmaster roch over 6 years ago

Hi Guys,

I am using Saxon saxon9he-api for C# (Version 9.8.0.5) to retrieve XPath results. The xml file(s) are coming as a MemoryStream so that I have not really a valid base URI. The code I have below throws following exception: "No base URI supplied". Is there a solution without give a valid URI?

    static XdmValue XPathResult(MemoryStream stream, string xPathExpression)
    {
        // Create a Processor instance.
        Processor processor = new Processor();
        
        var newDocumentBuilder = processor.NewDocumentBuilder();

        // set start positon to 0
        stream.Position = 0;

        // Load the source document from MemoryStream
        XdmNode xdmNode = newDocumentBuilder.Build(stream);

        // Create an XPath compiler
        XPathCompiler xPathCompiler = processor.NewXPathCompiler();

        // Enable caching, so each expression is only compiled once
        xPathCompiler.Caching = true;

        // Compile and evaluate an XPath expression
        var result = xPathCompiler.Evaluate(xPathExpression, xdmNode);

        return result;
    }

=========================================================================

Many thanks in advance!


Replies (7)

Please register to reply

RE: XdmNode from MemoryStream without valid base URI - Added by Michael Kay over 6 years ago

The theory is that failure to supply a base URI is only an error if you do something that needs a base URI, for example calling the doc() function.

However, it can be a little bit unpredictable knowing exactly what operations require a base URI, so my recommendation would be to always supply one. If you use something like http://dummy.base.uri/ then it will be very visible if you ever do anything that actually uses it.

If you want help knowing why you are getting the exception "no base URI supplied" then it would be useful for us to see (a) the stack trace showing where it is coming from, and (b) the actual XPath expression you are compiling.

RE: XdmNode from MemoryStream without valid base URI - Added by Funkmaster roch over 6 years ago

Michael Kay wrote:

The theory is that failure to supply a base URI is only an error if you do something that needs a base URI, for example calling the doc() function.

However, it can be a little bit unpredictable knowing exactly what operations require a base URI, so my recommendation would be to always supply one. If you use something like http://dummy.base.uri/ then it will be very visible if you ever do anything that actually uses it.

If you want help knowing why you are getting the exception "no base URI supplied" then it would be useful for us to see (a) the stack trace showing where it is coming from, and (b) the actual XPath expression you are compiling.

Sorry maybe my description was confusing. I hope the picture with code and comments below makes it more clear. If I using newDocumentBuilder.Build() method and provide a URI as parameter everything works fine. If I call the same method and provide a Stream parameter then I will get an exception. The XPath part never reached so I guess the XPath expression is not the root cause.

RE: XdmNode from MemoryStream without valid base URI - Added by Michael Kay over 6 years ago

The documentation for DocumentBuilder.Build(Stream) is explicit that a base URI must be supplied. See

http://www.saxonica.com/documentation/index.html#!dotnetdoc/Saxon.Api/DocumentBuilder@Build

I'm not quite sure why we're imposing that constraint - probably because one or another of the parsers that we use crashes if we don't. But the code is doing what the spec says, so just supply a dummy base URI.

RE: XdmNode from MemoryStream without valid base URI - Added by Funkmaster roch over 6 years ago

I already tried a dummy URI but this not brought the desired achievement. With the dummy URI I get a different exception(see attachment below).

Thanks

RE: XdmNode from MemoryStream without valid base URI - Added by Michael Kay over 6 years ago

I think there are two possibilities here.

(a) the stream is not a valid XML document, and the parser is reporting this, using the dummy base URI purely as diagnostic information

(b) the stream is an XML document that contains a reference to some external entity, perhaps an external DTD, which can't be resolved because the base URI is a dummy and doesn't actually exist.

So at this stage I think we need to know what is in the document. Perhaps supplying a repro (a free-standing C# program that we can run to reproduce the problem) would make it easier.

RE: XdmNode from MemoryStream without valid base URI - Added by Funkmaster roch over 6 years ago

It was partially (b). It looks like the xml file has some external entity which cause this issue using Apache Xerces parser. I changed it to the .NET XML parser and now it works for me. But I still don't understand why its not working when I provide a Stream instead of URI as parameter to the Build() method. The code which can be found in the attachment works for me. Thank you Michael for your support and your quick response.

RE: XdmNode from MemoryStream without valid base URI - Added by Michael Kay over 6 years ago

I would be interested to see what kind of external entity reference exists in your XML.

If it's a relative URI, then there's no way it can be resolved without a base URI.

If it's an absolute URI, then in principle it could be resolved without a base URI, but I can well imagine that some parsers will allow this, while others fail.

    (1-7/7)

    Please register to reply