Project

Profile

Help

[.NET Saxon-HE API] MalformedURIException thrown if Uri contains a space char

Added by Manuel F almost 8 years ago

Hey guys,

I'm currently facing a strange behavior using Saxon’s api while creating my very own xsl transformation application.

First I’m a .NET programmer and the api is a wrapped java application so that’s maybe my problem here.

But to get to my problem:


//loads input xml file and returns XDM Node
private XdmNode GetXdmNode(Uri inputFile)
{
     return processor.NewDocumentBuilder().Build(inputFile);
}

This short snipped throws me a internal api exception of type org.apache.xerces.util.URI.MalformedURIException if my input uri contains a space - see attached Exception View Detail.png. This ain't a problem for .NET - the URI is valid (according to .NET) but the api seems to struggle with it.

My file path is "file:///C:/test test/Test.xml" - see attached uri.png. The file exists and my application has every right for this file (777). But still I'm facing this exception every time if load an input xml file.

So I looked “under the hood” and decompiled the api and found the line in the DocumentBuilder.cs which throws the exception - see attached DocumentBuilder.png.

So I digged a little bit deeper and found inside the pullSource object the systemId property which contains the actual uri as a string - see attached pullSource.png.

I’m afraid this property is my problem, because it stores the path to my file in a non valid uri form - a uri can't contain spaces. In my case it’s “file:///C:/test test/Test.xml” but it should be "file:///C:/test%20test/Test.xml". %20 instead of " ".

If I change the value of the systemId property with my debugger to "file:///C:/test%20test/Test.xml" there will be no exception and everything works smooth and fine.

Can someone reproduce this problem? And how can I fix this issue without spoiling the api?

I’m using Visual Studio 2013 Premium Update 5 with an .NET Framework 4.5 project and Saxon-HE version 9.7.0.4 - this problem is also present using Visual Studio 2015 Enterprise Update 2 and the very same project.

Regards Manuel


Replies (7)

Please register to reply

RE: [.NET Saxon-HE API] MalformedURIException thrown if Uri contains a space char - Added by Manuel F almost 8 years ago

I think I found a workaround for my issue:

Please take a look at line 502 of my attached DocumentBuilder.png file.

Instead of:


pullSource = new StreamSource(new DotNetInputStream(input), baseUri.ToString());

use:


pullSource = new StreamSource(new DotNetInputStream(input), Uri.EscapeUriString(baseUri.ToString()));

should do the trick.

RE: [.NET Saxon-HE API] MalformedURIException thrown if Uri contains a space char - Added by Michael Kay almost 8 years ago

Sorry for the lack of response.

I have created a bug issue on this here to ensure that this receives attention:

https://saxonica.plan.io/issues/2744

RE: [.NET Saxon-HE API] MalformedURIException thrown if Uri contains a space char - Added by O'Neil Delpratt almost 8 years ago

Hi,

I am struggling to reproduce the MalformedURIException thrown. Is it possible to add code snippet as to how you are creating the Uri. Looking at your uri.png image I notice that the HostType = 'IPv6HostType|UncType'. I created the Uri object from a simple path with a space character in it, the HostType for me is 'BasicHostType'.

I wonder if there is something special about the Uri which is causing the failure.

RE: [.NET Saxon-HE API] MalformedURIException thrown if Uri contains a space char - Added by Manuel F almost 8 years ago

Hello Mr. Delpratt,

Thanks for your time and effort.

The uri object is originally created with this code


Uri uri = new Uri(file);

where file is a string with this value: "C:\test test\Test.xml".

Right after it's creation this Uri object (uri) will be added to a thread-safe collection of type SynchronizedCollection.

So there could may be an issue with this collection or the fact, that I'm processing this uri in another thread so I want to make sure this will not affect my reported issue.

I created a few test uri objects and a few XdmNode objects based on these test uris.


        //loads input xml file and returns XDM Node
        private XdmNode GetXdmNode(Uri inputFile)
        {
            Uri storeInputFileForComparison = inputFile;

            //regular node creation with passed uri - throws MalformedURIException as already reported
            XdmNode inputXdmNode = processor.NewDocumentBuilder().Build(inputFile);
            
            inputFile = new Uri(@"C:\test test\Test.xml");
            
            //new node creation with overwritten uri object with the same value - throws also MalformedURIException
            XdmNode nodeWithOverwrittenInputFile = processor.NewDocumentBuilder().Build(inputFile);

            //compare new created uri with passed one:
            bool equal1 = storeInputFileForComparison == inputFile;         //returns true
            bool equal2 = storeInputFileForComparison.Equals(inputFile);    //returns also true

            Uri newInputUriEscaped = new Uri(@"C:\test test\Test.xml", true);
            
            bool equal3 = storeInputFileForComparison.Equals(newInputUriEscaped);    //returns true

            //new node creation with new uri object with the same value - throws also MalformedURIException
            XdmNode nodeWithUnescapedNewInputUri = processor.NewDocumentBuilder().Build(newInputUriEscaped);
                        
            Uri newInputUriUnescaped = new Uri(@"C:\test test\Test.xml", false);

            bool equal4 = storeInputFileForComparison.Equals(newInputUriUnescaped);    //returns true
            
            //new node creation with new uri object with the same value - throws also MalformedURIException
            XdmNode nodeWithNewInputUri = processor.NewDocumentBuilder().Build(newInputUriUnescaped);
            
            Uri twoSlashesInsteadOfAtSign = new Uri("C:\\test test\\Test.xml");

            bool equal5 = storeInputFileForComparison.Equals(twoSlashesInsteadOfAtSign);    //returns true

            //throws MalformedURIException
            XdmNode nodeWithTwoSlashesInsteadOfAtSign = processor.NewDocumentBuilder().Build(twoSlashesInsteadOfAtSign);

            Uri usingFileScheme = new Uri(@"file:\\\C:\test test\Test.xml");
            
            //throws MalformedURIException
            XdmNode nodeWithFileScheme = processor.NewDocumentBuilder().Build(usingFileScheme);

            Uri differentPath = new Uri(@"U:\Test Test.xml");
            
            bool equal6 = storeInputFileForComparison.Equals(differentPath);    //returns false due to different path

            //throws MalformedURIException
            XdmNode nodeWithDifferentPath = processor.NewDocumentBuilder().Build(differentPath);

            Uri letsCheckIfUriIsValidForDotNet;
            if (Uri.TryCreate(@"C:\test test\Test.xml", UriKind.Absolute, out letsCheckIfUriIsValidForDotNet)) //returns true
            {
                // the url is valid
            }
            
            Uri newUri = new Uri("C:\\Test.xml");   //HostType is also "IPv6HostType | UncHostType"

            //throws MalformedURIException
            return processor.NewDocumentBuilder().Build(inputFile);
        }

All of my Uri's HostTypes are "IPv6HostType | UncHostType".

Kind regards Manuel

RE: [.NET Saxon-HE API] MalformedURIException thrown if Uri contains a space char - Added by O'Neil Delpratt almost 8 years ago

Hi Manuel,

My environment: Windows 7, running with Microsoft VS community 2015 The Saxon .NET dlls were built on a windows machine.

I ran your code snippet and it works on my machine. As mentioned in before my hostType is 'BasicHostType'. It would be interesting to examine what happens before the GetXdmNode method is called as something seems to be happening related to the SynchronizedCollection and what is returned. I would like to investigate further but would need the code for that part too.

RE: [.NET Saxon-HE API] MalformedURIException thrown if Uri contains a space char - Added by Manuel F almost 8 years ago

Hello O'Neil Delpratt,

that's very strange.

The transformation performs in one separate class and is triggered by a controller class.

My GetXdmNode() is called from this method:


        //create transformer and load precompiled executable transformation rule
        private XsltTransformer LoadGeneratedXslTransformationRule(Uri inputFile)
        {
            XsltTransformer transformer = this.executable.Load();
            transformer.InitialContextNode = GetXdmNode(inputFile);

            //handle xml:message output in xsl file
            transformer.MessageListener = new UserMessageListener();

            return transformer;
        }

executable is from type XsltExecutable and stores the precompiled xsl rule.

executable is generated before calling my LoadGeneratedXslTransformationRule method:


        private void CompileXslRule(string pathToXslFile)
        {
            compiler.ErrorList = new List();

            executable = compiler.Compile(new Uri(pathToXslFile));
        }

LoadGeneratedXslTransformationRule() is called from this method:


        //trigger transformation
        private void RunTransformation(Uri inputFile, string outputXmlPath)
        {
            XsltTransformer transformer = LoadGeneratedXslTransformationRule(inputFile);

            //set output directory for output xml files
            transformer.BaseOutputUri = GenerateUri(outputXmlPath);

            SetOutputWriter();
            
            PerformTransformation(transformer);
        }

I changed my SynchronizedCollection to a regular List and changed my code from parallel transforming processing to regular processing. I also got rid of all threads, which may could interfere here.

My calling method for the whole transformation class:


       //trigger transformation
        private void ExecuteTransformation()
        {
            //test to rule out collection issues
            Uri filePath = new Uri(@"C:\test test\Test.xml");
            
            //create transformation object
            Transform transform = new Transform();

            //precompile xsl rule
            transform.InitializeTransformation(pathToXslFile);

            //trigger xsl transformation
            transform.RunTransformation(filePath, outputXmlPath);
        }

RE: [.NET Saxon-HE API] MalformedURIException thrown if Uri contains a space char - Added by O'Neil Delpratt over 7 years ago

Hi,

I have tried looking at this again on Saxon 9.7.0.11 but still not able to reproduce the problem. I am wondering if you are still hitting the problem? Have you tried install Saxon 9.7.0.11?

    (1-7/7)

    Please register to reply