Failure in file:base-dir(): "URI has an authority component"
There is an issue when calling:
var transformer = _xsltCompiler.Compile(new Uri(stylesheetPath)).Load();
These are the following lines I have that's dealing with the file system.
<xsl:function name="pn:find-image-filepath"> <xsl:param name="filename"/> <xsl:variable name="file-path" select="file:list(file:parent(file:base-dir()), true(), concat($filename, '*'))"/> <xsl:choose> <xsl:when test="$file-path"> <xsl:value-of select="file:path-to-uri(concat(file:parent(file:base-dir()), $file-path)))"/> </xsl:when> <xsl:otherwise> <xsl:message terminate="no">Could not find image file "<xsl:value-of select="$filename"/>" </xsl:message> </xsl:otherwise> </xsl:choose> </xsl:function>
Googling this issue states it's an issue when on Windows machines. When working with files I should prefix the file path with: file:/// with 3 slashes. When I look at the path (when processing this via the commandline and not .NET) the path is rendered as follows: "file:/C:/data/data/..." However, is there a way to tell omit or include he extra slashes without doing a string replace?
#1 Updated by Kevon Hayes about 1 month ago
I'm beginning to think this also is tied into the licensing issue because I ran the following from the commandline: ~~~ xml ~~~ and Saxon was able to parse with no issues however from .NET I get the following exception: Message: URI has an authority component Exception: " at java.io.File..ctor(URI uri)\r\n at com.saxonica.functions.extfn.EXPathFile.toFile(String )\r\n at com.saxonica.functions.extfn.EXPathFile.parent(String path)\r\n at com.saxonica.functions.extfn.EXPathFileFunctionSet.BaseDir.makeResult(String )\r\n at com.saxonica.functions.extfn.EXPathFileFunctionSet.BaseDir.makeFunctionCall(Expression arguments)\r\n at net.sf.saxon.functions.registry.BuiltInFunctionSet.bind(F symbolicName, Expression staticArgs, StaticContext env, List reasons)\r\n at net.sf.saxon.functions.FunctionLibraryList.bind(F functionName, Expression staticArgs, StaticContext env, List reasons)\r\n at net.sf.saxon.functions.FunctionLibraryList.bind(F functionName, Expression staticArgs, StaticContext env, List reasons)\r\n at net.sf.saxon.expr.parser.XPathParser.parseFunctionCall(Expression prefixArgument)\r\n at net.sf.saxon.expr.parser.XPathParser.parseBasicStep(Boolean firstInPattern)\r\n at net.sf.saxon.expr.parser.XPathParser.parseStepExpression(Boolean firstInPattern)\r\n at net.sf.saxon.expr.parser.XPathParser.parseRelativePath()\r\n at net.sf.saxon.expr.parser.XPathParser.parsePathExpression()\r\n at net.sf.saxon.expr.parser.XPathParser.parseSimpleMappingExpression()\r\n at net.sf.saxon.expr.parser.XPathParser.parseUnaryExpression()\r\n at net.sf.saxon.expr.parser.XPathParser.parseExprSingle()\r\n at net.sf.saxon.expr.parser.XPathParser.parseFunctionArgument()\r\n at net.sf.saxon.expr.parser.XPathParser.parseFunctionCall(Expression prefixArgument)\r\n at net.sf.saxon.expr.parser.XPathParser.parseBasicStep(Boolean firstInPattern)\r\n at net.sf.saxon.expr.parser.XPathParser.parseStepExpression(Boolean firstInPattern)\r\n at net.sf.saxon.expr.parser.XPathParser.parseRelativePath()\r\n at net.sf.saxon.expr.parser.XPathParser.parsePathExpression()\r\n at net.sf.saxon.expr.parser.XPathParser.parseSimpleMappingExpression()\r\n at net.sf.saxon.expr.parser.XPathParser.parseUnaryExpression()\r\n at net.sf.saxon.expr.parser.XPathParser.parseExprSingle()\r\n at net.sf.saxon.expr.parser.XPathParser.parseFunctionArgument()\r\n at net.sf.saxon.expr.parser.XPathParser.parseFunctionCall(Expression prefixArgument)\r\n at net.sf.saxon.expr.parser.XPathParser.parseBasicStep(Boolean firstInPattern)\r\n at net.sf.saxon.expr.parser.XPathParser.parseStepExpression(Boolean firstInPattern)\r\n at net.sf.saxon.expr.parser.XPathParser.parseRelativePath()\r\n at net.sf.saxon.expr.parser.XPathParser.parsePathExpression()\r\n at net.sf.saxon.expr.parser.XPathParser.parseSimpleMappingExpression()\r\n at net.sf.saxon.expr.parser.XPathParser.parseUnaryExpression()\r\n at net.sf.saxon.expr.parser.XPathParser.parseExprSingle()\r\n at net.sf.saxon.expr.parser.XPathParser.parseExpression()\r\n at net.sf.saxon.expr.parser.XPathParser.parse(String expression, Int32 start, Int32 terminator, StaticContext env)\r\n at net.sf.saxon.expr.parser.ExpressionTool.make(String expression, StaticContext env, Int32 start, Int32 terminator, CodeInjector codeInjector)\r\n at net.sf.saxon.style.StyleElement.makeExpression(String expression, Int32 attIndex)\r\n at net.sf.saxon.style.SourceBinding.prepareAttributes(Int32 permittedAttributes)\r\n at net.sf.saxon.style.XSLLocalVariable.prepareAttributes()\r\n at net.sf.saxon.style.StyleElement.processAttributes()\r\n at net.sf.saxon.style.StyleElement.processAllAttributes()\r\n at net.sf.saxon.style.StyleElement.lambda$processAllAttributes$1(NodeInfo )\r\n at net.sf.saxon.style.StyleElement._<>Anon1.accept(Item )\r\n at net.sf.saxon.om.SequenceIterator.forEachOrFail(SequenceIterator , ItemConsumer )\r\n at net.sf.saxon.style.StyleElement.processAllAttributes()\r\n at net.sf.saxon.style.PrincipalStylesheetModule.processAllAttributes()\r\n at net.sf.saxon.style.PrincipalStylesheetModule.preprocess()\r\n at net.sf.saxon.style.Compilation.compilePackage(Source source)\r\n at net.sf.saxon.style.StylesheetModule.loadStylesheet(Source styleSource, Compilation compilation)\r\n at net.sf.saxon.style.Compilation.compileSingletonPackage(Configuration config, CompilerInfo compilerInfo, Source source)\r\n at net.sf.saxon.s9api.XsltCompiler.compile(Source source)\r\n at Saxon.Api.XsltCompiler.Compile(Stream input, String theBaseUri, Boolean closeStream)\r\n at Saxon.Api.XsltCompiler.Compile(Uri uri)\r\n at Comply365.Business.Objects.XAT.Export.XslFoTransformer.Transform(String inputPath, String stylesheetPath, String outputPath)"
#4 Updated by Michael Kay about 1 month ago
I'm a little confused about exactly what the problem is.
Your problem description says "there is an issue" but it never actually says what the issue is. It's presumably the error message in the title of your post. But it's not clear what you are doing when you get this error message. You imply that you get the error when compiling the stylesheet, but unfortunately you don't tell us what's in the variable
The stack trace shows the error as occurring while compiling an XPath expression containing a call to file:base-dir(), which is evaluated statically (because the base URI is part of the static context), and the failure appears to be during some kind of normalization of the file name which involves converting the supplied string to a URI and then to a File object.
So I think the key bit of information that's missing is: what is the value of
file: URI scheme is not particularly well standardized, and we know that Java and .NET have differences in interpretation of the rules. In this case we are using both the .NET Uri class (when you initially supply a URI to the compile() method), and the Java URI class (when we normalize/resolve the static base URI to return from file:base-dir()). It might well be that it's these differences of interpretation that are causing the problem.
As far as possible, we avoid manipulating filenames and URIs ourselves in Saxon, but rely on Java primitives to do it. There are a few exceptions, for example the EXPath file code sometimes treats backslashes in filenames as forwards slashes where the Java libraries would otherwise complain.
#6 Updated by Michael Kay about 1 month ago
Note, many of the threads discussing this error seem to resolve to some issue with UNC filenames, which are sometimes (incorrectly) converted to invalid URIs in which "file:" is followed by either two slashes or four. In a legal "file:" URI, there must be either one slash or three following the "file:" prefix. There appears to be no correct or universally-accepted way of representing UNC filenames as URIs.
#7 Updated by Kevon Hayes about 1 month ago
The issue is that the _xsltCompiler throws the above exception when attempting to compile source XML into a an FO for PDF conversion. Yet when I do the same conversion via the Saxon CLI using the following it parses the file without error.
C:\KH...>Transform -t -s:..\XML_NOC.xml -xsl:QRH.xsl -o:FOFile.fo
To answer your question: The value of the stylesheet value is: "\\somefilefolder\folder000\folder00\folder0\Temp\folder\folder\XML_NOC.xml" of which I attempted to prefix with
"file:\" I still get the error: "URI has an authority component"
"file:\\\" ***I get "Could not find a part of the path 'c:*'" (so it thinks I'm referring to the C: drive for whatever reason )
#8 Updated by Kevon Hayes about 1 month ago
No sure why the bold escaped one backslash but when trying: 1 slash: file:\ I get the URI has an authority component error 3 slashes file:\\ I get "Could not find a part of the path 'c:\somefilefolder\folder000\folder00\folder0\Temp\folder\folder\XML_NOC.xml which of course does not exist.
#9 Updated by Michael Kay about 1 month ago
If it starts with two slashes (or backslashes) then it's a UNC filename so we get into that issue that UNC filenames can't be accurately represented as URIs.
The XsltCompiler.Compile() method expects a URI, and I suspect (need a Windows machine to check...) that when you call
new Uri(stylesheetPath)) it's giving you a .NET
Uri object that doesn't actually correspond to a valid W3C URI value, which is why Saxon subsequently has difficulty with it. It might be worth checking what this
Uri object actually looks like.
The .NET spec for the Uri class says:
Uri can also be used to represent local file system paths. These paths can be represented explicitly in URIs that begin with the file:// scheme, and implicitly in URIs that do not have the file:// scheme... These implicit file paths are not compliant with the URI specification and so should be avoided when possible.
So I strongly suspect you've passed a Uri that isn't compliant with the URI specification and this is why Java (and hence Saxon) have difficulty with it.
A couple of other points:
URIs always use forwards slashes, not backslashes. So
file:\\\is a nonsense
It's best to avoid trying to convert filenames to URIs by simple prefixing. It may sometimes work, but you're very unlikely to get it completely right. For example special characters such as spaces and percent-signs in filenames need special treatment. Use .NET or Java methods to do the conversion, not string manipulation.
#12 Updated by Kevon Hayes about 1 month ago
I realize I never gave you the complete file path value of the stylesheet when I called the compile method as shown above. It is: file://dev-pri-files1/Root1/PTPDocs/devrelease/Temp/c006fa2b-baef-48ce-8dc6-75c14c73e185/B747-8 QRH SB 20190621/xsl-fo/QRH.xsl
If you view: https://www.ietf.org/rfc/rfc1738.txt the file resource URI matches what I have.
According to the rfc spec the authority is correct and allowed. Also my stylesheet format is correct. So can you tell me why am I getting this error? Are you telling me I cannot load a stylesheet from a network resource? Are you saying that I must use a local file system? Also that article explains that the only time systems can use: "file:///" is when you are on localhost. You can omit localhost and just use the third slash.
Please provide an example or advise procedure of how this should work for us on .NET when trying to compile a stylesheet that's located on a network resource. The cs samples doesn't go into the needed detail.
C# example: ~~~ cpp var transformer = _xsltCompiler.Compile(new Uri(stylesheetPath)).Load(); // stylesheetPath is file://dev-pri-files1/Root1/PTPDocs/devrelease/Temp/c006fa2b-baef-48ce-8dc6-75c14c73e185/B747-8 QRH SB 20190621/xsl-fo/QRH.xsl ~~~
#13 Updated by Michael Kay about 1 month ago
Given a URI of the form
file://dev-pri-files/x/y/z, this matches the syntax given in the RFC. But it's worth reading the rest of the section:
A file URL takes the form: file://<host>/<path> where <host> is the fully qualified domain name of the system on which the <path> is accessible, and <path> is a hierarchical directory path of the form <directory>/<directory>/.../<name>. ... As a special case, <host> can be the string "localhost" or the empty string; this is interpreted as `the machine from which the URL is being interpreted'. The file URL scheme is unusual in that it does not specify an Internet protocol or access method for such files; as such, its utility in network protocols between hosts is limited.
So what you have is a syntactically-valid URI, but (a) the value of
<host> is not a "fully qualified domain name", and (b) even if it were, Java wouldn't know what to do with it, because it hasn't been told whether the file is accessible using http, ftp, webdav, Microsoft SMB, or something else. The phrase "its utility in network protocols between hosts is limited" is a polite way of saying "in general, the file: URI scheme isn't useful for accessing remote files on a different machine".
There is in fact a more recent (2017) RFC on the file URI scheme: https://tools.ietf.org/html/rfc8089 (which I hadn't come across until today). Section 3 says:
A file URI can be dependably dereferenced or translated to a local file path only if it is local. A file URI is considered "local" if it has no "file-auth", or the "file-auth" is the special string "localhost", or a fully qualified domain name that resolves to the machine from which the URI is being interpreted (Section 2).
file-auth corresponds to
<host> in the RFC 1738 syntax).
Appendix E.3 discusses representation of UNC filenames under the general heading of "Non-standard syntax variations". To the best of my knowledge, Java doesn't implement or recognize this variation.
Worse still, there seems to be a different convention on Java (also not universally supported), which is to represent the entire UNC filename in the path component of the URI, that is:
file:////dev-pri-files/x/y/z (with four forwards-slashes).
#14 Updated by Michael Kay about 1 month ago
So, what's the way forward?
Compiling a stylesheet from a remote machine isn't too difficult, for example you could deference the filename yourself and supply a Stream.
What's more difficult is supplying a base URI for that stylesheet that is can be used to locate other resources using relative references: for example in xsl:include and xsl:import, and in calls to the doc() function. The only way you can do this reliably is to access the remote resources over HTTP. Alternatively, you could try setting a custom XmlResolver on the XsltCompiler. However, the XmlResolver is only going to be used when getting XML resources, and the stylesheet base URI is also used in some cases for non-XML resources, of which your use of EXPath File extensions is one example.
From our point of view, I guess we could try enhancing Saxon on .NET (or perhaps even Saxon on Java on Windows...) to be able to dereference file URIs that use the non-standard extensions in RFC 8089 for referencing UNC files. That's a fairly substantial project, I would think.
#15 Updated by Kevon Hayes about 1 month ago
I went down the filestream route yesterday and indeed the issue was with the xsl:include dependencies not being found. So I quickly understood that if I streamed a stylesheet I would need to stream all of the dependencies used by that stylesheet. This is not idea since we reference dozens of files.
#21 Updated by Kevon Hayes about 1 month ago
According to: https://stackoverflow.com/questions/18520972/converting-java-file-url-to-file-path-platform-independent-including-u it appears that Java erroneously reports an authority when parsing UNC paths. This is apparently fixed in Java 7 via the java.nio.Paths implementation.
Any plans to patch this so all your .NET paying customers could benefit from it?
#23 Updated by Kevon Hayes about 1 month ago
On line 258 in your Saxon PE codebase of the XsltCompiler. 1 too many slashes are removed. This can be fixed if wrap the uri.ToString() call with java.nio.Paths.get().
I attached a screenshot of the offending line of code that causes the bad URI.
#24 Updated by Michael Kay about 1 month ago
Thanks for the citations; I had already looked at many of these when responding earlier on this thread. What this generally reveals is that Java handling of UNC filenames is a mess. One of my concerns is that a number of the attempts to solve the problem do it in a way that is incompatible with the way that RFC 8089 tackles it; we also need to establish whether it's consistent with the way the Microsoft .NET Url class handles it.
Our ability to solve this is also restricted by the fact that there are a lot of third-party components involved, notably the XML parser and the OASIS catalog resolver (when used).
I've added a work item to our internal shopping list for future enhancements. It would be nice to offer improvements in this area and we will try to find room in the schedule for this work, but it's not a trivial bug-fix. It's also quite likely to get complicated by any changes we make over the coming year to take advantage of Microsoft's promised developments on .NET which appear to include some kind of improved Java interoperability, though the details are still very murky.
#26 Updated by Vladimir Nesterovsky 30 days ago
I think following StackOverflow post should lead you in right direction:
#27 Updated by Michael Kay 27 days ago
We are discussing internally what the best way forward is to support UNC filenames on Windows (whether running Saxon on Java or .NET). The recent RFC 8089 is helpful: it describes two "non-standard" ways of representing UNC filenames as URIs, and we should try and support either or both of these on critical interfaces. We suspect that some of the problems are due to .NET and Java disagreeing on which of the two representations to use, and we may have to do some mediation when we pass URIs between the two environments. We will need to do a significant amount of testing to get this working, and we can't guarantee that this can all be done in maintenance releases; some of it may need to wait until a major release.
The specific problems you have reported appear to be (and we need to confirm this) that .NET, when constructing a Uri from a UNC filename, uses the representation defined in E.3.1 of RFC 8089, and Java does not accept this form. Whether it fully accepts the alternative form described in E.3.2 remains to be seen. If this proves to be the case, then it might be possible to work around the problem by supplying (to the
XsltCompiler.Compile method, or in
XsltCompiler.BaseUri) a Uri in the format that Java expects: that is, one with no Authority component and with a path component of the form "//server/x/y/z/". It may be necessary to use the .NET
UriBuilder class to construct such a URI.
#28 Updated by Michael Kay 21 days ago
We're back in the office today and able to do some systematic experiments in a variety of environments.
We're running on Windows, set up with a stylesheet
//server/unctest/test.xsl that includes another stylesheet with
This runs successfully on four environments: Java and .NET from the command line, Java and .NET via the Saxon transformation API.
The result of
static-base-uri() varies. When running on .NET from the API, the static base URI is
file://server/unctest/test.xsl; in the other three environments it is
This is telling us firstly, that .NET file-to-URI conversion is generating the 2-slash form (with an authority component), while Java file-to-URI conversion is generating the 4-slash form; and secondly, that Java is capable of doing URI resolution and URI dereferencing successfully with either form -- though Saxon is giving it some help, because we have special logic in
ResolveURI.makeAbsolute() to handle 4-slash URIs (in fact, (a) we avoid calling
URI.normalize(), and (b) we use the code in
java.net.URL.resolve() in preference to
We don't get a failure until we attempt to do
file:base-dir(). At this point we get different failures on different platforms.
With the 4-slash URI format,
file:base-dir() gives us
C:\server\unctest\test.xsl which is clearly useless. We can solve this within the code of file:base-dir() by changing a call of
new File(uri.normalize()) to
new File(uri.getPath()) -- it seems to be the call on
normalize() that is doing the damage.
With the 2-slash URI format, we are getting the observed error "URI has an authority component". It is crashing on the call to
normalize(), and the same fix seems to work here.
The significance of
file:base-dir() is that it is one of the few places where we convert URIs back to filenames; and it's this conversion that seems to cause the trouble. There are a number of other places in the product where we have similar logic and we need to try and find them. Some of them already seem to have been addressed in the past: for example
ResolveURI avoids calling
uri.normalize() in the case of a "file:////" (4-slash) URI.
#29 Updated by Michael Kay 21 days ago
The incorrect behaviour of
URI.normalize() is addressed in https://bugs.java.com/bugdatabase/view_bug.do?bug_id=4723726 (closed as "won't fix"). This pushes the blame onto (a) the IETF RFCs, where the process of normalization is defined, and (b) the fact that File.toURI() creates a 4-slash URI rather than a 2-slash URI; it suggests replacing
File.toPath().toURI() if you want a 2-slash URI "as developers expect".
new File(path).toPath().toURI() to work around this JDK issue.
Please register to edit this issue