Bug #2397
closeddocument-uri() returns bad URL on Windows but Not Mac
100%
Description
Problem reported by Eliot Kimber on saxon-help list (2015-06-10).
Repo provided, and issue reproduced on Windows machine. The problem is actually in the document() function: if the document URI (href) supplied contains spaces, then it may not resolve correctly.
Mike's summary:
The document() function (and doc() for that matter) do two things:
(a) they check in a document pool to see if the URI is already known
(b) if not, they call the URIResolver to dereference the URI and fetch the resource
The URIResolver is defined to take the href and baseURI as separate parameters, so the URIResolver has its own logic to combine them into an absolute URI. But before calling it, we need to see if the absolute URI is present in the document pool, so there’s separate logic to construct an absolute URI for that purpose. It turns out that the two bits of code to combine href and baseURI are doing it slightly differently: the URIResolver logic escapes any spaces in the href to %20, but the document pool logic does not; instead it gets an exception from URI.resolve() and uses a fallback algorithm to concatenate the base URI and relative URI with “/../“ as a separator. It’s this fallback URI that you are seeing in the result from the document-uri() function.
According to the spec, the argument to document() should be a URI, which means it can’t contain unescaped spaces. We’re trying to be a bit friendlier than that, but we’re not entirely succeeding. We’ll fix this by doing the “escape spaces” logic on the document pool path as well as the URIResolver path. Meanwhile, please ensure that the value you pass to document() is a valid URI.
We haven’t explored all the variations on why this bug is occurring on some paths but not others, and on some operating systems and not others.
Please register to edit this issue