Project

Profile

Help

Bug #4413

closed

resolve-uri() function also normalises consecutive spaces

Added by Matthew Hutchins over 4 years ago. Updated over 4 years ago.

Status:
Closed
Priority:
Low
Assignee:
Category:
-
Sprint/Milestone:
-
Start date:
2019-12-18
Due date:
% Done:

0%

Estimated time:
Legacy ID:
Applies to branch:
9.8
Fix Committed on Branch:
Fixed in Maintenance Release:
Platforms:

Description

Hi, Not sure if this is a bug or not. I have an application where I am making URLs from paths in the file system, using the resolve-uri() function. The issue is if a filename contains two consecutive spaces, the resolve-uri() function collapses these to a single space, and so the result URL does not work. I could not see anything in the documentation for this function that suggests this should happen. I have tested this using XSLT bundled with Oxygen XML 21, which is Saxon-PE 9.8.0.12 but have seen it in other versions too. I have attached a sample XSL file that demonstrates the problem. Note that the problem goes away if the file name is % encoded first, so maybe my issue comes from the fact that the space character isn't valid in the URI anyway.


Files

resolve-uri.xsl (651 Bytes) resolve-uri.xsl Matthew Hutchins, 2019-12-18 00:21
Actions #1

Updated by Michael Kay over 4 years ago

I've reproduced this effect. Saxon is carefully escaping the spaces as %20 so that the Java JDK URI class doesn't barf on the URIs, and is then unescaping each %20 back to a space. The final step is to turn the resulting string into an instance of xs:anyURI, and it's at this stage that multiple spaces are being collapsed: the xs:anyURI type in XSD has the facet whitespace=collapse which means that the value space does not allow strings with consecutive spaces.

(XSD 1.1 part 2 is actually a bit inconsistent on this. It claims that the value space allows all sequences of characters, but by imposing the whiteSpace=collapse facet, it effectively constrains the value space. I've heard some XSD gurus say this is OK, the value space can include "ineffable" values that have no string representation; but this is hardly practical).

Given the fact that resolve-uri() is defined to return an xs:anyURI and that casting a string to xs:anyURI collapses whitespace, I don't think we can treat this as a bug, however inconvenient it might be.

Actions #2

Updated by Matthew Hutchins over 4 years ago

Thanks for that. This is my first Saxonica bug report, so not sure if I am responsible for marking it as cancelled/resolved, and if so, how to do it.

Actions #3

Updated by Michael Kay over 4 years ago

  • Status changed from New to Closed
  • Assignee set to Michael Kay

Closed as invalid. The product is behaving as designed, even if it's surprising.

Please register to edit this issue

Also available in: Atom PDF