encode-for-uri() and #
Added by Anonymous over 18 years ago
Legacy ID: #3683924 Legacy Poster: Kevin Rodgers (notorious_kev)
Sorry, but I'm confused: The change log for Saxon 8.7.1 say "The functions encode-for-uri() and iri-to-uri() have been modified according to the changes agreed in W3C Bugzilla 2457." I followed the link to the thread started by the esteemed Mr. Kay on Nov. 11, 2005, in which he says "Currently encode-for-uri() does NOT escape a "#" sign" and "The Saxon implementation currently does escape "#"." Everyone seems to agree with him that # should be escaped and that the XSLT spec should be changed to say so. But the wording supplied by the equally esteemed Mr. Walsh, upon which the changed description of fn:encode-for-uri was based, includes this example: * fn:encode-for-uri("http://www.example.com/00/Weather/CA/Los%20Angeles#ocean") returns "http%3A%2F%2Fwww.example.com%2F00%2FWeather%2FCA%2FLos%2520Angeles#ocean". This is probably not what the user intended because all of the delimiters have been encoded. But the # delimiter was not encoded! Also, one of my XSLT stylesheets has a CVS log entry dated 2005-11-22 that says "Since encode-for-uri() does not escape the NUMBER SIGN "#", do so with replace() to avoid this error: XTRE1160: Invalid fragment identifier in URI". So if I upgrade to 8.7.1, can I remove the replace(..., '#', '%23') call around encode-for-uri() or not? My experience is that Saxon did not escape # in November, but Mr. Kay (who certainly knows better than me) says it did. Has anything really changed? Thanks, -- Kevin
Replies (1)
RE: encode-for-uri() and # - Added by Anonymous over 18 years ago
Legacy ID: #3684413 Legacy Poster: Michael Kay (mhkay)
The history of this is indeed tortuous. As far as the W3C spec is concerned, I think there was simply an error in the example that Norm proposed, and I have reopened the bug to get this fixed. Thanks for pointing it out. As far as Saxon is concerned, according to the change log I made a change in Saxon 8.6 (released on the same day as the CR specs, 3 Nov 2005) that encode-fo-uri() (in accordinace with the spec) should not escape a "#" sign. Previously Saxon had escaped "#". IIRC I received immediate feedback from users that they didn't like this, so the following day I raised the W3C bugzilla entry. The proposed change was accepted, and implemented in Saxon 8.7.1, so we're back to the situation where "#" is escaped (along with quite a few other characters, incidentally, that weren't escaped before). Michael Kay
Please register to reply