Project

Profile

Help

Bug #4852

closed

SaxonJS incorrectly URI encodes 'value' attributes on 'input' elements

Added by Norm Tovey-Walsh almost 4 years ago. Updated over 3 years ago.

Status:
Closed
Priority:
Normal
Category:
-
Sprint/Milestone:
-
Start date:
2020-12-05
Due date:
% Done:

100%

Estimated time:
Applies to JS Branch:
2
Fix Committed on JS Branch:
2
Fixed in JS Release:
SEF Generated with:
Platforms:
Company:
-
Contact person:
-
Additional contact persons:
-

Description

I have a SaxonJS stylesheet running in Node that includes the following HTML output:

            <input placeholder="search" name="q" value=""
                   size="40"/> <input value="&#x01f50d;" type="submit"/></p>

That appears in the principleResult output as:

<input placeholder="search" name="q" value="" size="40"> <input value="%F0%9F%94%8D" type="submit"></p>
Actions #1

Updated by Michael Kay almost 4 years ago

input/@value is a URI attribute according to https://www.w3.org/TR/xslt-xquery-serialization-31/#list-of-uri-attributes, and this appears to be the correct %HH encoding of this character.

Saxon/J doesn't include input/@value in the list of URI attributes and doesn't %HH-encode it, but if you put input src="&#x01f50d;"/> through the HTML serializer then it comes out as <input src="%F0%9F%94%8D"/>

So I guess the question is, why do Saxon/J and Saxon-JS differ in the way that the HTML serializer handles URI-escaping of this attribute? Given the decision to do URI escaping, it seems to be doing it correctly as far as I can see.

Actions #2

Updated by Michael Kay almost 4 years ago

According to the DTD at https://www.w3.org/TR/html401/sgml/dtd.html input/@value is a CDATA attribute. So perhaps its inclusion in the list at https://www.w3.org/TR/xslt-xquery-serialization-31/#list-of-uri-attributes is an error that we quietly fixed in the Java product?

Actions #3

Updated by Michael Kay almost 4 years ago

input/@value is not included in the list of URI attributes in the 1.0/2.0 Serialization spec at https://www.w3.org/TR/2007/REC-xslt-xquery-serialization-20070123/#list-of-uri-attributes

Very strange. Henry wouldn't have added it to the list without a good reason.

The only other differences between the two versions of the spec are that 3.0 has added input/@formaction, button/@formaction, and video/@poster. These aren't present in the list used by Saxon/J (see HTMLURIEscaper.java), so it looks as if Saxon/J never implemented any of these changes in the 3.0 spec.

Actions #4

Updated by Michael Kay almost 4 years ago

The change to the list of URI-valued attributes first appeared in the draft of 2013-01-08 (https://www.w3.org/TR/2013/WD-xslt-xquery-serialization-30-20130108/ ) and the change log attributes it to bug 6129, but this bug was a catch-all to extend the spec to enable HTML5 support, and there is nothing specific in the bug about changing the list of URI attributes. Perhaps (pure conjecture here) Henry was looking at a draft HTML5 spec that subsequently changed?

Actions #5

Updated by Norm Tovey-Walsh almost 4 years ago

  • Assignee set to Norm Tovey-Walsh
Actions #6

Updated by Norm Tovey-Walsh almost 4 years ago

  • Subject changed from Unicode characters above the BMP don't pass through SaxonJS/Node correctly? to SaxonJS incorrectly URI encodes 'value' attributes on 'input' elements
Actions #7

Updated by Norm Tovey-Walsh almost 4 years ago

  • Status changed from New to Resolved

Fixed.

Actions #8

Updated by Community Admin almost 4 years ago

  • Applies to JS Branch 2 added
  • Applies to JS Branch deleted (2.0)
Actions #9

Updated by Debbie Lockett over 3 years ago

  • Fix Committed on JS Branch 2 added
Actions #10

Updated by Debbie Lockett over 3 years ago

  • Status changed from Resolved to Closed
  • % Done changed from 0 to 100
  • Fixed in JS Release set to Saxon-JS 2.1

Bug fix applied in the Saxon-JS 2.1 maintenance release.

Please register to edit this issue

Also available in: Atom PDF Tracking page