Project

Profile

Help

How to connect?
Download (19.7 KB) Statistics
| Branch: | Revision:

he / src / userdoc / extensions / serialization-params.xml @ 276447a9

1
<?xml version="1.0" encoding="UTF-8"?>
2

    
3
<section id="serialization-parameters" title="Serialization parameters">
4
    <h1>Serialization parameters</h1>
5

    
6
    <p>Saxon provides a number of additional serialization parameters, with names in the Saxon
7
        namespace. These can be specified as attributes on the <a class="bodylink code"
8
            href="/xsl-elements/output">xsl:output</a> and <a class="bodylink code"
9
            href="/xsl-elements/result-document">xsl:result-document</a> elements (XSLT-only), in
10
        the Query prolog (XQuery only), as parameters in the <a class="bodylink code"
11
            href="/functions/fn/serialize">fn:serialize()</a> function, or as extra parameters on
12
        the Query or Transform command line. They can also be specified in the query or
13
        transformation API.</p>
14

    
15
    <aside> Requires Saxon-PE or Saxon-EE.</aside>
16

    
17
    <element-syntax name="output-extras" xmlns="http://www.saxonica.com/ns/doc/elements">
18
        <attribute name="saxon:attribute-order" required="no">
19
            <desc>
20
                <p>Available with the XML, HTML, and XHTML output methods, to control the order in
21
                    which attributes appear within an element start tag (in the absence of the
22
                    property the order of attributes is unpredictable). 
23
                    The value of the parameter is a list of tokens, each of which is either a QName or 
24
                    the token "*" to match unspecified attributes. Attributes whose names are
25
                    listed before the "*" token appear first, in the order they are listed; other
26
                    unlisted attributes follow, sorted first by namespace URI and then by local-name, and finally any attributes
27
                    whose names appear after the "*" appear at the end.
28
                    For example <code>saxon:attribute-order="a b c * xml:space"</code> will cause attributes to be output in the order
29
                    <code>a</code>, then <code>b</code>, then <code>c</code>, then everything else (sorted by URI and local name), 
30
                    then <code>xml:space</code>.
31
                    
32
                    </p>
33
            </desc>
34
            <data-type name="eqnames"/>
35
        </attribute>
36
        
37
        <attribute name="saxon:canonical" required="no">
38
            <desc>
39
                <p>Available with the XML output method, to request that the serialized XML conforms to the W3C XML Canonicalization 1.1
40
                    specification (C14N). This can be useful, for example, when test results are to be compared.
41
                    Specifically, this option changes XML serialization as follows:</p>
42
                
43
                <ol>
44
                    <li>Empty elements are output as <code><![CDATA[<empty></empty>]]></code> rather than <code><![CDATA[<empty/>]]></code>.</li>
45
                    <li>Namespaces within a start tag are sorted in alphabetical order of prefix.</li>
46
                    <li>Attributes within a start tag are sorted first by namespace URI, then by local name.</li>
47
                    <li>Processing instructions and comments that appear as children of the document node are separated by newlines.</li>
48
                </ol>
49
                
50
                <p>Specifying <code>saxon:canonical="yes"</code> forces <code>omit-xml-declaration="yes"</code>, 
51
                    <code>indent="no"</code>, and <code>encoding="utf-8"</code>, and causes <code>use-character-maps</code> and <code>cdata-section-elements</code>
52
                to be ignored. No DOCTYPE declaration is output. The option does not force Unicode normalization; if in doubt, set <code>normalization-form="C"</code>.</p>
53
                
54
               
55
            </desc>
56
            <data-type name="boolean"/>
57
        </attribute>
58

    
59
        <attribute name="saxon:character-representation" required="no">
60
            <desc>
61
                <p>Allows greater control over how non-ASCII characters will be represented on
62
                    output.</p>
63

    
64
                <p>When the output method is XML, two values are supported: <code>decimal</code> and
65
                        <code>hex</code>. These control whether numeric character references are
66
                    output in decimal or hexadecimal when the character is not available in the
67
                    selected encoding. </p>
68

    
69
                <p>When the output method is HTML, the value may hold two strings, separated by a
70
                    semicolon. The first string defines how non-ASCII characters within the
71
                    character encoding will be represented, the values being <code>native</code>,
72
                        <code>entity</code>, <code>decimal</code>, or <code>hex</code>. The second
73
                    string defines how characters outside the encoding will be represented, the
74
                    values being <code>entity</code>, <code>decimal</code>, or <code>hex</code>.
75
                    Here <code>native</code> means output the character as itself;
76
                        <code>entity</code> means use a defined entity reference (such as
77
                    "&amp;eacute;") if known; <code>decimal</code> and <code>hex</code> refer to
78
                    numeric character references. For example <code>entity;decimal</code> (the
79
                    default) means that with <code>encoding="iso-8859-1"</code>, characters in the
80
                    range 160-255 will be represented using standard HTML entity references, while
81
                    Unicode characters above 255 will be represented as decimal character
82
                    references.</p>
83

    
84
                <!--<aside>This attribute is retained for the time being in the interests of backwards
85
                    compatibility. However, the XSLT 2.0 specification makes it technically a
86
                    non-conformance to provide attributes that change serialization behavior except
87
                    in cases where the behavior is implementation-defined; and this is not such a
88
                    case (the specification, at least in the case of the XML output method, does not
89
                    allow a character to be substituted with a character reference in cases where
90
                    the character is present in the chosen encoding). The best way of ensuring that
91
                    non-ASCII characters are output using character references is to use
92
                        <code>encoding="us-ascii"</code>.</aside>-->
93
            </desc>
94
            <constant value="native"/>
95
            <constant value="entity"/>
96
            <constant value="decimal"/>
97
            <constant value="hex"/>
98
        </attribute>
99

    
100
        <attribute name="saxon:double-space" required="no">
101
            <desc>
102
                <p>When the output method is XML with <code>indent="yes"</code>, the
103
                        <code>saxon:double-space</code> attribute may be used to generate an extra
104
                    blank line before selected elements. The value is a whitespace-separated list of
105
                    element names. The attribute follows the same conventions as
106
                        <code>cdata-section-elements</code>: values specified in separate <a
107
                        class="bodylink code" href="/xsl-elements/output">xsl:output</a> or <a
108
                        class="bodylink code" href="/xsl-elements/result-document"
109
                        >xsl:result-document</a> elements are cumulative, and if the value is
110
                    supplied programmatically via an API, or from the command line, then the element
111
                    names are given in Clark notation, namely <code>{uri}local</code>. The effect of
112
                    the attribute is to cause an extra blank line to be output before the start tag
113
                    of the specified elements.</p>
114
            </desc>
115
            <data-type name="eqnames"/>
116
        </attribute>
117

    
118
        <attribute name="saxon:indent-spaces" required="no">
119
            <desc>
120
                <p>When the output method is XML, HTML, or XHTML with <code>indent="yes"</code>, the
121
                        <code>saxon:indent-spaces</code> attribute may be used to control the amount
122
                    of indentation. The default value in the absence of this attribute is 3.</p>
123
            </desc>
124
            <data-type name="integer"/>
125
        </attribute>
126

    
127
        <attribute name="saxon:line-length" required="no">
128
            <desc>
129
                <p>Default value 80. With the XML output method, attributes are output on a new line
130
                    if they would otherwise extend beyond this column position. With the HTML output
131
                    method, text lines are split at this line length when possible. In releases 9.2
132
                    and earlier, the HTML output method attempted to split lines that exceeded 120
133
                    characters in length.</p>
134
            </desc>
135
            <data-type name="integer"/>
136
        </attribute>
137
        
138
        <attribute name="saxon:newline" required="no">
139
            <desc>
140
                <p>Default value 10. Defines the string that is used by the text output method
141
                    to represent a newline. The Windows line ending x0Dx0A (CRLF) may sometimes be preferred
142
                    to the default of a single newline character, this can be specified using
143
                    <code>saxon:newline="&amp;#x0D;&amp;#x0A;"</code>.</p>
144
            </desc>
145
            <data-type name="string"/>
146
        </attribute>
147

    
148
        <attribute name="saxon:next-in-chain" required="no">
149
            <desc>
150
                <p>XSLT only. Used to direct the output to another stylesheet. The value is the URL
151
                    of a stylesheet that should be used to process the output stream. In this case
152
                    the output stream must always be pure XML, and attributes that control the
153
                    format of the output (e.g. <code>method</code>,
154
                        <code>cdata-section-elements</code>, etc) will have no effect. The output of
155
                    the second stylesheet will be directed to the destination that would have been
156
                    used for the first stylesheet if no <code>saxon:next-in-chain</code> attribute
157
                    were present.</p>
158

    
159
                <p>This serialization property is available only on <a class="bodylink code"
160
                    href="/xsl-elements/output">xsl:output</a> declarations and <a class="bodylink code"
161
                        href="/xsl-elements/result-document">xsl:result-document</a> instructions. It cannot be
162
                    supplied as an attribute to any of the various APIs that control serialization; nor can it be used 
163
                    on the command line. It is not supported as an XQuery serialization parameter.</p>
164

    
165
                <p>Supplying a zero-length string is equivalent to omitting the attribute, except
166
                    that it can be used to override a previous setting.</p>
167
                
168
                <p>If the value is a relative URI, it is interpreted relative to the base URI of the
169
                stylesheet element (<code>xsl:output</code> or <code>xsl:result-document</code>)
170
                on which the attribute appears.</p>
171

    
172
                <aside>This property is retained for backwards compatibility. A more standard way of 
173
                    post-processing the output
174
                    of a transformation is to use the XPath 3.1 function
175
                    <a class="bodylink code" href="/functions/fn/transform"
176
                        >fn:transform()</a>. Alternatively, both the JAXP and s9api APIs provide
177
                mechanisms allowing the output of one transformation to be used as the input of another.</aside>
178
            </desc>
179
            <data-type name="uri"/>
180
        </attribute>
181
        
182
        <attribute name="saxon:property-order" required="no">
183
            <desc>
184
                <p>Available with the JSON output method, to control the order in
185
                    which properties appear within the serialization of a map/object (in the absence of the
186
                    property the order of properties is unpredictable). 
187
                    The value of the parameter is a list of tokens, in which the token "*" is treated specially. Properties whose names are
188
                    listed before the "*" token appear first, in the order they are listed; other
189
                    unlisted properties follow, sorted alphabetically, and finally any properties
190
                    whose names are listed after the "*" appear at the end.
191
                    For example <code>saxon:property-order="@ a b c * $"</code> will cause properties to be output in the order
192
                    <code>@</code>, then <code>a</code>, then <code>b</code>, then <code>c</code>, then everything else (sorted by URI and local name), 
193
                    then <code>$</code>. Although JSON property names can include spaces, there is no provision for such names to be included in the list.
194
                    
195
                </p>
196
            </desc>
197
            <data-type name="eqnames"/>
198
        </attribute>
199

    
200
        <attribute name="saxon:recognize-binary" required="no">
201
            <desc>
202
                <p>Relevant only when using the text output method. If set to <code>yes</code>, the
203
                    processing instructions <code>&lt;?hex XXXX?&gt;</code> and <code>&lt;?b64
204
                        XXXX?&gt;</code> will be recognized; the value is taken as a hexBinary or
205
                    base64 representation of a character string, encoded using the encoding in use
206
                    by the serializer, and this character string will be output without validating
207
                    it to ensure it contains valid XML characters. Also recognized are
208
                        <code>&lt;?hex.EEEE XXXX?&gt;</code> and <code>&lt;?b64.EEEE
209
                        XXXX?&gt;</code>, where EEEE is the name of the encoding of the base64 or
210
                    hexBinary data: for example <code>hex.ascii</code> or <code>b64.utf8</code>.</p>
211

    
212
                <p>This enables non-XML characters, notably binary zero, to be output.</p>
213

    
214
                <p>For example, given <code>&lt;xsl:output method="text"
215
                        saxon:recognize-binary="yes"/&gt;</code>, the following instruction:</p>
216

    
217
                <p>
218
                    <code>&lt;xsl:processing-instruction name="hex.ascii" select="'00'"/&gt;</code>
219
                </p>
220

    
221
                <p>outputs the Unicode character with codepoint zero ("NUL"), while</p>
222

    
223
                <p>
224
                    <code>&lt;xsl:processing-instruction name="b64.utf8"
225
                        select="securityKey"/&gt;</code>
226
                </p>
227

    
228
                <p>outputs the value of the <code>securityKey</code> element, on the assumption that
229
                    this is base64-encoded UTF-8 text.</p>
230
            </desc>
231
            <data-type name="boolean"/>
232
        </attribute>
233

    
234
        <attribute name="saxon:require-well-formed" required="no">
235
            <desc>
236
                <p>Affects the handling of result documents that contain multiple top-level elements
237
                    or top-level text nodes. The W3C specifications allow such a result document,
238
                    even though it is not a well-formed XML document. It is, however, a well-formed
239
                        <i>external general parsed entity</i>, which means it can be incorporated
240
                    into a well-formed XML document by means of an entity reference.</p>
241

    
242
                <p>The default is <code>no</code>. If the value is set to <code>yes</code>, and a
243
                    SAX destination (for example a <code>SAXResult</code>, a
244
                    <code>JDOMResult</code>, or a user-written <code>ContentHandler</code>) is
245
                    supplied to receive the results of the transformation, then Saxon will report an
246
                    error rather than sending a non-well-formed stream of SAX events to the
247
                        <code>ContentHandler</code>. This attribute is useful when the output of the
248
                    stylesheet is sent to a component (for example an XSL-FO rendering engine) that
249
                    is not designed to accept non-well-formed XML result trees.</p>
250

    
251
                <p>Note also that namespace undeclarations of the form <code>xmlns:p=""</code> (as
252
                    permitted by XML Namespaces 1.1) are passed to the
253
                        <code>startPrefixMapping()</code> method of a user-defined
254
                        <code>ContentHandler</code> only if <code>undeclare-prefixes="yes"</code> is
255
                    specified on <a class="bodylink code" href="/xsl-elements/output"
256
                    >xsl:output</a>.</p>
257
            </desc>
258
            <data-type name="boolean"/>
259
        </attribute>
260
        
261
        <attribute name="saxon:single-quotes" required="no">
262
            <desc>
263
                <p>If set to <code>yes</code>, the XML, HTML, and XHTML output methods will generally use
264
                single quotes (apostrophes) rather than double quotes to delimit attribute values.</p>
265
                <p>This can be useful if the serialized XML/HTML is to be subsequently wrapped in double quotes,
266
                for example as part of a JSON text, or within a Java string literal. It does not eliminate the
267
                need to escape double quotes (using <code>\"</code>) in such a context, but it means that
268
                fewer characters will be affected, which improves the readability of the result.</p>
269
                <p>The property is ignored in the case of attributes affected by character maps, where
270
                single or double quotes are used intelligently based on the actual content of the attribute.</p>
271
            </desc>
272
            <data-type name="boolean"/>
273
        </attribute>
274

    
275
        <attribute name="saxon:supply-source-locator" required="no">
276
            <desc>
277
                <p>Relevant only when output is sent to a user-written <code>ContentHandler</code>,
278
                    that is, a <code>SAXResult</code>. It causes extra information to be maintained
279
                    and made available to the <code>ContentHandler</code> for diagnostic purposes:
280
                    specifically, the <code>Locator</code> that is passed to the
281
                        <code>ContentHandler</code> via the <code>setDocumentLocator</code> method
282
                    may be cast to a <code>ContentHandlerProxyLocator</code>, which exposes the
283
                    method <code>getContextItemStack()</code>. This returns a
284
                        <code>java.util.Stack</code>. The top item on the stack is the current
285
                    context item, and below this are previous context items. Each item is
286
                    represented by the interface <a class="javalink" href="net.sf.saxon.om.Item"
287
                        >net.sf.saxon.om.Item</a>. If the item is a node, and if the node is one
288
                    derived by parsing a source document with the line-numbering option enabled,
289
                    then it is possible to obtain the URI and line number of this node in the
290
                    original XML source.</p>
291

    
292
                <p>For this to work, the code must be compiled with tracing enabled. This can be
293
                    achieved by setting the option <code>config.setCompileWithTracing(true)</code>
294
                    on the <a class="javalink" href="net.sf.saxon.Configuration">Configuration</a>
295
                    object, or equivalently by setting the property
296
                        <code>Feature.COMPILE_WITH_TRACING.name</code> on the JAXP
297
                        <code>TransformerFactory</code>. Note that this compile-time option imposes
298
                    a substantial run-time overhead, even if tracing is not switched on at run-time
299
                    by providing a <code>TraceListener</code>.</p>
300
            </desc>
301
            <data-type name="boolean"/>
302
        </attribute>
303
    </element-syntax>
304
</section>
(2-2/3)