Project

Profile

Help

How to connect?
Download (112 KB) Statistics
| Branch: | Revision:

he / src / userdoc / schema-processing.xml @ 8ddaa514

1
<?xml version="1.0" encoding="utf-8"?>
2
<article id="schema-processing" title="Using XML Schema (XSD)">
3
   <h1>Using XML Schema (XSD)</h1>
4

    
5

    
6
   <aside>Schema processing is available only with the Enterprise Edition of the product, Saxon-EE. </aside>
7

    
8
   <p>Saxon can be used as a free-standing schema processor in its own right, either from the
9
      command line or from a Java application. In addition, Saxon can be used as a schema-aware XSLT
10
      processor or as a schema-aware XQuery processor.</p>
11
   
12
   <p>This section covers the following topics:</p>
13
   
14
   <nav>
15
      <ul/>
16
   </nav>
17

    
18
   <p>Saxon-EE supports the schema validation APIs in JAXP 1.3, as well as its own native APIs.</p>
19
   <section id="commandline" title="Running Validation from the Command Line">
20
      <h1>Running Validation from the Command Line</h1>
21

    
22

    
23
      <p>The Java class <a class="javalink" href="com.saxonica.Validate">com.saxonica.Validate</a>
24
         allows you to validate a source XML document against a given schema, or simply to check a
25
         schema for internal correctness.</p>
26

    
27
      <p>To validate one or more source documents, using the Java platform, write:</p>
28
      <kbd>java  com.saxonica.Validate   [options]  source.xml...  </kbd>
29

    
30
      <p>The equivalent on the .NET platform is:</p>
31
      <kbd>Validate [options]  source.xml...  </kbd>
32

    
33
      <p>It is possible to use glob syntax to process multiple files, for example <code>Validate
34
            *.xml</code>.</p>
35

    
36
      <p>In the above form, the command relies on the use of <code>xsi:schemaLocation</code>
37
         attributes within the instance document to identify the schema to be loaded. As an
38
         alternative, the schema can be specified on the command line:</p>
39
      <kbd>[java com.saxonica.Validate | Validate] -xsd:schema.xsd -s:instance.xml</kbd>
40

    
41
      <p>In this form of the command, it is possible to specify multiple schema documents and/or
42
         multiple instance documents, in both cases as a semicolon-separated list. Glob syntax (such
43
         as <code>*.xml</code>) is available only if the <code>-s:</code> prefix is omitted, because
44
         the shell has to recognize the argument as a filename.</p>
45

    
46
      <p>Thus, source files to be validated can be listed either using the <code>-s</code> option,
47
         or in any argument that is not prefixed with "<code>-</code>". This allows the standard
48
         wildcard expansion facilities of the shell interpreter to be used, for example
49
            <code>*.xml</code> validates all files in the current directory with extension
50
         "xml".</p>
51

    
52
      <p>If no instance documents are supplied, the effect of the command is simply to check a
53
         schema for internal correctness. So a schema can be verified using the command:</p>
54
      <kbd>[java com.saxonica.Validate | Validate] -xsd:schema.xsd</kbd>
55

    
56
      <p>More generally the syntax of the command is:</p>
57
      <kbd>[java com.saxonica.Validate | Validate] [options] [params] [filenames] </kbd>
58

    
59
      <p>where options generally take the form <code>-code:value</code> and params take the form
60
            <code>keyword=value</code>.</p>      
61
      
62
      <h2 class="subtitle">Command line options</h2>
63

    
64
      <p>The options are as follows (in any order): </p>
65
      <table>
66
         <tr>
67
            <td>
68
               <p>-catalog:filenames</p>
69
            </td>
70
            <td>
71
               <p><i>filenames</i> is either a file name or a list of file names separated by
72
                  semicolons; the files are OASIS XML catalogs used to define how public identifiers
73
                  and system identifiers (URIs) used in a source document or schema are to be
74
                  redirected, typically to resources available locally. For more details see <a
75
                     class="bodylink" href="/sourcedocs/xml-catalogs">Using XML Catalogs</a>. </p>
76
            </td>
77
         </tr>
78
         <tr>
79
            <td>
80
               <p>-config:filename</p>
81
            </td>
82
            <td>
83
               <p>Loads options from a <a class="bodylink" href="/configuration/configuration-file"
84
                     >configuration file</a>. This must describe a schema-aware configuration. </p>
85
            </td>
86
         </tr>
87
         <tr>
88
            <td>
89
               <p>-dtd:(on|off|recover)</p>
90
            </td>
91
            <td>
92
               <p>Setting <code>-dtd:on</code> requests DTD-based validation of the source files. 
93
                  Requires an XML
94
                  parser that supports validation. The setting <code>-dtd:off</code> (which is the
95
                  default) suppresses DTD validation. The setting <code>-dtd:recover</code> performs
96
                  DTD validation but treats the error as non-fatal if it fails. Note that any
97
                  external DTD is likely to be read even if not used for validation, because DTDs
98
                  can contain definitions of entities.</p>
99
            </td>
100
         </tr>
101
         <tr>
102
            <td>
103
               <p>-export:filename</p>
104
            </td>
105
            <td>
106
               <p>Makes a copy of the compiled schema (providing it is valid) as a schema component
107
                  model to the specified XML file. This file will contain schema components
108
                  corresponding to all the loaded schema documents. This option may be combined with
109
                  other options: the SCM file is written after all document instance validation has
110
                  been carried out.</p>
111
            </td>
112
         </tr>
113
         <tr>
114
            <td>
115
               <p>-ext:(on|off)</p>
116
            </td>
117
            <td>
118
               <p>If <code>ext:off</code> is specified, suppress calls on dynamically-loaded
119
                  external Java functions. This does not affect calls on integrated extension
120
                  functions, including Saxon and EXSLT extension functions. This option is useful
121
                  when loading an untrusted schema, perhaps from a remote site using an
122
                  <code>http://</code> URL; it ensures that the schema cannot call arbitrary Java
123
                  methods and thereby gain privileged access to resources on your machine. </p>
124
            </td>
125
         </tr>
126
         <tr>
127
            <td>
128
               <p>-init:initializer</p>
129
            </td>
130
            <td>
131
               <p>The value is the name of a user-supplied class that implements the interface <a
132
                     class="javalink" href="net.sf.saxon.lib.Initializer"
133
                     >net.sf.saxon.lib.Initializer</a>; this initializer will be called during the
134
                  initialization process, and may be used to set any options required on the <a
135
                     class="javalink" href="net.sf.saxon.Configuration">Configuration</a>
136
                  programmatically. </p>
137
            </td>
138
         </tr>
139
         <tr>
140
            <td>
141
               <p>-limits:min,max</p>
142
            </td>
143
            <td>
144
               <p>Sets upper limits on the values of <code>minOccurs</code> and
145
                     <code>maxOccurs</code> allowed in a schema content model, in cases where Saxon
146
                  is not able to implement the rules using a finite state machine with counters. For
147
                  further details see <a class="bodylink" href="../min-and-maxoccurs">Handling
148
                        <code>minOccurs</code> and <code>maxOccurs</code></a>. </p>
149
            </td>
150
         </tr>
151
         <tr>
152
            <td>
153
               <p>-opt:0...10</p>
154
            </td>
155
            <td>
156
               <p>Set optimization level. The value is an integer in the range 0 (no optimization)
157
                  to 10 (full optimization); currently all values other than 0 result in full
158
                  optimization but this is likely to change in future. The default is full
159
                  optimization; this feature allows optimization to be suppressed in cases where
160
                  reducing compile time is important, or where optimization gets in the way of
161
                  debugging, or causes extension functions with side-effects to behave
162
                  unpredictably. (Note however, that even with no optimization, lazy evaluation may
163
                  still cause the evaluation order to be not as expected.) </p>
164
            </td>
165
         </tr>
166
         <tr>
167
            <td>
168
               <p>-quit:(on|off)</p>
169
            </td>
170
            <td>
171
               <p>With the default setting, <code>on</code>, the command will quit the Java VM and
172
                  return an exit code if a failure occurs. This is useful when running from an
173
                  operating system shell. With the setting <code>quit:off</code> the command instead
174
                  throws a <code>RunTimeException</code>, which is more useful when the command is
175
                  invoked from another Java application such as Ant. </p>
176
            </td>
177
         </tr>
178
         <tr>
179
            <td>
180
               <p>-r:classname</p>
181
            </td>
182
            <td>
183
               <p>Use the specified <code>URIResolver</code> to process the URIs of all schema
184
                  documents and source documents. The <code>URIResolver</code> is a user-defined
185
                  class, that implements the <code>URIResolver</code> interface defined in JAXP,
186
                  whose function is to take a URI supplied as a string, and return a SAX
187
                     <code>InputSource</code>. It is invoked to process URIs found in
188
                     <code>xs:include</code> and <code>xs:import</code>
189
                  <code>schemaLocation</code> attributes of schema documents, the URIs found in
190
                     <code>xsi:schemaLocation</code> and <code>xsi:noNamespaceSchemaLocation</code>
191
                  attributes in the source document, and (if <code>-u</code> is also specified) to
192
                  process the URI of the source file provided on the command line. Specifying
193
                     <code>-r:org.apache.xml.resolver.tools.CatalogResolver</code> selects the
194
                  Apache XML resolver (part of the Apache Commons project, which must be on the
195
                  classpath) and enables URIs to be resolved via a catalog, allowing references to
196
                  external websites to be redirected to local copies.</p>
197
            </td>
198
         </tr>
199
         <tr>
200
            <td>
201
                <p>-report:filename</p>
202
            </td>
203
            <td>
204
               This option switches on the capture of validation reporting.
205
               Here <i>filename</i> specifies where the validation report should be written to on disk. The validation report
206
               is in XML format. The format of the validation report is defined in a schema which is available in the
207
               <code>saxon-resources</code> download file (see <code>validation-reports.xsd</code>).
208
            </td>
209
         </tr>
210
         <tr>
211
            <td>
212
               <p>-s:file;file...</p>
213
            </td>
214
            <td>
215
               <p>Supplies a list of source documents to be validated. Each document is validated
216
                  using the same options. The value is a list of filenames separated by semicolons.
217
                  It is also possible to specify the names of source documents as arguments without
218
                  any preceding option flag; in this case shell wildcards can be used. A filename
219
                  can be specified as "<code>-</code>" to read the source document from standard
220
                  input, in which case the base URI is taken from that of the current directory.</p>
221
               <p>
222
                  The validation of multiple source documents is done simultaneously (in parallel threads) by default.
223
                  The number of threads used is set to the number of processors available on the machine. If the <code>Configuration</code>
224
                  option <code>Feature.ALLOW_MULTITHREADING</code> is set to false, the source documents are validated synchronously in
225
                  a single thread.      
226
               </p>
227
            </td>
228
         </tr>
229
         <tr>
230
            <td>
231
               <p>-scmin:filename</p>
232
            </td>
233
            <td>
234
               <p>Loads a precompiled schema component model from the given file. The file should be
235
                  generated in a previous run using the <code>-export</code> option. When this
236
                  option is used, the <code>-xsd</code> option should not be present. Schemas loaded
237
                  from an SCM file are assumed to be valid, without checking.</p>
238
               <p><i>This option is retained for compatibility. From Saxon 9.7, SCM files can also be
239
               supplied in the <code>-xsd</code> option.</i></p>
240
            </td>
241
         </tr>
242
         <tr>
243
            <td>
244
               <p>-scmout:filename</p>
245
            </td>
246
            <td>
247
               <p>Synonym of <code>-export:filename</code>, retained for compatibility.</p>
248
                
249
            </td>
250
         </tr>
251
         <tr>
252
            <td>
253
               <p>-stats:filename</p>
254
            </td>
255
            <td>
256
               <p>Requests creation of an XML document containing statistics showing which schema
257
                  components were used during the validation episode, and how often (coverage data).
258
                  This data can be used as input to further processes to produce user-readable
259
                  reports; for example the data could be combined with the output of
260
                     <code>-scmout</code> to show which components were not used at all during the
261
                  validation.</p>
262
            </td>
263
         </tr>
264
         <tr>
265
            <td>
266
               <p>-t</p>
267
            </td>
268
            <td>
269
               <p>Requests display of version and timing information to the standard error output.
270
                  This also shows all the schema documents that have been loaded.</p>
271
            </td>
272
         </tr>
273
         <tr>
274
            <td>
275
               <p>-top:element-name</p>
276
            </td>
277
            <td>
278
               <p>Requires that the outermost element of the instance being validated has the
279
                  required name. This is written in Clark notation format
280
                  <code>{uri}local</code>.</p>
281
            </td>
282
         </tr>
283
         <tr>
284
            <td>
285
               <p>-u</p>
286
            </td>
287
            <td>
288
               <p>Indicates that the name of the source document and schema document are supplied as
289
                  URIs; otherwise they are taken as filenames, unless they start with "http:" or
290
                  "file:", in which case they they are taken as URLs.</p>
291
            </td>
292
         </tr>
293
         <tr>
294
            <td>
295
               <p>-val:(strict|lax)</p>
296
            </td>
297
            <td>
298
               <p>Invokes strict or lax validation (default is <code>strict</code>). Lax validation
299
                  validates elements only if there is an element declaration to validate them
300
                  against, or if they have an <code>xsi:type</code> attribute.</p>
301
            </td>
302
         </tr>
303
         <tr>
304
            <td>
305
               <p>-x:classname</p>
306
            </td>
307
            <td>
308
               <p>Requests use of the specified SAX parser for parsing the source file. The
309
                  classname must be the fully-qualified name of a Java class that implements the
310
                     <code>org.xml.sax.XMLReader</code> interface. In the absence of this argument,
311
                  the standard JAXP facilities are used to locate an XML parser. Note that the XML
312
                  parser performs the raw XML parsing only; Saxon always does the schema validation
313
                  itself. Selecting <code>-x:org.apache.xml.resolver.tools.ResolvingXMLReader</code>
314
                  selects a parser configured to use the Apache entity resolver, so that DTD and
315
                  other external references in source documents are resolved via a catalog. The
316
                  parser (part of the Apache Commons project) must be on the classpath.</p>
317
            </td>
318
         </tr>
319
         <tr>
320
            <td>
321
               <p>-xi:(on|off)</p>
322
            </td>
323
            <td>
324
               <p>Apply XInclude processing to all source XML documents (but not to schema documents). 
325
                  This currently only works when documents are parsed using the
326
                  Xerces parser, which is the default in JDK 1.5 and later.</p>
327
            </td>
328
         </tr>
329
         <tr>
330
            <td>
331
               <p>-xmlversion:(1.0|1.1)</p>
332
            </td>
333
            <td>
334
               <p>If set to 1.1, allows XML 1.1 and XML Namespaces 1.1 constructs. This option must
335
                  be set if source documents using XML 1.1 are to be validated, or if the schema
336
                  itself is an XML 1.1 document. This option causes types such as
337
                     <code>xs:Name</code>, <code>xs:QName</code>, and <code>xs:ID</code> to use the
338
                  XML 1.1 definitions of these constructs.</p>
339
            </td>
340
         </tr>
341
         <tr>
342
            <td>
343
               <p>-xsd:file;file...</p>
344
            </td>
345
            <td>
346
               <p>Supplies a list of schema documents to be used for validation. The value is a list
347
                  of filenames separated by semicolons. If no source documents are supplied, the
348
                  schema documents will be processed and any errors in the schema will be notified.
349
                  This option must not be used when <code>-scmin</code> is specified. The option may
350
                  be omitted, in which case the schema to be used for validation will be located
351
                  using the <code>xsi:schemaLocation</code> and
352
                     <code>xsi:noNamespaceSchemaLocation</code> attributes in the source document. A
353
                  filename can be specified as "<code>-</code>" to read the schema from standard
354
                  input, in which case the base URI is taken from that of the current directory.</p>
355
               <p>The documents may either be source XSD schema documents, or compiled SCM files generated
356
               previously using the <code>-export</code> option. Loading precompiled schemas in SCM format
357
               is substantially faster. In addition, an SCM file may contain an embedded license key, in which
358
               case it is possible to use it for validation using a Saxon-EE configuration that does not have its
359
               own license.</p>
360
            </td>
361
         </tr>
362
         <tr>
363
            <td>
364
               <p>-xsdversion:(1.0|1.1)</p>
365
            </td>
366
            <td>
367
               <p>Indicates whether the schema processor is to act as an XSD 1.0 or XSD 1.1
368
                  processor. The default is XSD 1.1.</p>
369
            </td>
370
         </tr>
371
         <tr>
372
            <td>
373
               <p>-xsiloc:(on|off)</p>
374
            </td>
375
            <td>
376
               <p>If set to <code>on</code> (the default) the schema processor attempts to load any
377
                  schema documents referenced in <code>xsi:schemaLocation</code> and
378
                     <code>xsi:noNamespaceSchemaLocation</code> attributes in the instance document,
379
                  unless a schema for the specified namespace (or non-namespace) is already
380
                  available. If set to <code>off</code>, these attributes are ignored.</p>
381
            </td>
382
         </tr>
383
         <tr>
384
            <td>
385
               <p>-y:classname</p>
386
            </td>
387
            <td>
388
               <p>Use the specified SAX parser for schema documents. The supplied classname
389
                  must be the fully-qualified class name of a Java class that implements the
390
                  <code>org.xml.sax.XMLReader</code> or
391
                  <code>javax.xml.parsers.SAXParserFactory</code> interface, and it must be
392
                  instantiable using a zero-argument public constructor.</p>
393
            </td>
394
         </tr>
395
         <tr>
396
            <td>
397
               <p>--<i>feature</i>:value</p>
398
            </td>
399
            <td>
400
               <p>Set a feature defined in the <a class="javalink" href="net.sf.saxon.Configuration"
401
                     >Configuration</a> interface. The names of features are defined in the Javadoc
402
                  for class <a class="javalink" href="net.sf.saxon.lib.Feature">Feature</a>:
403
                  the value used here is the part of the name after the last "/", for example
404
                     <code>--allow-external-functions:off</code>. Only features accepting a string
405
                  or boolean may be set; for booleans the values
406
                     <code>true</code>/<code>false</code> or <code>on</code>/<code>off</code> are
407
                  recognized.</p>
408
            </td>
409
         </tr>
410
         <tr>
411
            <td>
412
               <p>-?</p>
413
            </td>
414
            <td>
415
               <p>Display command syntax.</p>
416
            </td>
417
         </tr>
418
         <tr>
419
            <td>
420
               <p>--?</p>
421
            </td>
422
            <td>
423
               <p>Display a list of features that are available using the <code>--feature:value</code> syntax</p>
424
            </td>
425
         </tr>
426
      </table>      
427
      
428
      <h2 class="subtitle">Command line parameters</h2>
429

    
430
      <p>Parameters on the command line can be used to supply values for any
431
            <code>saxon:param</code> declarations in the schema. See <a class="bodylink"
432
            href="../parameterizing-schemas">Parameterizing Schemas</a> for details. The format of
433
         parameters is the same as for the XSLT and XQuery command lines: <code>name=value</code> to
434
         supply a simple value; <code>+name=filename</code> to supply the contents of an XML
435
         document as the parameter value; or <code>?name=expression</code> to supply the result of
436
         evaluating an XPath expression (for example, <code>?date=current-date()</code>).</p>
437

    
438
      <p>The results of processing the schema, and of validating the source document against the
439
         schema, are written to the standard error output. Unless the <code>-t</code> option is
440
         used, successful processing of the source document and schema results in no output.</p>
441
   </section>
442
   <section id="scm" title="Importing and Exporting Schema Component Models">
443
      <h1>Importing and Exporting Schema Component Models</h1>
444
      
445
      
446
      <p>Saxon provides the ability to export or import a compiled schema. The export format is an
447
         XML file, known as an SCM file (for schema component model). Using SCM files has three benefits:</p>
448
      
449
      <ul>
450
         <li><p>An SCM file is much faster load than the corresponding source schema documents.</p></li>
451
         <li><p>An SCM file is much easier for applications to process that the corresponding source documents:
452
            whether written in Java, XSLT, or XQuery, applications that need access to schema information can find
453
            it much more readily in an SCM document than in the source schema.</p></li>
454
         <li><p>An SCM file may contain an embedded license key, enabling it to be used for validating source documents on
455
         a Saxon-EE configuration that does not have its own license.</p></li>
456
      </ul>
457
      
458
      <p>The simplest way to create an SCM file is from the command line, using the
459
         <code>com.saxonica.Validate</code> command with the <code>-export</code> option. This is
460
         described in <a class="bodylink" href="../commandline">Running Validation from the Command
461
            Line</a>. Alternatively, an SCM file can be generated programmatically using the
462
         <code>exportComponents()</code> method of the <a class="javalink"
463
            href="com.saxonica.config.EnterpriseConfiguration"
464
            >com.saxonica.config.EnterpriseConfiguration</a> class, which is described in the
465
         JavaDoc. The serializer is unselective: it will output an SCM containing all the schema
466
         components that have been loaded into the <code>Configuration</code>, other than built-in
467
         schema components.</p>
468
      
469
      <p>An SCM file is accepted by most interfaces that allow a source XSD file
470
      to be supplied, for example:</p>
471
      
472
      <ul>
473
         <li><p>The <code>-xsd</code> option of the command-line <code>com.saxonica.Validate</code> command</p></li>
474
         <li><p>The <code>load()</code> method of the s9api <code>SchemaManager</code></p></li>
475
         <li><p>The JAXP <code>SchemaFactory.newSchema()</code> method</p></li>
476
         <li><p>The <code>xsi:schemaLocation</code> and <code>xsi:noNamespaceSchemaLocation</code> attributes in
477
         an instance document (the SCM file contains components for multiple namespaces, and it should only be loaded once)</p></li>
478
         <li><p>An <code>xsl:import-schema</code> declaration in XSLT, or an <code>import schema</code> declaration in XQuery.</p></li>
479
      </ul>
480
      
481
      <p>A schema loaded in this way is then available for all tasks performed using this
482
         <code>Configuration</code>, including validation of source documents and compiling of
483
         schema-aware queries and stylesheets.</p>
484
      
485
      <p>Schema Component Models can also be imported and exported using the
486
         <code>importComponents()</code> and <code>exportComponents()</code> methods of the <a
487
            class="javalink" href="net.sf.saxon.s9api.SchemaManager">SchemaManager</a> in the s9api
488
         interface.</p>
489
      
490
      <p>An SCM file cannot be used as the target of <code>xs:include</code>, <code>xs:import</code>,
491
      <code>xs:redefine</code>, or <code>xs:override</code> declarations within a schema document. An 
492
      SCM file represents a complete schema, not an individual module.</p>
493
      
494
      <p>The schema components within an SCM file are <i>sealed</i>. This means it is not possible to
495
      change their effective meaning by adding new members to substitution groups, or deriving new types
496
      by extension. The components within an SCM file may be referenced from other components (loaded from a normal
497
      XSD document): for example types within an SCM file may be referred to from new element declarations.
498
      However, and SCM file will always be self-contained: it cannot contain external references to components
499
      loaded from elsewhere. It is possible to load two SCM files only if their components are non-overlapping, and 
500
      neither refers to components in the other.</p>
501
      
502
      <p>If the configuration used to generate an SCM file is licensed with a <i>developer master key</i>,
503
      then any exported SCM file will include an embedded license allowing it to be loaded and used for validation
504
      on a Saxon-EE configuration that does not have its own license. An SCM file containing an embedded license
505
      is protected from modification by checksums and digital signatures.</p>
506
      
507
      <p>The structure of an SCM file is defined in the schema <code>scmschema.xsd</code> which is
508
         available in the directory <code>samples/scm/</code> in the <code>saxon-resources</code>
509
         download file. This is annotated to explain the mappings between elements and attributes in
510
         the SCM file and components and properties as defined in the W3C XML Schema Specification.
511
         The same directory contains a file <code>scmschema.scm</code> which contains the schema for
512
         SCM in SCM format.</p>
513
      <aside>The SCM file includes a representation of the finite state machines used to validate
514
         instances against a complex type. This means that the FSM does not need to be regenerated
515
         when a schema is loaded from an SCM file, which saves a lot of time. However, it also means
516
         that the SCM format is not currently suitable as a target format for software-generated
517
         schemas. A variant of SCM in which the finite state machines can be omitted may be provided
518
         in a future release.</aside>
519
   </section>
520
   <section id="validation-api" title="Controlling Validation from Java">
521
      <h1>Controlling Validation from Java</h1>
522

    
523

    
524
      <p>Schema validation can be controlled either using the standard JAXP Java interface, or using
525
         Saxon's own <strong>s9api</strong> interface. The two approaches are described in the
526
         following sections. The main advantage of using JAXP is that it is portable; the main
527
         advantage of s9api is that it is better integrated across the range of Saxon XML processing
528
         interfaces.</p>
529
      <nav>
530
         <ul/>
531
      </nav>
532

    
533
      <section id="schema-s9api" title="Schema Processing using s9api">
534
         <h1>Schema Processing using s9api</h1>
535

    
536

    
537
         <p>The s9api interface allows schemas to be loaded into a <a class="javalink"
538
               href="net.sf.saxon.s9api.Processor">Processor</a>, and then to be used for validating
539
            instances, or for schema-aware XSLT and XQuery processing.</p>
540

    
541
         <p>The main steps are:</p>
542
         <ol>
543
            <li>
544
               <p>Create a <a class="javalink" href="net.sf.saxon.s9api.Processor"
545
                     >net.sf.saxon.s9api.Processor</a> and call its <code>getSchemaManager()</code>
546
                  method to get a <a class="javalink" href="net.sf.saxon.s9api.SchemaManager"
547
                     >SchemaManager</a>.</p>
548
            </li>
549
            <li>
550
               <p>If required, set options on the <a class="javalink"
551
                     href="net.sf.saxon.s9api.SchemaManager">SchemaManager</a> to control the way in
552
                  which schema documents will be loaded.</p>
553
            </li>
554
            <li>
555
               <p>Load a schema document by calling the <code>load()</code> method, which takes a
556
                  JAXP Source object as its argument. The resulting schema document is available to
557
                  all applications run within the containing <a class="javalink"
558
                     href="net.sf.saxon.s9api.Processor">Processor</a>.</p>
559
            </li>
560
            <li>
561
               <p>To validate an instance document, call the <code>newSchemaValidator()</code>
562
                  method on the <a class="javalink" href="net.sf.saxon.s9api.SchemaManager"
563
                     >SchemaManager</a> object. </p>
564
            </li>
565
            <li>
566
               <p>Set options on the <a class="javalink" href="net.sf.saxon.s9api.SchemaValidator"
567
                     >SchemaValidator</a> to control the way in which a particular validation
568
                  episode is performed, and then invoke its <code>validate()</code> method to
569
                  validate an instance document.</p>
570
            </li>
571
         </ol>
572

    
573
         <p>Note that additional schemas referenced from the <code>xsi:schemaLocation</code>
574
            attributes within the source documents will be loaded as necessary. By default a target
575
            namespace is ignored if there is already a loaded schema for that namespace; Saxon makes
576
            no attempt to load multiple schemas for the same namespace and check them for
577
            consistency. This behaviour can be changed using the configuration option <a
578
               class="javalink" href="net.sf.saxon.lib.Feature#MULTIPLE_SCHEMA_IMPORTS"
579
               >MULTIPLE_SCHEMA_IMPORTS</a>.</p>
580

    
581
         <p>Although the API is defined in such a way that a <a class="javalink"
582
               href="net.sf.saxon.s9api.SchemaValidator">SchemaValidator</a> is created for a
583
            particular <a class="javalink" href="net.sf.saxon.s9api.SchemaManager"
584
            >SchemaManager</a>, in the Saxon implementation the schema components that are available
585
            to the validator are not only the components within that schema, but all the components
586
            that form part of any schema registered with the <code>Processor</code> (or indeed, with
587
            the underlying <a class="javalink" href="net.sf.saxon.Configuration"
588
            >Configuration</a>).</p>
589

    
590
         <p>The <a class="javalink" href="net.sf.saxon.s9api.SchemaValidator">SchemaValidator</a>
591
            implements the <a class="javalink" href="net.sf.saxon.s9api.Destination">Destination</a>
592
            interface, which means it can be used to receive input from any process that writes to a
593
               <code>Destination</code>, for example an XSLT transformation or an XQuery query. The
594
            result of validation can also be sent to any <code>Destination</code>, for example an
595
            XSLT transformer.</p>
596
      </section>
597

    
598
      <section id="schema-jaxp" title="Schema Processing using JAXP">
599
         <h1>Schema Processing using JAXP</h1>
600

    
601

    
602
         <p>Applications can invoke schema processing using the APIs provided in JAXP 1.3. This
603
            makes Saxon interchangeable with other schema processors implementing this interface.
604
            There is full information on these APIs in the Java documentation. The two main
605
            mechanisms are the <code>Validator</code> class, and the <code>ValidatorHandler</code>
606
            class. Sample applications using these interfaces are provided in the
607
               <code>samples/java</code> directory of the <code>saxon-resources</code> download (see
608
             <code>SchemaValidatorExample.java</code> and <code>SchemaValidatorHandlerExample.java</code>).
609
            Saxon also supplies the class <a class="javalink"
610
               href="com.saxonica.ee.jaxp.ValidatingReader"
611
               >com.saxonica.ee.jaxp.ValidatingReader</a>, which implements the SAX2
612
               <code>XMLReader</code> interface, allowing it to be used as a schema-validating XML
613
            parser.</p>
614

    
615
         <p>The main steps are:</p>
616
         <ol>
617
            <li>
618
               <p>Create a <code>SchemaFactory</code>, by calling
619
                     <code>SchemaFactory.newInstance()</code> with the argument
620
                     <code>"http://www.w3.org/2001/XMLSchema"</code>, and with the Java system
621
                  properties set up to ensure that Saxon is loaded as the chosen schema processor.
622
                  Saxon will normally be loaded as the default schema processor if Saxon-EE is
623
                  present on the classpath, but to make absolutely sure, set the system property
624
                     <code>javax.xml.validation.SchemaFactory:http://www.w3.org/2001/XMLSchema</code>
625
                  to the value <a class="javalink" href="com.saxonica.ee.jaxp.SchemaFactoryImpl"
626
                     >com.saxonica.ee.jaxp.SchemaFactoryImpl</a>. Note that if you set this property
627
                  using a property file, colons in the property name must be escaped as
628
                     "<code>\:</code>".</p>
629
            </li>
630
            <li>
631
               <p>Process a schema document by calling one of the several <code>newSchema()</code>
632
                  methods on the returned <code>SchemaFactory</code>.</p>
633
            </li>
634
            <li>
635
               <p>Create either a <code>Validator</code> or a <code>ValidatorHandler</code> from
636
                  this returned <code>Schema</code>.</p>
637
            </li>
638
            <li>
639
               <p>Use the <code>Validator</code> or <code>ValidatorHandler</code> to process one or
640
                  more source documents.</p>
641
            </li>
642
         </ol>
643

    
644
         <p>Saxon also provides the class <code>SchemaFactory11</code> which automatically enables
645
            support for XSD 1.1. When the JAXP search mechanism is used, this schema factory will be
646
            selected if the schema language required is set to
647
               <code>http://www.w3.org/XML/XMLSchema/v1.1</code>. Saxon also recognizes the generic
648
            language identifier <code>http://www.w3.org/XML/XMLSchema</code> and the XSD 1.0
649
            identifier <code>http://www.w3.org/XML/XMLSchema/vX.Y</code> as requests for an XSD 1.0
650
            processor.</p>
651

    
652
         <p>Note that additional schemas referenced from the <code>xsi:schemaLocation</code>
653
            attributes within the source documents will be loaded as necessary. A target namespace
654
            is ignored if there is already a loaded schema for that namespace; Saxon makes no
655
            attempt to load multiple schemas for the same namespace and check them for
656
            consistency.</p>
657

    
658
         <p>Although the API is defined in such a way that a <code>Validator</code> or
659
               <code>ValidatorHandler</code> is created for a particular <code>Schema</code>, in the
660
            Saxon implementation the schema components that are available to the validator are not
661
            only the components within that schema, but all the components that form part of any
662
            schema registered with the <a class="javalink" href="net.sf.saxon.Configuration"
663
               >Configuration</a>.</p>
664

    
665
         <p>Another way to control validation from a Java application is to run a JAXP identity
666
            transformation, having first set the option to perform schema validation. The following
667
            code (from the sample application <code>QuickValidator.java</code>) illustrates
668
            this:</p>
669
         <samp><![CDATA[try {
670
    System.setProperty(
671
            "javax.xml.transform.TransformerFactory",
672
            "com.saxonica.config.EnterpriseTransformerFactory");
673
    TransformerFactory factory = 
674
            TransformerFactory.newInstance();
675
    factory.setAttribute(Feature.SCHEMA_VALIDATION.name, 
676
            new Integer(Validation.STRICT));
677
    factory.setAttribute(Feature.VALIDATION_WARNINGS.name, 
678
            Boolean.TRUE);
679
    Transformer trans = factory.newTransformer();
680
    StreamSource source = 
681
            new StreamSource(new File(args[0]).toURI().toString());
682
    SAXResult sink = 
683
            new SAXResult(new DefaultHandler());
684
    trans.transform(source, sink);
685
} catch (TransformerException err) {
686
    System.err.println("Validation failed");
687
}
688
]]></samp>
689

    
690
         <p>If you set an <code>ErrorListener</code> on the <code>TransformerFactory</code>, then
691
            you can control the way that error messages are output.</p>
692

    
693
         <p>If you want to validate against a schema without hard-coding the URI of the schema into
694
            the source document, you can do this by pre-loading the schema into the
695
               <code>TransformerFactory</code>. This extended example (again from the sample
696
            application <code>QuickValidator.java</code>) illustrates this:</p>
697
         <samp><![CDATA[try {
698
    System.setProperty(
699
            "javax.xml.transform.TransformerFactory",
700
            "com.saxonica.config.EnterpriseTransformerFactory");
701
    TransformerFactory factory = 
702
            TransformerFactory.newInstance();
703
    factory.setAttribute(Feature.SCHEMA_VALIDATION.name, 
704
            new Integer(Validation.STRICT));
705
    factory.setAttribute(Feature.VALIDATION_WARNINGS.name, 
706
            Boolean.TRUE);
707
    if (args.length > 1) {
708
        StreamSource schema = 
709
                new StreamSource(new File(args[1]).toURI().toString());
710
        ((EnterpriseTransformerFactory)factory).addSchema(schema);
711
    }
712
    Transformer trans = factory.newTransformer();
713
    StreamSource source = 
714
            new StreamSource(new File(args[0]).toURI().toString());
715
    SAXResult sink = 
716
            new SAXResult(new DefaultHandler());
717
    trans.transform(source, sink);
718
} catch (TransformerException err) {
719
    System.err.println("Validation failed");
720
}
721
]]></samp>
722

    
723
         <p>You can preload as many schemas as you like using the <code>addSchema()</code> method.
724
            Such schemas are parsed, validated, and compiled once, and can be used as often as you
725
            like for validating multiple source documents. You cannot unload a schema once it has
726
            been loaded. If you want to remove or replace a schema, start afresh with a new
727
               <code>TransformerFactory</code>.</p>
728

    
729
         <p>Behind the scenes, the <code>TransformerFactory</code> uses a <code>Configuration</code>
730
            object to hold all the configuration information. The basic Saxon product (Saxon-HE and
731
            Saxon-PE) uses the class <a class="javalink" href="net.sf.saxon.TransformerFactoryImpl"
732
               >net.sf.saxon.TransformerFactoryImpl</a> for the <code>TransformerFactory</code>, and
733
               <a class="javalink" href="net.sf.saxon.Configuration">net.sf.saxon.Configuration</a>
734
            for the underlying configuration information. The schema-aware product (Saxon-EE)
735
            subclasses these with <a class="javalink"
736
               href="com.saxonica.config.EnterpriseTransformerFactory"
737
               >com.saxonica.config.EnterpriseTransformerFactory</a> and <a class="javalink"
738
               href="com.saxonica.config.EnterpriseConfiguration"
739
               >com.saxonica.config.EnterpriseConfiguration</a> respectively. You can get hold of
740
            the <code>Configuration</code> object by casting the <code>TransformerFactory</code> to
741
            a Saxon <code>TransformerFactorImpl</code> and calling the
742
               <code>getConfiguration()</code> method. This gives you more precise control, for
743
            example it allows you to retrieve the <code>Schema</code> object containing the schema
744
            components for a given target namespace, and to inspect the compiled schema to establish
745
            its properties. See the JavaDoc documentation for further details.</p>
746

    
747
         <p>
748
            <i>Saxon currently implements its own API for access to the schema components. This API
749
               should be regarded as temporary. In the longer term, it is possible that Saxon will
750
               offer an API for schema access that has been proposed in a member submission to
751
               W3C.</i>
752
         </p>
753

    
754
         <p>The programming approach outlined above, of using an identity transformer, is suitable
755
            for a wide class of applications. For example, it enables you to insert a validation
756
            step into a SAX-based pipeline. However, for finer control, there are lower-level
757
            interfaces available in Saxon that you can also use. See for example the JavaDoc for the
758
               <a class="javalink" href="com.saxonica.config.EnterpriseConfiguration"
759
               >EnterpriseConfiguration</a> class, which includes methods such as
760
               <code>getElementValidator()</code>. This constructs a <a class="javalink"
761
               href="net.sf.saxon.event.Receiver">Receiver</a> which acts as a validating XML event
762
            filter. This can be inserted into a pipeline of <code>Receiver</code>s. Saxon also
763
            provides classes to bridge between SAX events and <code>Receiver</code> events: <a
764
               class="javalink" href="net.sf.saxon.event.ReceivingContentHandler"
765
               >ReceivingContentHandler</a> and <a class="javalink"
766
               href="net.sf.saxon.event.ContentHandlerProxy">ContentHandlerProxy</a>
767
            respectively.</p>
768
      </section>
769
   </section>
770
   <section id="parameterizing-schemas" title="Parameterizing Schemas">
771
      <h1>Parameterizing Schemas</h1>
772

    
773

    
774
      <p>Saxon provides an extension to the standard XSD syntax that allows a schema to be
775
         parameterized. This is only useful if XSD 1.1 is enabled. The facility allows a parameter
776
         to be declared in a top-level annotation in the schema document, for example:</p>
777
      <samp><![CDATA[<xs:annotation>
778
  <xs:appinfo>
779
    <saxon:param name="accepted-currencies" 
780
                 as="xs:string" 
781
                 select="'USD', 'GBP, 'EUR'"
782
                 xmlns:saxon="http://saxon.sf.net/"/>
783
  </xs:appinfo>
784
</xs:annotation>]]></samp>
785

    
786
      <p>This declaration allows the variable <code>$currency</code> to appear in any XPath
787
         expression appearing in the remainder of the same schema document. Typically it will be
788
         used in an assertion or in an expression controlling conditional type assignment, for
789
         example:</p>
790
      <samp><![CDATA[<xs:assert test="@currency = $accepted-currencies"/>]]></samp>
791

    
792
      <p>As with stylesheet parameters in XSLT, the <code>as</code> attribute defines the required
793
         type of the value (defaulting to <code>item()*</code>), and the <code>select</code>
794
         attribute supplies a default value. The expression determining the default value is
795
         evaluated during schema processing (that is, at "compile time"). The name of the parameter
796
         is a QName following the XSLT convention that no prefix means no namespace. The supplied
797
         value is converted to the required type using the function conversion rules, and validation
798
         fails if this is not possible.</p>
799

    
800
      <p>It is important to supply a sensible default value since it will not always be possible to
801
         supply a value for the parameter. For example, if the variable is used in the
802
            <code>assertion</code> facet of a simple type, then a cast expression initiated from
803
         XSLT or XQuery will always use the default value for the parameter.</p>
804

    
805
      <p>The scope of the declared variable is all XPath expressions appearing after the
806
            <code>saxon:param</code> element within the same schema document. All parameters within
807
         a schema must have distinct names. It is not at present possible to use one parameter
808
         across multiple schema documents (as a workaround, all types using the variable should
809
         appear in the same schema document).</p>
810

    
811
      <p>On the <code>Validate</code> command line the parameters can be supplied in the form
812
            <code>keyword=value</code>, for example <code>currency=EUR</code>; more complex values
813
         can be supplied as XPath expressions, for example
814
            <code>?accepted-currencies=('USD','GBP','EUR')</code> or
815
            <code>+lookup-table=lookup-doc.xml</code>.</p>
816

    
817
      <p>Using the s9api interface from Java, parameter values can be supplied using the
818
            <code>setParameter()</code> method on the <a class="javalink"
819
            href="net.sf.saxon.s9api.SchemaValidator">SchemaValidator</a> object.</p>
820

    
821
      <p>Using the Saxon.Api interface on .NET, parameter values can be supplied using the
822
            <code>SetParameter()</code> method on the <a class="javalink"
823
            href="Saxon.Api.SchemaValidator">SchemaValidator</a> object.</p>
824

    
825
      <p>It is not currently possible to supply parameter values when using the JAXP interfaces to
826
         run a validation episode, or when invoking validation using the standard mechanisms in XSLT
827
         or XQuery. In this situation the default value will always be used. In the absence of the
828
            <code>select</code> attribute the default value is an empty sequence (whether or not
829
         this is a legal value according to the required type).</p>
830

    
831
      <p>An extension function <a class="bodylink code" href="/functions/saxon/validate"
832
            >saxon:validate()</a> is available to allow parameterized validation to be invoked from
833
         XSLT or XQuery. The first argument is the document or element node to be validated; the
834
         second argument is a map giving validation options, and the third argument is a set of
835
         values for any validation parameters, also supplied as a map (the keys will be of type
836
            <code>xs:QName</code>). If the second and/or third arguments are omitted, the effect is
837
         the same as if empty maps were supplied for these arguments.</p>
838

    
839
   </section>
840
   <section id="validation-from-ant" title="Running Validation from Ant">
841
      <h1>Running Validation from Ant</h1>
842

    
843

    
844
      <p>It is possible to use the Saxon schema validator using the standard Ant tasks
845
            <code>xmlvalidate</code> and <code>schemavalidate</code>. To use Saxon rather than
846
         Xerces as the validation engine, specify the attribute
847
            <code>classname="com.saxonica.ee.jaxp.ValidatingReader"</code>, and make sure Saxon-EE
848
         is on the classpath.</p>
849

    
850
      <p>The schema to be used for validation can be specified using the
851
            <code>xsi:schemaLocation</code> and <code>xsi:noNamespaceSchemaLocation</code>
852
         attributes in the instance document, or (in the case of the <code>schemavalidate</code>
853
         task) using the <code>schemavalidate/schema</code> child element or the
854
            <code>schemavalidate/@noNamespaceFile</code> or
855
            <code>schemavalidate/@noNamespaceURL</code> attributes.</p>
856

    
857
      <p>The attributes <code>lenient</code> and <code>fullchecking</code> have no effect.</p>
858

    
859
      <p>The child element <code>schemavalidate/attribute</code> can be used to set options. Any
860
         option defined by the constants in class <a class="javalink"
861
            href="net.sf.saxon.lib.Feature">net.sf.saxon.lib.Feature</a> can be specified,
862
         provided the required value is expressible as a string (for boolean values, use "true" and
863
         "false"). Saxon also recognizes some property names defined by the Apache Xerces product,
864
         for compatibility.</p>
865

    
866
      <p>Properties of particular interest include the following:</p>
867
      <table>
868
         <tr>
869
            <td>
870
               <p>
871
                  <strong>Name</strong>
872
               </p>
873
            </td>
874
            <td>
875
               <p>
876
                  <strong>Value</strong>
877
               </p>
878
            </td>
879
         </tr>
880
         <tr>
881
            <td>
882
               <p>http://saxon.sf.net/feature/licenseFileLocation</p>
883
            </td>
884
            <td>
885
               <p>The filename where the Saxon-EE license file is found.</p>
886
            </td>
887
         </tr>
888
         <tr>
889
            <td>
890
               <p>http://saxon.sf.net/feature/schemaURIResolverClass</p>
891
            </td>
892
            <td>
893
               <p>Class used to resolve URIs of schema documents.</p>
894
            </td>
895
         </tr>
896
         <tr>
897
            <td>
898
               <p>http://saxon.sf.net/feature/schema-validation-mode</p>
899
            </td>
900
            <td>
901
               <p>
902
                  <code>strict</code> or <code>lax</code>: determines whether validation fails if no
903
                  element declaration can be found for the top-level element.</p>
904
            </td>
905
         </tr>
906
         <tr>
907
            <td>
908
               <p>http://saxon.sf.net/feature/standardErrorOutputFile</p>
909
            </td>
910
            <td>
911
               <p>Log file to capture validation errors.</p>
912
            </td>
913
         </tr>
914
         <tr>
915
            <td>
916
               <p>http://saxon.sf.net/feature/xsd-version</p>
917
            </td>
918
            <td>
919
               <p>
920
                  <code>1.0</code> or <code>1.1</code> depending on the version of the XML Schema
921
                  (XSD) Recommendation to be supported. Default is <code>1.1</code>. </p>
922
            </td>
923
         </tr>
924
      </table>
925
   </section>
926
   <section id="satransformcmd" title="Schema-Aware XSLT from the Command Line">
927
      <h1>Schema-Aware XSLT from the Command Line</h1>
928

    
929

    
930
      <p>To run a schema-aware transformation from the command line, use appropriate options on the 
931
         <code>net.sf.saxon.Transform</code> command, for example
932
            <code>-val:strict</code> to request strict validation of the source document, or
933
            <code>-val:lax</code> for lax validation. This applies not only to the principal source
934
         document loaded from the command line, but to all documents loaded via the <a
935
            class="bodylink code" href="/functions/fn/doc">doc()</a> and <a class="bodylink code"
936
            href="/functions/fn/document">document()</a> functions.</p>
937

    
938
      <p>The schemas to be used to validate these source documents can be specified either by using
939
         the <a class="bodylink code" href="/xsl-elements/import-schema">xsl:import-schema</a>
940
         declaration in the stylesheet, or using <code>xsi:schemaLocation</code> (or
941
            <code>xsi:noNamespaceSchemaLocation</code>) attributes within the source documents
942
         themselves, or by using the <code>-xsd</code> option on the command line.</p>
943

    
944
      <p>Validating the source document has several effects. Most obviously, it will cause the
945
         transformation to fail if the document is invalid. It will also cause default values for
946
         attributes and elements to be expanded, so they will appear to the stylesheet as if they
947
         were present on the source document. In addition, element and attribute nodes that have
948
         been validated will be annotated with a type. This enables operations to be performed in a
949
         type-safe way. This may cause error messages, for example if you try to use an
950
            <code>xs:decimal</code> value as an argument to a function that expects a string. It may
951
         also cause some operations to produce different results: for example when using elements or
952
         attributes that have been given a list type in the schema, the typed value of the node will
953
         appear in the stylesheet as a sequence rather than as a single string value.</p>
954

    
955
      <p>Saxon-EE also allows you to validate result documents (both final result documents and
956
         temporary trees), using the <code>validation</code> and <code>type</code>
957
         attributes.
958
         For details of these, refer to the XSLT 2.0 specification. Validation of result documents
959
         is done on-the-fly, so if the stylesheet attempts to produce invalid output, you will
960
         usually get an error message that identifies the offending instruction in the stylesheet.
961
         Type annotations on final result documents are lost if you send the output to a standard
962
         JAXP <code>Result</code> object (whether it's a <code>StreamResult</code>,
963
            <code>SAXResult</code>, or <code>DOMResult</code>), but they remain available if you
964
         capture the output in a Saxon <a class="javalink" href="net.sf.saxon.event.Receiver"
965
            >Receiver</a> or in a <code>DOMResult</code> that encapsulates a Saxon <a
966
            class="javalink" href="net.sf.saxon.om.NodeInfo">NodeInfo</a>. For details of the way in
967
         which type annotations are represented in the Saxon implementation of the data model, see
968
         the JavaDoc documentation. The <code>getSchemaType()</code> method on a <a class="javalink"
969
            href="net.sf.saxon.om.NodeInfo">NodeInfo</a> object returns a <a class="javalink"
970
            href="net.sf.saxon.type.SchemaType">SchemaType</a> object representing the type.</p>
971

    
972
      <p>The <code>-outval:recover</code> option on the command line causes validation errors encountered in
973
         processing a final result tree to be treated as warnings, allowing processing to continue.
974
         This allows more than one error to be reported in a single run. The result document is
975
         serialized as if validation were successful, but with XML comments inserted to show where
976
         the validation errors were found. This option does not necessarily recover from all
977
         validation errors, for example at present it does not recover from errors in uniqueness or
978
         referential constraints. It applies only to result trees validated using the
979
            <code>validation</code> attribute of <a class="bodylink code"
980
            href="/xsl-elements/result-document">xsl:result-document</a>.</p>
981
      
982
      <aside><p>When output is indented, the indentation takes account of schema information (whitespace
983
         is never added inside mixed-content elements). Validation failures, even if they are not fatal,
984
         may therefore affect the indentation of the output.</p></aside>
985

    
986
      <p>With the schema-aware version of Saxon, type declarations (the <code>as</code> attribute on
987
         elements such as <a class="bodylink code" href="/xsl-elements/function">xsl:function</a>,
988
            <a class="bodylink code" href="/xsl-elements/variable">xsl:variable</a>, and <a
989
            class="bodylink code" href="/xsl-elements/param">xsl:param</a>) can refer to
990
         schema-defined types, for example you can write <code>&lt;xsl:variable name="a"
991
            as="schema-element(ipo:invoice)"/&gt;</code>. You can also use the
992
            <code>element()</code> and <code>attribute()</code> tests to select nodes by their
993
         schema type in path expressions and match patterns.</p>
994

    
995
      <p>Saxon does a certain amount of static analysis of the XSLT and XPath code based on schema
996
         information. For example, if a template rule is defined with a match pattern such as
997
            <code>match="schema-element(invoice)"</code>, then it will check any path expressions
998
         used in the template rule to ensure that they are valid against the schema when starting
999
         from <code>invoice</code> as the context node. Similarly, if the result type of a template
1000
         rule or function is declared using an <code>as</code> attribute, then Saxon will check any
1001
         literal result elements in the body of the template or function to ensure that they are
1002
         consistent with this declared type. This analysis can reveal many simple user errors at
1003
         compile time that would otherwise result in run-time errors or simply in incorrect output.
1004
         But this is only possible if the source code explicitly declares the types of parameters,
1005
         template and function results, and match patterns.</p>
1006
   </section>
1007
   <section id="satransformapi" title="Schema-Aware XSLT from Java">
1008
      <h1>Schema-Aware XSLT from Java</h1>
1009

    
1010

    
1011
      <p>When transformations are controlled using the Java JAXP interfaces, the equivalent to the
1012
            <code>-val</code> option is to set the attribute
1013
         "http://saxon.sf.net/feature/schema-validation" on the <code>TransformerFactory</code> to
1014
         the value <a class="javalink" href="net.sf.saxon.lib.Validation#STRICT"
1015
            >net.sf.saxon.lib.Validation.STRICT</a>. Alternatively, you can set the value to <a
1016
            class="javalink" href="net.sf.saxon.lib.Validation#STRICT">Validation.LAX</a>. This
1017
         attribute name is available as the constant <a class="javalink"
1018
            href="net.sf.saxon.lib.Feature#SCHEMA_VALIDATION"
1019
         >Feature.SCHEMA_VALIDATION.name</a>.</p>
1020

    
1021
      <p>This option switches validation on for all source documents used by any transformation
1022
         under the control of this <code>TransformerFactory</code>. If you want finer control, so
1023
         that some documents are validated and others are not, you can achieve this by using the <a
1024
            class="javalink" href="net.sf.saxon.lib.AugmentedSource">AugmentedSource</a> object. An
1025
            <code>AugmentedSource</code> is a wrapper around a normal JAXP <code>Source</code>
1026
         object, in which additional properties can be set: for example, a property to request
1027
         validation of the document. The <code>AugmentedSource</code> itself implements the JAXP
1028
            <code>Source</code> interface, so it can be used anywhere that an ordinary
1029
            <code>Source</code> object can be used, notably as the first argument to the
1030
            <code>transform</code> method of the <code>Transformer</code>, and as the return value
1031
         from a user-written <code>URIResolver</code>.</p>
1032

    
1033
      <p>If the standard Saxon <code>URIResolver</code> is used, and recognition of query parameters
1034
         is enabled, it is also possible to control validation for each source document by means of
1035
         query parameters in the document URI. For example,
1036
            <code>document('source.xml?val=strict')</code> requests the loading of the file
1037
            <code>source.xml</code> with strict validation.</p>
1038

    
1039
      <p>The attribute <a class="javalink" href="net.sf.saxon.lib.Feature#VALIDATION_WARNINGS"
1040
            >Feature.VALIDATION_WARNINGS.name</a> has the same effect as the <code>-outval:recover</code> option
1041
         on the command line: validation errors encountered when processing the final result tree
1042
         are reported to the <code>ErrorListener</code> as warnings, not as fatal errors.</p>
1043

    
1044
      <p>Schemas can be loaded using either of the techniques used with the command-line interface:
1045
         that is, by specifying them in the <a class="bodylink code"
1046
            href="/xsl-elements/import-schema">xsl:import-schema</a> directive in the stylesheet, or
1047
         by including them in an <code>xsi:schemaLocation</code> attribute in a source document. In
1048
         addition, they can be loaded using the <code>addSchema()</code> method on the <a
1049
            class="javalink" href="com.saxonica.config.EnterpriseTransformerFactory"
1050
            >EnterpriseTransformerFactory</a> class.</p>
1051

    
1052
      <p>All schemas that are loaded are cached as part of the <code>TransformerFactory</code> (or
1053
         more specifically, as part of the <a class="javalink" href="net.sf.saxon.Configuration"
1054
            >Configuration</a> object owned by the <code>TransformerFactory</code>). This is true
1055
         whether the schema is loaded explicitly using the Java API, whether it is loaded as a
1056
         result of <code>xsl:import-schema</code>, or whether it is referenced in an
1057
            <code>xsi:schemaLocation</code> attribute in a source document. There can only be one
1058
         schema document loaded for each namespace: any further attempts to load a schema for a
1059
         given target namespace will return the existing loaded schema, rather than loading a new
1060
         one. Note in particular that this means there can only be one loaded no-namespace schema
1061
         document. If you want to force loading of a different schema document for an existing
1062
         namespace, the only way to do it is to create a new <code>TransformerFactory</code>.</p>
1063

    
1064
      <p>If you are validating the result tree, and you want your application to have access to the
1065
         type annotations in the validated tree, then you should specify as the result of the
1066
         transformation either a user-written <code>Receiver</code>, or a <code>DOMResult</code>
1067
         that wraps a Saxon <a class="javalink" href="net.sf.saxon.om.NodeInfo">NodeInfo</a>
1068
         object. Note that type annotations are supported only with the TinyTree implementation.</p>
1069
   </section>
1070
   <section id="saquerycmd" title="Schema-Aware XQuery from the Command Line">
1071
      <h1>Schema-Aware XQuery from the Command Line</h1>
1072

    
1073

    
1074
      <p>To run a schema-aware query from the command line, use the usual command <a
1075
            class="javalink" href="net.sf.saxon.Query">net.sf.saxon.Query</a>. This has an option
1076
            <code>-val:strict</code> to request strict validation of the source document, or
1077
            <code>-val:lax</code> to request lax validation. This applies not only to the principal
1078
         source document loaded using the <code>-s</code> option on the command line, but to all
1079
         documents loaded via the <a class="bodylink code" href="/functions/fn/doc">doc()</a>
1080
         functions, or supplied as additional command line parameters in the form
1081
            <code>+param=doc.xml</code>.</p>
1082

    
1083
      <p>The schemas to be used to validate these source documents can be specified either by using
1084
         the <code>import schema</code> declaration in the query prolog, or using
1085
            <code>xsi:schemaLocation</code> (or <code>xsi:noNamespaceSchemaLocation</code>)
1086
         attributes within the source documents themselves, or by using the <code>-xsd</code> option
1087
         on the command line.</p>
1088

    
1089
      <p>Validating the source document has several effects. Most obviously, it will cause the query
1090
         to fail if the document is invalid. It will also cause default values for attributes and
1091
         elements to be expanded, so they will appear to the query as if they were present on the
1092
         source document. In addition, element and attribute nodes that have been validated will be
1093
         annotated with a type. This enables operations to be performed in a type-safe way. This may
1094
         cause error messages, for example if you try to use an <code>xs:decimal</code> value as an
1095
         argument to a function that expects a string. It may also cause some operations to produce
1096
         different results: for example when using elements or attributes that have been given a
1097
         list type in the schema, the typed value of the node will appear in the stylesheet as a
1098
         sequence rather than as a single string value.</p>
1099

    
1100
      <p>The Enterprise Edition of Saxon also allows you to validate result documents (both final
1101
         result documents and intermediate results). By default, elements constructed by the query
1102
         are validated in lax mode, which means that they are validated if a schema declaration is
1103
         available, and are not validated otherwise. You can set a different initial validation mode
1104
         either using the <code>declare validation</code> declaration in the Query Prolog, or by
1105
         issuing a call such as <code>staticQueryContext.pushValidationMode(Validation.SKIP)</code>
1106
         in the calling API.</p>
1107

    
1108
      <p>The <code>-outval:recover</code> option on the command line causes validation errors encountered in
1109
         processing a final result tree to be treated as warnings, allowing processing to continue.
1110
         This allows more than one error to be reported in a single run. The result document is
1111
         serialized as if validation were successful, but with XML comments inserted to show where
1112
         the validation errors were found. This option does not necessarily recover from all
1113
         validation errors, for example at present it does not recover from errors in uniqueness or
1114
         referential constraints.</p>
1115

    
1116
      <p>By default, the validation context for element constructors in the query depends on the
1117
         textual nesting of the element constructors as written in the query. You can change the
1118
         validation context (and the validation mode) if you need to, by using a
1119
            <code>validate{}</code> expression within the query. For details of this expression,
1120
         refer to the XQuery 1.0 specification. Validation of result documents is done on-the-fly,
1121
         so if the query attempts to produce invalid output, you will usually get an error message
1122
         that identifies the approximate location in the query where the error occurred.</p>
1123

    
1124
      <p>With the Enterprise Edition of Saxon, declarations of functions and variables can refer to
1125
         schema-defined types, for example you can write <code>let $a as
1126
            schema-element(ipo:invoice)* := //inv</code>. You can also use the
1127
            <code>element()</code> and <code>attribute()</code> tests to select nodes by their
1128
         schema type in path expressions.</p>
1129

    
1130
      <p>Saxon-EE does a certain amount of static analysis of the XQuery code based on schema
1131
         information. For example, if a function argument is defined with a type such as
1132
            <code>as="schema-element(invoice)"</code>, then it will check any path expressions used
1133
         in the function body to ensure that they are valid against the schema when starting from
1134
            <code>invoice</code> as the context node. Similarly, if the result type of a function is
1135
         declared using an <code>as</code> attribute, then Saxon will check any direct element
1136
         constructors in the body of the function to ensure that they are consistent with this
1137
         declared type. This analysis can reveal many simple user errors at compile time that would
1138
         otherwise result in run-time errors or simply in incorrect output. But this is only
1139
         possible if the source code explicitly declares the types of variables and of function
1140
         arguments and results.</p>
1141
   </section>
1142
   <section id="saqueryapi" title="Schema-Aware XQuery from Java">
1143
      <h1>Schema-Aware XQuery from Java</h1>
1144

    
1145

    
1146
      <p>When queries are controlled using the Java API, the equivalent to the <code>-val</code>
1147
         option is to create a <a class="javalink"
1148
            href="com.saxonica.config.EnterpriseConfiguration">EnterpriseConfiguration</a> instead
1149
         of a <code>Configuration</code> object, and then to call <a class="javalink"
1150
            href="net.sf.saxon.lib.Validation#STRICT"
1151
            >setSchemaValidationMode(net.sf.saxon.lib.Validation.STRICT)</a> on this object. The
1152
         value <a class="javalink" href="net.sf.saxon.lib.Validation#LAX">Validation.LAX</a> can
1153
         also be used.</p>
1154

    
1155
      <p>This option switches validation on for all source documents used by any transformation
1156
         under the control of this <code>EnterpriseConfiguration</code>. If you want finer control,
1157
         so that some documents are validated and others are not, you can achieve this by using the
1158
            <a class="javalink" href="net.sf.saxon.lib.AugmentedSource">AugmentedSource</a> object.
1159
         An <code>AugmentedSource</code> is a wrapper around a normal JAXP <code>Source</code>
1160
         object, in which additional properties can be set: for example, a property to request
1161
         validation of the document. The <code>AugmentedSource</code> itself implements the JAXP
1162
            <code>Source</code> interface, so it can be used anywhere that an ordinary
1163
            <code>Source</code> object can be used, for example as the first argument to the
1164
            <code>buildDocument()</code> method of the <code>QueryProcessor</code>, and as the
1165
         return value from a user-written <code>URIResolver</code>.</p>
1166

    
1167
      <p>If the standard Saxon <code>URIResolver</code> is used, and recognition of query parameters
1168
         is enabled, it is also possible to control validation for each source document by means of
1169
         query parameters in the document URI. For example,
1170
            <code>doc('source.xml?val=strict')</code> requests the loading of the file
1171
            <code>source.xml</code> with strict validation.</p>
1172

    
1173
      <p>The <a class="javalink" href="net.sf.saxon.Configuration">Configuration</a> method
1174
            <code>setValidationWarnings()</code> has the same effect as the <code>-outval:recover</code> option
1175
         on the command line: validation errors encountered when processing the final result tree
1176
         are reported to the <code>ErrorListener</code> as warnings, not as fatal errors. They are
1177
         also reported as XML comments in the result tree.</p>
1178

    
1179
      <p>Schemas can be loaded using either of the techniques used with the command-line interface:
1180
         that is, by specifying them in the <code>import schema</code> directive in the query
1181
         prolog, or by including them in an <code>xsi:schemaLocation</code> attribute in a source
1182
         document. In addition, they can be loaded using the <code>addSchemaSource()</code> method
1183
         on the <code>EnterpriseConfiguration</code> class.</p>
1184

    
1185
      <p>All schemas that are loaded are cached as part of the <a class="javalink"
1186
            href="com.saxonica.config.EnterpriseConfiguration">EnterpriseConfiguration</a>. This is
1187
         true whether the schema is loaded explicitly using the Java API, whether it is loaded as a
1188
         result of <code>import schema</code> in a query, or whether it is referenced in an
1189
            <code>xsi:schemaLocation</code> attribute in a source document. There can only be one
1190
         schema document loaded for each namespace: any further attempts to load a schema for a
1191
         given target namespace will return the existing loaded schema, rather than loading a new
1192
         one. Note in particular that this means there can only be one loaded no-namespace schema
1193
         document. If you want to force loading of a different schema document for an existing
1194
         namespace, the only way to do it is to create a new
1195
         <code>EnterpriseConfiguration</code>.</p>
1196
   </section>
1197
   <section id="schema11" title="XML Schema 1.1">
1198
      <h1>XML Schema 1.1</h1>
1199

    
1200

    
1201
      <p>From release 9.5, Saxon-EE includes full support for the XML Schema 1.1 specification,
1202
         which is a W3C Recommendation. The main changes between XSD 1.0 and XSD 1.1 are listed in
1203
         the following pages (see <a class="bodylink" href="/conformance/schema11">XML Schema 1.1
1204
            Conformance</a> for further information).</p>
1205

    
1206
      <p>From Saxon 9.8, use of XML Schema 1.1 features is enabled by default. To disable use of XML
1207
         Schema 1.1 features, set the command line flag <code>-xsdversion:1.0</code> or the
1208
        equivalent in the API (<a class="javalink" href="net.sf.saxon.lib.Feature#XSD_VERSION"
1209
          >Feature.XSD_VERSION</a>).</p>
1210
      <nav>
1211
         <ul/>
1212
      </nav>
1213

    
1214
      <section id="assertions" title="Assertions on Complex Types">
1215
         <h1>Assertions on Complex Types</h1>
1216

    
1217

    
1218
         <p>XSD 1.1 supports the definition of assertions on both simple and complex types.</p>
1219

    
1220
         <p>Assertions enable cross-validation of different elements or attributes within a complex
1221
            type. For example, specifying:</p>
1222
         <samp><![CDATA[<xs:assert test="xs:date(@date-of-birth) lt xs:date(@date-of-death)"/>
1223
]]></samp>
1224

    
1225
         <p>will cause a run-time validation error if an instance document is validated in which the
1226
            relevant condition does not hold.</p>
1227

    
1228
         <p>Saxon allows any XPath 2.0 expression to be used in the <code>test</code> attribute.
1229
            This includes expressions that call Java or .NET extension functions. Support for XPath 3.0 or 3.1
1230
            can be configured using the configuration property <code>Feature.XPATH_VERSION_FOR_XSD</code>.</p>
1231

    
1232
         <p>For assertions on complex types, the context node supplied to the expression is the
1233
            element being validated. The element being validated is presented as type
1234
               <code>xs:anyType</code>, but its attributes and children, because they have already
1235
            been validated, are annotated with their respective types. The static context for the
1236
            expression comes from the containing schema document: any namespace prefixes used in the
1237
            expression must be declared using namespace declarations in the schema in the usual way.
1238
            The default namespace for elements and types may be set using the
1239
               <code>xpathDefaultNamespace</code> attribute either on the element containing the
1240
            XPath expression, or on the <code>xs:schema</code> element. It is not possible to use
1241
            any variables or user-defined functions within the expression. </p>
1242

    
1243
         <p>For the purpose of generating diagnostics, Saxon recognizes an assertion of the form
1244
               <code>empty(expr)</code> specially. For example, if you are validating an XSLT
1245
            stylesheet, you might write on the top-level complex type <code>&lt;xs:assert
1246
               test="empty(if (@version='1.0') then xsl:variable[@as] else ())"/&gt;</code>. If you
1247
            use this form of assertion, the validator will not only report that the assertion is
1248
            false for the top-level element, it will also report the location of all the
1249
               <code>xsl:variable</code> elements that caused the assertion to be false. This also
1250
            works for <code>not(expr)</code> provided that <code>expr</code> has a static item type
1251
            of <code>node()</code>.</p>
1252

    
1253
         <p>Another aid to diagnostics is the <code>saxon:message</code> attribute: if present on
1254
            the <code>xs:assert</code> element, this provides a message to be output when the
1255
            assertion is not satisfied: see <a class="bodylink code"
1256
               href="../../extensions11/saxon.message">saxon:message</a>.</p>
1257

    
1258
         <p>The XPath expression is evaluated against a temporary document that contains the subtree
1259
            rooted at this element: more specifically, the subtree contains a document node with
1260
            this element as its only child. Validation succeeds if the effective boolean value (EBV)
1261
            of the expression is true, and fails if the EBV is false or if an error occurs during
1262
            the evaluation.</p>
1263

    
1264
         <p>If a complex type is derived by extension or by restriction, then the assertions
1265
            supplied on the base type must be satisfied as well as those supplied on the type
1266
            itself.</p>
1267

    
1268
         <p>Note that when assertions are defined on a complex type, the subtree representing an
1269
            element with that type will be built in memory. It is therefore advisable to exercise
1270
            care when applying this facility to elements that have very large subtrees.</p>
1271

    
1272
         <p>For assertions on simple types, <code>&lt;xs:assertion&gt;</code> is treated as a facet.
1273
            It may be applied to any variety of type, that is to a type derived by restriction from
1274
            an atomic type, a list type, or a union type. The value against which the assertion is
1275
            being tested is available to the expression as the value of variable
1276
            <code>$value</code>; this will be typed as an instance of the base type (the type being
1277
            restricted). There is no context node. The variable <code>$value</code> is also
1278
            available in the same way for complex types with simple content.</p>
1279
      </section>
1280

    
1281
      <section id="simpleassert" title="Assertions on Simple Types">
1282
         <h1>Assertions on Simple Types</h1>
1283

    
1284

    
1285
         <p>XSD 1.1 allows assertions on simple types to be defined. The mechanism is to define an
1286
               <code>xs:assertion</code> element as a child of the <code>xs:restriction</code> child
1287
            of the <code>xs:simpleType</code> element (that is, it acts as an additional facet). The
1288
            type must be an atomic type. The value of the <code>test</code> attribute of
1289
               <code>xs:assert</code> is an XPath expression.</p>
1290

    
1291
         <p>The expression is evaluated with the value being validated supplied as the value of the
1292
            variable <code>$value</code>. This will be an instance of the base type: for example, if
1293
            you are restricting from <code>xs:string</code>, it will be a string; if you are
1294
            restricting from <code>xs:date</code>, it will be an <code>xs:date</code>; if you are
1295
            validating a list of integers, then <code>$value</code> will be a sequence of
1296
            integers.</p>
1297

    
1298
         <p>If the effective boolean value of the expression is true, the value is valid. If the
1299
            effective boolean value is false, or if a dynamic error occurs while evaluating the
1300
            expression, the value is invalid. Currently no diagnostics are produced to indicate why
1301
            the value is deemed invalid, other than a statement that the <code>xs:assertion</code>
1302
            facet is violated. You can supply a message in a <code>saxon:message</code> attribute:
1303
            see <a class="bodylink code" href="../../extensions11/saxon.message"
1304
            >saxon:message</a>.</p>
1305

    
1306
         <p>The XPath expression has no access to any part of the document being validated, other
1307
            than the atomic value of the actual element or attribute node. So the validation cannot
1308
            be context-sensitive.</p>
1309

    
1310
         <p>The XPath expression may make calls on Java extension functions in the normal way: see
1311
               <a class="bodylink" href="/extensibility/functions">Writing extension functions
1312
               (Java)</a>. Allowing call-out to procedural programming languages means that you can
1313
            perform arbitrary procedural validation of element and attribute values. Take care to
1314
            disable use of extension functions if validating against a schema that is untrusted.</p>
1315

    
1316
         <p>The following example validates that a date is in the past:</p>
1317
         <samp><![CDATA[  <xs:element name="date">
1318
    <xs:simpleType>
1319
       <xs:restriction base="xs:date">
1320
         <xs:assertion test="$value lt current-date()"/>
1321
       </xs:restriction>   
1322
    </xs:simpleType>
1323
  </xs:element>
1324
]]></samp>
1325

    
1326
         <p>The following example validates that a string is a legal XPath expression. This relies
1327
            on the fact that a failure evaluating the assertion is treated as "false":</p>
1328
         <samp><![CDATA[  <xs:element name="xpath">
1329
    <xs:simpleType>
1330
       <xs:restriction base="xs:string">
1331
         <xs:assertion test="exists(saxon:expression($value))" xmlns:saxon="http://saxon.sf.net/"/>
1332
       </xs:restriction>   
1333
    </xs:simpleType>
1334
  </xs:element>
1335
]]></samp>
1336

    
1337
         <p>Note how the in-scope namespaces for the XPath expression are taken from the in-scope
1338
            namespaces of the containing <code>xs:assert</code> element.</p>
1339
      </section>
1340

    
1341
      <section id="cta" title="Conditional Type Assignment">
1342
         <h1>Conditional Type Assignment</h1>
1343

    
1344

    
1345
         <p>XSD 1.1 supports Conditional Type Assignment to allow the type of an element to depend
1346
            on the value of one of its attributes. For example the content model for
1347
               <code>&lt;product action="create"&gt;</code> might be different from the content
1348
            model for <code>&lt;product action="delete"&gt;</code>.</p>
1349

    
1350
         <p>The full syntax of XPath 2.0 can be used, but the expression is constrained to access
1351
            the element node and its attributes: it has no access to the descendants, siblings, or
1352
            ancestors of the element.</p>
1353
      </section>
1354

    
1355
      <section id="allgroups" title="All Model Groups">
1356
         <h1>All Model Groups</h1>
1357

    
1358

    
1359
         <p>XSD 1.1 allows arbitrary values of <code>minOccurs</code> and <code>maxOccurs</code> on
1360
            the particles of a model group using the compositor <code>xs:all</code>.</p>
1361

    
1362
         <p>A complex type defined using <code>xs:all</code> may be derived by extension or
1363
            restriction from another type using <code>xs:all</code>.</p>
1364

    
1365
         <p>Element wildcards (<code>xs:any</code>) are allowed within <code>xs:all</code> content
1366
            models.</p>
1367
      </section>
1368

    
1369
      <section id="open-content" title="Open Content">
1370
         <h1>Open Content</h1>
1371

    
1372

    
1373
         <p>Open content is an XSD 1.1 feature that allows a schema document to declare that all types
1374
            defined in the schema are automatically extensible by the addition of child elements or
1375
            further attributes, typically in a different namespace from the
1376
               <code>targetNamespace</code> of the schema document.</p>
1377

    
1378
         <p>The facility allows a complex type to specify open content with mode "interleave" or
1379
            "suffix", allowing arbitrary elements (satisfying a wildcard) to be added either
1380
            anywhere in the content sequence, or at the end.</p>
1381

    
1382
         <p>At the level of a schema document, the <code>defaultOpenContent</code> option defines
1383
            the default open content mode for all types defined in the schema document (or, for all
1384
            types except those with an empty content model).</p>
1385

    
1386
         <p>Similarly, the <code>defaultAttributes</code> attribute of the <code>xs:schema</code>
1387
            element defines a default attribute wildcard to be permitted for all complex types
1388
            defined in the schema document.</p>
1389
      </section>
1390

    
1391
      <section id="misc-xsd11" title="Miscellaneous XSD 1.1 Features">
1392
         <h1>Miscellaneous XSD 1.1 Features</h1>
1393

    
1394

    
1395
         <p>The <code>notNamespace</code> and <code>notQName</code> attributes are provided on
1396
               <code>xs:any</code> and <code>xs:anyAttribute</code> wildcards.</p>
1397

    
1398
         <p>The <code>targetNamespace</code> attribute is available for use on local element and
1399
            attribute declarations appearing within the restriction of a complex type. </p>
1400

    
1401
         <p>XSD 1.1 allows conditional inclusion of elements in a schema document, using attributes
1402
            such as <code>vc:minVersion</code> and <code>vc:maxVersion</code>. Saxon allows use of
1403
            this feature whether the schema processor is run in 1.0 or 1.1 mode, allowing new 1.1
1404
            features such as assertions to be ignored when running in 1.0 mode. </p>
1405

    
1406
         <p>The type <code>xs:error</code> is available, as a type with no instances.</p>
1407

    
1408
         <p>An element may now appear in more than one substitution group.</p>
1409

    
1410
         <p>A new facet <code>xs:explicitTimezone</code> is available with values <code>required</code>,
1411
            <code>optional</code>, or <code>prohibited</code>.</p>
1412

    
1413
         <p>A new built-in data type <code>xs:dateTimeStamp</code> (an <code>xs:dateTime</code>
1414
            with timezone required) is available.</p>
1415
      </section>
1416
   </section>
1417

    
1418
   <section id="min-and-maxoccurs" title="Handling minOccurs and maxOccurs">
1419
      <h1>Handling minOccurs and maxOccurs</h1>
1420

    
1421

    
1422
      <p>Prior to release 9.1, Saxon used the validation algorithm described in <a
1423
            href="http://www.ltg.ed.ac.uk/~ht/XML_Europe_2003.html" class="bodylink">Thompson and
1424
            Tobin 2003</a>. This algorithm can be very inefficient when large bounded values of
1425
            <code>minOccurs</code> and <code>maxOccurs</code> are used in a content model; indeed,
1426
         it can be so inefficient that the finite state machine is too large to fit in memory, and
1427
         an OutOfMemory exception occurs.</p>
1428

    
1429
      <p>Since Saxon 9.1, many common cases of <code>minOccurs</code> and <code>maxOccurs</code> are
1430
         handled using a finite state machine that makes use of counters at run-time. This
1431
         eliminates the need to have one state in the machine for each possible number of
1432
         occurrences of the repeating item. Instead, counters are maintained at run-time and
1433
         compared against the <code>minOccurs</code> and <code>maxOccurs</code> values.</p>
1434

    
1435
      <p>This technique is used under the following circumstances:</p>
1436
      <ul>
1437
         <li>
1438
            <p>Either <code>minOccurs</code> &gt; 1, or <code>maxOccurs</code> &gt; 1 (and is not
1439
               unbounded), or both.</p>
1440
         </li>
1441
         <li>
1442
            <p>The <code>minOccurs</code>/<code>maxOccurs</code> values must be defined on an
1443
               element (<code>xs:element</code>) or wildcard (<code>xs:any</code>) particle.</p>
1444
         </li>
1445
         <li>
1446
            <p>If the repeating particle is <i>vulnerable</i>, then it must not be part of a model
1447
               group that is itself repeatable. A particle is vulnerable if it is part of a choice
1448
               group, or if it is part of a sequence group in which all the other particles are
1449
               optional or emptiable, except in the case where <code>minOccurs</code> is equal to
1450
                  <code>maxOccurs</code>. The reason for this restriction is that in such situations
1451
               there are two nested repetitions, and it is ambiguous whether a new instance of the
1452
               repeating term should be treated as a repetition at the inner level or at the outer
1453
               level.</p>
1454
         </li>
1455
      </ul>
1456

    
1457
      <p>In cases where counters cannot be used, Saxon will still attempt to compile a finite state
1458
         machine, but will use configuration-defined limits on <code>minOccurs</code> and
1459
            <code>maxOccurs</code> to approximate the values requested. If the values used in the
1460
         schema exceed these limits, Saxon will therefore approximate by generating a schema that does
1461
         not strictly enforce the specified <code>minOccurs</code> and <code>maxOccurs</code>. The
1462
         default limits are 100 and 250 respectively. Different limits can be set on the command
1463
         line or via the Java API on the <code>Configuration</code> object. Note however that when
1464
         several nested repeating groups are defined it is still possible for out-of-memory
1465
         conditions to occur, even with quite modest values of <code>minOccurs</code> and
1466
            <code>maxOccurs</code>.</p>
1467
   </section>
1468
  <section id="absent-components" title="Handling Absent Components">
1469
    <h1>Handling Absent Components</h1>
1470
    <p>The XSD 1.0 and 1.1 Recommendations both say that a schema should not be treated
1471
    as invalid merely because it contains unresolved references to absent schema components
1472
    (for example an <code>xs:attribute</code> declaration that refers to a named type which
1473
    is not declared in the schema). The specification suggests that such a schema should
1474
    be usable for validation provided that the missing components are never used.</p>
1475
    <p>However, this strategy has complications:</p>
1476
    <ul>
1477
      <li>The effect of validation using such a schema is not very well defined, 
1478
      especially in XSD 1.0.</li>
1479
      <li>The W3C test suite is written to treat such schemas as invalid (a processor
1480
      that does what the spec says will fail over 500 tests).</li>
1481
      <li>Users probably prefer to be told about the situation while the schema
1482
      is under development, rather than when it is deployed in the field.</li>
1483
      <li>It is much easier to produce clear diagnostics if the error is reported
1484
      early.</li>
1485
    </ul>
1486
    <p>By default therefore, Saxon treats missing component references as a compile-time
1487
    error in the schema.</p>
1488
    <p>A configuration option, <a class="javalink" href="net.sf.saxon.lib.Feature#ALLOW_UNRESOLVED_SCHEMA_COMPONENTS"
1489
      >Feature.ALLOW_UNRESOLVED_SCHEMA_COMPONENTS.name</a>, is available to change this
1490
    behavior. If this option is set, the schema processor attempts to repair the schema
1491
    to make it usable. For example, a reference to a missing type is replaced by a reference
1492
    to <code>xs:error</code> (which has no valid instances); a reference to a missing
1493
    attribute group or model group is replaced by a reference to an empty attribute
1494
    group or model group; a reference to a missing element or attribute declaration is
1495
    replaced by a reference to a local element or attribute declaration with a declared
1496
    type of <code>xs:error</code>.</p>
1497
    <p>One case where Saxon does not attempt a repair is where a type is derived from
1498
    an absent base type. This is always a fatal error.</p>
1499
    <p>It is not possible to generate an SCM file when this configuration option is set;
1500
    SCM files cannot contain dangling references.</p>
1501
  </section>
1502
   <section id="extensions11" title="Saxon extensions to XML Schema 1.1">
1503
      <h1>Saxon extensions to XML Schema 1.1</h1>
1504

    
1505

    
1506
      <p>The XSD 1.1 Recommendation allows implementations to define their own primitive types and
1507
         facets.</p>
1508

    
1509
      <p>At present Saxon provides three additional facets, <a class="bodylink code" href="distinct"
1510
         >saxon:distinct</a>, <a class="bodylink code" href="order"
1511
            >saxon:order</a>, and <a class="bodylink code" href="preprocess"
1512
         >saxon:preprocess</a>. It also provides a number of additional attributes for various
1513
         elements, including: <a class="bodylink code" href="saxon.message">saxon:message</a> for
1514
         any facet; <a class="bodylink code" href="saxon.flags">saxon:flags</a> for the
1515
         <code>xs:pattern</code> facet; <a class="bodylink code" href="saxon.separator"
1516
            >saxon:separator</a> for the <code>xs:list</code> element, and <a class="bodylink code"
1517
               href="saxon.order-unique">saxon:order</a> for a <code>xs:unique/xs:field</code> element.</p>
1518

    
1519
      <p>Saxon extensions to the XML Schema Language are implemented in the Saxon namespace
1520
            <code>http://saxon.sf.net/</code>.</p>
1521
      <nav>
1522
         <ul/>
1523
      </nav>
1524

    
1525
      <section id="saxon.message" title="saxon:message - Customizing validation messages">
1526
         <h1>saxon:message - Customizing validation messages</h1>
1527

    
1528

    
1529
         <p>In assertions, and on all elements representing facets (for example
1530
            <code>pattern</code>), Saxon supports the attribute <code>saxon:message="<em>message
1531
               text</em>"</code>. This message text is used in error messages when the assertion or other
1532
            facet is not satisfied.</p>
1533

    
1534
         <p>For example:</p>
1535
         <samp><![CDATA[  <xs:element name="date">
1536
    <xs:simpleType>
1537
       <xs:restriction base="xs:date" xmlns:saxon="http://saxon.sf.net/">
1538
         <xs:assertion test=". lt current-date()"
1539
                    saxon:message="The date must not be in the future"/>
1540
         <xs:pattern value="[^Z:]*" 
1541
                    saxon:message="The date must not have a timezone"/>
1542
       </xs:restriction>   
1543
    </xs:simpleType>
1544
  </xs:element>
1545
]]></samp>
1546
      </section>
1547
      
1548
      <section id="saxon.order-unique" title="saxon:order - Ordered Uniqueness Constraints">
1549
         <h1>saxon:order - Ordered Uniqueness Constraints</h1>
1550
         
1551
         <p>Saxon allows the additional attribute <code>xs:unique/xs:field/@saxon:order</code>, with the permitted values
1552
            <code>ascending</code> and <code>descending</code>. If the attribute is present on at least one field of a uniqueness
1553
            constraint, then the constraint not only imposes uniqueness of values in the normal way, it
1554
            also requires the values to be ordered. If the attribute is present on at least one field of a uniqueness constraint,
1555
            then the value <code>saxon:order="ascending"</code> is assumed for any other fields of the constraint if not
1556
            explicitly specified.</p>
1557
         
1558
         <p>Not only does this provide an additional integrity constraint (one which is quite difficult to articulate
1559
            using assertions), it also makes checking of the uniqueness constraint much more efficient, since it only requires
1560
            the most recently-encountered value of the selected fields to be maintained.</p>
1561
         
1562
         <p>For example, the following (rather contrived) example indicates that employees in a data file are to be sorted
1563
            by ascending last name, then descending first name:</p>
1564
         
1565
         <samp><![CDATA[<xs:element name="employees">
1566
  <xs:unique>
1567
    <xs:selector xpath="employee"/>
1568
    <xs:field xpath="last" saxon:order="ascending"/>
1569
    <xs:field xpath="first" saxon:order="descending"/>
1570
  </xs:unique>
1571
</xs:element>]]></samp>
1572
         
1573
      </section>
1574
      
1575
      <section id="saxon.separator" title="saxon:separator - Defining the separator for list values">
1576
         <h1>saxon:separator - Defining the separator for list values</h1>
1577
         
1578
         <p>Normally, in an element or attribute whose type is derived from <code>xs:list</code>,
1579
         the separator between items in the list must be whitespace. The <code>saxon:separator</code>
1580
         attribute can be specified on an <code>xs:list</code> element to define an alternative way
1581
            of separating items. The value is a regular expression. For example,
1582
         <code>saxon:separator=","</code> defines comma as the separator, <code>saxon:separator="\|"</code> uses a vertical bar,
1583
            and <code>saxon:separator=",\s*"</code> uses a comma followed by zero or more spaces.</p>
1584
         
1585
         <p>Tokenization of the supplied value is performed according to the rules of the XPath
1586
            <a class="bodylink code" href="/functions/fn/tokenize">fn:tokenize()</a>
1587
            function. Note this means that the regular expression must not be one that matches a zero-length string.</p>
1588
         
1589
         <p>If the item type of the list is one that collapses whitespace (for example <code>xs:integer</code> or <code>xs:date</code>)
1590
         then whitespace is automatically allowed before and after a separator; it does not need to be explicitly permitted
1591
         by the regular expression.</p>
1592
         
1593
         <p>If the input value starts or ends with a separator, then the result of tokenization will include a zero-length token.
1594
            With many item types (for example <code>xs:integer</code> or <code>xs:date</code>) a zero-length string is not a valid
1595
         value, so this will result in an error.</p>
1596
         
1597
         <p>An empty input string represents an empty list of values. An input string containing one or more whitespace characters
1598
         represents a list of length one whose only token comprises whitespace; for many item types, this is not a valid token.</p>
1599
         
1600
         <p>Note that the attribute has no impact on the way values are serialized. When constructing elements and attributes in the
1601
         result of a query or stylesheet, it will be necessary to insert the separators explicitly, typically by invoking the
1602
         <a class="bodylink code" href="/functions/fn/string-join">fn:string-join()</a> function.</p>
1603
         
1604
         <aside><p>The <code>saxon:separator</code> attribute has not been defined as a facet, because that would break
1605
            substitutability: it is not possible to define a restriction of a list type with a different separator,
1606
            because instances of the subtype would then not be valid instances of the base type.</p></aside>
1607
        
1608
         
1609
         <p>For example:</p>
1610
         <samp><![CDATA[  <xs:simpleType name="list-of-doubles" xmlns:saxon="http://saxon.sf.net/">
1611
    <xs:list item-type="xs:double" saxon:separator=",">
1612
  </xs:simpleType>]]></samp>
1613
      </section>
1614
      
1615
      <section id="saxon.flags" title="saxon:flags - Regular expression flags on xs:pattern">
1616
         <h1>saxon:flags - Regular expression flags on xs:pattern</h1>
1617
         
1618
         <p>Saxon provides an additional attribute on the <code>xs:pattern</code> element,
1619
         namely <code>saxon:flags</code>. If present, the attribute is a string containing one or more
1620
         of the letters (i, m, s, x, q). The flags have the same meaning as in the flags argument of the
1621
         XPath <a class="bodylink code" href="/functions/fn/matches">fn:matches()</a> function; for
1622
            example <code>saxon:flags="i"</code> causes case-blind regular expression matching.</p>
1623
 
1624
         
1625
         <p>For example:</p>
1626
         <samp><![CDATA[  <xs:element name="code">
1627
    <xs:simpleType>
1628
       <xs:restriction base="xs:string" xmlns:saxon="http://saxon.sf.net/">
1629
         <xs:pattern value="[A-M][1-9]" saxon:flags="i"/>
1630
       </xs:restriction>   
1631
    </xs:simpleType>
1632
  </xs:element>
1633
]]></samp>
1634
         
1635
         <p>Valid values for the <code>code</code> element include "A3", "J2", and "m5".</p>
1636
      </section>
1637
      
1638
      <section id="distinct" title="The saxon:distinct facet">
1639
         <h1>The saxon:distinct facet</h1>
1640
         
1641
         <p>This facet can be used only on list types. If the facet is present, it constrains the values in the list
1642
         to be distinct. That is, no item in the list may be equal to a different item in the list. Equality
1643
         is defined by the schema equality rules (which are not the same as the XPath equality rules: for example an
1644
         <code>xs:double</code> is never equal to an <code>xs:integer</code>). Values that are not comparable (for example
1645
         an integer and a date) are considered distinct.</p>
1646
         
1647
         <p>The facet may be written either as <code>&lt;saxon:distinct value="true"/></code> or more simply
1648
            <code>&lt;saxon:distinct/></code>. The only permitted value for the <code>value</code> attribute is "true".</p>
1649
         
1650
         <p>If the facet is present on a type, then it is also implicitly present on all types derived by restriction.</p>
1651
         
1652
         <p>For example, a list of days-of-the-week, not permitting duplicates, might be written:</p>
1653
      
1654
         <samp><![CDATA[<xs:simpleType name="days">
1655
  <xs:restriction>
1656
    <xs:list>
1657
      <xs:simpleType>
1658
        <xs:restriction base="xs:string">
1659
          <xs:enumeration value="Mon"/>
1660
          <xs:enumeration value="Tues"/>
1661
          <xs:enumeration value="Weds"/>
1662
          <xs:enumeration value="Thurs"/>
1663
          <xs:enumeration value="Fri"/>
1664
          <xs:enumeration value="Sat"/>
1665
          <xs:enumeration value="Sun"/>
1666
        </xs:restriction>
1667
      </xs:simpleType>
1668
    </xs:list>
1669
    <saxon:distinct xmlns:saxon="http://saxon.sf.net/"/>
1670
  </xs:restriction>
1671
</xs:simpleType>]]></samp>
1672
         
1673
      </section>
1674
      
1675
      <section id="order" title="The saxon:order facet for list types">
1676
         <h1>The saxon:order facet for list types</h1>
1677
         
1678
         <p>This facet can be used only on list types. If the facet is present, it constrains the values in the list
1679
            to be in either ascending or descending order. Adjacent values in the list may be equal unless the
1680
            <code>saxon:distinct</code> facet is also present. Ordering
1681
            is defined by the schema ordering rules (which are not the same as the XPath ordering rules: for example the rules
1682
         for comparing <code>xs:dateTime</code> values with and without timezone are different). Each value in the list must
1683
         be strictly greater-than-or-equal (in the case of ascending order), or less-than-or-equal (in the case of descending
1684
         order) to the previous value in the list. If the values are not comparable, or if their ordering is indeterminate,
1685
         then the facet is not satisfied. Strings are ordered using codepoint collation.</p>
1686
         
1687
         <p>The facet may be written as <code>&lt;saxon:order value="ascending"/></code> or
1688
            <code>&lt;saxon:order value="descending"/></code>. The permitted values for the <code>value</code> attribute are "ascending"
1689
         and "descending".</p>
1690
         
1691
         <p>If the facet is present on a type, then it must also be present with the same value on all types derived by restriction.</p>
1692
         
1693
         <p>For example, a list of dates, in ascending order, with no timezone, might be written:</p>
1694
         
1695
         <samp><![CDATA[<xs:simpleType name="dates">
1696
  <xs:restriction>
1697
    <xs:list>
1698
      <xs:simpleType>
1699
        <xs:restriction base="xs:date">
1700
          <xs:explicitTimezone value="prohibited"/>
1701
        </xs:restriction>
1702
      </xs:simpleType>
1703
    </xs:list>
1704
    <saxon:order value="ascending" xmlns:saxon="http://saxon.sf.net/"/>
1705
  </xs:restriction>
1706
</xs:simpleType>]]></samp>
1707
         
1708
      </section>
1709
      
1710
      
1711

    
1712
      <section id="preprocess" title="The saxon:preprocess facet">
1713
         <h1>The saxon:preprocess facet</h1>
1714

    
1715
         <!--<p>Saxon provides the <code>saxon:preprocess</code> facet as an addition to the standard
1716
            facets defined in the XSD 1.1 specification. It is available only when XSD 1.1 support
1717
            is enabled.</p>-->
1718

    
1719
         <p>Like <code>xs:whiteSpace</code>, this is a pre-lexical facet. It is used to transform
1720
            the supplied lexical value of an element or attribute from the form as written (but
1721
            after whitespace normalization) to the lexical space of the base type. Constraining
1722
            facets such as <code>pattern</code>, <code>enumeration</code>, and
1723
               <code>minLength</code> apply to the value after the <code>saxon:preprocess</code>
1724
            facet has done its work. In addition, if the primitive type is say <code>xs:date</code>
1725
            or <code>xs:decimal</code>, the built-in lexical rules for parsing a date or a decimal
1726
            number are applied only after <code>saxon:preprocess</code> has transformed the value.
1727
            This makes it possible, for example, to accept <code>yes</code> and <code>no</code> as
1728
            values of an <code>xs:boolean</code>, <code>3,14159</code> as the value of an
1729
               <code>xs:decimal</code>, or <code>13DEC1987</code> as the value of an
1730
               <code>xs:date</code>.</p>
1731

    
1732
         <p>Like other facets, <code>saxon:preprocess</code> may be used as a child of
1733
               <code>xs:restriction</code> when restricting a simple type, or a complex type with
1734
            simple content.</p>
1735

    
1736
         <p>The attributes are:</p>
1737
         <table>
1738
            <tr>
1739
               <td>
1740
                  <p>
1741
                     <strong>Attribute</strong>
1742
                  </p>
1743
               </td>
1744
               <td>
1745
                  <p>
1746
                     <strong>Usage</strong>
1747
                  </p>
1748
               </td>
1749
            </tr>
1750
            <tr>
1751
               <td>
1752
                  <p>id</p>
1753
               </td>
1754
               <td>
1755
                  <p>Standard attribute.</p>
1756
               </td>
1757
            </tr>
1758
            <tr>
1759
               <td>
1760
                  <p>action</p>
1761
               </td>
1762
               <td>
1763
                  <p>Mandatory. An XPath expression. The rules for writing the XPath expression are
1764
                     generally the same as the rules for the <code>test</code> expression of
1765
                        <code>xs:assert</code>. The value to be transformed is supplied (as a
1766
                     string) as the value of the variable <code>$value</code>; the context item is
1767
                     undefined. The expression must return a single string. If evaluation of the
1768
                     expression fails with a dynamic error, this is interpreted as a validation
1769
                     failure.</p>
1770
               </td>
1771
            </tr>
1772
            <tr>
1773
               <td>
1774
                  <p>reverse</p>
1775
               </td>
1776
               <td>
1777
                  <p>Optional. An XPath expression used to reverse the transformation. Used (in
1778
                     XPath, XSLT, and XQuery) when a value of this type is converted to a string.
1779
                     When a value of this type is converted to a string, it is first converted
1780
                     according to the rules of the base type. The resulting string is then passed,
1781
                     as the value of variable <code>$value</code>, to the XPath expression, and the
1782
                     result of the XPath expression is used as the final output. This attribute does
1783
                     not affect the schema validation process itself.</p>
1784
               </td>
1785
            </tr>
1786
            <tr>
1787
               <td>
1788
                  <p>xpathDefaultNamespace</p>
1789
               </td>
1790
               <td>
1791
                  <p>The default namespace for element names (unlikely to appear in practice) and
1792
                     types.</p>
1793
               </td>
1794
            </tr>
1795
         </table>
1796

    
1797
         <p>The following example converts a string to upper-case before testing it against the
1798
            enumeration facet.</p>
1799
         <samp><![CDATA[<xs:simpleType name="currency">
1800
  <xs:restriction base="xs:string">
1801
    <saxon:preprocess action="upper-case($value)" xmlns:saxon="http://saxon.sf.net/"/>
1802
    <xs:enumeration value="USD"/>
1803
    <xs:enumeration value="EUR"/>
1804
    <xs:enumeration value="GBP"/>
1805
  </xs:restriction>
1806
</xs:simpleType>]]></samp>
1807

    
1808
         <p>Of course, it is not only the constraining facets that will see the preprocessed value
1809
            (in this case, the upper-case value), any XPath operation that makes use of the typed
1810
            value of an element or attribute node will also see the value after preprocessing.
1811
            However, the string value of the node is unchanged.</p>
1812

    
1813
         <p>The following example converts any commas appearing in the input to full stops, allowing
1814
            decimal numbers to be represented in Continental European style as <code>3,15</code>. On
1815
            output, the process is reversed, so that full stops are replaced by commas. (Note that
1816
            in this example, the user-defined type also accepts numbers written in the "standard"
1817
            style <code>3.15</code>.)</p>
1818
         <samp><![CDATA[<xs:simpleType name="euroDecimal">
1819
  <xs:restriction base="xs:decimal">
1820
    <saxon:preprocess action="translate($value, ',', '.')" 
1821
                      reverse="translate($value, '.', ',')"
1822
                      xmlns:saxon="http://saxon.sf.net/"/>
1823
  </xs:restriction>
1824
</xs:simpleType>]]></samp>
1825

    
1826
         <p>The following example allows an <code>xs:time</code> value to be written with the
1827
            seconds part omitted. Again, it also accepts the standard <code>hh:mm:ss</code>
1828
            notation:</p>
1829
         <samp><![CDATA[<xs:simpleType name="hoursAndMinutes">
1830
  <xs:restriction base="xs:time">
1831
    <saxon:preprocess action="concat($value, ':00'[string-length($value) = 5])" 
1832
                      xmlns:saxon="http://saxon.sf.net/"/>
1833
  </xs:restriction>
1834
</xs:simpleType>]]></samp>
1835

    
1836
         <p>The following example uses extension function calls within the XPath expression to
1837
            support integers written in hexadecimal notation:</p>
1838
         <samp><![CDATA[<xs:simpleType name="hexInteger">
1839
  <xs:restriction base="xs:long">
1840
    <saxon:preprocess action="Long:parseLong($value, 16)" reverse="Long:toHexString(xs:long($value))"
1841
      xmlns:Long="java:java.lang.Long"
1842
      xmlns:saxon="http://saxon.sf.net/"/>
1843
  </xs:restriction>
1844
</xs:simpleType>]]></samp>
1845

    
1846
         <p>Given the input <code>&lt;val&gt;0040&lt;/val&gt;</code>, validated against this schema,
1847
            the query <code>(val*3) cast as hexInteger</code> will produce the output
1848
               <code>c0</code>.</p>
1849

    
1850
         <p>If the <code>xs:restriction</code> element defines facets other than
1851
               <code>saxon:preprocess</code>, for example <code>xs:enumeration</code> or
1852
               <code>xs:minInclusive</code>, then the values supplied in these other facets are
1853
            validated against the rules for the base type: that is, they are not subject to
1854
            preprocessing. So a facet that defines US date formats earlier than a certain date might
1855
            look like this:</p>
1856

    
1857
         <samp><![CDATA[<xs:simpleType name="us-date-before-2012">
1858
  <xs:restriction base="xs:date">
1859
    <saxon:preprocess action="concat(substring($value, 7, 4), '-', 
1860
                                     substring($value, 1, 2), '-', 
1861
                                     substring($value, 4, 2))"  
1862
                      xmlns:saxon="http://saxon.sf.net/"/>
1863
    <xs:maxInclusive value="2011-12-31"/>                  
1864
  </xs:restriction>
1865
</xs:simpleType>]]></samp>
1866

    
1867
         <p>However, if the type is further restricted, then facets for derived types will be
1868
            validated after preprocessing. So an alternative formulation of the above type would
1869
            be:</p>
1870

    
1871
         <samp><![CDATA[<xs:simpleType name="us-date">
1872
  <xs:restriction base="xs:date">
1873
    <saxon:preprocess action="concat(substring($value, 7, 4), '-', 
1874
                                     substring($value, 1, 2), '-', 
1875
                                     substring($value, 4, 2))"
1876
                      xmlns:saxon="http://saxon.sf.net/"/>                 
1877
  </xs:restriction>
1878
</xs:simpleType>
1879
<xs:simpleType name="us-date-before-2012">
1880
  <xs:restriction base="us-date">
1881
    <xs:maxInclusive value="12-31-2011"/>                  
1882
  </xs:restriction>
1883
</xs:simpleType>]]></samp>
1884

    
1885
         <p>
1886
            <i>The preprocess facet is not currently implemented for list or union types.</i>
1887
         </p>
1888
      </section>
1889

    
1890
      <section id="extended-uniqueness-constraints"
1891
         title="Extended XPath expressions for XSD uniqueness and referential constraints">
1892
         <h1>Extended XPath expressions for XSD uniqueness and referential constraints</h1>
1893

    
1894

    
1895
         <p>This extension is only available if enabled by specifying the attribute
1896
               <code>saxon:extensions="id-xpath-syntax"</code> on the <code>xs:schema</code> element
1897
            of the containing schema document.</p>
1898
         
1899
         <p>The <code>saxon:extensions</code> attribute is a whitespace-separated list of keywords.</p>
1900

    
1901
         <p>With this extension enabled, restrictions are removed on the syntax allowed in the
1902
               <code>selector/@xpath</code> and <code>field/@xpath</code> attributes of the
1903
               <code>xs:unique</code>, <code>xs:key</code>, and <code>xs:keyref</code> elements.
1904
            Instead of the very limited XPath subset defined in the XSD 1.0 and XSD 1.1
1905
            specifications, Saxon will allow the same syntax as is permitted for streamable XPath
1906
            expressions in XSLT. Specifically, the syntax for both attributes is the same as allowed
1907
            in the <code>select</code> attribute of a streaming <a class="bodylink code"
1908
               href="/xsl-elements/apply-templates">xsl:apply-templates</a>, which is an extended
1909
            form of the XSLT 3.0 syntax for patterns. It permits, for example, any sequence of
1910
            downwards axes, arbitrary predicates on any step provided they do no downwards
1911
            selection, and conditional expressions.</p>
1912

    
1913
         <p>For example, the following is permitted, indicating that US-based employees must have a
1914
            unique social security number, but not imposing any such constraint on other
1915
            employees:</p>
1916
         <samp><![CDATA[<xs:element name="company">
1917
  <xs:unique>
1918
    <xs:selector xpath="employee[@location='us']"/>
1919
    <xs:field xpath="@ssid"/>
1920
  </xs:unique>
1921
</xs:element>]]></samp>
1922
      </section>
1923
      
1924
      
1925
         
1926
      <section id="xpath-31-in-xsd-11"
1927
         title="Using XPath 3.1 in Assertions">
1928
         <h1>Using XPath 3.1 in Assertions</h1>
1929
         
1930
         <p>From Saxon 9.9 the XPath version used can be configured using the configuration 
1931
            property <code>Feature.XPATH_VERSION_FOR_XSD</code>.</p>
1932
         
1933
         <p>If the attribute <code>saxon:extensions="any-xpath-version"</code> is present on the <code>xs:schema</code>
1934
         element of a schema document, then XPath expressions used within assertions and conditional type assignments
1935
         within that schema document are allowed to use XPath 3.1 syntax. By default, only XPath 2.0 is allowed.
1936
         <em>This feature was new in Saxon 9.8.0.5.</em></p>
1937
         
1938
         <p>The <code>saxon:extensions</code> attribute is a whitespace-separated list of keywords.</p>
1939
      </section>
1940
         
1941
      <section id="saxon.param" title="Defining variables and parameters">
1942
         <h1>Defining variables and parameters for use in XPath expressions</h1>
1943
         
1944
         <p>Saxon allows a schema document to declare variables and parameters that can be
1945
         used in XPath expressions (for example, assertions) within the same schema document.
1946
         For details, see <a class="bodylink" href="/schema-processing/parameterizing-schemas">Parameterizing Schemas</a>.</p>
1947
      </section>
1948
   </section>
1949
</article>
(17-17/21)