Project

Profile

Help

How to connect?
Download (2.46 KB) Statistics
| Branch: | Tag: | Revision:

he / latest10 / hej / net / sf / saxon / serialize / charcode / package.html @ c74fd4aa

1
<!--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-->
2
<!-- Copyright (c) 2014 Saxonica Limited. -->
3
<!-- This Source Code Form is subject to the terms of the Mozilla Public License, v. 2.0. -->
4
<!-- If a copy of the MPL was not distributed with this file, You can obtain one at http://mozilla.org/MPL/2.0/. -->
5
<!-- This Source Code Form is "Incompatible With Secondary Licenses", as defined by the Mozilla Public License, v. 2.0. -->
6
<!--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-->
7

    
8
<html>
9

    
10
<head>
11
    <title>Package overview for net.sf.saxon.charcode</title>
12
</head>
13

    
14
<body>
15

    
16
<p>This package provides classes for handling different character sets, especially
17
    when serializing the output of a query or transformation. </p>
18

    
19
<p>Most of the classes in this package are implementations of the interface <code>CharacterSet</code>.
20
    The sole
21
    function of these classes is to determine whether a particular character is present in the
22
    character set or not: if not, Saxon has to replace it with a character reference.</p>
23

    
24
<p>The actual translation of Unicode characters to characters in the selected encoding
25
    is left to the Java run-time library. (Note that different versions of Java support
26
    different sets of encodings, and there is no easy way to find out which encodings
27
    are supported in a given installation).</p>
28

    
29
<p>It is possible to configure Saxon to support additional character sets by writing an
30
    implementation of the <code>CharacterSet</code> interface, and registering this class with the
31
    <code>Configuration</code> using the call <code>getCharacterSetFactory().setCharacterSetImplementation()</code></p>
32

    
33
<p>If an output encoding is requested that Saxon does not recognize, but which the Java
34
    platform does recognize, then Saxon attempts to determine which characters the encoding
35
    can represent, so that unsupported characters can be written as numeric character references.
36
    Saxon wraps the Java <code>CharSet</code> object in a <code>JavaCharacterSet</code> object,
37
    and tests whether a character is encodable by calling the Java interrogative
38
    <code>encoding.canEncode()</code>, caching the result locally. Since this mechanism
39
    appears to have become reliable in JDK 1.5, it is now used much more widely than before,
40
    and most character sets are now supported in Saxon by relying on this mechanism.</p>
41

    
42

    
43
</body>
44
</html>
(9-9/9)