Project

Profile

Help

How to connect?
Download (4.96 KB) Statistics
| Branch: | Revision:

he / src / main / java / net / sf / saxon / tree / tiny / package.html @ d9cb5c62

1
<!--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-->
2
<!-- Copyright (c) 2014 Saxonica Limited. -->
3
<!-- This Source Code Form is subject to the terms of the Mozilla Public License, v. 2.0. -->
4
<!-- If a copy of the MPL was not distributed with this file, You can obtain one at http://mozilla.org/MPL/2.0/. -->
5
<!-- This Source Code Form is "Incompatible With Secondary Licenses", as defined by the Mozilla Public License, v. 2.0. -->
6
<!--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-->
7

    
8
<html>
9

    
10
<head>
11
    <title>Package overview for net.sf.saxon.tree.tiny</title>
12
</head>
13

    
14
<body>
15

    
16
<p>This package is an implementation of the Saxon internal tree structure,
17
    designed to minimize memory usage, and the costs of allocating and garbage-collecting
18
    Java objects.</p>
19

    
20
<p>The data structure consists of a set of arrays, held in the <code>TinyTree</code> object.
21
    A <code>TinyTree</code> represents one or more root document or element nodes, together with their
22
    subtrees. If there is more than one root node, these will often be members of a sequence, but
23
    this is not essential and is never assumed.
24
    The arrays are in three groups. </p>
25

    
26
<p>The principal group of arrays contain one entry for each node other
27
    than namespace and attribute nodes. These arrays are in document order. The following
28
    information is maintained for each node: the
29
    depth in the tree, the name code, the index of the next sibling, and two fields labelled "alpha" and "beta".
30
    The meaning of "alpha" and
31
    "beat" depends on the node type. For text nodes, comment nodes, and processing instructions
32
    these fields index into a StringBuffer holding the text. For element nodes, "alpha" is
33
    an index into the attributes table, and "beta" is an offset into the namespaces table.
34
    Either of these may be set to -1 if there are no attributes/namespaces.</p>
35

    
36
<p>A name code is an integer value that indexes into the NamePool object: it can be
37
    used to determine the prefix, local name, or namespace URI of an element or attribute name.</p>
38

    
39
<p>The attribute group holds the following information for each attribute node: parent
40
    element, prefix, name code, attribute type, and attribute value. Attributes
41
    for the same element are adjacent.</p>
42

    
43
<p>The namespace group holds one entry per namespace declaration (not one per namespace
44
    node). The following information is held: a pointer to the element on which the namespace
45
    was declared, and a namespace code. A namespace code is an integer, which the NamePool can
46
    resolve into a prefix and a namespace URI: the top 16 bits identify the prefix, the bottom
47
    16 bits the URI.</p>
48

    
49
<p>The data structure contains no Java object references: the links between elements and
50
    attributes/namespaces are all held as integer offsets. This reduces size, and also makes
51
    the whole structure relocatable (though this capability is not currently exploited).
52
    All navigation is done by serial traversal of the arrays, using
53
    the node depth as a guide. An array of pointers to the preceding sibling is created on
54
    demand, the first time that backwards navigation is attempted. There are no parent pointers;
55
    Saxon attempts to remember the parent while navigating down the tree, and where this is not
56
    possible it locates the parent by searching through the following siblings; the last sibling
57
    points back to the parent. The absence of the other pointers is a trade-off between tree-building time and
58
    transformation time: I found that in most cases, more time was spent creating these pointers
59
    than actually using them. Occasionally, however, in trees with a very large fan-out, locating
60
    ancestors can be slow.</p>
61

    
62
<p>When the tree is navigated, transient ("flyweight") nodes are created as Java objects.
63
    These disappear as soon as they are no longer needed. Note that to compare two nodes for
64
    identity, you can use either the isSameNode() method, or compare the results of
65
    generateId(). Comparing the Java objects using "==" is incorrect.</p>
66

    
67
<p>The tree structure implements the DOM interface as well as the Saxon NodeInfo interface.
68
    There are limitations in the DOM support, however: especially (a) the tree is immutable, so
69
    all updating methods throw an exception; (b) namespace declarations are not exposed as
70
    attributes, and (c) only the core DOM classes are provided.</p>
71

    
72
<p>The primary way of navigating the tree is through the XPath axes, accessible through the
73
    iterateAxis() method. The familiar DOM methods such as getNextSibling() and getFirstChild()
74
    are not provided as an intrinsic part of the <code>NodeInfo</code> interface: all navigation
75
    is done by iterating the axes, and each tree model provides its own implementations of the axes.
76
    However, there are helper methods in the shared <code>Navigator</code> class which many
77
    of these implementations choose to use.</p>
78

    
79

    
80
</body>
81
</html>
(33-33/33)