Bug #5691


Appending a node discards any location that it might have had

Added by Norm Tovey-Walsh 3 months ago. Updated 3 months ago.

Start date:
Due date:
% Done:


Estimated time:
Legacy ID:
Applies to branch:
10, 11, trunk
Fix Committed on Branch:
Fixed in Maintenance Release:


Outputter.append() is:

    public void append(Item item) throws XPathException {
        append(item, Loc.NONE, ReceiverOption.ALL_NAMESPACES);

That means any item you append loses its location. I have some code that copies a tree, selectively modifying parts of it. I assumed that if I reached a node that I wasn't going to modify, I could simply call append() with node.getUnderlyingNode(). But if I do that, then the location is lost and consequently the base URI of the copied node(s) is lost.

I can work around this by doing the recursive descent and output the nodes encountered. But that seems like unnecessary work.

Files (82.3 KB) Norm Tovey-Walsh, 2022-09-20 10:57
Actions #1

Updated by Norm Tovey-Walsh 3 months ago

  • Applies to branch 10, 11, trunk added
Actions #2

Updated by Michael Kay 3 months ago

This is a complicated area. The location passed to an Outputter is generally the location of the instruction that constructs the node, not the location of the node itself. The copy-of instruction takes special (somewhat devious) measures to ensure that the location of the node being copied is passed to the DocumentBuilder that constructs the copy of the node; in some cases (not all) the new node retains the location of the old node.

Also complicating this is that the location of a node isn't in all cases the same thing as its base URI.

Need more detail on this one before making any changes.

Actions #3

Updated by Norm Tovey-Walsh 3 months ago

That's fair. Here's some context from a Slack chat:

Here's a weird one. I'm processing an XInclude. I construct a tree for the included content by calling various methods on the Receiver. Let's say the content I construct is . The baseURI (which comes from the underlying nodes getSystemId()) for foo is fine, for ?pi? is "", and for bar is "". The presence of the PI makes the following element(s) not have a base URI. I thought this was caused by my failure to supply a base URI for the PI. But I've changed the code so that I do this:

    public void addPI(XdmNode node) {
        Location location = VoidLocation.instance();
        if (node.getBaseURI() != null) {
            location = new SysIdLocation(node.getBaseURI().toString());
        addPI(node.getNodeName().getLocalName(), node.getStringValue(), location);

where the 3 argument form of addPI is:

    public void addPI(String target, String data, Location location) {
        try {
            receiver.processingInstruction(target, StringView.of(data), location, 0);
        } catch (XPathException e) {
            throw new XProcException(e);

The original PI that I'm effectively copying does have a base URI so the SysIdLocation is correct and is being passed to the receiver.

It looks like decompose() in ComplexContentOutputter is copying the PI again and passing a null location, because append in Outputter passes Loc.NONE.

I confess, I don't see where the PI comes into it. It would make more sense if all the elements lost their base URIs.

I'll see if I can extract out a test case.

Actions #4

Updated by Norm Tovey-Walsh 3 months ago

The attached test case demonstrates the problem. (I've done no more to investigate it beyond reproducing it.)

If you run the main class, it will print:

module: file:/Volumes/Saxonica/src/saxonica/test-apps/iss5691/main.xml: file:/Volumes/Saxonica/src/saxonica/test-apps/iss5691/main.xml
  foo: file:/Volumes/Saxonica/src/saxonica/test-apps/iss5691/main.xml: file:/Volumes/Saxonica/src/saxonica/test-apps/iss5691/main.xml
  pi: : 
  bar: : 

For some reason, appending the inner module document loses all base URIs starting at the first PI.

Please register to edit this issue

Also available in: Atom PDF