Project

Profile

Help

Copy DOM level 3 user data with xsl:copy / keeping track of nodes

Added by Christian Lück 12 days ago

Is it possible to get DOM nodes as underlying nodes from transformer.applyTemplates(xdmDoc), if underlying nodes in xdmDoc are DOM nodes? And, if so, will DOM level 3 user data be copied?

Here is what I tried: After wrapping a DOM document (which was parsed with Xerces' DOMParser) to Saxon's Xdm implementation with

XdmNode xdmDoc = processor.newDocumentBuilder().wrap(document)

it is possible to get and set user data for each node like so:

org.w3c.dom.Node node = xdmNode.getExternalNode();
node.setUserData("my-data", data, null);
...
T myData = (T) node.getUserData("my-data");

However, after running a transformation with

XdmValues result = transformer.applyTemplates(xdmDoc)

there is no DOM level 3 user data in the nodes contained the result sequence. The underlying nodes aren't DOM nodes at all.

What I need this for:

I would like to keep track of nodes when they are copied by XSLT. The goal is to map Web Annotation Data Model's XPathSelectors in the source document to XPathSelectors in the result document. Especially, keeping track of text nodes is of interest.

Therefore, my DOM level 3 user data contain kind of identifiers (traces) and I want to end up with mappings of source nodes to result nodes (forward) and vice versa (reverse):

Map<XdmNode, List<XdmNode>> forward;      // 1 -> 0..n
Map<XdmNode, Optional<XdmNode>> reverse;  // 1 -> 0..1

I also tried to keep track of nodes with xdmNode.hashCode(), but it turned out, that copied nodes have a different hash. – May be copy's hash is derived deterministically and the hash of the original node can be interfered?

Using any other Java-level approach from the Saxon API would be fine, too.


Replies (5)

Please register to reply

RE: Copy DOM level 3 user data with xsl:copy / keeping track of nodes - Added by Martin Honnen 12 days ago

If you want a W3C DOM Document result, use a DOMDestination https://www.saxonica.com/html/documentation12/javadoc/net/sf/saxon/s9api/DOMDestination.html. No idea currently, however, whether any user data of DOM nodes would be copied to result nodes.

RE: Copy DOM level 3 user data with xsl:copy / keeping track of nodes - Added by Michael Kay 12 days ago

XSLT operates on the XDM data model, and when it copies nodes, it will only copy things that are defined in XDM. It doesn't know about any user data in the nodes so it won't copy it.

As Martin says, it's easy enough to get the transformation output in DOM form, but in effect the input DOM has been converted to XDM and then back to DOM, which isn't going to be lossless.

If you want external data to survive the transformation then the best approach is probably to package it up somehow as an attribute value.

RE: Copy DOM level 3 user data with xsl:copy / keeping track of nodes - Added by Christian Lück 12 days ago

Dear Martin, dear Michael,

thanks for your replies! In the hour between your answers, I constructed a DOM with a DOMDestination and, I can confirm: no user data present after the transformation. That's also how I understood the Destination interface: It's simply a sink for the internal XdmValues.

Keeping track of nodes by storing paths in attributes was my first approach. While this is simple, the downside is, that the stylesheets have to reflect it. So I tried to pass data through the transformation on the Java level (the transformer level, not the transformation level), in order to keep track of nodes in arbitrary transformations.

What about xdmNode.hashCode()? Is there some kind of function that calculates the hash of a copied node on the basis of the original node's hash and that can be reversed?

RE: Copy DOM level 3 user data with xsl:copy / keeping track of nodes - Added by Christian Lück 12 days ago

PS. Keeping track of nodes on the transformer level would enable us to use Saxon not only to transform documents, but also to transform references to internals of documents (path expressions).

RE: Copy DOM level 3 user data with xsl:copy / keeping track of nodes - Added by Michael Kay 12 days ago

Is there some kind of function that calculates the hash of a copied node on the basis of the original node's hash and that can be reversed?

"Copying" a node usually means rebuilding it from scratch, because although it's described as a copy it often involves things like changing the namespace bindings and the base URI. There are a few cases where Saxon does a "virtual copy" but it's only possible in rather special circumstances.

    (1-5/5)

    Please register to reply