Maintenance: Planio will be observing a scheduled maintenance window this Sunday, November 10, 2024 from 20:00 UTC until 21:00 UTC to perform important network maintenance in our primary data center. Your Planio account will be unavailable for a few minutes during this maintenance window.
Forums » Help »
Is type of nodes returned by parse-xml/parse-xml-fragment/parse-html dependent on the tree model worked with?
Added by Martin Honnen almost 2 years ago
Saxon has its own different tree models and can wrap various other tree models like DOM, JDOM. If I start with a certain tree model I guess the tree I build with DocumentBuilder has XdmNodes wrapping the underlying original node; but I wonder what happens if part of my XPath uses parse-xml
or parse-xml-fragment
or parse-html
, for the result of those function calls, will I always get XdmNodes in form of TinyTree nodes?
Please register to reply
From looking at the SaxonJ code it appears, rather oddly, that parse-xml()
always builds a TinyTree
, whereas parse-xml-fragment
and parse-html
use whatever has been configured using Feature.TREE_MODEL
.
On SaxonCS parse-html
puts a wrapper around the HTML tree constructed by AngleSharp.
I have tried some sample code and, testing with Saxon 12, if I use XPath over DOM by using net.sf.saxon.xpath.XPathFactoryImpl
Saxon interestingly enough for expressions constructing nodes with parse-xml-fragment
or saxon:parse-html
indeed returns DOM nodes (i.e. net.sf.saxon.dom.DocumentOverNodeInfo
.
Then I was wondering if I use s9api and e.g. saxonDocBuilder.setTreeModel(DOMObjectModel.getInstance());
and use such expressions with parse-xml-fragment
or saxon:parse-html
whether I could expect such a net.sf.saxon.dom.DocumentOverNodeInfo
to be available via getExternalNode()
but that seems to return null.
In the end I am looking at writing extension methods on the SaxonCS API and e.g. System.Xml.XmlNode
to use Saxon's XPath implementation and wonder whether there is a way I can expect or configure the API to return an XdmNode
for stuff like parse-xml-fragment
as the result of GetUnderlyingXmlNode()
, so far that always seems to return null.
Any thoughts on whether and how that would be possible?
DocumentNodeOverNodeInfo
is an "inverse wrapper" where we wrap Saxon's XDM nodes into a (read-only) DOM node. I think we use this only where we have to support an interface where DOM nodes are mandated, which occurs in XQJ and in the JAXP XPath API. There's no equivalent in SaxonCS.
In SaxonCS I think the only time we create DOM nodes (System.Xml.XmlNode) is when you send output to a DomDestination; and there it will be a true DOM node, not a wrapper. Indeed, it has to be, because System.Xml.XmlNode is a Class not an Interface.
I guess the next step in integration with the .NET XML infrastructure would be to implement IXPathNavigable, or to accept an IXPathNavigable as input. That's certainly feasible, I don't know how valuable it would be.
Please register to reply