Bug #6596
closed
Failure upgrading to Axiom 1.4.0 - CDATA nodes
Category:
Saxon extensions
Applies to branch:
12, trunk
Fixed in Maintenance Release:
Description
As I said in Slack
I was trying to chase up a maven failure in a shell script. I think org.apache.ws.commons.axiom:axiom:{version} doesn't exist. I don't know why the build doesn't fail. I think that should be org.apache.ws.commons.axiom:axomi-api:{version}. Also, we're loading 1.2.x and the latest is 1.4.0. I'm going to bump the dependencies.
We don't distribute these jars, so the risk is smaller, but we should document the versions we test against.
Casual experiments with upgrading to 1.4.0 were unsuccessful. It will still build if we reduce the dependencies to only axiom-dom
and axiom-impl
, but 1.4.0 appears to introduce a new node type for CDATA. Attempting to treat CDATA as text is only partially successful as the resulting text nodes don't get merged.
Looking at the JDOM2 interface, I see that managing adjacent text nodes is spread across a few different methods. It's not immediately clear if the Axiom model can be approached in the same way.
Curiously, 1.2.15 has a OMNode.CDATA_SECTION_NODE
but doesn't use it? Or maybe the Axiom API has methods for merging adjacent nodes and those have changed in 1.4.0?
According to the Axiom docs,
Preserving CDATA sections during parsing
By default, StAXUtils creates StAX parsers in coaelescing mode. In this mode, the parser will never return two character data events in sequence, while in non coaelescing mode, the parser is allowed to break up character data into smaller chunks and to return multiple consecutive character events, which may improve throughput for documents containing large text nodes. It should be noted that StAXUtils overrides the default settings mandated by the StAX specification, which specifies that by default, a StAX parser must be in non coalescing mode. The primary reason is compatibility: older versions of Woodstox had coalescing switched on by default.
A side effect of the default settings chosen by Axiom is that by default, CDATA sections are not reported by parser created by StAXUtils. The reason is that in coalescing mode, the parser will not only coaelsce adjacent text nodes, but also CDATA sections. Applications that require correct reporting of CDATA sections should therefore disable coalescing. This can be achieved by creating a XMLInputFactory.properties file with the following content:
javax.xml.stream.isCoalescing=false
But using System.setProperty
to change values of javax.xml.stream.isCoalescing
doesn't seem to have any effect in either version of the API.
- Subject changed from How is axiom used? to Failure upgrading to Axiom 1.4.0 - CDATA nodes
- Category set to Saxon extensions
- Assignee set to Michael Kay
- Priority changed from Low to Normal
- Applies to branch 12, trunk added
- Platforms Java added
Changed AxiomLeafNodeWrapper to treat a CDATA node on the child axis in the same way as a text node (ie, wrapping it in a NodeInfo of type text).
Changed AxiomTreeTests to set mergesAdjacentTextNodes to false, so the test no longer expects adjacent text nodes to be merged (the Axiom tree wrapper, like the DOM4J tree wrapper, makes no attempt to do this)
- Private changed from Yes to No
- Status changed from New to Resolved
Please register to edit this issue
Also available in: Atom
PDF