Project

Profile

Help

Bug #6596

closed

Failure upgrading to Axiom 1.4.0 - CDATA nodes

Added by Norm Tovey-Walsh 27 days ago. Updated 25 days ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Saxon extensions
Sprint/Milestone:
-
Start date:
2024-11-25
Due date:
% Done:

0%

Estimated time:
Legacy ID:
Applies to branch:
12, trunk
Fix Committed on Branch:
Fixed in Maintenance Release:
Platforms:
Java

Description

As I said in Slack

I was trying to chase up a maven failure in a shell script. I think org.apache.ws.commons.axiom:axiom:{version} doesn't exist. I don't know why the build doesn't fail. I think that should be org.apache.ws.commons.axiom:axomi-api:{version}. Also, we're loading 1.2.x and the latest is 1.4.0. I'm going to bump the dependencies.

We don't distribute these jars, so the risk is smaller, but we should document the versions we test against.

Actions #1

Updated by Norm Tovey-Walsh 27 days ago

Casual experiments with upgrading to 1.4.0 were unsuccessful. It will still build if we reduce the dependencies to only axiom-dom and axiom-impl, but 1.4.0 appears to introduce a new node type for CDATA. Attempting to treat CDATA as text is only partially successful as the resulting text nodes don't get merged.

Actions #2

Updated by Norm Tovey-Walsh 27 days ago

Looking at the JDOM2 interface, I see that managing adjacent text nodes is spread across a few different methods. It's not immediately clear if the Axiom model can be approached in the same way.

Actions #3

Updated by Norm Tovey-Walsh 27 days ago

Curiously, 1.2.15 has a OMNode.CDATA_SECTION_NODE but doesn't use it? Or maybe the Axiom API has methods for merging adjacent nodes and those have changed in 1.4.0?

Actions #4

Updated by Norm Tovey-Walsh 27 days ago

According to the Axiom docs,

Preserving CDATA sections during parsing

By default, StAXUtils creates StAX parsers in coaelescing mode. In this mode, the parser will never return two character data events in sequence, while in non coaelescing mode, the parser is allowed to break up character data into smaller chunks and to return multiple consecutive character events, which may improve throughput for documents containing large text nodes. It should be noted that StAXUtils overrides the default settings mandated by the StAX specification, which specifies that by default, a StAX parser must be in non coalescing mode. The primary reason is compatibility: older versions of Woodstox had coalescing switched on by default.

A side effect of the default settings chosen by Axiom is that by default, CDATA sections are not reported by parser created by StAXUtils. The reason is that in coalescing mode, the parser will not only coaelsce adjacent text nodes, but also CDATA sections. Applications that require correct reporting of CDATA sections should therefore disable coalescing. This can be achieved by creating a XMLInputFactory.properties file with the following content:

javax.xml.stream.isCoalescing=false

But using System.setProperty to change values of javax.xml.stream.isCoalescing doesn't seem to have any effect in either version of the API.

Actions #5

Updated by Michael Kay 26 days ago

  • Subject changed from How is axiom used? to Failure upgrading to Axiom 1.4.0 - CDATA nodes
  • Category set to Saxon extensions
  • Assignee set to Michael Kay
  • Priority changed from Low to Normal
  • Applies to branch 12, trunk added
  • Platforms Java added
Actions #6

Updated by Michael Kay 26 days ago

Changed AxiomLeafNodeWrapper to treat a CDATA node on the child axis in the same way as a text node (ie, wrapping it in a NodeInfo of type text).

Changed AxiomTreeTests to set mergesAdjacentTextNodes to false, so the test no longer expects adjacent text nodes to be merged (the Axiom tree wrapper, like the DOM4J tree wrapper, makes no attempt to do this)

Actions #7

Updated by Norm Tovey-Walsh 26 days ago

  • Private changed from Yes to No
Actions #8

Updated by Michael Kay 25 days ago

  • Status changed from New to Resolved

Please register to edit this issue

Also available in: Atom PDF