Project

Profile

Help

Bug #4319

Namespace declaration no longer serialized with Saxon 9.9.1-5

Added by Radu Coravu about 1 year ago. Updated 7 months ago.

Status:
Closed
Priority:
Low
Assignee:
-
Category:
-
Sprint/Milestone:
-
Start date:
2019-09-25
Due date:
% Done:

0%

Estimated time:
Legacy ID:
Applies to branch:
Fix Committed on Branch:
Fixed in Maintenance Release:

Description

If a content handler outputs attributes in a certain namespace the Saxon transform will no longer add the necessary namespace declaration on the element. Sample code:

  public static void main(String[] args) throws TransformerFactoryConfigurationError, TransformerException, IOException {
	  Transformer transformer = TransformerFactory.newInstance().newTransformer();
	  XMLReader reader = new XMLReader() {
		private ContentHandler handler;
		@Override
		public void setProperty(String name, Object value) throws SAXNotRecognizedException, SAXNotSupportedException {
		}
		@Override
		public void setFeature(String name, boolean value) throws SAXNotRecognizedException, SAXNotSupportedException {
		}
		@Override
		public void setErrorHandler(ErrorHandler handler) {
		}
		@Override
		public void setEntityResolver(EntityResolver resolver) {
		}
		@Override
		public void setDTDHandler(DTDHandler handler) {
		}
		
		@Override
		public void setContentHandler(ContentHandler handler) {
			this.handler = handler;
		}
		
		@Override
		public void parse(String systemId) throws IOException, SAXException {
			parse((InputSource)null);
		}
		
		@Override
		public void parse(InputSource input) throws IOException, SAXException {
			handler.startDocument();
			AttributesImpl attrs = new AttributesImpl();
			attrs.addAttribute("http://www.oxygenxml.com/schematron/validation", "elementURI", "oxy:elementURI", "CDATA", "abc");
			handler.startElement("", "elem", "elem", attrs);
			handler.endElement("", "elem", "elem");
			handler.endDocument();
		}
		@Override
		public Object getProperty(String name) throws SAXNotRecognizedException, SAXNotSupportedException {
			return null;
		}
		@Override
		public boolean getFeature(String name) throws SAXNotRecognizedException, SAXNotSupportedException {
			return false;
		}
		@Override
		public ErrorHandler getErrorHandler() {
			return null;
		}
		@Override
		public EntityResolver getEntityResolver() {
			return null;
		}
		public DTDHandler getDTDHandler() {
			return null;
		}
		@Override
		public ContentHandler getContentHandler() {
			return handler;
		}
	};
	  SAXSource src = new SAXSource(reader, new InputSource("file://abc.txt"));
	  StringWriter sw = new StringWriter();
	  StreamResult res = new StreamResult(sw);
	  transformer.transform(src, res);
	  sw.close();
	  System.err.println(sw.toString());
  }

With Saxon 9.8 the code returns:

<?xml version="1.0" encoding="UTF-8"?><elem xmlns:oxy="http://www.oxygenxml.com/schematron/validation" oxy:elementURI="abc"/>

but with Saxon 9.9.1-5 it returns:

<?xml version="1.0" encoding="UTF-8"?><elem oxy:elementURI="abc"/>

History

#1 Updated by Radu Coravu about 1 year ago

To make this work I also need to issue a "startPrefixMapping" callback before the start element. But this did not use to be the case.

#2 Updated by Michael Kay about 1 year ago

It has always been the case that our ReceivingContentHandler has expectations on the stream of SAX events. The Javadoc for RCH states:

The {@code ReceivingContentHandler} is written on the assumption that it is receiving events
 from a parser configured with {@code http://xml.org/sax/features/namespaces} set to true
 and {@code http://xml.org/sax/features/namespace-prefixes} set to false.

(that is, namespaces must be notified using startPrefixMapping() and endPrefixMapping()).

There's a big hole here in that the detail of what is sent over the interface to a ContentHandler depends on the configuration settings of the XMLReader that sends the events and (a) the sender might not actually be an XMLReader, and (b) even if it is, we might not have access to it (e.g. in the TransformerHandler interface), in which case the ContentHandler has no way of discovering how it is configured. It would be very inefficient (and maybe impossible) to cater for all possible variations, so we just document our expectations.

I'm not sure exactly what has changed between releases - perhaps we removed a redundant NamespaceReducer from the input pipeline, somewhere between the ReceivingContentHandler and the builder. The NamespaceReducer basically performs namespace fixup.

#3 Updated by Michael Kay about 1 year ago

I would add that on some interfaces we do attempt to set the configuration of the XMLReader using setFeature/setProperty -- and there are some features/properties that a conformant XMLReader is obliged to recognize -- but your XMLReader is ignoring all requests to set features and properties, which is itself non-conformant...

We encounter that quite often with user-written XMLReader implementations.

#4 Updated by Radu Coravu about 1 year ago

I just created that very basic XMLReader implementation to show the problem. I can fix things in our code, I just wanted to signal this change between Saxon versions. So it's your choice if you want to make this work again or leave it as it is.

#5 Updated by Radu Coravu about 1 year ago

About the removal of that namespace reducer from the pipeline, as a related problem we started having failed automatic tests which applied XQuery updates over namespaced XML documents and which now produce content with extra namespaces declared on the elements which have been updated:

<p:personnel xmlns:p="http://www.oxygenxml.com/ns/samples/personal">
  <p:person id="harris.anderson" photo="personal-images/harris.anderson.jpg"><extra xmlns="otherNs">
    <p:url href="http://www.example.com" xmlns:p="http://www.oxygenxml.com/ns/samples/personal"/>
  </extra></p:person>
</p:personnel>

Notice that extra 'xmlns:p="http://www.oxygenxml.com/ns/samples/personal"' namespace declared on the "p:url". I know the XML is equivalent but we offer XML Refactoring actions based on XQuery update and people expect minimum changes.

#6 Updated by Michael Kay about 1 year ago

Yes, the removal of the NamespaceReducer has made us less resilient to errors in the client code that generates the events. But we are still exposed to other errors in the sequence of events.

#7 Updated by Michael Kay 7 months ago

  • Status changed from New to Closed

Decided to close this with no further action.

Please register to edit this issue

Also available in: Atom PDF