Project

Profile

Help

errorhandler saxsource vs. transformerhandler

Added by Anonymous almost 16 years ago

Legacy ID: #6050097 Legacy Poster: kendall shaw (queshaw)

Should I expect that my errorhandler be invoked when using a saxsource with saxon, or am I expecting something that I shouldn't? If I use SAXSource with JAXP with Saxon, it appears that my errorhandler is never invoked when there are parse errors regarding DTD validation. But, if I use a TransformerHandler it is invoked. For example: package test; import java.io.StringReader; import javax.xml.transform.ErrorListener; import javax.xml.transform.Transformer; import javax.xml.transform.TransformerException; import javax.xml.transform.TransformerFactory; import javax.xml.transform.sax.SAXSource; import javax.xml.transform.sax.SAXTransformerFactory; import javax.xml.transform.sax.TransformerHandler; import javax.xml.transform.stream.StreamResult; import javax.xml.transform.stream.StreamSource; import net.sf.saxon.TransformerFactoryImpl; import org.xml.sax.ErrorHandler; import org.xml.sax.InputSource; import org.xml.sax.SAXException; import org.xml.sax.SAXParseException; import org.xml.sax.XMLReader; import org.xml.sax.helpers.XMLReaderFactory; public class JaxpParseErrors { private static class MyErrorListener implements ErrorListener { public void error(TransformerException exception) throws TransformerException { System.out.println("ErrorListener(error): " + exception); } public void fatalError(TransformerException exception) throws TransformerException { System.out.println("ErrorListener(fatalError): " + exception); throw exception; } public void warning(TransformerException exception) throws TransformerException { System.out.println("ErrorListener(warning): " + exception); } } private static class MyErrorHandler implements ErrorHandler { public void error(SAXParseException exception) throws SAXException { System.out.println("ErrorHandler(error): " + exception); } public void fatalError(SAXParseException exception) throws SAXException { System.out.println("ErrorHandler(fatalError): " + exception); throw exception; } public void warning(SAXParseException exception) throws SAXException { System.out.println("ErrorHandler(warning): " + exception); } } public static void main(String[] args) throws Exception { JaxpParseErrors j = new JaxpParseErrors(); j.go(); //j.goSax(); } public void go() throws Exception { String xml = "<!DOCTYPE doc [\n" + "<!ELEMENT doc EMPTY>\n" + "]>\n" + "<doc a='b'/>\n"; StringReader ssr = new StringReader(xml); InputSource is = new InputSource(ssr); XMLReader xr = XMLReaderFactory.createXMLReader(); xr.setErrorHandler(new MyErrorHandler()); xr.setFeature("http://xml.org/sax/features/validation", true); SAXSource ss = new SAXSource(xr, is); String xsl = "<transform version='1.0'" + " xmlns='http://www.w3.org/1999/XSL/Transform'&gt;\n" + "<template match='@|node()'>\n" + "<copy><apply-templates select='@|node()'/></copy>\n" + "</template>\n" + "</transform>\n"; StringReader xsr = new StringReader(xsl); StreamSource xs = new StreamSource(xsr); StreamResult sr = new StreamResult(System.out); ErrorListener el = new MyErrorListener(); TransformerFactory tf = new TransformerFactoryImpl(); tf.setErrorListener(el); Transformer trans = tf.newTransformer(xs); trans.setErrorListener(el); trans.transform(ss, sr); } public void goSax() throws Exception { String xml = "<!DOCTYPE doc [\n" + "<!ELEMENT doc EMPTY>\n" + "]>\n" + "<doc a='b'/>\n"; StringReader ssr = new StringReader(xml); InputSource is = new InputSource(ssr); XMLReader xr = XMLReaderFactory.createXMLReader(); xr.setErrorHandler(new MyErrorHandler()); xr.setFeature("http://xml.org/sax/features/validation", true); String xsl = "<transform version='1.0'" + " xmlns='http://www.w3.org/1999/XSL/Transform'&gt;\n" + "<template match='@|node()'>\n" + "<copy><apply-templates select='@|node()'/></copy>\n" + "</template>\n" + "</transform>\n"; StringReader xsr = new StringReader(xsl); StreamSource xs = new StreamSource(xsr); StreamResult sr = new StreamResult(System.out); ErrorListener el = new MyErrorListener(); SAXTransformerFactory tf = new TransformerFactoryImpl(); tf.setErrorListener(el); TransformerHandler th = tf.newTransformerHandler(xs); th.setResult(sr); Transformer trans = th.getTransformer(); trans.setErrorListener(el); xr.setContentHandler(th); xr.setProperty("http://xml.org/sax/properties/lexical-handler", th); xr.parse(is); } } If I use j.go() MyErrorHandler is never invoked, but if I use j.goSax() it is. This is on linux with jdk 1.6.0_11 and saxonb 9.1.0.5. If I remove saxon from the classpath and use TransformerFactory.newInstance(), the error handler is invoked for both methods.


Replies (4)

Please register to reply

RE: errorhandler saxsource vs. transformerhandler - Added by Anonymous almost 16 years ago

Legacy ID: #6061603 Legacy Poster: Michael Kay (mhkay)

I think that Saxon is modifying the ErrorHandler associated with the user-supplied XMLReader, which it shouldn't really do. I'll take a look at this on my return from vacation next week.

RE: errorhandler saxsource vs. transformerhandler - Added by Anonymous almost 16 years ago

Legacy ID: #6130995 Legacy Poster: Michael Kay (mhkay)

This is a tricky one. It would be nice if one could say "If the user supplies an XMLReader, Saxon shouldn't mess with it". However, that's not possible: at the very least, Saxon has to set the ContentHandler and LexicalHandler for the parser so that it (Saxon) can see the parsing events. It then doesn't seem a major extension of this principle to say that Saxon should also set the ErrorHandler so that it can process the error events. There are other properties of the parser Saxon sets, for example it forces namespaces to be notified the way Saxon expects them, and it sets DTD validation and XInclude processing as requested in the Saxon configuration or parsing options. One possible solution would be for Saxon to intercept the error events and then pass them on to the original ErrorHandler. However, that still wouldn't achieve the aim of "not messing with the XMLReader" as supplied. The other thing I have to consider here is, would changing the code cause more problems than it solves? It's very hard to judge how many user applications would break (if only in subtle aspects of error handling) as a result of changing this. I'm going to do some experiments to see what happens - I haven't made a decision yet. I've given up long ago trying to "just do what Xalan does". Chasing a moving target isn't in the interests of Saxon users. Even when the JAXP spec changes to rubber-stamp the Xalan way of doing things (as it often does), I have to think about the impact on Saxon users of implementing a semantic change.

RE: errorhandler saxsource vs. transformerhandler - Added by Anonymous almost 16 years ago

Legacy ID: #6147899 Legacy Poster: kendall shaw (queshaw)

I'm not really complaining that saxon behaves differently than xalan. It isn't obvious to me that the go and goSax methods should behave differently. In the javadoc for jdk 1.6.0_11 the SAXSource constructor says that the transformer or SAXTransformerFactory will set itself to be the content handler and invoke parse. I suppose it's not saying that it won't also set other things on the XMLReader. I think it would be convenient if the go and goSax methods behaved the same and invoked the errorhandler that was set.

RE: errorhandler saxsource vs. transformerhandler - Added by Anonymous almost 16 years ago

Legacy ID: #6151772 Legacy Poster: Michael Kay (mhkay)

You can never read the JAXP specs completely literally - if the Transformer tried to set itself as the ContentHandler, you would get a ClassCastException, because a Transformer doesn't (in general) implement the ContentHandler interface. And it certainly has to set the lexical handler as well, if it's going to conform to XSLT semantics. I'm experimenting with not setting the ErrorHandler in the case of a user-created SAXSource whose XMLReader and its ErrorHandler have been preinitialized. So far, there seem to be no adverse consequences. But I've only tested with one SAX parser so far, and SAX parsers are notoriously variable in the detail of how they report errors (for example, whether or not an exception thrown by a ContentHandler callback method is then reported to the ErrorHandler). This is particular true of user-written implementations of XMLReader (typically acting as SAX filters). So it's an area where making changes could easily cause more problems than it solves.

    (1-4/4)

    Please register to reply