Project

Profile

Help

Support #4628

Ignoring dtd's declared in xml messages with dtd declarations

Added by Pratheek K about 1 month ago. Updated 2 days ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Sprint/Milestone:
-
Start date:
2020-07-03
Due date:
% Done:

0%

Estimated time:
Legacy ID:
Applies to branch:
Fix Committed on Branch:
Fixed in Maintenance Release:

Description

Hi, We are using saxon EE version 9.9.1.6. We would like to know if there is any option to ignore dtd's declared in xml messages while parsing.

We are trying to parse an xml message with dtd declaration While trying to parse, I get a Saxparser exception which says ' Failure converting a node of class javax.xml.transform.sax.SAXSource: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 56; External DTD: Failed to read external DTD 'accessExternalDtd' not set to 'file'. Is there any configuration available in saxon 9.9.1.6 or 10.1 that overrides the parser configuration of accessExternalDtd and allow parsing of these xmls's

History

#1 Updated by Michael Kay about 1 month ago

In general if you want control over the configuration options of the XML parser, then the best approach is to supply the XML parser yourself, rather than leaving Saxon to create one. Typically to do this, you supply input documents in the form of a SAXSource in which you have already initialised the XMLReader.

That's not always possible of course if you are invoking Saxon via the command line interface or, for example, from Ant. In that case there are a number of ways of influencing how Saxon configures the parser.

DTDs can define entity declarations, and if your document contains entity references, then in general failing to process the DTD will mean that entity references can't be resolved, which means parsing will fail. However, if you know that you aren't using entity references, then you can configure the parser with an EntityResolver that substitutes a different DTD from the one requested. You can ask Saxon to set an EntityResolver on any XmlReader (parser) that Saxon instantiates by using the Saxon configuration option Feature.ENTITY_RESOLVER_CLASS. For example, you could set this on the command line using --entityResolverClass=my.entity.Resolver.

If accessExternalDTD doesn't include "file", that's a security setting someone has put in the system to prevent access to DTDs using the file protocol. Security people are pretty paranoid about access to DTDs and external entities, but overriding security settings is generally unwise. Basically, the security people are telling you not to process this kind of document, and you need to have a debate with them about why they are imposing that restriction, rather than trying to find a way around it.

#2 Updated by Pratheek K about 1 month ago

Hi Michael,

The workaround with a custom EntityResolver, though seemed promising, resulted in the same error 'Failed to read external DTD '***.dtd', because 'file' access is not allowed due to restriction set by the accessExternalDTD' I checked the saxon configuration and verfifed that my custom Entity resolver was registered, but was never called.

So, it seems that the above error is thrown before the EntityResolver is hit.

#3 Updated by Pratheek K about 1 month ago

Hi Michael,

I did some debugging and figured out that the reason why my EntityResolver was not called. Saxon is called by camel and camel supplies its own XMLReader and none of the configuration in ParseOption is applied on the parser. So in that case, ENTITY_RESOLVER_CLASS is not suitable.

Regards Pratheek

#4 Updated by Michael Kay 2 days ago

  • Status changed from New to Closed

Please register to edit this issue

Also available in: Atom PDF