Bug #5957
closedDTD Validation disabling
100%
Description
Hi,
i am trying to transform a XML file with an XSL 3.0. The XML file has a doctype declaration, but for the workflow it is not necassary to download and validating the source file. I tried a couple of things but the error was the same, that the DTD file could not be downloaded.
Here are my code:
EnterpriseConfiguration configuration = new EnterpriseConfiguration();
configuration.setConfigurationProperty(
Feature.ENTITY_RESOLVER_CLASS, IgnoreDoctypeEntityResolver.class.getCanonicalName());
configuration.setConfigurationProperty(Feature.DTD_VALIDATION, false);
configuration.setValidation(false);
StreamingTransformerFactory streamingTransformerFactory =
new StreamingTransformerFactory(configuration);
streamingTransformerFactory.setErrorListener(new StandardErrorListener());
Transformer strTransformer =
streamingTransformerFactory.newTransformer(
new StreamSource(xslFile)));
StreamSource xml = new StreamSource(onixStream);
okFile = new File(targetDirectory.toString(), "dummy.xml");
strTransformer.transform(xml, new StreamResult(okFile));
Can you help me to solve this problem?
Updated by Mark Hansen over 1 year ago
ublic class IgnoreDoctypeEntityResolver implements EntityResolver {
private static final Logger log = LoggerFactory.getLogger(IgnoreDoctypeEntityResolver.class);
public static final String DOCTYPE_SUFFIX = ".dtd";
@Override
public InputSource resolveEntity(String publicId, String systemId) {
if (systemId.endsWith(DOCTYPE_SUFFIX)) {
// resolve to empty result to ignore
log.info("Ignoring external xml entity: {}", systemId);
return new InputSource(new StringReader(""));
}
// otherwise use default behaviour
return null;
}
}
Updated by Michael Kay over 1 year ago
You can ask Saxon to switch off DTD validation, but you can't prevent it from reading the external DTD, because that's needed for entity expansion. What you have to do is redirect it to use a different (perhaps dummy) DTD.
But you're already doing that - why isn't it working? Offhand, I don't know, I'll have to try your code in the debugger. One of the problems is that there are too many places you can set such options, and they sometimes interfere with each other.
Updated by Martin Honnen over 1 year ago
Assuming Xerces is the underlying XML parser then for me
configuration.setParseOptions(configuration.getParseOptions().withParserFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false));
prevents loading of an external DTD. I would think I have only used that once or twice and probably only with Saxon HE but you could give it a try for EE, I don't think the parser use of specific to the Saxon edition.
Updated by Mark Hansen over 1 year ago
First of all, thanks for the quick responses.
The error changes from:
Error I/O error reported by XML parser processing null. Caused by java.io.FileNotFoundException: http://doesnotexist.com/BookProduct_3.0_short.dtd Error I/O error reported by XML parser processing null. Caused by java.io.IOException: net.sf.saxon.s9api.SaxonApiException: I/O error reported by XML parser processing null: I/O error reported by XML parser processing null: I/O error reported by XML parser processing null.
To:
Error I/O error reported by XML parser processing null. Caused by java.io.FileNotFoundException: D:\Projekte\service\import-metadaten\BookProduct_3.0_short.dtd (Das System kann die angegebene Datei nicht finden) Error I/O error reported by XML parser processing null. Caused by java.io.IOException: net.sf.saxon.s9api.SaxonApiException: I/O error reported by XML parser processing null: I/O error reported by XML parser processing null: I/O error reported by XML parser
Now the validator tryed to find the DTD on my own system. But there is no file neither.
Updated by Martin Honnen over 1 year ago
I get the following code to compile and run with Java 8, Saxon EE 12.1 without any errors:
import com.saxonica.config.EnterpriseConfiguration;
import com.saxonica.config.StreamingTransformerFactory;
import net.sf.saxon.lib.Feature;
import javax.xml.transform.Templates;
import javax.xml.transform.TransformerException;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
public class Main {
public static void main(String[] args) throws TransformerException {
EnterpriseConfiguration configuration = new EnterpriseConfiguration();
configuration.setConfigurationProperty(Feature.DTD_VALIDATION, false);
configuration.setValidation(false);
configuration.setParseOptions(configuration.getParseOptions().withParserFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false));
StreamingTransformerFactory streamingTransformerFactory = new StreamingTransformerFactory(configuration);
Templates streamingTemplates = streamingTransformerFactory.newTemplates(new StreamSource("xslt-test1.xsl"));
streamingTemplates.newTransformer().transform(new StreamSource("input1.xml"), new StreamResult("result1.xml"));
streamingTemplates.newTransformer().transform(new StreamSource("input2.xml"), new StreamResult("result2.xml"));
streamingTemplates.newTransformer().transform(new StreamSource("input3.xml"), new StreamResult("result3.xml"));
}
}
input2.xml has e.g. <!DOCTYPE root SYSTEM "http://example.com/example1.dtd">
but doesn't give any error, input3.xml has e.g. <!DOCTYPE root SYSTEM "example1.dtd">
and doesn't give any error either.
Updated by Mark Hansen over 1 year ago
It still not working for me.
I have also Saxon-EE 12.1 but my Java version is 17.
I copied your code.
Here are my XML:
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE Book PUBLIC "-//MeineFirma Solutions//DTD EMail V 1.0//DE"
"http://doesnotexist.com/BookProduct_3.0_short.dtd">
<Book release="3.0">
<Entry>...</Entry>
</Book>
and here is my XSL:
<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:so="http://www.book.org/3.0/short">
<xsl:mode streamable="yes" on-no-match="shallow-copy"/>
<xsl:template match="so:Book">
<xsl:iterate
select="so:Entry">
<xsl:result-document href="{concat(position(),'.xml')}" method="xml">
<xsl:apply-templates select="."/>
</xsl:result-document>
</xsl:iterate>
</xsl:template>
<xsl:template match="so:Entry">
<xsl:copy-of select="."/>
</xsl:template>
</xsl:stylesheet>
Updated by Martin Honnen over 1 year ago
That XSLT code trying to match Entry
or Book
elements in a namespace doesn't seem to fit the XML sample not using any namespace.
Updated by Martin Honnen over 1 year ago
I tried a slight adaption of your XML
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE Book PUBLIC "-//MeineFirma Solutions//DTD EMail V 1.0//DE"
"http://doesnotexist.com/BookProduct_3.0_short.dtd">
<Book xmlns="http://www.book.org/3.0/short" release="3.0">
<Entry>...</Entry>
</Book>
and XSLT
<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:so="http://www.book.org/3.0/short">
<xsl:mode streamable="yes" on-no-match="shallow-copy"/>
<xsl:template match="so:Book">
<xsl:iterate
select="so:Entry">
<xsl:result-document href="Entry-{position()}.xml" method="xml">
<xsl:apply-templates select="."/>
</xsl:result-document>
</xsl:iterate>
</xsl:template>
<xsl:template match="so:Entry">
<xsl:copy-of select="."/>
</xsl:template>
</xsl:stylesheet>
(also had the Java code to set e.g. streamingTemplates.newTransformer().transform(new StreamSource("input4.xml"), new StreamResult(new File("result4.xml")));
, otherwise the xsl:result-document
would throw some error about a wrong relative URI) but then, with Java 8 the DTD is ignored.
That looks as if the setting I suggested is not working with Java 17 although I have not tested whether that is the relevant difference.
I guess you will have wait until Michael finds the time to test/tell how to change your code to have the external code ignored.
Updated by Martin Honnen over 1 year ago
For a test, I have installed Adoptium/Eclipse Temurin openjdk 17.0.6 and have now run the program using that Java version but it behaves the same for me, no errors related to the DTDs are given, the XSLT transformations run through with any external DTDs being ignored.
Updated by Mark Hansen over 1 year ago
Hello Martin,
i have maked a mistake or hava a misunderstanding.
I have shained two transformations and the old one have worked with the xslt30Transformer and the configuration property ENTITY_RESOLVER_CLASS which ignore the DTD. That worked with the saxon-he version 10.5.
I have now added
configuration.setParseOptions(configuration.getParseOptions().withParserFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false));
and it worked fine.
Thanks. That helped. I never have find out that with the configuration propertie.
Updated by Michael Kay over 1 year ago
I can confirm that the entity resolver supplied using Feature.ENTITY_RESOLVER_CLASS is not being used.
What seems to be happening is that Configuration.getSourceParser()
returns an XMLReader whose entity resolver is an EntityResolverWrappingResourceResolver which invokes the Configuration's CatalogResolver. On return, `ActiveStreamSource.deliver() does
if (options.getEntityResolver() != null && parser.getEntityResolver() == null) {
parser.setEntityResolver(options.getEntityResolver());
}
which has no effect because parser.getEntityResolver()
is not null. This code is there to stop us stomping over an EntityResolver explicitly registered with the XMLReader, but it has the effect that the EntityResolver held in the ParseOptions
is ignored (at least on this path).
If I take out the condition && parser.getEntityResolver() == null)
the test case works - but I'm not convinced that is the right solution,
Updated by Michael Kay over 1 year ago
- Tracker changed from Support to Bug
- Category set to JAXP Java API
- Status changed from New to In Progress
- Assignee set to Michael Kay
Updated by Michael Kay over 1 year ago
I've committed this change as a partial/provisional solution that handles this test case.
Updated by Michael Kay over 1 year ago
- Status changed from In Progress to Resolved
- Applies to branch trunk added
- Fix Committed on Branch trunk added
- Platforms Java added
Marking this resolved because we produced a patch that fixes the immediate problem, although there are wider issues still to consider.
Updated by O'Neil Delpratt over 1 year ago
- Status changed from Resolved to Closed
- % Done changed from 0 to 100
- Fixed in Maintenance Release 12.2 added
Bug fix applied in the Saxon 12.2 maintenance release
Please register to edit this issue