collection(): failed to parse XML file

Added by O'Neil Delpratt 5 months ago.

10, trunk
10, trunk
Reported by user on saxon mailing list:

When the collection() feature is used in the XSLT stylesheet. The following error occurs:

collection(): failed to parse XML file file:/D:/Test/sample1.xml: I/O error reported by XML parser processing file:/D:/Test/sample1.xml: Could not find file 'D:\Test\doctype.dtd'.

It appears the default CollectionFinder does not use XsltTransformer.InputXmlResolver;

The problem is the C# .NET code does not have its own CollectionFinder as in the Java product


Updated by Emanuel Wlaschitz 5 months ago

Thanks for logging this!

As far as I can tell, the CollectionFinder does exist and we can set one using processor.SetProperty(Feature<CollectionFinder>.COLLECTION_FINDER, new CustomCollectionFinder()) (or using transformer.Implementation.setCollectionFinder(new CustomCollectionFinder()) to be more localized) and its findCollection method will be called.

I just don't see how we could affect how documents are loaded, as the documentation seems to suggest it only returns URIs and not loaded documents.

Updated by Michael Kay 5 months ago

Sorry to confuse. What I meant to say was that the default in the .NET product is to use the Java CollectionFinder, which of course has no knowledge of .NET-specifics like the XmlResolver. (I'm not even sure if that statement is correct, it needs further investigation).

Updated by O'Neil Delpratt 3 months ago

Updated by O'Neil Delpratt 3 months ago

I have added the @CollectionFinder@ feature to .NET, which will available in the next maintenance release.

Users can now define their own @ICollectionFinder@ and set it on the Processor object to be used in XQuery, XPath XSLT APIs.

As in Java, we now have @IResourceCollection@ interface to map URI of the collection into a sequence of Resource objects. We have a number of implementations of @IResourceCollection@ available for users to use: @CataogCollection@, @JarCollection@ and @DirectoryCollection@.

NUnit tests added.

I am leaving this bug issue open until API doc is complete.

Updated by O'Neil Delpratt 3 months ago

Bug fixed and committed on Saxon10 and trunk branches.

Updated by Emanuel Wlaschitz 3 months ago

Just checking, this CollectionFinder will allow us to set a custom .NET XmlResolver to be used when loading individual entries of the collection, right?

Updated by O'Neil Delpratt 3 months ago

Yes, the CollectionFinder should use your custom XsltTransformer.InputXmlResolver. If you have a sample application I will happy to test your setup with this new feature.

Updated by Emanuel Wlaschitz 3 months ago

We don't have a ready-to-use sample application, but we can make one.

Since we don't have the change yet, we ran the following with transform.exe collection.xsl -it:main - but I'm confident you can turn this into a .NET testcase:

collection.xsl (which simply looks at all XML files in the same folder and prints their root element name)

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="" xmlns:xs="" exclude-result-prefixes="xs" version="2.0">

    <xsl:template name="main">
        <xsl:apply-templates select="collection('./?select=*.xml')" mode="print-root" />

    <xsl:template match="/" mode="print-root">
        <xsl:value-of select="name(/*)" />


a.xml in same folder as collection.xsl (the DTD does not really matter, but removing it allows the XSLT to run)

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE test SYSTEM "does-not-exist.dtd">

You can add more XML files if you want, both with and without DOCTYPE, where the ones with a System ID trigger the exception from this issue.

In C#, we'd use a custom XmlResolver like this:

public class DtdIgnoringResolver : XmlResolver
    private readonly XmlResolver _innerResolver;

    public DtdIgnoringResolver(XmlResolver innerResolver)
        _innerResolver = innerResolver ?? throw new ArgumentNullException(nameof(innerResolver));

    public override ICredentials Credentials
        set { _innerResolver.Credentials = value; }

    public override object GetEntity(Uri absoluteUri, string role, Type ofObjectToReturn)
        if (!string.IsNullOrEmpty(absoluteUri?.OriginalString) && IsDtdOrSchema(absoluteUri.OriginalString))
            return Stream.Null;
        return _innerResolver.GetEntity(absoluteUri, role, ofObjectToReturn);

    private static bool IsDtdOrSchema(string filePath)
            filePath.EndsWith(".dtd", StringComparison.InvariantCultureIgnoreCase) ||
            filePath.EndsWith(".xsd", StringComparison.InvariantCultureIgnoreCase);

The piece of code we use is a bit more involved and usually applies the XSLT to a file (rather than running a named template), but I'm sure this suffices. If not, let me know.

Thanks again.

Updated by O'Neil Delpratt 3 months ago


Thanks for sending this example, which I used to create a nunit test case. I can confirm it now works as expected. No longer seeing the exception.

Updated by O'Neil Delpratt 3 months ago

I am reopening this bug issue because we are still seeing the exception described in the initial report.

There was a bug in the test case in comment #9. I have set the InputXmlResolver to the DtdIgnoringResolver on the Xslt30Transformer, but this is not being filtered through to the CollectionFinder.

Further investigation required.

Updated by Michael Kay 22 days ago

Updated by O'Neil Delpratt 7 days ago

  • Status changed from In Progress to Resolved

Bug fixed and test of nunit tests created.

Updated by O'Neil Delpratt 7 days ago

Bug fix applied to Saxon 10.5 maintenance release.

