Project

Profile

Help

Feature #4839

collection(): failed to parse XML file

Added by O'Neil Delpratt about 2 months ago. Updated 6 days ago.

Status:
In Progress
Priority:
Normal
Category:
.NET API
Sprint/Milestone:
-
Start date:
2020-11-25
Due date:
% Done:

100%

Estimated time:
Legacy ID:
Applies to branch:
10, trunk
Fix Committed on Branch:
10, trunk
Fixed in Maintenance Release:

Description

Reported by user on saxon mailing list:

When the collection() feature is used in the XSLT stylesheet. The following error occurs:

collection(): failed to parse XML file file:/D:/Test/sample1.xml: I/O error reported by XML parser processing file:/D:/Test/sample1.xml: Could not find file 'D:\Test\doctype.dtd'.

It appears the default CollectionFinder does not use XsltTransformer.InputXmlResolver;

The problem is the C# .NET code does not have its own CollectionFinder as in the Java product

History

#1 Updated by Emanuel Wlaschitz about 2 months ago

Thanks for logging this!

As far as I can tell, the CollectionFinder does exist and we can set one using processor.SetProperty(Feature<CollectionFinder>.COLLECTION_FINDER, new CustomCollectionFinder()) (or using transformer.Implementation.setCollectionFinder(new CustomCollectionFinder()) to be more localized) and its findCollection method will be called.

I just don't see how we could affect how documents are loaded, as the documentation seems to suggest it only returns URIs and not loaded documents.

#2 Updated by Michael Kay about 2 months ago

Sorry to confuse. What I meant to say was that the default in the .NET product is to use the Java CollectionFinder, which of course has no knowledge of .NET-specifics like the XmlResolver. (I'm not even sure if that statement is correct, it needs further investigation).

#3 Updated by O'Neil Delpratt about 2 months ago

  • Tracker changed from Bug to Feature

#4 Updated by O'Neil Delpratt 13 days ago

  • Status changed from New to In Progress
  • Applies to branch deleted (9.9)

I have added the @CollectionFinder@ feature to .NET, which will available in the next maintenance release.

Users can now define their own @ICollectionFinder@ and set it on the Processor object to be used in XQuery, XPath XSLT APIs.

As in Java, we now have @IResourceCollection@ interface to map URI of the collection into a sequence of Resource objects. We have a number of implementations of @IResourceCollection@ available for users to use: @CataogCollection@, @JarCollection@ and @DirectoryCollection@.

NUnit tests added.

I am leaving this bug issue open until API doc is complete.

#5 Updated by O'Neil Delpratt 12 days ago

  • Status changed from In Progress to Resolved
  • % Done changed from 0 to 100
  • Fix Committed on Branch 10, trunk added

Bug fixed and committed on Saxon10 and trunk branches.

#6 Updated by Emanuel Wlaschitz 8 days ago

Just checking, this CollectionFinder will allow us to set a custom .NET XmlResolver to be used when loading individual entries of the collection, right?

#7 Updated by O'Neil Delpratt 8 days ago

Yes, the CollectionFinder should use your custom XsltTransformer.InputXmlResolver. If you have a sample application I will happy to test your setup with this new feature.

#8 Updated by Emanuel Wlaschitz 8 days ago

We don't have a ready-to-use sample application, but we can make one.

Since we don't have the change yet, we ran the following with transform.exe collection.xsl -it:main - but I'm confident you can turn this into a .NET testcase:

collection.xsl (which simply looks at all XML files in the same folder and prints their root element name)

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs" version="2.0">

    <xsl:template name="main">
        <xsl:apply-templates select="collection('./?select=*.xml')" mode="print-root" />
    </xsl:template>

    <xsl:template match="/" mode="print-root">
        <xsl:text>&#xA;</xsl:text>
        <xsl:value-of select="name(/*)" />
    </xsl:template>

</xsl:stylesheet>

a.xml in same folder as collection.xsl (the DTD does not really matter, but removing it allows the XSLT to run)

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE test SYSTEM "does-not-exist.dtd">
<a/>

You can add more XML files if you want, both with and without DOCTYPE, where the ones with a System ID trigger the exception from this issue.

In C#, we'd use a custom XmlResolver like this:

public class DtdIgnoringResolver : XmlResolver
{
    private readonly XmlResolver _innerResolver;

    public DtdIgnoringResolver(XmlResolver innerResolver)
    {
        _innerResolver = innerResolver ?? throw new ArgumentNullException(nameof(innerResolver));
    }

    public override ICredentials Credentials
    {
        set { _innerResolver.Credentials = value; }
    }

    public override object GetEntity(Uri absoluteUri, string role, Type ofObjectToReturn)
    {
        if (!string.IsNullOrEmpty(absoluteUri?.OriginalString) && IsDtdOrSchema(absoluteUri.OriginalString))
            return Stream.Null;
        return _innerResolver.GetEntity(absoluteUri, role, ofObjectToReturn);
    }

    private static bool IsDtdOrSchema(string filePath)
    {
        return
            filePath.EndsWith(".dtd", StringComparison.InvariantCultureIgnoreCase) ||
            filePath.EndsWith(".xsd", StringComparison.InvariantCultureIgnoreCase);
    }
}

The piece of code we use is a bit more involved and usually applies the XSLT to a file (rather than running a named template), but I'm sure this suffices. If not, let me know.

Thanks again.

#9 Updated by O'Neil Delpratt 7 days ago

Hi,

Thanks for sending this example, which I used to create a nunit test case. I can confirm it now works as expected. No longer seeing the exception.

#10 Updated by O'Neil Delpratt 6 days ago

  • Status changed from Resolved to In Progress

I am reopening this bug issue because we are still seeing the exception described in the initial report.

There was a bug in the test case in comment #9. I have set the InputXmlResolver to the DtdIgnoringResolver on the Xslt30Transformer, but this is not being filtered through to the CollectionFinder.

Further investigation required.

Please register to edit this issue

Also available in: Atom PDF