Project

Profile

Help

Support #4797

Unknown collection error FODC0002

Added by Nik Osvalds about 1 month ago. Updated about 1 month ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Sprint/Milestone:
-
Start date:
2020-10-19
Due date:
% Done:

0%

Estimated time:
Applies to JS Branch:
Fix Committed on JS Branch:
Fixed in JS Release:
SEF Generated with:
Company:
-
Contact person:
-
Additional contact persons:
-

Description

I'm attempting to run saxon-js in NodeJS to apply XSL 3 transformations currently working in another system with Saxon HE, because saxon-js looks like it can offer a lot more versatility.

I am essentially brand new to XSL so the learning curve is steep.

I'm trying to run the iati.xslt transform (attached) on a sample IATI file using saxon-js for Node.js. I've converted the .xslt file on the command line to .sef.json and am calling the transformation in Node. I'm receiving the following error which seems to indicate the the collection(...)'s on line 90 of the .xslt cannot be resolved. I've ensured that the files specified in the collection() are available in my project files.

            // Applying the XSLT3 Ruleset to IATI Files Using SaxonJS
            let results = await SaxonJS.transform({
                stylesheetFileName: "./rules/iati.sef.json",
                sourceFileName: filePath,
                destination: "file",
                baseOutputURI: "./file_storage/validated/" + xmlIn.md5 + '.xml',
            }, "async")
message: 'Unknown collection (no collectionFinder supplied)',
code: 'FODC0002',
xsltLineNr: '90',
xsltModule: 'iati.xslt',
 <xsl:variable name="iati-codelists">
    <codes version="2.03">
      <xsl:apply-templates select="collection('../lib/schemata/2.03/codelist/?select=*.xml;recurse=yes')" mode="get-codelists"/>
      <xsl:apply-templates select="collection('../lib/schemata/non-embedded-codelist/?select=*.xml;recurse=yes')" mode="get-codelists"/>
    </codes>
  </xsl:variable>

I see that you can use the collectionFinder parameter in the transform function but it's unclear how this should be implemented and I've only found documentation on implementing a collectionFinder in Java.

Is this something implemented in Saxon HE (which is presently doing the work) and not currently in Saxon-Js?

iati.xslt (5.26 KB) iati.xslt Nik Osvalds, 2020-10-19 13:56

History

#1 Updated by Michael Kay about 1 month ago

I'm afraid that support for the collection() function in Saxon-JS is minimal: essentially, the function only works if you supply a CollectionFinder as a parameter, and the CollectionFinder has to do all the work. This restriction was in the browser version of Saxon-JS because there's not much else you can usefully do in the browser, and we didn't get around to lifting the restriction for Node.js. It's on our list for a future release.

The CollectionFinder callback is simply a Javascript function that is called supplying the (absolutized) collation URI, and that returns any value (the Javascript value is converted to an XDM value in the usual way); this value is then returned as the result of the collection() function. There's no real advantage of doing this rather than calling an external Javascript function, other than making your XSLT code a bit more portable.

#2 Updated by Nik Osvalds about 1 month ago

Thanks for the swift response Michael. I've started to write my own collectionFinder but I'm struggling to determine what it should return.

Taking the example XML shared above the original collect() function is finding a directory with several .xml files of codelists. So I've written this into a JS function that's passed to the transform and it's returning an Array of those files in utf8 format. After doing this I'm not seeing the effect of these files being loaded on the transform. Do I need to join these files together and return one object instead?

const collectionFinder = async (uri) => {
    let loaded = [];
    let path = uri.split('file://')[1].split('?')[0]; // get the right filepath (remove file:// and after the ?
    let files = await fsPromises.readdir(path)
    if (files) {
        loaded = files.map(async (file) => {
            return await fsPromises.readFile(path + file,'utf8')
        })
    }
    return await Promise.all(loaded)
}

#3 Updated by Martin Honnen about 1 month ago

I am not sure the collectionFinder can work asynchronously. As for having a collection of XML documents, I think you need to make sure you pass in nodes, meaning in the context of Saxon-JS 2 it looks like you need to use SaxonJS.getResource to build nodes. Short sample that worked for me:

require('saxon-js');

(async () => {
    try {
        let documents = [ await SaxonJS.getResource({ type : 'xml', file : 'files/input1.xml' }), await SaxonJS.getResource({ type : 'xml', file : 'files/input2.xml' })];

        let result = await SaxonJS.transform({
            stylesheetFileName: 'sheet.sef.json',
            destination : 'serialized',
            collectionFinder: (url) => {
                if (url === 'http://example.com/col1')
                {
                    return documents;
                }
                else {
                    return [];
                }
            }
        }, 'async');
        console.log('Result: ' + result.principalResult);
    }
    catch (e) {
        console.log(e);
    }
})();

Of course given that I build the collection/array of nodes "before" running the transformation it might be easier to just pass them in as a global parameter instead of using the collection function.

#4 Updated by Nik Osvalds about 1 month ago

That did the trick. Thank you so much! I was not able to get the async version of collectionFinder to work as you said, but following your example I was able to get it to pull all the collection files I needed. This is just a proof of concept for us to transition from Saxon HE to Saxon JS so I'm okay with it being a bit "hardcoded" for now.

// load codelists since collectionFinder can't be async
            let codelistPaths = [
                "non-embedded-codelist/",
                "2.03/codelist/",
                "2.02/codelist/",
                "2.01/codelist/",
                "1.05/codelist/",
                "1.04/codelist/",
                "1.03/codelist/"
            ];
                
            // this returns an object of the codelistPaths as Keys and an Array of resolved promises for the Values. these promises are grabbing the codelist XML files
            let resources = _.zipObject(codelistPaths, await Promise.all(_.map(codelistPaths, async (path) => {
                let files = await fsPromises.readdir("./IATI-Rulesets/lib/schemata/" + path);
                return await Promise.all(files.map(async (file) => {
                    return await SaxonJS.getResource({ type : 'xml', file : "./IATI-Rulesets/lib/schemata/" +  path + file })
                }))
            })))         

            // this pulls the right array of SaxonJS resources from the resources object
            const collectionFinder = (url) => {
                if (url.includes("codelist")) {
                    let path = url.split('schemata/')[1].split('?')[0]; // get the right filepath (remove file:// and after the ?
                    return resources[path]
                } else {
                    return []
                }
            }

            // results filepath
            let resultsPath = __dirname + "/file_storage/validated/" + xmlIn.md5 + '_results.xml'
            // Applying the XSLT3 Ruleset to IATI Files Using SaxonJS
            let results = await SaxonJS.transform({
                stylesheetFileName: "./IATI-Rulesets/rules/iati.sef.json",
                sourceFileName: filePath,
                destination: "serialized",
                collectionFinder: collectionFinder
            }, "async")
<xsl:variable name="iati-codelists">
    <codes version="2.03">
      <xsl:apply-templates select="collection('../lib/schemata/2.03/codelist/?select=*.xml;recurse=yes')" mode="get-codelists"/>
      <xsl:apply-templates select="collection('../lib/schemata/non-embedded-codelist/?select=*.xml;recurse=yes')" mode="get-codelists"/>
    </codes>
    <codes version="2.02">
      <xsl:apply-templates select="collection('../lib/schemata/2.02/codelist/?select=*.xml;recurse=yes')" mode="get-codelists"/>
      <xsl:apply-templates select="collection('../lib/schemata/non-embedded-codelist/?select=*.xml;recurse=yes')" mode="get-codelists"/>
    </codes>
    <codes version="2.01">
      <xsl:apply-templates select="collection('../lib/schemata/2.01/codelist/?select=*.xml;recurse=yes')" mode="get-codelists"/>
      <xsl:apply-templates select="collection('../lib/schemata/non-embedded-codelist/?select=*.xml;recurse=yes')" mode="get-codelists"/>
    </codes>
    <codes version="1.05">
      <xsl:apply-templates select="collection('../lib/schemata/1.05/codelist/?select=*.xml;recurse=yes')" mode="get-codelists"/>
    </codes>
    <codes version="1.04">
      <xsl:apply-templates select="collection('../lib/schemata/1.04/codelist/?select=*.xml;recurse=yes')" mode="get-codelists"/>
    </codes>
    <codes version="1.03">
      <xsl:apply-templates select="collection('../lib/schemata/1.03/codelist/?select=*.xml;recurse=yes')" mode="get-codelists"/>
    </codes>
  </xsl:variable>

Please register to edit this issue

Also available in: Atom PDF Tracking page