Project

Profile

Help

Aggregating results of several XQueries or XSLT transformation into one file: shortcoming of SaxonC API?

Added by Martin Honnen almost 2 years ago

Sometimes with XSLT or XQuery you want to run the same transformation or query on a series of input files to aggregate the result into one file e.g. a single .csv file. With SaxonJ or SaxonCS/Saxon.NET I think the API allows that easily, here is one example for SaxonJ using a single FileOutputStream and then having each transformation create a Serializer over that same FileOutputStream:

import net.sf.saxon.s9api.*;

import javax.xml.transform.stream.StreamSource;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;

public class Main {
    public static void main(String[] args) throws SaxonApiException, IOException {
        String sampleDir = "C:\\Users\\marti\\OneDrive\\Documents\\xslt\\blog-xslt-3-by-example\\aggregate-sample-data";
        String[] xmlSamples = new String[] { "sample1.xml", "sample2.xml", "sample3.xml" };

        Processor processor = new Processor(true);

        XsltCompiler xsltCompiler = processor.newXsltCompiler();

        XsltExecutable xsltExecutable = xsltCompiler.compile(new StreamSource(new File(sampleDir, "filter-data.xsl")));

        FileOutputStream fos = new FileOutputStream(new File(sampleDir, "saxonj-result.csv"));

        for (String sample : xmlSamples) {
            Xslt30Transformer xslt30Transformer = xsltExecutable.load30();
            xslt30Transformer.applyTemplates(new StreamSource(new File(sampleDir, sample)), processor.newSerializer(fos));
        }

        fos.close();
    }
}

Now I am wondering whether the SaxonC API (mainly looking at Python currently) would allow me something similar but given that there is only a method like apply_templates_returning_file there doesn't seem to be a similarly simple way like the Java or .NET/CS APIs allow.

Am I left with using apply_templates_returning_string and then using Python to write the string results to the same opened file?

That would for huge results mean having huge intermediary string results in memory.

I understand that of course I could also try to write a wrapper stylesheet that creates a single result processing all input files but that is also forcing me to change my approach and kind of could run into memory problems if all samples files are are built as trees by the same stylesheet.

So I wonder, does SaxonC need some way to write results to a stream or append to a file? Of course I am just wondering whether it would be desirable, I am not sure the architecture and the various target platforms (e.g. C/C++, Python, PHP) would allow that in a consistent and usable way.


Replies (2)

RE: Aggregating results of several XQueries or XSLT transformation into one file: shortcoming of SaxonC API? - Added by O'Neil Delpratt almost 2 years ago

Thank you Martin for the feature request. This feature might be too disruptive for SaxonC 12. So it might be best to push it back to a SaxonC 13 release.

RE: Aggregating results of several XQueries or XSLT transformation into one file: shortcoming of SaxonC API? - Added by Michael Kay almost 2 years ago

It sounds as if it would be quite complex to achieve; when crossing language boundaries it's best to stick to simple data types like strings and booleans rather than trying to pass complex things like streams. But an option to append the result to a file might be feasible, at the risk of adding complexity to the API.

    (1-2/2)

    Please register to reply