


XSLT 3.0 non XML result documents: should the Python API allow to capture them serialized?

Added by Martin Honnen almost 2 years ago

I am continuing to explore the SaxonC 12 Python API, for XSLT I wonder whether there is a way to capture non XML result documents containing XDM maps/arrays (aka JSON) in a serialized form e.g. a string.

While the primary result of e.g.

from saxonche import *

with PySaxonProcessor(license=True) as proc:

    xslt = '''<xsl:stylesheet xmlns:xsl="" version="3.0" expand-text="yes">
    <xsl:output method="json" indent="yes"/>
    <xsl:template name="xsl:initial-template">
      <xsl:sequence select="map { 'name' : 'Example 1', 'data' : array { 1 to 5 } }"/>

    xslt_proc = proc.new_xslt30_processor()

    xslt_exe = xslt_proc.compile_stylesheet(stylesheet_text=xslt)

    result = xslt_exe.call_template_returning_string()


is returned serialized as a JSON string fine, i.e. the example code outputs

SaxonC-HE 12.0 from Saxonica
  "data": [ 1, 2, 3, 4, 5 ],
  "name": "Example 1"

when I try to do the same for secondary result documents created with xsl:result-document I seem to run into an error and only find that I can change my code to capture the raw result (but where I then lack a way in the Python API to serialize it as e.g. JSON):

So the following code

from saxonche import *

with PySaxonProcessor(license=True) as proc:

    xslt = '''<xsl:stylesheet xmlns:xsl="" version="3.0" expand-text="yes">
        <xsl:output method="json" indent="yes"/>
        <xsl:template name="xsl:initial-template">
          <xsl:sequence select="map { 'name' : 'Example 1', 'data' : array { 1 to 5 } }"/>
          <xsl:for-each select="1 to 5">
          <xsl:result-document href="json-result-{.}.json">
                select="array { (1 to .) ! map { 'name' : 'item ' || ., 'data' : array { 1 to . } } }"/>

    xslt_proc = proc.new_xslt30_processor()

    xslt_exe = xslt_proc.compile_stylesheet(stylesheet_text=xslt)


    xslt_exe.set_capture_result_documents(True, False)

    result = xslt_exe.call_template_returning_string()


    result_docs = xslt_exe.get_result_documents()

    for key in result_docs:
        print(key, result_docs[key])

does neither return the primary result in result nor does it give me any serialized secondary result documents, instead it outputs an error

SaxonC-HE 12.0 from Saxonica
Error in xsl:result-document/@href on line 6 column 60 
  SENR0001  Cannot serialize a map using this output method
at template xsl:initial-template on line 3 column 51

Now when I switch to "raw" results by changing e.g. xslt_exe.set_capture_result_documents(True, False) to xslt_exe.set_capture_result_documents(True, True) I get an output where the primary result is serialized as JSON but the secondary result documents as returned as arrays and I lack a way in the Python API to serialize them as JSON (their toString() representation of course use the adaptive output):

SaxonC-HE 12.0 from Saxonica
  "data": [ 1, 2, 3, 4, 5 ],
  "name": "Example 1"
json-result-1.json [map{"data":[1],"name":"item 1"}]
json-result-2.json [map{"data":[1],"name":"item 1"},map{"data":[1,2],"name":"item 2"}]
json-result-3.json [map{"data":[1],"name":"item 1"},map{"data":[1,2],"name":"item 2"},map{"data":[1,2,3],"name":"item 3"}]
json-result-4.json [map{"data":[1],"name":"item 1"},map{"data":[1,2],"name":"item 2"},map{"data":[1,2,3],"name":"item 3"},map{"data":[1,2,3,4],"name":"item 4"}]
json-result-5.json [map{"data":[1],"name":"item 1"},map{"data":[1,2],"name":"item 2"},map{"data":[1,2,3],"name":"item 3"},map{"data":[1,2,3,4],"name":"item 4"},map{"data":[1,2,3,4,5],"name":"item 5"}]

I am not sure whether the error I get ("SENR0001 Cannot serialize a map using this output method") is a shortcoming of SaxonC as the result of the interaction between Java and C++ and Python or whether it is a quirk/bug in the current implementation.

RE: XSLT 3.0 non XML result documents: should the Python API allow to capture them serialized? - Added by O'Neil Delpratt almost 2 years ago

Thank you Martin for your experiments on this issue. Maybe something like a serializeAsJson(XdmMap) as a new method would do. Ideally we should be inline with the Java API so I will investigate this further.

RE: XSLT 3.0 non XML result documents: should the Python API allow to capture them serialized? - Added by O'Neil Delpratt almost 2 years ago

Hi, taking a step back here. See below the code ported to Java:

        Processor processor = new Processor(false);

        StringReader reader = new StringReader("<xsl:stylesheet xmlns:xsl=\"\" version=\"3.0\" expand-text=\"yes\">\n" +
                "        <xsl:output method=\"json\" indent=\"yes\"/>\n" +
                "        <xsl:template name=\"xsl:initial-template\">\n" +
                "          <xsl:sequence select=\"map { 'name' : 'Example 1', 'data' : array { 1 to 5 } }\"/>\n" +
                "          <xsl:for-each select=\"1 to 5\">\n" +
                "          <xsl:result-document href=\"json-result-{.}.json\">\n" +
                "              <xsl:sequence\n" +
                "                select=\"array { (1 to .) ! map { 'name' : 'item ' || ., 'data' : array { 1 to . } } }\"/>\n" +
                "            </xsl:result-document>\n" +
                "          </xsl:for-each>\n" +
                "        </xsl:template>\n" +
                "    </xsl:stylesheet>");

        XsltCompiler compiler = processor.newXsltCompiler();

        Xslt30Transformer trans = compiler.compile(new StreamSource(reader)).load30();
         ResultHandler resultHandler = new ResultHandler(true);

        XdmValue value = trans.callTemplate(null);

        System.err.println("Primary doc: " + value.toString());

        XdmValue [] rawResults = resultHandler.getRawResults();

        int i=0;
        for(XdmValue valuei : rawResults) {
            System.err.println("Secondary results["+i+"] :" + valuei.toString());

We get the following output:

Primary doc: map{"data":[1,2,3,4,5],"name":"Example 1"}
Secondary results[0] :[map{"data":[1],"name":"item 1"}]
Secondary results[1] :[map{"data":[1],"name":"item 1"},map{"data":[1,2],"name":"item 2"},map{"data":[1,2,3],"name":"item 3"},map{"data":[1,2,3,4],"name":"item 4"}]
Secondary results[2] :[map{"data":[1],"name":"item 1"},map{"data":[1,2],"name":"item 2"},map{"data":[1,2,3],"name":"item 3"}]
Secondary results[3] :[map{"data":[1],"name":"item 1"},map{"data":[1,2],"name":"item 2"},map{"data":[1,2,3],"name":"item 3"},map{"data":[1,2,3,4],"name":"item 4"},map{"data":[1,2,3,4,5],"name":"item 5"}]
Secondary results[4] :[map{"data":[1],"name":"item 1"},map{"data":[1,2],"name":"item 2"}]

RE: XSLT 3.0 non XML result documents: should the Python API allow to capture them serialized? - Added by O'Neil Delpratt almost 2 years ago

So Java is doing the same thing as the Python code. In Java I guess I could create another map for each secondary result and serialize it to string which is the JSON you want. I think you can do the same in Python too.

RE: XSLT 3.0 non XML result documents: should the Python API allow to capture them serialized? - Added by O'Neil Delpratt almost 2 years ago

Or if you want the individual XdmArray item I can just traverse them in a loop and call toString on the XdmMap objects, which would be JSON.

RE: XSLT 3.0 non XML result documents: should the Python API allow to capture them serialized? - Added by Martin Honnen almost 2 years ago

The question I pondered was how to get the secondary results as a serialized string, on the Java side I can easily for the secondary result/the result handler use a Serializer, on the Python side I don't have that option and I don't have much of an API to serialize the raw XDM results with various options other than, as I have figured now, to call the XPath 3.1 serialize function. For the time being, what I have done, is, instead of using the Python API for XSLT, I have delegated the task to XPath and the fn:transform function with delivery-format : 'serialized, that gives me a map with all result documents in the same form of a string if I need/want that, or I could use raw` to have an XDM value.

I can't currently tell whether it is feasible to give the C++/Python API something alike a Serializer as the result handler, I guess that might not be possible.

RE: XSLT 3.0 non XML result documents: should the Python API allow to capture them serialized? - Added by Martin Honnen almost 2 years ago

I can live with the double True for capture result documents and store raw results giving me an XDM value, the question is whether the API can be improved/extended or perhaps changed for the case of capturing the results but not wanting a raw value (but perhaps rather a serialized result according to what is attribute on the xsl:result-document); in that case, as I stated in my post, if a secondary result is an XDM map or array, the current API (I think the code (for 11) is in uses an XdmDestination that throws an error for the XDM map or array results and that is changeable on the Java side by designing a Serializer but with SaxonC/Python I am kind of stuck with the error because under the hood the API uses either RawDestination (which can handle any result but ignores serialization attributes/properties) or XdmDestination (which also ignores serialization attributes/properties but can't handle results that are arrays or maps and throws an error on them).

So at that point I wonder whether for all the returning_string methods in the Python API it would make sense to have perhaps a third argument to set_capture_result_documents to allow me to say serialize or serialized and under the hood the C++/Java code would then not use an XdmDestination but a Serializer over a StringWriter and return serialized result documents.

As I said, I can't judge how feasible that is and I am kind of just exploring what can currently be done and what can't and commenting on what can't be done.

And as I said, I have found a workaround to simply rely on fn:transform and its delivery-format: serialized, if some day the system function call gives me a nice error/exception if something goes wrong there (is there a bug for I perhaps don't need the serialized option for the capture result documents API of the XSLT API and can live with fn:transform.


