Project

Profile

Help

Bug #6069

closed

"Unexpected XSLT error" "Cannot add item to tree: (object)"

Added by Conal Tuohy over 1 year ago. Updated 6 months ago.

Status:
Rejected
Priority:
Low
Assignee:
-
Category:
-
Sprint/Milestone:
-
Start date:
2023-06-09
Due date:
% Done:

0%

Estimated time:
Applies to JS Branch:
Fix Committed on JS Branch:
Fixed in JS Release:
SEF Generated with:
Platforms:
Company:
-
Contact person:
-
Additional contact persons:
-

Description

I've encountered a strange error running a compiled stylesheet under SaxonJS 2.5 in NodeJS v18.16.0. The stylesheet is compiled on the same platform with the xslt3 utility.

This all used to work at some point in the past and I'm not sure what has changed, but I have managed to cut the stylesheet down to a fairly minimal example that reproduces the problem.

My JS code invokes the stylesheet using SaxonJS.transform, and passes parameters to it using stylesheetParams. One of the parameters ($source-uri) is an absolute file: which the stylesheet passes to the doc function, and then applies templates to the result.

The stylesheet implements a pipeline of several transformations by applying templates to a document, capturing the result as a variable, then applying templates to the variable, capturing the result as another variable, applying templates to that, etc. Without that pipeline the stylesheet works OK (i.e. if in the stylesheet below I replace <xsl:copy-of select="$phase-1"/> with <xsl:copy-of select="$source"/>, then I get the result I expect; a copy of the source document).

<xsl:transform version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
	<!--
	Test of weird SaxonJS regression
	-->
	<xsl:param name="source-uri"/>
	
	<xsl:template name="xsl:initial-template">
		<!-- read the document -->
		<xsl:variable name="source" select="doc($source-uri)"/>
		<!-- transform the document with an identity mode -->
		<xsl:variable name="phase-1">
			<xsl:apply-templates mode="identity" select="$source"/>
		</xsl:variable>
		<!-- write the document -->
		<xsl:result-document href="/tmp/regression-test-output.xml" method="xml" indent="no">
			<xsl:copy-of select="$phase-1"/>
		</xsl:result-document>
	</xsl:template>
	
	<xsl:mode name="identity" on-no-match="shallow-copy"/>
</xsl:transform>

Here's the JS API call I use to invoke the stylesheet (my minimal example uses only the source-uri parameter, but the real stylesheet needs those other two):

const transformationResults = await SaxonJS.transform(
	{
		 stylesheetFileName: "/srv/tasks/src/xslt/document-function-regression-test.xsl.sef.json",
		//stylesheetFileName: "/srv/tasks/src/xslt/process-tei-to-page-files.xsl.sef.json",
		stylesheetParams: {
			identifier: identifier,
			"source-uri": sourceURI,
			"page-identifier-regex": configuration.ui.filename.checkNameStructure,
		},
		baseOutputURI: output ? output : sourceURI, // output into the same folder as the source data file
	},
	"async"
);

Logging the error thrown by SaxonJS.transform produces:

  console.error
    L {
      message: 'An unexpected error occurred in XSLT code. Please report the following information: \n' +
        'Error SXJS0004 in document-function-regression-test.xsl line 11:\n' +
        'Internal error: Cannot add item to tree (object) <TEI xmlns="http://www.tei-c.org/ns/1.0">\n' +
        '    <teiHeader>\n' +
        '        <fileDesc>\n' +
        '            <titleStmt>\n' +
        '                <title/>\n' +
        '                <author/>\n' +
        '            </titleStmt>\n' +
        '            <editionStmt>\n' +
        '                <edition>\n' +
        '                    <date>2022-04-04</date>\n' +
        '                </edition>\n' +
        '            </editionStmt>\n' +
        '            <publicationStmt>\n' +
        '                <p>unknown</p>\n' +
        '            </publicationStmt>\n' +
        '            <sourceDesc>\n' +
        '                <p>Converted from a Word document</p>\n' +
        '            </sourceDesc>\n' +
        '        </fileDesc>\n' +
        '        <encodingDesc>\n' +
        '            <appInfo>\n' +
        '                <application xml:id="docxtotei" ident="TEI_fromDOCX" version="2.15.0">\n' +
        '                    <label>DOCX to TEI</label>\n' +
        '                </application>\n' +
        '            </appInfo>\n' +
        '        </encodingDesc>\n' +
        '        <revisionDesc>\n' +
        '            <listChange>\n' +
        '                <change>\n' +
        '                    <date>2022-04-13T04:31:35Z</date>\n' +
        '                    <name/>\n' +
        '                </change>\n' +
        '            </listChange>\n' +
        '        </revisionDesc>\n' +
        '    </teiHeader>\n' +
        '    <text>\n' +
        '        <body>\n' +
        '            <p rend="Normal" style="text-align: left; ">\n' +
        '                <hi rend="Page">msword_example-001</hi> Text of page 1</p>\n' +
        '            <p rend="Normal" style="text-align: left; ">\n' +
        '                <pb/>\n' +
        '            </p>\n' +
        '            <p rend="Normal" style="text-align: left; ">\n' +
        '                <hi rend="Page">msword_example-002</hi> Text of page 2 starts here</p>\n' +
        '            <p rend="Normal" style="text-align: left; ">More page 2 text.<pb/>\n' +
        '            </p>\n' +
        '            <p rend="Normal" style="text-align: left; ">\n' +
        '                <hi rend="Page">msword_example-003</hi> Finally this we have this, the transcription of page 3; the third page of the document, i.e. the page which follows immediately after the second page, also known as “page 2”.</p>\n' +
        '            <p rend="Normal" style="text-align: left; ">This is more of the third page.<pb/>\n' +
        '            </p>\n' +
        '            <p rend="Normal" style="text-align: left; ">\n' +
        '                <hi rend="Page">msword_example_2-001</hi> This is the first page of transcript of a different item; an item called msword_example_2; this page should not be extracted from the file.</p>\n' +
        '        </body>\n' +
        '    </text>\n' +
        '</TEI>',
      stack: 'Error: \n' +
        '    at new L (/srv/tasks/node_modules/saxon-js/SaxonJS2N.js:4109:549)\n' +
        '    at Object.a [as ga] (/srv/tasks/node_modules/saxon-js/SaxonJS2N.js:4110:349)\n' +
        '    at a.append (/srv/tasks/node_modules/saxon-js/SaxonJS2N.js:4372:52)\n' +
        '    at Object.q [as gh] (/srv/tasks/node_modules/saxon-js/SaxonJS2N.js:4362:243)\n' +
        '    at SC (/srv/tasks/node_modules/saxon-js/SaxonJS2N.js:4971:310)\n' +
        '    at e.hf (/srv/tasks/node_modules/saxon-js/SaxonJS2N.js:4976:85)\n' +
        '    at /srv/tasks/node_modules/saxon-js/SaxonJS2N.js:4973:289\n' +
        '    at wc.Object.<anonymous>.ca.forEachItem (/srv/tasks/node_modules/saxon-js/SaxonJS2N.js:4206:583)\n' +
        '    at e.Bb (/srv/tasks/node_modules/saxon-js/SaxonJS2N.js:4973:256)\n' +
        '    at /srv/tasks/node_modules/saxon-js/SaxonJS2N.js:4376:8\n' +
        '    at /srv/tasks/node_modules/saxon-js/SaxonJS2N.js:4358:33\n' +
        '    at Array.forEach (<anonymous>)\n' +
        '    at /srv/tasks/node_modules/saxon-js/SaxonJS2N.js:4358:6\n' +
        '    at /srv/tasks/node_modules/saxon-js/SaxonJS2N.js:4379:17\n' +
        '    at Object.push (/srv/tasks/node_modules/saxon-js/SaxonJS2N.js:4390:143)\n' +
        '    at /srv/tasks/node_modules/saxon-js/SaxonJS2N.js:4600:290\n' +
        '    at /srv/tasks/node_modules/saxon-js/SaxonJS2N.js:4386:258\n' +
        '    at /srv/tasks/node_modules/saxon-js/SaxonJS2N.js:4386:309\n' +
        '    at Object.push (/srv/tasks/node_modules/saxon-js/SaxonJS2N.js:4390:143)\n' +
        '    at e (/srv/tasks/node_modules/saxon-js/SaxonJS2N.js:4988:69)\n' +
        '    at /srv/tasks/node_modules/saxon-js/SaxonJS2N.js:5015:342',
      name: 'UnexpectedXSLTError',
      code: 'SXJS0004',
      xsltLineNr: '11',
      xsltModule: 'document-function-regression-test.xsl'
    }

The TEI XML content in that error message is indeed the content of the file identified by the $source-uri parameter.

The stylesheet succeeds when launched from the command line:

node /srv/tasks/node_modules/xslt3 -it -t -xsl:/srv/tasks/src/xslt/document-function-regression-test.xsl source-uri=file:///srv/tasks/src/test-data/Succeeds-word_doc_upload/fake-msword-example/msword_example-tei.xml
SaxonJS 2.5 from Saxonica 
Node.js version v18.16.0
Compiling stylesheet /srv/tasks/src/xslt/document-function-regression-test.xsl
Stylesheet compilation time: 0.284s
Initial template: Q{http://www.w3.org/1999/XSL/Transform}initial-template
Asynchronous transform with options: stylesheetText={"N":"package","version":"30",(string), stylesheetBaseURI=file:///srv/tasks/src/xslt/doc(string), stylesheetParams=[object Object](string), outputProperties=[object Object](string), extraOptions=[object Object](string), destination=stdout(string), baseOutputURI=file:///srv/tasks/(string), logLevel=2(string), initialTemplate=Q{http://www.w3.org/1999/XSL/T(string), 
SEF generated by SaxonJS 2.5 at 2023-06-09T05:22:48.208Z
Promising to write to file:///tmp/regression-test-output.xml
<?xml version="1.0" encoding="UTF-8"?>
Execution time: 0.147s
Memory used: 29.21Mb
Transformation complete 

The output file ile:///tmp/regression-test-output.xml is there and is a copy of the input file file:///srv/tasks/src/test-data/Succeeds-word_doc_upload/fake-msword-example/msword_example-tei.xml

Naturally I can also run the stylesheet successfully under SaxonJ.


Files

Actions #1

Updated by Martin Honnen over 1 year ago

Hi Conal,

does

node /srv/tasks/node_modules/xslt3 -it -t -xsl:/srv/tasks/src/xslt/document-function-regression-test.xsl.sef.json source-uri=file:///srv/tasks/src/test-data/Succeeds-word_doc_upload/fake-msword-example/msword_example-tei.xml

also work?

Or do you get an error there? Does it help if you compile with -relocate:on?

Actions #2

Updated by Conal Tuohy over 1 year ago

Good question, Martin!

If I run the stylesheet using the xslt3 node command line interface, then it always succeeds, whether I specify the source .xsl file or a compiled .xsl.sef.json file, and it makes no difference whether or not the compiled stylesheet is compiled with -relocate:on.

If I invoke the compiled stylesheet using the transform function of the SaxonJS JS API from within NodeJS, then it fails. Again, it makes no difference whether the stylesheet was compiled with -relocate:on or not.

Actions #3

Updated by Norm Tovey-Walsh over 1 year ago

I've failed to reproduce this with the current head of development or the SaxonJS2 branch under either Node.js 12 or 18. I copied document-function-regression-test.xsl to issue.xsl and I'm running this transformation, compiled with the XX compiler:

      return SaxonJS.transform(
	  {
	      stylesheetFileName: "test/nodejs/iss6069/issue.sef.json",
	      stylesheetParams: {
		  identifier: "test",
		  "source-uri": "input.xml",
		  "page-identifier-regex": ".*"
	      },
	      baseOutputURI: "file:///tmp/"
	  },
	  "async"
      ).then(result => {
          fs.readFile("/tmp/regression-test-output.xml", "utf8", (err, text) => {
              console.log(text);
          });
      });

The test succeeds and logs the contents of input.xml to the console...

Actions #4

Updated by Norm Tovey-Walsh over 1 year ago

  • Status changed from New to AwaitingInfo

I've tried again, with the current head of development on the saxonjs2 branch and with the saxon-js-2-5-0 commit. I can't reproduce it. I'm going to mark this awaitinginfo for the time being in case you want to add more.

Actions #5

Updated by Norm Tovey-Walsh 6 months ago

  • Status changed from AwaitingInfo to Rejected

I couldn't reproduce it and we haven't heard back. I'm going to close this for now. Feel free to reopen it if you can provide more details.

Please register to edit this issue

Also available in: Atom PDF Tracking page