Project

Profile

Help

Bug #5562

open

strange error when calling transform_to_file in a loop

Added by Lou Burnard almost 2 years ago. Updated almost 2 years ago.

Status:
New
Priority:
Low
Category:
-
Start date:
2022-06-11
Due date:
% Done:

0%

Estimated time:
Found in version:
11.3
Fixed in version:
Platforms:

Description

I need to run a sequence of transforms, one after another, each one operating on the output of the preceding one. Here's the code I am currently trying:

with saxonc.PySaxonProcessor(license=False) as proc:
   print(proc.version)
    # Initialize the XSLT 3.0. processor
   xsltproc = proc.new_xslt30_processor()
   for i in (0,1,2,3,4) :
       print(i)
       SCRIPT=scriptRoot+"pt"+str(i)+".xsl"
       TEMP=tempRoot+"temp-"+str(i+1)+".xml"
       print("Running "+SCRIPT+" on "+FILE+" producing "+TEMP)
       xsltproc.transform_to_file(source_file=FILE, stylesheet_file=SCRIPT, output_file=TEMP)
       FILE=TEMP

First time round the loop this behaves as expected. But then I get an I/O error on the output file I am trying to create?!

SaxonC-HE 11.3 from Saxonica
0
Running /home/lou/Public/pdf2tei/pt0.xsl on /home/lou/Desktop/LacyWork/outgoing/0101Time/temp.xml 
producing /home/lou/Desktop/LacyWork/outgoing/0101Time/temp/temp-1.xml
1
Running /home/lou/Public/pdf2tei/pt1.xsl on /home/lou/Desktop/LacyWork/outgoing/0101Time/temp/temp-1.xml 
producing /home/lou/Desktop/LacyWork/outgoing/0101Time/temp/temp-2.xml
Error 
   I/O error reported by XML parser processing
  /home/lou/Desktop/LacyWork/outgoing/0101Time/temp/temp-2.xml: No such file or directory.
  Caused by java.io.FileNotFoundException: No such file or directory
Actions #1

Updated by Martin Honnen almost 2 years ago

Is the Linux or MacOS where that happens?

Actions #2

Updated by Lou Burnard almost 2 years ago

linux. ubuntu 18.04 to be exact

Actions #3

Updated by Martin Honnen almost 2 years ago

I am sure O'Neil will look into this soon.

As a workaround, if you need to chain stylesheets now, using the following code that uses explicit stylesheet compilation together with the apply_templates_returning_value method has allowed me on both Windows and Linux to chain stylesheets from Python with SaxonC 11.3 HE:

from saxonc import *

with PySaxonProcessor(license=False) as proc:
   print(proc.version)

   xsltproc = proc.new_xslt30_processor()

   source_doc = proc.parse_xml(xml_file_name = 'sample1.xml')

   result_doc = None

   for i in (1,2,3,4):
       print(i)
       sheet = "sheet" +str(i) + ".xsl"
       print('Running sheet ', i)
       print('source_doc is: ', source_doc)

       transform = xsltproc.compile_stylesheet(stylesheet_file = sheet)

       transform.set_global_context_item(xdm_item = source_doc.head)
       result_doc = transform.apply_templates_returning_value(xdm_value = source_doc)
       print('result', i, result_doc)
       
       source_doc = result_doc

   print('Final result:', result_doc)

I understand that workaround only helps if you don't need intermediary results as files on the file system. It was not quite clear whether that was just part of your chosen approach or a requirement for your use case.

Another option obviously is chaining in XSLT using fn:transform. Will try to post an example later on.

Actions #4

Updated by Martin Honnen almost 2 years ago

Using fold-left plus transform from XSLT works fine for me with SaxonC 11.3 HE and the Python code

from saxonc import *

with PySaxonProcessor(license=False) as proc:
   print(proc.version)

   xsltproc = proc.new_xslt30_processor()

   source_doc = proc.parse_xml(xml_file_name = 'sample1.xml')

   transform = xsltproc.compile_stylesheet(stylesheet_file = 'chain-using-fn-transform-fold-left1.xsl')

   result = transform.apply_templates_returning_value(source_file = 'sample1.xml')

   print(result)

plus the stylesheet code being e.g.

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="3.0"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  exclude-result-prefixes="#all"
  expand-text="yes">
  
  <xsl:param name="sheet-uris" as="xs:string*" select="(1 to 4) ! ('sheet' || . || '.xsl')"/>

  <xsl:mode on-no-match="shallow-copy"/>

  <xsl:template match="/" name="xsl:initial-template">
    <xsl:sequence
      select="fold-left(
                $sheet-uris,
                .,
                function($source-node, $sheet) {
                  transform(map {
                    'source-node' : $source-node,
                    'stylesheet-location' : $sheet
                  })?output
                }
              )"/>
  </xsl:template>
  
</xsl:stylesheet>
Actions #5

Updated by Martin Honnen almost 2 years ago

To have the result output as a file works too:

from saxonc import *

with PySaxonProcessor(license=False) as proc:
   print(proc.version)

   xsltproc = proc.new_xslt30_processor()

   source_doc = proc.parse_xml(xml_file_name = 'sample1.xml')

   transform = xsltproc.compile_stylesheet(stylesheet_file = 'chain-using-fn-transform-fold-left1.xsl')

   transform.set_cwd('.')

   transform.apply_templates_returning_file(source_file = 'sample1.xml', output_file = 'chaining-result1.xml')
Actions #6

Updated by Lou Burnard almost 2 years ago

Many thanks for swift and helpful response! I can confirm that your first suggested work around above works fine (If I need the intermediate results for debugging, it's easy enough to write them to file)

It would be nice to know why my original simple-minded attempt did not work though

Actions #7

Updated by Martin Honnen almost 2 years ago

Lou Burnard wrote in #note-6:

It would be nice to know why my original simple-minded attempt did not work though

I hope O'Neil investigates that during the week and let's you know.

Actions #8

Updated by O'Neil Delpratt almost 2 years ago

  • Assignee set to O'Neil Delpratt
  • Found in version set to 11.3
Actions #9

Updated by O'Neil Delpratt almost 2 years ago

Hi,

In the original example for the first time around the loop what is the value of the variable FILE?

Actions #10

Updated by Lou Burnard almost 2 years ago

Sorry I forgot this query. The TEMP variable is explicitly set at the beginning of the stylesheet I copied. It has an anodyne but genuine value like "temp.xml" .


From: Saxonica Developer Community
Sent: Monday, June 13, 2022 11:19 AM
Subject: [SaxonC - Bug #5562] strange error when calling transform_to_file in a loop

Please register to edit this issue

Also available in: Atom PDF