Support #3100: Processing many files - Saxon - Saxonica Developer Community

Actions

Send by e-mail Copy link

Support #3100

closed

Processing many files

Added by Erik Wilde over 7 years ago. Updated over 7 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Michael Kay

Category:

Sprint/Milestone:

Start date:

2017-01-10

Due date:

% Done:

Estimated time:

Legacy ID:

Applies to branch:

Fix Committed on Branch:

Fixed in Maintenance Release:

Platforms:

Description

As mentioned in an email conversation, I am trying to process many files (~2000), all of them moderate in size (1-2MB), but it seems like I am no using discard-document correctly. I always get Java heap space errors, so I am assuming that I must be doing something wrong. My starting point was the code shown on http://www.saxonica.com/html/documentation/functions/saxon/discard-document.html, but maybe I am doing something wrong in the way how I use it.

Files

training.xsl (696 Bytes) training.xsl

Erik Wilde, 2017-01-10 19:02

Actions

Copy link

Updated by Michael Kay over 7 years ago

Because you load the files in the form of a collection, and the collection is in a global variable, there is a link to the documents for the duration of the transformation even though they have been discarded from the document pool.

saxon:discard-document() is designed for documents read using the document() or doc() function rather than using collection(). Your best approach might be to use the uri-collection() function to get a set of URIs, and then to fetch individual documents using doc() based on those URIs.

But you may be able to get away with simply dropping the global variable and doing

xsl:for-each select="collection(...)!discard-document(.)"

Note that 9.7 introduced significant changes to the way collections work internally -- I'm not sure which release you are on. The new CollectionFinder interface in 9.7 opens the door to a lot of Java API capability for controlling how the resources used by collections are managed.

Actions

Copy link