Bug #2631
closedOutOfMemory running large Schematron file
0%
Description
Hi,
I run Saxon 9.6.0.7J through Oxygen 17.1. Regardless of the edition of Saxon, I run out of memory at various stages. The first is in looking for the phases that my schematron has. The second in running it. I asked the oXygen guys and they felt there was nothing more they could do and referred me you.
Note that we generate schematron. Not all schematrons are as huge as you'll find attached in the reproduction zip. Most run without a hitch.
I'd like to know if there's anything you could do to keep the memory foot print down, or maybe that we can create more efficient paths for example.
Environment: MacBook Pro early 2013, 2.6Ghz i7, 16GB RAM, with oXygen running on max 4GB RAM.
Thanks in advance.
Files
Updated by Alexander Henket almost 9 years ago
Note: I run out of memory when I choose phase #ALL, not when I choose any other (smaller) phase. The problem is that #ALL contains things that none of the other phases have and there are just too many phases to try one by one.
Updated by Michael Kay almost 9 years ago
- Subject changed from OutOfMemory to OutOfMemory running large Schematron file
First, please note that we aren't Schematron experts. Our general rule is that we don't offer support where third-party products are involved, unless they are part of our standard test platform. Where things are reasonably simple open source applications, we're prepared to relax this rule a little, but you need to provide us with detailed instructions as to what we need to download and how you want us to run it. I think there is more than one Schematron processor out there and you haven't even told us which one you are running. Presumably it is one of the processors that converts schematron files to XSLT and then processes the XSLT using Saxon.
Secondly, one of your Schematron files is 45Mb in size. That's a big program in any language: it's about the same size as the source code of Saxon, all in one module. What's the size of the XSLT code that this produces? I'm afraid I'm not in the least surprised that it blows up: Saxon is not designed to handle such enormous stylesheets. I've no idea what the limits are, but I think you need to re-think your design.
Updated by Michael Kay almost 9 years ago
- Status changed from New to AwaitingInfo
Updated by Alexander Henket almost 9 years ago
Your reply got me thinking, so I ran the Schematron.com standard conversion from SCH to SVRL to get the stylesheet. To answer that question: the XSLT is 145MB.
I then ran the thing outside of oxygen on the command line using the Saxon9EE.jar 9.6.0.7J from oXygen. Took a while but sure enough 2MB worth of SVRL came out. No out of memory, but rather useless output as I now need to run every single path with problems against the base instance again to get me back to a problematic fragment.
(java -Xmx4096m -jar /Applications/oxygen/lib/saxon9ee.jar -o:svrlresult.xml xml-JGZ/REPC_EX902120NL_03.xml schematron_svrl/jgz-versturenDossieroverdrachtbericht-02.xsl)
The second thing I'm working on is a small rewrite of the schematron-engine we have created to build these SCH files so new phases become available to facilitate the process in oXygen and so I don't 'have' to go to #ALL.
Lastly: I've pointed the oXygen guys to this ticket and asked them to re-evaluate what could be done.
So in summary: I think I have enough to work with to make things more manageable. The goal is to stay inside oXygen as disconnected SVRL is not a feasible solution.
It would seem this ticket here may be closed. If the oXygen-guys feel it should be reopened based on their renewed investigation I'm sure we'll get back to it. Thanks.
Updated by Michael Kay almost 9 years ago
- Status changed from AwaitingInfo to Closed
- Assignee set to Michael Kay
I'm closing this. Running a 145Mb stylesheet is beyond Saxon's capabilities.
Updated by Alexander Henket almost 9 years ago
Just for the record: running that XSL 145MB command line actually does work more or less reasonably well. It's oXygen as intermediary party that chokes. Saxon maybe wasn't designed for it, but it does work.
Please register to edit this issue