Project

Profile

Help

Support #5959

closed

XSLT 3.0 streaming results in OutOfMemoryError

Added by Mark Hansen about 1 year ago. Updated 10 months ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
XSLT 3.0 packages
Sprint/Milestone:
-
Start date:
2023-04-06
Due date:
% Done:

0%

Estimated time:
Legacy ID:
Applies to branch:
12
Fix Committed on Branch:
Fixed in Maintenance Release:
Platforms:

Description

Hello,

I have a big XML file and I tried to split it with a xsl 3.0 including streaming in smaller files to process.

Here is my Java Code:

 StreamingTransformerFactory streamingTransformerFactory =
        XmlTransformerHelper.createStreamingTransformerFactory();

 Templates streamingTemplates =
          XmlTransformerHelper.createTemplate(
              streamingTransformerFactory,
              getClass()
                  .getClassLoader()
                  .getResourceAsStream(RESOURCE_URL_XSL_SPLIT_BIG_ONIX_STREAM));

 okFile = new File(targetDirectory.toString(), "dummy.xml");

 streamingTemplates
          .newTransformer()
          .transform(new StreamSource(bookStream), new StreamResult(okFile));

Here is my XSL:

<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                exclude-result-prefixes="#all"
                xmlns:s="http://www.book.org/book/3.0/short"
                xmlns:saxon="http://saxon.sf.net/"
>
    <xsl:mode streamable="yes" on-no-match="shallow-copy" use-accumulators="#all"/>

    <xsl:output indent="yes"/>

    <xsl:accumulator name="header" as="element()?" initial-value="()" streamable="yes">
        <xsl:accumulator-rule match="s:header" phase="end" saxon:capture="yes" select="."/>
    </xsl:accumulator>

    <xsl:template match="s:Entry">
        <xsl:result-document href="{position()}.xml" method="xml">
            <xsl:element name="{name(ancestor::*[last()])}" namespace="{namespace-uri(ancestor::*[last()])}">
                <xsl:copy-of select="accumulator-before('header')"/>
                <xsl:copy-of select="."/>
            </xsl:element>
        </xsl:result-document>
    </xsl:template>

</xsl:stylesheet>

It works fine for small XML files like 200 MB but for a 2 GB XML file I got a OutOfMemoryError (refering to the attachment ).

I startet my Spring Boot Application with this VM arguments "-Xms1G -Xmx3G".

My expectation would have been that with streaming a poor memory processing would have came up.

Is it a wrong assumption or did I do something wrong?


Files

clipboard-202304061011-zsmsr.png (41.2 KB) clipboard-202304061011-zsmsr.png Mark Hansen, 2023-04-06 10:11
S3ChunkedStream.java (1.79 KB) S3ChunkedStream.java Mark Hansen, 2023-04-06 13:01

Please register to edit this issue

Also available in: Atom PDF