Project

Profile

Help

OOM with streamsource and streamresult

Added by Marco Laponder over 11 years ago

Hi All,

I have a problem when I am transforming a large html to file. It is quite a large file (over 100MB) and when I transform it I get a OutOfMemmory. I am using a StreamSource (inputFile) and as result a StreamResult expecting that this would need the least memory. I even stripped my stylesheet as at first I was blaming my stylesheet preventing the streaming. But now I have stripped my stylesheet to:

<xsl:stylesheet version="2.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml" indent="no" omit-xml-declaration="yes"/> <xsl:template match="html"> <xsl:element name="{name()}"> </xsl:element> </xsl:template> </xsl:stylesheet>

so of the input html only

remains. Seems to me this would not need much memory, but the OOM still occurs. I have dumped a heap file on OOM en when I inspect the dump file with jProfiler and go to 'Biggest Objects' the net.sf.saxon.tree.TinyTree is the biggest object and occupies 236MB

Am I doing something wrong here ?

Kind regards, Marco


Replies (1)

RE: OOM with streamsource and streamresult - Added by Michael Kay over 11 years ago

By default an XSLT processor will build an in-memory tree representing the source document. This is what is consuming the memory. It doesn't matter what form the input comes in, the tree will still be constructed.

For a transformation of 100Mb this should be viable provided you allocate enough memory, e.g. use -Xmx1024m on the command line.

With Saxon-EE, provided you are careful in the way you write your transformation, it is possible to perform a streaming transformation that avoids the need to construct the tree. Information on this can be found here:

http://www.saxonica.com/documentation/sourcedocs/streaming.xml

    (1-1/1)

    Please register to reply