Project

Profile

Help

Support #4918

closed

OutOfMemoryError for 48MB XML Input XdmNode.toString()

Added by Nik Osvalds about 3 years ago. Updated about 3 years ago.

Status:
Closed
Priority:
Low
Assignee:
-
Category:
-
Sprint/Milestone:
-
Start date:
2021-02-24
Due date:
% Done:

0%

Estimated time:
Legacy ID:
Applies to branch:
10
Fix Committed on Branch:
Fixed in Maintenance Release:
Platforms:

Description

Hello,

I'm creating a "serverless" Azure Functions Web API runs XSLT transforms that is used by an application I'm developing to validate XML documents.

I'm getting an OutOfMemoryError: Java heap space when running the Xslt30Transformer.applyTempaltes() on a 48 MB XML source document. Occurs in the Azure Function running in on the cloud platform, which is supposed to have 1.5 GB memory available. I actually can't reproduce it locally due to a limitation in request size for the local Azure Functions runtime.

I'm using the XdmNode.toString() method since this is an API and I want to return the response from .applyTemplates() in a JSON Object. However, this seems to be where the function is running out of memory at net.sf.saxon.s9api.XdmNode.toString(XdmNode.java:487)

Is there a way to consume less memory when taking the output from .applyTemplates() and turning it into a string to send in my response object?

My Transform class:

package com.transform;

import javax.xml.transform.stream.StreamSource;

import com.microsoft.azure.functions.ExecutionContext;

import java.io.*;
import net.sf.saxon.s9api.*;

public class Transform {
    public static String transform(String body, ExecutionContext context) throws Exception {
        try {                        

            Processor processor = new Processor(false);
            XsltCompiler compiler = processor.newXsltCompiler();
            XsltExecutable stylesheet = compiler.compile(new StreamSource(new File(System.getenv("StorageDir")+"/data-quality/rules/iati.xslt")));
            Xslt30Transformer transformer = stylesheet.load30();

                
            String result = transformer.applyTemplates(new StreamSource(new StringReader(body))).toString();

            
            return result;
        } catch (Exception e) {
            context.getLogger().severe("Error in transform: " + e);
            throw e;
        }
    }
}

Error Dump:

Exception while executing function: Functions.transform Result: Failure
Exception: OutOfMemoryError: Java heap space
Stack: java.lang.reflect.InvocationTargetException
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
	at java.base/java.lang.reflect.Method.invoke(Unknown Source)
	at com.microsoft.azure.functions.worker.broker.JavaMethodInvokeInfo.invoke(JavaMethodInvokeInfo.java:22)
	at com.microsoft.azure.functions.worker.broker.EnhancedJavaMethodExecutorImpl.execute(EnhancedJavaMethodExecutorImpl.java:55)
	at com.microsoft.azure.functions.worker.broker.JavaFunctionBroker.invokeMethod(JavaFunctionBroker.java:57)
	at com.microsoft.azure.functions.worker.handler.InvocationRequestHandler.execute(InvocationRequestHandler.java:33)
	at com.microsoft.azure.functions.worker.handler.InvocationRequestHandler.execute(InvocationRequestHandler.java:10)
	at com.microsoft.azure.functions.worker.handler.MessageHandler.handle(MessageHandler.java:45)
	at com.microsoft.azure.functions.worker.JavaWorkerClient$StreamingMessagePeer.lambda$onNext$0(JavaWorkerClient.java:92)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)
Caused by: java.lang.OutOfMemoryError: Java heap space
	at java.base/java.util.Arrays.copyOf(Unknown Source)
	at java.base/java.lang.AbstractStringBuilder.ensureCapacityInternal(Unknown Source)
	at java.base/java.lang.AbstractStringBuilder.append(Unknown Source)
	at java.base/java.lang.StringBuffer.append(Unknown Source)
	at java.base/java.io.StringWriter.write(Unknown Source)
	at net.sf.saxon.serialize.XMLEmitter.startElement(XMLEmitter.java:409)
	at net.sf.saxon.serialize.XMLIndenter.startElement(XMLIndenter.java:172)
	at net.sf.saxon.event.NamespaceDifferencer.startElement(NamespaceDifferencer.java:71)
	at net.sf.saxon.event.ProxyReceiver.startElement(ProxyReceiver.java:139)
	at net.sf.saxon.event.SequenceNormalizer.startElement(SequenceNormalizer.java:84)
	at net.sf.saxon.tree.tiny.TinyElementImpl.copy(TinyElementImpl.java:389)
	at net.sf.saxon.event.SequenceReceiver.decompose(SequenceReceiver.java:214)
	at net.sf.saxon.event.SequenceNormalizerWithSpaceSeparator.append(SequenceNormalizerWithSpaceSeparator.java:38)
	at net.sf.saxon.event.SequenceReceiver.append(SequenceReceiver.java:133)
	at net.sf.saxon.event.SequenceCopier$$Lambda$205/0x0000000100407040.accept(Unknown Source)
	at net.sf.saxon.om.SequenceIterator.forEachOrFail(SequenceIterator.java:136)
	at net.sf.saxon.event.SequenceCopier.copySequence(SequenceCopier.java:33)
	at net.sf.saxon.query.QueryResult.serializeSequence(QueryResult.java:199)
	at net.sf.saxon.query.QueryResult.serialize(QueryResult.java:115)
	at net.sf.saxon.query.QueryResult.serialize(QueryResult.java:58)
	at net.sf.saxon.s9api.XdmNode.toString(XdmNode.java:487)
	at com.transform.Transform.transform(Transform.java:20)
	at com.transform.Function.run(Function.java:41)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
	at java.base/java.lang.reflect.Method.invoke(Unknown Source)
	at com.microsoft.azure.functions.worker.broker.JavaMethodInvokeInfo.invoke(JavaMethodInvokeInfo.java:22)
	at com.microsoft.azure.functions.worker.broker.EnhancedJavaMethodExecutorImpl.execute(EnhancedJavaMethodExecutorImpl.java:55)
	at com.microsoft.azure.functions.worker.broker.JavaFunctionBroker.invokeMethod(JavaFunctionBroker.java:57)
	at com.microsoft.azure.functions.worker.handler.InvocationRequestHandler.execute(InvocationRequestHandler.java:33)
	at com.microsoft.azure.functions.worker.handler.InvocationRequestHandler.execute(InvocationRequestHandler.java:10)

Thanks, Nik

Actions #1

Updated by Michael Kay about 3 years ago

I don't know if it will solve the problem for you, but what you're doing is certainly inefficient. Rather than constructing the transformation result as an in-memory tree and then serializing it, you should send the output directly to a serializer. That also has other benefits, for example the serialization parameters are then controlled by the xsl:output declarations in the stylesheet.

Use the two-argument version of transformer.applyTemplates, supplying a Serializer as the second argument; the Serializer can be initialised with a StringWriter that writes the serialised output directly to a string.

I don't know what the Web API you are using looks like, but it might even be possible to send the response directly to the network without holding it as a string in memory at all.

Actions #2

Updated by Michael Kay about 3 years ago

  • Status changed from New to AwaitingInfo

Did my suggestion on this help?

I don't think there's any more advice we can offer unless you have further information. I'll class it as "awaiting info" for now,

Actions #3

Updated by Nik Osvalds about 3 years ago

Hello,

Yes this did help as in the memory error wasn't triggered at this part of the execution anymore. It did pop up again when I was building the HTTP Response body, but I was able to fix that as well. Unfortunately Azure Functions doesn't support any type of streaming currently.

Thanks, Nik

Actions #4

Updated by Michael Kay about 3 years ago

  • Status changed from AwaitingInfo to Closed

Please register to edit this issue

Also available in: Atom PDF