Project

Profile

Help

i hava some difficult in devoloping,need help

Added by Anonymous over 15 years ago

Legacy ID: #5915634 Legacy Poster: xu yang (xuyang5580136)

These days,i want to develop something with xqj,so i download the saxonsa9-0-0-8j.zip,then start to study how to excute Xquery in java.But when i processing large input xml document,the problem Exception in thread "main" java.lang.OutOfMemoryError: Java heap space occurs, so I want to know how to resolve this problem. cant the input be used in InputStream?and how could I use?I really want your help.


Replies (8)

Please register to reply

RE: i hava some difficult in devoloping,need help - Added by Anonymous over 15 years ago

Legacy ID: #5915666 Legacy Poster: David Lee (daldei)

XQuery doesnt do streaming mode to my knowledge, only xslt. XQuery loads the entire document into memory. You can pick which model it uses, but I belive the default is TinyTree which is the most memory compact. To get over this problem you need to set your java VM higher. From the command line I use the java flag -Xmx##### to set the max VM size. The default size is way too low. For example I typically run at java -Xmx500m .... to set the max at 500 mbytes. Java doesnt immedeately eat up this amount, it just sets the ceiling there before GC starts. You can also set the minimum and other memory parameters, but I find just setting the max is the easiest. A ROUGH estimate may be about 4x the size of your largest XML file. but you may need more. -David

RE: i hava some difficult in devoloping,need help - Added by Anonymous over 15 years ago

Legacy ID: #5915742 Legacy Poster: xu yang (xuyang5580136)

that means when i have a large XML input about 100M,i cant use xquery in java to process?

RE: i hava some difficult in devoloping,need help - Added by Anonymous over 15 years ago

Legacy ID: #5916496 Legacy Poster: David Lee (daldei)

There will be an upper limit to the size of document you can process using XQuery or any XML technology that builds an in-memory model of the document. In practice I find I can process 100MB XML files but not 500MB XML files. Your experience may vary. For larger files I split them up into smaller files. Most large XML files I've had to process are really collections of lots of smaller documents. A program using a streaming API can split these up fairly easily. Such a program can be written fairly simply using SAX or STax API's. You could look at "xsplit" in xmlsh , www.xmlsh.org, which can split XML files of essentially unlimited size into smaller xml files as an example. -David

RE: i hava some difficult in devoloping,need help - Added by Anonymous over 15 years ago

Legacy ID: #5920594 Legacy Poster: xu yang (xuyang5580136)

could you give me an example of processing 100MB XML files ?I want to know how to do this.

RE: i hava some difficult in devoloping,need help - Added by Anonymous over 15 years ago

Legacy ID: #5923899 Legacy Poster: David Lee (daldei)

Using the java command flag -Xmx1024m I was able to run an xquery using saxon on a 90MB file with no problem. You will have to experiment with your own files to see how much or little memory you can get awawy with.

RE: i hava some difficult in devoloping,need help - Added by Anonymous over 15 years ago

Legacy ID: #5926650 Legacy Poster: Michael Kay (mhkay)

Firstly, the current release is 9.1.0.5. This is significant because 9.1 added some streaming facilities that were not present in 9.0. There are two ways you can process a large document. You can load it into memory, in which case you will need to make sure that the Java VM has enough memory available: to do this, use the option (e.g.) -Xmx1024m on the Java command line. Or you can run it in streaming mode: for details see http://www.saxonica.com/documentation/sourcedocs/serial/saxonstream.html

RE: i hava some difficult in devoloping,need help - Added by Anonymous over 15 years ago

Legacy ID: #5926738 Legacy Poster: David Lee (daldei)

I looked at this page again, and for the first time I saw an XQuery example for the streaming. How new is XQuery support for streaming ? Can the streaming indications be set programatically (for the entire context?) or only for the doc() function ? Thanks !

RE: i hava some difficult in devoloping,need help - Added by Anonymous over 15 years ago

Legacy ID: #5926799 Legacy Poster: Michael Kay (mhkay)

XQuery support for streaming was added in 9.1. At present it can only be done by calling the doc() function within the scope of the saxon:stream pragma or the saxon:stream() extension function. Michael Kay http://www.saxonica.com/

    (1-8/8)

    Please register to reply