Project

Profile

Help

about Large document

Added by Anonymous about 15 years ago

Legacy ID: #6951752 Legacy Poster: xu yang (xuyang5580136)

I have some questions about Processing large document in an xquery statement ,when the result of an xquery statement if too large ,how could i get the result ?with String ?Streanm?or other datastructure.


Replies (3)

Please register to reply

RE: about Large document - Added by Anonymous about 15 years ago

Legacy ID: #6951876 Legacy Poster: Michael Kay (mhkay)

It depends rather on what you want to do with the result. If you want to output the result (rather than processing it further in your application), I would recommend serializing the result to a stream. If the result is XML you could pipe it to another process using a SAXResult or SAXDestination.

RE: about Large document - Added by Anonymous about 15 years ago

Legacy ID: #6976530 Legacy Poster: xu yang (xuyang5580136)

Thanks for your reply,but I think i just find two methord in the interface XQSequence of xqj,one is getSequenceAsString,the other is getSequenceAsStream,when I process a large document as the input of an xquery statement with jar file Saxon-sa,I just get the java heap outof memory,how could I do?It means that when the result of query is too large ,the saxon-sa cant process it either?

RE: about Large document - Added by Anonymous about 15 years ago

Legacy ID: #6980628 Legacy Poster: Michael Kay (mhkay)

How large is large? Some people when they say "large" mean 1Mb, some people mean 50Gb. The API that you use for running the query makes very little difference to the space requirements. Out of memory conditions are usually caused by the size of the input file, not by the size of the output file. The output file can usually be streamed - that is, written to its destination without ever holding the whole file in memory. The first thing to check is that you are allocating enough memory to the Java VM. Use the -Xmx settings on the command line. The default amount of memory allocated is very small. The amount of memory needed is typically around 3-5 times the size of the input file, so if your input is 200Mb, use -Xmx1024m to allocate a gigabyte, or more if it is available. Saxon-SA allows some simple queries to be executed using streaming, where the input file is processed as it is read. This makes it possible to handle source documents of unlimited size. Streaming is never used automatically - you have to request it using the saxon:stream extension. See http://www.saxonica.com/documentation/sourcedocs/serial.html. Another technique that can be useful for large documents is document projection, described at http://www.saxonica.com/documentation/sourcedocs/projection.html

    (1-3/3)

    Please register to reply