get progress of executeQuery?
Added by Anonymous over 16 years ago
Legacy ID: #5052685 Legacy Poster: y10k (y10k)
I have a huge XQuery in the following format: <item type="type1"> { for $x in doc('doc.xml')/..... } </item> <item type="type2"> { for $x in doc('doc.xml')/..... } </item> ... <item type="typeN"> { for $x in doc('doc.xml')/..... } </item> I have a Java program using Saxon XQJ to run this XQuery against doc.xml, which is potentially huge (e.g. 100MB). To process the whole XQuery, it could take a while. I am wondering if there a any way to indicate the progress (such as how many items have been processed) of executing this XQuery without breaking them (by each item node) into individual queries. Thanks a lot!
Replies (5)
Please register to reply
RE: get progress of executeQuery? - Added by Anonymous over 16 years ago
Legacy ID: #5053143 Legacy Poster: Michael Kay (mhkay)
It's quite likely that the elapsed time for a query like this will be dominated by the cost of parsing the input document and building its tree in memory. You can monitor when that starts and finishes by using a URIResolver that delegates to the system URIResolver. If you want finer-grained monitoring while the parsing is taking place, you could insert a SAX filter between the XML parser and the Saxon tree builder. Once the query is actually running, you could use a TraceListener to monitor progress, but this could make it take longer. A better approach might be to intercept the query output as it is produced. Michael Kay
RE: get progress of executeQuery? - Added by Anonymous over 16 years ago
Legacy ID: #5053696 Legacy Poster: y10k (y10k)
Thanks, Michael. I will look into that. If eventually I have to break the big xquery into small queries and invoke executeQuery N times to execute them, are there any good ways to reduce the cost of building the tree for every query?
RE: get progress of executeQuery? - Added by Anonymous over 16 years ago
Legacy ID: #5053717 Legacy Poster: Michael Kay (mhkay)
>are there any good ways to reduce the cost of building the tree for every query? Only the obvious ones really - keep them as small as possible. And don't parse and reserialize between steps - I've seen cases where that completely swamped the cost of the logic within each step. Michael Kay
RE: get progress of executeQuery? - Added by Anonymous over 16 years ago
Legacy ID: #5053817 Legacy Poster: y10k (y10k)
Sorry, a couple more detailed questions about how the processing works... 1. Say I have two queries: First query for $i in doc('doc.xml')//node where $i/honda return $i Second query for $i in doc('doc.xml')//node where $i/toyota return $i XQSequence rs1 = expr.executeQuery(query1); XQSequence rs2 = expr.executeQuery(query2); rs1.writeSequence(out1, prop); rs2.writeSequence(out2, prop); In this case, does Saxon parse doc.xml and generate the tree twice? (If twice, any way to "cache" the tree to reduce this cost?) 2. This time I have one single executeQuery(query), where the query is: <items> <item type="type1"> { for $i in doc('doc.xml')//node where $i/honda return $i } </item> <item type="type2"> { for $i in doc('doc.xml')//node where $i/toyota return $i } </item> </items> XQSequence rs = expr.executeQuery(query); rs.writeSequence(out, prop); I thought executeQuery processes the query and then writeSequence simply writes the result to out. Michael, you mentioned "A better approach might be to intercept the query output as it is produced.". Does this mean is there a way to generate the result while processing the query? Thanks again.
RE: get progress of executeQuery? - Added by Anonymous over 16 years ago
Legacy ID: #5053861 Legacy Poster: David Lee (daldei)
To answer #1 ... you can pre-parse the document and set it as a variable and reuse that same document in both queries. If your using s9api then XQueryEvaluator.setExternalVariable() does this.
Please register to reply