Project

Profile

Help

How to know which documents got into resulting sequence and which were filtered out

Added by Denis Sukhoroslov almost 6 years ago

I'm processing XML/JSON documents running XQuery expressions on them. Documents are provided via custom CollectionFinder class. The common processing scheme is:

String query; StaticQueryContext sqc; DynamicQueryContext dqc; XQueryExpression xqe = sqc.compileQuery(query); SequenceIterator itr = xqe.iterator(dqc); // then iterate over the resulting sequence..

Is it possible to get notified at the result iteration, which docs were filtered out?

Thanks, Denis.


Replies (5)

Please register to reply

RE: How to know which documents got into resulting sequence and which were filtered out - Added by Michael Kay almost 6 years ago

Firstly, I'd suggest you use the interface classes at the s9api level: XQueryCompiler, XQueryExecutable, XQueryEvaluator, because these are likely to be more stable over time than the lower-level interfaces such as StaticQueryContext.

But I'm having trouble understanding your question. I don't see any "filtering out" of any documents in this code. If your CollectionFinder is filtering out documents, then there's nothing Saxon can do about it (you could change your CollectionFinder to return some indicator of documents that it has filtered out). If the XQueryExpression xqe is returning no results (an empty sequence) in respect of some documents that it reads from the CollectionFinder, then you should change it so it returns something other than an empty sequence in respect of those documents.

A technique that is sometimes useful is to use the uri-collection() function rather than collection(). The query itself can then decide which URIs returned by the CollectionFinder should be passed to the XML parser and turned into document trees.

RE: How to know which documents got into resulting sequence and which were filtered out - Added by Denis Sukhoroslov almost 6 years ago

Hi Michael,

The “filtering” code is in XQuery expression. In case of simple where comparisons it is handled by the custom CollectionFinder, indeed. But in case of some complex query with aggregates or limits it is not so easy, so the CollectionFinder provides just an initial rough sequence of documents which is handled further in Saxon XQuery engine. So I’m looking to some kind of listener or something that could be notified when document enter/exit query processing chain. It should be possible to achieve adding call to some custom function in the xquery itself, but I’d like to not complicate all my queries with this functionality.

Regarding the x9api - thanks, will have a look on it. But not sure it’ll help as I do use other low level apis: ModuleURIResolver, SourceURIResolver, ResourceCollection, custom Resources etc.

Thanks, Denis.

RE: How to know which documents got into resulting sequence and which were filtered out - Added by Michael Kay almost 6 years ago

What do you actually want this "listener" to do? Is it producing output that finds its way into the query results?

XQuery being functional, it goes against the grain for a filtering operation to have side-effects.

RE: How to know which documents got into resulting sequence and which were filtered out - Added by Denis Sukhoroslov almost 6 years ago

No, the listener will not cause any side effects. It should track which documents participated in query result. Other part of the system cache query results, so in case when document has been updated I need to invalidate its cached results.

RE: How to know which documents got into resulting sequence and which were filtered out - Added by Denis Sukhoroslov almost 6 years ago

May be a custom CodeInjector will help? What do you think, is it possible to inject a call to some custom function which would track processed documents?

    (1-5/5)

    Please register to reply