Project

Profile

Help

Bug in Saxon .NET (maybe)

Added by Anonymous about 15 years ago

Legacy ID: #6836588 Legacy Poster: Vadim Arasev (varasev)

Hi, Excuse for my bad English (I'm Russian). I have downloaded Saxon 9.1.0.6 and tried to use it to run the next XQuery in C#: for $offer in //offer, $category in //category, $price in round(number($offer/price)) where $offer/categoryId = $category/@id and $category = 'Techno' and $price > 250 order by $price descending return <offer> <price>{string($offer/price)}</price> <title>{string($offer/title)}</title> </offer> The C# code was: Processor processor = new Processor(); XmlTextReader reader = new XmlTextReader(new FileStream("file.xml", FileMode.Open, FileAccess.Read, FileShare.Read)); reader.Normalization = true; XdmNode m_doc = processor.NewDocumentBuilder().Build(reader); reader.Close(); XQueryCompiler m_compiler = processor.NewXQueryCompiler(); MemoryStream ms = new MemoryStream(); XQueryExecutable exp = m_compiler.Compile(strQuery); // strQuery contains the query above XQueryEvaluator eval = exp.Load(); eval.ContextItem = m_doc; Serializer qout = new Serializer(); qout.SetOutputProperty(Serializer.METHOD, "html"); qout.SetOutputProperty(Serializer.INDENT, "no"); qout.SetOutputStream(ms); eval.Run(qout); // infinite work here The problem is that there are infinite work at the last code line. I use Visual Studio 2008. I tried to use Query.exe - it is normally working at the same file.xml and XQuery. But my code does not want to work with them. The size of the "file.xml" is 35 Mb. I have this problem when use the latest version of Saxon-B - 9.1.0.6. I have an old version (9.0.0.2) that works fine with the same query and xml-file. Please, help. This is a bug maybe or I don't do it correctly? I can send to you "file.rar" (~2.5 Mb) with "file.xml" inside if you need it for experiments.


Replies (9)

Please register to reply

RE: Bug in Saxon .NET (maybe) - Added by Anonymous about 15 years ago

Legacy ID: #6836787 Legacy Poster: Michael Kay (mhkay)

It sounds as if you had better send me the data file so I can see what's happening. You can send it to mike at saxonica.com I would expect a join query like this against a 35Mb input file to take quite a long time using Saxon-B, and to be much faster using Saxon-SA. How long did it take using 9.0.0.2? Michael Kay http://www.saxonica.com/

RE: Bug in Saxon .NET (maybe) - Added by Anonymous about 15 years ago

Legacy ID: #6851558 Legacy Poster: Vadim Arasev (varasev)

I've sent you e-mail with the data file

RE: Bug in Saxon .NET (maybe) - Added by Anonymous about 15 years ago

Legacy ID: #6852126 Legacy Poster: Michael Kay (mhkay)

Thanks for sending the file. I'm afraid I can't reproduce the problem. The query runs fine for me on all versions of Saxon, typically in 3.5 to 4 seconds on 9.1.0.6/Java, 4.5 seconds on 9.1.0.6/.net, and 5.5 seconds on 9.0.0.2/.net. Can you tell me more about your operating environment? Does the problem occur consistently? Any idea of the state of the machine when it hangs, e.g. is the CPU busy, is it thrashing for memory? Michael Kay http://www.saxonica.com/

RE: Bug in Saxon .NET (maybe) - Added by Anonymous about 15 years ago

Legacy ID: #6852336 Legacy Poster: Michael Kay (mhkay)

OK, I've now reproduced it by running via Saxon.Api. Very peculiar - it does seem to depend both on the actual query that's executed, and on the way in which it's run, which are normally quite independent of each other. It could prove a bit complex to debug this one.

RE: Bug in Saxon .NET (maybe) - Added by Anonymous about 15 years ago

Legacy ID: #6855155 Legacy Poster: Vadim Arasev (varasev)

So, can you fix this?

RE: Bug in Saxon .NET (maybe) - Added by Anonymous about 15 years ago

Legacy ID: #6855560 Legacy Poster: Michael Kay (mhkay)

>So, can you fix this? The reason I gave you a progress report was to tell you that it's not going to be easy and you may need to be a little patient. At present I think I have reproduced the problem in an environment where I can step through the execution with a debugger, which is a big step forward. That took me all morning.

RE: Bug in Saxon .NET (maybe) - Added by Anonymous about 15 years ago

Legacy ID: #6856904 Legacy Poster: Vadim Arasev (varasev)

Ok, thanks. It's not time-critical problem for me. I'll wait.

RE: Bug in Saxon .NET (maybe) - Added by Anonymous about 15 years ago

Legacy ID: #6857364 Legacy Poster: Michael Kay (mhkay)

OK, got it. It's not actually looping or hung, it's just taking a long time. Basically, you struck lucky with the optimizer with 9.0, and you didn't with 9.1. The Saxon-B optimizer is not supposed to optimize joins over a 35Mb file, but occasionally it succeeds anyway. You'll get much more robust optimization for such queries if you switch to Saxon-SA. The actual difference between the command line and the API here is that that the command line sets a flag saying that all data is untyped (i.e., not schema-validated). It's more efficient in many cases to use schema-validated data, but if you're going to use untyped data, it helps when the optimizer knows this in advance. The command line is setting a flag to indicate this, and the API front-end isn't. You can set the flag manually by changing your code to start: Processor processor = new Processor(); ((net.sf.saxon.Configuration)processor.Implementation).setAllNodesUntyped(true); You'll need to add references to saxon9.dll and IKVM.OpenJDK.ClassLibrary.dll when compiling your code. Why does this flag make such a huge difference? Well, usually it doesn't. In this particular case, it just simplifies the code a tiny fraction, enough for the optimizer to spot a pattern that allows it to make a further optimization, that happens to be a big win. Unfortunately, that tends to be the way with optimizers - they are very sensitive to small variations in the input conditions. This means that even with this flag set, another very similar query might also run very slowly.

RE: Bug in Saxon .NET (maybe) - Added by Anonymous about 15 years ago

Legacy ID: #6936419 Legacy Poster: Vadim Arasev (varasev)

Thanks a lot! It works now.

    (1-9/9)

    Please register to reply