Benchmarking Saxon & .NET
Added by Anonymous over 18 years ago
Legacy ID: #3719571 Legacy Poster: Don Burden (donburden)
I've been trying to benchmark, translating a .NET XMLDocument with both the .NET 1.1 Transform method and the Saxon XsltTransformer. Using the cs samples as a guide, this line where InitialContextNode is set seems to take a huge amount of time compared to the actual translation: xslTrans.InitialContextNode=xslProc.NewDocumentBuilder().Build(xmlDoc); Am I doing this right, or is there a faster way to do this? Here is some code that simulates something I need to do. I'm declaring the translations globally, then loading XML documents in sequence, then applying the same translation to each. Benchmarking .NET 1.1: Defining globally: xmlDoc=new XmlDocument(); XslTransform xsl=new XslTransform(); xsl.Load("Trans.xsl"); //this is about a 20K XSL file, compatible with XSLT 2.0, no jscript or other extensions sXml=..... //this is about a 10K XML string. //.NET 1.1 benchmark routine: XmlReader xmlr; for (int i=1; i<=5000; i++){ xmlDoc.LoadXml(sXml); //make believe each XML string is a unique XML document xmlr=xsl.Transform(xmlDoc.CreateNavigator(), null); xmlDoc.Load(xmlr); } Above takes 8 seconds. Throughput is 613 XML documents per second. Benchmarking Saxon: Defining globally: xmlDoc=new XmlDocument(); xslProc = new Processor(); xslTrans = xslProc.NewXsltCompiler().Compile(new Uri("...Trans.xsl")).Load(); //("..." is really full path to above xsl file) //Saxon benchmark routine: for (int i=1; i<=5000; i++){ xmlDoc.LoadXml(sXml); //make believe each XML string is a unique XML document DomDestination result = new DomDestination(); xslTrans.InitialContextNode=xslProc.NewDocumentBuilder().Build(xmlDoc); xslTrans.Run(result); xmlDoc=result.XmlDocument; } This takes 22 seconds. Throughput is only 227 XML documents per second. InitialContextNode was detected as the problem by taking that line and moving it outside (above) the loop. Benchmark then runs without error, and the results are more in line with the .NET version, loop takes 7 seconds, and throughput goes up to 673 XML documents per second. Does the .Build method really need to be called for each new XMLDocument loaded, or is there a faster and more efficient way to set InitialContextNode?
Replies (3)
Please register to reply
RE: Benchmarking Saxon & .NET - Added by Anonymous over 18 years ago
Legacy ID: #3719719 Legacy Poster: Michael Kay (mhkay)
Thanks for this input. I've done no detailed performance studies of the .NET product - just some basic sanity checks - so any data points are useful. What you're doing here is to build a DOM tree in memory: xmlDoc.LoadXml(sXml); and then convert the DOM tree into a Saxon tree: initialContextNode = xslProc.NewDocumentBuilder().Build(xmlDoc); It would be much better to build the Saxon tree directly from the raw XML input: initialContextNode = xslProc.NewDocumentBuilder().Build(new XmlTextReader(sXml)) I'd be interested to see how that compares: it should be a lot better. Saxon on Java has the ability to take a DOM directly as the input: it's still a lot less efficient than using a native Saxon tree, but doesn't incur the overhead of rebuilding the tree in memory, which is what's currently happening on the .NET product. Michael Kay
RE: Benchmarking Saxon & .NET - Added by Anonymous over 18 years ago
Legacy ID: #3720402 Legacy Poster: Don Burden (donburden)
Can't really do it that way. The idea is that each time through the loop, an XmlDocument will get loaded from a string (into a .NET object), then a number of manipulations will be performed directly on the XmlDocument in C# (reading/setting attributes, loading/saving data from the database based on attributes set in the XML, etc). These manipulations are all using C# .NET XML, and represents a large amount of code that is already written. The XSL translation will be performed after that, then more C# manipulations. What's most important is to get the XSL translation part to execute as quickly as possible.
RE: Benchmarking Saxon & .NET - Added by Anonymous over 18 years ago
Legacy ID: #3720439 Legacy Poster: Michael Kay (mhkay)
In that case I'm afraid you'll have to take the performance hit until such time as there's a wrapper for the .NET DOM implemented in the same way as the similar code for the Java DOM. It shouldn't be difficult to do - the main effort is testing - and it doesn't have to be done by me - but until it's done, converting the DOM to a Saxon tree incurs a cost. How efficient the "wrapper" approach will prove to be is an open question. It depends on things like support for sorting nodes into document order.
Please register to reply