Bug #4341

Performance degraded after moving from Saxon 8.9 to (use case CXML)

Added by Stefan Fritz over 1 year ago. Updated 3 months ago.

Start date:
Due date:
% Done:


Estimated time:
Legacy ID:
Applies to branch:
Fix Committed on Branch:
Fixed in Maintenance Release:


Customer upgraded to the latest version of our product which uses Saxon EE. They came from an older version which used Saxon They process thousands of documents per day with hundreds of XSLs. For two use cases (see other Saxon issue reported for the second use case) they see dramatic performance decrease which impacts their business.

Below you will find test results from our product(Sonic), from Saxon commandline and also from a Java test utility that simulates how our product uses Saxon. The ZIP contains sonicTest_CXML.cmd and saxonTest_CXML.cmd which allow you to reproduce the tests that I did. Further it contains the source code of the SonicXSLCmdLine.

We have to stick to 9.8 for now as we still have Java 7 support. Therefore we would appreciate a patch release for 9.8 once the cause has been identified.

I sent you a download link for the ZIP with the reproducible to the support email address. The ZIP contains sensible customer data and therefore password protected. (please do not attach here!)

"CXML" use case summary:

With Saxon 8.9 as the baseline the processing time is 2x inside Sonic with Saxon Saxon commandline processing time is 3x. SaxonXSLCmdLine is showing similar results to Sonic product.

Test results:

Inside Sonic product: 17min = 1020 sec with Saxon 8.9 41min = 2460 sec with Saxon 9.8 34min = 2040 sec with Saxon 9.9

CXML XSL results form Saxon commandline: Saxon 763290 ms Saxon 813310 ms Saxon 2120170 ms Saxon 2021810 ms

SaxonXSLCmdLine RESULTS: XSL: EDW_Adapters\XSLs\CXMLTOEDW.xsl JVM arguments: -Dxsl.useTracing=false -Dsaxon.optimizationLevel=-c Saxon 2112930 ms Saxon 2016450 ms

SaxonXSLCmdLine RESULTS: XSL: EDW_Adapters\XSLs\CXMLTOEDW.xsl JVM arguments: -Dxsl.useTracing=false Saxon 2156000 ms Saxon 1959450 ms


#1 Updated by Stefan Fritz about 1 year ago

This is causing a huge business impact. Would appreciate an update or ETA.

Thanks Stefan

#2 Updated by Michael Kay about 1 year ago

I have completed a profiling run (with -TP:profile.html) using the large source file. This took 37m. The template with match="COLUMN" in CXMLTOEDW.xsl (line 232) was completely dominant. This template makes a number of calls on saxon:evaluate(), which are likely to be expensive. However, JProfiler monitoring (only run for a short time) did not suggest a lot of activity in the XPath parser, instead the focus was on evaluation of *:localname tests.

An obvious thing to try would be to replace the calls on saxon:evaluate() with use of xsl:evaluate, which caches compiled XPath expressions.

It's certainly plausible that XPath parsing might be slower than in earlier releases, as the XPath grammar has grown and as we attempt more optimization.

#3 Updated by Michael Kay about 1 year ago

Tracing the actual expressions executed using saxon:evaluate, we see a pattern like this (7 is the current document number), indicating that these are mainly fairly simple relative path expressions making heavy use of local name tests, which may account for the high occurrence of local name tests in the JProfiler output:

7 - *:CoverageRelationships/*:Groups/*:Group[*:Category/@*:ReferenceListCode = 'Employee Group'][*:Category/@*:SourceKey= 'RATEABLE' or *:Category/@*:MDMKey= 'RATEABLE']/*:Domestic/*:Count
7 - *:ClaimsMade/*:ExtendedReportingPeriod/*:Supplemental/*:Reason
7 - *:Details/*:PIP/*:ExcessPIPMedicalExpense/@*:Indicator
7 - *:Details/*:PIP/*:ExcessPIPWorkLoss/@*:Indicator
7 - *:Details/*:KidnapAndRansomExcludedCountries
7 - *:Details/*:PIP/*:ExcludeLossOfIncomeBenefits/@*:Indicator
7 - *:CoverageRelationships/*:Groups/*:Group[*:Category/@*:ReferenceListCode = 'Location Group'][*:Category/@*:SourceKey= 'TOTAL' or *:Category/@*:MDMKey= 'TOTAL']/*:Foreign/*:Count
7 - *:CoverageRelationships/*:Groups/*:Group[*:Category/@*:ReferenceListCode = 'Employee Group'][*:Category/@*:SourceKey= 'RATEABLE' or *:Category/@*:MDMKey= 'RATEABLE']/*:Foreign/*:Count
7 - *:Details/*:HartfordSteamBoilerCoverageCode
7 - *:Details/*:ISOCoverageCode
7 - *:Details/*:ISODeductibleCode
7 - *:Details/*:ISOLiabilityLimitCode
7 - *:ClaimsMade/*:OriginalEntryDate
7 - *:PremiumLevel
7 - *:Details/*:ProductPricingPlan
7 - *:QuotaShareParticipation/*:Category
7 - *:Details/*:ISORatingCode
7 - *:ClaimsMade/*:ReinstateAggregateLimit/@*:Indicator
7 - *:Reporting/*:Frequency
7 - *:ClaimsMade/*:ExtendedReportingPeriod/*:Supplemental/*:Duration/*:EndDate
7 - *:ClaimsMade/*:ExtendedReportingPeriod/*:Supplemental/*:Duration/*:Count
7 - *:ClaimsMade/*:ExtendedReportingPeriod/*:Supplemental/*:Duration/*:Unit
7 - /*:Transaction/@*:ID

#4 Updated by Michael Kay about 1 year ago

I put in some counters in saxon:evaluate. These show the following kind of pattern:

Calls 1000 exprs 102 nodes 23
Calls 2000 exprs 102 nodes 44
Calls 3000 exprs 102 nodes 65
Calls 4000 exprs 102 nodes 86
Calls 5000 exprs 102 nodes 108
Calls 6000 exprs 102 nodes 129
Calls 7000 exprs 102 nodes 150
Calls 8000 exprs 102 nodes 171
Calls 9000 exprs 102 nodes 193
Calls 10000 exprs 102 nodes 214

meaning that after 10,000 calls on saxon:evaluate(), we've seen 102 distinct XPath expressions, with 214 distinct nodes used as the context node. the same pattern continues: after 200,000 calls we've still only seen 102 distinct XPath expressions (applied to 4257 distinct context nodes). This would certainly suggest that caching compiled expressions is going to be useful. (Though the absence of XPath parsing in the JProfiler output remains puzzling).

#5 Updated by Michael Kay about 1 year ago

We will need to be careful with caching because some of the dynamic XPath expressions call the function fns:GetID(), in module Eagle.xsl, and this function uses <saxon:assign> to update a global variable.

#6 Updated by Michael Kay about 1 year ago

I've changed the calls on saxon:evaluate to use xsl:evaluate, to take advantage of caching.

In the first phase of processing the calls come thick and fast and we get a high hit rate on the cache (here it's 100%):

calls 340000 exprs 270 cache 100 hits 339723
calls 341000 exprs 270 cache 100 hits 340723
calls 342000 exprs 270 cache 100 hits 341723
calls 343000 exprs 270 cache 100 hits 342723
calls 344000 exprs 270 cache 100 hits 343723
calls 345000 exprs 270 cache 100 hits 344723

But at around this point the pattern changes, and we start to see the number of distinct expressions steadily increasing, and the hit rate dropping to 25% or so:

calls 346000 exprs 270 cache 100 hits 345723
calls 347000 exprs 453 cache 100 hits 346540
calls 348000 exprs 1222 cache 100 hits 346771
calls 349000 exprs 1992 cache 100 hits 347001
calls 350000 exprs 2760 cache 100 hits 347233
calls 351000 exprs 3531 cache 100 hits 347462

At the same time the time interval between messages increases substantially, suggesting that calls on evaluate() are no longer quite as dominant.

Looking more closely, the XPath expressions now being evaluated are of the form:

//*:Coverage[@*:ID='941b5203-07a2-4d62-e053-1428330a829a.1']/*:AdditionalAttributes/*:Attribute[*:Name[. = 'Coverage_Family']]/*:Value

That is, they are absolute path expressions. I thought that the low hit rate might simply be because the cache is too small, but looking more carefully, of every 1000 calls to evaluate, about 750 of the expressions are new. What's happening is that they cycle: the @*:ID value changes, and then the same subordinate paths are evaluated for a new ID. There are a couple of ways we could tackle this. First, we could make the ID value a parameter, so the cached expression becomes //*:Coverage[@*:ID=$p]/*:CoverageText and we evaluate it with different values of $p. Secondly, we could evaluate //*:Coverage[@*:ID='941b5203-07a2-4d62-e053-1428330a829a.1']/ outside the saxon:evaluate() call, and use the result as the context item for evaluating the relative path *:BulkCoding/@*:Indicator (etc). Evaluating the common subexpression once might end up giving as big a saving as the caching itself; in fact it's entirely possible that it's this repeated whole-document searching for Coverage elements by ID that accounts for most of the cost. Certainly that would be consistent with the JProfiler observations.

#7 Updated by Michael Kay about 1 year ago

By adding comments to the start of the dynamic XPath expressions, I have established that the //*:Coverage expressions originate from line 277 in the XSLT source:

<xsl:for-each select="fns:evaluate(.,'(:D:)' || concat('//*:',$currentMapping/SrcChild,'[@*:ID=',$getID,']/',$currentMapping/SrcDataPath))">

Note that the paths we are evaluating appear to come from the data files MappingFile.xml or MappingFile_Enterprise.xml. The fact that we evaluate a sequence of paths having the same values for $currentMapping/SrcChild and $getID appears to be an accident of the data file, not something the code can rely on.

I'm thinking that (A) we could replace the


part of this by code to generate

fns:findID(., 'Coverage', $getID)/

where fns:findID uses a key to locate the required element.

and then (B) we could make the two variable parts here ("Coverage" and $getID) into xsl:evaluate parameters in order to improve the caching.

#8 Updated by Michael Kay about 1 year ago

  • Category set to Performance
  • Assignee set to Michael Kay

With these changes, the execution time (on the large source document) is reduced from 37m to 21s.

Specifically, we've made three changes.

(a) We've switched from saxon:evaluate to xsl:evaluate to take advantage of compiled expression caching.

(b) In one particular case, we've used a parameterized expression where the variable part is supplied using with-param, in order to reduce the number of different XPath expressions and thus benefit further from caching of compiled expressions

(c) Rather than generate an expression starting //*:XXX[ID='1234'], we've generated one that starts fns:findById(., 'XXX', '1234'), where fns:findById is a stylesheet function that uses key() to find the relevant element.

I haven't tried to quantify which of these changes contributes most to the improvement.

I haven't attempted to investigate the performance regression reported in the original bug report. I suspect it's caused by increased XPath compile time, caused by increased complexity in the XPath grammar in XPath 3.1 and/or more vigorous optimization attempts in Saxon.

Oh, and I haven't actually verified that the revised code is generating the correct output.

#9 Updated by Michael Kay 3 months ago

  • Status changed from New to Closed
  • Priority changed from High to Normal

Closing this old bug report with no further action.

We never got to the bottom of why there was a performance regression in Saxon, but we did identify changes to the user stylesheet code that gave a couple of orders of magnitude performance improvement.

Please register to edit this issue

Also available in: Atom PDF