Project

Profile

Help

Bug #4554

Extremely slow grammar loading depending on ordering.

Added by Chris Simmons 3 months ago. Updated 3 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Performance
Sprint/Milestone:
-
Start date:
2020-05-14
Due date:
% Done:

0%

Estimated time:
Legacy ID:
Applies to branch:
10
Fix Committed on Branch:
10
Fixed in Maintenance Release:

Description

My company, CoreFiling, are currently using Saxon EE 9.7.0.7.

We've discovered a performance problem which seems to stem from the order in which grammars are loaded. The difference is stark, on my machine its ~5s vs ~90s. I've attached some code and lots of schemas demonstrating the problem.

TestSchemaLoading loads the same grammars in three different ways. If you diff filing vs filing-copy you'll see we've added a single import. Loading 'filing-copy' is fast whereas loading 'filing' then loading a second namespace is slow. If you reverse the order of the imports then its fast again.

We may have worked around this by producing a synthetic schema that imports all the schema grammars that we want to load. I'm not sure however whether it really works around the problem since I don't know what's really going wrong under the hood.

To run it you'll need to add a ActivateSaxon.java and put saxon ee jar and junit 4 on the classpath.

saxon-schema-bug.zip (13.7 MB) saxon-schema-bug.zip Chris Simmons, 2020-05-14 09:15

History

#1 Updated by Michael Kay 3 months ago

Running this from the command line with Saxon 10, I'm seeing

Loading schema document http://www.esma.europa.eu/taxonomy/2019-03-27/esef_cor.xsd
Error 
   Error reported by XML parser: Premature end of file.
Error 
   Error reported by XML parser processing
  http://www.esma.europa.eu/taxonomy/2019-03-27/esef_cor.xsd: Premature end of file.:
  Premature end of file.

I can't see why that should be - the file looks perfectly OK in the browser.

I get the same error when loading from filing-copy; and I get the same errors under 9.7.

So I guess I'll have to run it from the Java code (which sets a SchemaURIResolver)

#2 Updated by Michael Kay 3 months ago

  • Status changed from New to In Progress

Reproduced under Saxon 10: the fast case takes 3.116s, the slow case 32.956s.

Note that the slow case produces warnings:

The schema document at http://www.xbrl.org/2003/xbrl-instance-2003-12-31.xsd is not being read because schema components for this namespace are already available. To force the schema document to be read, set --multipleSchemaImports:on
The schema document at http://www.xbrl.org/2003/xbrl-instance-2003-12-31.xsd is not being read because schema components for this namespace are already available. To force the schema document to be read, set --multipleSchemaImports:on
The schema document at http://www.xbrl.org/2003/xbrl-instance-2003-12-31.xsd is not being read because schema components for this namespace are already available. To force the schema document to be read, set --multipleSchemaImports:on
The schema document at http://www.xbrl.org/2005/xbrldt-2005.xsd is not being read because schema components for this namespace are already available. To force the schema document to be read, set --multipleSchemaImports:on

I suspect we're going to find that there are schema namespaces that are mapped to multiple schema locations depending on the import hierarchy. By default we're only going to load the first schema document imported for each namespace, and if that happens to be one with a slow network connection, it's just bad luck. I'm going to look out to see if any schema documents are fetched from the W3C site, because they are automatically subject to a 30s delay, which corresponds closely to the difference between the fast and slow cases.

#3 Updated by Michael Kay 3 months ago

The "fast" case gives us this schema loading trace:
Loading schema document filing-copy/taxonomy/ext.xsd
Loading schema document http://xbrl.ifrs.org/taxonomy/2019-03-27/full_ifrs/full_ifrs-cor_2019-03-27.xsd
Loading schema document http://www.xbrl.org/2003/xbrl-instance-2003-12-31.xsd
Loading schema document http://www.xbrl.org/2003/xbrl-linkbase-2003-12-31.xsd
Loading schema document http://www.xbrl.org/2003/xl-2003-12-31.xsd
Loading schema document http://www.xbrl.org/2003/xlink-2003-12-31.xsd
Finished loading schema document http://www.xbrl.org/2003/xlink-2003-12-31.xsd
Finished loading schema document http://www.xbrl.org/2003/xl-2003-12-31.xsd
Finished loading schema document http://www.xbrl.org/2003/xbrl-linkbase-2003-12-31.xsd
Finished loading schema document http://www.xbrl.org/2003/xbrl-instance-2003-12-31.xsd
Loading schema document http://www.xbrl.org/dtr/type/nonNumeric-2009-12-16.xsd
Finished loading schema document http://www.xbrl.org/dtr/type/nonNumeric-2009-12-16.xsd
Loading schema document http://www.xbrl.org/dtr/type/numeric-2009-12-16.xsd
Finished loading schema document http://www.xbrl.org/dtr/type/numeric-2009-12-16.xsd
Loading schema document http://www.xbrl.org/2005/xbrldt-2005.xsd
Finished loading schema document http://www.xbrl.org/2005/xbrldt-2005.xsd
Finished loading schema document http://xbrl.ifrs.org/taxonomy/2019-03-27/full_ifrs/full_ifrs-cor_2019-03-27.xsd
Loading schema document http://www.xbrl.org/lrr/arcrole/esma-arcrole-2018-11-21.xsd
Finished loading schema document http://www.xbrl.org/lrr/arcrole/esma-arcrole-2018-11-21.xsd
Loading schema document http://www.esma.europa.eu/taxonomy/2019-03-27/esef_cor.xsd
Loading schema document http://www.esma.europa.eu/taxonomy/ext/technical.xsd
Finished loading schema document http://www.esma.europa.eu/taxonomy/ext/technical.xsd
Loading schema document http://www.xbrl.org/taxonomy/int/lei/CR/2018-11-01/lei-required.xsd
Loading schema document http://www.xbrl.org/taxonomy/int/lei/CR/2018-11-01/lei.xsd
Finished loading schema document http://www.xbrl.org/taxonomy/int/lei/CR/2018-11-01/lei.xsd
Finished loading schema document http://www.xbrl.org/taxonomy/int/lei/CR/2018-11-01/lei-required.xsd
Finished loading schema document http://www.esma.europa.eu/taxonomy/2019-03-27/esef_cor.xsd
URIResolver.resolve href="ext-roles.xsd" base="file:///Users/mike/bugs/2020/4554-Simmons/saxon-schema-bug/filing-copy/taxonomy/ext.xsd"
Loading schema document file:/Users/mike/bugs/2020/4554-Simmons/saxon-schema-bug/filing-copy/taxonomy/ext-roles.xsd
Finished loading schema document file:/Users/mike/bugs/2020/4554-Simmons/saxon-schema-bug/filing-copy/taxonomy/ext-roles.xsd
Finished loading schema document filing-copy/taxonomy/ext.xsd
Time in ms: 3137

The slow case gives us:

Loading schema document filing/taxonomy/ext.xsd
Loading schema document http://www.xbrl.org/2003/xbrl-instance-2003-12-31.xsd
Loading schema document http://www.xbrl.org/2003/xbrl-linkbase-2003-12-31.xsd
Loading schema document http://www.xbrl.org/2003/xl-2003-12-31.xsd
Loading schema document http://www.xbrl.org/2003/xlink-2003-12-31.xsd
Finished loading schema document http://www.xbrl.org/2003/xlink-2003-12-31.xsd
Finished loading schema document http://www.xbrl.org/2003/xl-2003-12-31.xsd
Finished loading schema document http://www.xbrl.org/2003/xbrl-linkbase-2003-12-31.xsd
Finished loading schema document http://www.xbrl.org/2003/xbrl-instance-2003-12-31.xsd
Loading schema document http://www.xbrl.org/lrr/arcrole/esma-arcrole-2018-11-21.xsd
Finished loading schema document http://www.xbrl.org/lrr/arcrole/esma-arcrole-2018-11-21.xsd
Loading schema document http://www.esma.europa.eu/taxonomy/2019-03-27/esef_cor.xsd
Loading schema document http://www.esma.europa.eu/taxonomy/ext/technical.xsd
Loading schema document http://www.xbrl.org/2005/xbrldt-2005.xsd
Finished loading schema document http://www.xbrl.org/2005/xbrldt-2005.xsd
Finished loading schema document http://www.esma.europa.eu/taxonomy/ext/technical.xsd
Loading schema document http://www.xbrl.org/taxonomy/int/lei/CR/2018-11-01/lei-required.xsd
Loading schema document http://www.xbrl.org/taxonomy/int/lei/CR/2018-11-01/lei.xsd
Finished loading schema document http://www.xbrl.org/taxonomy/int/lei/CR/2018-11-01/lei.xsd
Finished loading schema document http://www.xbrl.org/taxonomy/int/lei/CR/2018-11-01/lei-required.xsd
Finished loading schema document http://www.esma.europa.eu/taxonomy/2019-03-27/esef_cor.xsd
URIResolver.resolve href="ext-roles.xsd" base="file:///Users/mike/bugs/2020/4554-Simmons/saxon-schema-bug/filing/taxonomy/ext.xsd"
Loading schema document file:/Users/mike/bugs/2020/4554-Simmons/saxon-schema-bug/filing/taxonomy/ext-roles.xsd
Finished loading schema document file:/Users/mike/bugs/2020/4554-Simmons/saxon-schema-bug/filing/taxonomy/ext-roles.xsd
Finished loading schema document filing/taxonomy/ext.xsd
Loading schema document http://xbrl.ifrs.org/taxonomy/2019-03-27/full_ifrs/full_ifrs-cor_2019-03-27.xsd
The schema document at http://www.xbrl.org/2003/xbrl-instance-2003-12-31.xsd is not being read because schema components for this namespace are already available. To force the schema document to be read, set --multipleSchemaImports:on
Loading schema document http://www.xbrl.org/dtr/type/nonNumeric-2009-12-16.xsd
The schema document at http://www.xbrl.org/2003/xbrl-instance-2003-12-31.xsd is not being read because schema components for this namespace are already available. To force the schema document to be read, set --multipleSchemaImports:on
Finished loading schema document http://www.xbrl.org/dtr/type/nonNumeric-2009-12-16.xsd
Loading schema document http://www.xbrl.org/dtr/type/numeric-2009-12-16.xsd
The schema document at http://www.xbrl.org/2003/xbrl-instance-2003-12-31.xsd is not being read because schema components for this namespace are already available. To force the schema document to be read, set --multipleSchemaImports:on
Finished loading schema document http://www.xbrl.org/dtr/type/numeric-2009-12-16.xsd
The schema document at http://www.xbrl.org/2005/xbrldt-2005.xsd is not being read because schema components for this namespace are already available. To force the schema document to be read, set --multipleSchemaImports:on
Finished loading schema document http://xbrl.ifrs.org/taxonomy/2019-03-27/full_ifrs/full_ifrs-cor_2019-03-27.xsd
Time in ms: 33718

I added timing information to this trace, and the excess time does not appear to be spent between the "Loading" and "Finished loading" messages.

#4 Updated by Michael Kay 3 months ago

Java profiling shows that in the slow case, we are spending a lot of time in the code that determinizes the finite state machine for some complex type.

Switching on some (long commented out!) instrumentation, the timings for determinization in the fast case (in microseconds) are:

Time to determinize complex type Q{http://www.xbrl.org/2003/instance}fractionItemType - 521
Time to determinize complex type Q{http://www.xbrl.org/2003/XLink}locatorType - 32
Time to determinize complex type Q{http://www.xbrl.org/2003/XLink}arcType - 25
Time to determinize complex type Q{http://www.xbrl.org/2003/XLink}extendedType - 755
Time to determinize complex type Q{http://www.xbrl.org/2003/instance}contextPeriodType - 39
Time to determinize complex type of element segment - 43
Time to determinize complex type Q{http://www.xbrl.org/2003/instance}contextEntityType - 28
Time to determinize complex type Q{http://www.xbrl.org/2003/instance}contextScenarioType - 25
Time to determinize complex type Q{http://www.xbrl.org/2003/instance}measuresType - 64
Time to determinize complex type of element context - 28
Time to determinize complex type of element divide - 34
Time to determinize complex type of element unit - 49
Time to determinize complex type of element footnote - 19
Time to determinize complex type of element footnoteLink - 125
Time to determinize complex type of element xbrl - 33001
Time to determinize complex type of element label - 81
Time to determinize complex type of element calculationArc - 34
Time to determinize complex type of element calculationLink - 70
Time to determinize complex type of element linkbase - 245
Time to determinize complex type of element reference - 15
Time to determinize complex type of element roleType - 23
Time to determinize complex type of element arcroleType - 24
Time to determinize complex type of element referenceLink - 130
Time to determinize complex type of element labelLink - 61
Time to determinize complex type of element definitionLink - 26
Time to determinize complex type of element presentationArc - 21
Time to determinize complex type of element presentationLink - 53

but in the slow case we see this:

Time to determinize complex type of element xbrl - 2234
Time to determinize complex type of element xbrl - 2099
Time to determinize complex type of element xbrl - 2253
Time to determinize complex type of element xbrl - 2146
Time to determinize complex type of element xbrl - 2229
Time to determinize complex type of element xbrl - 2040
Time to determinize complex type of element xbrl - 2438
Time to determinize complex type of element xbrl - 2110
... etc etc

That is, we are determinizing the type repeatedly, many many times, and moreover the cost is gradually increasing each time.

Note that in the fast case, determinizing this type was expensive, but we only did it once.

I would speculate that we are incrementally extending the type, and recompiling it each time.

#5 Updated by Chris Simmons 3 months ago

I was wondering whether it was something like that, perhaps an API where we can submit multiple schemas to load at the same time so it can defer this work until its loaded all the schema documents.

#6 Updated by Michael Kay 3 months ago

Running in the debugger confirms that the xbrl type is recompiled repeatedly as new members are added to the substitution group of the xbrl element.

There's a comment in the code:

// TODO: delay this until the compiled type definition is actually needed, to avoid repeated compilations.

So it looks like we've been here before (many years ago - the TODO is in the source code for Saxon 9.2, which is the oldest one I have handy) but it's never created quite such a pathological problem as in this case.

I'll think about how to improve the design, but we're talking about a design change here so it will require considerable care, and if we do find a fix it will almost certainly be on the 10.0 branch only -- we avoid performance fixes in stable releases because of the risk of introducing new bugs.

#7 Updated by Michael Kay 3 months ago

  • Category set to Performance
  • Assignee set to Michael Kay

I think the main difficulty with deferring the creation of the FSM until the end of the compilation episode is that we use the FSM to check whether one complex type is a valid restriction of another. So we would have to defer this checking as well. But the logic that does this checking is intertwined with the logic that (for example) expands attribute groups based on definitions in the base type.

This looks like a fairly substantial project, so after this initial investigation, I'm going to prioritise it after some other current activities.

Notes: I attempted to change calls to UserComplexType.compile() to add the type to a list maintained by the SchemaCompiler, and to process this list at the end of EnterpriseConfiguration.addSchemaSource(). I also attempted to defer the calls on UserComplexType.isValidRestriction() to occur after all the types were compiled. This is currently failing saying that a type is an invalid restriction because it has an attribute wildcard and the base type does not. Shelved these changes for now.

#8 Updated by Michael Kay 3 months ago

I've made another attempt to implement this - deferring building the finite state machines and checking subsumption until the end. Currently getting 260 failures in the XSD test suite, which I now need to investigate.

#9 Updated by Michael Kay 3 months ago

All W3C XSD tests now passing. (There were two main issues. Firstly, the subsumption check wasn't taking place in the case of schemas loaded using xsi:schemaLocation. Secondly, it was being conducted on the finite state machines prior to determinization.)

Now need to run API tests, Saxon extension tests, etc.

In the test case submitted with the bug, the "fast" case is now running in 2175ms, the "slow" case in 5063ms.

#10 Updated by Michael Kay 3 months ago

  • Status changed from In Progress to Resolved
  • Applies to branch 10 added
  • Applies to branch deleted (9.7)
  • Fix Committed on Branch 10 added

No further problems found. Committing the changes on the 10 branch (only).

#11 Updated by Chris Simmons 3 months ago

Excellent thank you very much. I'll raise an issue on our end to upgrade our Saxon version.

#12 Updated by Michael Kay 3 months ago

  • Status changed from Resolved to In Progress

Reopening; the change appears to cause regression in one QT3 test

-s:prod-SchemaImport -t:qischema083

#13 Updated by Michael Kay 3 months ago

  • Status changed from In Progress to Resolved

Fixed this test case. The path Configuration.readMultipleSchemas(), used when an XQuery has multiple import schema declarations, had not been updated to perform the completion actions at the end of schema compilation.

Please register to edit this issue

Also available in: Atom PDF