Support #6547
openSchema loading performance regressions
0%
Description
Loading schemas in Saxon consumes a lot of time, which only seems to regress across versions. (Currently migrating from 9.9 to 12.1). I did some performance measurements on 12.1 of which I would like to share my findings.
The folowing method consumes a lot of CPU time: com.saxonica.ee.schema.ElementDecl.addSubstitutionGroupMember which is specifically caused by the following:
this.substitutionGroupMembers.add(member);
if (this.substitutionGroupMembers.size() > 10) {
this.substitutionGroupMembers = new HashSet(this.substitutionGroupMembers);
}
I fail to see the reason why the HashSet must be replaced by a new hashset every time an item is added and the set size is over 10. HashSet#iterate, HashSet#putVal and ElementDecl#hashCode are quite costly. Is there any particular reason for this copying? I would be very happy if this behaviour could be changed.
Other performance hotspots:
EnterpriseConfiguration.getNamepsaceStatus (also called from ElementDecl.addSubstitutionGroupMember via config.isSealedNamespace) Here, I suspect that hashCode() on NamespaceUri is not particularly performant. Perhaps implementing this explicitly on that class, e.g. forwarding to the underlying String could help.
net.sf.saxon.tree.util.Navigator.getPath (via SchemaReader.read -> SchemaElement.processAllAttributes -> XSDElement.prepareAttributes) Specifically, XSDElement.prepareAttributes tends to call Navigator.getPath three times: by calling new ElementDecl, by calling new TypedReference and directly as argument for this.element.setGeneratedId. It would be nice if this expensive path for 'this' (being used as locator) would be reused, and not recomputed inside the constructors of ElemDecl and TypedReference (via an overloaded constructor maybe?)
AutomationState.getSpecificTransition of which 62% of the time is spent in Edge.getTerm(), which is strange, as it is just a simple getter.
Some numbers, given a program where 490.000ms are spent in SchemaCompiler.readSchema: 281.000ms spent in ElementDecl.lookForCycles (of which 93.000ms spent in EnterpriseConfiguration.getNamspaceStatus and 140.000ms copying HashSets) 195.000ms spent in XSDElement.processAllAttributes (roughly 3x 63.000ms for each repeated Navigator.getPath) All of which I believe should be easily achievable under 100s, given some optimizations.
Ńote that this is not a synthetic test: this involves reading 30 schemas, each consisting of dozens of schema files, all of which are used in production.
Also note that I applied a 'hack' to disable calling SchemaCompiler.makeAllAutomata() after each and every file (by subclassing SchemaCompiler and overriding that method to a no-op). Without this, the whole processing time of loading all schema files becomes extreme (the list of scheduledForAutomaton seems to only ever grow, never being reset) Still, calling makeAllAutomata() once for each schema (after reading all separate schema files individually) takes another 224.000ms due to getSpecificTransition (on top of the aforementioned 490.000ms).
It does not help to use SchemaCompiler.readMultipleSchemas for two reasons:
- it forwards to SchemaReader.getSources, which expects targetNamespace not to be null. I pass a null as there is no particular expected namspace. readSchema() handles this null properly.
- it assumes all schemas contribute to the same targetNamespace, which is not the case, so things break
In the past we used setDefferedValidationMode, which helped a lot, until it got deprecated and eventually removed. For this reason we stuck to 9.9 for a long time.
Are any of these issues fixed in versions after 12.1? Is there a chance (some of) these issues could be addressed in a next release? Are there other workarounds I could try to speed up processing?
Kind regards,
Johan Walters
Files
Please register to edit this issue