Bug #4982: Loading xml schemas which are stored inside a zip archive is very slow compared to Xerces - Saxon - Saxonica Developer Community

Actions

#1

Updated by Michael Kay over 3 years ago

Thanks, we'll take a look at this.

From a very quick first glance, my immediate reactions are:

(a) do you really need to set the MULTIPLE_SCHEMA_IMPORTS option? Because this is going to do what it says: read the same schema document multiple times. You should only need it if you have several schema documents with the same target namespace, and that's not really good practice.

(b) there are a number of instances of maxOccurs="99", or "999", or even "9999". The classic algorithm for building a finite state machine with such rules is very expensive (it's exponential in both time and space). Saxon tries to optimise it when it can by using counters, but it's not always possible and I will check to see how these cases are being handled. (Xerces has the same problem, and I think that it sometimes gives up and treats the constraint as if it were maxOccurs="unbounded"). If we do find a problem here, the best solution might be to replace the maxOccurs with an xs:assert.

Actions

SaxonBugDemo.java (8.69 KB) SaxonBugDemo.java	small demo app	Tomas Vanhala, 2021-05-04 10:01
SchemaValidatorImplTest_zipped_schemas.zip (31.9 KB) SchemaValidatorImplTest_zipped_schemas.zip	schemas used by the demo app	Tomas Vanhala, 2021-05-04 10:02
SaxonBugDemo.zip (29.8 MB) SaxonBugDemo.zip	The ZIP archive containing refined demo code with all dependencies included.	Tomas Vanhala, 2021-06-07 16:45
SaxonStack.pdf (493 KB) SaxonStack.pdf		Tomas Vanhala, 2021-07-29 16:16
SaxonBugDemov2.zip (29.9 MB) SaxonBugDemov2.zip	Demo code with improved packaging	Tomas Vanhala, 2021-08-02 16:36
demo.log (44.9 KB) demo.log	Log created by running the demo code	Tomas Vanhala, 2021-08-02 16:45

Project

Profile

Help

Saxon

Bug #4982

Loading xml schemas which are stored inside a zip archive is very slow compared to Xerces

Updated by Michael Kay over 3 years ago

Updated by Tomas Vanhala over 3 years ago

Updated by Michael Kay over 3 years ago

Updated by Michael Kay over 3 years ago

Updated by Tomas Vanhala over 3 years ago

Updated by Tomas Vanhala over 3 years ago

Updated by Michael Kay over 3 years ago

Updated by Tomas Vanhala over 3 years ago

Updated by Michael Kay over 3 years ago

Updated by Tomas Vanhala over 3 years ago

Updated by Michael Kay over 3 years ago

Updated by Tomas Vanhala over 3 years ago

Updated by Tomas Vanhala over 3 years ago

Updated by Tomas Vanhala over 3 years ago

Updated by Michael Kay over 3 years ago

Updated by Michael Kay over 3 years ago

Updated by Michael Kay over 3 years ago

Updated by Michael Kay over 3 years ago

Updated by Michael Kay over 3 years ago

Updated by Michael Kay over 3 years ago

Updated by Michael Kay over 3 years ago

Updated by Tomas Vanhala over 3 years ago

Updated by Michael Kay over 3 years ago

Updated by O'Neil Delpratt about 3 years ago