The problem is in the handling of this type, defined in solution.xsd:
<complexType name="Reagents_Type">
<sequence minOccurs="1" maxOccurs="100">
<element name="Solute" type="solution:Solute_Type" minOccurs="0" maxOccurs="100"/>
<element name="Solvent" type="solution:Solvent_Type" minOccurs="0" maxOccurs="10"/>
</sequence>
</complexType>
Because of the two nested maxOccurs values, this is a very demanding content model, and the classic textbook algorithms perform very poorly on it (they use a lot of time and a lot of memory). Saxon uses a smarter approach with counters for simple non-nested maxOccurs models, but that doesn't work in a case like this which is "weakly ambiguous". For example the model allows a sequence of 50 Solute elements followed by a Solvent element followed by 350 Solute elements, and matching this involves potential back-tracking. Henry Thomson has published an algorithm for content models with nested counters which can handle more cases efficiently than Saxon can, but it still breaks down on some "weakly ambiguous" models - I'm not sure whether it will handle this one.
I've no idea how Xerces handles this case!
I think the biggest improvement we could make with such schemas is to compile the finite state machine for a complex type lazily. At present the cost is being incurred whether or not the type is actually used in a given validation episode. One disadvantage of lazy compilation is that we detect UPA ambiguities as part of the process of building the finite state machine, so UPA violations would not be detected in a complex type unless it is actually used. We could perhaps get around this with a switch, similar to the one in Xerces, whereby complete validation of the schema is performed only if requested.