Project

Profile

Help

Bug #4445

Problem with validation [ArrayIndexOutOfBoundsException in TinyTree]

Added by Mathieu Bergonzini 2 months ago. Updated about 1 month ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Schema conformance
Sprint/Milestone:
-
Start date:
2020-01-29
Due date:
% Done:

100%

Legacy ID:
Applies to branch:
9.8, 9.9, trunk
Fix Committed on Branch:
9.8, 9.9, trunk
Fixed in Maintenance Release:

Description

Hello, I have a problem with xml validation by xsd schema. Some files crash validation and I get this error message:

java.lang.ArrayIndexOutOfBoundsException: 32008
	at net.sf.saxon.tree.tiny.TinyParentNodeImpl.getStringValueCS(TinyParentNodeImpl.java:75)
	at net.sf.saxon.tree.tiny.TinyParentNodeImpl.getStringValueCS(TinyParentNodeImpl.java:49)
	at com.saxonica.ee.schema.UserSimpleType.atomize(UserSimpleType.java:490)
	at net.sf.saxon.tree.tiny.TinyTree.getTypedValueOfElement(TinyTree.java:548)
	at net.sf.saxon.tree.tiny.TinyElementImpl.atomize(TinyElementImpl.java:91)
	at net.sf.saxon.tree.wrapper.VirtualCopy.atomize(VirtualCopy.java:644)
	at net.sf.saxon.tree.iter.AtomizingIterator.next(AtomizingIterator.java:56)
	at net.sf.saxon.tree.iter.AtomizingIterator.next(AtomizingIterator.java:27)
	at com.saxonica.ee.optim.GeneralComparisonEE.effectiveBooleanValue(GeneralComparisonEE.java:101)
	at net.sf.saxon.expr.OrExpression.effectiveBooleanValue(OrExpression.java:134)
	at net.sf.saxon.expr.instruct.Choose.choose(Choose.java:901)
	at net.sf.saxon.expr.instruct.Choose.iterate(Choose.java:952)
	at net.sf.saxon.expr.Expression.effectiveBooleanValue(Expression.java:886)
	at net.sf.saxon.functions.NotFn$1.effectiveBooleanValue(NotFn.java:70)
	at net.sf.saxon.expr.FilterIterator$NonNumeric.matches(FilterIterator.java:186)
	at net.sf.saxon.expr.FilterIterator.getNextMatchingItem(FilterIterator.java:78)
	at net.sf.saxon.expr.FilterIterator.next(FilterIterator.java:64)
	at com.saxonica.ee.schema.Assertion.testComplex(Assertion.java:239)
	at com.saxonica.ee.validate.ValidationStack.testAssertions(ValidationStack.java:491)
	at com.saxonica.ee.validate.ValidationStack.endElement(ValidationStack.java:430)
	at net.sf.saxon.event.ProxyReceiver.endElement(ProxyReceiver.java:182)
	at net.sf.saxon.event.StartTagBuffer.endElement(StartTagBuffer.java:290)
	at com.saxonica.ee.validate.StartTagBufferEE.endElement(StartTagBufferEE.java:58)
	at net.sf.saxon.event.PathMaintainer.endElement(PathMaintainer.java:62)
	at net.sf.saxon.event.DocumentValidator.endElement(DocumentValidator.java:68)
	at net.sf.saxon.event.ReceivingContentHandler.endElement(ReceivingContentHandler.java:459)
	at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(AbstractSAXParser.java:609)
	at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement(XMLDocumentFragmentScannerImpl.java:1782)
	at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2967)
	at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602)
	at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112)
	at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:505)
	at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:842)
	at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:771)
	at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
	at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)
	at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:643)
	at net.sf.saxon.event.Sender.sendSAXSource(Sender.java:427)
	at net.sf.saxon.event.Sender.send(Sender.java:164)
	at com.saxonica.ee.s9api.SchemaValidatorImpl.validate(SchemaValidatorImpl.java:587)
	at fr.insee.test.saxonee.util.SaxonUtilTest.validateTestBug(SaxonUtilTest.java:71)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
	at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:89)
	at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:41)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:542)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:770)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:464)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:210)

Here is the code I use for validation:

        File schema = new File("./src/test/resources/shema.xsd");
        File file = new File("./src/test/resources/file-bug.xml");
        Processor proc = new Processor(true);
        proc.setConfigurationProperty(FeatureKeys.LICENSE_FILE_LOCATION, "./src/test/resources/saxon-license.lic");
        SchemaManager sm = proc.getSchemaManager();
        sm.setXsdVersion("1.1");
        sm.load(new StreamSource(schema));
        SchemaValidator sv = sm.newSchemaValidator();
        sv.setErrorListener(getErrorListener());
        sv.validate(new StreamSource(new FileInputStream(file)));

and the test files attached

file-bug.xml (1.07 MB) file test Mathieu Bergonzini, 2020-01-29 10:02 file-bug.xml
schema.xsd (65 KB) schema xsd Mathieu Bergonzini, 2020-01-29 10:02 schema.xsd
xhtml.xsd (63.9 KB) Mathieu Bergonzini, 2020-01-29 10:02 xhtml.xsd
xml.xsd (5.57 KB) Mathieu Bergonzini, 2020-01-29 10:02 xml.xsd

History

#1 Updated by Michael Kay 2 months ago

Thanks for reporting it. From the stack trace, I can make a guess: we're evaluating an assertion against a subtree of the document, and we've used the "fast copy" mechanism to create a copy of that subtree, and the fast copy mechanism is dodgy. In fact, in resolving bug #4433, I decided to scrap this optimisation because it had given too many reliability problems.

I'll look at the repro first, however, before jumping to conclusions.

#2 Updated by Michael Kay 2 months ago

  • Category set to Schema conformance
  • Status changed from New to In Progress
  • Assignee set to Michael Kay
  • Applies to branch 9.8 added

Reproduced on 9.8 (from the command line); runs without failure on 9.9

#3 Updated by Michael Kay 2 months ago

Seems to be in the same general area as bug #3665, though it's not the same because that involved construction of prior-pointers.

I'm not going to close it immediately as "fixed in 9.9" because I think the reason it's working in 9.9 might be coincidence (the TinyTree in 9.9 has been reduced in size by the introduction of TextualElement nodes, and that might mean the bug is still there on a particular boundary condition, but not activated by this test case). So I'm going to try and debug it on 9.8.

#4 Updated by Michael Kay 2 months ago

Stepping through what's happening in the debugger, it's evaluating the assertion on the Adresse element at line 24284, and while atomizing the LibellePays element (which is the last child of Adresse, and happens to be empty), it follows a "next" pointer which points off the end of the underlying TinyTree.

So I looked at how the construction of tree fragments for assertions works, and I noticed a relevant change between 9.8 and 9.9, which was made in response to the (apparently asymptomatic) bug #4124.

I tried experimentally to reverse that change in the 9.9 code, and it doesn't cause the validation to crash, so we have no evidence as to whether this code change is responsible for fixing the bug. In fact, looking at it more carefully, I think the patch only affects what happens when we finish tree construction for an assertion on the root element of the entire document, which suggests it's not relevant.

I do find the code a little puzzling, because we seem to do the testing of assertions before sending an endElement() event to the builder; what's more the intuitive code to send the endElement() event has been commented out suggesting a deliberate change.

#5 Updated by Michael Kay 2 months ago

  • Subject changed from Problem with validation to Problem with validation [ArrayIndexOutOfBoundsException in TinyTree]

Yes, it's a bug, and it's still there in 9.9; it's just very unlikely to be triggered.

When we get the string value of a node in the tinytree (TinyParentNodeImpl.getStringValueCS) we take a quick look to see if the depth of the next node is <= the current depth; if so, that implies that the node has no children, so the string value is a zero-length string. But on this occasion the "next node" is off the end of the TinyTree array.

This will never happen when accessing a normal tree, because we always add a "stopper" node at the end to guard against such things. And on this special path for assertion handling, where we're accessing a tree that's under construction, there will normally be a zero entry in the uninitialised part of the array, which happens to give the correct results. It's failed this time because of a rare combination of circumstances: the last child of the element containing the assertion is empty; we're accessing its string value; and the algorithm for allocating space in the TinyTree is such that there was exactly the right amount of room for this particular element; since we're typically allocating space in chunks of at least 4K bytes, this has a very low probability of happening.

#6 Updated by Michael Kay 2 months ago

  • Status changed from In Progress to Resolved
  • Applies to branch 9.9, trunk added
  • Fix Committed on Branch 9.8, 9.9, trunk added

Patch applied on 9.8, 9.9, and development branches; tested on 9.8, regression tested only on 9.9 and 10.0 as it is impossible to construct a test case for this unlikely event.

#7 Updated by O'Neil Delpratt about 1 month ago

  • Status changed from Resolved to Closed
  • % Done changed from 0 to 100
  • Fixed in Maintenance Release 9.9.1.7 added
  • Fixed in Maintenance Release deleted (9.8.0.14)

Patch applied in the 9.9.1.7 maintenance release.

Please register to edit this issue

Also available in: Atom PDF