Project

Profile

Help

Bug #3925

closed

Validation that succeeds under 9.8 fails under 9.9

Added by Michael Kay over 5 years ago. Updated over 5 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Schema-Aware processing
Sprint/Milestone:
-
Start date:
2018-09-29
Due date:
% Done:

100%

Estimated time:
Legacy ID:
Applies to branch:
9.8, 9.9
Fix Committed on Branch:
9.8, 9.9
Fixed in Maintenance Release:
Platforms:

Description

With the attached files, Saxon 9.8.0.14 generates the output successfully:

   C:\test>java -jar SaxonEE9-8-0-14J\saxon9ee.jar -config:config.xml -s:test.xml -t -xsl:test.xsl
   Saxon-EE 9.8.0.14J from Saxonica
...
   <?xml version="1.0" encoding="UTF-8"?><title xmlns="test">test</title>Execution time: 387.1993ms
   Memory used: 10,353,184

But 9.9 fails:

   C:\test>java -jar SaxonEE9-9-0-1J\saxon9ee.jar -config:config.xml -s:test.xml -t -xsl:test.xsl
   Saxon-EE 9.9.0.1J from Saxonica
...
   Validation error in test.xml:
     FORG0001: Character data is not allowed: element <title> has an empty content model
     Validating /title[1]
     Currently processing  in file:/C:/test/test.xml
     See http://www.w3.org/TR/xmlschema11-1/#cvc-complex-type clause 2.1
   Error evaluating (xsl:copy-of) in xsl:copy-of/@select on line 8 column 49 of test.xsl:
     XTTE1510: One validation error was reported: Character data is not allowed: element
     <title> has an empty content model
   One validation error was reported: Character data is not allowed: element <title> has an empty content model

Actions #1

Updated by Michael Kay over 5 years ago

This appears to be closely related to bug #3919 (a type derived by extension from one with mixed="true"); though in that case the failure occurred under 9.8

Actions #2

Updated by Michael Kay over 5 years ago

  • Description updated (diff)
Actions #3

Updated by Michael Kay over 5 years ago

I observe the following sequence of events under 9.9

(A) During the processing of config.xml

  • titleType (UserComplexType#1270) correctly has variety set to MIXED

(B) During processing of xsl:import-schema, the schema is processed again. A new UserComplexType#1611 is created to represent titleType. At some stage, though, it is recognized as a duplicate and so further processing is abandoned; at the point it is abandoned, the #1611 object has variety=EMPTY.

But processing continues with SchemaCompiler.processSchemaDocument(), eventually producing another copy of the compiled schema that is incompletely validated and has variety=EMPTY.

On completion of xsl:import-schema processing, the schema namespace is sealed.

At this point (and I missed exactly where it happened), the version of UserComplexType for titleType in the configuration is the incomplete #1611 version, rather than the correct #1270 version

(Also, but this perhaps not directly relevant, the final version of UserComplexType for baseType in the configuration has an empty set of extendedTypes, rather than referencing titleType as it should.)

I think we can identify a couple of issues here.

(1) we should detect far earlier on that the schema being processed using xsl:import-schema is one that we already know about, to avoid wasted effort

(2) handling a schema that adds extended types or substitution group members to a type/declaration that already exists in the Configuration is always problematic - see also the unresolved issue #3531

(3) if we're going to process the schema twice (and there will be cases where we can't avoid it, e.g. if the schema location is different) then we shouldn't be distracted by the presence in the configuration of other components with the same name.

Note that all of this applies equally to 9.8 and 9.9. There may be some minor glitch that causes 9.8 to get it right in this case where 9.9 gets it wrong, but that's basically by accident rather than by design.

Actions #4

Updated by Michael Kay over 5 years ago

The significant difference between 9.8 and 9.9 is in PreparedSchema.addType(), where 9.8 has

if (existing == null || existing.getRedefinitionLevel() < type.getRedefinitionLevel()) {

and 9.9 has

if (existing == null || existing.getRedefinitionLevel() <= type.getRedefinitionLevel())

The effect is that when the redefinition levels are the same, as here, in 9.9 the new type overwrites the old, whereas in 9.8 the old type is retained. In this case the old type is correct (variety=mixed) while the new one is wrong (variety=EMPTY).

There are several levels at which we could address this. We could simply revert to the 9.8 code and see if anything breaks. We could examine the history to see if we can discover why the change was made. We could probe deeper and ask why the new type is wrong in the first place. And then we could look at why it is we are processing the schema twice and doing all this unnecessary work.

Actions #5

Updated by Michael Kay over 5 years ago

The code used to have <= in this line; it was changed on the 9.8 branch to < in commit 1834, in response to bug #3563, which doesn't look particularly relevant; a more likely candidate is bug #3125. Either way, though, it seems the code was changed on the 9.8 branch and the change wasn't copied into the development branch, and 9.9 has reverted to how it stood before this patch.

Actions #6

Updated by Michael Kay over 5 years ago

So, for clarity, I'm looking at three separate questions here:

(A) The use of < versus <= in PreparedSchema.addType()

(B) The early exit from UserComplexType.validateExtension() if the type already exists in the Configuration

(C) The question of whether we can avoid processing the schema in xsl:import-schema because it has already been processed while loading the configuration file.

At the moment I'm looking at (B).

If I remove the following lines from UserComplexType.validateExtension() then the test case works:

        SchemaType existing = getConfiguration().getSchemaType(getStructuredQName());
        if (existing != null && existing.isSameType(this)) {
            return true;
        }

The question is whether this has any adverse side-effects.

After making this change (and making some fixes to my test environment), I get no failures (out of 41535 tests) in the XSD test suite; no failures (out of 11435 tests) on the XSLT3 test suite; and no unexpected failures in the JAXP or S9API unit tests. So the change seems OK. I suspect the code was there to try and skip subsequence steps for performance reasons.

Moreover, removing these lines on the 9.8 branch appears to fix bug #3919.

Actions #7

Updated by Michael Kay over 5 years ago

Looking now at issue (C).

The reason xsl:import-schema doesn't recognize the schema as being already loaded is the absence of a "namespace" attribute on the xsl:import-schema instruction. (Note that the referenced schema has targetNamespace="test")

This leads me to a careful reading of the spec for xsl:import-schema. It's surprisingly fuzzy. In non-normative notes it says "The namespace attribute may be omitted to indicate that a schema for names in no namespace is being imported. The zero-length string is not a valid namespace URI, and is therefore not a valid value for the namespace attribute."

My XSLT 2.0 similarly says "omitting the namespace attribute indicates that you want to import a schema that has no target namespace (that is, a schema for elements that are in no namespace).

If we change the code to say <xsl:import-schema namespace="flamingo" schema-location="test.xsd"/> then Saxon complains XTSE0220: Schema at location test.xsd has target namespace "test" but requested namespace was "flamingo". This is essentially following the rules of xs:import.

But the actual rules of xs:import, in XSD 1.1 part 1 §4.2.6.2, say:

3 If (the schema document) D2 exists, that is, clause 2.1 or clause 2.2 above were satisfied, then the appropriate case among the following must be true:
3.1 If there is a namespace [attribute], then its ·actual value· is identical to the ·actual value· of the targetNamespace [attribute] of D2.
3.2 If there is no namespace [attribute], then D2 has no targetNamespace [attribute]

So i think this xsl:schema-import declaration is invalid, and Saxon is failing to detect the fact.

Actions #8

Updated by Michael Kay over 5 years ago

The reason we don't detect an error here is that the code which checks for consistency of the targetNamespace of the referenced schema is written to handle the case of "chameleon include", where xs:include projects a schema document into the target namespace of its caller. But this case shouldn't arise for xs:import and therefore for xsl:import-schema which bases its rules on those for xs:import.

I have raised this as a separate issue #3928. Meanwhile, I have changed the xsl:import-schema in this test case to say namespace="test". This has the effect that by default there is a message

Warning at xsl:import-schema on line 5 column 67 of test.xsl:
  SXWN9006: The schema document at test.xsd is ignored because a schema for this namespace
  is already loaded

Setting --multipleSchemaImports:on suppresses the warning but it does not actually cause the schema to be processed. I've raised this as bug #3929

Actions #9

Updated by Michael Kay over 5 years ago

  • Status changed from New to Resolved
  • Applies to branch 9.8 added
  • Fix Committed on Branch 9.8, 9.9 added

Marking this resolved after applying patches (A) and (B) on the 9.9 branch, and (B) only on the 9.8 branch. Issue (C) has been transferred to other bug entries.

Actions #10

Updated by O'Neil Delpratt over 5 years ago

  • % Done changed from 0 to 100
  • Fixed in Maintenance Release 9.8.0.15 added

Bug fix applied in the Saxon 9.8.0.15 maintenance release. Leave open to the Saxon 9.9 maintenance release.

Actions #11

Updated by O'Neil Delpratt over 5 years ago

  • Status changed from Resolved to Closed
  • Fixed in Maintenance Release 9.9.0.2 added

Bug fix applied in the Saxon 9.9.0.2 maintenance release.

Please register to edit this issue

Also available in: Atom PDF