Project

Profile

Help

Support #5336

open

How to load/compile XSD 1.1 schema for XSLT 3 with SaxonCS 11.1?

Added by Martin Honnen 5 months ago. Updated 5 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
s9api API
Sprint/Milestone:
Start date:
2022-02-17
Due date:
% Done:

0%

Estimated time:
Legacy ID:
Applies to branch:
11
Fix Committed on Branch:
Fixed in Maintenance Release:
Platforms:
.NET

Description

With SaxonJ EE 11.1 I can run

    Processor processor = new Processor(true);

    processor.getSchemaManager().load(new StreamSource("https://www.w3.org/TR/xslt-30/schema-for-xslt30.xsd"));

without problems and it takes a such a short amount of time that I am sure the schema is loaded with the XmlResolver and from its XmlResolverData.

With SaxonCS 11.1, however, the similar code

            Processor processor = new Processor(true);

            processor.SchemaManager.Compile(new Uri("https://www.w3.org/TR/xslt-30/schema-for-xslt30.xsd"));

gives

Saxon.Api.SaxonApiException
  HResult=0x80131500
  Message=: Unable to retrieve URI https://www.w3.org/TR/xslt-30/schema-for-xslt30.xsd
  Source=SaxonCS
  StackTrace:
   at Saxon.Eej.ee.s9api.SchemaManagerImpl.load(Source source)
   at Saxon.Api.SchemaManager.Compile(Uri uri)
   at SaxonCSCompileXSLT3SchemaTest.Program.Main(String[] args) in C:\SomeDir\SaxonCSCompileXSLT3SchemaTest\Program.cs:line 14

and takes a lot of time to give that error so I assume it might try to download the file from the W3C server.

Even when trying to set

            Processor processor = new Processor(true);

            (processor.Implementation.getConfigurationProperty(FeatureKeys.RESOURCE_RESOLVER) as CatalogResourceResolver).setFeature(ResolverFeature.URI_FOR_SYSTEM, true);

            processor.SchemaManager.Compile(new Uri("https://www.w3.org/TR/xslt-30/schema-for-xslt30.xsd"));

it takes a lot of time and gives the same error:

Saxon.Api.SaxonApiException
  HResult=0x80131500
  Message=: Unable to retrieve URI https://www.w3.org/TR/xslt-30/schema-for-xslt30.xsd
  Source=SaxonCS
  StackTrace:
   at Saxon.Eej.ee.s9api.SchemaManagerImpl.load(Source source)
   at Saxon.Api.SchemaManager.Compile(Uri uri)
   at SaxonCSCompileXSLT3SchemaTest.Program.Main(String[] args) in C:\SomeDir\SaxonCSCompileXSLT3SchemaTest\Program.cs:line 16

So how do I get SaxonCS 11.1 to load/compile the schema for XSLT 3.0 and hopefully have it load the XSD using the catalog and its data dll?

Actions #1

Updated by Michael Kay 5 months ago

I've added this as a unit test.

The method SchemaManager.Compile(Uri) hasn't been updated to use the new Resolver infrastructure.

Also it would be more consistent for SchemaManager to have a ResourceResolver property that can be set, rather than relying on the legacy SchemaResolver.

Actions #2

Updated by Michael Kay 5 months ago

I tried a workaround using DocumentBuilder.Build(uri) followed by SchemaManager.Compile(XdmNode), but DocumentBuilder.Build(uri) suffers the same problem.

I can retrieve the document from the local cache using

                XQueryCompiler compiler = proc.NewXQueryCompiler();
                XQueryEvaluator eval = compiler.Compile("doc('https://www.w3.org/TR/xslt-30/schema-for-xslt30.xsd')").Load();
                XdmNode schema = (XdmNode)eval.Evaluate();

But I still get some kind of failure when doing Compile(XdmNode) on the result.

Actions #3

Updated by Michael Kay 5 months ago

This latest problem is the XmlResolverWrappingResourceResolver class throwing "invalid URI" when the XmlReader throws it the public ID -//W3C//DTD XSD 1.1//EN. This path doesn't seem to include the usual hack of recognising this as a Public ID rather than a URI.

Actions #4

Updated by Michael Kay 5 months ago

I've added unit tests on the Java side that use all three mechanisms (directly supplying a URI in a StreamSource, or getting an XdmNode via a DocumentBuilder or via an XQuery), and all three work correctly.

So how does the Java code differ from the C# code? They diverge at SchemaReader.sendSchemaSource(), which in SaxonCS is converted to a call on Sender.send(). But the divergent Java code is only setting up an XMLReader.

The real difference is that the StreamSource (containing only a URI) is passed to Platform.resolveSource() which has different implementations for SaxonJ and SaxonCS. On the SaxonCS side, it directly resolves the URI using WebClient.OpenRead(). On Java, it constructs an ActiveStreamSource that wraps the StreamSource, and passes this to the XMLReader. I'm finding it difficult to work out how the XMLReader handles it; I don't see any callback to the CatalogResolver, and yet it's coming back to quickly to have been out to the web. HTTP monitoring using Charles suggests that there is a call out to www.w3.org, but I can't see that it's requesting this URL.

Back over on the .NET side, it's very clear in Charles that we're making a real HTTP request to retrieve http://www.w3.org/TR/xmlschema11-1/XMLSchema.dtd - not the original schema document, but the DTD that it references.

Please register to edit this issue

Also available in: Atom PDF