https://saxonica.plan.io/https://saxonica.plan.io/favicon.ico2022-02-18T11:54:50ZSaxonica Developer CommunitySaxon - Bug #5336: How to load/compile XSD 1.1 schema for XSLT 3 with SaxonCS 11.1?https://saxonica.plan.io/issues/5336?journal_id=195532022-02-18T11:54:50ZMichael Kaymike@saxonica.com
<ul></ul><p>I've added this as a unit test.</p>
<p>The method <code>SchemaManager.Compile(Uri)</code> hasn't been updated to use the new Resolver infrastructure.</p>
<p>Also it would be more consistent for SchemaManager to have a ResourceResolver property that can be set, rather than relying on the legacy <code>SchemaResolver</code>.</p> Saxon - Bug #5336: How to load/compile XSD 1.1 schema for XSLT 3 with SaxonCS 11.1?https://saxonica.plan.io/issues/5336?journal_id=195542022-02-18T12:48:16ZMichael Kaymike@saxonica.com
<ul></ul><p>I tried a workaround using <code>DocumentBuilder.Build(uri)</code> followed by <code>SchemaManager.Compile(XdmNode)</code>, but <code>DocumentBuilder.Build(uri)</code> suffers the same problem.</p>
<p>I can retrieve the document from the local cache using</p>
<pre><code> XQueryCompiler compiler = proc.NewXQueryCompiler();
XQueryEvaluator eval = compiler.Compile("doc('https://www.w3.org/TR/xslt-30/schema-for-xslt30.xsd')").Load();
XdmNode schema = (XdmNode)eval.Evaluate();
</code></pre>
<p>But I still get some kind of failure when doing <code>Compile(XdmNode)</code> on the result.</p> Saxon - Bug #5336: How to load/compile XSD 1.1 schema for XSLT 3 with SaxonCS 11.1?https://saxonica.plan.io/issues/5336?journal_id=195552022-02-18T12:56:33ZMichael Kaymike@saxonica.com
<ul></ul><p>This latest problem is the <code>XmlResolverWrappingResourceResolver</code> class throwing "invalid URI" when the XmlReader throws it the public ID <code>-//W3C//DTD XSD 1.1//EN</code>. This path doesn't seem to include the usual hack of recognising this as a Public ID rather than a URI.</p> Saxon - Bug #5336: How to load/compile XSD 1.1 schema for XSLT 3 with SaxonCS 11.1?https://saxonica.plan.io/issues/5336?journal_id=195932022-02-18T15:41:21ZMichael Kaymike@saxonica.com
<ul></ul><p>I've added unit tests on the Java side that use all three mechanisms (directly supplying a URI in a StreamSource, or getting an XdmNode via a DocumentBuilder or via an XQuery), and all three work correctly.</p>
<p>So how does the Java code differ from the C# code? They diverge at <code>SchemaReader.sendSchemaSource()</code>, which in SaxonCS is converted to a call on <code>Sender.send()</code>. But the divergent Java code is only setting up an XMLReader.</p>
<p>The real difference is that the StreamSource (containing only a URI) is passed to Platform.resolveSource() which has different implementations for SaxonJ and SaxonCS. On the SaxonCS side, it directly resolves the URI using <code>WebClient.OpenRead()</code>. On Java, it constructs an <code>ActiveStreamSource</code> that wraps the <code>StreamSource</code>, and passes this to the XMLReader. I'm finding it difficult to work out how the XMLReader handles it; I don't see any callback to the CatalogResolver, and yet it's coming back to quickly to have been out to the web. HTTP monitoring using Charles suggests that there is a call out to <a href="http://www.w3.org" class="external">www.w3.org</a>, but I can't see that it's requesting this URL.</p>
<p>Back over on the .NET side, it's very clear in Charles that we're making a real HTTP request to retrieve <a href="http://www.w3.org/TR/xmlschema11-1/XMLSchema.dtd" class="external">http://www.w3.org/TR/xmlschema11-1/XMLSchema.dtd</a> - not the original schema document, but the DTD that it references.</p> Saxon - Bug #5336: How to load/compile XSD 1.1 schema for XSLT 3 with SaxonCS 11.1?https://saxonica.plan.io/issues/5336?journal_id=212412022-07-12T08:40:17ZMartin Honnenmartin.honnen@gmx.de
<ul></ul><p>So what is the final resolution here for the .NET/C# side? Shouldn't Saxon get that <a href="http://www.w3.org/TR/xmlschema11-1/XMLSchema.dtd" class="external">http://www.w3.org/TR/xmlschema11-1/XMLSchema.dtd</a> from its XmlResoverData.dll cache instead of trying to pull from W3C?</p> Saxon - Bug #5336: How to load/compile XSD 1.1 schema for XSLT 3 with SaxonCS 11.1?https://saxonica.plan.io/issues/5336?journal_id=212422022-07-12T10:10:41ZMartin Honnenmartin.honnen@gmx.de
<ul></ul><p>Probably related, using SaxonCS from the command line to try to validate the XSD 1.1 schema against the XSD 1.1 schema fails:</p>
<pre><code> & 'C:\Program Files\Saxonica\SaxonCS-11.3\SaxonCS.exe' validate -t -s:https://www.w3.org/TR/xmlschema11-1/XMLSchema.xsd -xsd:https://www.w3.org/TR/xmlschema11-1/XMLSchema.xsd
SaxonCS-EE 11.3 from Saxonica
.NET 5.0.9 on Windows 10.0.22000.0
Using license serial number ...
URIResolver for schema file must return a Source
Exiting with code 2
</code></pre>
<p>When I use a .NET 6 wrapper command tool that does nothing more than calling <code>Saxon.Cmd.Command.Main(args.Prepend("validate").ToArray());</code> I interestingly enough get a different error:</p>
<pre><code>saxonvalidate -s:https://www.w3.org/TR/xmlschema11-1/XMLSchema.xsd -xsd:https://www.w3.org/TR/xmlschema11-1/XMLSchema.xsd -t
SaxonCS-EE 11.3 from Saxonica
.NET 6.0.6 on Windows 10.0.22000.0
Using license serial number ...
Loading schema document https://www.w3.org/TR/xmlschema11-1/XMLSchema.xsd
: Cannot resolve external DTD subset - public ID = '-//W3C//DTD XSD 1.1//EN', system ID = 'XMLSchema.dtd'.
Fatal error during validation: : Cannot resolve external DTD subset - public ID = '-//W3C//DTD XSD 1.1//EN', system ID = 'XMLSchema.dtd'.
Exiting with code 2
</code></pre>
<p>So its seems the Xml resolver / resolver cache fails. <a class="user active" href="https://saxonica.plan.io/users/3760">Norm Tovey-Walsh</a>, any idea on that.</p> Saxon - Bug #5336: How to load/compile XSD 1.1 schema for XSLT 3 with SaxonCS 11.1?https://saxonica.plan.io/issues/5336?journal_id=213002022-07-15T09:00:24ZMichael Kaymike@saxonica.com
<ul><li><strong>Tracker</strong> changed from <i>Support</i> to <i>Bug</i></li></ul><p>I'm elevating this to a Bug. We have a SaxonCS unit test <code>TestSchemaValidator.testSchemaForXslt30</code> which is failing (or running extremely slowly) as a result of this problem.</p> Saxon - Bug #5336: How to load/compile XSD 1.1 schema for XSLT 3 with SaxonCS 11.1?https://saxonica.plan.io/issues/5336?journal_id=213012022-07-15T10:28:44ZMichael Kaymike@saxonica.com
<ul></ul><p>I'm investigating the unit test <code>TestSchemaValidator.testSchemaForXslt30</code>, which attempts to load the XSLT30 schema using a doc() call issued from an XQuery (and fails).</p>
<p>The XSLT30 schema URI is being successfully resolved by the catalog resolver, and returns with a ManifestResourceStream delivering the content from the XmlResolverData. We then attempt to parse this stream, using an XmlTextReader in which the XmlResolver is set to an instance of XmlResolverWrappingResourceResolver.</p>
<p>We see a callback from the XmlTextReader to this XmlResolver - it's calling ResolveUri() with a baseUri of null and a relativeUri() of <code>.../XMLSchema.xsd</code> originating internally from DtdParserProxy.get_DtdParserProxy_BaseUri(). This successfully returns the URI unchanged.</p>
<p>Next we see a call on ResolveUri with baseUri equal to the <code>.../XMLSchema.xsd</code> URI and relativeURI being the public ID <code>-//W3C//DTD XSD 1.1//EN</code>. We detect this as a Public ID and pass it back as unchanged as a Uri object.</p>
<p>Next we see a call on GetEntity with absoluteUri being the public ID <code>-//W3C//DTD XSD 1.1//EN</code>. We call the catalog resolver which looks this up and sets uri2=<code>pack://application:,,,XmlResolverData;0.2.0.0;component/www_w3_org.2009.XMLSchema.XMLSchema.dtd</code>. The resolvedResourceImpl is null, so we return null from the GetEntity() call.</p>
<p>Next we see a call on ResolveUri with baseUri being <code>http://www.w3.org/TR/xmlschema11-1/XMLSchema.xsd</code> and relativeUri being <code>XMLSchema.dtd</code>. This correctly returns <code>http://www.w3.org/TR/xmlschema11-1/XMLSchema.dtd</code>. It appears the XmlTextReader attempts to fetch this URI itself without a further call to <code>GetEntity()</code>.</p> Saxon - Bug #5336: How to load/compile XSD 1.1 schema for XSLT 3 with SaxonCS 11.1?https://saxonica.plan.io/issues/5336?journal_id=213022022-07-15T10:38:45ZNorm Tovey-Walsh
<ul></ul><p>If I'm understanding correctly,</p>
<blockquote>
<p>Next we see a call on GetEntity with absoluteUri being the public ID -//W3C//DTD XSD 1.1//EN. We call the catalog resolver which looks this up and sets uri2=pack://application:,,,XmlResolverData;0.2.0.0;component/www_w3_org.2009.XMLSchema.XMLSchema.dtd. The resolvedResourceImpl is null, so we return null from the GetEntity() call.</p>
</blockquote>
<p>This is the part that sounds like a bug. If we've worked out that the dtd is in a <code>pack://</code> URI, then why is <code>resolvedResourceImpl</code> returned null, I wonder? I'll have to build CS and see if I can get it running under the debugger...</p> Saxon - Bug #5336: How to load/compile XSD 1.1 schema for XSLT 3 with SaxonCS 11.1?https://saxonica.plan.io/issues/5336?journal_id=213032022-07-15T11:16:00ZNorm Tovey-Walsh
<ul></ul><p>Yes, it appears that <code>http://www.w3.org/TR/xmlschema11-1/XMLSchema.dtd</code> isn't in the data assembly.</p> Saxon - Bug #5336: How to load/compile XSD 1.1 schema for XSLT 3 with SaxonCS 11.1?https://saxonica.plan.io/issues/5336?journal_id=213042022-07-15T11:27:43ZNorm Tovey-Walsh
<ul></ul><p>Wow, the W3C site seems to be very confused about what is available and where wrt schema validation.</p>
<pre><code>https://www.w3.org/TR/xmlschema-2/datatypes.xsd
</code></pre>
<p>Isn't an XSD file, it's something wrapped in a <code>pre</code>. Even if you unwrapped the <code>pre</code> it would be wrong because the XML declaration is after the DTD fragment.</p> Saxon - Bug #5336: How to load/compile XSD 1.1 schema for XSLT 3 with SaxonCS 11.1?https://saxonica.plan.io/issues/5336?journal_id=213082022-07-15T13:16:28ZMichael Kaymike@saxonica.com
<ul></ul><p>That's how it appears in Safari. But if you look at it using curl, it starts</p>
<pre><code><?xml version='1.0'?>
<!DOCTYPE xs:schema PUBLIC "-//W3C//DTD XSD 1.1//EN" "XMLSchema.dtd" [
<!-- provide ID type information even for parsers which only read the
internal subset -->
<!ATTLIST xs:schema id ID #IMPLIED>
<!ATTLIST xs:complexType id ID #IMPLIED>
</code></pre> Saxon - Bug #5336: How to load/compile XSD 1.1 schema for XSLT 3 with SaxonCS 11.1?https://saxonica.plan.io/issues/5336?journal_id=213092022-07-15T13:20:26ZMichael Kaymike@saxonica.com
<ul></ul><p>Oh, sorry, that was</p>
<p><a href="https://www.w3.org/TR/xmlschema11-2/XMLSchema.xsd" class="external">https://www.w3.org/TR/xmlschema11-2/XMLSchema.xsd</a></p>
<p>But the 1.0 version is very similar in curl:</p>
<pre><code><?xml version='1.0' encoding='UTF-8'?>
<!-- XML Schema schema for XML Schemas: Part 1: Structures -->
<!-- Note this schema is NOT the normative structures schema. -->
<!-- The prose copy in the structures REC is the normative -->
<!-- version (which shouldn't differ from this one except for -->
<!-- this comment and entity expansions, but just in case -->
<!DOCTYPE xs:schema PUBLIC "-//W3C//DTD XMLSCHEMA 200102//EN" "XMLSchema.dtd" [
<!-- provide ID type information even for parsers which only read the
internal subset -->
<!ATTLIST xs:schema id ID #IMPLIED>
<!ATTLIST xs:complexType id ID #IMPLIED>
<!ATTLIST xs:complexContent id ID #IMPLIED>
</code></pre> Saxon - Bug #5336: How to load/compile XSD 1.1 schema for XSLT 3 with SaxonCS 11.1?https://saxonica.plan.io/issues/5336?journal_id=213122022-07-15T14:00:34ZNorm Tovey-Walsh
<ul></ul><p><code>XMLSchema.xsd</code> is fine, but <code>datatypes.xsd</code>, that's another story:</p>
<pre><code>$ curl -s https://www.w3.org/TR/xmlschema-2/datatypes.xsd | head
<pre><![CDATA[<!DOCTYPE xs:schema PUBLIC "-//W3C//DTD XMLSCHEMA 200102//EN" "XMLSchema.dtd" [
<!--
keep this schema XML1.0 DTD valid
-->
<!ENTITY % schemaAttrs 'xmlns:hfp CDATA #IMPLIED'>
<!ELEMENT hfp:hasFacet EMPTY>
<!ATTLIST hfp:hasFacet
name NMTOKEN #REQUIRED>
...
</code></pre>
<p>Note the initial <code><pre></code>.</p>
<p>FYI: I have reported this to <code>webreq@w3.org</code>.</p> Saxon - Bug #5336: How to load/compile XSD 1.1 schema for XSLT 3 with SaxonCS 11.1?https://saxonica.plan.io/issues/5336?journal_id=213132022-07-15T14:03:18ZNorm Tovey-Walsh
<ul></ul><p>The root cause of the slowness was the fact that the W3C XML Schema DTDs are being retrieved from an unofficial location that I didn't know existed. I've updated the XML Resolver data jar and assembly to correct the oversight. If you update the NuGet dependency for <code>XmlResolverData</code> to <code>1.2.0</code>, I believe the DTD will be accessed without dereferencing from the W3C site.</p> Saxon - Bug #5336: How to load/compile XSD 1.1 schema for XSLT 3 with SaxonCS 11.1?https://saxonica.plan.io/issues/5336?journal_id=213142022-07-15T14:45:34ZMartin Honnenmartin.honnen@gmx.de
<ul></ul><p>Hi Norm,</p>
<p>for me, with SaxonCS 11.3, .NET 6 and XmlResolverData 1.2 the code</p>
<pre><code class="csharp syntaxhl" data-language="csharp"><span class="k">using</span> <span class="nn">Saxon.Api</span><span class="p">;</span>
<span class="n">Processor</span> <span class="n">processor</span> <span class="p">=</span> <span class="k">new</span> <span class="nf">Processor</span><span class="p">(</span><span class="k">true</span><span class="p">);</span>
<span class="n">processor</span><span class="p">.</span><span class="n">SchemaManager</span><span class="p">.</span><span class="nf">Compile</span><span class="p">(</span><span class="k">new</span> <span class="nf">Uri</span><span class="p">(</span><span class="s">"https://www.w3.org/TR/xslt-30/schema-for-xslt30.xsd"</span><span class="p">));</span>
</code></pre>
<p>still hangs for quite some time to then gives an exception</p>
<pre><code>Saxon.Api.SaxonApiException
HResult=0x80131500
Nachricht = : Unable to retrieve URI https://www.w3.org/TR/xslt-30/schema-for-xslt30.xsd
Quelle = SaxonCS
Stapelüberwachung:
bei Saxon.Eej.ee.s9api.SchemaManagerImpl.load(Source source)
bei Program.<Main>$(String[] args) in C:\SomePath\SaxonCSCompileXSLT30Schema\SaxonCSCompileXSLT30Schema\Program.cs: Zeile5
</code></pre> Saxon - Bug #5336: How to load/compile XSD 1.1 schema for XSLT 3 with SaxonCS 11.1?https://saxonica.plan.io/issues/5336?journal_id=213152022-07-15T15:04:36ZNorm Tovey-Walsh
<ul></ul><p>Okay. I'll investigate. I was following Mike's lead on an existing test case which definitely did get faster.</p> Saxon - Bug #5336: How to load/compile XSD 1.1 schema for XSLT 3 with SaxonCS 11.1?https://saxonica.plan.io/issues/5336?journal_id=213162022-07-15T15:29:30ZNorm Tovey-Walsh
<ul></ul><p>AFAICT (and that might not be very far),</p>
<pre><code>processor.SchemaManager.Compile(someURI)
</code></pre>
<p>makes no effort to resolve the resource through the resolver.</p>
<ol>
<li>
<code>Compile()</code> creates a <code>StreamSource</code> directly from the URI</li>
<li>We go on a long journey: <code>SchemaManagerImpl.load()</code>, <code>EnterpriseConfiguration.addSchemaSource()</code>, <code>SchemaReader.read()</code>, <code>SchemaReader.buildSchemaDocument()</code>, <code>schemaReader.sendSchemaSource()</code>, <code>Sender.send()</code>, <code>ProfessionConfiguration.resolveSource()</code>, <code>Configuration.resolveSource()</code>, <code>DotNetPlatform.resolveSource()</code>
</li>
<li>In <code>DotNetPlatform.resolveSource()</code>, if the source is a <code>StreamSource</code>, we call <code>getInputStream()</code> on it.</li>
</ol>
<p>So this code path, unlike the code path in <code>TestSchemaValidator.testSchemaForXslt30</code> just never tries to resolve it through the catalog resolver.</p> Saxon - Bug #5336: How to load/compile XSD 1.1 schema for XSLT 3 with SaxonCS 11.1?https://saxonica.plan.io/issues/5336?journal_id=213462022-07-19T10:14:21ZNorm Tovey-Walsh
<ul><li><strong>Assignee</strong> set to <i>Michael Kay</i></li></ul> Saxon - Bug #5336: How to load/compile XSD 1.1 schema for XSLT 3 with SaxonCS 11.1?https://saxonica.plan.io/issues/5336?journal_id=213502022-07-19T14:39:49ZMichael Kaymike@saxonica.com
<ul></ul><p>I've created a unit test that does</p>
<pre><code> Processor proc = new Processor(true);
SchemaManager schemaManager = proc.getSchemaManager();
schemaManager.load(new StreamSource("https://www.w3.org/TR/xslt-30/schema-for-xslt30.xsd"));
SchemaValidator validator = schemaManager.newSchemaValidator();
validator.validate(new StreamSource(new File(configTest.getDataDir(), "books.xsl")));
</code></pre>
<p>in both Java and C# versions.</p>
<p>In both cases I can't see any attempt to resolve the initial URI using the catalog resolver. I'm monitoring using Charles and in both cases there appears to be a request to <a href="http://www.w3.org" class="external">www.w3.org</a>, though the path doesn't seem to be shown (limitation of trial version perhaps?). The difference is that in the Java case the request comes back with 114.9Kb after 1400ms, while in the C# case it comes back with 5.85Kb after 99985ms.</p> Saxon - Bug #5336: How to load/compile XSD 1.1 schema for XSLT 3 with SaxonCS 11.1?https://saxonica.plan.io/issues/5336?journal_id=213512022-07-19T15:12:52ZMichael Kaymike@saxonica.com
<ul></ul><p>In SaxonJ, following the call to <code>Platform.resolveSource()</code>, we eventually end up with an ActiveSAXSource, containing an InputSource containing only the URI <code>https://www.w3.org/TR/xslt-30/schema-for-xslt30.xsd</code>. We pass this InputSource to parser.parse(), and it comes back with the result after 1.5 seconds.</p>
<p>The xs:import of the second schema document, <code>http://www.w3.org/2001/XMLSchema</code>, is handled by the StandardSchemaResolver, and this successfully invokes the catalog resolver.</p>
<p>In SaxonCS, the call to <code>Platform.resolveSource()</code> invokes <code>new WebClient.OpenRead(new Uri('https://www.w3.org/TR/xslt-30/schema-for-xslt30.xsd'))</code> and this times out. I can't see why there should be a difference.</p> Saxon - Bug #5336: How to load/compile XSD 1.1 schema for XSLT 3 with SaxonCS 11.1?https://saxonica.plan.io/issues/5336?journal_id=213522022-07-19T16:12:48ZMichael Kaymike@saxonica.com
<ul></ul><p>In SaxonJ I have changed SchemaManager.load() so if the supplied Source is a StreamSource with no InputStream or Reader, then the systemId is resolved using the configuration-level resource resolver (which by default uses the catalog). It doesn't use the SchemaURIResolver.</p>
<p>In consequence, in SaxonCS, SchemaManager.Compile(Uri) does the same.</p> Saxon - Bug #5336: How to load/compile XSD 1.1 schema for XSLT 3 with SaxonCS 11.1?https://saxonica.plan.io/issues/5336?journal_id=213532022-07-19T16:21:11ZMichael Kaymike@saxonica.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Resolved</i></li><li><strong>Applies to branch</strong> <i>trunk</i> added</li><li><strong>Fix Committed on Branch</strong> <i>11, trunk</i> added</li><li><strong>Platforms</strong> <i>Java</i> added</li></ul> Saxon - Bug #5336: How to load/compile XSD 1.1 schema for XSLT 3 with SaxonCS 11.1?https://saxonica.plan.io/issues/5336?journal_id=214312022-07-28T15:48:58ZDebbie Lockettdebbie@saxonica.com
<ul><li><strong>Status</strong> changed from <i>Resolved</i> to <i>Closed</i></li><li><strong>% Done</strong> changed from <i>0</i> to <i>100</i></li><li><strong>Fixed in Maintenance Release</strong> <i>11.4</i> added</li></ul><p>Bug fix applied in the Saxon 11.4 maintenance release.</p>