Project

Profile

Help

Bug #4270

closed

Query doc('http://www.w3.org/2001/XMLSchema') fails "unknown protocol: classpath"

Added by Michael Kay over 4 years ago. Updated over 4 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
XQuery conformance
Sprint/Milestone:
-
Start date:
2019-08-05
Due date:
% Done:

100%

Estimated time:
Legacy ID:
Applies to branch:
9.9, trunk
Fix Committed on Branch:
9.9, trunk
Fixed in Maintenance Release:
Platforms:

Description

Running XQuery from the command line, with the query

doc('http://www.w3.org/2001/XMLSchema')

fails

Saxon-EE 9.9.1.4J from Saxonica
Java version 1.8.0_121
Using license serial number K007537
Analyzing query from {doc('http://www.w3.org/2001/XMLSchema')}
Analysis time: 395.469638 milliseconds
URIResolver.resolve href="http://www.w3.org/2001/XMLSchema" base="file:/Users/mike/repo/samples/"
Using parser org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser
Building tree for http://www.w3.org/2001/XMLSchema using class net.sf.saxon.tree.tiny.TinyBuilder
Fetching Saxon copy of w3c/rddl/rddl-xhtml.dtd
Fetching Saxon copy of w3c/xhtml11/xhtml-framework-1.mod
Fetching Saxon copy of w3c/xhtml11/xhtml-datatypes-1.mod
Fetching Saxon copy of w3c/xhtml11/xhtml-qname-1.mod
Fetching Saxon copy of w3c/rddl/rddl-qname-1.mod
Fetching Saxon copy of w3c/xhtml11/xhtml-datatypes-1.mod
Fetching Saxon copy of w3c/xhtml11/xhtml-attribs-1.mod
Fetching Saxon copy of w3c/rddl/xhtml-rddl-model-1.mod
Fetching Saxon copy of w3c/xhtml11/xhtml-charent-1.mod
Fetching Saxon copy of w3c/xhtml-lat1.ent
Fetching Saxon copy of w3c/xhtml-symbol.ent
Fetching Saxon copy of w3c/xhtml-special.ent
Fetching Saxon copy of w3c/xhtml11/xhtml-text-1.mod
Fetching Saxon copy of w3c/xhtml11/xhtml-inlstruct-1.mod
Fetching Saxon copy of w3c/xhtml11/xhtml-inlphras-1.mod
Fetching Saxon copy of w3c/xhtml11/xhtml-blkstruct-1.mod
Fetching Saxon copy of w3c/xhtml11/xhtml-blkphras-1.mod
Fetching Saxon copy of w3c/xhtml11/xhtml-hypertext-1.mod
Fetching Saxon copy of w3c/xhtml11/xhtml-list-1.mod
Fetching Saxon copy of w3c/xhtml11/xhtml-image-1.mod
Fetching Saxon copy of w3c/xhtml11/xhtml-basic-table-1.mod
Fetching Saxon copy of w3c/xhtml11/xhtml-basic-form-1.mod
Fetching Saxon copy of w3c/xhtml11/xhtml-link-1.mod
Fetching Saxon copy of w3c/xhtml11/xhtml-meta-1.mod
Fetching Saxon copy of w3c/xhtml11/xhtml-base-1.mod
Fetching Saxon copy of w3c/xhtml11/xhtml-param-1.mod
Fetching Saxon copy of w3c/xhtml11/xhtml-object-1.mod
Fetching Saxon copy of w3c/xhtml11/xhtml-struct-1.mod
Error evaluating (fn:doc(...)) on line 1 column 5 of file:/Users/mike/repo/samples/:
  FODC0002: I/O error reported by XML parser processing http://www.w3.org/2001/XMLSchema:
  unknown protocol: classpath

Actions #1

Updated by Michael Kay over 4 years ago

Note that the document at that location is an XHTML document with an external DTD:

<?xml version='1.0'?>
<!DOCTYPE html PUBLIC "-//XML-DEV//DTD XHTML RDDL 1.0//EN" "http://www.w3.org/2001/rddl/rddl-xhtml.dtd" >
<html xmlns="http://www.w3.org/1999/xhtml"
      xmlns:xlink="http://www.w3.org/1999/xlink"
      xmlns:rddl="http://www.rddl.org/" xml:lang="en">
<head>
<title>XML Schema</title>
</head>
<body>
...

Internal tracing at the level of the StandardEntityResolver shows that the failing request is

Requesting PUBLIC '-//XML-DEV//ENTITIES XLink Module 1.0//EN' at 'classpath:w3c/rddl/xlink-module-1.mod'

We don't appear to have xlink-module-1.mod in the set of well-known W3C entities.

The "classpath" URI scheme seems to be something Saxon has added to establish a base URI. We are primarily accessing known entities using the public ID, not the system ID; we only use the system ID as a fallback.

In fact the StandardEntityResolver fails to find the entity using either the public or system ID; instead it returns null to fallback to the parser's default entity resolution, and it is this that cannot handle the "classpath" URI scheme.

Actions #2

Updated by Michael Kay over 4 years ago

Problem solved by registering a couple of extra Public IDs in StandardEntityResolver. The relevant modules are already present in the product, but there are inconsistencies in the RDDL/XHTML DTDs over use of Public IDs, so we need to register them under more than one ID.

Actions #3

Updated by Michael Kay over 4 years ago

  • Status changed from New to Resolved
  • Fix Committed on Branch 9.9, trunk added
Actions #4

Updated by O'Neil Delpratt over 4 years ago

  • Status changed from Resolved to Closed
  • % Done changed from 0 to 100
  • Fixed in Maintenance Release 9.9.1.5 added

Bug fix applied in the Saxon 9.9.1.5 maintenance release.

Please register to edit this issue

Also available in: Atom PDF