Bug #5970
closedDirectResourceResolver - unresolved XML classpath resources
100%
Description
Seems identical to https://saxonica.plan.io/issues/5271 but for XML resources.
From Saxon 11, when trying to load an XML document from the classpath, java.net.MalformedURLException: unknown protocol: classpath
exception can be thrown.
StringWriter sw = new StringWriter();
Configuration config = new Configuration();
XQueryExpression expression = config.newStaticQueryContext().compileQuery("doc('classpath:books.xml')");
expression.run(new DynamicQueryContext(config), new StreamResult(sw), new Properties());
System.out.println(sw);
In this example the resource is lastly resolved by the DirectResourceResolver
, which relies on ResourceLoader
.
For XSLT resources we have the following patch (could it be applied to all natures of resources?)
if (request.uri.startsWith("classpath:")) {
String s = request.uri.substring(10);
if (s.startsWith("/")) {
s = s.substring(1);
}
stream = config.getDynamicLoader().getResourceAsStream(s);
}
Updated by Norm Tovey-Walsh over 1 year ago
Indeed. I'm looking at that code in DirectResourceResolver
that only applies the classpath:
sniffing to XSLT resources. Right now, it has three branches:
if (XSLT) {
if (classpath:) {
handle that
} else {
use ResourceLoader
}
} else if (XML or XSD) {
use ResourceLoader
} else {
use new StreamSource()
}
It sure feels like we could check for classpath:
and use ResourceLoader
otherwise in all cases...
Updated by Norm Tovey-Walsh over 1 year ago
In fact, why not move the classpath:
checking into ResourceLoader.urlStream()
and just use that?
Updated by Michael Kay over 1 year ago
Indeed, why not? There would be consequences, but they seem generally positive, like making query module locations and unparsed text locations work with classpath URIs.
Updated by Norm Tovey-Walsh over 1 year ago
Alas, not every context where ResourceLoader.urlStream()
is called has access to the Configuration
and we need the Configuration
to call getDynamicLoader()
.
Could we just use the default loader if we don't have the configuration, or would that be a problem?
Do we need a flavor of urlStream()
that handles classpath:
if it has access to the Configuration?
Updated by Norm Tovey-Walsh over 1 year ago
There appear to be two places where we call urlStream()
when there's no configuration available. In QueryReader
and in UnparsedTextResource
. Seems like it would be nice to be able to support classpath:
in those places if we're going to try to support it in more places... :-(
Updated by Michael Kay over 1 year ago
I think that most callers can be made Configuration-aware with a bit of tweaking, e.g. QueryReader.readSourceQuery()
and UnparsedTextResource
.
UnparsedTextResource
doesn't know the Configuration
, but it's always instantiated via a Factory object which is supplied with the context and currently ignores it.
Updated by Norm Tovey-Walsh over 1 year ago
I started down that road. I don't see how UnparsedTextResource
is always created by a factory though? I see that there is a factory but it has a public constructor and, for example, DataURIScheme
constructs one directly.
Updated by Michael Kay over 1 year ago
I missed that one. But the callers of DataURIScheme.decode()
have access to context?
It's probably one of those cases where you need a fallback if no configuration is available, but where most paths can be made to supply one.
Updated by Norm Tovey-Walsh over 1 year ago
- Assignee changed from Norm Tovey-Walsh to Michael Kay
- Priority changed from Low to Normal
- Applies to branch trunk added
- Fix Committed on Branch 12 added
This turned out to require making quite a few changes, but I think they all make sense:
-
ResourceLoader.urlStream()
andResourceLoader.urlReader()
now expect aConfiguration
. That configuration is used to resolveclasspath:
URIs. They’ve been refactored a little bit to avoid some code duplication. -
DirectResourceResolver.resolve()
has been simplified to rely onResourceLoader
to get the input stream. -
StandardModuleURIResolver.toStreamSource()
has been modified to accept anAugmentedSource
. If it’s passed anAugmentedSource
, it uses the contained source to get the stream. -
StandardURIResolver.setSAXInputSource
has been simplified to rely onResourceLoader
to get the input stream. -
StandardUnparsedTextResolver.resolve()
has been modified to accept anAugmentedSource
. If it’s passed anAugmentedSource
, it uses the contained source to get the stream. To make that work, I’ve added agetReaderFromSAXSource()
to handle the contained source (which is usually/always a SAXSource). Refactored a bit to avoid some code duplication. -
QueryReader.readSourceQuery()
now takes aConfiguration
argument so it can useResourceLoader.urlStream()
-
AbstractResourceCollection.getInputStream()
,AbstractResourceCollection.obtainBinaryContent()
, andAbstractResourceCollection.obtainCharacterContent
now take aConfiguration
argument so they can useResourceLoader.urlStream()
. -
DataURIScheme
usesUnparsedTextResource.FACTORY
to create anUnparsedTextResource
. It doesn’t have aConfiguration
handy so it just passesnull
as theConfiguration
but that’s ok becuase adata:
URI can never be aclasspath:
URI. -
JSONResource
andUnknownResource
now hold onto theConfiguration
that created them so that they can useResourceLoader.urlStream()
. - The constructor for
UnparsedTextResource
is now private. TheFACTORY
can be used to obtain one. TheConfiguration
is now used by the constructor so thatgetContent()
can useResourceLoader.urlStream()
.
Mike it would be good to get your review on commits 891c1849
and 9567e592
. (I haven't merged them into main
yet, pending your review.)
Updated by Norm Tovey-Walsh over 1 year ago
FYI: I've broken the SaxonCS build ... investigating.
Updated by Norm Tovey-Walsh over 1 year ago
I've fixed the SaxonCS problems, I think. See commit 79303539
Updated by Michael Kay over 1 year ago
- Assignee changed from Michael Kay to Norm Tovey-Walsh
Updated by Norm Tovey-Walsh over 1 year ago
- Assignee changed from Norm Tovey-Walsh to Michael Kay
Updated by Norm Tovey-Walsh over 1 year ago
- Status changed from New to Resolved
- Assignee changed from Michael Kay to Norm Tovey-Walsh
- Fix Committed on Branch trunk added
Right. I've pushed the fixes to main as well. The patch didn't apply cleanly to 11 so I haven't done that one yet.
Updated by O'Neil Delpratt over 1 year ago
- Status changed from Resolved to Closed
- % Done changed from 0 to 100
- Fixed in Maintenance Release 12.2 added
Bug fix applied in the Saxon 12.2 maintenance release
Please register to edit this issue