Bug #6222
closedXMLResolver fails on windows paths in system identifiers that are reachable but can't be resolved, even if "-Dxml.catalog.fixWindowsSystemIdentifiers=true" was given.
0%
Description
XMLResolver fails on windows paths in system identifiers that are reachable but can't be resolved, even if "-Dxml.catalog.fixWindowsSystemIdentifiers=true" was given.
stack trace:
java.lang.IllegalArgumentException: Illegal character in path at index 1: .\test.dtd
at java.base/java.net.URI.create(Unknown Source)
at java.base/java.net.URI.resolve(Unknown Source)
at org.xmlresolver.Resolver.openConnection(Resolver.java:250)
at org.xmlresolver.Resolver.resolveEntity(Resolver.java:204)
at net.sf.saxon.lib.CatalogResourceResolver.resolveEntity(CatalogResourceResolver.java:194)
at net.sf.saxon.lib.EntityResolverWrappingResourceResolver.resolveEntity(EntityResolverWrappingResourceResolver.java:46)
[…]
Steps to reproduce (see attached ZIP file):
xml-inputfile.xml
:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE root SYSTEM ".\test.dtd">
<root id="id12345"/>
test.dtd
- is a valid DTD
- is located in the same directory as
xml-inputfile.xml
command line
java -cp "path\to\saxon\*;path\to\saxon\lib\*" -Dxml.catalog.fixWindowsSystemIdentifiers=true net.sf.saxon.Transform -xsl:"main.xsl" -s:"xml-inputfile.xml" -o:"xml-outputfile.xml" [-catalog:catalog_0X.xml]
Current Results
- no catalog given (
run_01.bat
)- fails
- catalog is given and system identifier can be resolved (
run_02.bat
usescatalog_01.xml
)- works
- catalog is given and system identifier can't be resolved (
run_03.bat
usescatalog_02.xml
)- fails
Expected results: (like run_02.bat
)
- input file is loaded
- no exception is thrown
- no warning appears
Files
Updated by Stefan Krause about 1 year ago
- File #6222_xmlresolver.zip #6222_xmlresolver.zip added
Updated by Norm Tovey-Walsh about 1 year ago
- Status changed from New to AwaitingInfo
I think this is the expected behavior.
If I'm understanding this correctly, what's happening is that run_03.bat
attempts to resolve .\test.dtd
with catalog_02.xml
. There is no matching catalog entry for the DTD, so no match is found. The resolver reports failure and Saxon goes on to try the original URI, .\test.dtd
which isn't a valid URI.
You can see this if you set -Dxml.catalog.defaultLoggerLogLevel=debug
:
...
config: Fix Windows system identifiers: true
config: Default logger log level: debug
config: Searching for catalogs on classpath:
config: Catalog: jar:file:/C:/Users/norm/Desktop/%236222_xmlresolver/SaxonHE12-3J/lib/xmlresolver-5.2.0-data.jar!/org/xmlresolver/catalog.xml
config: Throw URI exceptions: true
config: Catalog list cleared
config: Catalog: file:/C:/Users/norm/Desktop/%236222_xmlresolver/catalog_02.xml
request: resolveEntity: ./test.dtd (baseURI: file:/C:/Users/norm/Desktop/%236222_xmlresolver/xml-inputfile.xml, publicId: null)
config: Loaded catalog: file:/C:/Users/norm/Desktop/%236222_xmlresolver/catalog_02.xml
config: Loaded catalog: jar:file:/C:/Users/norm/Desktop/%236222_xmlresolver/SaxonHE12-3J/lib/xmlresolver-5.2.0-data.jar!/org/xmlresolver/catalog.xml
response: resolveEntity: null
java.lang.IllegalArgumentException: Illegal character in path at index 1: .\test.dtd
at java.base/java.net.URI.create(URI.java:883)
at java.base/java.net.URI.resolve(URI.java:1066)
Note that the resolver is looking for ./test.dtd
indicating that it has fixed the Windows system identifier.
The point at which the resolver responds null
is the point at which the resolver exits the picture. At this point, Saxon has no choice but to ask for the original URI and that...isn't a URI.
XML system identifiers are not filenames, or file paths, they are URIs and "\" is not a valid character in a URI.
If you can't fix the input documents to contain valid system identifiers as URIs, then your best bet is to make sure that the catalog successfully resolves the identifiers.
Updated by Stefan Krause about 1 year ago
Sorry for the delayed response, I was out of office.
I see your point here, and you can close this ticket.
For now, this is just a problem with our test cases. Maybe in the future we need an easy solution to make run_01.bat
and run_03.bat
working, since we can not know in advance which DTDs where referenced. Please consider that many applications including OxygenXML, XMetaL, and Saxon had accepted and/or produced such entity references during the last 25 years.
Please register to edit this issue