Project

Profile

Help

Bug #5635

closed

I cannot disable the catalog resolver disk cache and avoid it creating the cache folder and files

Added by Radu Coravu over 1 year ago. Updated over 1 year ago.

Status:
Closed
Priority:
Low
Category:
Resolvers
Sprint/Milestone:
-
Start date:
2022-08-05
Due date:
% Done:

0%

Estimated time:
Legacy ID:
Applies to branch:
Fix Committed on Branch:
Fixed in Maintenance Release:
Platforms:

Description

Any time a new transformer is created, I get a bunch of files created here: C:/Users/radu_coravu/.xmlresolver.org/cache/ I've already set up the code to disable the cache:

    ResourceResolver rsrcResolver = config.getResourceResolver();
    if (rsrcResolver instanceof CatalogResourceResolver) {
      ((CatalogResourceResolver) rsrcResolver).setFeature(ResolverFeature.CACHE_ENABLED, Boolean.FALSE);
    }

but it's too late, that folder is still getting created and content from it is being read when the Configuration is being created:

	at org.xmlresolver.cache.ResourceCache.reset(ResourceCache.java:211)
	at org.xmlresolver.cache.ResourceCache.<init>(ResourceCache.java:145)
	at org.xmlresolver.XMLResolverConfiguration.getFeature(XMLResolverConfiguration.java:1098)
	at org.xmlresolver.CatalogResolver.<init>(CatalogResolver.java:51)
	at org.xmlresolver.Resolver.<init>(Resolver.java:68)
	at net.sf.saxon.lib.CatalogResourceResolver.<init>(CatalogResourceResolver.java:46)
	at net.sf.saxon.Configuration.init(Configuration.java:622)
	at net.sf.saxon.Configuration.<init>(Configuration.java:433)
	at net.sf.saxon.s9api.Processor.<init>(Processor.java:74)

In general I do not see why an XML catalog resolver should have a folder cache on disk, Saxon doesn't have a cache and it's pretty fast. Are there performance tests which demonstrate that a disk cache of the resolver is useful? Also for now we are not really using the xmlresolver for anything, so having some way of disabling its cache before it writes and reads content from that cache folder would be nice.

Actions #1

Updated by Michael Kay over 1 year ago

  • Assignee set to Norm Tovey-Walsh
Actions #2

Updated by Norm Tovey-Walsh over 1 year ago

The cache is supposed to help folks who access resources that aren't in the catalog. If, for example, you went and got a version of the JATS DTD or DITA schemas without having a catalog for them. But it's also supposed to be possible to turn it off. I'll investigate.

Perhaps the default should be to disable the cache. Of course, that means almost no one will ever turn it on so it won't provide any benefit. It's not clear what the right answer is.

Actions #3

Updated by Radu Coravu over 1 year ago

In my opinion the role of an XML catalog resolver is to re-direct a reference to another location based on the information from the XML catalog files. If it does not have mappings in the XML catalog files for a resource, it returns null. If there is a need to cache content of remote resources which are not resolved through the XML catalog, that component would be somehow outside of the XML catalog resolver implementation or at least a component which needs to be explicitly enabled. In my case the xmlresolver does not even receive a list of XML catalog files to parse and it still creates that folder structure ".xmlresolver.org/cache" and it still attempts to read the "control.xml" from there every time a new transformer is created. I noticed this in an automated test where I had a listener for files being read on disk when the transformer was running. Anyway, it does not influence us much either way.

Actions #4

Updated by Norm Tovey-Walsh over 1 year ago

Hi Radu,

Just FYI, I’ve been struggling with some .NET 6 issues, but I think
those are behind me now. Tomorrow, I’ll be trying to sort out why
disabling the cache isn’t working for you.

I’ve also reached the conclusion that enabling the cache by default is
probably a mistake, so I’ll publish a new version of the XML Resolver
that ships with the default set to “false”.

That’ll also fix it for you, of course, but I’d like to figure out why
the API didn’t work as I expected.

Be seeing you,
norm

--
Norm Tovey-Walsh
Saxonica

Actions #5

Updated by Norm Tovey-Walsh over 1 year ago

Disabling the cache as you described works, but unfortunately, the cache directory and files are created before your code runs. I don't see any way to avoid that. If you set the system property xml.catalog.cacheEnabled to false before initializing Saxon, the cache directories will not be created.

However, I'm in the process of building and releasing version 4.5.0 of the XML Resolver which has caching disabled by default. Simply including that version of the resolver in your build should fix the problem as well.

Actions #6

Updated by Norm Tovey-Walsh over 1 year ago

  • Status changed from New to Resolved

XML Resolver 4.5.0 has been published on Maven.

Actions #7

Updated by Radu Coravu over 1 year ago

Thanks Norm.

Actions #8

Updated by Radu Coravu over 1 year ago

I confirm that after updating to 4.5.0 and creating transformers the ".xmlresolver.org" no longer seems to appear on disk.

Actions #9

Updated by Michael Kay over 1 year ago

  • Category set to Resolvers
Actions #10

Updated by Community Admin over 1 year ago

  • Status changed from Resolved to Closed

Closing this bug issue as it has been applied to the XML Resolver 4.5.0 as mentioned in the comment #6.

Please register to edit this issue

Also available in: Atom PDF