Project

Profile

Help

RelativeURIResolver and document caching

Added by Anonymous over 16 years ago

Legacy ID: #5030275 Legacy Poster: Vladimir Nesterovsky (vnesterovsky)

To support xslt loading from my source I've implemented RelativeURIResolver interface. I've noticed however that engine has started to consume memory. This is not the case when URIResolver is implemented. I've checked the code net.sf.saxon.functions.Document, makeDoc() and found the reason. 1. When document() function is called for input xml (not source xslt) my RelativeURIResolver returns null to delegate the work to a default resolver, as suggested by the documentation. 2. Cache miss is checked before default resolver works. if (resolver instanceof RelativeURIResolver) { ... documentKey = ((RelativeURIResolver)resolver).makeAbsolute(href, baseURI); ... } else { ... } // see if the document is already loaded doc = controller.getDocumentPool().find(documentKey); if (doc != null) { return getFragment(doc, fragmentId, c); } ... controller.registerDocument(newdoc, documentKey); This results in blowing cache and other sad results. P.S. my resolver defines "resource:" protocol allowing to load xslt from METAINF without specifying jar.


Replies (6)

Please register to reply

RE: RelativeURIResolver and document caching - Added by Anonymous over 16 years ago

Legacy ID: #5030302 Legacy Poster: Michael Kay (mhkay)

I'm sorry, I'm missing something here. It would help to see the code of your RelativeURIResolver. When you say you return null to delegate to the standard URIResolver, which method are you referring to? I don't think the makeAbsolute() method can return null. I think it should be OK for the dereference() method to return null. I don't think it matters what the resolve() method does, because it won't be called - it's only there to satisfy the interface. Presumably if you're running out of memory this is because document() is being called with the same arguments repeatedly and your RelativeURIResolver is not returning the same documentKey each time? Or have I misunderstood completely?

RE: RelativeURIResolver and document caching - Added by Anonymous over 16 years ago

Legacy ID: #5030696 Legacy Poster: Vladimir Nesterovsky (vnesterovsky)

  1. I've quoted the source of my resolver below. 2. Probably I've missed the interface's contract. If resource to resolve does not belong to my scheme I'm returning null. This result in null when engine calls document() with parameter like "file:/d:/path/file.xml", which is one of input files in file system. Should I resolve other schemes in my RelativeURIResolver? /** * This classes implements an interface that can be called by the processor * to turn a URI used in document(), xsl:import, or xsl:include into a * Source object. / public class ResourceURIResolver implements RelativeURIResolver { /* * Create an absolute URI from a relative URI and a base URI. * This method performs the process which is correctly called * "URI resolution": this is purely a syntactic operation * on the URI strings, and does not retrieve any resources. * @param href - a relative or absolute URI, to be resolved. * @param base - a base URI against which the first argument will be made * absolute if the absolute URI is required. * @return a string containing the absolute URI that results from URI * resolution. / public String makeAbsolute(String href, String base) throws TransformerException { String resourceName = getResourceName(href, base); if (resourceName == null) { return null; } return prefix + resourceName; } /* * Called by the processor when it encounters * an xsl:include, xsl:import, or document() function. * @param uri - an absolute URI to be dereferenced. * @return a Source object, or null if the href cannot be dereferenced, * and the processor should try to resolve the URI itself. / public Source dereference(String uri) throws TransformerException { String resourceName = getResourceName(uri, null); if (resourceName == null) { return null; } ClassLoader classLoader = Thread.currentThread().getContextClassLoader(); InputStream stream = classLoader.getResourceAsStream(resourceName); if (stream == null) { return null; } return new StreamSource(stream, uri); } /* * Called by the processor when it encounters * an xsl:include, xsl:import, or document() function. * * This resolver supports protocol "resource:". * Note that normalization is not performed, thus * "resource:xyz", and "resource:/xyz" are different. * @param href An href attribute, which may be relative or absolute. * @param base The base URI against which the first argument will be made * absolute if the absolute URI is required. * @return A Source object, or null if the href cannot be resolved, * and the processor should try to resolve the URI itself. / public Source resolve(String href, String base) throws TransformerException { String resourceName = getResourceName(href, base); if (resourceName == null) { return null; } ClassLoader classLoader = Thread.currentThread().getContextClassLoader(); InputStream stream = classLoader.getResourceAsStream(resourceName); if (stream == null) { return null; } return new StreamSource(stream, prefix + resourceName); } /* * Specify the media type of the resource that is expected to be delivered. * This information is supplied by the processor primarily to indicate * whether the URIResolver is allowed to return an XML tree already parsed. * If the value is "text/plain" then the Source returned by the resolve() * method should be a StreamSource. * @param mediaType the expected media type / public void setExpectedMediaType(String mediaType) { } /* * Gets resource name for base uri and relative part. * @param href - a relative or absolute URI, to be resolved. * @param base - a base URI against which the first argument will be made * absolute if the absolute URI is required. * @return a string containing the absolute URI that results from URI * resolution. / private static String getResourceName(String href, String base) throws TransformerException { if (href == null) { return null; } int hash = href.indexOf('#'); if (hash >= 0) { href = href.substring(0, hash); } if (href.length() == 0) { return null; } String url; if (href.startsWith(prefix)) { if (href.length() < prefix.length()) { return null; } url = href; } else { if ((base == null) || (base.length() < prefix.length()) || !base.startsWith(prefix) ) { return null; } int pos = base.lastIndexOf('/'); if (pos == -1) { url = prefix + href; } else { url = base.substring(0, pos + 1) + href; } } try { return new URI(url.substring(prefix.length())).normalize().toString(); } catch (URISyntaxException e) { throw new TransformerException(e); } } /* * Resource url prefix. */ private static final String prefix = "resource:"; }

RE: RelativeURIResolver and document caching - Added by Anonymous over 16 years ago

Legacy ID: #5030937 Legacy Poster: Michael Kay (mhkay)

Yes, I'm afraid you have misunderstood the interface. The makeAbsolute() method must always return the absolute URI, it cannot return null. You could call net.sf.saxon.java.JavaPlatform.getInstance().makeAbsolute(href, base).toString() Michael Kay http://www.saxonica.com/

RE: RelativeURIResolver and document caching - Added by Anonymous over 16 years ago

Legacy ID: #5031072 Legacy Poster: Vladimir Nesterovsky (vnesterovsky)

Thank you, your advice has helped me to implement RelativeURIResolver properly, however that has not solved my original problem, which I had to expose in the first place. My xslt is in the METAINF folder, say at: resource:METAINF/stylesheets/main/source.xslt I want to get uri of relative resource. In particular I want to call: resolve-uri('../other/source.xslt') I was thinking that RelativeURIResolver can solve the problem (URIResolver does not solve it). Unfortunately ResolveURI never calls resolver but uses platform.makeAbsolute directly. This contrasts with document() implementation. Thanks.

RE: RelativeURIResolver and document caching - Added by Anonymous over 16 years ago

Legacy ID: #5031298 Legacy Poster: Michael Kay (mhkay)

Correct. The doc() and document() functions allow implementations a lot of latitude in how the URIs are interpreted. The specification for resolve-uri(), by contrast, requires implementations to follow RFC 3986 very closely, so I don't think it would be appropriate to call a user hook here. In any case, if you don't want the standard behaviour of resolve-uri(), why call it? You can call a Java extension function of your own devising if you want to do something different.

RE: RelativeURIResolver and document caching - Added by Anonymous over 16 years ago

Legacy ID: #5031359 Legacy Poster: Vladimir Nesterovsky (vnesterovsky)

  1. Taking into account: ---- 15.5.4 fn:doc fn:doc($uri... ... If $uri is read from a source document, it is generally appropriate to resolve it relative to the base URI property of the relevant node in the source document. This can be achieved by calling the fn:resolve-uri function, and passing the resulting absolute URI as an argument to the fn:doc function. ---- I would expect resolve-uri to be similar in its logic to what doc() does to resolve uri. 2. RFC 3986 is rather liberal ragarding to what is a treated as acceptable uri. In particular my custom scheme is acceptable. 3. I did not want to use extension function as I expected to integrate my scheme in a natural way. 4. I've played a little with my schema syntax an found that if I write "resource:/MEATAINF/..." instead of "reasouce:METAINF/..." then java's URI class (and Saxon as result) succeeds to resolve relative path for such protocol.
    (1-6/6)

    Please register to reply