Feature #5021
closedOption to supply base URI of the source text in SaxonJS.getResource() for Node.js
100%
Description
I am having a case, when I would like to use the base-uri()
function for a resource I resolved with SaxonJS.getResource({text: <source_text>, type: "xml"})
. The reason for that is I need to resolve the links that are relative to the document I resolved using its text, while embedding it into the document I use in SaxonJS.transform()
.
I was trying the following approaches in my attempt to make it work:
- Use
text
andlocation
options together (like it is done inSaxonJS.transform()
) then use a stylesheet parameter to access the resolved document — thetext
option is ignored and the document is resolved using thelocation
option. - Use only
text
option and supply the{"<source_uri>": "<resource_object>"}
indocumentPool
then usedoc($<source_uri>)
to access the document — the document is still fetched by its URI instead of provided resource.
I noticed that when I am using the "location" option, the object is provided with _saxonBaseUri
and _saxonDocUri
options that contain the URI of the resolved document. I tried providing _saxonBaseUri
and _saxonDocUri
directly in the resource I got, and it actually worked!
So, is there a proper way to do what I described in case I am doing it wrong? And if not, then is it possible to provide such option to align the behavior with SaxonJS.transform()
?
Updated by Yury Palyanitsa over 3 years ago
Please change the issue type from "Bug" to "Feature". Did this by accident.
Updated by Debbie Lockett over 3 years ago
Thanks for the suggestion. I'm actually looking at SaxonJS.getResource
at the moment (related to #5017 and #4748). I think you are right that there are currently some gaps in the API.
Could you share a sample repro - i.e. XML source(s) and XSLT stylesheet? That would be helpful for me to reproduce the issues you describe, to do with using the SaxonJS API.
In point 1, you say you tried to "Use text
and location
options together (like it is done in SaxonJS.transform()
)". Could you expand what you mean by "like it is done in SaxonJS.transform()
"? Do you mean using the sourceText
and sourceLocation
options together? It's a surprise to me if that works!
Updated by Yury Palyanitsa over 3 years ago
Hello Debbie,
Well, then it's surprising for me too if that is not actually supposed to work :) But it indeed works this way and I demonstrated it in the sample repro.
Here's the github repo with the samples that reproduce the environment in the issue: https://github.com/deiteris/saxonica-issue-5021
The structure:
- The
MasterPages/
folder includesMasterPage.htm
that is used as primary source inSaxonJS.transform()
. It also includesStylesheet.css
in the same folder to demonstrate how the link will resolve for it. - The
Documents/
folder includesDocument.htm
that is resolved usingSaxonJS.getResource()
. It also includes a stylesheet which relative link points to../Styles/DocumentStylesheet.css
-
main.js
is used to start the transformation and will output thebase-uri()
of thelink//@href
selector, and the resulting output. Notice two TODO comments in it that outline what I said.
Instructions:
- Clone the project, go to project folder and run
npm install
- Run
node main.js
Updated by Yury Palyanitsa over 3 years ago
Update: I was not correct about using sourceText
and sourceLocation
in SaxonJS.transform()
. When both supplied - sourceLocation
is actually used.
Updated by Yury Palyanitsa over 3 years ago
Sorry for the spam as I see no way to edit my messages, but I actually triple-checked the behavior of SaxonJS.transform()
. And yes, it actually works with both options supplied!
How I checked this:
I am developing a plugin for live previewing of transformed XML documents. When the editor's content changes - it is not stored on the disk immediately, but I am able to get the contents of the editor directly. When I supply only sourceLocation
, Saxon fetches the document contents from the disk and my preview doesn't update as I type, because the file is not updated on the disk yet. BUT, if I additionally provide the contents of the editor in sourceText
, the preview updates as I type, and the base-uri()
works at the same time.
Updated by Debbie Lockett over 3 years ago
- Status changed from New to In Progress
Thanks for the repro, etc.
It looks like the SaxonJS.transform()
behaviour you have discovered of using sourceText
with sourceLocation
to provide the base URI is possibly an unintentional quirk in the code. It looks like it only works this way for an asynchronous transform (and not for a synchronous transform), and I don't think it was really by design. I think the intention is that sourceLocation
, sourceFileName
, sourceNode
, and sourceText
are mutually exclusive (though we no longer check that only one is used). Perhaps we should be adding a sourceBaseUri
option which can be used with sourceText
(and possibly sourceNode
) to allow the base URI for this source to be supplied; rather like we have the stylesheetBaseURI
option which can be used with stylesheetText
or stylesheetInternal
to supply the static base URI of the stylesheet.
Updated by Debbie Lockett over 3 years ago
Thanks again for your repro, I have been using it to run a number of tests trying out different combinations of the options for supplying the stylesheet, primary and secondary sources, to check how the base URIs for each of these is handled. As well as the need to add options to supply the base URI of sources loaded from text for both SaxonJS.getResource
and SaxonJS.transform
, the testing has also shown that there is some other tidying up to do in this area. (i.e. We could better align the initial processing while loading sources and stylesheets for asynchronous and synchronous transforms).
By the way, in your initial report, in point 2 you said that you had not managed to use the documentPool
for the preloaded secondary source; but this should work. I wonder if there was an issue in the way that you created the documentPool
? Having loaded the resource with SaxonJS.getResource({text: <source_text>, type: "xml"})
, and manually set its base URI with _saxonBaseUri
and _saxonDocUri
, the following should work:
const documentPool = {};
documentPool[<source_uri>] = <resource_object>;
This documentPool
can then be supplied in the SaxonJS.transform
, and you can use doc($<source_uri>)
to access the preloaded resource from the stylesheet. Obviously you have found a good alternative with supplying the resource in a stylesheet parameter, but you might like to try again with the documentPool
, as this is what it was designed for. Let me know if you still have issues using this.
Updated by Yury Palyanitsa over 3 years ago
By the way, in your initial report, in point 2 you said that you had not managed to use the
documentPool
for the preloaded secondary source; but this should work. I wonder if there was an issue in the way that you created thedocumentPool
?
Document pool works too and I do can access the preloaded resource using doc($<source_uri>)
when I use documentPool
, my point 2 is actually misleading here. What I meant is that documentPool
doesn't help with empty base-uri()
issue for the preloaded text resource, even though I provide a document URI with the resource.
Obviously you have found a good alternative with supplying the resource in a stylesheet parameter, but you might like to try again with the documentPool, as this is what it was designed for. Let me know if you still have issues using this.
In this specific case, I work with only 2 files so I don't see much benefit from using documentPool
, but in mass transform scenario I'll definitely come back to it. It's demonstrated the same way on the documentation page https://www.saxonica.com/saxon-js/documentation/index.html#!api/getResource
Updated by Debbie Lockett over 3 years ago
A number of closely related but separate issues have been raised and discovered while investigating this feature request, which all need to be addressed:
- Add option to set base URI for source loaded with
SaxonJS.getResource
fromtext
. - Add option to set base URI for
SaxonJS.transform
source supplied bysourceText
. - Source base URI is not set correctly from
sourceFileName
for asyncSaxonJS.transform
. -
SaxonJS.transform
optionssourceLocation
andstylesheetLocation
do not always handle relative URIs correctly. -
SaxonJS.getResource
optionlocation
does not handle relative URIs.
Further notes on the current status (for the latest release Saxon-JS 2.2):
Issue 1: The work around is to set the _saxonBaseUri
property manually, but an option should be added to the API.
Issue 2: Currently, for an asynchronous transform, there is a work around to supply the source base URI using sourceLocation
; but this is not really how the API is designed to work. The transform options for supplying the source (and stylesheet) should be mutually exclusive, and we should add a check for this in the code, to avoid confusion about which options are being used.
Issue 3: There is a bug in the code for asynchronous transforms; sourceFileName
(absolute or relative) is not resolved correctly before being used to set the base URI.
Issue 4: There are 2 bugs in the code. For asynchronous transforms, at the point that they are used to obtain the resources, locations (sourceLocation
and stylesheetLocation
) are assumed to be absolute. These values have earlier been normalised to absolute URIs if working in the browser, but not on Node.js. For synchronous transforms, on Node.js there is a bug in resolving against the current working directory (caused by a missing trailing slash).
Issue 5: Currently the SaxonJS.getResource
option location
is assumed to be absolute. Meanwhile the SaxonJS.transform
location options handle relative URIs - resolved against the current working directory on Node.js, and against the location of the HTML page in the browser. It seems reasonable to align the SaxonJS.getResource
location
option to similarly handle relative URIs.
A set of Node.js tests covering these issues has been added (see src/test/nodejs/iss5021_test.js
).
Updated by Debbie Lockett over 3 years ago
- Status changed from In Progress to Resolved
- Assignee set to Debbie Lockett
- Applies to JS Branch 2 added
- Fix Committed on JS Branch 2 added
Code fixes and documentation updates committed.
New options added to set base URIs for primary and secondary sources:
-
baseURI
forSaxonJS.getResource
for use withtext
; -
sourceBaseURI
forSaxonJS.transform
for use withsourceText
orsourceNode
.
The other bugs noted above have also been fixed.
Updated by Debbie Lockett over 3 years ago
- % Done changed from 0 to 100
- Fixed in JS Release set to Saxon-JS 2.3
Bug fix applied in the Saxon-JS 2.3 maintenance release.
Updated by Debbie Lockett over 3 years ago
- Status changed from Resolved to Closed
Please register to edit this issue
Also available in: Atom PDF Tracking page