Project

Profile

Help

Feature #5021

Option to supply base URI of the source text in SaxonJS.getResource() for Node.js

Added by Yury Palyanitsa 7 days ago. Updated about 7 hours ago.

Status:
In Progress
Priority:
Low
Assignee:
-
Category:
-
Sprint/Milestone:
-
Start date:
2021-06-11
Due date:
% Done:

0%

Estimated time:
Applies to JS Branch:
Fix Committed on JS Branch:
Fixed in JS Release:
SEF Generated with:
Company:
-
Contact person:
-
Additional contact persons:
-

Description

I am having a case, when I would like to use the base-uri() function for a resource I resolved with SaxonJS.getResource({text: <source_text>, type: "xml"}). The reason for that is I need to resolve the links that are relative to the document I resolved using its text, while embedding it into the document I use in SaxonJS.transform().

I was trying the following approaches in my attempt to make it work:

  1. Use text and location options together (like it is done in SaxonJS.transform()) then use a stylesheet parameter to access the resolved document — the text option is ignored and the document is resolved using the location option.
  2. Use only text option and supply the {"<source_uri>": "<resource_object>"} in documentPool then use doc($<source_uri>) to access the document — the document is still fetched by its URI instead of provided resource.

I noticed that when I am using the "location" option, the object is provided with _saxonBaseUri and _saxonDocUri options that contain the URI of the resolved document. I tried providing _saxonBaseUri and _saxonDocUri directly in the resource I got, and it actually worked!

So, is there a proper way to do what I described in case I am doing it wrong? And if not, then is it possible to provide such option to align the behavior with SaxonJS.transform()?

History

#1 Updated by Yury Palyanitsa 7 days ago

Please change the issue type from "Bug" to "Feature". Did this by accident.

#2 Updated by Debbie Lockett 7 days ago

  • Tracker changed from Bug to Feature

#3 Updated by Debbie Lockett 7 days ago

Thanks for the suggestion. I'm actually looking at SaxonJS.getResource at the moment (related to #5017 and #4748). I think you are right that there are currently some gaps in the API.

Could you share a sample repro - i.e. XML source(s) and XSLT stylesheet? That would be helpful for me to reproduce the issues you describe, to do with using the SaxonJS API.

In point 1, you say you tried to "Use text and location options together (like it is done in SaxonJS.transform())". Could you expand what you mean by "like it is done in SaxonJS.transform()"? Do you mean using the sourceText and sourceLocation options together? It's a surprise to me if that works!

#4 Updated by Yury Palyanitsa 7 days ago

Hello Debbie,

Well, then it's surprising for me too if that is not actually supposed to work :) But it indeed works this way and I demonstrated it in the sample repro.

Here's the github repo with the samples that reproduce the environment in the issue: https://github.com/deiteris/saxonica-issue-5021

The structure:

  • The MasterPages/ folder includes MasterPage.htm that is used as primary source in SaxonJS.transform(). It also includes Stylesheet.css in the same folder to demonstrate how the link will resolve for it.
  • The Documents/ folder includes Document.htm that is resolved using SaxonJS.getResource(). It also includes a stylesheet which relative link points to ../Styles/DocumentStylesheet.css
  • main.js is used to start the transformation and will output the base-uri() of the link//@href selector, and the resulting output. Notice two TODO comments in it that outline what I said.

Instructions:

  • Clone the project, go to project folder and run npm install
  • Run node main.js

#5 Updated by Yury Palyanitsa 7 days ago

Update: I was not correct about using sourceText and sourceLocation in SaxonJS.transform(). When both supplied - sourceLocation is actually used.

#6 Updated by Yury Palyanitsa 7 days ago

Sorry for the spam as I see no way to edit my messages, but I actually triple-checked the behavior of SaxonJS.transform(). And yes, it actually works with both options supplied!

How I checked this:

I am developing a plugin for live previewing of transformed XML documents. When the editor's content changes - it is not stored on the disk immediately, but I am able to get the contents of the editor directly. When I supply only sourceLocation, Saxon fetches the document contents from the disk and my preview doesn't update as I type, because the file is not updated on the disk yet. BUT, if I additionally provide the contents of the editor in sourceText, the preview updates as I type, and the base-uri() works at the same time.

#7 Updated by Debbie Lockett 7 days ago

  • Status changed from New to In Progress

Thanks for the repro, etc.

It looks like the SaxonJS.transform() behaviour you have discovered of using sourceText with sourceLocation to provide the base URI is possibly an unintentional quirk in the code. It looks like it only works this way for an asynchronous transform (and not for a synchronous transform), and I don't think it was really by design. I think the intention is that sourceLocation, sourceFileName, sourceNode, and sourceText are mutually exclusive (though we no longer check that only one is used). Perhaps we should be adding a sourceBaseUri option which can be used with sourceText (and possibly sourceNode) to allow the base URI for this source to be supplied; rather like we have the stylesheetBaseURI option which can be used with stylesheetText or stylesheetInternal to supply the static base URI of the stylesheet.

#8 Updated by Debbie Lockett about 7 hours ago

Thanks again for your repro, I have been using it to run a number of tests trying out different combinations of the options for supplying the stylesheet, primary and secondary sources, to check how the base URIs for each of these is handled. As well as the need to add options to supply the base URI of sources loaded from text for both SaxonJS.getResource and SaxonJS.transform, the testing has also shown that there is some other tidying up to do in this area. (i.e. We could better align the initial processing while loading sources and stylesheets for asynchronous and synchronous transforms).

By the way, in your initial report, in point 2 you said that you had not managed to use the documentPool for the preloaded secondary source; but this should work. I wonder if there was an issue in the way that you created the documentPool? Having loaded the resource with SaxonJS.getResource({text: <source_text>, type: "xml"}), and manually set its base URI with _saxonBaseUri and _saxonDocUri, the following should work:

    const documentPool = {};
    documentPool[<source_uri>] = <resource_object>;

This documentPool can then be supplied in the SaxonJS.transform, and you can use doc($<source_uri>) to access the preloaded resource from the stylesheet. Obviously you have found a good alternative with supplying the resource in a stylesheet parameter, but you might like to try again with the documentPool, as this is what it was designed for. Let me know if you still have issues using this.

Please register to edit this issue

Also available in: Atom PDF Tracking page