Project

Profile

Help

Support #6352

closed

Reduced performance of Saxon-JS 2.6 compared to 2.4

Added by Johan Gheys 2 months ago. Updated about 2 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Sprint/Milestone:
-
Start date:
2024-02-19
Due date:
% Done:

0%

Estimated time:
Applies to JS Branch:
2, Trunk
Fix Committed on JS Branch:
Fixed in JS Release:
SEF Generated with:
Platforms:
Company:
-
Contact person:
-
Additional contact persons:
-

Description

We have a single page application that allows our users to draw a so-called geoschematic net map. The nodes are the stations and the edges are the tracks between them. If the user drags the rectangle of a station (e.g. Jette), the edges (6 in the case of Jette) should move along almost immediately. This is the case with Saxon-JS 2.4, but with 2.6 there is a considerable delay.
Attached, you will find all the necessary files except the SaxonJS2.rt.js files: these should, of course, still be added under the folders int-map\js\saxon-js\2.4 and 2.6. For simplicity, we use a local nodejs http server here (which can be started with startup.bat). Two tabs can then be opened in the browser: http://localhost:8888/int-map/index-2.4.html on the one hand and http://localhost:8888/int-map/index-2.6.html on the other. Hopefully you guys can figure out what's going on so we can start using the latest Saxon-JS version.


Files

int-map.zip (2.63 MB) int-map.zip Johan Gheys, 2024-02-19 10:25
Actions #1

Updated by Norm Tovey-Walsh 2 months ago

Interesting. It's not at all clear what could have changes in that regard. The question is really what changed between 2.4 and 2.5, I think. The 2.6 release was very minor.

I will take a look...

Actions #2

Updated by Johan Gheys 2 months ago

Indeed, you are right: version 2.5 already has the same phenomenon and is also noticeably slower than 2.4. We have mentioned version 2.6 because it is the latest version we would like to start using.

Actions #3

Updated by Norm Tovey-Walsh 2 months ago

  • Status changed from New to In Progress

We've identified the problem. In SaxonJS 2.4, the id() function is implemented with a call to the native browser getElementById implementation. We discovered that doing so is insufficient in some cases, specifically where xml:id has been used. This was revealed by failing test suite tests, see #5492.

To compensate, we refactored how id() works so that it is correct by navigating through the tree ourselves. This is a lot slower. :-(

We'll have to think about how to fix this. It's clearly a regression that we must try to fix. Several possibilities occur to me:

  1. We could try to index the ID attributes. There's a comment on #5492 that suggests I explored that, but it's not clear if that was ever adopted. Note that a static index isn't sufficient because a user can add ID attributes at runtime.
  2. We could try to cache the values returned by id(). That might also run afoul of runtime changes.
  3. We could try the DOM function first and only go searching ourselves if the DOM call doesn't return a hit. That's presumably going to have performance on-par with 2.4 in the case where the ID exists and is an ID that the HTML DOM recognized. But would that leave the performance problem open in some use cases?
Actions #4

Updated by Martynas Jusevicius about 2 months ago

I'd rather have non-compliant behavior in this area than a performance hit. Fast id() is essential whereas @xml:id usage is negligible IMO.

Actions #5

Updated by Martynas Jusevicius about 2 months ago

To clarify, I'm voting for option #3

Actions #6

Updated by Norm Tovey-Walsh about 2 months ago

I think it's really important to deliver correct results. In this case, correct means taking xml:id into account, even if the browser doesn't. The fact that not doing so produces failing tests in the test suites is an existance proof that some applications may need it.

We already have a mechanism for building caches, they're used to support xsl:key. I tried my hand at extending that model, creating a new index for "id" values (attributes named id or xml:id). That seems to work reasonably well. In the application provided in this issue, performance is back to being on-par with SaxonJS 2.4 after the index is constructed. Constructing the index takes a fraction of a second, not at all noticable if you only do it once (as opposed to doing it between every drag event!)

Bonus feature: I think this will improve "get attribute by ID" performance on Node.js as well because I think that implementation will also benefit from caching. No one has ever noticed that its performance is linear with the size of the document because you don't usually do a lot of lookups unless you're being driven by something like user interaction events.

Actions #7

Updated by Martynas Jusevicius about 2 months ago

In the browser, can we please have a solution backed by the native getElementById? It provides a fast primitive and it seems strange to me not to take advantage of it. Any parallel index structures can only be slower.

It seems to me your proposal #3 would combine the best of both worlds?

Actions #8

Updated by Michael Kay about 2 months ago

Note that as a general principle, we NEVER compromise on standards conformance in order to deliver a performance benefit.

Very occasionally we compromise on that principle (an example is stability of collections); when we do so, we generally make it something the user has to explicitly configure.

Actions #9

Updated by Norm Tovey-Walsh about 2 months ago

Combining Martynas' comment 7 with Mike's comment 8, I'm tempted to add a user option to request the browser getElementById behavior. It will be faster and the user can be responsible for knowing whether or not xml:id attributes are relevant.

Actions #10

Updated by Martynas Jusevicius about 2 months ago

If automatic fallback from the fast lookup to the compliant lookup is not possible (which is what I thought Norm you were suggesting in option #3), then a config option works for me. Can you let us know how it would look like?

Actions #11

Updated by Norm Tovey-Walsh about 2 months ago

  • Status changed from In Progress to Resolved
  • Applies to JS Branch 2, Trunk added

My concern about fallback is again one of correctness. If you have both id and xml:id attributes, then a hit with the native getElementById function may be incorrect.

What I've done is added a new configuration property, nativeGetElementById. If you set it:

SaxonJS.setConfigurationProperty("nativeGetElementById", true);

then the native browser implementation will be used preferentially.

Actions #12

Updated by Johan Gheys about 2 months ago

Thanks Norm for your solution. The configuration parameter is a good compromise that allows everyone to enforce the desired behaviour.

Actions #13

Updated by Martynas Jusevicius about 2 months ago

Works for me.

Can this come out in 2.7? :) Or how far is 3.x?

Actions #14

Updated by Norm Tovey-Walsh about 2 months ago

If we do a 2.7 release, it will definitely include this fix. The estimate for 3.0 remains essentially unchanged: "as soon as it's ready". We're working hard on it and I hope that it isn't too far away.

Please register to edit this issue

Also available in: Atom PDF Tracking page