Bug #6352
closed
Reduced performance of Saxon-JS 2.6 compared to 2.4
Applies to JS Branch:
2, Trunk
Fix Committed on JS Branch:
2, Trunk
Description
We have a single page application that allows our users to draw a so-called geoschematic net map. The nodes are the stations and the edges are the tracks between them. If the user drags the rectangle of a station (e.g. Jette), the edges (6 in the case of Jette) should move along almost immediately. This is the case with Saxon-JS 2.4, but with 2.6 there is a considerable delay.
Attached, you will find all the necessary files except the SaxonJS2.rt.js files: these should, of course, still be added under the folders int-map\js\saxon-js\2.4 and 2.6. For simplicity, we use a local nodejs http server here (which can be started with startup.bat). Two tabs can then be opened in the browser: http://localhost:8888/int-map/index-2.4.html on the one hand and http://localhost:8888/int-map/index-2.6.html on the other. Hopefully you guys can figure out what's going on so we can start using the latest Saxon-JS version.
Files
Interesting. It's not at all clear what could have changes in that regard. The question is really what changed between 2.4 and 2.5, I think. The 2.6 release was very minor.
I will take a look...
Indeed, you are right: version 2.5 already has the same phenomenon and is also noticeably slower than 2.4. We have mentioned version 2.6 because it is the latest version we would like to start using.
- Status changed from New to In Progress
We've identified the problem. In SaxonJS 2.4, the id()
function is implemented with a call to the native browser getElementById
implementation. We discovered that doing so is insufficient in some cases, specifically where xml:id
has been used. This was revealed by failing test suite tests, see #5492.
To compensate, we refactored how id()
works so that it is correct by navigating through the tree ourselves. This is a lot slower. :-(
We'll have to think about how to fix this. It's clearly a regression that we must try to fix. Several possibilities occur to me:
- We could try to index the ID attributes. There's a comment on #5492 that suggests I explored that, but it's not clear if that was ever adopted. Note that a static index isn't sufficient because a user can add ID attributes at runtime.
- We could try to cache the values returned by
id()
. That might also run afoul of runtime changes.
- We could try the DOM function first and only go searching ourselves if the DOM call doesn't return a hit. That's presumably going to have performance on-par with 2.4 in the case where the ID exists and is an ID that the HTML DOM recognized. But would that leave the performance problem open in some use cases?
I'd rather have non-compliant behavior in this area than a performance hit. Fast id()
is essential whereas @xml:id
usage is negligible IMO.
To clarify, I'm voting for option #3
I think it's really important to deliver correct results. In this case, correct means taking xml:id
into account, even if the browser doesn't. The fact that not doing so produces failing tests in the test suites is an existance proof that some applications may need it.
We already have a mechanism for building caches, they're used to support xsl:key
. I tried my hand at extending that model, creating a new index for "id" values (attributes named id
or xml:id
). That seems to work reasonably well. In the application provided in this issue, performance is back to being on-par with SaxonJS 2.4 after the index is constructed. Constructing the index takes a fraction of a second, not at all noticable if you only do it once (as opposed to doing it between every drag event!)
Bonus feature: I think this will improve "get attribute by ID" performance on Node.js as well because I think that implementation will also benefit from caching. No one has ever noticed that its performance is linear with the size of the document because you don't usually do a lot of lookups unless you're being driven by something like user interaction events.
In the browser, can we please have a solution backed by the native getElementById
? It provides a fast primitive and it seems strange to me not to take advantage of it. Any parallel index structures can only be slower.
It seems to me your proposal #3 would combine the best of both worlds?
Note that as a general principle, we NEVER compromise on standards conformance in order to deliver a performance benefit.
Very occasionally we compromise on that principle (an example is stability of collections); when we do so, we generally make it something the user has to explicitly configure.
Combining Martynas' comment 7 with Mike's comment 8, I'm tempted to add a user option to request the browser getElementById
behavior. It will be faster and the user can be responsible for knowing whether or not xml:id
attributes are relevant.
If automatic fallback from the fast lookup to the compliant lookup is not possible (which is what I thought Norm you were suggesting in option #3), then a config option works for me.
Can you let us know how it would look like?
- Status changed from In Progress to Resolved
- Applies to JS Branch 2, Trunk added
My concern about fallback is again one of correctness. If you have both id
and xml:id
attributes, then a hit with the native getElementById
function may be incorrect.
What I've done is added a new configuration property, nativeGetElementById
. If you set it:
SaxonJS.setConfigurationProperty("nativeGetElementById", true);
then the native browser implementation will be used preferentially.
Thanks Norm for your solution. The configuration parameter is a good compromise that allows everyone to enforce the desired behaviour.
Works for me.
Can this come out in 2.7? :) Or how far is 3.x?
If we do a 2.7 release, it will definitely include this fix. The estimate for 3.0 remains essentially unchanged: "as soon as it's ready". We're working hard on it and I hope that it isn't too far away.
- Tracker changed from Support to Bug
- Category set to Performance
- Fix Committed on JS Branch 2, Trunk added
- Status changed from Resolved to Closed
- % Done changed from 0 to 100
- Fixed in JS Release set to SaxonJS 2.7
Bug fix applied in the SaxonJS 2.7 maintenance release.
Please register to edit this issue
Also available in: Atom
PDF
Tracking page