Bug #4498
closed
Attribute order in Saxon 10.0
Fix Committed on Branch:
10
Fixed in Maintenance Release:
Description
Is it deliberately done that the order of attributes in version 10.0 is completely different than it has been so far?
What is the easiest way to keep the original order when applying a simple transformation such as filtering old records from a large file?
Order of attributes has always been undefined / implementation-defined, though many XML parsers retain the original attribute order and many operations in Saxon do so. We have made changes in 10.0 to allow finding an attribute in a large collection of attributes more efficient; this uses a HashMap which does not retain order.
There's no guaranteed way of keeping the original attribute order, but you can control the attribute order in the serialized result using the saxon:attribute-order
output parameter. (Needs PE or higher).
You could try tweaking the static variable net.sf.saxon.om.SmallAttributeMap.LIMIT
, currently set to 5, to a higher number; this is the threshold for using the large attribute set implementation. You would have to change the source code or use reflection because the variable isn't public.
Thank you for your answer.
But somehow the order has to be present because @*/. gives the attributes in "document order", right?
There's a stable order of attributes, but it's not predictable, and it's not necessarily related to the lexical order of attributes in the serialized XML.
It was a bit strange to see that the order of the attributes in the result tree no longer determines the order in the serialized XML, but it may also be a mind switch. And of course we prefer a better performance over a more human readable result. We will probably use saxon:attribute-order = "*" as a compromise from now on. Our customers will detect a large number of differences once (based on a digest calculation), but after that everything will remain stable. Thank you for the explanation.
- Tracker changed from Support to Bug
- Category set to Internals
- Status changed from New to In Progress
- Assignee set to Michael Kay
- Applies to branch 10 added
Although this isn't a bug, it's a change that does appear to have caused a number of people usability problems, e.g. because diff checking of test results is affected. So I'm inclined to revert to a data structure for attribute sets that retains order of insertion.
It only affects the LargeAttributeMap which is used when there are more than 5 attributes. This currently uses an ImmutableHashTrieMap internally, so that incremental addition of new attributes is efficient. (For a SmallAttributeMap, the entire structure is copied when a new entry is added.) I'll look into whether we can find another map implementation that retains insertion order.
- Status changed from In Progress to Resolved
- Priority changed from Low to Normal
- Fix Committed on Branch 10 added
I have changed the LargeAttributeMap
implementation so it now maintains the order of attributes. (Specifically, if an attribute is added, it goes at the end, unless it is replacing an existing attribute with the same name, in which case it occupies the same position as that attribute).
This proved quite tricky to implement, given the requirement to use immutable data structures internally, but I found a way that seems reasonably efficient. However, because of the extra complexity, I've raised the threshold at which we start using the LargeAttributeMap
from 5 attributes to 8.
- % Done changed from 0 to 100
- Fixed in Maintenance Release 10.1 added
Bug fix committed in the Saxon 10.1 maintenance release.
- Status changed from Resolved to Closed
Please register to edit this issue
Also available in: Atom
PDF