Project

Profile

Help

Bug #1852

closed

ArrayIndexOutOfBounds doing Unicode normalization

Added by Michael Kay almost 11 years ago. Updated over 10 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
XPath conformance
Sprint/Milestone:
-
Start date:
2013-07-23
Due date:
% Done:

100%

Estimated time:
Legacy ID:
Applies to branch:
Fix Committed on Branch:
Fixed in Maintenance Release:
Platforms:

Description

Reported on saxon-help list by High Cayless:

Hi, I'm running into a sporadic exception when converting NFC strings to NFD, using Saxon 9.5.1.1 HE:

java.lang.ArrayIndexOutOfBoundsException

at java.lang.System.arraycopy(Native Method)

at net.sf.saxon.tree.util.FastStringBuffer.insertWideChar(FastStringBuffer.java:407)

at net.sf.saxon.serialize.codenorm.Normalizer.internalDecompose(Normalizer.java:152)

at net.sf.saxon.serialize.codenorm.Normalizer.normalize(Normalizer.java:83)

at net.sf.saxon.functions.NormalizeUnicode.normalize(NormalizeUnicode.java:99)

at net.sf.saxon.functions.NormalizeUnicode.evaluateItem(NormalizeUnicode.java:35)

at net.sf.saxon.functions.NormalizeUnicode.evaluateItem(NormalizeUnicode.java:23)

I've played around a bit with varying strings, and it seems only to trigger in certain circumstances—maybe certain strings are sneaking by net.sf.saxon.tree.util.FastStringBuffer's ensureCapacity method?

Attached is a string that will consistently trigger the bug when passed to net.sf.saxon.serialize.codenorm.Normalizer's normalize method. Normalizer is set up to convert to NFD.

This seems like a regression, as I've been using Saxon on these files for a few years now, and I've only seen it since upgrading.

Thanks,

Hugh

The string in question is:

ἱερου ιβ´ ἔτουσ λγ Παῦνι κα μεμέτρηκεν εἰσ τὸν ἐν Διὸσ πόλει τῆι μεγάληι θησαυρὸν εἰσ τὴν ἐπιγραφὴν τοῦ τρίτου καὶ λ ἔτουσ ὑπὲρ τοῦ τόπου Σῶσασ Ἀλεξάνδρου κριθῆσ πέντε

Please register to edit this issue

Also available in: Atom PDF