Project

Profile

Help

Bug #5294

closed

fn:unparsed-text throws StringOutOfBoundsException (File contains Astral characters)

Added by Gunther Rademacher about 2 years ago. Updated about 2 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
XPath conformance
Sprint/Milestone:
-
Start date:
2022-02-07
Due date:
% Done:

100%

Estimated time:
Legacy ID:
Applies to branch:
11, trunk
Fix Committed on Branch:
11, trunk
Fixed in Maintenance Release:
Platforms:
.NET, Java

Description

Reading the attached file with fn:unparsed-text, as in

java -Dfile.encoding=utf-8 net.sf.saxon.Query -qs:unparsed-text('Zakynthos.ebnf','utf-8')

on Saxon 11.1 throws this exception:

java.lang.StringIndexOutOfBoundsException: String index out of range: 2049
        at java.lang.String.<init>(String.java:205)
        at net.sf.saxon.functions.UnparsedTextFunction.readFile(UnparsedTextFunction.java:260)
        at net.sf.saxon.functions.UnparsedTextFunction.readFile(UnparsedTextFunction.java:94)
        at net.sf.saxon.functions.UnparsedText.process(UnparsedText.java:79)
        at net.sf.saxon.expr.SystemFunctionCall.process(SystemFunctionCall.java:486)
        at net.sf.saxon.query.XQueryExpression.run(XQueryExpression.java:461)
        at net.sf.saxon.s9api.XQueryEvaluator.run(XQueryEvaluator.java:429)
        at net.sf.saxon.Query.runQuery(Query.java:971)
        at net.sf.saxon.Query.doQuery(Query.java:443)
        at net.sf.saxon.Query.main(Query.java:103)
Fatal error during query: java.lang.StringIndexOutOfBoundsException: String index out of range: 2049

Files

Zakynthos.ebnf (2.89 KB) Zakynthos.ebnf Gunther Rademacher, 2022-02-07 20:34
Actions #1

Updated by Michael Kay about 2 years ago

  • Subject changed from fn:unparsed-text throws StringOutOfBoundsException on Saxon 11.1 to fn:unparsed-text throws StringOutOfBoundsException (File contains Astral characters)
  • Category set to XPath conformance
  • Assignee set to Michael Kay
  • Priority changed from Low to Normal
  • Applies to branch trunk added

Reproduced. It's on a code path that's only executed if the file contains non-BMP characters.

Actions #2

Updated by Michael Kay about 2 years ago

Another simple off-by-one bug: if the current buffer has 256 char's starting with a BOM, we're doing new String(chars, 1, 256). The third argument is a length, not an end position, so it should be 255.

Actions #3

Updated by Michael Kay about 2 years ago

  • Status changed from New to Resolved
  • Fix Committed on Branch 11, trunk added
  • Platforms .NET, Java added
Actions #4

Updated by Debbie Lockett about 2 years ago

  • Status changed from Resolved to Closed
  • % Done changed from 0 to 100
  • Fixed in Maintenance Release 11.2 added

Bug fix applied in the Saxon 11.2 maintenance release.

Please register to edit this issue

Also available in: Atom PDF