Project

Profile

Help

Bug #2492

closed

ClassCastException in StringValue class when calling xquery function fn:string-length()

Added by Michal Jagannathan about 9 years ago. Updated about 9 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Multithreading
Sprint/Milestone:
Start date:
2015-11-06
Due date:
% Done:

100%

Estimated time:
Legacy ID:
Applies to branch:
9.6
Fix Committed on Branch:
9.6
Fixed in Maintenance Release:
Platforms:

Description

We have noticed a ClassCastExceptions in our application when calling xquery function "fn:sting-length()".

It happens every now and then in our system tests running on CI. We have created a simple test application that shows the issue.

Steps:

Run the attached demo application.

Actual:

ClassCastException is thrown

Expected:

No exceptions

Additional info:

This demo application will execute two methods on the same instance of StringValue in different threads.

After few runs an output as showed in the 'example_output.txt' should eventually be observed.

On my machine (windows 7) it can be reproduced almost every time a test is run.

  1. The bug is caused by StringValue class not being thread safe:

    1. This class has an instance variable 'value' of type CharSequence which is being modified by public methods - its type is being switched between java.util.String and net.sf.saxon.regex.UnicodeString classes.

    2. There is no any synchronization used

  2. This is very important for the case when empty string is used, because there are public values used:

     StringValue.EMPTY_STRING
    
     StringValue.SINGLE_SPACE
    
     StringValue.TRUE
    
     StringValue.FALSE
    

    These values can be used by other classes, in different threads.

  3. For the case of an empty string StringValue can be created using this static method:

     public static StringValue makeStringValue(CharSequence value)
    

    which internally uses the StringValue.EMPTY_STRING for an empty string.

    This static method is for example used in the net.sf.saxon.functions.StringLength class (implementation of xquery function 'fn:string-length') in method:

     public Int64Value evaluateItem(XPathContext c)
    

Looking at the implementation of StringLenth class I can see this bug may affect only EMPTY_STRING case, but there are concerns about the rest of the code - are there more possibilities of a similar issues that could affect arbitrary string values. Also it could be checked if similar potential issues could happen in other classes the StringValue.


Files

saxon_issue.zip (4.24 KB) saxon_issue.zip Michal Jagannathan, 2015-11-06 13:05
Actions #1

Updated by Michael Kay about 9 years ago

  • Assignee set to Michael Kay
  • Found in version set to 9.6

Many thanks for reporting this, and for your thorough analysis of the problem.

Actions #2

Updated by Michael Kay about 9 years ago

Interesting.

Apart from the multi-threading issue, the first thing I notice is that we have two competing optimizations: asString() is trying to ensure that once we've computed the value as a string, we retain that string, while makeUnicodeString() is trying to ensure that we maintain it as an array of 32-, 16-, or 8-bit codepoints. If memory were free we would retain both representations, but we've had cases where the space occupied by StringValue instances runs into gigabyte territory, so that's not really on.

It's a long time since we've done any measurements to justify either of these optimizations, and I guess we should probably do it again before making decisions.

asString() is used most often when comparing strings for equality, and I'm inclined to think we'd be better off calling String.equals() if both values are String objects, and doing character by character comparison otherwise.

Meanwhile for a 9.6 patch I'll probably just add synchronisation to the relevant methods.

Actions #3

Updated by Michael Kay about 9 years ago

Complicating things further, many paths don't use StringValue.asString() but instead use StringValue.getStringValue(), which returns the same result but doesn't modify what's stored.

Actions #4

Updated by Michael Kay about 9 years ago

  • Status changed from New to Resolved

For the moment I've fixed the problem in both 9.6 and 9.7 by synchronizing the relevant methods (and in the case of the character iterators, making them work off a reference to the original CharSequence rather than referring back to it to get each character).

I'm still studying whether there might be a better approach for the future.

Actions #5

Updated by O'Neil Delpratt about 9 years ago

  • Status changed from Resolved to Closed
  • % Done changed from 0 to 100
  • Fixed in version set to 9.6.0.8

Bug fix applied in the Saxon 9.6.0.8 maintenance release

Actions #6

Updated by O'Neil Delpratt about 9 years ago

  • Applies to branch 9.6 added
  • Fix Committed on Branch 9.6 added
  • Fixed in Maintenance Release 9.6.0.8 added
Actions #7

Updated by O'Neil Delpratt about 9 years ago

  • Sprint/Milestone set to 9.6.0.8

Please register to edit this issue

Also available in: Atom PDF