Looking at the usages of matchesEmptyString(), I think changing OpBackReference.matchesEmptyString() to return false is safe for all usages except the top-level RegularExpression.isNullable(), which relies solely on the static analysis to decide whether the expression matches a zero-length string in its entirety (i.e. with implicit anchoring). This is used in particular (a) when evaluating the pattern facet against an empty string in XSD, and (b) when assessing the rule in various XPath expressions (e.g. tokenize()) that "the regular expression must not be one that matches a zero-length string".
The XSD case is not affected because XSD doesn't use back-references, but case (b) is certainly relevant.
Given a regular expression that matches a sequence (OpSequence), e.g. ABC*, we return matchesEmptyString()=true only if each subexpression returns matchesEmptyString()=true. This means that if one of the subexpressions is a back-reference, we will now return false, even for an expression such as (A?)\1, which means we will not prevent such an expression being used in functions such as tokenize().
Sure enough, tokenize('ABCD', '(A?)\1')
now goes into an infinite loop, rather than raising an error.
I think the answer to this is that a positive response from matchesEmptyString() is sufficient to determine that a regex matches "", but a negative response is not sufficient to establish that it doesn't. So the method ARegularExpression.matches(CharSequence), which is the only place that calls regex.isNullable(), needs to change from
if (StringValue.isEmpty(input)) { return regex.isNullable(); }
to
if (StringValue.isEmpty(input) && regex.isNullable() { return true; }
The other remaining doubt is whether we can go into an infinite loop if we don't do the optimization from A* to A+ where A can match an empty string. Testing this with
matches('ABCD', '(A?)\1*')
suggests there's no problem. There are other mechanisms in place to ensure that we don't try to match a zero-length string an infinite number of times without advancing the position.