Project

Profile

Help

Support #5124

closed

Unexpected error "Cannot read a document that was written during the same transformation" - race condition

Added by Johan Gheys over 2 years ago. Updated almost 2 years ago.

Status:
Closed
Priority:
Low
Assignee:
Category:
Multithreading
Sprint/Milestone:
-
Start date:
2021-10-13
Due date:
% Done:

0%

Estimated time:
Legacy ID:
Applies to branch:
Fix Committed on Branch:
Fixed in Maintenance Release:
Platforms:

Description

We use Saxon-EE 10.5 and one of our transformations is giving for the first time the unexpected error "Cannot read a document that was written during the same transformation" (see log file error-2021-10-12.txt). A simple restart fixed the problem. Probably seems closely related to issue 4934?


Files

error-2021-10-12.txt (36.4 KB) error-2021-10-12.txt Johan Gheys, 2021-10-13 10:10
shared-functions.xslt (66.2 KB) shared-functions.xslt Johan Gheys, 2021-10-13 10:11
select-trains-and-movements-(proxy).xslt (15.3 KB) select-trains-and-movements-(proxy).xslt Johan Gheys, 2021-10-13 10:11

Related issues

Related to Saxon - Bug #5130: `saxon:discard-document()` should be synchronizedClosedMichael Kay2021-10-15

Actions
Actions #1

Updated by Michael Kay over 2 years ago

The first sign that things are wrong here is the message

Some child xsl:result-document threads have not finished

which should never happen.

Before that there are three messages of the form

*** MISSING file:/var/opt/app/arte2/u00p/interfac/exporter/publish/20211012-1726/formation/88_movement_HE_527.xml

which might well provide a clue.

The message

XTRE1500  Cannot read a document that was written during the same transformation:
  file:/var/opt/app/arte2/u00p/interfac/exporter/publish/20211012-1726/formation/88_movement_HE_527.xml

is actually a little inaccurate. It happens on a doc() call when the URI passed to doc() is unavailable, either because it was previously written using xsl:result-document, or for a variety of other reasons. The immediate cause is that the URI is present in a controller-owned collection called (misleadingly) allOutputDestinations, which is updated not only by xsl:result-document but also by fn:doc() and saxon:discard-document(). I suspect that if two threads both call doc() and subsequently call saxon:discard-document() on the same URI then it might be possible for the URI to be left in this collection at the end.

I'll need to think about that possibility a bit more carefully. If I'm right, then I think the only solution might be to introduce a function (say saxon:doc(uri, map{'stable':false()}) that's equivalent to doing doc() followed by discard-document(), but does it as an atomic action.

In fact I see that the implementation of saxon:discard-document() isn't even synchronised within itself, which it certainly should be. That's easy to fix, but whether it's enough to solve the problem I don't know.

None of this explains the disquieting message Some child xsl:result-document threads have not finished. Is that something you see elsewhere in your log files?

Actions #2

Updated by Michael Kay over 2 years ago

I raised bug #5130 regarding synchronization of saxon:discard-document(). It's quite hard to know whether that's actually a contributor to the current problem, but it's worth fixing anyway.

Actions #3

Updated by Michael Kay over 2 years ago

  • Related to Bug #5130: `saxon:discard-document()` should be synchronized added
Actions #4

Updated by Johan Gheys over 2 years ago

Hi Michael, thanks for the investigation. Every $runs/run is creating an xsl:result-document (see line 27 of select-trains-and-movements-(proxy).xslt) and I thought that error message was the result of an earlier problem (probably the reading problem caused by saxon:discard-document()) like you suggests).

Actions #5

Updated by Michael Kay over 2 years ago

Noted also that allOutputDestinations is a (thread-unsafe) HashSet which is updated by various non-synchronised methods such as XsltController.addAvailableOutputDestination()andXsltController.removeAvailableOutputDestination(). We need to use a thread-safe set/map here. But again, I'm not sure that's enough: the possible interactions between doc-available(), doc(), and discard-document()` are quite complex.

Actions #6

Updated by Michael Kay almost 2 years ago

  • Category set to Multithreading
  • Status changed from New to Closed
  • Assignee set to Michael Kay

I have changed the methods that access allOutputDestinations to be synchronized, as suggested (on the 11.x and 12.x branches). Apart from that, I don't think I can make more progress at this stage, so I'm closing the issue. Please re-open if problems occur again.

Actions #7

Updated by Johan Gheys almost 2 years ago

I know that race conditions rarely occur, but since we have been using Saxon-EE 10.7 or higher in which bug #5130 has been fixed, both this issue and support #4934 have not occurred anymore. As far as I am concerned, both issues may therefore be closed.

Please register to edit this issue

Also available in: Atom PDF