xsl:message serialization

Added by Vladimir Nesterovsky over 6 years ago

Hello,

I'm observing great serialization when xsl:message is used along with multiple threads (like xsl:for-each with saxon:threads).

Impact depends on xsl:message content, e.g. if following runs in multiple threads:

<xsl:message select="'count', count($items)"/>

may virtually remove all threading effect.

Looking at Message.java class I can see that processing is done under:

            synchronized (emitter) {

Not sure, but this might be the reason of contention.

Replies (10)

Please register to reply

RE: xsl:message serialization - Added by Vladimir Nesterovsky over 6 years ago

I think I can model a dead lock.

Consider:

<xsl:message> <!-- thread1: lock over emmiter -->

… content triggers parallel execution
   <xsl:for-each … saxon:threads …> <!-- thread1: wait for thread2 -->
     <xsl:message> <!-- thread2: lock attempt over emitter -->
     </xsl:message>
   </xsl:for-each>

</xsl:message>

RE: xsl:message serialization - Added by Michael Kay over 6 years ago

All xsl:message output within a transformation is sent to the same MessageEmitter, and we can't rely on the MessageEmitter being thread-safe, so we have to synchronize. However, it might be better to compute the message content first, and then use synchronization only when sending it to the MessageEmitter. That's particularly true of course for an example like this where you access a lot of input data (count(...)) to construct a small message.

Certainly, holding a synchronization lock while executing user-written XSLT code is highly undesirable. On the other hand, it would also be undesirable to buffer all xsl:message content in cases where the synchronization problem does not arise.

Doing a multi-threaded for-each while executing xsl:message is a perverse case that we haven't really considered.

I'll raise a bug entry.

RE: xsl:message serialization - Added by Michael Kay over 6 years ago

Raised as a bug here: https://saxonica.plan.io/issues/3979

Note: the term "serialization" in the title does not refer to the process of converting a tree of nodes to lexical XML, it refers to the tendency whereby excessive synchronization can cause processes that should execute in parallel to actually execute in series.

RE: xsl:message serialization - Added by Vladimir Nesterovsky over 6 years ago

One of solutions could be to setup separate message emmiter per thread.

RE: xsl:message serialization - Added by Vladimir Nesterovsky over 6 years ago

On the other hand, it would also be undesirable to buffer all xsl:message content in cases where the synchronization problem does not arise.

Cannot tell in general but in our codebase we use xsl:message for diagnostics, which is especially important in long running transformation. Thus messages we issue are inherently look like small log records. Full content buffering will not impact in such cases.

RE: xsl:message serialization - Added by Vladimir Nesterovsky over 6 years ago

Similar deadlock threat exists in during use of xslt's key.

Consider KeyManager.obtainSharedIndex():

lock doc
{
  assert no under construction mark

  put under constructon mark
  run user code // This might be multithreaded and will trigger some other key over same doc.
  put index
}

I think if you will elevate assert part above lock doc, you will be OK.

RE: xsl:message serialization - Added by Vladimir Nesterovsky over 6 years ago

I think if you will elevate assert part above lock doc, you will be OK.

This won't help in general.

RE: xsl:message serialization - Added by Vladimir Nesterovsky over 6 years ago

I think it's best to keep a set of activities in key manager, where activity is a class that is keyed by pair (doc, key).

Locks should be per activity.

Under construction check should be above lock.

Similar examples can be found at:

https://stackoverflow.com/questions/7985971/is-there-a-way-to-synchronize-using-two-lock-objects-in-java

RE: xsl:message serialization - Added by Michael Kay over 6 years ago

Possibly related to https://saxonica.plan.io/issues/3984

Note, we have departed from the original subject of the thread. I don't think this is related to xsl:message any longer.

RE: xsl:message serialization - Added by Vladimir Nesterovsky over 6 years ago

Agree.

The only common theme here is multithreading.

Once we started to run our transformations using multithreading we started to see some negative effects including contention, deadlocks and unstable outputs between runs.

It's not possible to report a simple tests for such cases. In fact everything works perfectly within single thread, so we need to look into implementation.

(1-10/10)

Please register to reply

Project

Profile

Help

Saxon