Project

Profile

Help

Bug #2224

closed

Blocking when running transformations on multiple threads in .NET

Added by Jeff Monnette about 10 years ago. Updated almost 10 years ago.

Status:
Closed
Priority:
Normal
Category:
Multithreading
Sprint/Milestone:
-
Start date:
2014-11-18
Due date:
% Done:

0%

Estimated time:
Legacy ID:
Applies to branch:
Fix Committed on Branch:
Fixed in Maintenance Release:
Platforms:

Description

I've observed behavior similar to what was reported here.

https://saxonica.plan.io/boards/3/topics/5984

I'd like to work with you to see if we can find and address the cause of the blocking. What information can I provide you to assist?


Files

ZeroMQPoc.zip (6.55 MB) ZeroMQPoc.zip Jeff Monnette, 2014-12-02 19:08
XSLT Blocking Results.xlsx (22.6 KB) XSLT Blocking Results.xlsx Jeff Monnette, 2014-12-02 19:08
Actions #1

Updated by O'Neil Delpratt about 10 years ago

  • Category set to Multithreading
  • Assignee set to O'Neil Delpratt

Please could you send us the stylesheet and any resource files needed to reproduce the problem. You can either add them to this bug entry or send it privately by email if data is confidential.

It would also be good to know more about your setup: such as the version of Saxon you are running, .NET version, etc.

Thanks,

O'Neil

Actions #2

Updated by Michael Kay about 10 years ago

To amplify O'Neil's answer:

(a) it's useful for us to check how your application is running the transformation, if only to confirm that you aren't doing anything at this level that is obviously wrong.

(b) it's useful for us to have a sample of stylesheets and source documents that demonstrate the problem. The problem is quite likely to depend on some detailed feature of the stylesheet that it would be hard to isolate.

(c) it's not clear from your description whether what you are seeing is a deadlock (all threads suspended), or simply excessive contention (the workload proceeds to completion, but slows does as you increase the number of threads).

(d) It's useful if threads are hanging (waiting indefinitely) for us to have a VM dump of some kind, e.g. diagnostics from DebugDiag 1.2.

Actions #3

Updated by Jeff Monnette almost 10 years ago

Sorry for the delay in getting back to you. I was pulled into other things.

To clarify, the issue I am seeing is excessive contention, not deadlocking.

I've attached a zip file that contains the test harness I created to test

running a transformation in parallel using either threads or processes.

The file also contains the XSLTs, dependencies, and a sample source file.

For my tests, I used .NET 4.5 and Saxon 9.4 EE. You'll see that the test

harness does the following:

  1. Spin up the specified number of transformation workers (either threads

or processes)

  1. Split the input file into individual records using an XmlReader (in this

case, our input file is a flat list of records and each record can be

transformed independently of the rest of the file)

  1. Distribute the individual records to workers for transformation

  2. The workers apply a series of transformations to the individual records

  3. The output of the final transformation is simply discarded (to eliminate

disk contention as a cause of the problem)

The records are transformed from the input format to an intermediate format

using medline_recs.xsl and then to a final format using i2a.xsl.

I am using 0MQ to communicate across threads/processes.

I also collect various timing info to measure

  1. The total time needed to process the file

  2. The average time to process each record

I ran tests using a ~30,000 record input file and a sample 150 record

file. My test cases included various numbers of workers in both thread and

processes mode.

As you can see in the results, when running the 30k record file, both the

total time and the average times are substantially worse in thread mode

compared to process mode.

The most interesting result is the steady increase in average times as the

number of workers increases. This is what leads me to think there is some

kind of blocking occurring in the transformation. I'm running these tests

on a machine with 4 physical cores (8 logical cores due to hyperthreading)

so I wouldn't expect to see much processor contention until the number of

workers increases to 7 or 8.

Please let me know if there is any other information I can provide that

would be helpful.

On Wed, Nov 19, 2014 at 11:08 AM, Saxonica Developer Community <

> wrote:

Actions #4

Updated by O'Neil Delpratt almost 10 years ago

Thanks for sending us the project. It is unclear to me how to run it. Please could you give some instructions.

Actions #5

Updated by Michael Kay almost 10 years ago

One point to note is that we have made considerable progress since 9.4 in reducing NamePool contention. We don't know yet whether that's the culprit here, but it is prima facie the first suspect. NamePool contention occurs when multiple threads synchronise on the NamePool to allocate name codes. In 9.5 in particular, and to a lesser extent in 9.6, we made significant changes to reduce the number of synchronized accesses to the NamePool.

The NamePool holds a mapping from QNames to integers, so that XPath expressions and XSLT patterns that match elements by name can do integer comparisons rather than string comparisons. This gives a substantial speed-up, but does have the side-effect of causing contention because different threads need to allocate integer codes from the same pool.

Actions #6

Updated by Michael Kay almost 10 years ago

The figures you are reporting are not that unexpected: see my 2009 blog posting where we did some measurements on concurrent throughput here:

http://dev.saxonica.com/blog/mike/2009/02/some-threading-tests.html

I think we probably made some improvements between the date of that post (Feb 2009) and release 9.4 (which came out in Dec 2011), and we have made more improvements since, but NamePool contention has not been entirely eliminated.

Actions #7

Updated by O'Neil Delpratt almost 10 years ago

  • Status changed from New to Closed

This is not a bug here, therefore closing the bug issue. We have made progress in reducing contention since 9.4 and have ideas for future improvements.

Please register to edit this issue

Also available in: Atom PDF