Bug #2224
closedBlocking when running transformations on multiple threads in .NET
0%
Description
I've observed behavior similar to what was reported here.
https://saxonica.plan.io/boards/3/topics/5984
I'd like to work with you to see if we can find and address the cause of the blocking. What information can I provide you to assist?
Files
Updated by O'Neil Delpratt about 10 years ago
- Category set to Multithreading
- Assignee set to O'Neil Delpratt
Please could you send us the stylesheet and any resource files needed to reproduce the problem. You can either add them to this bug entry or send it privately by email if data is confidential.
It would also be good to know more about your setup: such as the version of Saxon you are running, .NET version, etc.
Thanks,
O'Neil
Updated by Michael Kay about 10 years ago
To amplify O'Neil's answer:
(a) it's useful for us to check how your application is running the transformation, if only to confirm that you aren't doing anything at this level that is obviously wrong.
(b) it's useful for us to have a sample of stylesheets and source documents that demonstrate the problem. The problem is quite likely to depend on some detailed feature of the stylesheet that it would be hard to isolate.
(c) it's not clear from your description whether what you are seeing is a deadlock (all threads suspended), or simply excessive contention (the workload proceeds to completion, but slows does as you increase the number of threads).
(d) It's useful if threads are hanging (waiting indefinitely) for us to have a VM dump of some kind, e.g. diagnostics from DebugDiag 1.2.
Updated by Jeff Monnette almost 10 years ago
- File ZeroMQPoc.zip ZeroMQPoc.zip added
- File XSLT Blocking Results.xlsx XSLT Blocking Results.xlsx added
Sorry for the delay in getting back to you. I was pulled into other things.
To clarify, the issue I am seeing is excessive contention, not deadlocking.
I've attached a zip file that contains the test harness I created to test
running a transformation in parallel using either threads or processes.
The file also contains the XSLTs, dependencies, and a sample source file.
For my tests, I used .NET 4.5 and Saxon 9.4 EE. You'll see that the test
harness does the following:
- Spin up the specified number of transformation workers (either threads
or processes)
- Split the input file into individual records using an XmlReader (in this
case, our input file is a flat list of records and each record can be
transformed independently of the rest of the file)
-
Distribute the individual records to workers for transformation
-
The workers apply a series of transformations to the individual records
-
The output of the final transformation is simply discarded (to eliminate
disk contention as a cause of the problem)
The records are transformed from the input format to an intermediate format
using medline_recs.xsl and then to a final format using i2a.xsl.
I am using 0MQ to communicate across threads/processes.
I also collect various timing info to measure
-
The total time needed to process the file
-
The average time to process each record
I ran tests using a ~30,000 record input file and a sample 150 record
file. My test cases included various numbers of workers in both thread and
processes mode.
As you can see in the results, when running the 30k record file, both the
total time and the average times are substantially worse in thread mode
compared to process mode.
The most interesting result is the steady increase in average times as the
number of workers increases. This is what leads me to think there is some
kind of blocking occurring in the transformation. I'm running these tests
on a machine with 4 physical cores (8 logical cores due to hyperthreading)
so I wouldn't expect to see much processor contention until the number of
workers increases to 7 or 8.
Please let me know if there is any other information I can provide that
would be helpful.
On Wed, Nov 19, 2014 at 11:08 AM, Saxonica Developer Community <
dropbox+saxonica+f38e@plan.io> wrote:
Updated by O'Neil Delpratt almost 10 years ago
Thanks for sending us the project. It is unclear to me how to run it. Please could you give some instructions.
Updated by Michael Kay almost 10 years ago
One point to note is that we have made considerable progress since 9.4 in reducing NamePool contention. We don't know yet whether that's the culprit here, but it is prima facie the first suspect. NamePool contention occurs when multiple threads synchronise on the NamePool to allocate name codes. In 9.5 in particular, and to a lesser extent in 9.6, we made significant changes to reduce the number of synchronized accesses to the NamePool.
The NamePool holds a mapping from QNames to integers, so that XPath expressions and XSLT patterns that match elements by name can do integer comparisons rather than string comparisons. This gives a substantial speed-up, but does have the side-effect of causing contention because different threads need to allocate integer codes from the same pool.
Updated by Michael Kay almost 10 years ago
The figures you are reporting are not that unexpected: see my 2009 blog posting where we did some measurements on concurrent throughput here:
http://dev.saxonica.com/blog/mike/2009/02/some-threading-tests.html
I think we probably made some improvements between the date of that post (Feb 2009) and release 9.4 (which came out in Dec 2011), and we have made more improvements since, but NamePool contention has not been entirely eliminated.
Updated by O'Neil Delpratt almost 10 years ago
- Status changed from New to Closed
This is not a bug here, therefore closing the bug issue. We have made progress in reducing contention since 9.4 and have ideas for future improvements.
Please register to edit this issue