Stream Data Lock

Question

Stream Data Lock

On page 88 of Stephen Tub's book

http://www.microsoft.com/download/en/details.aspx?id=19222

There is a code

private BlockingCollection<T> _streamingData = new BlockingCollection<T>(); // Parallel.ForEach Parallel.ForEach(_streamingData.GetConsumingEnumerable(), item => Process(item)); // PLINQ var q = from item in _streamingData.GetConsumingEnumerable().AsParallel() ... select item;

Then Stephen mentions

“when passing the result of calling GetConsumingEnumerable as a data source to Parallel.ForEach, the threads used by the loop may block when the collection becomes empty. And the blocked thread cannot be released by Parallel.ForEach back to ThreadPool for retirement or other purposes. So "with the code as shown above, if there are times when the collection is empty, the number of threads in the process may grow steadily;"

I do not understand why the number of threads has increased?

If the collection is empty, will blockingcollection not request further threads?

Therefore, you do not need to do WithDegreeOfParallelism to limit the number of threads used in the BlockingCollection

+4

c # streaming blockingcollection

Thewommies Jan 28 '12 at 11:51

source share

1 answer

usr · Accepted Answer · 2012-01-28T16:19:35+0000

There is a mountain climbing algorithm in the thread pool, which is used to estimate the corresponding number of threads. While adding threads increases throughput, the thread pool will create more threads. It is assumed that some locks or IOs occur and try to saturate the processor by going over the number of processors in the system.

This is why using IO and blocking in thread pool threads can be dangerous.

Here is a complete working example of behavior:

  BlockingCollection<string> _streamingData = new BlockingCollection<string>(); Task.Factory.StartNew(() => { for (int i = 0; i < 100; i++) { _streamingData.Add(i.ToString()); Thread.Sleep(100); } }); new Thread(() => { while (true) { Thread.Sleep(1000); Console.WriteLine("Thread count: " + Process.GetCurrentProcess().Threads.Count); } }).Start(); Parallel.ForEach(_streamingData.GetConsumingEnumerable(), item => { });

I do not know why the number of threads continues to grow, although it does not increase throughput. According to the model I explained, it will not grow. But I do not know if my model is really correct.

Perhaps there is an additional heuristic in the thread pool that forces it to create threads if it does not see any progress (measured in tasks performed per second). This would make sense, as it would probably prevent a lot of deadlocks in applications. Deadlocks can occur if important tasks cannot be completed because they are waiting to exit existing tasks and create threads. This is a well-known thread pool issue.

Stream Data Lock

More articles: