With a parallel increase in productivity

I am trying to understand when using parallel will increase performance.
I tested it with a simple code that moved over 100,000 items to a List<Person> and changed each name to string.Empty .

The parallel version took twice as much time as the regular version. (Yes, I tested more than one core ...)

I saw this answer saying a piece of data that is not always parallel for performance.
In addition, this caution is repeated on every page of parallel examples in the MSDN tutorial:

These examples are primarily intended to demonstrate use and may or may not work faster than the equivalent serial LINQ for Query objects.

I need some rules and advice when the parallel will increase the performance of my code, and when it will not.
The obvious answer: “Check your code if the parallel loop uses it faster” is absolutely correct, but I think no one starts a performance analysis in every loop that he writes.

+4
source share
5 answers

Think about when to parallelize something in real life. When is it better to just sit down and do the work from start to finish, and when is it better to hire twenty guys?

  • Is work inherently parallel or inherently consistent? Some jobs are not parallelized: nine women cannot work together to make one child in a month. Some works are parallel, but give poor results: you could hire twenty guys and assign each of them fifty pages of War and Peace to read for you, and then each of them wrote one twentieth of the essay, glued all the fragments of the essay together and submit a document; which is unlikely to lead to a good grade. Some jobs are very parallel: twenty guys with shovels can dig a hole much faster than one guy.

  • If work is inherently parallelizable, does parallelization actually save time? You can make a pot of spaghetti with a hundred noodles in it, or you can make twenty pots of spaghetti with five noodles in each and pour the results together at the end. I guarantee that parallelizing the task of cooking spaghetti will not lead to an acceleration of the reception of your lunch.

  • If work is inherently parallelizable, and there is a potential time saver, is it worth hiring those guys to save time? If it's faster to do the work yourself than to hire guys, parallelization is not a victory. Hiring twenty guys to complete a task that takes you five seconds and hoping they will do it in a quarter of a second, not saving if you need a day to find the guys.

Parallelization usually wins when the work is huge and parallel. Setting one hundred thousand pointers to zero is what a computer can do in a tiny fraction of a second; there is no huge cost, so no savings. Try to do something non-trivial; say, write a compiler and conduct a semantic analysis of the method bodies in parallel. You are likely to get a win there.

+20
source

If you iterate through a collection and do something that is computationally intensive for each element (especially if “something” is not intensive I / O), you are likely to see some benefit from loop parallelization. Setting a property to string.Empty not an expensive computational process, so you probably haven't gotten an improvement.

+4
source

The loop will benefit from parallelism when computations performed in parallel exceed the overhead when using parallelism (starting a thread, switching threads, communication, thread conflict, etc.). Your test seems to imply that parallelism should benefit trivial computing, but it is not. What he shows you is the overhead of parallelism. The amount of work should be more (and usually significantly higher) than the overhead so that you can see any benefit.

You also decline testing. Testing is the only way to find out if paralysis buys you. You do not need to check the performance of each cycle, only performance critical ones. If a cycle is not performance critical, why even bother its parallel? And if it’s critical enough to spend time drawing a parallel, you better take a test to make sure that you are benefiting from your labor and regression tests to ensure that some kind of smart programmer doesn’t destroy your work later.

+2
source

There are several rules for me when you should consider parallelizing the code (and even then you should still check if this is faster):

  • The code you want to parallelize is computationally intensive. Just expecting an IO will usually not do you much good. It should be something where you will definitely use a bunch of processor time (e.g. image rendering).
  • The code you want to parallelize is complex enough that the overhead of creating parallelization is less than the savings from distributing the code (i.e. setting a string in string.Empty is incredibly simple and fast; you need something much more complex for each item to be worth it)
  • The code you want to parallelize is independent and has no dependency on other elements.
+1
source

Parallelism helps performance only to the extent that it allows you to get your entire hardware crank in a useful direction.

Two processor-related threads will not be faster than one if they need to split one core. In fact, they will be slower.

There are other reasons than performance for using multiple threads. For example, web applications that must interact with many concurrent users can be written as a single stream that simply responds to interrupts. However, it greatly simplifies the code if it can be written using threads.

This does not make the code faster. This makes recording easier.

0
source

Source: https://habr.com/ru/post/1388503/


All Articles