Will a multi-threaded application run faster than a single-threaded application?

Everything is completely theoretical, the question only came to mind, and I was not completely sure that the answer:

Suppose you have an application that calculates 4 independent calculations. (Completely independent, it doesn't matter what order you make them, and you don't need to calculate it). Also assume that these calculations are long (minutes) and are CPU related (without waiting for any IO)

1) Now, if you have a 1-processor computer, a single-threaded application will be logically faster than (or the same) as a multi-threaded application. Since the computer is not capable of doing more than once at a time with one processor, it will "waste time" switching contexts and the like. So far so good?

2) If you have a computer with 4 processors , there will most likely be more than 4 threads for this thread than one thread. Right? your computer can now perform 4 operations at a time, so it’s just logical to split your application into 4 threads, and it should end in the time that takes the longest of 4 calculations. Still good?

3) And now the actual part that I'm embarrassed about is why I would like my application to create more threads than the number of processors (well, actually, cores)? I programmed and saw applications that create dozens and hundreds of threads, but in reality - the ideal number is about 8 for an average computer?

PS I already read this: Threading vs single thread but did not answer this question.

Greetings

+5
source share
4 answers

Why would I want my application to create more threads than the number of processors (well, actually, cores)?

One very good reason is if you have threads that are waiting for events. For example, you might have a producer / consumer application in which a producer reads data from a certain data stream, and this data comes in packets: several hundred (or thousands) of records in a batch, after which nothing happens, and then another explosion. Say you have a 4-core machine. You can have one producer stream that reads the data and puts it in the queue, and three consumer flows for processing the queue.

Or you can have one producer flow and four consumer flows. In most cases, the producer flow is idle, giving you four consumer flows to process items from the queue. But when items are available in the data stream, one of the consumer flows is replaced in favor of the manufacturer.

This is a simplified example, but essentially similar to the programs that I have in production.

In a more general sense, it makes no sense to create more continuous workflows (i.e. related to the processor) than you have processors (mainly CPU cores, although the existence of a hyperflow mutates the water a bit). If you know that your threads will not wait for external events, then if there are n+1 threads, when you have only n kernels, you lose time using thread context switches. Please note that this is strictly in the context of your program. If there are other OS applications and services, application flows will change from time to time so that these other applications and services can receive time limits. But it is suggested that if you use a processor-intensive program, you will limit other applications and services that run at the same time.

The best thing, of course, is to set up a test. On a quad-core machine, test your application with threads 1, 2, 3, 4, 5, .... The time it takes to complete a different number of threads. I think you will find that on a 4-core machine the sweet spot will be 3 or 4; most likely 4 if there are no other applications or OS services that take up a lot of CPU.

+5
source

I think you assume that all programs are connected to the processor - remember that some of your threads will wait for I / O (disk / network / user traffic).

+1
source

One of the reasons I could use more threads than cores would be if some threads had to interact with other parties ... waiting for a response from the server .. requesting something from the database. This will allow the thread to sleep until a response is provided. Thus, other calculations do not have to wait. in 4cores-> 4thread, the thread will wait for input, which may make other code wait too.

+1
source

Adding threads to your application is not related to performance gains. Sometimes you need or need to perform more than one task at a time, because this is the most logical way to archive your program.

As an example, perhaps you are writing a game engine, if you use a multi-threaded approach, you can have one stream for physics, one stream for graphics, one stream for working on the network, one stream for user input, one stream for loading resources from disk etc.

Also the point of James Baxter is also very true. Several times threads are waiting on a resource and cannot be executed until they access the specified resource. If there are only the same number of threads as the core, one core will be wasted.

+1
source

Source: https://habr.com/ru/post/1202347/


All Articles