Motivation to create a new v thread process

I understand that if your program has large segments that can run in parallel, it would be useful to create new threads when the instances are not connected by one resource. An example of this would be a request to a web server on which pages are requested.

Threads are useful in that cross-threading is much less expensive, and context switching is much faster.

Processes give you more protection against an aspect that one process cannot “spoil” with the contents of other processes, whereas if one thread crashes, it is likely that all threads will fail in the specified process.

My question is, what are some examples of when you want to use a process (e.g. fork () in C)?

I might think if you have a program that wants to run another program, it would be advisable to encapsulate it in a new process, but I feel that I have no more serious reason to start a new process.

In particular, when does it make sense for one program to spawn a new process and thread?

+6
source share
3 answers

The main reason for using processes is that the process may crash or crash, and the OS will limit the effect of this on other processes. For example, Firefox recently started running plugins in separate processes, IIRC Chrome launches different pages in different processes, and web servers process separate requests in separate processes for a long time.

There are several different ways to apply OS restrictions:

  • Crash - as you noticed, if the thread crashes, it generally removes the whole process. This motivates the boundaries of the browser process: browsers and browser plug-ins are complex bits of code that are subject to constant attack, so it makes sense to take unusual precautions.
  • Resource limits. If a thread in your process opens a lot of files, allocates a lot of memory, etc., then this affects you. Another process is not needed, as it may be limited separately. Thus, each request on the web server may be more limited in resource use than the server as a whole, because you want your server to serve several requests at the same time without any remote user resources.
  • Opportunity. It depends on the OS, but for example, you can run the process in the chroot prison to make sure that it does not modify or read files that it should not, no matter how vulnerable your code is to use. In another example, SymbianOS has an explicit list of permissions for performing various actions with the system ("read user phone book", "write user phone book", "decrypt DRM files", etc.). It is not possible to give up access rights to your process, so if you want to do something very sensitive and then return to low sensitivity mode, you will need the process border somewhere. One of the reasons why you need to do this is security - an unknown code or code that may contain security flaws can be somewhat isolated, and a smaller amount of code that is not limited can be examined more closely. Another reason is to get the OS to fulfill certain aspects of your design.
  • Drivers In general, a device driver controls sharing of a unique system resource. As with features, restricting this access to a single driver process means that you can restrict it to all other processes. For example, IIRC TrueCrypt on Windows installs a driver with advanced permissions, allowing it to register an encrypted container with a drive letter and then act like any other Windows file system. Part of the GUI application runs in normal user mode. I'm not sure if the file system drivers in Windows really need a process associated with them, but device drivers as a whole can do this, so even if this is not a very good example, we hope this gives an idea.

Another potential reason for using processes is that it makes it easier to explain your code. In multi-threaded code, you rely on the invariants of all your classes to infer that access to a specific object is serialized: if your code is not multi-threaded, you know that it is [*]. Of course, this can be done with multi-threaded code, just make sure you know which thread "owns" each object, and never access an object from a thread that does not own it. The boundaries of the process provide this, and not just design for it. Again, I’m not sure that this is motivation, but, for example, a World Community Grid client can use several cores. In this mode, it starts several processes with a completely different task in each, therefore it has the performance advantages of additional cores, without the need for separate parallelization of a separate task or code for any task that should be thread safe.

[*] Well, if it was not created in shared memory. You also need to avoid unexpected recursive calls, etc., but this is usually a simpler problem than synchronizing multi-threaded code.

+3
source

Threads share the same memory with another thread in the same process. Therefore, if you want to transfer data from one stream to another, you need to take care of blocking, synchronization, etc., which is a complex and erroneous thing, and it should be avoided. If one of the downstream flows destroys the whole process. Creating a stream is light, not creating a new process.

In a separate process for each task, the pluses are that you do not need to worry about common volatile data that needs to be blocked and synchronized, because you will use message passing to communicate with the process and even if the process fails, it will not delete your entire application ( browsers do this: each tab = new process). The disadvantage is that creating a process is more difficult than creating threads. Remember that there may be several threads in this separate process.

Thus, using the above points, you can choose the best approach for your specific application. There is no silver bullet, and all this "depends" on a case by case basis.

It is important to remember that in both cases you should use the "pools" (thread pool and process pool), and not create a new thread / process for each task that you want to perform. Browser example: each tab has its own process that does not use the pool, since the process in this case is not a "workflow", which is supposed to wait for the main process to give them some kind of task, and then after it is completed again awaiting condition.

0
source

I would say that threads have a limited stack, which gives you a limit on the amount of work that you can continue with them. Bot OTH, you can exchange data very simply using shared memory and thread messages. Having 10 threads, each of which performs a simple task, it is quite easy to control even on the base computer.

Using processes, in my opinion, is more difficult to manage, because you have to handle the transfer of data between the parent and the child. You must use pipes, message queues, etc., and memory transfers are more expensive than threads. Stopping the process involves sending SIG_KILL . OTH you get emergency protection: a process failure will not execute your main application.

thread examples
- I / O I / O processing: you send a buffer, the stream notifies you of the completion of writing / reading - matrix multiplication: you can divide the matrix calculations into rows or columns, and the task is quite simple

process examples
- export some data to image format using complex compression
- an indexer process that collects metadata and formats it into the internal format of your application
- each new drawing window of a complex application

Tl dr
IMHO:
- Use processes if the task is complex.
- Use streams if the lamp is lit.

0
source

Source: https://habr.com/ru/post/886997/


All Articles