What point in the fork process creates a copy of the parent?

I know the answer to the question "why is this so", because the language was invented in this way, but it seems that it takes a lot of effort to fork() generate a copy of the process that called it. Perhaps this is sometimes useful, but, of course, most of the time when someone wants to start a new process so that it does not duplicate the caller? Why does fork create an identical process, not an empty one or one determined by passing an argument?

From yolinux

The fork () system call will invoke a new child process, which is an identical process for the parent, except that it has a new system process ID

In other words, when is it useful to start with a copy of the parent process?

+6
source share
4 answers

One big advantage of the fact that the parent process is duplicated in the child process is that it allows the parent program to create settings for the environment of the child process before it runs. For example, the parent may want to read the child process' stdout , in which case it needs to configure the pipes to allow it to read this before starting a new program.

It is also not as bad as it seems to be effectiveness. All of this is implemented on Linux using copy-on-write semantics for process memory (except for the special cases noted on the manual page):

On Linux (and in most unifications starting with version 7, the parent of all joins that exist now), fork () is implemented using copy-to-write pages, so the only penalty it incurs is the time and memory required to duplicate the parent table pages (which can also be copied to the record) and create a unique task structure for the child.

+8
source

Contrary to all expectations, it’s basically a fork , which makes the creation process so incredibly fast in Unices.

AFAIK, on ​​Linux, the actual process memory is not copied to the fork, the child starts with the same virtual memory mapping as the parent, and the pages are copied only where and when the child makes the change. Most pages are read-only code, so they are never copied. This is called copy-on-write .

Use cases where it is useful to use parent process copying:

  • Sinks

When you say cat foo >bar , the forks of the shell, and in the child process (still the shell) prepare the redirection, and then execs cat foo . The executed program runs under the same PID as the child shell and inherits all open file descriptors. You won’t believe how easy it is to write a basic Unix shell.

  1. Demons (services)

Demons run in the background. Many of them develop after initial preparation, exits from the parent and the child are separated from the terminal and remain in the background.

  1. Network servers

Many network daemons must handle multiple connections simultaneously. Sshd example. The main daemon starts as root and listens for new connections on port 22. When a new connection appears, it expands the child. The child simply saves a new socket representing this connection, authenticates the user, reduces privileges, etc.

  1. Etc
+4
source

There are several legitimate ways to use the fork system call. Here are some examples:

  • Saving memory. Since fork on any modern UNIX / Linux system shares memory between the child and the parent (using copy-on-write semantics), the parent process can load some static data that can be instantly passed to the child process. The zygote process on Android does this: it zygote Java runtime (Dalvik) and many classes, and then simply creates on demand applications to create new processes (which inherit a copy of the parent runtime and loaded classes).
  • Time saving. A process can perform some expensive initialization procedure (for example, files and Apache file upload modules), then fork disable the work to perform tasks that use preloaded initialization data.
  • Arbitrary process setup. On systems that have direct methods for creating processes (for example, Windows with CreateProcess , QNX with spawn , etc.), these direct process APIs tend to be very complex, since any process settings must be specified in the function call itself. Unlike fork/exec , a process can simply fork, perform settings through standard system calls ( close , signal , dup , etc.) and then exec when it is ready. fork/exec is therefore one of the simplest APIs creating processes, one of the most powerful and flexible.

To be fair, fork also has its share of problems. For example, it doesn’t work very well with multi-threaded programs: in the new process only one thread is created, and the locks are closed incorrectly (which leads to the need for atfork handlers for reset reset states through the fork ).

+4
source

Why fork() ? It had nothing to do with C. C, in itself only appeared at that time. This is because of how the original UNIX memory page and process control worked, it was trivial to make the process unload pages and then unload back to another location without unloading the first copy of the process.

In the evolution of the Unix time-sharing system ( http://cm.bell-labs.com/cm/cs/who/dmr/hist.html ), Dennis Ritchie says: “In fact, the PDP-7 plug requires exactly 27 lines of code to call assembly. " See the link for more details.

The threads are evil. With threads, you essentially have a number of processes that have access to the same memory space that can dance across each other's values. There is no memory protection at all. See “The Art of Unix Programming,” chapter 7 ( http://www.faqs.org/docs/artu/ch07s03.html#id2923889 ) for a more complete explanation.

+3
source

Source: https://habr.com/ru/post/975372/


All Articles