Why does creating a process using `clone` cause out of memory crashes?

Question

Why does creating a process using `clone` cause out of memory crashes?

I have a process that allocates about 20 GB of RAM to a 32 gigabyte machine. After some events, I pass the data from the parent process to the stdin of the child process. Be sure to store 20 GB of data in the parent process at the time when the child is born.

The application is written in Rust and I call Command::new('path/to/command') to create the child process.

When I start a child process, the operating system captures an error due to memory.

output strace:

[pid 747] 16: 04: 41.128377 clone (child_stack = 0, flags = CLONE_CHILD_CLEARTID | CLONE_CHILD_SETTID | SIGCHLD, child_tidptr = 0x7ff4c7f87b10) = -1 ENOMEM (cannot allocate memory)

Why does a trap arise? The child process should not consume more than 1 GB, and exec() is called immediately after clone() .

+6

unix memory process fork rust

Nextar Feb 23 '17 at 8:10

source share

1 answer

Douglas daseeco · Accepted Answer · 2017-03-13T02:21:07+0000

Problem

When a child process is created by calling Rust, several events occur at the C / C ++ level. This is a simplification, but it will help explain the dilemma.

Streams are duplicated (with dup2 or similar call)
Parent process forks (with fork or clone system call)
The processed process executes the child process (with the call to the execvp family)

Parent and child objects are now parallel processes. The Rust call you are currently using is a clone call that behaves like a clean fork, so you are 20G x 2 - 32G = 8G, not taking into account the space required by the operating system and everything that could be running. The clone call returns with a negative return value, and errno is set by calling ENOMEM errno.

If architectural decisions on adding physical memory, compressing data or streaming through a process that does not require its full use in memory at any time are not parameters, then the classical solution is quite simple.

Recommendation

Set the parent process to lean. Then create two working children, one that will handle your 20 gigabyte needs, and the other that will handle your 1 GB needs ¹ . These children can be connected to each other through a channel, file, shared memory, socket, semaphore, alarm and / or other communication mechanisms, just as the parent and child can be.

Many mature software packages from Apache httpd to embedded cell routing echelons use this design pattern. It is reliable, serviceable, expandable and portable.

Then 32G is likely to be sufficient for the processing needs of 20G and 1G, as well as the OS and the lean parent process.

Although this solution will undoubtedly solve your problem, if the code is reused or extended later, you may need to look for possible changes in the process design using data frames or multidimensional slices to support streaming data and reduce memory footprint.

Memory Overcommit Always

Setting overcommit_memory to 1 eliminates the clone error condition mentioned in the question because calling Rust calls clone LINUX, which reads this setting. But there are a few caveats with this decision that point to the above recommendation as excellent, in the first place, that a value of 1 is dangerous, especially for production environments.

Background

In the late 1990s and early 2000s, discussions of the OpenBSD rfork kernel and clone invocation were discussed. Functions arising from these discussions allow for less extreme branching than processes that are symmetrically similar to providing wider independence between pthreads. Some of these discussions have expanded the traditional process that introduced POSIX standardization.

In the early 2000s, Linux Torvalds proposed a flag structure to determine which components of the execution model are shared and what are copied when forks are executed, blurring the distinction between processes and threads. From this moment, a clone call appeared.

Overuse memory is not discussed much if it is in these threads. The goal of the project was to more effectively manage the results of the plug, rather than delegating the optimization of memory usage for the heuristic of the operating system, which makes the default setting overcommit_memory = 0 .

Warning

Exceeding memory goes beyond the scope of these extensions, adding complexity to the compromises of its modes ² caveats on design trends ³ practical limits on runtime ⁴ and performance affects ⁵ .

Portability and Durability

In addition, without standardization, code using overcommit memory may not be portable, and the question of durability is appropriate, especially if the parameter controls the behavior of the function. There is no guarantee of backward compatibility or even some warning of loss if the configuration system changes.

Danger

The linuxdevcenter ² documentation says: "1 is always excessive. Perhaps you now recognize the danger of this mode." There are other signs of danger when ALWAYS overwhelm ^{6, 7} .

Overcommit implementers on LINUX, Windows, and VMWare can guarantee reliability, but this is a statistical game that, combined with many other complexities of process control, can lead to certain unstable performance under certain conditions. Even the name overcommit tells us something about his true nature as a practice.

An overcommit_memory mode without default, for which several warnings are problems, but works to directly check the immediate case, can subsequently lead to intermittent reliability.

Predictability and its impact on system reliability and consistency of response times

The idea of a process in UNIX as an operating system, since its inception by Bell Labs, is that a process creates specific requests for its container, the operating system. The result is both predictable and binary. Either the request is denied, or it is provided. After giving this process full control and direct access to resources is provided until the use of this process is postponed.

The aspect of virtual memory in the swap space is a violation of this principle, which manifests itself in the form of a sharp slowdown in activity at workstations, when RAM is heavily consumed. For example, during development, there are times when one presses a key and must wait ten seconds to see a character on the display.

Conclusion

There are many ways to get the most out of physical memory, but hoping that using allocated memory will be scarce is likely to lead to negative consequences. Exchange performance metrics when overcommit is used is a well-documented example. If you save 20G of data in RAM, this can be especially important.

Only the distribution of what is necessary, intelligent markup, the use of threads and the release of memory, which is undoubtedly no longer needed, leads to cost savings without affecting reliability, creating bursts when using disk space and can work without reservation up to the limits of system resources.

The call designer position of Command::new can be based on this perspective. In this case, as soon as after the fork is called exec, it is not a determining factor in how much memory is requested during caviar.

Notes and links

[1] Spawning children may need code refactoring and there seem to be too many problems at the superficial level, but refactoring can be unexpectedly simple and significantly useful.

[2] http://www.linuxdevcenter.com/pub/a/linux/2006/11/30/linux-out-of-memory.html?page=2

[3] https://www.etalabs.net/overcommit.html

[4] http://www.gabesvirtualworld.com/memory-overcommit-in-production-yes-yes-yes/

[5] https://labs.vmware.com/vmtj/memory-overcommitment-in-the-esx-server

[6] https://github.com/kubernetes/kubernetes/issues/14452

[7] http://linuxtoolkit.blogspot.com/2011_08_01_archive.html