Actually, the kernel documentation for accounting for rebuilding has some details: https://www.kernel.org/doc/Documentation/vm/overcommit-accounting
The Linux kernel supports the following overcommit processing modes.
0 - Heuristic processing with excess.
Refusal of redundant address space addresses. Used for a typical system. This ensures that serious wild distribution fails, overcommit to reduce swap usage. root allows you to allocate a bit more memory in this mode. This is the default value.
Also Documentation / sysctl / vm.txt
overcommit_memory: This value contains a flag that allows memory re-arrangement.
When this flag is 0, the kernel tries to estimate how much free memory remains when user space requests more memory ...
See the documentation / vm / overcommit -accounting and mm / mmap.c :: __ vm_enough_memory () for more information.
In addition, man 5 proc
:
/proc/sys/vm/overcommit_memory
This file contains the kernel virtual memory accounting mode. Values:
0: heuristic overcommit (this is the default) 1: always overcommit, never check 2: always check, never overcommit
In mode 0, calls to mmap(2)
with MAP_NORESERVE
not checked, and the default check is very weak, which leads to the risk of an OOM-kill process.
Thus, very large distributions are disabled by heuristics, but sometimes an application can allocate more virtual memory than the size of physical memory in the system if it does not use all of this. With a value of MAP_NORESERVE
amount of mmapable memory may be higher.
Parameter: "The overcommit policy is set through sysctl` vm.overcommit_memory", so we can find how it is implemented in the source code: http://lxr.free-electrons.com/ident?v=4.4;i=sysctl_overcommit_memory , defined in line 112 mm / mmap.c
112 int sysctl_overcommit_memory __read_mostly = OVERCOMMIT_GUESS;
and the constant OVERCOMMIT_GUESS
(defined in linux / mman.h ) is actually used only in the line 170 mm / mmap.c , this is the implementation of the heuristic:
138 154 int __vm_enough_memory(struct mm_struct *mm, long pages, int cap_sys_admin) ... 170 if (sysctl_overcommit_memory == OVERCOMMIT_GUESS) { 171 free = global_page_state(NR_FREE_PAGES); 172 free += global_page_state(NR_FILE_PAGES); 173 174 180 free -= global_page_state(NR_SHMEM); 181 182 free += get_nr_swap_pages(); 183 184 190 free += global_page_state(NR_SLAB_RECLAIMABLE); 191 192 195 if (free <= totalreserve_pages) 196 goto error; 197 else 198 free -= totalreserve_pages; 199 200 203 if (!cap_sys_admin) 204 free -= sysctl_admin_reserve_kbytes >> (PAGE_SHIFT - 10); 205 206 if (free > pages) 207 return 0; 208 209 goto error; 210 }
So, heuristic is a way to estimate how many pages of physical memory are being used now ( free
) when a request for more memory is being processed (applications request pages
pages).
When overcommit ("1") is always enabled, this function always returns 0 ("there is enough memory for this request")
164 167 if (sysctl_overcommit_memory == OVERCOMMIT_ALWAYS) 168 return 0;
Without the default heuristic in "2" mode, the kernel will try to take into account the requested pages pages
to get a new Committed_AS
(from /proc/meminfo
):
162 vm_acct_memory(pages); ...
this actually just increments vm_committed_as
- __percpu_counter_add(&vm_committed_as, pages, vm_committed_as_batch);
212 allowed = vm_commit_limit();
The magic is here:
401 404 unsigned long vm_commit_limit(void) 405 { 406 unsigned long allowed; 407 408 if (sysctl_overcommit_kbytes) 409 allowed = sysctl_overcommit_kbytes >> (PAGE_SHIFT - 10); 410 else 411 allowed = ((totalram_pages - hugetlb_total_pages()) 412 * sysctl_overcommit_ratio / 100); 413 allowed += total_swap_pages; 414 415 return allowed; 416 } 417
So, allowed
set as kilobyte in vm.overcommit_kbytes
sysctl or as vm.overcommit_ratio
as a percentage of physical memory, plus swap sizes.
213 /* 214 * Reserve some for root 215 */ 216 if (!cap_sys_admin) 217 allowed -= sysctl_admin_reserve_kbytes >> (PAGE_SHIFT - 10);
Allow a certain amount of memory for root only (Page_shift - 12 for a healthy person, page_shift-10 is just a conversion from kilobytes to the number of pages).
218 219 222 if (mm) { 223 reserve = sysctl_user_reserve_kbytes >> (PAGE_SHIFT - 10); 224 allowed -= min_t(long, mm->total_vm / 32, reserve); 225 } 226 227 if (percpu_counter_read_positive(&vm_committed_as) < allowed) 228 return 0;
If, after accounting for the request, the entire user space still has less memory, select it. In the other case, refuse the request (and not take into account the request).
229 error: 230 vm_unacct_memory(pages); 231 232 return -ENOMEM;
In other words, as stated in the "Linux Kernel. Some Notes on the Linux Kernel", 2003-02-01 by Andries Brouwer, 9. Memory, 9.6 Overcommit and OOM - https://www.win.tue.nl/~aeb/ linux / lk / lk-9.html :
Going in the right direction
Starting from 2.5.30 values:
0
(default): as before: guess how much a reasonable level has been exceeded,1
: never give up malloc ()2
: be precise with respect to overcommit β never transfer a virtual address space larger than the swap space, as well as the share of overcommit_ratio
physical memory.
So, "2" is an accurate calculation of the amount of memory used after the request, and "0" is a heuristic estimate.