xref: /linux/Documentation/mm/overcommit-accounting.rst (revision 0ea5c948cb64bab5bc7a5516774eb8536f05aa0d)
1ee65728eSMike Rapoport=====================
2ee65728eSMike RapoportOvercommit Accounting
3ee65728eSMike Rapoport=====================
4ee65728eSMike Rapoport
5ee65728eSMike RapoportThe Linux kernel supports the following overcommit handling modes
6ee65728eSMike Rapoport
7ee65728eSMike Rapoport0
8ee65728eSMike Rapoport	Heuristic overcommit handling. Obvious overcommits of address
9ee65728eSMike Rapoport	space are refused. Used for a typical system. It ensures a
10ee65728eSMike Rapoport	seriously wild allocation fails while allowing overcommit to
11*d17ff438SVratislav Bendel	reduce swap usage. This is the default.
12ee65728eSMike Rapoport
13ee65728eSMike Rapoport1
14ee65728eSMike Rapoport	Always overcommit. Appropriate for some scientific
15ee65728eSMike Rapoport	applications. Classic example is code using sparse arrays and
16ee65728eSMike Rapoport	just relying on the virtual memory consisting almost entirely
17ee65728eSMike Rapoport	of zero pages.
18ee65728eSMike Rapoport
19ee65728eSMike Rapoport2
20ee65728eSMike Rapoport	Don't overcommit. The total address space commit for the
21ee65728eSMike Rapoport	system is not permitted to exceed swap + a configurable amount
22ee65728eSMike Rapoport	(default is 50%) of physical RAM.  Depending on the amount you
23ee65728eSMike Rapoport	use, in most situations this means a process will not be
24ee65728eSMike Rapoport	killed while accessing pages but will receive errors on memory
25ee65728eSMike Rapoport	allocation as appropriate.
26ee65728eSMike Rapoport
27ee65728eSMike Rapoport	Useful for applications that want to guarantee their memory
28ee65728eSMike Rapoport	allocations will be available in the future without having to
29ee65728eSMike Rapoport	initialize every page.
30ee65728eSMike Rapoport
31ee65728eSMike RapoportThe overcommit policy is set via the sysctl ``vm.overcommit_memory``.
32ee65728eSMike Rapoport
33ee65728eSMike RapoportThe overcommit amount can be set via ``vm.overcommit_ratio`` (percentage)
34ee65728eSMike Rapoportor ``vm.overcommit_kbytes`` (absolute value). These only have an effect
35ee65728eSMike Rapoportwhen ``vm.overcommit_memory`` is set to 2.
36ee65728eSMike Rapoport
37ee65728eSMike RapoportThe current overcommit limit and amount committed are viewable in
38ee65728eSMike Rapoport``/proc/meminfo`` as CommitLimit and Committed_AS respectively.
39ee65728eSMike Rapoport
40ee65728eSMike RapoportGotchas
41ee65728eSMike Rapoport=======
42ee65728eSMike Rapoport
43ee65728eSMike RapoportThe C language stack growth does an implicit mremap. If you want absolute
44ee65728eSMike Rapoportguarantees and run close to the edge you MUST mmap your stack for the
45ee65728eSMike Rapoportlargest size you think you will need. For typical stack usage this does
46ee65728eSMike Rapoportnot matter much but it's a corner case if you really really care
47ee65728eSMike Rapoport
48ee65728eSMike RapoportIn mode 2 the MAP_NORESERVE flag is ignored.
49ee65728eSMike Rapoport
50ee65728eSMike Rapoport
51ee65728eSMike RapoportHow It Works
52ee65728eSMike Rapoport============
53ee65728eSMike Rapoport
54ee65728eSMike RapoportThe overcommit is based on the following rules
55ee65728eSMike Rapoport
56ee65728eSMike RapoportFor a file backed map
57ee65728eSMike Rapoport	| SHARED or READ-only	-	0 cost (the file is the map not swap)
58ee65728eSMike Rapoport	| PRIVATE WRITABLE	-	size of mapping per instance
59ee65728eSMike Rapoport
60ee65728eSMike RapoportFor an anonymous or ``/dev/zero`` map
61ee65728eSMike Rapoport	| SHARED			-	size of mapping
62ee65728eSMike Rapoport	| PRIVATE READ-only	-	0 cost (but of little use)
63ee65728eSMike Rapoport	| PRIVATE WRITABLE	-	size of mapping per instance
64ee65728eSMike Rapoport
65ee65728eSMike RapoportAdditional accounting
66ee65728eSMike Rapoport	| Pages made writable copies by mmap
67ee65728eSMike Rapoport	| shmfs memory drawn from the same pool
68ee65728eSMike Rapoport
69ee65728eSMike RapoportStatus
70ee65728eSMike Rapoport======
71ee65728eSMike Rapoport
72ee65728eSMike Rapoport*	We account mmap memory mappings
73ee65728eSMike Rapoport*	We account mprotect changes in commit
74ee65728eSMike Rapoport*	We account mremap changes in size
75ee65728eSMike Rapoport*	We account brk
76ee65728eSMike Rapoport*	We account munmap
77ee65728eSMike Rapoport*	We report the commit status in /proc
78ee65728eSMike Rapoport*	Account and check on fork
79ee65728eSMike Rapoport*	Review stack handling/building on exec
80ee65728eSMike Rapoport*	SHMfs accounting
81ee65728eSMike Rapoport*	Implement actual limit enforcement
82ee65728eSMike Rapoport
83ee65728eSMike RapoportTo Do
84ee65728eSMike Rapoport=====
85ee65728eSMike Rapoport*	Account ptrace pages (this is hard)
86