1ee65728eSMike Rapoport===================== 2ee65728eSMike RapoportOvercommit Accounting 3ee65728eSMike Rapoport===================== 4ee65728eSMike Rapoport 5ee65728eSMike RapoportThe Linux kernel supports the following overcommit handling modes 6ee65728eSMike Rapoport 7ee65728eSMike Rapoport0 8ee65728eSMike Rapoport Heuristic overcommit handling. Obvious overcommits of address 9ee65728eSMike Rapoport space are refused. Used for a typical system. It ensures a 10ee65728eSMike Rapoport seriously wild allocation fails while allowing overcommit to 11*d17ff438SVratislav Bendel reduce swap usage. This is the default. 12ee65728eSMike Rapoport 13ee65728eSMike Rapoport1 14ee65728eSMike Rapoport Always overcommit. Appropriate for some scientific 15ee65728eSMike Rapoport applications. Classic example is code using sparse arrays and 16ee65728eSMike Rapoport just relying on the virtual memory consisting almost entirely 17ee65728eSMike Rapoport of zero pages. 18ee65728eSMike Rapoport 19ee65728eSMike Rapoport2 20ee65728eSMike Rapoport Don't overcommit. The total address space commit for the 21ee65728eSMike Rapoport system is not permitted to exceed swap + a configurable amount 22ee65728eSMike Rapoport (default is 50%) of physical RAM. Depending on the amount you 23ee65728eSMike Rapoport use, in most situations this means a process will not be 24ee65728eSMike Rapoport killed while accessing pages but will receive errors on memory 25ee65728eSMike Rapoport allocation as appropriate. 26ee65728eSMike Rapoport 27ee65728eSMike Rapoport Useful for applications that want to guarantee their memory 28ee65728eSMike Rapoport allocations will be available in the future without having to 29ee65728eSMike Rapoport initialize every page. 30ee65728eSMike Rapoport 31ee65728eSMike RapoportThe overcommit policy is set via the sysctl ``vm.overcommit_memory``. 32ee65728eSMike Rapoport 33ee65728eSMike RapoportThe overcommit amount can be set via ``vm.overcommit_ratio`` (percentage) 34ee65728eSMike Rapoportor ``vm.overcommit_kbytes`` (absolute value). These only have an effect 35ee65728eSMike Rapoportwhen ``vm.overcommit_memory`` is set to 2. 36ee65728eSMike Rapoport 37ee65728eSMike RapoportThe current overcommit limit and amount committed are viewable in 38ee65728eSMike Rapoport``/proc/meminfo`` as CommitLimit and Committed_AS respectively. 39ee65728eSMike Rapoport 40ee65728eSMike RapoportGotchas 41ee65728eSMike Rapoport======= 42ee65728eSMike Rapoport 43ee65728eSMike RapoportThe C language stack growth does an implicit mremap. If you want absolute 44ee65728eSMike Rapoportguarantees and run close to the edge you MUST mmap your stack for the 45ee65728eSMike Rapoportlargest size you think you will need. For typical stack usage this does 46ee65728eSMike Rapoportnot matter much but it's a corner case if you really really care 47ee65728eSMike Rapoport 48ee65728eSMike RapoportIn mode 2 the MAP_NORESERVE flag is ignored. 49ee65728eSMike Rapoport 50ee65728eSMike Rapoport 51ee65728eSMike RapoportHow It Works 52ee65728eSMike Rapoport============ 53ee65728eSMike Rapoport 54ee65728eSMike RapoportThe overcommit is based on the following rules 55ee65728eSMike Rapoport 56ee65728eSMike RapoportFor a file backed map 57ee65728eSMike Rapoport | SHARED or READ-only - 0 cost (the file is the map not swap) 58ee65728eSMike Rapoport | PRIVATE WRITABLE - size of mapping per instance 59ee65728eSMike Rapoport 60ee65728eSMike RapoportFor an anonymous or ``/dev/zero`` map 61ee65728eSMike Rapoport | SHARED - size of mapping 62ee65728eSMike Rapoport | PRIVATE READ-only - 0 cost (but of little use) 63ee65728eSMike Rapoport | PRIVATE WRITABLE - size of mapping per instance 64ee65728eSMike Rapoport 65ee65728eSMike RapoportAdditional accounting 66ee65728eSMike Rapoport | Pages made writable copies by mmap 67ee65728eSMike Rapoport | shmfs memory drawn from the same pool 68ee65728eSMike Rapoport 69ee65728eSMike RapoportStatus 70ee65728eSMike Rapoport====== 71ee65728eSMike Rapoport 72ee65728eSMike Rapoport* We account mmap memory mappings 73ee65728eSMike Rapoport* We account mprotect changes in commit 74ee65728eSMike Rapoport* We account mremap changes in size 75ee65728eSMike Rapoport* We account brk 76ee65728eSMike Rapoport* We account munmap 77ee65728eSMike Rapoport* We report the commit status in /proc 78ee65728eSMike Rapoport* Account and check on fork 79ee65728eSMike Rapoport* Review stack handling/building on exec 80ee65728eSMike Rapoport* SHMfs accounting 81ee65728eSMike Rapoport* Implement actual limit enforcement 82ee65728eSMike Rapoport 83ee65728eSMike RapoportTo Do 84ee65728eSMike Rapoport===== 85ee65728eSMike Rapoport* Account ptrace pages (this is hard) 86