Lines Matching +full:no +full:- +full:ref +full:- +full:current +full:- +full:limit
8 here but make sure to check the current code if you need a deeper
18 we call it "memory cgroup". When you see git-log and source code, you'll
30 Memory-hungry applications can be isolated and limited to a smaller
42 Current Status: linux-2.6.34-mmotm(development version of 2010/April)
46 - accounting anonymous pages, file caches, swap caches usage and limiting them.
47 - pages are linked to per-memcg LRU exclusively, and there is no global LRU.
48 - optionally, memory+swap usage can be accounted and limited.
49 - hierarchical accounting
50 - soft limit
51 - moving (recharging) account at moving a task is selectable.
52 - usage threshold notifier
53 - memory pressure notifier
54 - oom-killer disable knob and oom-notifier
55 - Root cgroup has no limit controls.
57 Kernel memory support is a work in progress, and the current version provides
58 basically functionality. (See :ref:`section 2.7
59 <cgroup-v1-memory-kernel-extension>`)
69 memory.usage_in_bytes show current usage for memory
71 memory.memsw.usage_in_bytes show current usage for memory+Swap
73 memory.limit_in_bytes set/show limit of memory usage
74 memory.memsw.limit_in_bytes set/show limit of memory+Swap usage
79 memory.soft_limit_in_bytes set/show soft limit of memory usage
102 memory hard limit. Kernel hard limit is not
108 memory.kmem.usage_in_bytes show current kernel memory allocation
113 memory.kmem.tcp.limit_in_bytes set/show hard limit for tcp buf memory
116 memory.kmem.tcp.usage_in_bytes show current tcp buf memory allocation
139 raised to allow user space handling of OOM. The current memory controller is
162 -----------
165 page_counter tracks the current memory usage and limit of the group of
170 ---------------
172 .. code-block::
175 +--------------------+
178 +--------------------+
181 +---------------+ | +---------------+
184 +---------------+ | +---------------+
186 + --------------+
188 +---------------+ +------+--------+
189 | page +----------> page_cgroup|
191 +---------------+ +---------------+
204 charged is over its limit. If it is, then reclaim is invoked on the cgroup.
206 If everything goes well, a page meta-data-structure called page_cgroup is
208 (*) page_cgroup structure is allocated at boot/memory-hotplug time.
211 ------------------------
226 A swapped-in page is accounted after adding into swapcache.
228 Note: The kernel does swapin-readahead and reads multiple swaps at once.
234 Note: we just account pages-on-LRU because our purpose is to control amount
235 of used pages; not-on-LRU pages tend to be out-of-control from VM view.
238 --------------------------
244 the cgroup that brought it in -- this will happen on memory pressure).
246 But see :ref:`section 8.2 <cgroup-v1-memory-movable-charges>` when moving a
251 --------------------------------------
254 read and limit it.
258 - memory.memsw.usage_in_bytes.
259 - memory.memsw.limit_in_bytes.
267 By using the memsw limit, you can avoid system OOM which can be caused by swap
273 The global LRU(kswapd) can swap out arbitrary pages. Swap-out means
274 to move account from memory to swap...there is no change in usage of
275 memory+swap. In other words, when we want to limit the usage of swap without
276 affecting global LRU, memory+swap limit is better than just limiting swap from
282 When a cgroup hits memory.memsw.limit_in_bytes, it's useless to do swap-out
283 in this cgroup. Then, swap-out will not be done by cgroup routine and file
289 -----------
292 global VM. When a cgroup goes over its limit, we first try
296 cgroup. (See :ref:`10. OOM Control <cgroup-v1-memory-oom-control>` below.)
299 pages that are selected for reclaiming come from the per-cgroup LRU
310 (See :ref:`oom_control <cgroup-v1-memory-oom-control>` section)
313 -----------
318 mm->page_table_lock or split pte_lock
319 folio_memcg_lock (memcg->move_lock)
320 mapping->i_pages lock
321 lruvec->lru_lock.
323 Per-node-per-memcgroup LRU (cgroup's private LRU) is guarded by
324 lruvec->lru_lock; the folio LRU flag is cleared before
325 isolating a page from its LRU under lruvec->lru_lock.
327 .. _cgroup-v1-memory-kernel-extension:
330 -----------------------------------------------
332 With the Kernel memory extension, the Memory Controller is able to limit
338 it can be disabled system-wide by passing cgroup.memory=nokmem to the kernel
349 Currently no soft limit is implemented for kernel memory. It is future work
352 2.7.1 Current Kernel Memory resources accounted
353 -----------------------------------------------
377 ----------------------
381 limit, and "K" the kernel limit. There are three possible ways limits can be
390 deployments where the total amount of memory per-cgroup is overcommitted.
392 box can still run out of non-reclaimable memory.
398 In the current implementation, memory reclaim will NOT be triggered for
414 2. Prepare the cgroups (see :ref:`Why are cgroups needed?
415 <cgroups-why-needed>` for the background information)::
417 # mount -t tmpfs none /sys/fs/cgroup
419 # mount -t cgroup none /sys/fs/cgroup/memory -o memory
426 4. Since now we're in the 0 cgroup, we can alter the memory limit::
430 The limit can now be queried::
441 We can write "-1" to reset the ``*.limit_in_bytes(unlimited)``.
453 this limit to the value written into the file. This can be due to a
455 availability of memory on the system. The user is required to re-read
462 The memory.failcnt field gives the number of times that the cgroup limit was
477 Page-fault scalability is also important. At measuring parallel
478 page fault test, multi-process test may be better than multi-thread
484 .. _cgroup-v1-memory-test-troubleshoot:
487 -------------------
492 1. The cgroup limit is too low (just too low to do anything useful)
498 To know what happens, disabling OOM_Kill as per :ref:`"10. OOM Control"
499 <cgroup-v1-memory-oom-control>` (below) and seeing what happens will be
502 .. _cgroup-v1-memory-test-task-migration:
505 ------------------
513 See :ref:`8. "Move charges at task migration" <cgroup-v1-memory-move-charges>`
516 ---------------------
518 A cgroup can be removed by rmdir, but as discussed in :ref:`sections 4.1
519 <cgroup-v1-memory-test-troubleshoot>` and :ref:`4.2
520 <cgroup-v1-memory-test-task-migration>`, a cgroup might have some charge
524 We move the stats to parent, and no change on the charge except uncharging
535 ---------------
545 charged file caches. Some out-of-use page caches may keep charged until
549 -------------
553 * per-memory cgroup local status
576 inactive_file # of bytes of file-backed memory and MADV_FREE anonymous
578 active_file # of bytes of file-backed memory on active LRU list.
585 hierarchical_memory_limit # of bytes of memory limit with regard to
588 hierarchical_memsw_limit # of bytes of memory+swap limit with regard to
623 --------------
628 Please note that unlike during the global reclaim, limit reclaim
631 if there are no file pages to reclaim.
634 -----------
638 hit its limit. When a memory cgroup hits a limit, failcnt increases and
646 ------------------
656 -------------
658 This is similar to numa_maps but operates on a per-memcg basis. This is
665 per-node page counts including "hierarchical_<counter>" which sums up all
696 If one of the ancestors goes over its limit, the reclaim algorithm reclaims
700 ---------------------------------------
718 a. There is no memory contention
719 b. They do not exceed their hard limit
722 are pushed back to their soft limits. If the soft limit of each control
726 Please note that soft limits is a best-effort feature; it comes with
727 no guarantees, but it does its best to make sure that when memory is
728 heavily contended for, memory is allocated based on the soft limit
729 hints/setup. Currently soft limit based reclaim is set up such that
733 -------------
736 assume a soft limit of 256 MiB)::
749 It is recommended to set the soft limit always below the hard limit,
750 otherwise the hard limit will take precedence.
752 .. _cgroup-v1-memory-move-charges:
761 cgroups to allow fine-grained policy adjustments without having to
770 -------------
781 of charges should be moved. See :ref:`section 8.2
782 <cgroup-v1-memory-movable-charges>` for details.
785 Charges are moved only when you move mm->owner, in other words,
800 .. _cgroup-v1-memory-movable-charges:
803 --------------------------------------
807 a page or a swap can be moved only when it is charged to the task's current
810 +---+--------------------------------------------------------------------------+
815 +---+--------------------------------------------------------------------------+
824 +---+--------------------------------------------------------------------------+
827 --------
829 - All of moving charge operations are done under cgroup_mutex. It's not good
841 - create an eventfd using eventfd(2);
842 - open memory.usage_in_bytes or memory.memsw.usage_in_bytes;
843 - write string like "<event_fd> <fd of memory.usage_in_bytes> <threshold>" to
849 It's applicable for root and non-root cgroup.
851 .. _cgroup-v1-memory-oom-control:
866 - create an eventfd using eventfd(2)
867 - open memory.oom_control file
868 - write string like "<event_fd> <fd of memory.oom_control>" to
874 You can disable the OOM-killer by writing "1" to memory.oom_control file, as:
878 If OOM-killer is disabled, tasks under cgroup will hang/sleep
879 in memory cgroup's OOM-waitqueue when they request accountable memory.
883 * enlarge limit or reduce usage.
893 At reading, current status of OOM is shown.
895 - oom_kill_disable 0 or 1
896 (if 1, oom-killer is disabled)
897 - under_oom 0 or 1
899 - oom_kill integer counter
923 resources that can be easily reconstructed or re-read from a disk.
926 about to out of memory (OOM) or even the in-kernel OOM killer is on its
932 events are not pass-through. For example, you have three cgroups: A->B->C. Now
938 notification only if there are no event listeners for group C.
942 - "default": this is the default behavior specified above. This mode is the
946 - "hierarchy": events always propagate up to the root, similar to the default
951 - "local": events are pass-through, i.e. they only receive notifications when
960 specified by a comma-delimited string, i.e. "low,hierarchy" specifies
961 hierarchical, pass-through, notification for all ancestor memcgs. Notification
962 that is the default, non pass-through behavior, does not specify a mode.
963 "medium,local" specifies pass-through notification for the medium level.
968 - create an eventfd using eventfd(2);
969 - open memory.pressure_level;
970 - write string as "<event_fd> <fd of memory.pressure_level> <level[,mode]>"
975 memory.pressure_level are no implemented.
980 memory limit, sets up a notification in the cgroup and then makes child
992 (Expect a bunch of notifications, and eventually, the oom-killer will
998 1. Make per-cgroup scanner reclaim not-shared pages first
999 2. Teach controller to account for shared-pages
1000 3. Start reclamation in the background when the limit is
1030 https://lore.kernel.org/r/20070819094658.654.84837.sendpatchset@balbir-laptop
1033 https://lore.kernel.org/r/20070817084228.26003.12568.sendpatchset@balbir-laptop