Lines Matching +full:controller +full:- +full:number
1 .. _cgroup-v2:
11 conventions of cgroup v2. It describes all userland-visible aspects
12 of cgroup including core and specific controller behaviors. All
14 v1 is available under :ref:`Documentation/admin-guide/cgroup-v1/index.rst <cgroup-v1>`.
22 1-1. Terminology
23 1-2. What is cgroup?
25 2-1. Mounting
26 2-2. Organizing Processes and Threads
27 2-2-1. Processes
28 2-2-2. Threads
29 2-3. [Un]populated Notification
30 2-4. Controlling Controllers
31 2-4-1. Availability
32 2-4-2. Enabling and Disabling
33 2-4-3. Top-down Constraint
34 2-4-4. No Internal Process Constraint
35 2-5. Delegation
36 2-5-1. Model of Delegation
37 2-5-2. Delegation Containment
38 2-6. Guidelines
39 2-6-1. Organize Once and Control
40 2-6-2. Avoid Name Collisions
42 3-1. Weights
43 3-2. Limits
44 3-3. Protections
45 3-4. Allocations
47 4-1. Format
48 4-2. Conventions
49 4-3. Core Interface Files
51 5-1. CPU
52 5-1-1. CPU Interface Files
53 5-2. Memory
54 5-2-1. Memory Interface Files
55 5-2-2. Usage Guidelines
56 5-2-3. Memory Ownership
57 5-3. IO
58 5-3-1. IO Interface Files
59 5-3-2. Writeback
60 5-3-3. IO Latency
61 5-3-3-1. How IO Latency Throttling Works
62 5-3-3-2. IO Latency Interface Files
63 5-3-4. IO Priority
64 5-4. PID
65 5-4-1. PID Interface Files
66 5-5. Cpuset
67 5.5-1. Cpuset Interface Files
68 5-6. Device controller
69 5-7. RDMA
70 5-7-1. RDMA Interface Files
71 5-8. DMEM
72 5-8-1. DMEM Interface Files
73 5-9. HugeTLB
74 5.9-1. HugeTLB Interface Files
75 5-10. Misc
76 5.10-1 Misc Interface Files
77 5.10-2 Migration and Ownership
78 5-11. Others
79 5-11-1. perf_event
80 5-N. Non-normative information
81 5-N-1. CPU controller root cgroup process behaviour
82 5-N-2. IO controller root cgroup process behaviour
84 6-1. Basics
85 6-2. The Root and Views
86 6-3. Migration and setns(2)
87 6-4. Interaction with Other Namespaces
89 P-1. Filesystem Support for Writeback
92 R-1. Multiple Hierarchies
93 R-2. Thread Granularity
94 R-3. Competition Between Inner Nodes and Threads
95 R-4. Other Interface Issues
96 R-5. Controller Issues and Remedies
97 R-5-1. Memory
104 -----------
113 ---------------
119 cgroup is largely composed of two parts - the core and controllers.
121 processes. A cgroup controller is usually responsible for
134 disabled selectively on a cgroup. All controller behaviors are
135 hierarchical - if a controller is enabled on a cgroup, it affects all
137 sub-hierarchy of the cgroup. When a controller is enabled on a nested
147 --------
152 # mount -t cgroup2 none $MOUNT_POINT
154 cgroup2 filesystem has the magic number 0x63677270 ("cgrp"). All
161 A controller can be moved across hierarchies only after the controller
162 is no longer referenced in its current hierarchy. Because per-cgroup
163 controller states are destroyed asynchronously and controllers may
164 have lingering references, a controller may not show up immediately on
166 Similarly, a controller should be fully disabled to be moved out of
168 controller to become available for other hierarchies; furthermore, due
169 to inter-controller dependencies, other controllers may need to be
175 the hierarchies and controller associations before starting using the
190 ignored on non-init namespace mounts. Please refer to the
195 task migrations and controller on/offs at the cost of making
207 option is ignored on non-init namespace mounts.
215 behavior but is a mount-option to avoid regressing setups
221 memory usage for the memory controller (for the purpose of
229 controller. The pre-allocated pool does not belong to anyone.
232 memory controller. It is only charged to a cgroup when it is
236 done via other mechanisms (such as the HugeTLB controller).
237 * Failure to charge a HugeTLB folio to the memory controller
241 * Charging HugeTLB memory towards the memory controller affects
245 will not be tracked by the memory controller (even if cgroup
249 The option restores v1-like behavior of pids.events:max, that is only
257 --------------------------------
263 A child cgroup can be created by creating a sub-directory::
268 structure. Each cgroup has a read-writable interface file
270 belong to the cgroup one-per-line. The PIDs are not ordered and the
301 0::/test-cgroup/test-cgroup-nested
308 0::/test-cgroup/test-cgroup-nested (deleted)
334 constraint - threaded controllers can be enabled on non-leaf cgroups
358 - As the cgroup will join the parent's resource domain. The parent
361 - When the parent is an unthreaded domain, it must not have any domain
365 Topology-wise, a cgroup can be in an invalid state. Please consider
368 A (threaded domain) - B (threaded) - C (domain, just created)
383 threads in the cgroup. Except that the operations are per-thread
384 instead of per-process, "cgroup.threads" has the same format and
399 a threaded controller is enabled inside a threaded subtree, it only
405 constraint, a threaded controller must be able to handle competition
406 between threads in a non-leaf cgroup and its child cgroups. Each
407 threaded controller defines how such competitions are handled.
412 - cpu
413 - cpuset
414 - perf_event
415 - pids
418 --------------------------
420 Each non-root cgroup has a "cgroup.events" file which contains
421 "populated" field indicating whether the cgroup's sub-hierarchy has
425 example, to start a clean-up operation after all processes of a given
426 sub-hierarchy have exited. The populated state updates and
427 notifications are recursive. Consider the following sub-hierarchy
431 A(4) - B(0) - C(1)
441 -----------------------
446 A controller is available in a cgroup when it is supported by the kernel (i.e.,
448 "cgroup.controllers" file. Availability means the controller's interface files
461 No controller is enabled by default. Controllers can be enabled and
464 # echo "+cpu +memory -io" > cgroup.subtree_control
468 all succeed or fail. If multiple operations on the same controller
471 Enabling a controller in a cgroup indicates that the distribution of
473 Consider the following sub-hierarchy. The enabled controllers are
476 A(cpu,memory) - B(memory) - C()
484 As a controller regulates the distribution of the target resource to
485 the cgroup's children, enabling it creates the controller's interface
487 would create the "cpu." prefixed controller interface files in C and
489 prefixed controller interface files from C and D. This means that the
490 controller interface files - anything which doesn't start with
494 Top-down Constraint
497 Resources are distributed top-down and a cgroup can further distribute
499 parent. This means that all non-root "cgroup.subtree_control" files
501 "cgroup.subtree_control" file. A controller can be enabled only if
502 the parent has the controller enabled and a controller can't be
509 Non-root cgroups can distribute domain resources to their children
514 This guarantees that, when a domain controller is looking at the part
523 is up to each controller (for more information on this topic please
524 refer to the Non-normative information section in the Controllers
528 enabled controller in the cgroup's "cgroup.subtree_control". This is
537 ----------
559 delegated, the user can build sub-hierarchy under the directory,
563 happens in the delegated sub-hierarchy, nothing can escape the
566 Currently, cgroup doesn't impose any restrictions on the number of
567 cgroups in or nesting depth of a delegated sub-hierarchy; however,
574 A delegated sub-hierarchy is contained in the sense that processes
575 can't be moved into or out of the sub-hierarchy by the delegatee.
578 requiring the following conditions for a process with a non-root euid
582 - The writer must have write access to the "cgroup.procs" file.
584 - The writer must have write access to the "cgroup.procs" file of the
588 processes around freely in the delegated sub-hierarchy it can't pull
589 in from or push out to outside the sub-hierarchy.
595 ~~~~~~~~~~~~~ - C0 - C00
598 ~~~~~~~~~~~~~ - C1 - C10
605 will be denied with -EACCES.
610 is not reachable, the migration is rejected with -ENOENT.
614 ----------
622 inherent trade-offs between migration and various hot paths in terms
628 resource structure once on start-up. Dynamic adjustments to resource
629 distribution can be made by changing controller configuration through
641 controller's interface files are prefixed with the controller name and
642 a dot. A controller's name is composed of lower case alphabets and
661 -------
667 work-conserving. Due to the dynamic nature, this model is usually
682 .. _cgroupv2-limits-distributor:
685 ------
688 Limits can be over-committed - the sum of the limits of children can
693 As limits can be over-committed, all configuration combinations are
700 .. _cgroupv2-protections-distributor:
703 -----------
708 soft boundaries. Protections can also be over-committed in which case
715 As protections can be over-committed, all configuration combinations
719 "memory.low" implements best-effort memory protection and is an
724 -----------
727 resource. Allocations can't be over-committed - the sum of the
734 As allocations can't be over-committed, some configuration
739 "cpu.rt.max" hard-allocates realtime slices and is an example of this
747 ------
752 New-line separated values
760 (when read-only or multiple values can be written at once)
786 -----------
788 - Settings for a single feature should be contained in a single file.
790 - The root cgroup should be exempt from resource control and thus
793 - The default time unit is microseconds. If a different unit is ever
796 - A parts-per quantity should use a percentage decimal with at least
797 two digit fractional part - e.g. 13.40.
799 - If a controller implements weight based resource distribution, its
805 - If a controller implements an absolute resource guarantee and/or
807 respectively. If a controller implements best effort resource
814 - If a setting has a configurable default value and keyed specific
828 # cat cgroup-example-interface-file
834 # echo 125 > cgroup-example-interface-file
838 # echo "default 125" > cgroup-example-interface-file
842 # echo "8:16 170" > cgroup-example-interface-file
846 # echo "8:0 default" > cgroup-example-interface-file
847 # cat cgroup-example-interface-file
851 - For events which are not very high frequency, an interface file
858 --------------------
863 A read-write single value file which exists on non-root
869 - "domain" : A normal valid domain cgroup.
871 - "domain threaded" : A threaded domain cgroup which is
874 - "domain invalid" : A cgroup which is in an invalid state.
878 - "threaded" : A threaded cgroup which is a member of a
885 A read-write new-line separated values file which exists on
889 the cgroup one-per-line. The PIDs are not ordered and the
898 - It must have write access to the "cgroup.procs" file.
900 - It must have write access to the "cgroup.procs" file of the
903 When delegating a sub-hierarchy, write access to this file
911 A read-write new-line separated values file which exists on
915 the cgroup one-per-line. The TIDs are not ordered and the
924 - It must have write access to the "cgroup.threads" file.
926 - The cgroup that the thread is currently in must be in the
929 - It must have write access to the "cgroup.procs" file of the
932 When delegating a sub-hierarchy, write access to this file
936 A read-only space separated values file which exists on all
943 A read-write space separated values file which exists on all
950 Space separated list of controllers prefixed with '+' or '-'
951 can be written to enable or disable controllers. A controller
952 name prefixed with '+' enables the controller and '-'
953 disables. If a controller appears more than once on the list,
958 A read-only flat-keyed file which exists on non-root cgroups.
970 A read-write single value files. The default is "max".
972 Maximum allowed number of descent cgroups.
973 If the actual number of descendants is equal or larger,
977 A read-write single value files. The default is "max".
984 A read-only flat-keyed file with the following entries:
987 Total number of visible descendant cgroups.
990 Total number of dying descendant cgroups. A cgroup becomes
1002 Total number of live cgroup subsystems (e.g memory
1006 Total number of dying cgroup subsystems (e.g. memory
1010 A read-only flat-keyed file which exists in non-root cgroups.
1028 A read-write single value file which exists on non-root cgroups.
1051 create new sub-cgroups.
1054 A write-only single value file which exists in non-root cgroups.
1066 the whole thread-group.
1069 A read-write single value file that allowed values are "0" and "1".
1073 Writing "1" to the file will re-enable the cgroup PSI accounting.
1081 This may cause non-negligible overhead for some workloads when under
1083 be used to disable PSI accounting in the non-leaf cgroups.
1086 A read-write nested-keyed file.
1094 .. _cgroup-v2-cpu:
1097 ---
1100 controller implements weight and absolute bandwidth limit models for
1111 WARNING: cgroup2 cpu controller doesn't yet support the (bandwidth) control of
1113 enabled for group scheduling of realtime processes, the cpu controller can only
1115 management software may already have placed RT processes into non-root cgroups
1117 root cgroup before the cpu controller can be enabled with a
1122 the following section for details. Only the cpu controller is affected by
1130 The interaction of a process with the cpu controller depends on its scheduling
1131 policy and the underlying scheduler. From the point of view of the cpu controller,
1134 * Processes under the fair-class scheduler
1139 For details on when a process is under the fair-class scheduler or a BPF scheduler,
1140 check out :ref:`Documentation/scheduler/sched-ext.rst <sched-ext>`.
1146 A read-only flat-keyed file.
1147 This file exists whether the controller is enabled or not.
1152 - usage_usec
1153 - user_usec
1154 - system_usec
1156 and the following five when the controller is enabled, which account for
1157 only the processes under the fair-class scheduler:
1159 - nr_periods
1160 - nr_throttled
1161 - throttled_usec
1162 - nr_bursts
1163 - burst_usec
1166 A read-write single value file which exists on non-root
1175 This file affects only processes under the fair-class scheduler and a BPF
1180 A read-write single value file which exists on non-root
1183 The nice value is in the range [-20, 19].
1191 This file affects only processes under the fair-class scheduler and a BPF
1196 A read-write two value file which exists on non-root cgroups.
1205 one number is written, $MAX is updated.
1207 This file affects only processes under the fair-class scheduler.
1210 A read-write single value file which exists on non-root
1215 This file affects only processes under the fair-class scheduler.
1218 A read-write nested-keyed file.
1226 A read-write single value file which exists on non-root cgroups.
1230 rational number, e.g. 12.34 for 12.34%.
1244 A read-write single value file which exists on non-root cgroups.
1248 number, e.g. 98.76 for 98.76%.
1258 A read-write single value file which exists on non-root cgroups.
1261 This is the cgroup analog of the per-task SCHED_IDLE sched policy.
1267 This file affects only processes under the fair-class scheduler.
1270 ------
1272 The "memory" controller regulates distribution of memory. Memory is
1278 While not completely water-tight, all major memory usages by a given
1283 - Userland memory - page cache and anonymous memory.
1285 - Kernel data structures such as dentries and inodes.
1287 - TCP socket buffers.
1300 A read-only single value file which exists on non-root
1307 A read-write single value file which exists on non-root
1333 A read-write single value file which exists on non-root
1336 Best-effort memory protection. If the memory usage of a
1356 A read-write single value file which exists on non-root
1379 busy-hitting its memory to slow down reclaim.
1382 A read-write single value file which exists on non-root
1391 In default configuration regular 0-order allocations always
1396 as -ENOMEM or silently ignore in cases like disk readahead.
1399 reclaim and oom-kill are bypassed. This is useful for admin
1402 The job will trigger the reclaim and/or oom-kill on its next
1408 busy-hitting its memory to slow down reclaim.
1411 A write-only nested-keyed file which exists for all cgroups.
1422 specified amount, -EAGAIN is returned.
1442 The valid range for swappiness is [0-200, max], setting
1446 A read-write single value file which exists on non-root cgroups.
1451 A write of any non-empty string to this file resets it to the
1456 A read-write single value file which exists on non-root
1466 Tasks with the OOM protection (oom_score_adj set to -1000)
1474 A read-only flat-keyed file which exists on non-root cgroups.
1485 The number of times the cgroup is reclaimed due to
1488 boundary is over-committed.
1491 The number of times processes of the cgroup are
1499 The number of times the cgroup's memory usage was
1504 The number of time the cgroup's memory usage was
1508 considered as an option, e.g. for failed high-order
1512 The number of processes belonging to this cgroup
1516 The number of times a group OOM has occurred.
1524 A read-only flat-keyed file which exists on non-root cgroups.
1527 types of memory, type-specific details, and other information
1536 If the entry has no per-node counter (or not show in the
1537 memory.numa_stat). We use 'npn' (non-per-node) as the tag
1568 Amount of memory used for storing per-cpu kernel
1578 Amount of cached filesystem data that is swap-backed,
1618 Amount of memory, swap-backed and filesystem-backed,
1624 the value for the foo counter, since the foo counter is type-based, not
1625 list-based.
1636 Amount of memory used for storing in-kernel data
1640 Number of refaults of previously evicted anonymous pages.
1643 Number of refaults of previously evicted file pages.
1646 Number of refaulted anonymous pages that were immediately
1650 Number of refaulted file pages that were immediately activated.
1653 Number of restored anonymous pages which have been detected as
1657 Number of restored file pages which have been detected as an
1661 Number of times a shadow node has been reclaimed
1664 Number of pages swapped into memory
1667 Number of pages swapped out of memory
1700 Total number of page faults incurred
1703 Number of major page faults incurred
1721 Number of pages swapped into memory and filled with zero, where I/O
1726 Number of zero-filled pages swapped out with I/O skipped due to the
1730 Number of pages moved in to memory from zswap.
1733 Number of pages moved out of memory to zswap.
1736 Number of pages written from zswap to swap.
1739 Number of transparent hugepages which were allocated to satisfy
1744 Number of transparent hugepages which were allocated to allow
1749 Number of transparent hugepages which are swapout in one piece
1753 Number of transparent hugepages which were split before swapout.
1758 Number of pages migrated by NUMA balancing.
1761 Number of pages whose page table entries are modified by
1765 Number of NUMA hinting faults.
1768 Number of pages demoted by kswapd.
1771 Number of pages demoted directly.
1774 Number of pages demoted by khugepaged.
1777 Number of pages demoted by proactively.
1785 A read-only nested-keyed file which exists on non-root cgroups.
1788 types of memory, type-specific details, and other information
1810 A read-only single value file which exists on non-root
1817 A read-write single value file which exists on non-root
1822 allow userspace to implement custom out-of-memory procedures.
1833 A read-write single value file which exists on non-root cgroups.
1838 A write of any non-empty string to this file resets it to the
1843 A read-write single value file which exists on non-root
1850 A read-only flat-keyed file which exists on non-root cgroups.
1856 The number of times the cgroup's swap usage was over
1860 The number of times the cgroup's swap usage was about
1865 The number of times swap allocation failed either
1866 because of running out of swap system-wide or max
1875 A read-only single value file which exists on non-root
1882 A read-write single value file which exists on non-root
1890 A read-write single value file. The default value is "1".
1908 A read-only nested-keyed file.
1918 Over-committing on high limit (sum of high limits > available memory)
1932 pressure - how much the workload is being impacted due to lack of
1933 memory - is necessary to determine whether a workload needs more
1947 To which cgroup the area will be charged is in-deterministic; however,
1958 --
1960 The "io" controller regulates the distribution of IO resources. This
1961 controller implements both weight based and absolute bandwidth or IOPS
1963 only if cfq-iosched is in use and neither scheme is available for
1964 blk-mq devices.
1971 A read-only nested-keyed file.
1979 rios Number of read IOs
1980 wios Number of write IOs
1982 dios Number of discard IOs
1991 A read-write nested-keyed file which exists only on the root
1995 model based controller (CONFIG_BLK_CGROUP_IOCOST) which
2003 enable Weight-based control enable
2013 The controller is disabled by default and can be enabled by
2015 to zero and the controller uses internal device saturation
2023 shows that on sdb, the controller is enabled, will consider
2035 devices which show wide temporary behavior changes - e.g. a
2046 A read-write nested-keyed file which exists only on the root
2050 controller (CONFIG_BLK_CGROUP_IOCOST) which currently
2059 model The cost model in use - "linear"
2085 generate device-specific coefficients.
2088 A read-write flat-keyed file which exists on non-root cgroups.
2108 A read-write nested-keyed file which exists on non-root
2122 When writing, any number of nested key-value pairs can be
2147 A read-only nested-keyed file.
2162 The io controller, in conjunction with the memory controller,
2163 implements control of page cache writeback IOs. The memory controller
2165 maintained for and the io controller defines the io domain which
2166 writes out dirty pages for the memory domain. Both system-wide and
2167 per-cgroup dirty memory states are examined and the more restrictive
2193 As memory controller assigns page ownership on the first use and
2205 memory controller and system-wide clean memory.
2216 This is a cgroup v2 controller for IO workload protection. You provide a group
2218 controller will throttle any peers that have a lower latency target than the
2238 your real setting, setting at 10-15% higher than the value in io.stat.
2244 target the controller doesn't do anything. Once a group starts missing its
2248 - Queue depth throttling. This is the number of outstanding IO's a group is
2252 - Artificial delay induction. There are certain types of IO that cannot be
2258 being added to any process that runs in this group. Because this number can
2275 If the controller is enabled you will see extra stats in io.stat in
2285 corresponding number of samples based on the win value.
2299 no-change
2302 promote-to-rt
2303 For requests that have a non-RT I/O priority class, change it into RT.
2307 restrict-to-be
2317 none-to-rt
2318 Deprecated. Just an alias for promote-to-rt.
2322 +----------------+---+
2323 | no-change | 0 |
2324 +----------------+---+
2325 | promote-to-rt | 1 |
2326 +----------------+---+
2327 | restrict-to-be | 2 |
2328 +----------------+---+
2330 +----------------+---+
2334 +-------------------------------+---+
2336 +-------------------------------+---+
2337 | IOPRIO_CLASS_RT (real-time) | 1 |
2338 +-------------------------------+---+
2340 +-------------------------------+---+
2342 +-------------------------------+---+
2346 - If I/O priority class policy is promote-to-rt, change the request I/O
2349 - If I/O priority class policy is not promote-to-rt, translate the I/O priority
2350 class policy into a number, then change the request I/O priority class
2351 into the maximum of the I/O priority class policy number and the numerical
2355 ---
2357 The process number controller is used to allow a cgroup to stop any
2361 The number of tasks in a cgroup can be exhausted in ways which other
2362 controllers cannot prevent, thus warranting its own controller. For
2363 example, a fork bomb is likely to exhaust the number of tasks before
2366 Note that PIDs used in this controller refer to TIDs, process IDs as
2374 A read-write single value file which exists on non-root
2377 Hard limit of number of processes.
2380 A read-only single value file which exists on non-root cgroups.
2382 The number of processes currently in the cgroup and its
2386 A read-only single value file which exists on non-root cgroups.
2388 The maximum value that the number of processes in the cgroup and its
2392 A read-only flat-keyed file which exists on non-root cgroups. Unless
2397 The number of times the cgroup's total number of processes hit the pids.max
2410 through fork() or clone(). These will return -EAGAIN if the creation
2415 ------
2417 The "cpuset" controller provides a mechanism for constraining
2422 memory placement to reduce cross-node memory access and contention
2425 The "cpuset" controller is hierarchical. That means the controller
2433 A read-write multiple values file which exists on non-root
2434 cpuset-enabled cgroups.
2441 The CPU numbers are comma-separated numbers or ranges.
2445 0-4,6,8-10
2448 setting as the nearest cgroup ancestor with a non-empty
2455 A read-only multiple values file which exists on all
2456 cpuset-enabled cgroups.
2472 A read-write multiple values file which exists on non-root
2473 cpuset-enabled cgroups.
2480 The memory node numbers are comma-separated numbers or ranges.
2484 0-1,3
2487 setting as the nearest cgroup ancestor with a non-empty
2494 Setting a non-empty value to "cpuset.mems" causes memory of
2506 A read-only multiple values file which exists on all
2507 cpuset-enabled cgroups.
2522 A read-write multiple values file which exists on non-root
2523 cpuset-enabled cgroups.
2556 A read-only multiple values file which exists on all non-root
2557 cpuset-enabled cgroups.
2569 A read-only and root cgroup only multiple values file.
2576 A read-write single value file which exists on non-root
2577 cpuset-enabled cgroups. This flag is owned by the parent cgroup
2583 "member" Non-root member of a partition
2588 A cpuset partition is a collection of cpuset-enabled cgroups with
2595 There are two types of partitions - local and remote. A local
2611 be changed. All other non-root cgroups start out as "member".
2624 two possible states - valid or invalid. An invalid partition
2635 "member" Non-root member of a partition
2662 A valid non-root parent partition may distribute out all its CPUs
2681 A user can pre-configure certain CPUs to an isolated state
2687 Device controller
2688 -----------------
2690 Device controller manages access to device files. It includes both
2694 Cgroup v2 device controller has no interface files and is implemented
2699 on the return value the attempt will succeed or fail with -EPERM.
2704 If the program returns 0, the attempt fails with -EPERM, otherwise it
2712 ----
2714 The "rdma" controller regulates the distribution and accounting of
2721 A readwrite nested-keyed file that exists for all the cgroups
2732 hca_handle Maximum number of HCA Handles
2733 hca_object Maximum number of HCA Objects
2742 A read-only file that describes current resource usage.
2751 ----
2753 The "dmem" controller regulates the distribution and accounting of
2761 A readwrite nested-keyed file that exists for all the cgroups
2770 The semantics are the same as for the memory cgroup controller, and are
2774 A read-only file that describes maximum region capacity.
2785 A read-only file that describes current resource usage.
2794 -------
2796 The HugeTLB controller allows to limit the HugeTLB usage per control group and
2797 enforces the controller limit during page fault.
2811 A read-only flat-keyed file which exists on non-root cgroups.
2814 The number of allocation failure due to HugeTLB limit
2824 use hugetlb pages are included. The per-node values are in bytes.
2827 ----
2831 cgroup resources. Controller is enabled by the CONFIG_CGROUP_MISC config
2834 A resource can be added to the controller via enum misc_res_type{} in the
2840 uncharge APIs. All of the APIs to interact with misc controller are in
2846 Miscellaneous controller provides 3 interface files. If two misc resources (res_a and res_b) are re…
2849 A read-only flat-keyed file shown only in the root cgroup. It shows
2858 A read-only flat-keyed file shown in the all cgroups. It shows
2866 A read-only flat-keyed file shown in all cgroups. It shows the
2875 A read-write flat-keyed file shown in the non root cgroups. Allowed
2894 A read-only flat-keyed file which exists on non-root cgroups. The
2900 The number of times the cgroup's resource usage was
2917 ------
2922 perf_event controller, if not mounted on a legacy hierarchy, is
2924 always be filtered by cgroup v2 path. The controller can still be
2928 Non-normative information
2929 -------------------------
2935 CPU controller root cgroup process behaviour
2945 appropriately so the neutral - nice 0 - value is 100 instead of 1024).
2948 IO controller root cgroup process behaviour
2961 ------
2980 The path '/batchjobs/container_id1' can be considered as system-data
2985 # ls -l /proc/self/ns/cgroup
2986 lrwxrwxrwx 1 root root 0 2014-07-15 10:37 /proc/self/ns/cgroup -> cgroup:[4026531835]
2992 # ls -l /proc/self/ns/cgroup
2993 lrwxrwxrwx 1 root root 0 2014-07-15 10:35 /proc/self/ns/cgroup -> cgroup:[4026532183]
2997 When some thread from a multi-threaded process unshares its cgroup
3009 ------------------
3020 # ~/unshare -c # unshare cgroupns in some cgroup
3028 Each process gets its namespace-specific view of "/proc/$PID/cgroup"
3059 ----------------------
3088 ---------------------------------
3091 running inside a non-init cgroup namespace::
3093 # mount -t cgroup2 none $MOUNT_POINT
3100 the view of cgroup hierarchy by namespace-private cgroupfs mount
3113 --------------------------------
3116 address_space_operations->writepages() to annotate bio's using the
3133 super_block by setting SB_I_CGROUPWB in ->s_iflags. This allows for
3150 - Multiple hierarchies including named ones are not supported.
3152 - All v1 mount options are not supported.
3154 - The "tasks" file is removed and "cgroup.procs" is not sorted.
3156 - "cgroup.clone_children" is removed.
3158 - /proc/cgroups is meaningless for v2. Use "cgroup.controllers" or
3166 --------------------
3168 cgroup v1 allowed an arbitrary number of hierarchies and each
3169 hierarchy could host any number of controllers. While this seemed to
3172 For example, as there is only one instance of each controller, utility
3179 the specific controller.
3183 each controller on its own hierarchy. Only closely related ones, such
3196 length. The key might contain any number of entries and was unlimited
3199 which in turn exacerbated the original problem of proliferating number
3202 Also, as a controller couldn't have any expectation regarding the
3204 controller had to assume that all other controllers were attached to
3211 depending on the specific controller. In other words, hierarchy may
3219 ------------------
3227 Generally, in-process knowledge is available only to the process
3228 itself; thus, unlike service-level organization of processes,
3235 sub-hierarchies and control resource distributions along them. This
3236 effectively raised cgroup to the status of a syscall-like API exposed
3246 that the process would actually be operating on its own sub-hierarchy.
3248 cgroup controllers implemented a number of knobs which would never be
3250 system-management pseudo filesystem. cgroup ended up with interface
3253 individual applications through the ill-defined delegation mechanism
3263 -------------------------------------------
3271 The cpu controller considered threads and cgroups as equivalents and
3274 cycles and the number of internal threads fluctuated - the ratios
3275 constantly changed as the number of competing entities fluctuated.
3280 The io controller implicitly created a hidden leaf node for each
3288 The memory controller didn't have a way to control what happened
3290 clearly defined. There were attempts to add ad-hoc behaviors and
3304 ----------------------
3306 cgroup v1 grew without oversight and developed a large number of
3308 was how an empty cgroup was notified - a userland helper binary was
3311 to in-kernel event delivery filtering mechanism further complicating
3314 Controller interfaces were problematic too. An extreme example is
3326 formats and units even in the same controller.
3332 Controller Issues and Remedies
3333 ------------------------------
3340 global reclaim prefers is opt-in, rather than opt-out. The costs for
3350 becomes self-defeating.
3352 The memory.low boundary on the other hand is a top-down allocated
3390 new limit is met - or the task writing to memory.max is killed.
3399 groups can sabotage swapping by other means - such as referencing its
3400 anonymous memory in a tight loop - and an admin can not assume full