Lines Matching +full:partition +full:- +full:file +full:- +full:system
1 .. _cgroup-v2:
11 conventions of cgroup v2. It describes all userland-visible aspects
14 v1 is available under :ref:`Documentation/admin-guide/cgroup-v1/index.rst <cgroup-v1>`.
19 1-1. Terminology
20 1-2. What is cgroup?
22 2-1. Mounting
23 2-2. Organizing Processes and Threads
24 2-2-1. Processes
25 2-2-2. Threads
26 2-3. [Un]populated Notification
27 2-4. Controlling Controllers
28 2-4-1. Enabling and Disabling
29 2-4-2. Top-down Constraint
30 2-4-3. No Internal Process Constraint
31 2-5. Delegation
32 2-5-1. Model of Delegation
33 2-5-2. Delegation Containment
34 2-6. Guidelines
35 2-6-1. Organize Once and Control
36 2-6-2. Avoid Name Collisions
38 3-1. Weights
39 3-2. Limits
40 3-3. Protections
41 3-4. Allocations
43 4-1. Format
44 4-2. Conventions
45 4-3. Core Interface Files
47 5-1. CPU
48 5-1-1. CPU Interface Files
49 5-2. Memory
50 5-2-1. Memory Interface Files
51 5-2-2. Usage Guidelines
52 5-2-3. Memory Ownership
53 5-3. IO
54 5-3-1. IO Interface Files
55 5-3-2. Writeback
56 5-3-3. IO Latency
57 5-3-3-1. How IO Latency Throttling Works
58 5-3-3-2. IO Latency Interface Files
59 5-3-4. IO Priority
60 5-4. PID
61 5-4-1. PID Interface Files
62 5-5. Cpuset
63 5.5-1. Cpuset Interface Files
64 5-6. Device
65 5-7. RDMA
66 5-7-1. RDMA Interface Files
67 5-8. DMEM
68 5-9. HugeTLB
69 5.9-1. HugeTLB Interface Files
70 5-10. Misc
71 5.10-1 Miscellaneous cgroup Interface Files
72 5.10-2 Migration and Ownership
73 5-11. Others
74 5-11-1. perf_event
75 5-N. Non-normative information
76 5-N-1. CPU controller root cgroup process behaviour
77 5-N-2. IO controller root cgroup process behaviour
79 6-1. Basics
80 6-2. The Root and Views
81 6-3. Migration and setns(2)
82 6-4. Interaction with Other Namespaces
84 P-1. Filesystem Support for Writeback
87 R-1. Multiple Hierarchies
88 R-2. Thread Granularity
89 R-3. Competition Between Inner Nodes and Threads
90 R-4. Other Interface Issues
91 R-5. Controller Issues and Remedies
92 R-5-1. Memory
99 -----------
108 ---------------
111 distribute system resources along the hierarchy in a controlled and
114 cgroup is largely composed of two parts - the core and controllers.
117 distributing a specific type of system resource along the hierarchy
121 cgroups form a tree structure and every process in the system belongs
130 hierarchical - if a controller is enabled on a cgroup, it affects all
132 sub-hierarchy of the cgroup. When a controller is enabled on a nested
142 --------
147 # mount -t cgroup2 none $MOUNT_POINT
157 is no longer referenced in its current hierarchy. Because per-cgroup
164 to inter-controller dependencies, other controllers may need to be
171 controllers after system boot.
173 During transition to v2, system management software might still
183 option is system wide and can only be set on mount or modified
185 ignored on non-init namespace mounts. Please refer to the
200 This option is system wide and can only be set on mount or
202 option is ignored on non-init namespace mounts.
210 behavior but is a mount-option to avoid regressing setups
224 controller. The pre-allocated pool does not belong to anyone.
244 The option restores v1-like behavior of pids.events:max, that is only
252 --------------------------------
258 A child cgroup can be created by creating a sub-directory::
263 structure. Each cgroup has a read-writable interface file
265 belong to the cgroup one-per-line. The PIDs are not ordered and the
270 target cgroup's "cgroup.procs" file. Only one process can be migrated
290 cgroup is in use in the system, this file may contain multiple lines,
296 0::/test-cgroup/test-cgroup-nested
303 0::/test-cgroup/test-cgroup-nested (deleted)
329 constraint - threaded controllers can be enabled on non-leaf cgroups
340 "cgroup.type" file which indicates whether the cgroup is a normal
345 threaded by writing "threaded" to the "cgroup.type" file. The
353 - As the cgroup will join the parent's resource domain. The parent
356 - When the parent is an unthreaded domain, it must not have any domain
360 Topology-wise, a cgroup can be in an invalid state. Please consider
363 A (threaded domain) - B (threaded) - C (domain, just created)
367 threaded cgroup. "cgroup.type" file will report "domain (invalid)" in
373 "cgroup.subtree_control" file while there are processes in the cgroup.
378 threads in the cgroup. Except that the operations are per-thread
379 instead of per-process, "cgroup.threads" has the same format and
401 between threads in a non-leaf cgroup and its child cgroups. Each
407 - cpu
408 - cpuset
409 - perf_event
410 - pids
413 --------------------------
415 Each non-root cgroup has a "cgroup.events" file which contains
416 "populated" field indicating whether the cgroup's sub-hierarchy has
420 example, to start a clean-up operation after all processes of a given
421 sub-hierarchy have exited. The populated state updates and
422 notifications are recursive. Consider the following sub-hierarchy
426 A(4) - B(0) - C(1)
431 file modified events will be generated on the "cgroup.events" files of
436 -----------------------
441 Each cgroup has a "cgroup.controllers" file which lists all
448 disabled by writing to the "cgroup.subtree_control" file::
450 # echo "+cpu +memory -io" > cgroup.subtree_control
459 Consider the following sub-hierarchy. The enabled controllers are
462 A(cpu,memory) - B(memory) - C()
476 controller interface files - anything which doesn't start with
480 Top-down Constraint
483 Resources are distributed top-down and a cgroup can further distribute
485 parent. This means that all non-root "cgroup.subtree_control" files
487 "cgroup.subtree_control" file. A controller can be enabled only if
495 Non-root cgroups can distribute domain resources to their children
510 refer to the Non-normative information section in the Controllers
519 file.
523 ----------
545 delegated, the user can build sub-hierarchy under the directory,
549 happens in the delegated sub-hierarchy, nothing can escape the
553 cgroups in or nesting depth of a delegated sub-hierarchy; however,
560 A delegated sub-hierarchy is contained in the sense that processes
561 can't be moved into or out of the sub-hierarchy by the delegatee.
564 requiring the following conditions for a process with a non-root euid
566 "cgroup.procs" file.
568 - The writer must have write access to the "cgroup.procs" file.
570 - The writer must have write access to the "cgroup.procs" file of the
574 processes around freely in the delegated sub-hierarchy it can't pull
575 in from or push out to outside the sub-hierarchy.
581 ~~~~~~~~~~~~~ - C0 - C00
584 ~~~~~~~~~~~~~ - C1 - C10
588 file; however, the common ancestor of the source cgroup C10 and the
591 will be denied with -EACCES.
596 is not reachable, the migration is rejected with -ENOENT.
600 ----------
608 inherent trade-offs between migration and various hot paths in terms
613 should be assigned to a cgroup according to the system's logical and
614 resource structure once on start-up. Dynamic adjustments to resource
630 character for collision avoidance. Also, interface file names won't
647 -------
653 work-conserving. Due to the dynamic nature, this model is usually
668 .. _cgroupv2-limits-distributor:
671 ------
674 Limits can be over-committed - the sum of the limits of children can
679 As limits can be over-committed, all configuration combinations are
686 .. _cgroupv2-protections-distributor:
689 -----------
694 soft boundaries. Protections can also be over-committed in which case
701 As protections can be over-committed, all configuration combinations
705 "memory.low" implements best-effort memory protection and is an
710 -----------
713 resource. Allocations can't be over-committed - the sum of the
720 As allocations can't be over-committed, some configuration
725 "cpu.rt.max" hard-allocates realtime slices and is an example of this
733 ------
738 New-line separated values
746 (when read-only or multiple values can be written at once)
762 For a writable file, the format for writing should generally match
772 -----------
774 - Settings for a single feature should be contained in a single file.
776 - The root cgroup should be exempt from resource control and thus
779 - The default time unit is microseconds. If a different unit is ever
782 - A parts-per quantity should use a percentage decimal with at least
783 two digit fractional part - e.g. 13.40.
785 - If a controller implements weight based resource distribution, its
786 interface file should be named "weight" and have the range [1,
791 - If a controller implements an absolute resource guarantee and/or
800 - If a setting has a configurable default value and keyed specific
802 appear as the first entry in the file.
814 # cat cgroup-example-interface-file
820 # echo 125 > cgroup-example-interface-file
824 # echo "default 125" > cgroup-example-interface-file
828 # echo "8:16 170" > cgroup-example-interface-file
832 # echo "8:0 default" > cgroup-example-interface-file
833 # cat cgroup-example-interface-file
837 - For events which are not very high frequency, an interface file
839 Whenever a notifiable event happens, file modified event should be
840 generated on the file.
844 --------------------
849 A read-write single value file which exists on non-root
855 - "domain" : A normal valid domain cgroup.
857 - "domain threaded" : A threaded domain cgroup which is
860 - "domain invalid" : A cgroup which is in an invalid state.
864 - "threaded" : A threaded cgroup which is a member of a
868 "threaded" to this file.
871 A read-write new-line separated values file which exists on
875 the cgroup one-per-line. The PIDs are not ordered and the
884 - It must have write access to the "cgroup.procs" file.
886 - It must have write access to the "cgroup.procs" file of the
889 When delegating a sub-hierarchy, write access to this file
892 In a threaded cgroup, reading this file fails with EOPNOTSUPP
897 A read-write new-line separated values file which exists on
901 the cgroup one-per-line. The TIDs are not ordered and the
910 - It must have write access to the "cgroup.threads" file.
912 - The cgroup that the thread is currently in must be in the
915 - It must have write access to the "cgroup.procs" file of the
918 When delegating a sub-hierarchy, write access to this file
922 A read-only space separated values file which exists on all
929 A read-write space separated values file which exists on all
936 Space separated list of controllers prefixed with '+' or '-'
938 name prefixed with '+' enables the controller and '-'
944 A read-only flat-keyed file which exists on non-root cgroups.
946 otherwise, a value change in this file generates a file
956 A read-write single value files. The default is "max".
963 A read-write single value files. The default is "max".
970 A read-only flat-keyed file with the following entries:
979 on system load) before being completely destroyed.
984 A dying cgroup can consume system resources not exceeding
996 A read-write single value file which exists on non-root cgroups.
999 Writing "1" to the file causes freezing of the cgroup and all
1003 is completed, the "frozen" value in the cgroup.events control file
1019 create new sub-cgroups.
1022 A write-only single value file which exists in non-root cgroups.
1025 Writing "1" to the file causes the cgroup and all descendant cgroups to
1032 In a threaded cgroup, writing this file fails with EOPNOTSUPP as
1034 the whole thread-group.
1037 A read-write single value file that allowed values are "0" and "1".
1040 Writing "0" to the file will disable the cgroup PSI accounting.
1041 Writing "1" to the file will re-enable the cgroup PSI accounting.
1049 This may cause non-negligible overhead for some workloads when under
1051 be used to disable PSI accounting in the non-leaf cgroups.
1054 A read-write nested-keyed file.
1062 .. _cgroup-v2-cpu:
1065 ---
1082 be enabled when all RT processes are in the root cgroup. Be aware that system
1083 management software may already have placed RT processes into non-root cgroups
1084 during the system boot process, and these processes may need to be moved to the
1102 * Processes under the fair-class scheduler
1107 For details on when a process is under the fair-class scheduler or a BPF scheduler,
1108 check out :ref:`Documentation/scheduler/sched-ext.rst <sched-ext>`.
1114 A read-only flat-keyed file.
1115 This file exists whether the controller is enabled or not.
1120 - usage_usec
1121 - user_usec
1122 - system_usec
1125 only the processes under the fair-class scheduler:
1127 - nr_periods
1128 - nr_throttled
1129 - throttled_usec
1130 - nr_bursts
1131 - burst_usec
1134 A read-write single value file which exists on non-root
1143 This file affects only processes under the fair-class scheduler and a BPF
1148 A read-write single value file which exists on non-root
1151 The nice value is in the range [-20, 19].
1153 This interface file is an alternative interface for
1159 This file affects only processes under the fair-class scheduler and a BPF
1164 A read-write two value file which exists on non-root cgroups.
1175 This file affects only processes under the fair-class scheduler.
1178 A read-write single value file which exists on non-root
1183 This file affects only processes under the fair-class scheduler.
1186 A read-write nested-keyed file.
1191 This file accounts for all the processes in the cgroup.
1194 A read-write single value file which exists on non-root cgroups.
1209 This file affects all the processes in the cgroup.
1212 A read-write single value file which exists on non-root cgroups.
1223 This file affects all the processes in the cgroup.
1226 A read-write single value file which exists on non-root cgroups.
1229 This is the cgroup analog of the per-task SCHED_IDLE sched policy.
1235 This file affects only processes under the fair-class scheduler.
1238 ------
1246 While not completely water-tight, all major memory usages by a given
1251 - Userland memory - page cache and anonymous memory.
1253 - Kernel data structures such as dentries and inodes.
1255 - TCP socket buffers.
1268 A read-only single value file which exists on non-root
1275 A read-write single value file which exists on non-root
1301 A read-write single value file which exists on non-root
1304 Best-effort memory protection. If the memory usage of a
1324 A read-write single value file which exists on non-root
1347 busy-hitting its memory to slow down reclaim.
1350 A read-write single value file which exists on non-root
1359 In default configuration regular 0-order allocations always
1364 as -ENOMEM or silently ignore in cases like disk readahead.
1367 reclaim and oom-kill are bypassed. This is useful for admin
1370 The job will trigger the reclaim and/or oom-kill on its next
1376 busy-hitting its memory to slow down reclaim.
1379 A write-only nested-keyed file which exists for all cgroups.
1390 specified amount, -EAGAIN is returned.
1410 The valid range for swappiness is [0-200, max], setting
1414 A read-write single value file which exists on non-root cgroups.
1419 A write of any non-empty string to this file resets it to the
1421 file descriptor.
1424 A read-write single value file which exists on non-root
1434 Tasks with the OOM protection (oom_score_adj set to -1000)
1442 A read-only flat-keyed file which exists on non-root cgroups.
1444 otherwise, a value change in this file generates a file
1447 Note that all fields in this file are hierarchical and the
1448 file modified event can be generated due to an event down the
1456 boundary is over-committed.
1476 considered as an option, e.g. for failed high-order
1487 Similar to memory.events but the fields in the file are local
1488 to the cgroup i.e. not hierarchical. The file modified event
1489 generated on this file reflects only the local events.
1492 A read-only flat-keyed file which exists on non-root cgroups.
1495 types of memory, type-specific details, and other information
1496 on the state and past events of the memory management system.
1504 If the entry has no per-node counter (or not show in the
1505 memory.numa_stat). We use 'npn' (non-per-node) as the tag
1515 file
1536 Amount of memory used for storing per-cpu kernel
1546 Amount of cached filesystem data that is swap-backed,
1586 Amount of memory, swap-backed and filesystem-backed,
1592 the value for the foo counter, since the foo counter is type-based, not
1593 list-based.
1604 Amount of memory used for storing in-kernel data
1611 Number of refaults of previously evicted file pages.
1618 Number of refaulted file pages that were immediately activated.
1625 Number of restored file pages which have been detected as an
1694 Number of zero-filled pages swapped out with I/O skipped due to the
1753 A read-only nested-keyed file which exists on non-root cgroups.
1756 types of memory, type-specific details, and other information
1757 per node on the state of the memory management system.
1778 A read-only single value file which exists on non-root
1785 A read-write single value file which exists on non-root
1790 allow userspace to implement custom out-of-memory procedures.
1801 A read-write single value file which exists on non-root cgroups.
1806 A write of any non-empty string to this file resets it to the
1808 file descriptor.
1811 A read-write single value file which exists on non-root
1818 A read-only flat-keyed file which exists on non-root cgroups.
1820 otherwise, a value change in this file generates a file
1834 because of running out of swap system-wide or max
1843 A read-only single value file which exists on non-root
1850 A read-write single value file which exists on non-root
1858 A read-write single value file. The default value is "1".
1876 A read-only nested-keyed file.
1886 Over-committing on high limit (sum of high limits > available memory)
1898 network to a file can use all available memory but can also operate as
1900 pressure - how much the workload is being impacted due to lack of
1901 memory - is necessary to determine whether a workload needs more
1915 To which cgroup the area will be charged is in-deterministic; however,
1926 --
1931 only if cfq-iosched is in use and neither scheme is available for
1932 blk-mq devices.
1939 A read-only nested-keyed file.
1959 A read-write nested-keyed file which exists only on the root
1962 This file configures the Quality of Service of the IO cost
1971 enable Weight-based control enable
2003 devices which show wide temporary behavior changes - e.g. a
2014 A read-write nested-keyed file which exists only on the root
2017 This file configures the cost model of the IO cost model based
2027 model The cost model in use - "linear"
2053 generate device-specific coefficients.
2056 A read-write flat-keyed file which exists on non-root cgroups.
2076 A read-write nested-keyed file which exists on non-root
2090 When writing, any number of nested key-value pairs can be
2115 A read-only nested-keyed file.
2134 writes out dirty pages for the memory domain. Both system-wide and
2135 per-cgroup dirty memory states are examined and the more restrictive
2173 memory controller and system-wide clean memory.
2206 your real setting, setting at 10-15% higher than the value in io.stat.
2216 - Queue depth throttling. This is the number of outstanding IO's a group is
2220 - Artificial delay induction. There are certain types of IO that cannot be
2267 no-change
2270 promote-to-rt
2271 For requests that have a non-RT I/O priority class, change it into RT.
2275 restrict-to-be
2285 none-to-rt
2286 Deprecated. Just an alias for promote-to-rt.
2290 +----------------+---+
2291 | no-change | 0 |
2292 +----------------+---+
2293 | promote-to-rt | 1 |
2294 +----------------+---+
2295 | restrict-to-be | 2 |
2296 +----------------+---+
2298 +----------------+---+
2302 +-------------------------------+---+
2304 +-------------------------------+---+
2305 | IOPRIO_CLASS_RT (real-time) | 1 |
2306 +-------------------------------+---+
2308 +-------------------------------+---+
2310 +-------------------------------+---+
2314 - If I/O priority class policy is promote-to-rt, change the request I/O
2317 - If I/O priority class policy is not promote-to-rt, translate the I/O priority
2323 ---
2342 A read-write single value file which exists on non-root
2348 A read-only single value file which exists on non-root cgroups.
2354 A read-only single value file which exists on non-root cgroups.
2360 A read-only flat-keyed file which exists on non-root cgroups. Unless
2361 specified otherwise, a value change in this file generates a file
2369 Similar to pids.events but the fields in the file are local
2370 to the cgroup i.e. not hierarchical. The file modified event
2371 generated on this file reflects only the local events.
2378 through fork() or clone(). These will return -EAGAIN if the creation
2383 ------
2390 memory placement to reduce cross-node memory access and contention
2391 can improve overall system performance.
2401 A read-write multiple values file which exists on non-root
2402 cpuset-enabled cgroups.
2409 The CPU numbers are comma-separated numbers or ranges.
2413 0-4,6,8-10
2416 setting as the nearest cgroup ancestor with a non-empty
2423 A read-only multiple values file which exists on all
2424 cpuset-enabled cgroups.
2430 If "cpuset.cpus" is empty, the "cpuset.cpus.effective" file shows
2440 A read-write multiple values file which exists on non-root
2441 cpuset-enabled cgroups.
2448 The memory node numbers are comma-separated numbers or ranges.
2452 0-1,3
2455 setting as the nearest cgroup ancestor with a non-empty
2462 Setting a non-empty value to "cpuset.mems" causes memory of
2474 A read-only multiple values file which exists on all
2475 cpuset-enabled cgroups.
2490 A read-write multiple values file which exists on non-root
2491 cpuset-enabled cgroups.
2494 to create a new cpuset partition. Its value is not used
2495 unless the cgroup becomes a valid partition root. See the
2496 "cpuset.cpus.partition" section below for a description of what
2497 a cpuset partition is.
2499 When the cgroup becomes a partition root, the actual exclusive
2500 CPUs that are allocated to that partition are listed in
2520 The root cgroup is a partition root and all its available CPUs
2524 A read-only multiple values file which exists on all non-root
2525 cpuset-enabled cgroups.
2527 This file shows the effective set of exclusive CPUs that
2528 can be used to create a partition root. The content
2529 of this file will always be a subset of its parent's
2534 formation of local partition.
2537 A read-only and root cgroup only multiple values file.
2539 This file shows the set of all isolated CPUs used in existing
2540 isolated partitions. It will be empty if no isolated partition
2543 cpuset.cpus.partition
2544 A read-write single value file which exists on non-root
2545 cpuset-enabled cgroups. This flag is owned by the parent cgroup
2551 "member" Non-root member of a partition
2552 "root" Partition root
2553 "isolated" Partition root without load balancing
2556 A cpuset partition is a collection of cpuset-enabled cgroups with
2557 a partition root at the top of the hierarchy and its descendants
2558 except those that are separate partition roots themselves and
2559 their descendants. A partition has exclusive access to the
2561 of that partition cannot use any CPUs in that set.
2563 There are two types of partitions - local and remote. A local
2564 partition is one whose parent cgroup is also a valid partition
2565 root. A remote partition is one whose parent cgroup is not a
2566 valid partition root itself. Writing to "cpuset.cpus.exclusive"
2567 is optional for the creation of a local partition as its
2568 "cpuset.cpus.exclusive" file will assume an implicit value that
2571 before the target partition root is mandatory for the creation
2572 of a remote partition.
2574 Currently, a remote partition cannot be created under a local
2575 partition. All the ancestors of a remote partition root except
2576 the root cgroup cannot be a partition root.
2578 The root cgroup is always a partition root and its state cannot
2579 be changed. All other non-root cgroups start out as "member".
2582 partition or scheduling domain. The set of exclusive CPUs is
2585 When set to "isolated", the CPUs in that partition will be in
2588 a partition with multiple CPUs should be carefully distributed
2591 A partition root ("root" or "isolated") can be in one of the
2592 two possible states - valid or invalid. An invalid partition
2599 On read, the "cpuset.cpus.partition" file can show the following
2603 "member" Non-root member of a partition
2604 "root" Partition root
2605 "isolated" Partition root without load balancing
2606 "root invalid (<reason>)" Invalid partition root
2607 "isolated invalid (<reason>)" Invalid isolated partition root
2610 In the case of an invalid partition root, a descriptive string on
2611 why the partition is invalid is included within parentheses.
2613 For a local partition root to be valid, the following conditions
2616 1) The parent cgroup is a valid partition root.
2617 2) The "cpuset.cpus.exclusive.effective" file cannot be empty,
2620 no task associated with this partition.
2622 For a remote partition root to be valid, all the above conditions
2626 "cpuset.cpus.exclusive" can cause a valid partition root to
2630 A valid non-root parent partition may distribute out all its CPUs
2634 Care must be taken to change a valid partition root to "member"
2638 their parent is switched back to a partition root with a proper
2642 "cpuset.cpus.partition" changes. That includes changes caused
2643 by write to "cpuset.cpus.partition", cpu hotplug or other
2644 changes that modify the validity status of the partition.
2646 to "cpuset.cpus.partition" without the need to do continuous
2649 A user can pre-configure certain CPUs to an isolated state
2652 into a partition, they have to be used in an isolated partition.
2656 -----------------
2666 device file, corresponding BPF programs will be executed, and depending
2667 on the return value the attempt will succeed or fail with -EPERM.
2672 If the program returns 0, the attempt fails with -EPERM, otherwise it
2680 ----
2689 A readwrite nested-keyed file that exists for all the cgroups
2710 A read-only file that describes current resource usage.
2719 ----
2723 which does not have to be equal to the system page size, the units are always bytes.
2729 A readwrite nested-keyed file that exists for all the cgroups
2742 A read-only file that describes maximum region capacity.
2753 A read-only file that describes current resource usage.
2762 -------
2779 A read-only flat-keyed file which exists on non-root cgroups.
2785 Similar to hugetlb.<hugepagesize>.events but the fields in the file
2786 are local to the cgroup i.e. not hierarchical. The file modified event
2787 generated on this file reflects only the local events.
2792 use hugetlb pages are included. The per-node values are in bytes.
2795 ----
2803 include/linux/misc_cgroup.h file and the corresponding name via misc_res_name[]
2804 in the kernel/cgroup/misc.c file. Provider of the resource must set its
2817 A read-only flat-keyed file shown only in the root cgroup. It shows
2826 A read-only flat-keyed file shown in the all cgroups. It shows
2834 A read-only flat-keyed file shown in all cgroups. It shows the
2843 A read-write flat-keyed file shown in the non root cgroups. Allowed
2859 file.
2862 A read-only flat-keyed file which exists on non-root cgroups. The
2864 change in this file generates a file modified event. All fields in
2865 this file are hierarchical.
2872 Similar to misc.events but the fields in the file are local to the
2873 cgroup i.e. not hierarchical. The file modified event generated on
2874 this file reflects only the local events.
2885 ------
2896 Non-normative information
2897 -------------------------
2912 kernel/sched/core.c file (values from this array should be scaled
2913 appropriately so the neutral - nice 0 - value is 100 instead of 1024).
2929 ------
2932 "/proc/$PID/cgroup" file and cgroup mounts. The CLONE_NEWCGROUP clone
2939 Without cgroup namespace, the "/proc/$PID/cgroup" file shows the
2942 "/proc/$PID/cgroup" file may leak potential system level information
2948 The path '/batchjobs/container_id1' can be considered as system-data
2953 # ls -l /proc/self/ns/cgroup
2954 lrwxrwxrwx 1 root root 0 2014-07-15 10:37 /proc/self/ns/cgroup -> cgroup:[4026531835]
2960 # ls -l /proc/self/ns/cgroup
2961 lrwxrwxrwx 1 root root 0 2014-07-15 10:35 /proc/self/ns/cgroup -> cgroup:[4026532183]
2965 When some thread from a multi-threaded process unshares its cgroup
2977 ------------------
2988 # ~/unshare -c # unshare cgroupns in some cgroup
2996 Each process gets its namespace-specific view of "/proc/$PID/cgroup"
3027 ----------------------
3056 ---------------------------------
3059 running inside a non-init cgroup namespace::
3061 # mount -t cgroup2 none $MOUNT_POINT
3067 The virtualization of /proc/self/cgroup file combined with restricting
3068 the view of cgroup hierarchy by namespace-private cgroupfs mount
3081 --------------------------------
3084 address_space_operations->writepages() to annotate bio's using the
3101 super_block by setting SB_I_CGROUPWB in ->s_iflags. This allows for
3118 - Multiple hierarchies including named ones are not supported.
3120 - All v1 mount options are not supported.
3122 - The "tasks" file is removed and "cgroup.procs" is not sorted.
3124 - "cgroup.clone_children" is removed.
3126 - /proc/cgroups is meaningless for v2. Use "cgroup.controllers" or
3134 --------------------
3187 ------------------
3193 individual applications and system management interface.
3195 Generally, in-process knowledge is available only to the process
3196 itself; thus, unlike service-level organization of processes,
3203 sub-hierarchies and control resource distributions along them. This
3204 effectively raised cgroup to the status of a syscall-like API exposed
3214 that the process would actually be operating on its own sub-hierarchy.
3218 system-management pseudo filesystem. cgroup ended up with interface
3221 individual applications through the ill-defined delegation mechanism
3231 -------------------------------------------
3242 cycles and the number of internal threads fluctuated - the ratios
3258 clearly defined. There were attempts to add ad-hoc behaviors and
3272 ----------------------
3276 was how an empty cgroup was notified - a userland helper binary was
3279 to in-kernel event delivery filtering mechanism further complicating
3301 ------------------------------
3308 global reclaim prefers is opt-in, rather than opt-out. The costs for
3316 introduces high allocation latencies into the system, but also impacts
3317 system performance due to overreclaim, to the point where the feature
3318 becomes self-defeating.
3320 The memory.low boundary on the other hand is a top-down allocated
3350 system than killing the group. Otherwise, memory.max is there to
3358 new limit is met - or the task writing to memory.max is killed.
3367 groups can sabotage swapping by other means - such as referencing its
3368 anonymous memory in a tight loop - and an admin can not assume full
3374 resources. Swap space is a resource like all others in the system,