Lines Matching +full:cpu +full:- +full:idle +full:- +full:states
1 .. SPDX-License-Identifier: GPL-2.0
5 ``intel_idle`` CPU Idle Time Management Driver
17 :doc:`CPU idle time management subsystem <cpuidle>` in the Linux kernel
18 (``CPUIdle``). It is the default CPU idle time management driver for the
24 Documentation/admin-guide/pm/cpuidle.rst if you have not done that yet.]
27 logical CPU executing it is idle and so it may be possible to put some of the
28 processor's functional blocks into low-power states. That instruction takes two
29 arguments (passed in the ``EAX`` and ``ECX`` registers of the target CPU), the
38 only way to pass early-configuration-time parameters to it is via the kernel
45 ``/sys/devices/system/cpu/cpuidle/``:
55 C-state requests from the OS (e.g., C6 requests) to C1. The idea is that
56 firmware monitors CPU wake-up rate, and if it is higher than a
57 platform-specific threshold, the firmware demotes deep C-state requests
59 wake-ups per second, and it keeps the CPU in C1. When the CPU stays in
63 .. _intel-idle-enumeration-of-states:
65 Enumeration of Idle States
71 as C-states (in the ACPI terminology) or idle states. The list of meaningful
72 ``MWAIT`` hint values and idle states (i.e. low-power configurations of the
76 In order to create a list of available idle states required by the ``CPUIdle``
77 subsystem (see :ref:`idle-states-representation` in
78 Documentation/admin-guide/pm/cpuidle.rst),
79 ``intel_idle`` can use two sources of information: static tables of idle states
87 `below <intel-idle-parameters_>`_.]
89 If the ACPI tables are going to be used for building the list of available idle
90 states, ``intel_idle`` first looks for a ``_CST`` object under one of the ACPI
93 ``CPUIdle`` subsystem expects that the list of idle states supplied by the
96 driver looks for the first ``_CST`` object returning at least one valid idle
97 state description and such that all of the idle states included in its return
101 applicable to all of the other CPUs in the system and the idle state
102 descriptions extracted from it are stored in a preliminary list of idle states
104 configured to ignore the ACPI tables; see `below <intel-idle-parameters_>`_.]
106 Next, the first (index 0) entry in the list of available idle states is
107 initialized to represent a "polling idle state" (a pseudo-idle state in which
108 the target CPU continuously fetches and executes instructions), and the
109 subsequent (real) idle state entries are populated as follows.
112 (static) table of idle state descriptions for it in the driver. In that case,
113 the "internal" table is the primary source of information on idle states and the
114 information from it is copied to the final list of available idle states. If
115 using the ACPI tables for the enumeration of idle states is not required
116 (depending on the processor model), all of the listed idle state are enabled by
118 governors during CPU idle state selection). Otherwise, some of the listed idle
119 states may not be enabled by default if there are no matching entries in the
120 preliminary list of idle states coming from the ACPI tables. In that case user
121 space still can enable them later (on a per-CPU basis) with the help of
122 the ``disable`` idle state attribute in ``sysfs`` (see
123 :ref:`idle-states-representation` in
124 Documentation/admin-guide/pm/cpuidle.rst). This basically means that
125 the idle states "known" to the driver may not be enabled by default if they have
129 supports ``MWAIT``, the preliminary list of idle states coming from the ACPI
131 ``CPUIdle`` core during driver registration. For each idle state in that list,
133 entry in the final list of idle states. The name of the idle state represented
134 by it (to be returned by the ``name`` idle state attribute in ``sysfs``) is
135 "CX_ACPI", where X is the index of that idle state in the final list (note that
138 C1-type idle states the exit latency value is also used as the target residency
139 (for compatibility with the majority of the "internal" tables of idle states for
140 various processor models recognized by ``intel_idle``) and for the other idle
144 All of the idle states in the final list are enabled by default in this case.
147 .. _intel-idle-initialization:
157 driver, which determines the idle states enumeration method (see
158 `above <intel-idle-enumeration-of-states_>`_), and whether or not the processor
165 `below <intel-idle-parameters_>`_), the idle states information provided by the
169 available idle states is created as explained
170 `above <intel-idle-enumeration-of-states_>`_.
173 as the ``CPUIdle`` driver for all CPUs in the system and a CPU online callback
176 CPUs present in the system at that time (each CPU executes its own instance of
177 the callback routine). That routine registers a ``CPUIdle`` device for the CPU
178 running it (which enables the ``CPUIdle`` subsystem to operate that CPU) and
179 optionally performs some CPU-specific initialization actions that may be
183 .. _intel-idle-parameters:
189 options related to CPU idle time management: ``idle=poll``, ``idle=halt``,
190 and ``idle=nomwait``. If any of them is present in the kernel command line, the
198 The ``max_cstate`` parameter value is the maximum idle state index in the list
199 of idle states supplied to the ``CPUIdle`` core during the registration of the
200 driver. It is also the maximum number of regular (non-polling) idle states that
201 can be used by ``intel_idle``, so the enumeration of idle states is terminated
202 after finding that number of usable idle states (the other idle states that
205 ``intel_idle`` from exposing idle states that are regarded as "too deep" for
208 be desirable. In practice, it is only really necessary to do that if the idle
209 states in question cannot be enabled during system startup, because in the
210 working state of the system the CPU power management quality of service (PM
211 QoS) feature can be used to prevent ``CPUIdle`` from touching those idle states
212 even if they have been enumerated (see :ref:`cpu-pm-qos` in
213 Documentation/admin-guide/pm/cpuidle.rst).
221 ``no_acpi`` - Do not use ACPI at all. Only native mode is available, no
224 ``use_acpi`` - No-op in ACPI mode, the driver will consult ACPI tables for
225 C-states on/off status in native mode.
227 ``no_native`` - Work only in ACPI mode, no native mode available (ignore
231 list of idle states to be disabled by default in the form of a bitmask.
234 the indices of idle states to be disabled by default (as reflected by the names
235 of the corresponding idle state directories in ``sysfs``, :file:`state0`,
237 idle state; see :ref:`idle-states-representation` in
238 Documentation/admin-guide/pm/cpuidle.rst).
240 For example, if ``states_off`` is equal to 3, the driver will disable idle
241 states 0 and 1 by default, and if it is equal to 8, idle state 3 will be
242 disabled by default and so on (bit positions beyond the maximum idle state index
245 The idle states disabled this way can be enabled (on a per-CPU basis) from user
250 Speculation) should be turned off when the CPU enters an idle state.
256 have a performance impact on its sibling CPU. The IBRS mode will be turned off
257 by default when the CPU enters into a deep idle state, but not in some
259 mode to off when the CPU is in any one of the available idle states. This may
260 help performance of a sibling CPU at the expense of a slightly higher wakeup
261 latency for the idle CPU.
264 .. _intel-idle-core-and-package-idle-states:
266 Core and Package Levels of Idle States
270 least) two levels of idle states (or C-states). One level, referred to as
271 "core C-states", covers individual cores in the processor, whereas the other
272 level, referred to as "package C-states", covers the entire processor package
276 Some of the ``MWAIT`` hint values allow the processor to use core C-states only
278 to the ``C1`` idle state), but the majority of them give it a license to put
279 the target core (i.e. the core containing the logical CPU executing ``MWAIT``
280 with the given hint value) into a specific core C-state and then (if possible)
281 to enter a specific package C-state at the deeper level. For example, the
282 ``MWAIT`` hint value representing the ``C3`` idle state allows the processor to
283 put the target core into the low-power state referred to as "core ``C3``" (or
286 representing a deeper idle state), and in addition to that (in the majority of
288 including some non-CPU components such as a GPU or a memory controller) into the
289 low-power state referred to as "package ``C3``" (or ``PC3``), which happens if
292 be required to be in a certain GPU-specific low-power state for ``PC3`` to be
295 As a rule, there is no simple way to make the processor use core C-states only
296 if the conditions for entering the corresponding package C-states are met, so
297 the logical CPU executing ``MWAIT`` with a hint value that is not core-level
299 enter a package C-state. [That is why the exit latency and target residency
301 tables of idle states in ``intel_idle`` reflect the properties of package
302 C-states.] If using package C-states is not desirable at all, either
303 :ref:`PM QoS <cpu-pm-qos>` or the ``max_cstate`` module parameter of
304 ``intel_idle`` described `above <intel-idle-parameters_>`_ must be used to
305 restrict the range of permissible idle states to the ones with core-level only
312 .. [1] *Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 2B*,
313 …www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-develo…