1.. SPDX-License-Identifier: GPL-2.0 2 3============================================================ 4Hardware-Feedback Interface for scheduling on Intel Hardware 5============================================================ 6 7Overview 8-------- 9 10Intel has described the Hardware Feedback Interface (HFI) in the Intel 64 and 11IA-32 Architectures Software Developer's Manual (Intel SDM) Volume 3 Section 1214.6 [1]_. 13 14The HFI gives the operating system a performance and energy efficiency 15capability data for each CPU in the system. Linux can use the information from 16the HFI to influence task placement decisions. 17 18The Hardware Feedback Interface 19------------------------------- 20 21The Hardware Feedback Interface provides to the operating system information 22about the performance and energy efficiency of each CPU in the system. Each 23capability is given as a unit-less quantity in the range [0-255]. Higher values 24indicate higher capability. Energy efficiency and performance are reported in 25separate capabilities. Even though on some systems these two metrics may be 26related, they are specified as independent capabilities in the Intel SDM. 27 28These capabilities may change at runtime as a result of changes in the 29operating conditions of the system or the action of external factors. The rate 30at which these capabilities are updated is specific to each processor model. On 31some models, capabilities are set at boot time and never change. On others, 32capabilities may change every tens of milliseconds. For instance, a remote 33mechanism may be used to lower Thermal Design Power. Such change can be 34reflected in the HFI. Likewise, if the system needs to be throttled due to 35excessive heat, the HFI may reflect reduced performance on specific CPUs. 36 37The kernel or a userspace policy daemon can use these capabilities to modify 38task placement decisions. For instance, if either the performance or energy 39capabilities of a given logical processor becomes zero, it is an indication that 40the hardware recommends to the operating system to not schedule any tasks on 41that processor for performance or energy efficiency reasons, respectively. 42 43Implementation details for Linux 44-------------------------------- 45 46The infrastructure to handle thermal event interrupts has two parts. In the 47Local Vector Table of a CPU's local APIC, there exists a register for the 48Thermal Monitor Register. This register controls how interrupts are delivered 49to a CPU when the thermal monitor generates and interrupt. Further details 50can be found in the Intel SDM Vol. 3 Section 10.5 [1]_. 51 52The thermal monitor may generate interrupts per CPU or per package. The HFI 53generates package-level interrupts. This monitor is configured and initialized 54via a set of machine-specific registers. Specifically, the HFI interrupt and 55status are controlled via designated bits in the IA32_PACKAGE_THERM_INTERRUPT 56and IA32_PACKAGE_THERM_STATUS registers, respectively. There exists one HFI 57table per package. Further details can be found in the Intel SDM Vol. 3 58Section 14.9 [1]_. 59 60The hardware issues an HFI interrupt after updating the HFI table and is ready 61for the operating system to consume it. CPUs receive such interrupt via the 62thermal entry in the Local APIC's Local Vector Table. 63 64When servicing such interrupt, the HFI driver parses the updated table and 65relays the update to userspace using the thermal notification framework. Given 66that there may be many HFI updates every second, the updates relayed to 67userspace are throttled at a rate of CONFIG_HZ jiffies. 68 69References 70---------- 71 72.. [1] https://www.intel.com/sdm 73