1*3402bc01SSrinivas Pandruvada.. SPDX-License-Identifier: GPL-2.0 2*3402bc01SSrinivas Pandruvada.. include:: <isonum.txt> 3*3402bc01SSrinivas Pandruvada 4*3402bc01SSrinivas Pandruvada======================================= 5*3402bc01SSrinivas PandruvadaIntel thermal throttle events reporting 6*3402bc01SSrinivas Pandruvada======================================= 7*3402bc01SSrinivas Pandruvada 8*3402bc01SSrinivas Pandruvada:Author: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> 9*3402bc01SSrinivas Pandruvada 10*3402bc01SSrinivas PandruvadaIntroduction 11*3402bc01SSrinivas Pandruvada------------ 12*3402bc01SSrinivas Pandruvada 13*3402bc01SSrinivas PandruvadaIntel processors have built in automatic and adaptive thermal monitoring 14*3402bc01SSrinivas Pandruvadamechanisms that force the processor to reduce its power consumption in order 15*3402bc01SSrinivas Pandruvadato operate within predetermined temperature limits. 16*3402bc01SSrinivas Pandruvada 17*3402bc01SSrinivas PandruvadaRefer to section "THERMAL MONITORING AND PROTECTION" in the "Intel® 64 and 18*3402bc01SSrinivas PandruvadaIA-32 Architectures Software Developer’s Manual Volume 3 (3A, 3B, 3C, & 3D): 19*3402bc01SSrinivas PandruvadaSystem Programming Guide" for more details. 20*3402bc01SSrinivas Pandruvada 21*3402bc01SSrinivas PandruvadaIn general, there are two mechanisms to control the core temperature of the 22*3402bc01SSrinivas Pandruvadaprocessor. They are called "Thermal Monitor 1 (TM1) and Thermal Monitor 2 (TM2)". 23*3402bc01SSrinivas Pandruvada 24*3402bc01SSrinivas PandruvadaThe status of the temperature sensor that triggers the thermal monitor (TM1/TM2) 25*3402bc01SSrinivas Pandruvadais indicated through the "thermal status flag" and "thermal status log flag" in 26*3402bc01SSrinivas PandruvadaMSR_IA32_THERM_STATUS for core level and MSR_IA32_PACKAGE_THERM_STATUS for 27*3402bc01SSrinivas Pandruvadapackage level. 28*3402bc01SSrinivas Pandruvada 29*3402bc01SSrinivas PandruvadaThermal Status flag, bit 0 — When set, indicates that the processor core 30*3402bc01SSrinivas Pandruvadatemperature is currently at the trip temperature of the thermal monitor and that 31*3402bc01SSrinivas Pandruvadathe processor power consumption is being reduced via either TM1 or TM2, depending 32*3402bc01SSrinivas Pandruvadaon which is enabled. When clear, the flag indicates that the core temperature is 33*3402bc01SSrinivas Pandruvadabelow the thermal monitor trip temperature. This flag is read only. 34*3402bc01SSrinivas Pandruvada 35*3402bc01SSrinivas PandruvadaThermal Status Log flag, bit 1 — When set, indicates that the thermal sensor has 36*3402bc01SSrinivas Pandruvadatripped since the last power-up or reset or since the last time that software 37*3402bc01SSrinivas Pandruvadacleared this flag. This flag is a sticky bit; once set it remains set until 38*3402bc01SSrinivas Pandruvadacleared by software or until a power-up or reset of the processor. The default 39*3402bc01SSrinivas Pandruvadastate is clear. 40*3402bc01SSrinivas Pandruvada 41*3402bc01SSrinivas PandruvadaIt is possible that when user reads MSR_IA32_THERM_STATUS or 42*3402bc01SSrinivas PandruvadaMSR_IA32_PACKAGE_THERM_STATUS, TM1/TM2 is not active. In this case, 43*3402bc01SSrinivas Pandruvada"Thermal Status flag" will read "0" and the "Thermal Status Log flag" will be set 44*3402bc01SSrinivas Pandruvadato show any previous "TM1/TM2" activation. But since it needs to be cleared by 45*3402bc01SSrinivas Pandruvadathe software, it can't show the number of occurrences of "TM1/TM2" activations. 46*3402bc01SSrinivas Pandruvada 47*3402bc01SSrinivas PandruvadaHence, Linux provides counters of how many times the "Thermal Status flag" was 48*3402bc01SSrinivas Pandruvadaset. Also presents how long the "Thermal Status flag" was active in milliseconds. 49*3402bc01SSrinivas PandruvadaUsing these counters, users can check if the performance was limited because of 50*3402bc01SSrinivas Pandruvadathermal events. It is recommended to read from sysfs instead of directly reading 51*3402bc01SSrinivas PandruvadaMSRs as the "Thermal Status Log flag" is reset by the driver to implement rate 52*3402bc01SSrinivas Pandruvadacontrol. 53*3402bc01SSrinivas Pandruvada 54*3402bc01SSrinivas PandruvadaSysfs Interface 55*3402bc01SSrinivas Pandruvada--------------- 56*3402bc01SSrinivas Pandruvada 57*3402bc01SSrinivas PandruvadaThermal throttling events are presented for each CPU under 58*3402bc01SSrinivas Pandruvada"/sys/devices/system/cpu/cpuX/thermal_throttle/", where "X" is the CPU number. 59*3402bc01SSrinivas Pandruvada 60*3402bc01SSrinivas PandruvadaAll these counters are read-only. They can't be reset to 0. So, they can potentially 61*3402bc01SSrinivas Pandruvadaoverflow after reaching the maximum 64 bit unsigned integer. 62*3402bc01SSrinivas Pandruvada 63*3402bc01SSrinivas Pandruvada``core_throttle_count`` 64*3402bc01SSrinivas Pandruvada Shows the number of times "Thermal Status flag" changed from 0 to 1 for this 65*3402bc01SSrinivas Pandruvada CPU since OS boot and thermal vector is initialized. This is a 64 bit counter. 66*3402bc01SSrinivas Pandruvada 67*3402bc01SSrinivas Pandruvada``package_throttle_count`` 68*3402bc01SSrinivas Pandruvada Shows the number of times "Thermal Status flag" changed from 0 to 1 for the 69*3402bc01SSrinivas Pandruvada package containing this CPU since OS boot and thermal vector is initialized. 70*3402bc01SSrinivas Pandruvada Package status is broadcast to all CPUs; all CPUs in the package increment 71*3402bc01SSrinivas Pandruvada this count. This is a 64-bit counter. 72*3402bc01SSrinivas Pandruvada 73*3402bc01SSrinivas Pandruvada``core_throttle_max_time_ms`` 74*3402bc01SSrinivas Pandruvada Shows the maximum amount of time for which "Thermal Status flag" has been 75*3402bc01SSrinivas Pandruvada set to 1 for this CPU at the core level since OS boot and thermal vector 76*3402bc01SSrinivas Pandruvada is initialized. 77*3402bc01SSrinivas Pandruvada 78*3402bc01SSrinivas Pandruvada``package_throttle_max_time_ms`` 79*3402bc01SSrinivas Pandruvada Shows the maximum amount of time for which "Thermal Status flag" has been 80*3402bc01SSrinivas Pandruvada set to 1 for the package containing this CPU since OS boot and thermal 81*3402bc01SSrinivas Pandruvada vector is initialized. 82*3402bc01SSrinivas Pandruvada 83*3402bc01SSrinivas Pandruvada``core_throttle_total_time_ms`` 84*3402bc01SSrinivas Pandruvada Shows the cumulative time for which "Thermal Status flag" has been 85*3402bc01SSrinivas Pandruvada set to 1 for this CPU for core level since OS boot and thermal vector 86*3402bc01SSrinivas Pandruvada is initialized. 87*3402bc01SSrinivas Pandruvada 88*3402bc01SSrinivas Pandruvada``package_throttle_total_time_ms`` 89*3402bc01SSrinivas Pandruvada Shows the cumulative time for which "Thermal Status flag" has been set 90*3402bc01SSrinivas Pandruvada to 1 for the package containing this CPU since OS boot and thermal vector 91*3402bc01SSrinivas Pandruvada is initialized. 92