1.. SPDX-License-Identifier: GPL-2.0 2.. include:: <isonum.txt> 3 4=============================================== 5``amd-pstate`` CPU Performance Scaling Driver 6=============================================== 7 8:Copyright: |copy| 2021 Advanced Micro Devices, Inc. 9 10:Author: Huang Rui <ray.huang@amd.com> 11 12 13Introduction 14=================== 15 16``amd-pstate`` is the AMD CPU performance scaling driver that introduces a 17new CPU frequency control mechanism on modern AMD APU and CPU series in 18Linux kernel. The new mechanism is based on Collaborative Processor 19Performance Control (CPPC) which provides finer grain frequency management 20than legacy ACPI hardware P-States. Current AMD CPU/APU platforms are using 21the ACPI P-states driver to manage CPU frequency and clocks with switching 22only in 3 P-states. CPPC replaces the ACPI P-states controls and allows a 23flexible, low-latency interface for the Linux kernel to directly 24communicate the performance hints to hardware. 25 26``amd-pstate`` leverages the Linux kernel governors such as ``schedutil``, 27``ondemand``, etc. to manage the performance hints which are provided by 28CPPC hardware functionality that internally follows the hardware 29specification (for details refer to AMD64 Architecture Programmer's Manual 30Volume 2: System Programming [1]_). Currently, ``amd-pstate`` supports basic 31frequency control function according to kernel governors on some of the 32Zen2 and Zen3 processors, and we will implement more AMD specific functions 33in future after we verify them on the hardware and SBIOS. 34 35 36AMD CPPC Overview 37======================= 38 39Collaborative Processor Performance Control (CPPC) interface enumerates a 40continuous, abstract, and unit-less performance value in a scale that is 41not tied to a specific performance state / frequency. This is an ACPI 42standard [2]_ which software can specify application performance goals and 43hints as a relative target to the infrastructure limits. AMD processors 44provide the low latency register model (MSR) instead of an AML code 45interpreter for performance adjustments. ``amd-pstate`` will initialize a 46``struct cpufreq_driver`` instance, ``amd_pstate_driver``, with the callbacks 47to manage each performance update behavior. :: 48 49 Highest Perf ------>+-----------------------+ +-----------------------+ 50 | | | | 51 | | | | 52 | | Max Perf ---->| | 53 | | | | 54 | | | | 55 Nominal Perf ------>+-----------------------+ +-----------------------+ 56 | | | | 57 | | | | 58 | | | | 59 | | | | 60 | | | | 61 | | | | 62 | | Desired Perf ---->| | 63 | | | | 64 | | | | 65 | | | | 66 | | | | 67 | | | | 68 | | | | 69 | | | | 70 | | | | 71 | | | | 72 Lowest non- | | | | 73 linear perf ------>+-----------------------+ +-----------------------+ 74 | | | | 75 | | Min perf ---->| | 76 | | | | 77 Lowest perf ------>+-----------------------+ +-----------------------+ 78 | | | | 79 | | | | 80 | | | | 81 0 ------>+-----------------------+ +-----------------------+ 82 83 AMD P-States Performance Scale 84 85 86.. _perf_cap: 87 88AMD CPPC Performance Capability 89-------------------------------- 90 91Highest Performance (RO) 92......................... 93 94This is the absolute maximum performance an individual processor may reach, 95assuming ideal conditions. This performance level may not be sustainable 96for long durations and may only be achievable if other platform components 97are in a specific state; for example, it may require other processors to be in 98an idle state. This would be equivalent to the highest frequencies 99supported by the processor. 100 101Nominal (Guaranteed) Performance (RO) 102...................................... 103 104This is the maximum sustained performance level of the processor, assuming 105ideal operating conditions. In the absence of an external constraint (power, 106thermal, etc.), this is the performance level the processor is expected to 107be able to maintain continuously. All cores/processors are expected to be 108able to sustain their nominal performance state simultaneously. 109 110Lowest non-linear Performance (RO) 111................................... 112 113This is the lowest performance level at which nonlinear power savings are 114achieved, for example, due to the combined effects of voltage and frequency 115scaling. Above this threshold, lower performance levels should be generally 116more energy efficient than higher performance levels. This register 117effectively conveys the most efficient performance level to ``amd-pstate``. 118 119Lowest Performance (RO) 120........................ 121 122This is the absolute lowest performance level of the processor. Selecting a 123performance level lower than the lowest nonlinear performance level may 124cause an efficiency penalty but should reduce the instantaneous power 125consumption of the processor. 126 127AMD CPPC Performance Control 128------------------------------ 129 130``amd-pstate`` passes performance goals through these registers. The 131register drives the behavior of the desired performance target. 132 133Minimum requested performance (RW) 134................................... 135 136``amd-pstate`` specifies the minimum allowed performance level. 137 138Maximum requested performance (RW) 139................................... 140 141``amd-pstate`` specifies a limit the maximum performance that is expected 142to be supplied by the hardware. 143 144Desired performance target (RW) 145................................... 146 147``amd-pstate`` specifies a desired target in the CPPC performance scale as 148a relative number. This can be expressed as percentage of nominal 149performance (infrastructure max). Below the nominal sustained performance 150level, desired performance expresses the average performance level of the 151processor subject to hardware. Above the nominal performance level, 152the processor must provide at least nominal performance requested and go higher 153if current operating conditions allow. 154 155Energy Performance Preference (EPP) (RW) 156......................................... 157 158This attribute provides a hint to the hardware if software wants to bias 159toward performance (0x0) or energy efficiency (0xff). 160 161 162Key Governors Support 163======================= 164 165``amd-pstate`` can be used with all the (generic) scaling governors listed 166by the ``scaling_available_governors`` policy attribute in ``sysfs``. Then, 167it is responsible for the configuration of policy objects corresponding to 168CPUs and provides the ``CPUFreq`` core (and the scaling governors attached 169to the policy objects) with accurate information on the maximum and minimum 170operating frequencies supported by the hardware. Users can check the 171``scaling_cur_freq`` information comes from the ``CPUFreq`` core. 172 173``amd-pstate`` mainly supports ``schedutil`` and ``ondemand`` for dynamic 174frequency control. It is to fine tune the processor configuration on 175``amd-pstate`` to the ``schedutil`` with CPU CFS scheduler. ``amd-pstate`` 176registers the adjust_perf callback to implement performance update behavior 177similar to CPPC. It is initialized by ``sugov_start`` and then populates the 178CPU's update_util_data pointer to assign ``sugov_update_single_perf`` as the 179utilization update callback function in the CPU scheduler. The CPU scheduler 180will call ``cpufreq_update_util`` and assigns the target performance according 181to the ``struct sugov_cpu`` that the utilization update belongs to. 182Then, ``amd-pstate`` updates the desired performance according to the CPU 183scheduler assigned. 184 185.. _processor_support: 186 187Processor Support 188======================= 189 190The ``amd-pstate`` initialization will fail if the ``_CPC`` entry in the ACPI 191SBIOS does not exist in the detected processor. It uses ``acpi_cpc_valid`` 192to check the existence of ``_CPC``. All Zen based processors support the legacy 193ACPI hardware P-States function, so when ``amd-pstate`` fails initialization, 194the kernel will fall back to initialize the ``acpi-cpufreq`` driver. 195 196There are two types of hardware implementations for ``amd-pstate``: one is 197`Full MSR Support <perf_cap_>`_ and another is `Shared Memory Support 198<perf_cap_>`_. It can use the :c:macro:`X86_FEATURE_CPPC` feature flag to 199indicate the different types. (For details, refer to the Processor Programming 200Reference (PPR) for AMD Family 19h Model 51h, Revision A1 Processors [3]_.) 201``amd-pstate`` is to register different ``static_call`` instances for different 202hardware implementations. 203 204Currently, some of the Zen2 and Zen3 processors support ``amd-pstate``. In the 205future, it will be supported on more and more AMD processors. 206 207Full MSR Support 208----------------- 209 210Some new Zen3 processors such as Cezanne provide the MSR registers directly 211while the :c:macro:`X86_FEATURE_CPPC` CPU feature flag is set. 212``amd-pstate`` can handle the MSR register to implement the fast switch 213function in ``CPUFreq`` that can reduce the latency of frequency control in 214interrupt context. The functions with a ``pstate_xxx`` prefix represent the 215operations on MSR registers. 216 217Shared Memory Support 218---------------------- 219 220If the :c:macro:`X86_FEATURE_CPPC` CPU feature flag is not set, the 221processor supports the shared memory solution. In this case, ``amd-pstate`` 222uses the ``cppc_acpi`` helper methods to implement the callback functions 223that are defined on ``static_call``. The functions with the ``cppc_xxx`` prefix 224represent the operations of ACPI CPPC helpers for the shared memory solution. 225 226 227AMD P-States and ACPI hardware P-States always can be supported in one 228processor. But AMD P-States has the higher priority and if it is enabled 229with :c:macro:`MSR_AMD_CPPC_ENABLE` or ``cppc_set_enable``, it will respond 230to the request from AMD P-States. 231 232 233User Space Interface in ``sysfs`` - Per-policy control 234====================================================== 235 236``amd-pstate`` exposes several global attributes (files) in ``sysfs`` to 237control its functionality at the system level. They are located in the 238``/sys/devices/system/cpu/cpufreq/policyX/`` directory and affect all CPUs. :: 239 240 root@hr-test1:/home/ray# ls /sys/devices/system/cpu/cpufreq/policy0/*amd* 241 /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_highest_perf 242 /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_hw_prefcore 243 /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_lowest_nonlinear_freq 244 /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_max_freq 245 /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_floor_freq 246 /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_floor_count 247 /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_prefcore_ranking 248 249 250``amd_pstate_highest_perf / amd_pstate_max_freq`` 251 252Maximum CPPC performance and CPU frequency that the driver is allowed to 253set, in percent of the maximum supported CPPC performance level (the highest 254performance supported in `AMD CPPC Performance Capability <perf_cap_>`_). 255In some ASICs, the highest CPPC performance is not the one in the ``_CPC`` 256table, so we need to expose it to sysfs. If boost is not active, but 257still supported, this maximum frequency will be larger than the one in 258``cpuinfo``. 259This attribute is read-only. 260 261``amd_pstate_lowest_nonlinear_freq`` 262 263The lowest non-linear CPPC CPU frequency that the driver is allowed to set, 264in percent of the maximum supported CPPC performance level. (Please see the 265lowest non-linear performance in `AMD CPPC Performance Capability 266<perf_cap_>`_.) 267This attribute is read-only. 268 269``amd_pstate_hw_prefcore`` 270 271Whether the platform supports the preferred core feature and it has 272been enabled. This attribute is read-only. This file is only visible 273on platforms which support the preferred core feature. 274 275``amd_pstate_prefcore_ranking`` 276 277The performance ranking of the core. This number doesn't have any unit, but 278larger numbers are preferred at the time of reading. This can change at 279runtime based on platform conditions. This attribute is read-only. This file 280is only visible on platforms which support the preferred core feature. 281 282``amd_pstate_floor_freq`` 283 284The floor frequency associated with each CPU. Userspace can write any 285value between ``cpuinfo_min_freq`` and ``scaling_max_freq`` into this 286file. When the system is under power or thermal constraints, the 287platform firmware will attempt to throttle the CPU frequency to the 288value specified in ``amd_pstate_floor_freq`` before throttling it 289further. This allows userspace to specify different floor frequencies 290to different CPUs. For optimal results, threads of the same core 291should have the same floor frequency value. This file is only visible 292on platforms that support the CPPC Performance Priority feature. 293 294 295``amd_pstate_floor_count`` 296 297The number of distinct Floor Performance levels supported by the 298platform. For example, if this value is 2, then the number of unique 299values obtained from the command ``cat 300/sys/devices/system/cpu/cpufreq/policy*/amd_pstate_floor_freq | 301sort -n | uniq`` should be at most this number for the behavior 302described in ``amd_pstate_floor_freq`` to take effect. A zero value 303implies that the platform supports unlimited floor performance levels. 304This file is only visible on platforms that support the CPPC 305Performance Priority feature. 306 307**Note**: When ``amd_pstate_floor_count`` is non-zero, the frequency to 308which the CPU is throttled under power or thermal constraints is 309undefined when the number of unique values of ``amd_pstate_floor_freq`` 310across all CPUs in the system exceeds ``amd_pstate_floor_count``. 311 312``energy_performance_available_preferences`` 313 314A list of all the supported EPP preferences that could be used for 315``energy_performance_preference`` on this system. 316These profiles represent different hints that are provided 317to the low-level firmware about the user's desired energy vs efficiency 318tradeoff. ``default`` represents the epp value is set by platform 319firmware. ``custom`` designates that integer values 0-255 may be written 320as well. This attribute is read-only. 321 322``energy_performance_preference`` 323 324The current energy performance preference can be read from this attribute. 325and user can change current preference according to energy or performance needs 326Coarse named profiles are available in the attribute 327``energy_performance_available_preferences``. 328Users can also write individual integer values between 0 to 255. 329When dynamic EPP is enabled, writes to energy_performance_preference are blocked 330even when EPP feature is enabled by platform firmware. Lower epp values shift the bias 331towards improved performance while a higher epp value shifts the bias towards 332power-savings. The exact impact can change from one platform to the other. 333If a valid integer was last written, then a number will be returned on future reads. 334If a valid string was last written then a string will be returned on future reads. 335This attribute is read-write. 336 337``boost`` 338The `boost` sysfs attribute provides control over the CPU core 339performance boost, allowing users to manage the maximum frequency limitation 340of the CPU. This attribute can be used to enable or disable the boost feature 341on individual CPUs. 342 343When the boost feature is enabled, the CPU can dynamically increase its frequency 344beyond the base frequency, providing enhanced performance for demanding workloads. 345On the other hand, disabling the boost feature restricts the CPU to operate at the 346base frequency, which may be desirable in certain scenarios to prioritize power 347efficiency or manage temperature. 348 349To manipulate the `boost` attribute, users can write a value of `0` to disable the 350boost or `1` to enable it, for the respective CPU using the sysfs path 351`/sys/devices/system/cpu/cpuX/cpufreq/boost`, where `X` represents the CPU number. 352 353Other performance and frequency values can be read back from 354``/sys/devices/system/cpu/cpuX/acpi_cppc/``, see :ref:`cppc_sysfs`. 355 356Dynamic energy performance profile 357================================== 358The amd-pstate driver supports dynamically selecting the energy performance 359profile based on whether the machine is running on AC or DC power. 360 361Whether this behavior is enabled by default depends on the kernel 362config option `CONFIG_X86_AMD_PSTATE_DYNAMIC_EPP`. This behavior can also be overridden 363at runtime by the sysfs file ``/sys/devices/system/cpu/cpufreq/policyX/dynamic_epp``. 364 365When set to enabled, the driver will select a different energy performance 366profile when the machine is running on battery or AC power. The driver will 367also register with the platform profile handler to receive notifications of 368user desired power state and react to those. 369When set to disabled, the driver will not change the energy performance profile 370based on the power source and will not react to user desired power state. 371 372Attempting to manually write to the ``energy_performance_preference`` sysfs 373file will fail when ``dynamic_epp`` is enabled. 374 375``amd-pstate`` vs ``acpi-cpufreq`` 376====================================== 377 378On the majority of AMD platforms supported by ``acpi-cpufreq``, the ACPI tables 379provided by the platform firmware are used for CPU performance scaling, but 380only provide 3 P-states on AMD processors. 381However, on modern AMD APU and CPU series, hardware provides the Collaborative 382Processor Performance Control according to the ACPI protocol and customizes this 383for AMD platforms. That is, fine-grained and continuous frequency ranges 384instead of the legacy hardware P-states. ``amd-pstate`` is the kernel 385module which supports the new AMD P-States mechanism on most of the future AMD 386platforms. The AMD P-States mechanism is the more performance and energy 387efficiency frequency management method on AMD processors. 388 389 390``amd-pstate`` Driver Operation Modes 391====================================== 392 393``amd_pstate`` CPPC has 3 operation modes: autonomous (active) mode, 394non-autonomous (passive) mode and guided autonomous (guided) mode. 395Active/passive/guided mode can be chosen by different kernel parameters. 396 397- In autonomous mode, platform ignores the desired performance level request 398 and takes into account only the values set to the minimum, maximum and energy 399 performance preference registers. 400- In non-autonomous mode, platform gets desired performance level 401 from OS directly through Desired Performance Register. 402- In guided-autonomous mode, platform sets operating performance level 403 autonomously according to the current workload and within the limits set by 404 OS through min and max performance registers. 405 406Active Mode 407------------ 408 409``amd_pstate=active`` 410 411This is the low-level firmware control mode which is implemented by ``amd_pstate_epp`` 412driver with ``amd_pstate=active`` passed to the kernel in the command line. 413In this mode, ``amd_pstate_epp`` driver provides a hint to the hardware if software 414wants to bias toward performance (0x0) or energy efficiency (0xff) to the CPPC firmware. 415then CPPC power algorithm will calculate the runtime workload and adjust the realtime 416cores frequency according to the power supply and thermal, core voltage and some other 417hardware conditions. 418 419Passive Mode 420------------ 421 422``amd_pstate=passive`` 423 424It will be enabled if the ``amd_pstate=passive`` is passed to the kernel in the command line. 425In this mode, ``amd_pstate`` driver software specifies a desired QoS target in the CPPC 426performance scale as a relative number. This can be expressed as percentage of nominal 427performance (infrastructure max). Below the nominal sustained performance level, 428desired performance expresses the average performance level of the processor subject 429to the Performance Reduction Tolerance register. Above the nominal performance level, 430processor must provide at least nominal performance requested and go higher if current 431operating conditions allow. 432 433Guided Mode 434----------- 435 436``amd_pstate=guided`` 437 438If ``amd_pstate=guided`` is passed to kernel command line option then this mode 439is activated. In this mode, driver requests minimum and maximum performance 440level and the platform autonomously selects a performance level in this range 441and appropriate to the current workload. 442 443``amd-pstate`` Preferred Core 444================================= 445 446The core frequency is subjected to the process variation in semiconductors. 447Not all cores are able to reach the maximum frequency respecting the 448infrastructure limits. Consequently, AMD has redefined the concept of 449maximum frequency of a part. This means that a fraction of cores can reach 450maximum frequency. To find the best process scheduling policy for a given 451scenario, OS needs to know the core ordering informed by the platform through 452highest performance capability register of the CPPC interface. 453 454``amd-pstate`` preferred core enables the scheduler to prefer scheduling on 455cores that can achieve a higher frequency with lower voltage. The preferred 456core rankings can dynamically change based on the workload, platform conditions, 457thermals and ageing. 458 459The priority metric will be initialized by the ``amd-pstate`` driver. The ``amd-pstate`` 460driver will also determine whether or not ``amd-pstate`` preferred core is 461supported by the platform. 462 463``amd-pstate`` driver will provide an initial core ordering when the system boots. 464The platform uses the CPPC interfaces to communicate the core ranking to the 465operating system and scheduler to make sure that OS is choosing the cores 466with highest performance firstly for scheduling the process. When ``amd-pstate`` 467driver receives a message with the highest performance change, it will 468update the core ranking and set the cpu's priority. 469 470``amd-pstate`` Preferred Core Switch 471===================================== 472Kernel Parameters 473----------------- 474 475``amd-pstate`` peferred core`` has two states: enable and disable. 476Enable/disable states can be chosen by different kernel parameters. 477Default enable ``amd-pstate`` preferred core. 478 479``amd_prefcore=disable`` 480 481For systems that support ``amd-pstate`` preferred core, the core rankings will 482always be advertised by the platform. But OS can choose to ignore that via the 483kernel parameter ``amd_prefcore=disable``. 484 485``amd_dynamic_epp`` 486 487When AMD pstate is in auto mode, dynamic EPP will control whether the kernel 488autonomously changes the EPP mode. The default is configured by 489``CONFIG_X86_AMD_PSTATE_DYNAMIC_EPP`` but can be explicitly enabled with 490``amd_dynamic_epp=enable`` or disabled with ``amd_dynamic_epp=disable``. 491 492User Space Interface in ``sysfs`` - General 493=========================================== 494 495Global Attributes 496----------------- 497 498``amd-pstate`` exposes several global attributes (files) in ``sysfs`` to 499control its functionality at the system level. They are located in the 500``/sys/devices/system/cpu/amd_pstate/`` directory and affect all CPUs. 501 502``status`` 503 Operation mode of the driver: "active", "passive", "guided" or "disable". 504 505 "active" 506 The driver is functional and in the ``active mode`` 507 508 "passive" 509 The driver is functional and in the ``passive mode`` 510 511 "guided" 512 The driver is functional and in the ``guided mode`` 513 514 "disable" 515 The driver is unregistered and not functional now. 516 517 This attribute can be written to in order to change the driver's 518 operation mode or to unregister it. The string written to it must be 519 one of the possible values of it and, if successful, writing one of 520 these values to the sysfs file will cause the driver to switch over 521 to the operation mode represented by that string - or to be 522 unregistered in the "disable" case. 523 524``prefcore`` 525 Preferred core state of the driver: "enabled" or "disabled". 526 527 "enabled" 528 Enable the ``amd-pstate`` preferred core. 529 530 "disabled" 531 Disable the ``amd-pstate`` preferred core 532 533 534 This attribute is read-only to check the state of preferred core set 535 by the kernel parameter. 536 537``cpupower`` tool support for ``amd-pstate`` 538=============================================== 539 540``amd-pstate`` is supported by the ``cpupower`` tool, which can be used to dump 541frequency information. Development is in progress to support more and more 542operations for the new ``amd-pstate`` module with this tool. :: 543 544 root@hr-test1:/home/ray# cpupower frequency-info 545 analyzing CPU 0: 546 driver: amd-pstate 547 CPUs which run at the same hardware frequency: 0 548 CPUs which need to have their frequency coordinated by software: 0 549 maximum transition latency: 131 us 550 hardware limits: 400 MHz - 4.68 GHz 551 available cpufreq governors: ondemand conservative powersave userspace performance schedutil 552 current policy: frequency should be within 400 MHz and 4.68 GHz. 553 The governor "schedutil" may decide which speed to use 554 within this range. 555 current CPU frequency: Unable to call hardware 556 current CPU frequency: 4.02 GHz (asserted by call to kernel) 557 boost state support: 558 Supported: yes 559 Active: yes 560 AMD PSTATE Highest Performance: 166. Maximum Frequency: 4.68 GHz. 561 AMD PSTATE Nominal Performance: 117. Nominal Frequency: 3.30 GHz. 562 AMD PSTATE Lowest Non-linear Performance: 39. Lowest Non-linear Frequency: 1.10 GHz. 563 AMD PSTATE Lowest Performance: 15. Lowest Frequency: 400 MHz. 564 565 566Diagnostics and Tuning 567======================= 568 569Trace Events 570-------------- 571 572There are two static trace events that can be used for ``amd-pstate`` 573diagnostics. One of them is the ``cpu_frequency`` trace event generally used 574by ``CPUFreq``, and the other one is the ``amd_pstate_perf`` trace event 575specific to ``amd-pstate``. The following sequence of shell commands can 576be used to enable them and see their output (if the kernel is 577configured to support event tracing). :: 578 579 root@hr-test1:/home/ray# cd /sys/kernel/tracing/ 580 root@hr-test1:/sys/kernel/tracing# echo 1 > events/amd_cpu/enable 581 root@hr-test1:/sys/kernel/tracing# cat trace 582 # tracer: nop 583 # 584 # entries-in-buffer/entries-written: 47827/42233061 #P:2 585 # 586 # _-----=> irqs-off 587 # / _----=> need-resched 588 # | / _---=> hardirq/softirq 589 # || / _--=> preempt-depth 590 # ||| / delay 591 # TASK-PID CPU# |||| TIMESTAMP FUNCTION 592 # | | | |||| | | 593 <idle>-0 [015] dN... 4995.979886: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=15 changed=false fast_switch=true 594 <idle>-0 [007] d.h.. 4995.979893: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=7 changed=false fast_switch=true 595 cat-2161 [000] d.... 4995.980841: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=0 changed=false fast_switch=true 596 sshd-2125 [004] d.s.. 4995.980968: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=4 changed=false fast_switch=true 597 <idle>-0 [007] d.s.. 4995.980968: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=7 changed=false fast_switch=true 598 <idle>-0 [003] d.s.. 4995.980971: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=3 changed=false fast_switch=true 599 <idle>-0 [011] d.s.. 4995.980996: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=11 changed=false fast_switch=true 600 601The ``cpu_frequency`` trace event will be triggered either by the ``schedutil`` scaling 602governor (for the policies it is attached to), or by the ``CPUFreq`` core (for the 603policies with other scaling governors). 604 605 606Tracer Tool 607------------- 608 609``amd_pstate_tracer.py`` can record and parse ``amd-pstate`` trace log, then 610generate performance plots. This utility can be used to debug and tune the 611performance of ``amd-pstate`` driver. The tracer tool needs to import intel 612pstate tracer. 613 614Tracer tool located in ``linux/tools/power/x86/amd_pstate_tracer``. It can be 615used in two ways. If trace file is available, then directly parse the file 616with command :: 617 618 ./amd_pstate_trace.py [-c cpus] -t <trace_file> -n <test_name> 619 620Or generate trace file with root privilege, then parse and plot with command :: 621 622 sudo ./amd_pstate_trace.py [-c cpus] -n <test_name> -i <interval> [-m kbytes] 623 624The test result can be found in ``results/test_name``. Following is the example 625about part of the output. :: 626 627 common_cpu common_secs common_usecs min_perf des_perf max_perf freq mperf apef tsc load duration_ms sample_num elapsed_time common_comm 628 CPU_005 712 116384 39 49 166 0.7565 9645075 2214891 38431470 25.1 11.646 469 2.496 kworker/5:0-40 629 CPU_006 712 116408 39 49 166 0.6769 8950227 1839034 37192089 24.06 11.272 470 2.496 kworker/6:0-1264 630 631Unit Tests for amd-pstate 632------------------------- 633 634``amd-pstate-ut`` is a test module for testing the ``amd-pstate`` driver. 635 636 * It can help all users to verify their processor support (SBIOS/Firmware or Hardware). 637 638 * Kernel can have a basic function test to avoid the kernel regression during the update. 639 640 * We can introduce more functional or performance tests to align the result together, it will benefit power and performance scale optimization. 641 6421. Test case descriptions 643 644 1). Basic tests 645 646 Test prerequisite and basic functions for the ``amd-pstate`` driver. 647 648 +---------+--------------------------------+------------------------------------------------------------------------------------+ 649 | Index | Functions | Description | 650 +=========+================================+====================================================================================+ 651 | 1 | amd_pstate_ut_acpi_cpc_valid || Check whether the _CPC object is present in SBIOS. | 652 | | || | 653 | | || The detail refer to `Processor Support <processor_support_>`_. | 654 +---------+--------------------------------+------------------------------------------------------------------------------------+ 655 | 2 | amd_pstate_ut_check_enabled || Check whether AMD P-State is enabled. | 656 | | || | 657 | | || AMD P-States and ACPI hardware P-States always can be supported in one processor. | 658 | | | But AMD P-States has the higher priority and if it is enabled with | 659 | | | :c:macro:`MSR_AMD_CPPC_ENABLE` or ``cppc_set_enable``, it will respond to the | 660 | | | request from AMD P-States. | 661 +---------+--------------------------------+------------------------------------------------------------------------------------+ 662 | 3 | amd_pstate_ut_check_perf || Check if the each performance values are reasonable. | 663 | | || highest_perf >= nominal_perf > lowest_nonlinear_perf > lowest_perf > 0. | 664 +---------+--------------------------------+------------------------------------------------------------------------------------+ 665 | 4 | amd_pstate_ut_check_freq || Check if the each frequency values and max freq when set support boost mode | 666 | | | are reasonable. | 667 | | || max_freq >= nominal_freq > lowest_nonlinear_freq > min_freq > 0 | 668 | | || If boost is not active but supported, this maximum frequency will be larger than | 669 | | | the one in ``cpuinfo``. | 670 +---------+--------------------------------+------------------------------------------------------------------------------------+ 671 672 2). Tbench test 673 674 Test and monitor the cpu changes when running tbench benchmark under the specified governor. 675 These changes include desire performance, frequency, load, performance, energy etc. 676 The specified governor is ondemand or schedutil. 677 Tbench can also be tested on the ``acpi-cpufreq`` kernel driver for comparison. 678 679 3). Gitsource test 680 681 Test and monitor the cpu changes when running gitsource benchmark under the specified governor. 682 These changes include desire performance, frequency, load, time, energy etc. 683 The specified governor is ondemand or schedutil. 684 Gitsource can also be tested on the ``acpi-cpufreq`` kernel driver for comparison. 685 686#. How to execute the tests 687 688 We use test module in the kselftest frameworks to implement it. 689 We create ``amd-pstate-ut`` module and tie it into kselftest.(for 690 details refer to Linux Kernel Selftests [4]_). 691 692 1). Build 693 694 + open the :c:macro:`CONFIG_X86_AMD_PSTATE` configuration option. 695 + set the :c:macro:`CONFIG_X86_AMD_PSTATE_UT` configuration option to M. 696 + make project 697 + make selftest :: 698 699 $ cd linux 700 $ make -C tools/testing/selftests 701 702 + make perf :: 703 704 $ cd tools/perf/ 705 $ make 706 707 708 2). Installation & Steps :: 709 710 $ make -C tools/testing/selftests install INSTALL_PATH=~/kselftest 711 $ cp tools/perf/perf /usr/bin/perf 712 $ sudo ./kselftest/run_kselftest.sh -c amd-pstate 713 714 3). Specified test case :: 715 716 $ cd ~/kselftest/amd-pstate 717 $ sudo ./run.sh -t basic 718 $ sudo ./run.sh -t tbench 719 $ sudo ./run.sh -t tbench -m acpi-cpufreq 720 $ sudo ./run.sh -t gitsource 721 $ sudo ./run.sh -t gitsource -m acpi-cpufreq 722 $ ./run.sh --help 723 ./run.sh: illegal option -- - 724 Usage: ./run.sh [OPTION...] 725 [-h <help>] 726 [-o <output-file-for-dump>] 727 [-c <all: All testing, 728 basic: Basic testing, 729 tbench: Tbench testing, 730 gitsource: Gitsource testing.>] 731 [-t <tbench time limit>] 732 [-p <tbench process number>] 733 [-l <loop times for tbench>] 734 [-i <amd tracer interval>] 735 [-m <comparative test: acpi-cpufreq>] 736 737 738 4). Results 739 740 + basic 741 742 When you finish test, you will get the following log info :: 743 744 $ dmesg | grep "amd_pstate_ut" | tee log.txt 745 [12977.570663] amd_pstate_ut: 1 amd_pstate_ut_acpi_cpc_valid success! 746 [12977.570673] amd_pstate_ut: 2 amd_pstate_ut_check_enabled success! 747 [12977.571207] amd_pstate_ut: 3 amd_pstate_ut_check_perf success! 748 [12977.571212] amd_pstate_ut: 4 amd_pstate_ut_check_freq success! 749 750 + tbench 751 752 When you finish test, you will get selftest.tbench.csv and png images. 753 The selftest.tbench.csv file contains the raw data and the drop of the comparative test. 754 The png images shows the performance, energy and performan per watt of each test. 755 Open selftest.tbench.csv : 756 757 +-------------------------------------------------+--------------+----------+---------+----------+-------------+---------+----------------------+ 758 + Governor | Round | Des-perf | Freq | Load | Performance | Energy | Performance Per Watt | 759 +-------------------------------------------------+--------------+----------+---------+----------+-------------+---------+----------------------+ 760 + Unit | | | GHz | | MB/s | J | MB/J | 761 +=================================================+==============+==========+=========+==========+=============+=========+======================+ 762 + amd-pstate-ondemand | 1 | | | | 2504.05 | 1563.67 | 158.5378 | 763 +-------------------------------------------------+--------------+----------+---------+----------+-------------+---------+----------------------+ 764 + amd-pstate-ondemand | 2 | | | | 2243.64 | 1430.32 | 155.2941 | 765 +-------------------------------------------------+--------------+----------+---------+----------+-------------+---------+----------------------+ 766 + amd-pstate-ondemand | 3 | | | | 2183.88 | 1401.32 | 154.2860 | 767 +-------------------------------------------------+--------------+----------+---------+----------+-------------+---------+----------------------+ 768 + amd-pstate-ondemand | Average | | | | 2310.52 | 1465.1 | 156.1268 | 769 +-------------------------------------------------+--------------+----------+---------+----------+-------------+---------+----------------------+ 770 + amd-pstate-schedutil | 1 | 165.329 | 1.62257 | 99.798 | 2136.54 | 1395.26 | 151.5971 | 771 +-------------------------------------------------+--------------+----------+---------+----------+-------------+---------+----------------------+ 772 + amd-pstate-schedutil | 2 | 166 | 1.49761 | 99.9993 | 2100.56 | 1380.5 | 150.6377 | 773 +-------------------------------------------------+--------------+----------+---------+----------+-------------+---------+----------------------+ 774 + amd-pstate-schedutil | 3 | 166 | 1.47806 | 99.9993 | 2084.12 | 1375.76 | 149.9737 | 775 +-------------------------------------------------+--------------+----------+---------+----------+-------------+---------+----------------------+ 776 + amd-pstate-schedutil | Average | 165.776 | 1.53275 | 99.9322 | 2107.07 | 1383.84 | 150.7399 | 777 +-------------------------------------------------+--------------+----------+---------+----------+-------------+---------+----------------------+ 778 + acpi-cpufreq-ondemand | 1 | | | | 2529.9 | 1564.4 | 160.0997 | 779 +-------------------------------------------------+--------------+----------+---------+----------+-------------+---------+----------------------+ 780 + acpi-cpufreq-ondemand | 2 | | | | 2249.76 | 1432.97 | 155.4297 | 781 +-------------------------------------------------+--------------+----------+---------+----------+-------------+---------+----------------------+ 782 + acpi-cpufreq-ondemand | 3 | | | | 2181.46 | 1406.88 | 153.5060 | 783 +-------------------------------------------------+--------------+----------+---------+----------+-------------+---------+----------------------+ 784 + acpi-cpufreq-ondemand | Average | | | | 2320.37 | 1468.08 | 156.4741 | 785 +-------------------------------------------------+--------------+----------+---------+----------+-------------+---------+----------------------+ 786 + acpi-cpufreq-schedutil | 1 | | | | 2137.64 | 1385.24 | 152.7723 | 787 +-------------------------------------------------+--------------+----------+---------+----------+-------------+---------+----------------------+ 788 + acpi-cpufreq-schedutil | 2 | | | | 2107.05 | 1372.23 | 152.0138 | 789 +-------------------------------------------------+--------------+----------+---------+----------+-------------+---------+----------------------+ 790 + acpi-cpufreq-schedutil | 3 | | | | 2085.86 | 1365.35 | 151.2433 | 791 +-------------------------------------------------+--------------+----------+---------+----------+-------------+---------+----------------------+ 792 + acpi-cpufreq-schedutil | Average | | | | 2110.18 | 1374.27 | 152.0136 | 793 +-------------------------------------------------+--------------+----------+---------+----------+-------------+---------+----------------------+ 794 + acpi-cpufreq-ondemand VS acpi-cpufreq-schedutil | Comprison(%) | | | | -9.0584 | -6.3899 | -2.8506 | 795 +-------------------------------------------------+--------------+----------+---------+----------+-------------+---------+----------------------+ 796 + amd-pstate-ondemand VS amd-pstate-schedutil | Comprison(%) | | | | 8.8053 | -5.5463 | -3.4503 | 797 +-------------------------------------------------+--------------+----------+---------+----------+-------------+---------+----------------------+ 798 + acpi-cpufreq-ondemand VS amd-pstate-ondemand | Comprison(%) | | | | -0.4245 | -0.2029 | -0.2219 | 799 +-------------------------------------------------+--------------+----------+---------+----------+-------------+---------+----------------------+ 800 + acpi-cpufreq-schedutil VS amd-pstate-schedutil | Comprison(%) | | | | -0.1473 | 0.6963 | -0.8378 | 801 +-------------------------------------------------+--------------+----------+---------+----------+-------------+---------+----------------------+ 802 803 + gitsource 804 805 When you finish test, you will get selftest.gitsource.csv and png images. 806 The selftest.gitsource.csv file contains the raw data and the drop of the comparative test. 807 The png images shows the performance, energy and performan per watt of each test. 808 Open selftest.gitsource.csv : 809 810 +-------------------------------------------------+--------------+----------+----------+----------+-------------+---------+----------------------+ 811 + Governor | Round | Des-perf | Freq | Load | Time | Energy | Performance Per Watt | 812 +-------------------------------------------------+--------------+----------+----------+----------+-------------+---------+----------------------+ 813 + Unit | | | GHz | | s | J | 1/J | 814 +=================================================+==============+==========+==========+==========+=============+=========+======================+ 815 + amd-pstate-ondemand | 1 | 50.119 | 2.10509 | 23.3076 | 475.69 | 865.78 | 0.001155027 | 816 +-------------------------------------------------+--------------+----------+----------+----------+-------------+---------+----------------------+ 817 + amd-pstate-ondemand | 2 | 94.8006 | 1.98771 | 56.6533 | 467.1 | 839.67 | 0.001190944 | 818 +-------------------------------------------------+--------------+----------+----------+----------+-------------+---------+----------------------+ 819 + amd-pstate-ondemand | 3 | 76.6091 | 2.53251 | 43.7791 | 467.69 | 855.85 | 0.001168429 | 820 +-------------------------------------------------+--------------+----------+----------+----------+-------------+---------+----------------------+ 821 + amd-pstate-ondemand | Average | 73.8429 | 2.20844 | 41.2467 | 470.16 | 853.767 | 0.001171279 | 822 +-------------------------------------------------+--------------+----------+----------+----------+-------------+---------+----------------------+ 823 + amd-pstate-schedutil | 1 | 165.919 | 1.62319 | 98.3868 | 464.17 | 866.8 | 0.001153668 | 824 +-------------------------------------------------+--------------+----------+----------+----------+-------------+---------+----------------------+ 825 + amd-pstate-schedutil | 2 | 165.97 | 1.31309 | 99.5712 | 480.15 | 880.4 | 0.001135847 | 826 +-------------------------------------------------+--------------+----------+----------+----------+-------------+---------+----------------------+ 827 + amd-pstate-schedutil | 3 | 165.973 | 1.28448 | 99.9252 | 481.79 | 867.02 | 0.001153375 | 828 +-------------------------------------------------+--------------+----------+----------+----------+-------------+---------+----------------------+ 829 + amd-pstate-schedutil | Average | 165.954 | 1.40692 | 99.2944 | 475.37 | 871.407 | 0.001147569 | 830 +-------------------------------------------------+--------------+----------+----------+----------+-------------+---------+----------------------+ 831 + acpi-cpufreq-ondemand | 1 | | | | 2379.62 | 742.96 | 0.001345967 | 832 +-------------------------------------------------+--------------+----------+----------+----------+-------------+---------+----------------------+ 833 + acpi-cpufreq-ondemand | 2 | | | | 441.74 | 817.49 | 0.001223256 | 834 +-------------------------------------------------+--------------+----------+----------+----------+-------------+---------+----------------------+ 835 + acpi-cpufreq-ondemand | 3 | | | | 455.48 | 820.01 | 0.001219497 | 836 +-------------------------------------------------+--------------+----------+----------+----------+-------------+---------+----------------------+ 837 + acpi-cpufreq-ondemand | Average | | | | 425.613 | 793.487 | 0.001260260 | 838 +-------------------------------------------------+--------------+----------+----------+----------+-------------+---------+----------------------+ 839 + acpi-cpufreq-schedutil | 1 | | | | 459.69 | 838.54 | 0.001192548 | 840 +-------------------------------------------------+--------------+----------+----------+----------+-------------+---------+----------------------+ 841 + acpi-cpufreq-schedutil | 2 | | | | 466.55 | 830.89 | 0.001203528 | 842 +-------------------------------------------------+--------------+----------+----------+----------+-------------+---------+----------------------+ 843 + acpi-cpufreq-schedutil | 3 | | | | 470.38 | 837.32 | 0.001194286 | 844 +-------------------------------------------------+--------------+----------+----------+----------+-------------+---------+----------------------+ 845 + acpi-cpufreq-schedutil | Average | | | | 465.54 | 835.583 | 0.001196769 | 846 +-------------------------------------------------+--------------+----------+----------+----------+-------------+---------+----------------------+ 847 + acpi-cpufreq-ondemand VS acpi-cpufreq-schedutil | Comprison(%) | | | | 9.3810 | 5.3051 | -5.0379 | 848 +-------------------------------------------------+--------------+----------+----------+----------+-------------+---------+----------------------+ 849 + amd-pstate-ondemand VS amd-pstate-schedutil | Comprison(%) | 124.7392 | -36.2934 | 140.7329 | 1.1081 | 2.0661 | -2.0242 | 850 +-------------------------------------------------+--------------+----------+----------+----------+-------------+---------+----------------------+ 851 + acpi-cpufreq-ondemand VS amd-pstate-ondemand | Comprison(%) | | | | 10.4665 | 7.5968 | -7.0605 | 852 +-------------------------------------------------+--------------+----------+----------+----------+-------------+---------+----------------------+ 853 + acpi-cpufreq-schedutil VS amd-pstate-schedutil | Comprison(%) | | | | 2.1115 | 4.2873 | -4.1110 | 854 +-------------------------------------------------+--------------+----------+----------+----------+-------------+---------+----------------------+ 855 856Reference 857=========== 858 859.. [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming, 860 https://docs.amd.com/v/u/en-US/24593_3.44_APM_Vol2 861 862.. [2] Advanced Configuration and Power Interface Specification, 863 https://uefi.org/sites/default/files/resources/ACPI_Spec_6_4_Jan22.pdf 864 865.. [3] Processor Programming Reference (PPR) for AMD Family 19h Model 51h, Revision A1 Processors 866 https://docs.amd.com/v/u/en-US/56569-A1-PUB_3.03 867 868.. [4] Linux Kernel Selftests, 869 https://www.kernel.org/doc/html/latest/dev-tools/kselftest.html 870