xref: /linux/Documentation/power/energy-model.rst (revision 48dea9a700c8728cc31a1dd44588b97578de86ee)
1.. SPDX-License-Identifier: GPL-2.0
2
3=======================
4Energy Model of devices
5=======================
6
71. Overview
8-----------
9
10The Energy Model (EM) framework serves as an interface between drivers knowing
11the power consumed by devices at various performance levels, and the kernel
12subsystems willing to use that information to make energy-aware decisions.
13
14The source of the information about the power consumed by devices can vary greatly
15from one platform to another. These power costs can be estimated using
16devicetree data in some cases. In others, the firmware will know better.
17Alternatively, userspace might be best positioned. And so on. In order to avoid
18each and every client subsystem to re-implement support for each and every
19possible source of information on its own, the EM framework intervenes as an
20abstraction layer which standardizes the format of power cost tables in the
21kernel, hence enabling to avoid redundant work.
22
23The figure below depicts an example of drivers (Arm-specific here, but the
24approach is applicable to any architecture) providing power costs to the EM
25framework, and interested clients reading the data from it::
26
27       +---------------+  +-----------------+  +---------------+
28       | Thermal (IPA) |  | Scheduler (EAS) |  |     Other     |
29       +---------------+  +-----------------+  +---------------+
30               |                   | em_cpu_energy()   |
31               |                   | em_cpu_get()      |
32               +---------+         |         +---------+
33                         |         |         |
34                         v         v         v
35                        +---------------------+
36                        |    Energy Model     |
37                        |     Framework       |
38                        +---------------------+
39                           ^       ^       ^
40                           |       |       | em_dev_register_perf_domain()
41                +----------+       |       +---------+
42                |                  |                 |
43        +---------------+  +---------------+  +--------------+
44        |  cpufreq-dt   |  |   arm_scmi    |  |    Other     |
45        +---------------+  +---------------+  +--------------+
46                ^                  ^                 ^
47                |                  |                 |
48        +--------------+   +---------------+  +--------------+
49        | Device Tree  |   |   Firmware    |  |      ?       |
50        +--------------+   +---------------+  +--------------+
51
52In case of CPU devices the EM framework manages power cost tables per
53'performance domain' in the system. A performance domain is a group of CPUs
54whose performance is scaled together. Performance domains generally have a
551-to-1 mapping with CPUFreq policies. All CPUs in a performance domain are
56required to have the same micro-architecture. CPUs in different performance
57domains can have different micro-architectures.
58
59
602. Core APIs
61------------
62
632.1 Config options
64^^^^^^^^^^^^^^^^^^
65
66CONFIG_ENERGY_MODEL must be enabled to use the EM framework.
67
68
692.2 Registration of performance domains
70^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
71
72Drivers are expected to register performance domains into the EM framework by
73calling the following API::
74
75  int em_dev_register_perf_domain(struct device *dev, unsigned int nr_states,
76		struct em_data_callback *cb, cpumask_t *cpus);
77
78Drivers must provide a callback function returning <frequency, power> tuples
79for each performance state. The callback function provided by the driver is free
80to fetch data from any relevant location (DT, firmware, ...), and by any mean
81deemed necessary. Only for CPU devices, drivers must specify the CPUs of the
82performance domains using cpumask. For other devices than CPUs the last
83argument must be set to NULL.
84See Section 3. for an example of driver implementing this
85callback, and kernel/power/energy_model.c for further documentation on this
86API.
87
88
892.3 Accessing performance domains
90^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
91
92There are two API functions which provide the access to the energy model:
93em_cpu_get() which takes CPU id as an argument and em_pd_get() with device
94pointer as an argument. It depends on the subsystem which interface it is
95going to use, but in case of CPU devices both functions return the same
96performance domain.
97
98Subsystems interested in the energy model of a CPU can retrieve it using the
99em_cpu_get() API. The energy model tables are allocated once upon creation of
100the performance domains, and kept in memory untouched.
101
102The energy consumed by a performance domain can be estimated using the
103em_cpu_energy() API. The estimation is performed assuming that the schedutil
104CPUfreq governor is in use in case of CPU device. Currently this calculation is
105not provided for other type of devices.
106
107More details about the above APIs can be found in include/linux/energy_model.h.
108
109
1103. Example driver
111-----------------
112
113This section provides a simple example of a CPUFreq driver registering a
114performance domain in the Energy Model framework using the (fake) 'foo'
115protocol. The driver implements an est_power() function to be provided to the
116EM framework::
117
118  -> drivers/cpufreq/foo_cpufreq.c
119
120  01	static int est_power(unsigned long *mW, unsigned long *KHz,
121  02			struct device *dev)
122  03	{
123  04		long freq, power;
124  05
125  06		/* Use the 'foo' protocol to ceil the frequency */
126  07		freq = foo_get_freq_ceil(dev, *KHz);
127  08		if (freq < 0);
128  09			return freq;
129  10
130  11		/* Estimate the power cost for the dev at the relevant freq. */
131  12		power = foo_estimate_power(dev, freq);
132  13		if (power < 0);
133  14			return power;
134  15
135  16		/* Return the values to the EM framework */
136  17		*mW = power;
137  18		*KHz = freq;
138  19
139  20		return 0;
140  21	}
141  22
142  23	static int foo_cpufreq_init(struct cpufreq_policy *policy)
143  24	{
144  25		struct em_data_callback em_cb = EM_DATA_CB(est_power);
145  26		struct device *cpu_dev;
146  27		int nr_opp, ret;
147  28
148  29		cpu_dev = get_cpu_device(cpumask_first(policy->cpus));
149  30
150  31     	/* Do the actual CPUFreq init work ... */
151  32     	ret = do_foo_cpufreq_init(policy);
152  33     	if (ret)
153  34     		return ret;
154  35
155  36     	/* Find the number of OPPs for this policy */
156  37     	nr_opp = foo_get_nr_opp(policy);
157  38
158  39     	/* And register the new performance domain */
159  40     	em_dev_register_perf_domain(cpu_dev, nr_opp, &em_cb, policy->cpus);
160  41
161  42	        return 0;
162  43	}
163