Skip to content

Commit 9bdcb44

Browse files
committed
cpufreq: schedutil: New governor based on scheduler utilization data
Add a new cpufreq scaling governor, called "schedutil", that uses scheduler-provided CPU utilization information as input for making its decisions. Doing that is possible after commit 34e2c55 (cpufreq: Add mechanism for registering utilization update callbacks) that introduced cpufreq_update_util() called by the scheduler on utilization changes (from CFS) and RT/DL task status updates. In particular, CPU frequency scaling decisions may be based on the the utilization data passed to cpufreq_update_util() by CFS. The new governor is relatively simple. The frequency selection formula used by it depends on whether or not the utilization is frequency-invariant. In the frequency-invariant case the new CPU frequency is given by next_freq = 1.25 * max_freq * util / max where util and max are the last two arguments of cpufreq_update_util(). In turn, if util is not frequency-invariant, the maximum frequency in the above formula is replaced with the current frequency of the CPU: next_freq = 1.25 * curr_freq * util / max The coefficient 1.25 corresponds to the frequency tipping point at (util / max) = 0.8. All of the computations are carried out in the utilization update handlers provided by the new governor. One of those handlers is used for cpufreq policies shared between multiple CPUs and the other one is for policies with one CPU only (and therefore it doesn't need to use any extra synchronization means). The governor supports fast frequency switching if that is supported by the cpufreq driver in use and possible for the given policy. In the fast switching case, all operations of the governor take place in its utilization update handlers. If fast switching cannot be used, the frequency switch operations are carried out with the help of a work item which only calls __cpufreq_driver_target() (under a mutex) to trigger a frequency update (to a value already computed beforehand in one of the utilization update handlers). Currently, the governor treats all of the RT and DL tasks as "unknown utilization" and sets the frequency to the allowed maximum when updated from the RT or DL sched classes. That heavy-handed approach should be replaced with something more subtle and specifically targeted at RT and DL tasks. The governor shares some tunables management code with the "ondemand" and "conservative" governors and uses some common definitions from cpufreq_governor.h, but apart from that it is stand-alone. Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Viresh Kumar <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]>
1 parent b7898fd commit 9bdcb44

File tree

5 files changed

+568
-0
lines changed

5 files changed

+568
-0
lines changed

drivers/cpufreq/Kconfig

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -107,6 +107,16 @@ config CPU_FREQ_DEFAULT_GOV_CONSERVATIVE
107107
Be aware that not all cpufreq drivers support the conservative
108108
governor. If unsure have a look at the help section of the
109109
driver. Fallback governor will be the performance governor.
110+
111+
config CPU_FREQ_DEFAULT_GOV_SCHEDUTIL
112+
bool "schedutil"
113+
select CPU_FREQ_GOV_SCHEDUTIL
114+
select CPU_FREQ_GOV_PERFORMANCE
115+
help
116+
Use the 'schedutil' CPUFreq governor by default. If unsure,
117+
have a look at the help section of that governor. The fallback
118+
governor will be 'performance'.
119+
110120
endchoice
111121

112122
config CPU_FREQ_GOV_PERFORMANCE
@@ -188,6 +198,26 @@ config CPU_FREQ_GOV_CONSERVATIVE
188198

189199
If in doubt, say N.
190200

201+
config CPU_FREQ_GOV_SCHEDUTIL
202+
tristate "'schedutil' cpufreq policy governor"
203+
depends on CPU_FREQ
204+
select CPU_FREQ_GOV_ATTR_SET
205+
select IRQ_WORK
206+
help
207+
This governor makes decisions based on the utilization data provided
208+
by the scheduler. It sets the CPU frequency to be proportional to
209+
the utilization/capacity ratio coming from the scheduler. If the
210+
utilization is frequency-invariant, the new frequency is also
211+
proportional to the maximum available frequency. If that is not the
212+
case, it is proportional to the current frequency of the CPU. The
213+
frequency tipping point is at utilization/capacity equal to 80% in
214+
both cases.
215+
216+
To compile this driver as a module, choose M here: the module will
217+
be called cpufreq_schedutil.
218+
219+
If in doubt, say N.
220+
191221
comment "CPU frequency scaling drivers"
192222

193223
config CPUFREQ_DT

kernel/sched/Makefile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,3 +24,4 @@ obj-$(CONFIG_SCHEDSTATS) += stats.o
2424
obj-$(CONFIG_SCHED_DEBUG) += debug.o
2525
obj-$(CONFIG_CGROUP_CPUACCT) += cpuacct.o
2626
obj-$(CONFIG_CPU_FREQ) += cpufreq.o
27+
obj-$(CONFIG_CPU_FREQ_GOV_SCHEDUTIL) += cpufreq_schedutil.o

0 commit comments

Comments
 (0)