xref: /linux/Documentation/core-api/real-time/theory.rst (revision 4f38da1f027ea2c9f01bb71daa7a299c191b6940)
1*f51fe3b7SSebastian Andrzej Siewior.. SPDX-License-Identifier: GPL-2.0
2*f51fe3b7SSebastian Andrzej Siewior
3*f51fe3b7SSebastian Andrzej Siewior=====================
4*f51fe3b7SSebastian Andrzej SiewiorTheory of operation
5*f51fe3b7SSebastian Andrzej Siewior=====================
6*f51fe3b7SSebastian Andrzej Siewior
7*f51fe3b7SSebastian Andrzej Siewior:Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
8*f51fe3b7SSebastian Andrzej Siewior
9*f51fe3b7SSebastian Andrzej SiewiorPreface
10*f51fe3b7SSebastian Andrzej Siewior=======
11*f51fe3b7SSebastian Andrzej Siewior
12*f51fe3b7SSebastian Andrzej SiewiorPREEMPT_RT transforms the Linux kernel into a real-time kernel. It achieves
13*f51fe3b7SSebastian Andrzej Siewiorthis by replacing locking primitives, such as spinlock_t, with a preemptible
14*f51fe3b7SSebastian Andrzej Siewiorand priority-inheritance aware implementation known as rtmutex, and by enforcing
15*f51fe3b7SSebastian Andrzej Siewiorthe use of threaded interrupts. As a result, the kernel becomes fully
16*f51fe3b7SSebastian Andrzej Siewiorpreemptible, with the exception of a few critical code paths, including entry
17*f51fe3b7SSebastian Andrzej Siewiorcode, the scheduler, and low-level interrupt handling routines.
18*f51fe3b7SSebastian Andrzej Siewior
19*f51fe3b7SSebastian Andrzej SiewiorThis transformation places the majority of kernel execution contexts under the
20*f51fe3b7SSebastian Andrzej Siewiorcontrol of the scheduler and significantly increasing the number of preemption
21*f51fe3b7SSebastian Andrzej Siewiorpoints. Consequently, it reduces the latency between a high-priority task
22*f51fe3b7SSebastian Andrzej Siewiorbecoming runnable and its actual execution on the CPU.
23*f51fe3b7SSebastian Andrzej Siewior
24*f51fe3b7SSebastian Andrzej SiewiorScheduling
25*f51fe3b7SSebastian Andrzej Siewior==========
26*f51fe3b7SSebastian Andrzej Siewior
27*f51fe3b7SSebastian Andrzej SiewiorThe core principles of Linux scheduling and the associated user-space API are
28*f51fe3b7SSebastian Andrzej Siewiordocumented in the man page sched(7)
29*f51fe3b7SSebastian Andrzej Siewior`sched(7) <https://man7.org/linux/man-pages/man7/sched.7.html>`_.
30*f51fe3b7SSebastian Andrzej SiewiorBy default, the Linux kernel uses the SCHED_OTHER scheduling policy. Under
31*f51fe3b7SSebastian Andrzej Siewiorthis policy, a task is preempted when the scheduler determines that it has
32*f51fe3b7SSebastian Andrzej Siewiorconsumed a fair share of CPU time relative to other runnable tasks. However,
33*f51fe3b7SSebastian Andrzej Siewiorthe policy does not guarantee immediate preemption when a new SCHED_OTHER task
34*f51fe3b7SSebastian Andrzej Siewiorbecomes runnable. The currently running task may continue executing.
35*f51fe3b7SSebastian Andrzej Siewior
36*f51fe3b7SSebastian Andrzej SiewiorThis behavior differs from that of real-time scheduling policies such as
37*f51fe3b7SSebastian Andrzej SiewiorSCHED_FIFO. When a task with a real-time policy becomes runnable, the
38*f51fe3b7SSebastian Andrzej Siewiorscheduler immediately selects it for execution if it has a higher priority than
39*f51fe3b7SSebastian Andrzej Siewiorthe currently running task. The task continues to run until it voluntarily
40*f51fe3b7SSebastian Andrzej Siewioryields the CPU, typically by blocking on an event.
41*f51fe3b7SSebastian Andrzej Siewior
42*f51fe3b7SSebastian Andrzej SiewiorSleeping spin locks
43*f51fe3b7SSebastian Andrzej Siewior===================
44*f51fe3b7SSebastian Andrzej Siewior
45*f51fe3b7SSebastian Andrzej SiewiorThe various lock types and their behavior under real-time configurations are
46*f51fe3b7SSebastian Andrzej Siewiordescribed in detail in Documentation/locking/locktypes.rst.
47*f51fe3b7SSebastian Andrzej SiewiorIn a non-PREEMPT_RT configuration, a spinlock_t is acquired by first disabling
48*f51fe3b7SSebastian Andrzej Siewiorpreemption and then actively spinning until the lock becomes available. Once
49*f51fe3b7SSebastian Andrzej Siewiorthe lock is released, preemption is enabled. From a real-time perspective,
50*f51fe3b7SSebastian Andrzej Siewiorthis approach is undesirable because disabling preemption prevents the
51*f51fe3b7SSebastian Andrzej Siewiorscheduler from switching to a higher-priority task, potentially increasing
52*f51fe3b7SSebastian Andrzej Siewiorlatency.
53*f51fe3b7SSebastian Andrzej Siewior
54*f51fe3b7SSebastian Andrzej SiewiorTo address this, PREEMPT_RT replaces spinning locks with sleeping spin locks
55*f51fe3b7SSebastian Andrzej Siewiorthat do not disable preemption. On PREEMPT_RT, spinlock_t is implemented using
56*f51fe3b7SSebastian Andrzej Siewiorrtmutex. Instead of spinning, a task attempting to acquire a contended lock
57*f51fe3b7SSebastian Andrzej Siewiordisables CPU migration, donates its priority to the lock owner (priority
58*f51fe3b7SSebastian Andrzej Siewiorinheritance), and voluntarily schedules out while waiting for the lock to
59*f51fe3b7SSebastian Andrzej Siewiorbecome available.
60*f51fe3b7SSebastian Andrzej Siewior
61*f51fe3b7SSebastian Andrzej SiewiorDisabling CPU migration provides the same effect as disabling preemption, while
62*f51fe3b7SSebastian Andrzej Siewiorstill allowing preemption and ensuring that the task continues to run on the
63*f51fe3b7SSebastian Andrzej Siewiorsame CPU while holding a sleeping lock.
64*f51fe3b7SSebastian Andrzej Siewior
65*f51fe3b7SSebastian Andrzej SiewiorPriority inheritance
66*f51fe3b7SSebastian Andrzej Siewior====================
67*f51fe3b7SSebastian Andrzej Siewior
68*f51fe3b7SSebastian Andrzej SiewiorLock types such as spinlock_t and mutex_t in a PREEMPT_RT enabled kernel are
69*f51fe3b7SSebastian Andrzej Siewiorimplemented on top of rtmutex, which provides support for priority inheritance
70*f51fe3b7SSebastian Andrzej Siewior(PI). When a task blocks on such a lock, the PI mechanism temporarily
71*f51fe3b7SSebastian Andrzej Siewiorpropagates the blocked task’s scheduling parameters to the lock owner.
72*f51fe3b7SSebastian Andrzej Siewior
73*f51fe3b7SSebastian Andrzej SiewiorFor example, if a SCHED_FIFO task A blocks on a lock currently held by a
74*f51fe3b7SSebastian Andrzej SiewiorSCHED_OTHER task B, task A’s scheduling policy and priority are temporarily
75*f51fe3b7SSebastian Andrzej Siewiorinherited by task B. After this inheritance, task A is put to sleep while
76*f51fe3b7SSebastian Andrzej Siewiorwaiting for the lock, and task B effectively becomes the highest-priority task
77*f51fe3b7SSebastian Andrzej Siewiorin the system. This allows B to continue executing, make progress, and
78*f51fe3b7SSebastian Andrzej Siewioreventually release the lock.
79*f51fe3b7SSebastian Andrzej Siewior
80*f51fe3b7SSebastian Andrzej SiewiorOnce B releases the lock, it reverts to its original scheduling parameters, and
81*f51fe3b7SSebastian Andrzej Siewiortask A can resume execution.
82*f51fe3b7SSebastian Andrzej Siewior
83*f51fe3b7SSebastian Andrzej SiewiorThreaded interrupts
84*f51fe3b7SSebastian Andrzej Siewior===================
85*f51fe3b7SSebastian Andrzej Siewior
86*f51fe3b7SSebastian Andrzej SiewiorInterrupt handlers are another source of code that executes with preemption
87*f51fe3b7SSebastian Andrzej Siewiordisabled and outside the control of the scheduler. To bring interrupt handling
88*f51fe3b7SSebastian Andrzej Siewiorunder scheduler control, PREEMPT_RT enforces threaded interrupt handlers.
89*f51fe3b7SSebastian Andrzej Siewior
90*f51fe3b7SSebastian Andrzej SiewiorWith forced threading, interrupt handling is split into two stages. The first
91*f51fe3b7SSebastian Andrzej Siewiorstage, the primary handler, is executed in IRQ context with interrupts disabled.
92*f51fe3b7SSebastian Andrzej SiewiorIts sole responsibility is to wake the associated threaded handler. The second
93*f51fe3b7SSebastian Andrzej Siewiorstage, the threaded handler, is the function passed to request_irq() as the
94*f51fe3b7SSebastian Andrzej Siewiorinterrupt handler. It runs in process context, scheduled by the kernel.
95*f51fe3b7SSebastian Andrzej Siewior
96*f51fe3b7SSebastian Andrzej SiewiorFrom waking the interrupt thread until threaded handling is completed, the
97*f51fe3b7SSebastian Andrzej Siewiorinterrupt source is masked in the interrupt controller. This ensures that the
98*f51fe3b7SSebastian Andrzej Siewiordevice interrupt remains pending but does not retrigger the CPU, allowing the
99*f51fe3b7SSebastian Andrzej Siewiorsystem to exit IRQ context and handle the interrupt in a scheduled thread.
100*f51fe3b7SSebastian Andrzej Siewior
101*f51fe3b7SSebastian Andrzej SiewiorBy default, the threaded handler executes with the SCHED_FIFO scheduling policy
102*f51fe3b7SSebastian Andrzej Siewiorand a priority of 50 (MAX_RT_PRIO / 2), which is midway between the minimum and
103*f51fe3b7SSebastian Andrzej Siewiormaximum real-time priorities.
104*f51fe3b7SSebastian Andrzej Siewior
105*f51fe3b7SSebastian Andrzej SiewiorIf the threaded interrupt handler raises any soft interrupts during its
106*f51fe3b7SSebastian Andrzej Siewiorexecution, those soft interrupt routines are invoked after the threaded handler
107*f51fe3b7SSebastian Andrzej Siewiorcompletes, within the same thread. Preemption remains enabled during the
108*f51fe3b7SSebastian Andrzej Siewiorexecution of the soft interrupt handler.
109*f51fe3b7SSebastian Andrzej Siewior
110*f51fe3b7SSebastian Andrzej SiewiorSummary
111*f51fe3b7SSebastian Andrzej Siewior=======
112*f51fe3b7SSebastian Andrzej Siewior
113*f51fe3b7SSebastian Andrzej SiewiorBy using sleeping locks and forced-threaded interrupts, PREEMPT_RT
114*f51fe3b7SSebastian Andrzej Siewiorsignificantly reduces sections of code where interrupts or preemption is
115*f51fe3b7SSebastian Andrzej Siewiordisabled, allowing the scheduler to preempt the current execution context and
116*f51fe3b7SSebastian Andrzej Siewiorswitch to a higher-priority task.
117