xref: /linux/Documentation/core-api/real-time/theory.rst (revision 68a052239fc4b351e961f698b824f7654a346091)
1.. SPDX-License-Identifier: GPL-2.0
2
3=====================
4Theory of operation
5=====================
6
7:Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
8
9Preface
10=======
11
12PREEMPT_RT transforms the Linux kernel into a real-time kernel. It achieves
13this by replacing locking primitives, such as spinlock_t, with a preemptible
14and priority-inheritance aware implementation known as rtmutex, and by enforcing
15the use of threaded interrupts. As a result, the kernel becomes fully
16preemptible, with the exception of a few critical code paths, including entry
17code, the scheduler, and low-level interrupt handling routines.
18
19This transformation places the majority of kernel execution contexts under the
20control of the scheduler and significantly increasing the number of preemption
21points. Consequently, it reduces the latency between a high-priority task
22becoming runnable and its actual execution on the CPU.
23
24Scheduling
25==========
26
27The core principles of Linux scheduling and the associated user-space API are
28documented in the man page sched(7)
29`sched(7) <https://man7.org/linux/man-pages/man7/sched.7.html>`_.
30By default, the Linux kernel uses the SCHED_OTHER scheduling policy. Under
31this policy, a task is preempted when the scheduler determines that it has
32consumed a fair share of CPU time relative to other runnable tasks. However,
33the policy does not guarantee immediate preemption when a new SCHED_OTHER task
34becomes runnable. The currently running task may continue executing.
35
36This behavior differs from that of real-time scheduling policies such as
37SCHED_FIFO. When a task with a real-time policy becomes runnable, the
38scheduler immediately selects it for execution if it has a higher priority than
39the currently running task. The task continues to run until it voluntarily
40yields the CPU, typically by blocking on an event.
41
42Sleeping spin locks
43===================
44
45The various lock types and their behavior under real-time configurations are
46described in detail in Documentation/locking/locktypes.rst.
47In a non-PREEMPT_RT configuration, a spinlock_t is acquired by first disabling
48preemption and then actively spinning until the lock becomes available. Once
49the lock is released, preemption is enabled. From a real-time perspective,
50this approach is undesirable because disabling preemption prevents the
51scheduler from switching to a higher-priority task, potentially increasing
52latency.
53
54To address this, PREEMPT_RT replaces spinning locks with sleeping spin locks
55that do not disable preemption. On PREEMPT_RT, spinlock_t is implemented using
56rtmutex. Instead of spinning, a task attempting to acquire a contended lock
57disables CPU migration, donates its priority to the lock owner (priority
58inheritance), and voluntarily schedules out while waiting for the lock to
59become available.
60
61Disabling CPU migration provides the same effect as disabling preemption, while
62still allowing preemption and ensuring that the task continues to run on the
63same CPU while holding a sleeping lock.
64
65Priority inheritance
66====================
67
68Lock types such as spinlock_t and mutex_t in a PREEMPT_RT enabled kernel are
69implemented on top of rtmutex, which provides support for priority inheritance
70(PI). When a task blocks on such a lock, the PI mechanism temporarily
71propagates the blocked task’s scheduling parameters to the lock owner.
72
73For example, if a SCHED_FIFO task A blocks on a lock currently held by a
74SCHED_OTHER task B, task A’s scheduling policy and priority are temporarily
75inherited by task B. After this inheritance, task A is put to sleep while
76waiting for the lock, and task B effectively becomes the highest-priority task
77in the system. This allows B to continue executing, make progress, and
78eventually release the lock.
79
80Once B releases the lock, it reverts to its original scheduling parameters, and
81task A can resume execution.
82
83Threaded interrupts
84===================
85
86Interrupt handlers are another source of code that executes with preemption
87disabled and outside the control of the scheduler. To bring interrupt handling
88under scheduler control, PREEMPT_RT enforces threaded interrupt handlers.
89
90With forced threading, interrupt handling is split into two stages. The first
91stage, the primary handler, is executed in IRQ context with interrupts disabled.
92Its sole responsibility is to wake the associated threaded handler. The second
93stage, the threaded handler, is the function passed to request_irq() as the
94interrupt handler. It runs in process context, scheduled by the kernel.
95
96From waking the interrupt thread until threaded handling is completed, the
97interrupt source is masked in the interrupt controller. This ensures that the
98device interrupt remains pending but does not retrigger the CPU, allowing the
99system to exit IRQ context and handle the interrupt in a scheduled thread.
100
101By default, the threaded handler executes with the SCHED_FIFO scheduling policy
102and a priority of 50 (MAX_RT_PRIO / 2), which is midway between the minimum and
103maximum real-time priorities.
104
105If the threaded interrupt handler raises any soft interrupts during its
106execution, those soft interrupt routines are invoked after the threaded handler
107completes, within the same thread. Preemption remains enabled during the
108execution of the soft interrupt handler.
109
110Summary
111=======
112
113By using sleeping locks and forced-threaded interrupts, PREEMPT_RT
114significantly reduces sections of code where interrupts or preemption is
115disabled, allowing the scheduler to preempt the current execution context and
116switch to a higher-priority task.
117