1.. SPDX-License-Identifier: GPL-2.0 2 3===================== 4Theory of operation 5===================== 6 7:Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de> 8 9Preface 10======= 11 12PREEMPT_RT transforms the Linux kernel into a real-time kernel. It achieves 13this by replacing locking primitives, such as spinlock_t, with a preemptible 14and priority-inheritance aware implementation known as rtmutex, and by enforcing 15the use of threaded interrupts. As a result, the kernel becomes fully 16preemptible, with the exception of a few critical code paths, including entry 17code, the scheduler, and low-level interrupt handling routines. 18 19This transformation places the majority of kernel execution contexts under the 20control of the scheduler and significantly increasing the number of preemption 21points. Consequently, it reduces the latency between a high-priority task 22becoming runnable and its actual execution on the CPU. 23 24Scheduling 25========== 26 27The core principles of Linux scheduling and the associated user-space API are 28documented in the man page sched(7) 29`sched(7) <https://man7.org/linux/man-pages/man7/sched.7.html>`_. 30By default, the Linux kernel uses the SCHED_OTHER scheduling policy. Under 31this policy, a task is preempted when the scheduler determines that it has 32consumed a fair share of CPU time relative to other runnable tasks. However, 33the policy does not guarantee immediate preemption when a new SCHED_OTHER task 34becomes runnable. The currently running task may continue executing. 35 36This behavior differs from that of real-time scheduling policies such as 37SCHED_FIFO. When a task with a real-time policy becomes runnable, the 38scheduler immediately selects it for execution if it has a higher priority than 39the currently running task. The task continues to run until it voluntarily 40yields the CPU, typically by blocking on an event. 41 42Sleeping spin locks 43=================== 44 45The various lock types and their behavior under real-time configurations are 46described in detail in Documentation/locking/locktypes.rst. 47In a non-PREEMPT_RT configuration, a spinlock_t is acquired by first disabling 48preemption and then actively spinning until the lock becomes available. Once 49the lock is released, preemption is enabled. From a real-time perspective, 50this approach is undesirable because disabling preemption prevents the 51scheduler from switching to a higher-priority task, potentially increasing 52latency. 53 54To address this, PREEMPT_RT replaces spinning locks with sleeping spin locks 55that do not disable preemption. On PREEMPT_RT, spinlock_t is implemented using 56rtmutex. Instead of spinning, a task attempting to acquire a contended lock 57disables CPU migration, donates its priority to the lock owner (priority 58inheritance), and voluntarily schedules out while waiting for the lock to 59become available. 60 61Disabling CPU migration provides the same effect as disabling preemption, while 62still allowing preemption and ensuring that the task continues to run on the 63same CPU while holding a sleeping lock. 64 65Priority inheritance 66==================== 67 68Lock types such as spinlock_t and mutex_t in a PREEMPT_RT enabled kernel are 69implemented on top of rtmutex, which provides support for priority inheritance 70(PI). When a task blocks on such a lock, the PI mechanism temporarily 71propagates the blocked task’s scheduling parameters to the lock owner. 72 73For example, if a SCHED_FIFO task A blocks on a lock currently held by a 74SCHED_OTHER task B, task A’s scheduling policy and priority are temporarily 75inherited by task B. After this inheritance, task A is put to sleep while 76waiting for the lock, and task B effectively becomes the highest-priority task 77in the system. This allows B to continue executing, make progress, and 78eventually release the lock. 79 80Once B releases the lock, it reverts to its original scheduling parameters, and 81task A can resume execution. 82 83Threaded interrupts 84=================== 85 86Interrupt handlers are another source of code that executes with preemption 87disabled and outside the control of the scheduler. To bring interrupt handling 88under scheduler control, PREEMPT_RT enforces threaded interrupt handlers. 89 90With forced threading, interrupt handling is split into two stages. The first 91stage, the primary handler, is executed in IRQ context with interrupts disabled. 92Its sole responsibility is to wake the associated threaded handler. The second 93stage, the threaded handler, is the function passed to request_irq() as the 94interrupt handler. It runs in process context, scheduled by the kernel. 95 96From waking the interrupt thread until threaded handling is completed, the 97interrupt source is masked in the interrupt controller. This ensures that the 98device interrupt remains pending but does not retrigger the CPU, allowing the 99system to exit IRQ context and handle the interrupt in a scheduled thread. 100 101By default, the threaded handler executes with the SCHED_FIFO scheduling policy 102and a priority of 50 (MAX_RT_PRIO / 2), which is midway between the minimum and 103maximum real-time priorities. 104 105If the threaded interrupt handler raises any soft interrupts during its 106execution, those soft interrupt routines are invoked after the threaded handler 107completes, within the same thread. Preemption remains enabled during the 108execution of the soft interrupt handler. 109 110Summary 111======= 112 113By using sleeping locks and forced-threaded interrupts, PREEMPT_RT 114significantly reduces sections of code where interrupts or preemption is 115disabled, allowing the scheduler to preempt the current execution context and 116switch to a higher-priority task. 117