xref: /linux/Documentation/locking/rt-mutex.rst (revision 4b4193256c8d3bc3a5397b5cd9494c2ad386317d)
1387b1468SMauro Carvalho Chehab==================================
2387b1468SMauro Carvalho ChehabRT-mutex subsystem with PI support
3387b1468SMauro Carvalho Chehab==================================
4387b1468SMauro Carvalho Chehab
5387b1468SMauro Carvalho ChehabRT-mutexes with priority inheritance are used to support PI-futexes,
6387b1468SMauro Carvalho Chehabwhich enable pthread_mutex_t priority inheritance attributes
7*95ca6d73SMauro Carvalho Chehab(PTHREAD_PRIO_INHERIT). [See Documentation/locking/pi-futex.rst for more details
8387b1468SMauro Carvalho Chehababout PI-futexes.]
9387b1468SMauro Carvalho Chehab
10387b1468SMauro Carvalho ChehabThis technology was developed in the -rt tree and streamlined for
11387b1468SMauro Carvalho Chehabpthread_mutex support.
12387b1468SMauro Carvalho Chehab
13387b1468SMauro Carvalho ChehabBasic principles:
14387b1468SMauro Carvalho Chehab-----------------
15387b1468SMauro Carvalho Chehab
16387b1468SMauro Carvalho ChehabRT-mutexes extend the semantics of simple mutexes by the priority
17387b1468SMauro Carvalho Chehabinheritance protocol.
18387b1468SMauro Carvalho Chehab
19387b1468SMauro Carvalho ChehabA low priority owner of a rt-mutex inherits the priority of a higher
20387b1468SMauro Carvalho Chehabpriority waiter until the rt-mutex is released. If the temporarily
21387b1468SMauro Carvalho Chehabboosted owner blocks on a rt-mutex itself it propagates the priority
22387b1468SMauro Carvalho Chehabboosting to the owner of the other rt_mutex it gets blocked on. The
23387b1468SMauro Carvalho Chehabpriority boosting is immediately removed once the rt_mutex has been
24387b1468SMauro Carvalho Chehabunlocked.
25387b1468SMauro Carvalho Chehab
26387b1468SMauro Carvalho ChehabThis approach allows us to shorten the block of high-prio tasks on
27387b1468SMauro Carvalho Chehabmutexes which protect shared resources. Priority inheritance is not a
28387b1468SMauro Carvalho Chehabmagic bullet for poorly designed applications, but it allows
29387b1468SMauro Carvalho Chehabwell-designed applications to use userspace locks in critical parts of
30387b1468SMauro Carvalho Chehaban high priority thread, without losing determinism.
31387b1468SMauro Carvalho Chehab
32387b1468SMauro Carvalho ChehabThe enqueueing of the waiters into the rtmutex waiter tree is done in
33387b1468SMauro Carvalho Chehabpriority order. For same priorities FIFO order is chosen. For each
34387b1468SMauro Carvalho Chehabrtmutex, only the top priority waiter is enqueued into the owner's
35387b1468SMauro Carvalho Chehabpriority waiters tree. This tree too queues in priority order. Whenever
36387b1468SMauro Carvalho Chehabthe top priority waiter of a task changes (for example it timed out or
37387b1468SMauro Carvalho Chehabgot a signal), the priority of the owner task is readjusted. The
38387b1468SMauro Carvalho Chehabpriority enqueueing is handled by "pi_waiters".
39387b1468SMauro Carvalho Chehab
40387b1468SMauro Carvalho ChehabRT-mutexes are optimized for fastpath operations and have no internal
41387b1468SMauro Carvalho Chehablocking overhead when locking an uncontended mutex or unlocking a mutex
42387b1468SMauro Carvalho Chehabwithout waiters. The optimized fastpath operations require cmpxchg
43387b1468SMauro Carvalho Chehabsupport. [If that is not available then the rt-mutex internal spinlock
44387b1468SMauro Carvalho Chehabis used]
45387b1468SMauro Carvalho Chehab
46387b1468SMauro Carvalho ChehabThe state of the rt-mutex is tracked via the owner field of the rt-mutex
47387b1468SMauro Carvalho Chehabstructure:
48387b1468SMauro Carvalho Chehab
49387b1468SMauro Carvalho Chehablock->owner holds the task_struct pointer of the owner. Bit 0 is used to
50387b1468SMauro Carvalho Chehabkeep track of the "lock has waiters" state:
51387b1468SMauro Carvalho Chehab
52387b1468SMauro Carvalho Chehab ============ ======= ================================================
53387b1468SMauro Carvalho Chehab owner        bit0    Notes
54387b1468SMauro Carvalho Chehab ============ ======= ================================================
55387b1468SMauro Carvalho Chehab NULL         0       lock is free (fast acquire possible)
56387b1468SMauro Carvalho Chehab NULL         1       lock is free and has waiters and the top waiter
57387b1468SMauro Carvalho Chehab		      is going to take the lock [1]_
58387b1468SMauro Carvalho Chehab taskpointer  0       lock is held (fast release possible)
59387b1468SMauro Carvalho Chehab taskpointer  1       lock is held and has waiters [2]_
60387b1468SMauro Carvalho Chehab ============ ======= ================================================
61387b1468SMauro Carvalho Chehab
62387b1468SMauro Carvalho ChehabThe fast atomic compare exchange based acquire and release is only
63387b1468SMauro Carvalho Chehabpossible when bit 0 of lock->owner is 0.
64387b1468SMauro Carvalho Chehab
65387b1468SMauro Carvalho Chehab.. [1] It also can be a transitional state when grabbing the lock
66387b1468SMauro Carvalho Chehab       with ->wait_lock is held. To prevent any fast path cmpxchg to the lock,
67387b1468SMauro Carvalho Chehab       we need to set the bit0 before looking at the lock, and the owner may
68387b1468SMauro Carvalho Chehab       be NULL in this small time, hence this can be a transitional state.
69387b1468SMauro Carvalho Chehab
70387b1468SMauro Carvalho Chehab.. [2] There is a small time when bit 0 is set but there are no
71387b1468SMauro Carvalho Chehab       waiters. This can happen when grabbing the lock in the slow path.
72387b1468SMauro Carvalho Chehab       To prevent a cmpxchg of the owner releasing the lock, we need to
73387b1468SMauro Carvalho Chehab       set this bit before looking at the lock.
74387b1468SMauro Carvalho Chehab
75387b1468SMauro Carvalho ChehabBTW, there is still technically a "Pending Owner", it's just not called
76387b1468SMauro Carvalho Chehabthat anymore. The pending owner happens to be the top_waiter of a lock
77387b1468SMauro Carvalho Chehabthat has no owner and has been woken up to grab the lock.
78