1387b1468SMauro Carvalho Chehab================================== 2387b1468SMauro Carvalho ChehabRT-mutex subsystem with PI support 3387b1468SMauro Carvalho Chehab================================== 4387b1468SMauro Carvalho Chehab 5387b1468SMauro Carvalho ChehabRT-mutexes with priority inheritance are used to support PI-futexes, 6387b1468SMauro Carvalho Chehabwhich enable pthread_mutex_t priority inheritance attributes 7*95ca6d73SMauro Carvalho Chehab(PTHREAD_PRIO_INHERIT). [See Documentation/locking/pi-futex.rst for more details 8387b1468SMauro Carvalho Chehababout PI-futexes.] 9387b1468SMauro Carvalho Chehab 10387b1468SMauro Carvalho ChehabThis technology was developed in the -rt tree and streamlined for 11387b1468SMauro Carvalho Chehabpthread_mutex support. 12387b1468SMauro Carvalho Chehab 13387b1468SMauro Carvalho ChehabBasic principles: 14387b1468SMauro Carvalho Chehab----------------- 15387b1468SMauro Carvalho Chehab 16387b1468SMauro Carvalho ChehabRT-mutexes extend the semantics of simple mutexes by the priority 17387b1468SMauro Carvalho Chehabinheritance protocol. 18387b1468SMauro Carvalho Chehab 19387b1468SMauro Carvalho ChehabA low priority owner of a rt-mutex inherits the priority of a higher 20387b1468SMauro Carvalho Chehabpriority waiter until the rt-mutex is released. If the temporarily 21387b1468SMauro Carvalho Chehabboosted owner blocks on a rt-mutex itself it propagates the priority 22387b1468SMauro Carvalho Chehabboosting to the owner of the other rt_mutex it gets blocked on. The 23387b1468SMauro Carvalho Chehabpriority boosting is immediately removed once the rt_mutex has been 24387b1468SMauro Carvalho Chehabunlocked. 25387b1468SMauro Carvalho Chehab 26387b1468SMauro Carvalho ChehabThis approach allows us to shorten the block of high-prio tasks on 27387b1468SMauro Carvalho Chehabmutexes which protect shared resources. Priority inheritance is not a 28387b1468SMauro Carvalho Chehabmagic bullet for poorly designed applications, but it allows 29387b1468SMauro Carvalho Chehabwell-designed applications to use userspace locks in critical parts of 30387b1468SMauro Carvalho Chehaban high priority thread, without losing determinism. 31387b1468SMauro Carvalho Chehab 32387b1468SMauro Carvalho ChehabThe enqueueing of the waiters into the rtmutex waiter tree is done in 33387b1468SMauro Carvalho Chehabpriority order. For same priorities FIFO order is chosen. For each 34387b1468SMauro Carvalho Chehabrtmutex, only the top priority waiter is enqueued into the owner's 35387b1468SMauro Carvalho Chehabpriority waiters tree. This tree too queues in priority order. Whenever 36387b1468SMauro Carvalho Chehabthe top priority waiter of a task changes (for example it timed out or 37387b1468SMauro Carvalho Chehabgot a signal), the priority of the owner task is readjusted. The 38387b1468SMauro Carvalho Chehabpriority enqueueing is handled by "pi_waiters". 39387b1468SMauro Carvalho Chehab 40387b1468SMauro Carvalho ChehabRT-mutexes are optimized for fastpath operations and have no internal 41387b1468SMauro Carvalho Chehablocking overhead when locking an uncontended mutex or unlocking a mutex 42387b1468SMauro Carvalho Chehabwithout waiters. The optimized fastpath operations require cmpxchg 43387b1468SMauro Carvalho Chehabsupport. [If that is not available then the rt-mutex internal spinlock 44387b1468SMauro Carvalho Chehabis used] 45387b1468SMauro Carvalho Chehab 46387b1468SMauro Carvalho ChehabThe state of the rt-mutex is tracked via the owner field of the rt-mutex 47387b1468SMauro Carvalho Chehabstructure: 48387b1468SMauro Carvalho Chehab 49387b1468SMauro Carvalho Chehablock->owner holds the task_struct pointer of the owner. Bit 0 is used to 50387b1468SMauro Carvalho Chehabkeep track of the "lock has waiters" state: 51387b1468SMauro Carvalho Chehab 52387b1468SMauro Carvalho Chehab ============ ======= ================================================ 53387b1468SMauro Carvalho Chehab owner bit0 Notes 54387b1468SMauro Carvalho Chehab ============ ======= ================================================ 55387b1468SMauro Carvalho Chehab NULL 0 lock is free (fast acquire possible) 56387b1468SMauro Carvalho Chehab NULL 1 lock is free and has waiters and the top waiter 57387b1468SMauro Carvalho Chehab is going to take the lock [1]_ 58387b1468SMauro Carvalho Chehab taskpointer 0 lock is held (fast release possible) 59387b1468SMauro Carvalho Chehab taskpointer 1 lock is held and has waiters [2]_ 60387b1468SMauro Carvalho Chehab ============ ======= ================================================ 61387b1468SMauro Carvalho Chehab 62387b1468SMauro Carvalho ChehabThe fast atomic compare exchange based acquire and release is only 63387b1468SMauro Carvalho Chehabpossible when bit 0 of lock->owner is 0. 64387b1468SMauro Carvalho Chehab 65387b1468SMauro Carvalho Chehab.. [1] It also can be a transitional state when grabbing the lock 66387b1468SMauro Carvalho Chehab with ->wait_lock is held. To prevent any fast path cmpxchg to the lock, 67387b1468SMauro Carvalho Chehab we need to set the bit0 before looking at the lock, and the owner may 68387b1468SMauro Carvalho Chehab be NULL in this small time, hence this can be a transitional state. 69387b1468SMauro Carvalho Chehab 70387b1468SMauro Carvalho Chehab.. [2] There is a small time when bit 0 is set but there are no 71387b1468SMauro Carvalho Chehab waiters. This can happen when grabbing the lock in the slow path. 72387b1468SMauro Carvalho Chehab To prevent a cmpxchg of the owner releasing the lock, we need to 73387b1468SMauro Carvalho Chehab set this bit before looking at the lock. 74387b1468SMauro Carvalho Chehab 75387b1468SMauro Carvalho ChehabBTW, there is still technically a "Pending Owner", it's just not called 76387b1468SMauro Carvalho Chehabthat anymore. The pending owner happens to be the top_waiter of a lock 77387b1468SMauro Carvalho Chehabthat has no owner and has been woken up to grab the lock. 78