1.\" Copyright (c) 2007 Julian Elischer (julian - freebsd org ) 2.\" All rights reserved. 3.\" 4.\" Redistribution and use in source and binary forms, with or without 5.\" modification, are permitted provided that the following conditions 6.\" are met: 7.\" 1. Redistributions of source code must retain the above copyright 8.\" notice, this list of conditions and the following disclaimer. 9.\" 2. Redistributions in binary form must reproduce the above copyright 10.\" notice, this list of conditions and the following disclaimer in the 11.\" documentation and/or other materials provided with the distribution. 12.\" 13.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND 14.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 15.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 16.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 17.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 18.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 19.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 20.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 21.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 22.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 23.\" SUCH DAMAGE. 24.\" 25.\" $FreeBSD$ 26.\" 27.Dd March 14, 2007 28.Dt LOCKING 9 29.Os 30.Sh NAME 31.Nm locking 32.Nd kernel synchronization primitives 33.Sh SYNOPSIS 34All sorts of stuff to go here. 35.Pp 36.Sh DESCRIPTION 37The 38.Em FreeBSD 39kernel is written to run across multiple CPUs and as such requires 40several different synchronization primitives to allow the developers 41to safely access and manipulate the many data types required. 42.Pp 43These include: 44.Bl -enum 45.It 46Spin Mutexes 47.It 48Sleep Mutexes 49.It 50pool Mutexes 51.It 52Shared-Exclusive locks 53.It 54Reader-Writer locks 55.It 56Turnstiles 57.It 58Semaphores 59.It 60Condition variables 61.It 62Sleep/wakeup 63.It 64Giant 65.It 66Lockmanager locks 67.El 68.Pp 69The primitives interact and have a number of rules regarding how 70they can and can not be combined. 71There are too many for the average 72human mind and they keep changing. 73(if you disagree, please write replacement text) :-) 74.Pp 75Some of these primitives may be used at the low (interrupt) level and 76some may not. 77.Pp 78There are strict ordering requirements and for some of the types this 79is checked using the 80.Xr witness 4 81code. 82.Pp 83.Ss SPIN Mutexes 84Mutexes are the basic primitive. 85You either hold it or you don't. 86If you don't own it then you just spin, waiting for the holder (on 87another CPU) to release it. 88Hopefully they are doing something fast. 89You 90.Em must not 91do anything that deschedules the thread while you 92are holding a SPIN mutex. 93.Ss Mutexes 94Basically (regular) mutexes will deschedule the thread if the 95mutex can not be acquired. 96A non-spin mutex can be considered to be equivalent 97to getting a write lock on an 98.Em rw_lock 99(see below), and in fact non-spin mutexes and rw_locks may soon become the same thing. 100As in spin mutexes, you either get it or you don't. 101You may only call the 102.Xr sleep 9 103call via 104.Fn msleep 105or the new 106.Fn mtx_sleep 107variant. 108These will atomically drop the mutex and reacquire it 109as part of waking up. 110This is often however a 111.Em BAD 112idea because it generally relies on you having 113such a good knowledge of all the call graph above you 114and what assumptions it is making that there are a lot 115of ways to make hard-to-find mistakes. 116For example you MUST re-test all the assumptions you made before, 117all the way up the call graph to where you got the lock. 118You can not just assume that mtx_sleep can be inserted anywhere. 119If any caller above you has any mutex or 120rwlock, your sleep, will cause a panic. 121If the sleep only happens rarely it may be years before the 122bad code path is found. 123.Ss Pool Mutexes 124A variant of regular mutexes where the allocation of the mutex is handled 125more by the system. 126.Ss Rw_locks 127Reader/writer locks allow shared access to protected data by multiple threads, 128or exclusive access by a single thread. 129The threads with shared access are known as 130.Em readers 131since they should only read the protected data. 132A thread with exclusive access is known as a 133.Em writer 134since it may modify protected data. 135.Pp 136Although reader/writer locks look very similar to 137.Xr sx 9 138(see below) locks, their usage pattern is different. 139Reader/writer locks can be treated as mutexes (see above and 140.Xr mutex 9 ) 141with shared/exclusive semantics. 142More specifically, regular mutexes can be 143considered to be equivalent to a write-lock on an 144.Em rw_lock. 145In the future this may in fact 146become literally the fact. 147An 148.Em rw_lock 149can be locked while holding a regular mutex, but 150can 151.Em not 152be held while sleeping. 153The 154.Em rw_lock 155locks have priority propagation like mutexes, but priority 156can be propagated only to an exclusive holder. 157This limitation comes from the fact that shared owners 158are anonymous. 159Another important property is that shared holders of 160.Em rw_lock 161can recurse, but exclusive locks are not allowed to recurse. 162This ability should not be used lightly and 163.Em may go away. 164Users of recursion in any locks should be prepared to 165defend their decision against vigorous criticism. 166.Ss Sx_locks 167Shared/exclusive locks are used to protect data that are read far more often 168than they are written. 169Mutexes are inherently more efficient than shared/exclusive locks, so 170shared/exclusive locks should be used prudently. 171The main reason for using an 172.Em sx_lock 173is that a thread may hold a shared or exclusive lock on an 174.Em sx_lock 175lock while sleeping. 176As a consequence of this however, an 177.Em sx_lock 178lock may not be acquired while holding a mutex. 179The reason for this is that, if one thread slept while holding an 180.Em sx_lock 181lock while another thread blocked on the same 182.Em sx_lock 183lock after acquiring a mutex, then the second thread would effectively 184end up sleeping while holding a mutex, which is not allowed. 185The 186.Em sx_lock 187should be considered to be closely related to 188.Xr sleep 9 . 189In fact it could in some cases be 190considered a conditional sleep. 191.Ss Turnstiles 192Turnstiles are used to hold a queue of threads blocked on 193non-sleepable locks. 194Sleepable locks use condition variables to implement their queues. 195Turnstiles differ from a sleep queue in that turnstile queue's 196are assigned to a lock held by an owning thread. 197Thus, when one thread is enqueued onto a turnstile, it can lend its 198priority to the owning thread. 199If this sounds confusing, we need to describe it better. 200.Ss Semaphores 201.Ss Condition variables 202Condition variables are used in conjunction with mutexes to wait for 203conditions to occur. 204A thread must hold the mutex before calling the 205.Fn cv_wait* , 206functions. 207When a thread waits on a condition, the mutex 208is atomically released before the thread is blocked, then reacquired 209before the function call returns. 210.Ss Giant 211Giant is a special instance of a sleep lock. 212It has several special characteristics. 213.Bl -enum 214.It 215It is recursive. 216.It 217Drivers can request that Giant be locked around them, but this is 218going away. 219.It 220You can sleep while it has recursed, but other recursive locks cannot. 221.It 222Giant must be locked first before other locks. 223.It 224There are places in the kernel that drop Giant and pick it back up 225again. 226Sleep locks will do this before sleeping. 227Parts of the Network or VM code may do this as well, depending on the 228setting of a sysctl. 229This means that you cannot count on Giant keeping other code from 230running if your code sleeps, even if you want it to. 231.El 232.Ss Sleep/wakeup 233The functions 234.Fn tsleep , 235.Fn msleep , 236.Fn msleep_spin , 237.Fn pause , 238.Fn wakeup , 239and 240.Fn wakeup_one 241handle event-based thread blocking. 242If a thread must wait for an external event, it is put to sleep by 243.Fn tsleep , 244.Fn msleep , 245.Fn msleep_spin , 246or 247.Fn pause . 248Threads may also wait using one of the locking primitive sleep routines 249.Xr mtx_sleep 9 , 250.Xr rw_sleep 9 , 251or 252.Xr sx_sleep 9 . 253.Pp 254The parameter 255.Fa chan 256is an arbitrary address that uniquely identifies the event on which 257the thread is being put to sleep. 258All threads sleeping on a single 259.Fa chan 260are woken up later by 261.Fn wakeup , 262often called from inside an interrupt routine, to indicate that the 263resource the thread was blocking on is available now. 264.Pp 265Several of the sleep functions including 266.Fn msleep , 267.Fn msleep_spin , 268and the locking primitive sleep routines specify an additional lock 269parameter. 270The lock will be released before sleeping and reacquired 271before the sleep routine returns. 272If 273.Fa priority 274includes the 275.Dv PDROP 276flag, then the lock will not be reacquired before returning. 277The lock is used to ensure that a condition can be checked atomically, 278and that the current thread can be suspended without missing a 279change to the condition, or an associated wakeup. 280In addition, all of the sleep routines will fully drop the 281.Va Giant 282mutex 283(even if recursed) 284while the thread is suspended and will reacquire the 285.Va Giant 286mutex before the function returns. 287.Pp 288.Ss lockmanager locks 289Largely deprecated. 290See the 291.Xr lock 9 292page for more information. 293I don't know what the downsides are but I'm sure someone will fill in this part. 294.Sh Usage tables. 295.Ss Interaction table. 296The following table shows what you can and can not do if you hold 297one of the synchronization primitives discussed here: 298(someone who knows what they are talking about should write this table) 299.Bl -column ".Ic xxxxxxxxxxxxxxxxxxxx" ".Xr XXXXXXXXX" ".Xr XXXXXXX" ".Xr XXXXXXX" ".Xr XXXXXXX" ".Xr XXXXX" -offset indent 300.It Xo 301.Em "You have: You want:" Ta Spin_mtx Ta Slp_mtx Ta sx_lock Ta rw_lock Ta sleep 302.Xc 303.It Ic SPIN mutex Ta \&ok Ta \&no Ta \&no Ta \&no Ta \&no-3 304.It Ic Sleep mutex Ta \&ok Ta \&ok-1 Ta \&no Ta \&ok Ta \&no-3 305.It Ic sx_lock Ta \&ok Ta \&no Ta \&ok-2 Ta \&no Ta \&ok-4 306.It Ic rw_lock Ta \&ok Ta \&ok Ta \&no Ta \&ok-2 Ta \&no-3 307.El 308.Pp 309.Em *1 310Recursion is defined per lock. 311Lock order is important. 312.Pp 313.Em *2 314readers can recurse though writers can not. 315Lock order is important. 316.Pp 317.Em *3 318There are calls atomically release this primitive when going to sleep 319and reacquire it on wakeup (e.g. 320.Fn mtx_sleep , 321.Fn rw_sleep 322and 323.Fn msleep_spin 324). 325.Pp 326.Em *4 327Though one can sleep holding an sx lock, one can also use 328.Fn sx_sleep 329which atomically release this primitive when going to sleep and 330reacquire it on wakeup. 331.Ss Context mode table. 332The next table shows what can be used in different contexts. 333At this time this is a rather easy to remember table. 334.Bl -column ".Ic Xxxxxxxxxxxxxxxxxxxx" ".Xr XXXXXXXXX" ".Xr XXXXXXX" ".Xr XXXXXXX" ".Xr XXXXXXX" ".Xr XXXXX" -offset indent 335.It Xo 336.Em "Context:" Ta Spin_mtx Ta Slp_mtx Ta sx_lock Ta rw_lock Ta sleep 337.Xc 338.It interrupt: Ta \&ok Ta \&no Ta \&no Ta \&no Ta \&no 339.It idle: Ta \&ok Ta \&no Ta \&no Ta \&no Ta \&no 340.El 341.Sh SEE ALSO 342.Xr condvar 9 , 343.Xr lock 9 , 344.Xr mtx_pool 9 , 345.Xr mutex 9 , 346.Xr rwlock 9 , 347.Xr sema 9 , 348.Xr sleep 9 , 349.Xr sx 9 , 350.Xr LOCK_PROFILING 9 , 351.Xr WITNESS 9 352.Sh HISTORY 353These 354functions appeared in 355.Bsx 4.1 356through 357.Fx 7.0 358