xref: /freebsd/share/man/man9/locking.9 (revision bb15ca603fa442c72dde3f3cb8b46db6970e3950)
1.\" Copyright (c) 2007 Julian Elischer  (julian -  freebsd org )
2.\" All rights reserved.
3.\"
4.\" Redistribution and use in source and binary forms, with or without
5.\" modification, are permitted provided that the following conditions
6.\" are met:
7.\" 1. Redistributions of source code must retain the above copyright
8.\"    notice, this list of conditions and the following disclaimer.
9.\" 2. Redistributions in binary form must reproduce the above copyright
10.\"    notice, this list of conditions and the following disclaimer in the
11.\"    documentation and/or other materials provided with the distribution.
12.\"
13.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
14.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
15.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
16.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
17.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
18.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
19.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
20.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
21.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
22.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
23.\" SUCH DAMAGE.
24.\"
25.\" $FreeBSD$
26.\"
27.Dd November 3, 2010
28.Dt LOCKING 9
29.Os
30.Sh NAME
31.Nm locking
32.Nd kernel synchronization primitives
33.Sh DESCRIPTION
34The
35.Em FreeBSD
36kernel is written to run across multiple CPUs and as such requires
37several different synchronization primitives to allow the developers
38to safely access and manipulate the many data types required.
39.Ss Mutexes
40Mutexes (also called "sleep mutexes") are the most commonly used
41synchronization primitive in the kernel.
42Thread acquires (locks) a mutex before accessing data shared with other
43threads (including interrupt threads), and releases (unlocks) it afterwards.
44If the mutex cannot be acquired, the thread requesting it will sleep.
45Mutexes fully support priority propagation.
46.Pp
47See
48.Xr mutex 9
49for details.
50.Ss Spin mutexes
51Spin mutexes are variation of basic mutexes; the main difference between
52the two is that spin mutexes never sleep - instead, they spin, waiting
53for the thread holding the lock, which runs on another CPU, to release it.
54Differently from ordinary mutex, spin mutexes disable interrupts when acquired.
55Since disabling interrupts is expensive, they are also generally slower.
56Spin mutexes should be used only when necessary, e.g. to protect data shared
57with interrupt filter code (see
58.Xr bus_setup_intr 9
59for details).
60.Ss Pool mutexes
61With most synchronization primitives, such as mutexes, programmer must
62provide a piece of allocated memory to hold the primitive.
63For example, a mutex may be embedded inside the structure it protects.
64Pool mutex is a variant of mutex without this requirement - to lock or unlock
65a pool mutex, one uses address of the structure being protected with it,
66not the mutex itself.
67Pool mutexes are seldom used.
68.Pp
69See
70.Xr mtx_pool 9
71for details.
72.Ss Reader/writer locks
73Reader/writer locks allow shared access to protected data by multiple threads,
74or exclusive access by a single thread.
75The threads with shared access are known as
76.Em readers
77since they should only read the protected data.
78A thread with exclusive access is known as a
79.Em writer
80since it may modify protected data.
81.Pp
82Reader/writer locks can be treated as mutexes (see above and
83.Xr mutex 9 )
84with shared/exclusive semantics.
85More specifically, regular mutexes can be
86considered to be equivalent to a write-lock on an
87.Em rw_lock.
88The
89.Em rw_lock
90locks have priority propagation like mutexes, but priority
91can be propagated only to an exclusive holder.
92This limitation comes from the fact that shared owners
93are anonymous.
94Another important property is that shared holders of
95.Em rw_lock
96can recurse, but exclusive locks are not allowed to recurse.
97This ability should not be used lightly and
98.Em may go away.
99.Pp
100See
101.Xr rwlock 9
102for details.
103.Ss Read-mostly locks
104Mostly reader locks are similar to
105.Em reader/writer
106locks but optimized for very infrequent write locking.
107.Em Read-mostly
108locks implement full priority propagation by tracking shared owners
109using a caller-supplied
110.Em tracker
111data structure.
112.Pp
113See
114.Xr rmlock 9
115for details.
116.Ss Shared/exclusive locks
117Shared/exclusive locks are similar to reader/writer locks; the main difference
118between them is that shared/exclusive locks may be held during unbounded sleep
119(and may thus perform an unbounded sleep).
120They are inherently less efficient than mutexes, reader/writer locks
121and read-mostly locks.
122They don't support priority propagation.
123They should be considered to be closely related to
124.Xr sleep 9 .
125In fact it could in some cases be
126considered a conditional sleep.
127.Pp
128See
129.Xr sx 9
130for details.
131.Ss Counting semaphores
132Counting semaphores provide a mechanism for synchronizing access
133to a pool of resources.
134Unlike mutexes, semaphores do not have the concept of an owner,
135so they can be useful in situations where one thread needs
136to acquire a resource, and another thread needs to release it.
137They are largely deprecated.
138.Pp
139See
140.Xr sema 9
141for details.
142.Ss Condition variables
143Condition variables are used in conjunction with mutexes to wait for
144conditions to occur.
145A thread must hold the mutex before calling the
146.Fn cv_wait* ,
147functions.
148When a thread waits on a condition, the mutex
149is atomically released before the thread is blocked, then reacquired
150before the function call returns.
151.Pp
152See
153.Xr condvar 9
154for details.
155.Ss Giant
156Giant is an instance of a mutex, with some special characteristics:
157.Bl -enum
158.It
159It is recursive.
160.It
161Drivers and filesystems can request that Giant be locked around them
162by not marking themselves MPSAFE.
163Note that infrastructure to do this is slowly going away as non-MPSAFE
164drivers either became properly locked or disappear.
165.It
166Giant must be locked first before other locks.
167.It
168It is OK to hold Giant while performing unbounded sleep; in such case,
169Giant will be dropped before sleeping and picked up after wakeup.
170.It
171There are places in the kernel that drop Giant and pick it back up
172again.
173Sleep locks will do this before sleeping.
174Parts of the network or VM code may do this as well, depending on the
175setting of a sysctl.
176This means that you cannot count on Giant keeping other code from
177running if your code sleeps, even if you want it to.
178.El
179.Ss Sleep/wakeup
180The functions
181.Fn tsleep ,
182.Fn msleep ,
183.Fn msleep_spin ,
184.Fn pause ,
185.Fn wakeup ,
186and
187.Fn wakeup_one
188handle event-based thread blocking.
189If a thread must wait for an external event, it is put to sleep by
190.Fn tsleep ,
191.Fn msleep ,
192.Fn msleep_spin ,
193or
194.Fn pause .
195Threads may also wait using one of the locking primitive sleep routines
196.Xr mtx_sleep 9 ,
197.Xr rw_sleep 9 ,
198or
199.Xr sx_sleep 9 .
200.Pp
201The parameter
202.Fa chan
203is an arbitrary address that uniquely identifies the event on which
204the thread is being put to sleep.
205All threads sleeping on a single
206.Fa chan
207are woken up later by
208.Fn wakeup ,
209often called from inside an interrupt routine, to indicate that the
210resource the thread was blocking on is available now.
211.Pp
212Several of the sleep functions including
213.Fn msleep ,
214.Fn msleep_spin ,
215and the locking primitive sleep routines specify an additional lock
216parameter.
217The lock will be released before sleeping and reacquired
218before the sleep routine returns.
219If
220.Fa priority
221includes the
222.Dv PDROP
223flag, then the lock will not be reacquired before returning.
224The lock is used to ensure that a condition can be checked atomically,
225and that the current thread can be suspended without missing a
226change to the condition, or an associated wakeup.
227In addition, all of the sleep routines will fully drop the
228.Va Giant
229mutex
230(even if recursed)
231while the thread is suspended and will reacquire the
232.Va Giant
233mutex before the function returns.
234.Pp
235See
236.Xr sleep 9
237for details.
238.Pp
239.Ss Lockmanager locks
240Shared/exclusive locks, used mostly in
241.Xr VFS 9 ,
242in particular as a
243.Xr vnode 9
244lock.
245They have features other lock types don't have, such as sleep timeout,
246writer starvation avoidance, draining, and interlock mutex, but this makes them
247complicated to implement; for this reason, they are deprecated.
248.Pp
249See
250.Xr lock 9
251for details.
252.Sh INTERACTIONS
253The primitives interact and have a number of rules regarding how
254they can and can not be combined.
255Many of these rules are checked using the
256.Xr witness 4
257code.
258.Ss Bounded vs. unbounded sleep
259The following primitives perform bounded sleep: mutexes, pool mutexes,
260reader/writer locks and read-mostly locks.
261.Pp
262The following primitives block (perform unbounded sleep): shared/exclusive locks,
263counting semaphores, condition variables, sleep/wakeup and lockmanager locks.
264.Pp
265It is an error to do any operation that could result in any kind of sleep while
266holding spin mutex.
267.Pp
268As a general rule, it is an error to do any operation that could result
269in unbounded sleep while holding any primitive from the 'bounded sleep' group.
270For example, it is an error to try to acquire shared/exclusive lock while
271holding mutex, or to try to allocate memory with M_WAITOK while holding
272read-write lock.
273.Pp
274As a special case, it is possible to call
275.Fn sleep
276or
277.Fn mtx_sleep
278while holding a single mutex.
279It will atomically drop that mutex and reacquire it as part of waking up.
280This is often a bad idea because it generally relies on the programmer having
281good knowledge of all of the call graph above the place where
282.Fn mtx_sleep
283is being called and assumptions the calling code has made.
284Because the lock gets dropped during sleep, one one must re-test all
285the assumptions that were made before, all the way up the call graph to the
286place where the lock was acquired.
287.Pp
288It is an error to do any operation that could result in any kind of sleep when
289running inside an interrupt filter.
290.Pp
291It is an error to do any operation that could result in unbounded sleep when
292running inside an interrupt thread.
293.Ss Interaction table
294The following table shows what you can and can not do while holding
295one of the synchronization primitives discussed:
296.Bl -column ".Ic xxxxxxxxxxxxxxxxxxx" ".Xr XXXXXXXXX" ".Xr XXXXXXX" ".Xr XXXXXXX" ".Xr XXXXXXX" ".Xr XXXXXX" -offset indent
297.It Xo
298.Em "You have: You want:" Ta spin mtx Ta mutex Ta sx Ta rwlock Ta rmlock Ta sleep
299.Xc
300.It spin mtx  Ta \&ok-1 Ta \&no Ta \&no Ta \&no Ta \&no Ta \&no-3
301.It mutex     Ta \&ok Ta \&ok-1 Ta \&no Ta \&ok Ta \&ok Ta \&no-3
302.It sx        Ta \&ok Ta \&ok Ta \&ok-2 Ta \&ok Ta \&ok Ta \&ok-4
303.It rwlock    Ta \&ok Ta \&ok Ta \&no Ta \&ok-2 Ta \&ok Ta \&no-3
304.It rmlock    Ta \&ok Ta \&ok Ta \&ok-5 Ta \&ok Ta \&ok-2 Ta \&ok-5
305.El
306.Pp
307.Em *1
308Recursion is defined per lock.
309Lock order is important.
310.Pp
311.Em *2
312Readers can recurse though writers can not.
313Lock order is important.
314.Pp
315.Em *3
316There are calls that atomically release this primitive when going to sleep
317and reacquire it on wakeup (e.g.
318.Fn mtx_sleep ,
319.Fn rw_sleep
320and
321.Fn msleep_spin
322).
323.Pp
324.Em *4
325Though one can sleep holding an sx lock, one can also use
326.Fn sx_sleep
327which will atomically release this primitive when going to sleep and
328reacquire it on wakeup.
329.Pp
330.Em *5
331.Em Read-mostly
332locks can be initialized to support sleeping while holding a write lock.
333See
334.Xr rmlock 9
335for details.
336.Ss Context mode table
337The next table shows what can be used in different contexts.
338At this time this is a rather easy to remember table.
339.Bl -column ".Ic Xxxxxxxxxxxxxxxxxxx" ".Xr XXXXXXXXX" ".Xr XXXXXXX" ".Xr XXXXXXX" ".Xr XXXXXXX" ".Xr XXXXXX" -offset indent
340.It Xo
341.Em "Context:"  Ta spin mtx Ta mutex Ta sx Ta rwlock Ta rmlock Ta sleep
342.Xc
343.It interrupt filter:  Ta \&ok Ta \&no Ta \&no Ta \&no Ta \&no Ta \&no
344.It interrupt thread:  Ta \&ok Ta \&ok Ta \&no Ta \&ok Ta \&ok Ta \&no
345.It callout:    Ta \&ok Ta \&ok Ta \&no Ta \&ok Ta \&no Ta \&no
346.It syscall:    Ta \&ok Ta \&ok Ta \&ok Ta \&ok Ta \&ok Ta \&ok
347.El
348.Sh SEE ALSO
349.Xr witness 4 ,
350.Xr condvar 9 ,
351.Xr lock 9 ,
352.Xr mtx_pool 9 ,
353.Xr mutex 9 ,
354.Xr rmlock 9 ,
355.Xr rwlock 9 ,
356.Xr sema 9 ,
357.Xr sleep 9 ,
358.Xr sx 9 ,
359.Xr BUS_SETUP_INTR 9 ,
360.Xr LOCK_PROFILING 9
361.Sh HISTORY
362These
363functions appeared in
364.Bsx 4.1
365through
366.Fx 7.0
367.Sh BUGS
368There are too many locking primitives to choose from.
369