xref: /freebsd/lib/libthr/libthr.3 (revision ae477ca7da55f76d28859e1bd01cd1051e36f28f)
1.\" Copyright (c) 2005 Robert N. M. Watson
2.\" Copyright (c) 2014,2015,2021 The FreeBSD Foundation, Inc.
3.\" All rights reserved.
4.\"
5.\" Part of this documentation was written by
6.\" Konstantin Belousov <kib@FreeBSD.org> under sponsorship
7.\" from the FreeBSD Foundation.
8.\"
9.\" Redistribution and use in source and binary forms, with or without
10.\" modification, are permitted provided that the following conditions
11.\" are met:
12.\" 1. Redistributions of source code must retain the above copyright
13.\"    notice, this list of conditions and the following disclaimer.
14.\" 2. Redistributions in binary form must reproduce the above copyright
15.\"    notice, this list of conditions and the following disclaimer in the
16.\"    documentation and/or other materials provided with the distribution.
17.\"
18.\" THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND
19.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
20.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
21.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE
22.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
23.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
24.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
25.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
26.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
27.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
28.\" SUCH DAMAGE.
29.\"
30.Dd October 1, 2021
31.Dt LIBTHR 3
32.Os
33.Sh NAME
34.Nm libthr
35.Nd "1:1 POSIX threads library"
36.Sh LIBRARY
37.Lb libthr
38.Sh SYNOPSIS
39.In pthread.h
40.Sh DESCRIPTION
41The
42.Nm
43library provides a 1:1 implementation of the
44.Xr pthread 3
45library interfaces for application threading.
46It
47has been optimized for use by applications expecting system scope thread
48semantics.
49.Pp
50The library is tightly integrated with the run-time link editor
51.Xr ld-elf.so.1 1
52and
53.Lb libc ;
54all three components must be built from the same source tree.
55Mixing
56.Li libc
57and
58.Nm
59libraries from different versions of
60.Fx
61is not supported.
62The run-time linker
63.Xr ld-elf.so.1 1
64has some code to ensure backward-compatibility with older versions of
65.Nm .
66.Pp
67The man page documents the quirks and tunables of the
68.Nm .
69When linking with
70.Li -lpthread ,
71the run-time dependency
72.Li libthr.so.3
73is recorded in the produced object.
74.Sh MUTEX ACQUISITION
75A locked mutex (see
76.Xr pthread_mutex_lock 3 )
77is represented by a volatile variable of type
78.Dv lwpid_t ,
79which records the global system identifier of the thread
80owning the lock.
81.Nm
82performs a contested mutex acquisition in three stages, each of which
83is more resource-consuming than the previous.
84The first two stages are only applied for a mutex of
85.Dv PTHREAD_MUTEX_ADAPTIVE_NP
86type and
87.Dv PTHREAD_PRIO_NONE
88protocol (see
89.Xr pthread_mutexattr 3 ) .
90.Pp
91First, on SMP systems, a spin loop
92is performed, where the library attempts to acquire the lock by
93.Xr atomic 9
94operations.
95The loop count is controlled by the
96.Ev LIBPTHREAD_SPINLOOPS
97environment variable, with a default value of 2000.
98.Pp
99If the spin loop
100was unable to acquire the mutex, a yield loop
101is executed, performing the same
102.Xr atomic 9
103acquisition attempts as the spin loop,
104but each attempt is followed by a yield of the CPU time
105of the thread using the
106.Xr sched_yield 2
107syscall.
108By default, the yield loop
109is not executed.
110This is controlled by the
111.Ev LIBPTHREAD_YIELDLOOPS
112environment variable.
113.Pp
114If both the spin and yield loops
115failed to acquire the lock, the thread is taken off the CPU and
116put to sleep in the kernel with the
117.Xr _umtx_op 2
118syscall.
119The kernel wakes up a thread and hands the ownership of the lock to
120the woken thread when the lock becomes available.
121.Sh THREAD STACKS
122Each thread is provided with a private user-mode stack area
123used by the C runtime.
124The size of the main (initial) thread stack is set by the kernel, and is
125controlled by the
126.Dv RLIMIT_STACK
127process resource limit (see
128.Xr getrlimit 2 ) .
129.Pp
130By default, the main thread's stack size is equal to the value of
131.Dv RLIMIT_STACK
132for the process.
133If the
134.Ev LIBPTHREAD_SPLITSTACK_MAIN
135environment variable is present in the process environment
136(its value does not matter),
137the main thread's stack is reduced to 4MB on 64bit architectures, and to
1382MB on 32bit architectures, when the threading library is initialized.
139The rest of the address space area which has been reserved by the
140kernel for the initial process stack is used for non-initial thread stacks
141in this case.
142The presence of the
143.Ev LIBPTHREAD_BIGSTACK_MAIN
144environment variable overrides
145.Ev LIBPTHREAD_SPLITSTACK_MAIN ;
146it is kept for backward-compatibility.
147.Pp
148The size of stacks for threads created by the process at run-time
149with the
150.Xr pthread_create 3
151call is controlled by thread attributes: see
152.Xr pthread_attr 3 ,
153in particular, the
154.Xr pthread_attr_setstacksize 3 ,
155.Xr pthread_attr_setguardsize 3
156and
157.Xr pthread_attr_setstackaddr 3
158functions.
159If no attributes for the thread stack size are specified, the default
160non-initial thread stack size is 2MB for 64bit architectures, and 1MB
161for 32bit architectures.
162.Sh RUN-TIME SETTINGS
163The following environment variables are recognized by
164.Nm
165and adjust the operation of the library at run-time:
166.Bl -tag -width "Ev LIBPTHREAD_SPLITSTACK_MAIN"
167.It Ev LIBPTHREAD_BIGSTACK_MAIN
168Disables the reduction of the initial thread stack enabled by
169.Ev LIBPTHREAD_SPLITSTACK_MAIN .
170.It Ev LIBPTHREAD_SPLITSTACK_MAIN
171Causes a reduction of the initial thread stack, as described in the
172section
173.Sx THREAD STACKS .
174This was the default behaviour of
175.Nm
176before
177.Fx 11.0 .
178.It Ev LIBPTHREAD_SPINLOOPS
179The integer value of the variable overrides the default count of
180iterations in the
181.Li spin loop
182of the mutex acquisition.
183The default count is 2000, set by the
184.Dv MUTEX_ADAPTIVE_SPINS
185constant in the
186.Nm
187sources.
188.It Ev LIBPTHREAD_YIELDLOOPS
189A non-zero integer value enables the yield loop
190in the process of the mutex acquisition.
191The value is the count of loop operations.
192.It Ev LIBPTHREAD_QUEUE_FIFO
193The integer value of the variable specifies how often blocked
194threads are inserted at the head of the sleep queue, instead of its tail.
195Bigger values reduce the frequency of the FIFO discipline.
196The value must be between 0 and 255.
197.It Dv LIBPTHREAD_UMTX_MIN_TIMEOUT
198The minimal amount of time, in nanoseconds, the thread is required to sleep
199for pthread operations specifying a timeout.
200If the operation requests a timeout less than the value provided,
201it is silently increased to the value.
202The value of zero means no minimum (default).
203.Pp
204.El
205The following
206.Dv sysctl
207MIBs affect the operation of the library:
208.Bl -tag -width "Dv debug.umtx.robust_faults_verbose"
209.It Dv kern.ipc.umtx_vnode_persistent
210By default, a shared lock backed by a mapped file in memory is
211automatically destroyed on the last unmap of the corresponding file's page,
212which is allowed by POSIX.
213Setting the sysctl to 1 makes such a shared lock object persist until
214the vnode is recycled by the Virtual File System.
215Note that in case file is not opened and not mapped, the kernel might
216recycle it at any moment, making this sysctl less useful than it sounds.
217.It Dv kern.ipc.umtx_max_robust
218The maximal number of robust mutexes allowed for one thread.
219The kernel will not unlock more mutexes than specified, see
220.Xr _umtx_op
221for more details.
222The default value is large enough for most useful applications.
223.It Dv debug.umtx.robust_faults_verbose
224A non zero value makes kernel emit some diagnostic when the robust
225mutexes unlock was prematurely aborted after detecting some inconsistency,
226as a measure to prevent memory corruption.
227.El
228.Pp
229The
230.Dv RLIMIT_UMTXP
231limit (see
232.Xr getrlimit 2 )
233defines how many shared locks a given user may create simultaneously.
234.Sh INTERACTION WITH RUN-TIME LINKER
235On load,
236.Nm
237installs interposing handlers into the hooks exported by
238.Li libc .
239The interposers provide real locking implementation instead of the
240stubs for single-threaded processes in
241.Li libc ,
242cancellation support and some modifications to the signal operations.
243.Pp
244.Nm
245cannot be unloaded; the
246.Xr dlclose 3
247function does not perform any action when called with a handle for
248.Nm .
249One of the reasons is that the internal interposing of
250.Li libc
251functions cannot be undone.
252.Sh SIGNALS
253The implementation interposes the user-installed
254.Xr signal 3
255handlers.
256This interposing is done to postpone signal delivery to threads which
257entered (libthr-internal) critical sections, where the calling
258of the user-provided signal handler is unsafe.
259An example of such a situation is owning the internal library lock.
260When a signal is delivered while the signal handler cannot be safely
261called, the call is postponed and performed until after the exit from
262the critical section.
263This should be taken into account when interpreting
264.Xr ktrace 1
265logs.
266.Sh PROCESS-SHARED SYNCHRONIZATION OBJECTS
267In the
268.Li libthr
269implementation,
270user-visible types for all synchronization objects (e.g. pthread_mutex_t)
271are pointers to internal structures, allocated either by the corresponding
272.Fn pthread_<objtype>_init
273method call, or implicitly on first use when a static initializer
274was specified.
275The initial implementation of process-private locking object used this
276model with internal allocation, and the addition of process-shared objects
277was done in a way that did not break the application binary interface.
278.Pp
279For process-private objects, the internal structure is allocated using
280either
281.Xr malloc 3
282or, for
283.Xr pthread_mutex_init 3 ,
284an internal memory allocator implemented in
285.Nm .
286The internal allocator for mutexes is used to avoid bootstrap issues
287with many
288.Xr malloc 3
289implementations which need working mutexes to function.
290The same allocator is used for thread-specific data, see
291.Xr pthread_setspecific 3 ,
292for the same reason.
293.Pp
294For process-shared objects, the internal structure is created by first
295allocating a shared memory segment using
296.Xr _umtx_op 2
297operation
298.Dv UMTX_OP_SHM ,
299and then mapping it into process address space with
300.Xr mmap 2
301with the
302.Dv MAP_SHARED
303flag.
304The POSIX standard requires that:
305.Bd -literal
306only the process-shared synchronization object itself can be used for
307performing synchronization.  It need not be referenced at the address
308used to initialize it (that is, another mapping of the same object can
309be used).
310.Ed
311.Pp
312With the
313.Fx
314implementation, process-shared objects require initialization
315in each process that use them.
316In particular, if you map the shared memory containing the user portion of
317a process-shared object already initialized in different process, locking
318functions do not work on it.
319.Pp
320Another broken case is a forked child creating the object in memory shared
321with the parent, which cannot be used from parent.
322Note that processes should not use non-async-signal safe functions after
323.Xr fork 2
324anyway.
325.Sh SEE ALSO
326.Xr ktrace 1 ,
327.Xr ld-elf.so.1 1 ,
328.Xr getrlimit 2 ,
329.Xr errno 2 ,
330.Xr thr_exit 2 ,
331.Xr thr_kill 2 ,
332.Xr thr_kill2 2 ,
333.Xr thr_new 2 ,
334.Xr thr_self 2 ,
335.Xr thr_set_name 2 ,
336.Xr _umtx_op 2 ,
337.Xr dlclose 3 ,
338.Xr dlopen 3 ,
339.Xr getenv 3 ,
340.Xr pthread_attr 3 ,
341.Xr pthread_attr_setstacksize 3 ,
342.Xr pthread_create 3 ,
343.Xr signal 3 ,
344.Xr atomic 9
345.Sh HISTORY
346The
347.Nm
348library first appeared in
349.Fx 5.2 .
350.Sh AUTHORS
351.An -nosplit
352The
353.Nm
354library
355was originally created by
356.An Jeff Roberson Aq Mt jeff@FreeBSD.org ,
357and enhanced by
358.An Jonathan Mini Aq Mt mini@FreeBSD.org
359and
360.An Mike Makonnen Aq Mt mtm@FreeBSD.org .
361It has been substantially rewritten and optimized by
362.An David Xu Aq Mt davidxu@FreeBSD.org .
363