xref: /freebsd/lib/libthr/libthr.3 (revision 9f44a47fd07924afc035991af15d84e6585dea4f)
1.\" Copyright (c) 2005 Robert N. M. Watson
2.\" Copyright (c) 2014,2015,2021 The FreeBSD Foundation, Inc.
3.\" All rights reserved.
4.\"
5.\" Part of this documentation was written by
6.\" Konstantin Belousov <kib@FreeBSD.org> under sponsorship
7.\" from the FreeBSD Foundation.
8.\"
9.\" Redistribution and use in source and binary forms, with or without
10.\" modification, are permitted provided that the following conditions
11.\" are met:
12.\" 1. Redistributions of source code must retain the above copyright
13.\"    notice, this list of conditions and the following disclaimer.
14.\" 2. Redistributions in binary form must reproduce the above copyright
15.\"    notice, this list of conditions and the following disclaimer in the
16.\"    documentation and/or other materials provided with the distribution.
17.\"
18.\" THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND
19.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
20.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
21.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE
22.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
23.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
24.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
25.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
26.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
27.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
28.\" SUCH DAMAGE.
29.\"
30.\" $FreeBSD$
31.\"
32.Dd October 1, 2021
33.Dt LIBTHR 3
34.Os
35.Sh NAME
36.Nm libthr
37.Nd "1:1 POSIX threads library"
38.Sh LIBRARY
39.Lb libthr
40.Sh SYNOPSIS
41.In pthread.h
42.Sh DESCRIPTION
43The
44.Nm
45library provides a 1:1 implementation of the
46.Xr pthread 3
47library interfaces for application threading.
48It
49has been optimized for use by applications expecting system scope thread
50semantics.
51.Pp
52The library is tightly integrated with the run-time link editor
53.Xr ld-elf.so.1 1
54and
55.Lb libc ;
56all three components must be built from the same source tree.
57Mixing
58.Li libc
59and
60.Nm
61libraries from different versions of
62.Fx
63is not supported.
64The run-time linker
65.Xr ld-elf.so.1 1
66has some code to ensure backward-compatibility with older versions of
67.Nm .
68.Pp
69The man page documents the quirks and tunables of the
70.Nm .
71When linking with
72.Li -lpthread ,
73the run-time dependency
74.Li libthr.so.3
75is recorded in the produced object.
76.Sh MUTEX ACQUISITION
77A locked mutex (see
78.Xr pthread_mutex_lock 3 )
79is represented by a volatile variable of type
80.Dv lwpid_t ,
81which records the global system identifier of the thread
82owning the lock.
83.Nm
84performs a contested mutex acquisition in three stages, each of which
85is more resource-consuming than the previous.
86The first two stages are only applied for a mutex of
87.Dv PTHREAD_MUTEX_ADAPTIVE_NP
88type and
89.Dv PTHREAD_PRIO_NONE
90protocol (see
91.Xr pthread_mutexattr 3 ) .
92.Pp
93First, on SMP systems, a spin loop
94is performed, where the library attempts to acquire the lock by
95.Xr atomic 9
96operations.
97The loop count is controlled by the
98.Ev LIBPTHREAD_SPINLOOPS
99environment variable, with a default value of 2000.
100.Pp
101If the spin loop
102was unable to acquire the mutex, a yield loop
103is executed, performing the same
104.Xr atomic 9
105acquisition attempts as the spin loop,
106but each attempt is followed by a yield of the CPU time
107of the thread using the
108.Xr sched_yield 2
109syscall.
110By default, the yield loop
111is not executed.
112This is controlled by the
113.Ev LIBPTHREAD_YIELDLOOPS
114environment variable.
115.Pp
116If both the spin and yield loops
117failed to acquire the lock, the thread is taken off the CPU and
118put to sleep in the kernel with the
119.Xr _umtx_op 2
120syscall.
121The kernel wakes up a thread and hands the ownership of the lock to
122the woken thread when the lock becomes available.
123.Sh THREAD STACKS
124Each thread is provided with a private user-mode stack area
125used by the C runtime.
126The size of the main (initial) thread stack is set by the kernel, and is
127controlled by the
128.Dv RLIMIT_STACK
129process resource limit (see
130.Xr getrlimit 2 ) .
131.Pp
132By default, the main thread's stack size is equal to the value of
133.Dv RLIMIT_STACK
134for the process.
135If the
136.Ev LIBPTHREAD_SPLITSTACK_MAIN
137environment variable is present in the process environment
138(its value does not matter),
139the main thread's stack is reduced to 4MB on 64bit architectures, and to
1402MB on 32bit architectures, when the threading library is initialized.
141The rest of the address space area which has been reserved by the
142kernel for the initial process stack is used for non-initial thread stacks
143in this case.
144The presence of the
145.Ev LIBPTHREAD_BIGSTACK_MAIN
146environment variable overrides
147.Ev LIBPTHREAD_SPLITSTACK_MAIN ;
148it is kept for backward-compatibility.
149.Pp
150The size of stacks for threads created by the process at run-time
151with the
152.Xr pthread_create 3
153call is controlled by thread attributes: see
154.Xr pthread_attr 3 ,
155in particular, the
156.Xr pthread_attr_setstacksize 3 ,
157.Xr pthread_attr_setguardsize 3
158and
159.Xr pthread_attr_setstackaddr 3
160functions.
161If no attributes for the thread stack size are specified, the default
162non-initial thread stack size is 2MB for 64bit architectures, and 1MB
163for 32bit architectures.
164.Sh RUN-TIME SETTINGS
165The following environment variables are recognized by
166.Nm
167and adjust the operation of the library at run-time:
168.Bl -tag -width "Ev LIBPTHREAD_SPLITSTACK_MAIN"
169.It Ev LIBPTHREAD_BIGSTACK_MAIN
170Disables the reduction of the initial thread stack enabled by
171.Ev LIBPTHREAD_SPLITSTACK_MAIN .
172.It Ev LIBPTHREAD_SPLITSTACK_MAIN
173Causes a reduction of the initial thread stack, as described in the
174section
175.Sx THREAD STACKS .
176This was the default behaviour of
177.Nm
178before
179.Fx 11.0 .
180.It Ev LIBPTHREAD_SPINLOOPS
181The integer value of the variable overrides the default count of
182iterations in the
183.Li spin loop
184of the mutex acquisition.
185The default count is 2000, set by the
186.Dv MUTEX_ADAPTIVE_SPINS
187constant in the
188.Nm
189sources.
190.It Ev LIBPTHREAD_YIELDLOOPS
191A non-zero integer value enables the yield loop
192in the process of the mutex acquisition.
193The value is the count of loop operations.
194.It Ev LIBPTHREAD_QUEUE_FIFO
195The integer value of the variable specifies how often blocked
196threads are inserted at the head of the sleep queue, instead of its tail.
197Bigger values reduce the frequency of the FIFO discipline.
198The value must be between 0 and 255.
199.It Dv LIBPTHREAD_UMTX_MIN_TIMEOUT
200The minimal amount of time, in nanoseconds, the thread is required to sleep
201for pthread operations specifying a timeout.
202If the operation requests a timeout less than the value provided,
203it is silently increased to the value.
204The value of zero means no minimum (default).
205.Pp
206.El
207The following
208.Dv sysctl
209MIBs affect the operation of the library:
210.Bl -tag -width "Dv debug.umtx.robust_faults_verbose"
211.It Dv kern.ipc.umtx_vnode_persistent
212By default, a shared lock backed by a mapped file in memory is
213automatically destroyed on the last unmap of the corresponding file's page,
214which is allowed by POSIX.
215Setting the sysctl to 1 makes such a shared lock object persist until
216the vnode is recycled by the Virtual File System.
217Note that in case file is not opened and not mapped, the kernel might
218recycle it at any moment, making this sysctl less useful than it sounds.
219.It Dv kern.ipc.umtx_max_robust
220The maximal number of robust mutexes allowed for one thread.
221The kernel will not unlock more mutexes than specified, see
222.Xr _umtx_op
223for more details.
224The default value is large enough for most useful applications.
225.It Dv debug.umtx.robust_faults_verbose
226A non zero value makes kernel emit some diagnostic when the robust
227mutexes unlock was prematurely aborted after detecting some inconsistency,
228as a measure to prevent memory corruption.
229.El
230.Pp
231The
232.Dv RLIMIT_UMTXP
233limit (see
234.Xr getrlimit 2 )
235defines how many shared locks a given user may create simultaneously.
236.Sh INTERACTION WITH RUN-TIME LINKER
237On load,
238.Nm
239installs interposing handlers into the hooks exported by
240.Li libc .
241The interposers provide real locking implementation instead of the
242stubs for single-threaded processes in
243.Li libc ,
244cancellation support and some modifications to the signal operations.
245.Pp
246.Nm
247cannot be unloaded; the
248.Xr dlclose 3
249function does not perform any action when called with a handle for
250.Nm .
251One of the reasons is that the internal interposing of
252.Li libc
253functions cannot be undone.
254.Sh SIGNALS
255The implementation interposes the user-installed
256.Xr signal 3
257handlers.
258This interposing is done to postpone signal delivery to threads which
259entered (libthr-internal) critical sections, where the calling
260of the user-provided signal handler is unsafe.
261An example of such a situation is owning the internal library lock.
262When a signal is delivered while the signal handler cannot be safely
263called, the call is postponed and performed until after the exit from
264the critical section.
265This should be taken into account when interpreting
266.Xr ktrace 1
267logs.
268.Sh PROCESS-SHARED SYNCHRONIZATION OBJECTS
269In the
270.Li libthr
271implementation,
272user-visible types for all synchronization objects (e.g. pthread_mutex_t)
273are pointers to internal structures, allocated either by the corresponding
274.Fn pthread_<objtype>_init
275method call, or implicitly on first use when a static initializer
276was specified.
277The initial implementation of process-private locking object used this
278model with internal allocation, and the addition of process-shared objects
279was done in a way that did not break the application binary interface.
280.Pp
281For process-private objects, the internal structure is allocated using
282either
283.Xr malloc 3
284or, for
285.Xr pthread_mutex_init 3 ,
286an internal memory allocator implemented in
287.Nm .
288The internal allocator for mutexes is used to avoid bootstrap issues
289with many
290.Xr malloc 3
291implementations which need working mutexes to function.
292The same allocator is used for thread-specific data, see
293.Xr pthread_setspecific 3 ,
294for the same reason.
295.Pp
296For process-shared objects, the internal structure is created by first
297allocating a shared memory segment using
298.Xr _umtx_op 2
299operation
300.Dv UMTX_OP_SHM ,
301and then mapping it into process address space with
302.Xr mmap 2
303with the
304.Dv MAP_SHARED
305flag.
306The POSIX standard requires that:
307.Bd -literal
308only the process-shared synchronization object itself can be used for
309performing synchronization.  It need not be referenced at the address
310used to initialize it (that is, another mapping of the same object can
311be used).
312.Ed
313.Pp
314With the
315.Fx
316implementation, process-shared objects require initialization
317in each process that use them.
318In particular, if you map the shared memory containing the user portion of
319a process-shared object already initialized in different process, locking
320functions do not work on it.
321.Pp
322Another broken case is a forked child creating the object in memory shared
323with the parent, which cannot be used from parent.
324Note that processes should not use non-async-signal safe functions after
325.Xr fork 2
326anyway.
327.Sh SEE ALSO
328.Xr ktrace 1 ,
329.Xr ld-elf.so.1 1 ,
330.Xr getrlimit 2 ,
331.Xr errno 2 ,
332.Xr thr_exit 2 ,
333.Xr thr_kill 2 ,
334.Xr thr_kill2 2 ,
335.Xr thr_new 2 ,
336.Xr thr_self 2 ,
337.Xr thr_set_name 2 ,
338.Xr _umtx_op 2 ,
339.Xr dlclose 3 ,
340.Xr dlopen 3 ,
341.Xr getenv 3 ,
342.Xr pthread_attr 3 ,
343.Xr pthread_attr_setstacksize 3 ,
344.Xr pthread_create 3 ,
345.Xr signal 3 ,
346.Xr atomic 9
347.Sh HISTORY
348The
349.Nm
350library first appeared in
351.Fx 5.2 .
352.Sh AUTHORS
353.An -nosplit
354The
355.Nm
356library
357was originally created by
358.An Jeff Roberson Aq Mt jeff@FreeBSD.org ,
359and enhanced by
360.An Jonathan Mini Aq Mt mini@FreeBSD.org
361and
362.An Mike Makonnen Aq Mt mtm@FreeBSD.org .
363It has been substantially rewritten and optimized by
364.An David Xu Aq Mt davidxu@FreeBSD.org .
365