1.\" Copyright (c) 2005 Robert N. M. Watson 2.\" Copyright (c) 2014,2015,2021 The FreeBSD Foundation, Inc. 3.\" All rights reserved. 4.\" 5.\" Part of this documentation was written by 6.\" Konstantin Belousov <kib@FreeBSD.org> under sponsorship 7.\" from the FreeBSD Foundation. 8.\" 9.\" Redistribution and use in source and binary forms, with or without 10.\" modification, are permitted provided that the following conditions 11.\" are met: 12.\" 1. Redistributions of source code must retain the above copyright 13.\" notice, this list of conditions and the following disclaimer. 14.\" 2. Redistributions in binary form must reproduce the above copyright 15.\" notice, this list of conditions and the following disclaimer in the 16.\" documentation and/or other materials provided with the distribution. 17.\" 18.\" THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND 19.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 20.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 21.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE 22.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 23.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 24.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 25.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 26.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 27.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 28.\" SUCH DAMAGE. 29.\" 30.\" $FreeBSD$ 31.\" 32.Dd October 1, 2021 33.Dt LIBTHR 3 34.Os 35.Sh NAME 36.Nm libthr 37.Nd "1:1 POSIX threads library" 38.Sh LIBRARY 39.Lb libthr 40.Sh SYNOPSIS 41.In pthread.h 42.Sh DESCRIPTION 43The 44.Nm 45library provides a 1:1 implementation of the 46.Xr pthread 3 47library interfaces for application threading. 48It 49has been optimized for use by applications expecting system scope thread 50semantics. 51.Pp 52The library is tightly integrated with the run-time link editor 53.Xr ld-elf.so.1 1 54and 55.Lb libc ; 56all three components must be built from the same source tree. 57Mixing 58.Li libc 59and 60.Nm 61libraries from different versions of 62.Fx 63is not supported. 64The run-time linker 65.Xr ld-elf.so.1 1 66has some code to ensure backward-compatibility with older versions of 67.Nm . 68.Pp 69The man page documents the quirks and tunables of the 70.Nm . 71When linking with 72.Li -lpthread , 73the run-time dependency 74.Li libthr.so.3 75is recorded in the produced object. 76.Sh MUTEX ACQUISITION 77A locked mutex (see 78.Xr pthread_mutex_lock 3 ) 79is represented by a volatile variable of type 80.Dv lwpid_t , 81which records the global system identifier of the thread 82owning the lock. 83.Nm 84performs a contested mutex acquisition in three stages, each of which 85is more resource-consuming than the previous. 86The first two stages are only applied for a mutex of 87.Dv PTHREAD_MUTEX_ADAPTIVE_NP 88type and 89.Dv PTHREAD_PRIO_NONE 90protocol (see 91.Xr pthread_mutexattr 3 ) . 92.Pp 93First, on SMP systems, a spin loop 94is performed, where the library attempts to acquire the lock by 95.Xr atomic 9 96operations. 97The loop count is controlled by the 98.Ev LIBPTHREAD_SPINLOOPS 99environment variable, with a default value of 2000. 100.Pp 101If the spin loop 102was unable to acquire the mutex, a yield loop 103is executed, performing the same 104.Xr atomic 9 105acquisition attempts as the spin loop, 106but each attempt is followed by a yield of the CPU time 107of the thread using the 108.Xr sched_yield 2 109syscall. 110By default, the yield loop 111is not executed. 112This is controlled by the 113.Ev LIBPTHREAD_YIELDLOOPS 114environment variable. 115.Pp 116If both the spin and yield loops 117failed to acquire the lock, the thread is taken off the CPU and 118put to sleep in the kernel with the 119.Xr _umtx_op 2 120syscall. 121The kernel wakes up a thread and hands the ownership of the lock to 122the woken thread when the lock becomes available. 123.Sh THREAD STACKS 124Each thread is provided with a private user-mode stack area 125used by the C runtime. 126The size of the main (initial) thread stack is set by the kernel, and is 127controlled by the 128.Dv RLIMIT_STACK 129process resource limit (see 130.Xr getrlimit 2 ) . 131.Pp 132By default, the main thread's stack size is equal to the value of 133.Dv RLIMIT_STACK 134for the process. 135If the 136.Ev LIBPTHREAD_SPLITSTACK_MAIN 137environment variable is present in the process environment 138(its value does not matter), 139the main thread's stack is reduced to 4MB on 64bit architectures, and to 1402MB on 32bit architectures, when the threading library is initialized. 141The rest of the address space area which has been reserved by the 142kernel for the initial process stack is used for non-initial thread stacks 143in this case. 144The presence of the 145.Ev LIBPTHREAD_BIGSTACK_MAIN 146environment variable overrides 147.Ev LIBPTHREAD_SPLITSTACK_MAIN ; 148it is kept for backward-compatibility. 149.Pp 150The size of stacks for threads created by the process at run-time 151with the 152.Xr pthread_create 3 153call is controlled by thread attributes: see 154.Xr pthread_attr 3 , 155in particular, the 156.Xr pthread_attr_setstacksize 3 , 157.Xr pthread_attr_setguardsize 3 158and 159.Xr pthread_attr_setstackaddr 3 160functions. 161If no attributes for the thread stack size are specified, the default 162non-initial thread stack size is 2MB for 64bit architectures, and 1MB 163for 32bit architectures. 164.Sh RUN-TIME SETTINGS 165The following environment variables are recognized by 166.Nm 167and adjust the operation of the library at run-time: 168.Bl -tag -width "Ev LIBPTHREAD_SPLITSTACK_MAIN" 169.It Ev LIBPTHREAD_BIGSTACK_MAIN 170Disables the reduction of the initial thread stack enabled by 171.Ev LIBPTHREAD_SPLITSTACK_MAIN . 172.It Ev LIBPTHREAD_SPLITSTACK_MAIN 173Causes a reduction of the initial thread stack, as described in the 174section 175.Sx THREAD STACKS . 176This was the default behaviour of 177.Nm 178before 179.Fx 11.0 . 180.It Ev LIBPTHREAD_SPINLOOPS 181The integer value of the variable overrides the default count of 182iterations in the 183.Li spin loop 184of the mutex acquisition. 185The default count is 2000, set by the 186.Dv MUTEX_ADAPTIVE_SPINS 187constant in the 188.Nm 189sources. 190.It Ev LIBPTHREAD_YIELDLOOPS 191A non-zero integer value enables the yield loop 192in the process of the mutex acquisition. 193The value is the count of loop operations. 194.It Ev LIBPTHREAD_QUEUE_FIFO 195The integer value of the variable specifies how often blocked 196threads are inserted at the head of the sleep queue, instead of its tail. 197Bigger values reduce the frequency of the FIFO discipline. 198The value must be between 0 and 255. 199.It Dv LIBPTHREAD_UMTX_MIN_TIMEOUT 200The minimal amount of time, in nanoseconds, the thread is required to sleep 201for pthread operations specifying a timeout. 202If the operation requests a timeout less than the value provided, 203it is silently increased to the value. 204The value of zero means no minimum (default). 205.Pp 206.El 207The following 208.Dv sysctl 209MIBs affect the operation of the library: 210.Bl -tag -width "Dv debug.umtx.robust_faults_verbose" 211.It Dv kern.ipc.umtx_vnode_persistent 212By default, a shared lock backed by a mapped file in memory is 213automatically destroyed on the last unmap of the corresponding file's page, 214which is allowed by POSIX. 215Setting the sysctl to 1 makes such a shared lock object persist until 216the vnode is recycled by the Virtual File System. 217Note that in case file is not opened and not mapped, the kernel might 218recycle it at any moment, making this sysctl less useful than it sounds. 219.It Dv kern.ipc.umtx_max_robust 220The maximal number of robust mutexes allowed for one thread. 221The kernel will not unlock more mutexes than specified, see 222.Xr _umtx_op 223for more details. 224The default value is large enough for most useful applications. 225.It Dv debug.umtx.robust_faults_verbose 226A non zero value makes kernel emit some diagnostic when the robust 227mutexes unlock was prematurely aborted after detecting some inconsistency, 228as a measure to prevent memory corruption. 229.El 230.Pp 231The 232.Dv RLIMIT_UMTXP 233limit (see 234.Xr getrlimit 2 ) 235defines how many shared locks a given user may create simultaneously. 236.Sh INTERACTION WITH RUN-TIME LINKER 237On load, 238.Nm 239installs interposing handlers into the hooks exported by 240.Li libc . 241The interposers provide real locking implementation instead of the 242stubs for single-threaded processes in 243.Li libc , 244cancellation support and some modifications to the signal operations. 245.Pp 246.Nm 247cannot be unloaded; the 248.Xr dlclose 3 249function does not perform any action when called with a handle for 250.Nm . 251One of the reasons is that the internal interposing of 252.Li libc 253functions cannot be undone. 254.Sh SIGNALS 255The implementation interposes the user-installed 256.Xr signal 3 257handlers. 258This interposing is done to postpone signal delivery to threads which 259entered (libthr-internal) critical sections, where the calling 260of the user-provided signal handler is unsafe. 261An example of such a situation is owning the internal library lock. 262When a signal is delivered while the signal handler cannot be safely 263called, the call is postponed and performed until after the exit from 264the critical section. 265This should be taken into account when interpreting 266.Xr ktrace 1 267logs. 268.Sh PROCESS-SHARED SYNCHRONIZATION OBJECTS 269In the 270.Li libthr 271implementation, 272user-visible types for all synchronization objects (e.g. pthread_mutex_t) 273are pointers to internal structures, allocated either by the corresponding 274.Fn pthread_<objtype>_init 275method call, or implicitly on first use when a static initializer 276was specified. 277The initial implementation of process-private locking object used this 278model with internal allocation, and the addition of process-shared objects 279was done in a way that did not break the application binary interface. 280.Pp 281For process-private objects, the internal structure is allocated using 282either 283.Xr malloc 3 284or, for 285.Xr pthread_mutex_init 3 , 286an internal memory allocator implemented in 287.Nm . 288The internal allocator for mutexes is used to avoid bootstrap issues 289with many 290.Xr malloc 3 291implementations which need working mutexes to function. 292The same allocator is used for thread-specific data, see 293.Xr pthread_setspecific 3 , 294for the same reason. 295.Pp 296For process-shared objects, the internal structure is created by first 297allocating a shared memory segment using 298.Xr _umtx_op 2 299operation 300.Dv UMTX_OP_SHM , 301and then mapping it into process address space with 302.Xr mmap 2 303with the 304.Dv MAP_SHARED 305flag. 306The POSIX standard requires that: 307.Bd -literal 308only the process-shared synchronization object itself can be used for 309performing synchronization. It need not be referenced at the address 310used to initialize it (that is, another mapping of the same object can 311be used). 312.Ed 313.Pp 314With the 315.Fx 316implementation, process-shared objects require initialization 317in each process that use them. 318In particular, if you map the shared memory containing the user portion of 319a process-shared object already initialized in different process, locking 320functions do not work on it. 321.Pp 322Another broken case is a forked child creating the object in memory shared 323with the parent, which cannot be used from parent. 324Note that processes should not use non-async-signal safe functions after 325.Xr fork 2 326anyway. 327.Sh SEE ALSO 328.Xr ktrace 1 , 329.Xr ld-elf.so.1 1 , 330.Xr getrlimit 2 , 331.Xr errno 2 , 332.Xr thr_exit 2 , 333.Xr thr_kill 2 , 334.Xr thr_kill2 2 , 335.Xr thr_new 2 , 336.Xr thr_self 2 , 337.Xr thr_set_name 2 , 338.Xr _umtx_op 2 , 339.Xr dlclose 3 , 340.Xr dlopen 3 , 341.Xr getenv 3 , 342.Xr pthread_attr 3 , 343.Xr pthread_attr_setstacksize 3 , 344.Xr pthread_create 3 , 345.Xr signal 3 , 346.Xr atomic 9 347.Sh HISTORY 348The 349.Nm 350library first appeared in 351.Fx 5.2 . 352.Sh AUTHORS 353.An -nosplit 354The 355.Nm 356library 357was originally created by 358.An Jeff Roberson Aq Mt jeff@FreeBSD.org , 359and enhanced by 360.An Jonathan Mini Aq Mt mini@FreeBSD.org 361and 362.An Mike Makonnen Aq Mt mtm@FreeBSD.org . 363It has been substantially rewritten and optimized by 364.An David Xu Aq Mt davidxu@FreeBSD.org . 365