1.\" Copyright (c) 2016 The FreeBSD Foundation, Inc. 2.\" 3.\" This documentation was written by 4.\" Konstantin Belousov <kib@FreeBSD.org> under sponsorship 5.\" from the FreeBSD Foundation. 6.\" 7.\" Redistribution and use in source and binary forms, with or without 8.\" modification, are permitted provided that the following conditions 9.\" are met: 10.\" 1. Redistributions of source code must retain the above copyright 11.\" notice, this list of conditions and the following disclaimer. 12.\" 2. Redistributions in binary form must reproduce the above copyright 13.\" notice, this list of conditions and the following disclaimer in the 14.\" documentation and/or other materials provided with the distribution. 15.\" 16.\" THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND 17.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 18.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 19.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE 20.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 21.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 22.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 23.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 24.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 25.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 26.\" SUCH DAMAGE. 27.\" 28.Dd November 23, 2020 29.Dt _UMTX_OP 2 30.Os 31.Sh NAME 32.Nm _umtx_op 33.Nd interface for implementation of userspace threading synchronization primitives 34.Sh LIBRARY 35.Lb libc 36.Sh SYNOPSIS 37.In sys/types.h 38.In sys/umtx.h 39.Ft int 40.Fn _umtx_op "void *obj" "int op" "u_long val" "void *uaddr" "void *uaddr2" 41.Sh DESCRIPTION 42The 43.Fn _umtx_op 44system call provides kernel support for userspace implementation of 45the threading synchronization primitives. 46The 47.Lb libthr 48uses the syscall to implement 49.St -p1003.1-2001 50pthread locks, like mutexes, condition variables and so on. 51.Ss STRUCTURES 52The operations, performed by the 53.Fn _umtx_op 54syscall, operate on userspace objects which are described 55by the following structures. 56Reserved fields and paddings are omitted. 57All objects require ABI-mandated alignment, but this is not currently 58enforced consistently on all architectures. 59.Pp 60The following flags are defined for flag fields of all structures: 61.Bl -tag -width indent 62.It Dv USYNC_PROCESS_SHARED 63Allow selection of the process-shared sleep queue for the thread sleep 64container, when the lock ownership cannot be granted immediately, 65and the operation must sleep. 66The process-shared or process-private sleep queue is selected based on 67the attributes of the memory mapping which contains the first byte of 68the structure, see 69.Xr mmap 2 . 70Otherwise, if the flag is not specified, the process-private sleep queue 71is selected regardless of the memory mapping attributes, as an optimization. 72.Pp 73See the 74.Sx SLEEP QUEUES 75subsection below for more details on sleep queues. 76.El 77.Bl -hang -offset indent 78.It Sy Mutex 79.Bd -literal 80struct umutex { 81 volatile lwpid_t m_owner; 82 uint32_t m_flags; 83 uint32_t m_ceilings[2]; 84 uintptr_t m_rb_lnk; 85}; 86.Ed 87.Pp 88The 89.Dv m_owner 90field is the actual lock. 91It contains either the thread identifier of the lock owner in the 92locked state, or zero when the lock is unowned. 93The highest bit set indicates that there is contention on the lock. 94The constants are defined for special values: 95.Bl -tag -width indent 96.It Dv UMUTEX_UNOWNED 97Zero, the value stored in the unowned lock. 98.It Dv UMUTEX_CONTESTED 99The contention indicator. 100.It Dv UMUTEX_RB_OWNERDEAD 101A thread owning the robust mutex terminated. 102The mutex is in unlocked state. 103.It Dv UMUTEX_RB_NOTRECOV 104The robust mutex is in a non-recoverable state. 105It cannot be locked until reinitialized. 106.El 107.Pp 108The 109.Dv m_flags 110field may contain the following umutex-specific flags, in addition to 111the common flags: 112.Bl -tag -width indent 113.It Dv UMUTEX_PRIO_INHERIT 114Mutex implements 115.Em Priority Inheritance 116protocol. 117.It Dv UMUTEX_PRIO_PROTECT 118Mutex implements 119.Em Priority Protection 120protocol. 121.It Dv UMUTEX_ROBUST 122Mutex is robust, as described in the 123.Sx ROBUST UMUTEXES 124section below. 125.It Dv UMUTEX_NONCONSISTENT 126Robust mutex is in a transient non-consistent state. 127Not used by kernel. 128.El 129.Pp 130In the manual page, mutexes not having 131.Dv UMUTEX_PRIO_INHERIT 132and 133.Dv UMUTEX_PRIO_PROTECT 134flags set, are called normal mutexes. 135Each type of mutex 136.Pq normal, priority-inherited, and priority-protected 137has a separate sleep queue associated 138with the given key. 139.Pp 140For priority protected mutexes, the 141.Dv m_ceilings 142array contains priority ceiling values. 143The 144.Dv m_ceilings[0] 145is the ceiling value for the mutex, as specified by 146.St -p1003.1-2008 147for the 148.Em Priority Protected 149mutex protocol. 150The 151.Dv m_ceilings[1] 152is used only for the unlock of a priority protected mutex, when 153unlock is done in an order other than the reversed lock order. 154In this case, 155.Dv m_ceilings[1] 156must contain the ceiling value for the last locked priority protected 157mutex, for proper priority reassignment. 158If, instead, the unlocking mutex was the last priority propagated 159mutex locked by the thread, 160.Dv m_ceilings[1] 161should contain \-1. 162This is required because kernel does not maintain the ordered lock list. 163.It Sy Condition variable 164.Bd -literal 165struct ucond { 166 volatile uint32_t c_has_waiters; 167 uint32_t c_flags; 168 uint32_t c_clockid; 169}; 170.Ed 171.Pp 172A non-zero 173.Dv c_has_waiters 174value indicates that there are in-kernel waiters for the condition, 175executing the 176.Dv UMTX_OP_CV_WAIT 177request. 178.Pp 179The 180.Dv c_flags 181field contains flags. 182Only the common flags 183.Pq Dv USYNC_PROCESS_SHARED 184are defined for ucond. 185.Pp 186The 187.Dv c_clockid 188member provides the clock identifier to use for timeout, when the 189.Dv UMTX_OP_CV_WAIT 190request has both the 191.Dv CVWAIT_CLOCKID 192flag and the timeout specified. 193Valid clock identifiers are a subset of those for 194.Xr clock_gettime 2 : 195.Bl -bullet -compact 196.It 197.Dv CLOCK_MONOTONIC 198.It 199.Dv CLOCK_MONOTONIC_FAST 200.It 201.Dv CLOCK_MONOTONIC_PRECISE 202.It 203.Dv CLOCK_PROF 204.It 205.Dv CLOCK_REALTIME 206.It 207.Dv CLOCK_REALTIME_FAST 208.It 209.Dv CLOCK_REALTIME_PRECISE 210.It 211.Dv CLOCK_SECOND 212.It 213.Dv CLOCK_UPTIME 214.It 215.Dv CLOCK_UPTIME_FAST 216.It 217.Dv CLOCK_UPTIME_PRECISE 218.It 219.Dv CLOCK_VIRTUAL 220.El 221.It Sy Reader/writer lock 222.Bd -literal 223struct urwlock { 224 volatile int32_t rw_state; 225 uint32_t rw_flags; 226 uint32_t rw_blocked_readers; 227 uint32_t rw_blocked_writers; 228}; 229.Ed 230.Pp 231The 232.Dv rw_state 233field is the actual lock. 234It contains both the flags and counter of the read locks which were 235granted. 236Names of the 237.Dv rw_state 238bits are following: 239.Bl -tag -width indent 240.It Dv URWLOCK_WRITE_OWNER 241Write lock was granted. 242.It Dv URWLOCK_WRITE_WAITERS 243There are write lock waiters. 244.It Dv URWLOCK_READ_WAITERS 245There are read lock waiters. 246.It Dv URWLOCK_READER_COUNT(c) 247Returns the count of currently granted read locks. 248.El 249.Pp 250At any given time there may be only one thread to which the writer lock 251is granted on the 252.Vt struct rwlock , 253and no threads are granted read lock. 254Or, at the given time, up to 255.Dv URWLOCK_MAX_READERS 256threads may be granted the read lock simultaneously, but write lock is 257not granted to any thread. 258.Pp 259The following flags for the 260.Dv rw_flags 261member of 262.Vt struct urwlock 263are defined, in addition to the common flags: 264.Bl -tag -width indent 265.It Dv URWLOCK_PREFER_READER 266If specified, immediately grant read lock requests when 267.Dv urwlock 268is already read-locked, even in presence of unsatisfied write 269lock requests. 270By default, if there is a write lock waiter, further read requests are 271not granted, to prevent unfair write lock waiter starvation. 272.El 273.Pp 274The 275.Dv rw_blocked_readers 276and 277.Dv rw_blocked_writers 278members contain the count of threads which are sleeping in kernel, 279waiting for the associated request type to be granted. 280The fields are used by kernel to update the 281.Dv URWLOCK_READ_WAITERS 282and 283.Dv URWLOCK_WRITE_WAITERS 284flags of the 285.Dv rw_state 286lock after requesting thread was woken up. 287.It Sy Semaphore 288.Bd -literal 289struct _usem2 { 290 volatile uint32_t _count; 291 uint32_t _flags; 292}; 293.Ed 294.Pp 295The 296.Dv _count 297word represents a counting semaphore. 298A non-zero value indicates an unlocked (posted) semaphore, while zero 299represents the locked state. 300The maximal supported semaphore count is 301.Dv USEM_MAX_COUNT . 302.Pp 303The 304.Dv _count 305word, besides the counter of posts (unlocks), also contains the 306.Dv USEM_HAS_WAITERS 307bit, which indicates that locked semaphore has waiting threads. 308.Pp 309The 310.Dv USEM_COUNT() 311macro, applied to the 312.Dv _count 313word, returns the current semaphore counter, which is the number of posts 314issued on the semaphore. 315.Pp 316The following bits for the 317.Dv _flags 318member of 319.Vt struct _usem2 320are defined, in addition to the common flags: 321.Bl -tag -width indent 322.It Dv USEM_NAMED 323Flag is ignored by kernel. 324.El 325.It Sy Timeout parameter 326.Bd -literal 327struct _umtx_time { 328 struct timespec _timeout; 329 uint32_t _flags; 330 uint32_t _clockid; 331}; 332.Ed 333.Pp 334Several 335.Fn _umtx_op 336operations allow the blocking time to be limited, failing the request 337if it cannot be satisfied in the specified time period. 338The timeout is specified by passing either the address of 339.Vt struct timespec , 340or its extended variant, 341.Vt struct _umtx_time , 342as the 343.Fa uaddr2 344argument of 345.Fn _umtx_op . 346They are distinguished by the 347.Fa uaddr 348value, which must be equal to the size of the structure pointed to by 349.Fa uaddr2 , 350casted to 351.Vt uintptr_t . 352.Pp 353The 354.Dv _timeout 355member specifies the time when the timeout should occur. 356Legal values for clock identifier 357.Dv _clockid 358are shared with the 359.Fa clock_id 360argument to the 361.Xr clock_gettime 2 362function, 363and use the same underlying clocks. 364The specified clock is used to obtain the current time value. 365Interval counting is always performed by the monotonic wall clock. 366.Pp 367The 368.Dv _flags 369argument allows the following flags to further define the timeout behaviour: 370.Bl -tag -width indent 371.It Dv UMTX_ABSTIME 372The 373.Dv _timeout 374value is the absolute time. 375The thread will be unblocked and the request failed when specified 376clock value is equal or exceeds the 377.Dv _timeout. 378.Pp 379If the flag is absent, the timeout value is relative, that is the amount 380of time, measured by the monotonic wall clock from the moment of the request 381start. 382.El 383.El 384.Ss SLEEP QUEUES 385When a locking request cannot be immediately satisfied, the thread is 386typically put to 387.Em sleep , 388which is a non-runnable state terminated by the 389.Em wake 390operation. 391Lock operations include a 392.Em try 393variant which returns an error rather than sleeping if the lock cannot 394be obtained. 395Also, 396.Fn _umtx_op 397provides requests which explicitly put the thread to sleep. 398.Pp 399Wakes need to know which threads to make runnable, so sleeping threads 400are grouped into containers called 401.Em sleep queues . 402A sleep queue is identified by a key, which for 403.Fn _umtx_op 404is defined as the physical address of some variable. 405Note that the 406.Em physical 407address is used, which means that same variable mapped multiple 408times will give one key value. 409This mechanism enables the construction of 410.Em process-shared 411locks. 412.Pp 413A related attribute of the key is shareability. 414Some requests always interpret keys as private for the current process, 415creating sleep queues with the scope of the current process even if 416the memory is shared. 417Others either select the shareability automatically from the 418mapping attributes, or take additional input as the 419.Dv USYNC_PROCESS_SHARED 420common flag. 421This is done as optimization, allowing the lock scope to be limited 422regardless of the kind of backing memory. 423.Pp 424Only the address of the start byte of the variable specified as key is 425important for determining corresponding sleep queue. 426The size of the variable does not matter, so, for example, sleep on the same 427address interpreted as 428.Vt uint32_t 429and 430.Vt long 431on a little-endian 64-bit platform would collide. 432.Pp 433The last attribute of the key is the object type. 434The sleep queue to which a sleeping thread is assigned is an individual 435one for simple wait requests, mutexes, rwlocks, condvars and other 436primitives, even when the physical address of the key is same. 437.Pp 438When waking up a limited number of threads from a given sleep queue, 439the highest priority threads that have been blocked for the longest on 440the queue are selected. 441.Ss ROBUST UMUTEXES 442The 443.Em robust umutexes 444are provided as a substrate for a userspace library to implement 445.Tn POSIX 446robust mutexes. 447A robust umutex must have the 448.Dv UMUTEX_ROBUST 449flag set. 450.Pp 451On thread termination, the kernel walks two lists of mutexes. 452The two lists head addresses must be provided by a prior call to 453.Dv UMTX_OP_ROBUST_LISTS 454request. 455The lists are singly-linked. 456The link to next element is provided by the 457.Dv m_rb_lnk 458member of the 459.Vt struct umutex . 460.Pp 461Robust list processing is aborted if the kernel finds a mutex 462with any of the following conditions: 463.Bl -dash -offset indent -compact 464.It 465the 466.Dv UMUTEX_ROBUST 467flag is not set 468.It 469not owned by the current thread, except when the mutex is pointed to 470by the 471.Dv robust_inactive 472member of the 473.Vt struct umtx_robust_lists_params , 474registered for the current thread 475.It 476the combination of mutex flags is invalid 477.It 478read of the umutex memory faults 479.It 480the list length limit described in 481.Xr libthr 3 482is reached. 483.El 484.Pp 485Every mutex in both lists is unlocked as if the 486.Dv UMTX_OP_MUTEX_UNLOCK 487request is performed on it, but instead of the 488.Dv UMUTEX_UNOWNED 489value, the 490.Dv m_owner 491field is written with the 492.Dv UMUTEX_RB_OWNERDEAD 493value. 494When a mutex in the 495.Dv UMUTEX_RB_OWNERDEAD 496state is locked by kernel due to the 497.Dv UMTX_OP_MUTEX_TRYLOCK 498and 499.Dv UMTX_OP_MUTEX_LOCK 500requests, the lock is granted and 501.Er EOWNERDEAD 502error is returned. 503.Pp 504Also, the kernel handles the 505.Dv UMUTEX_RB_NOTRECOV 506value of 507.Dv the m_owner 508field specially, always returning the 509.Er ENOTRECOVERABLE 510error for lock attempts, without granting the lock. 511.Ss OPERATIONS 512The following operations, requested by the 513.Fa op 514argument to the function, are implemented: 515.Bl -tag -width indent 516.It Dv UMTX_OP_WAIT 517Wait. 518The arguments for the request are: 519.Bl -tag -width "obj" 520.It Fa obj 521Pointer to a variable of type 522.Vt long . 523.It Fa val 524Current value of the 525.Dv *obj . 526.El 527.Pp 528The current value of the variable pointed to by the 529.Fa obj 530argument is compared with the 531.Fa val . 532If they are equal, the requesting thread is put to interruptible sleep 533until woken up or the optionally specified timeout expires. 534.Pp 535The comparison and sleep are atomic. 536In other words, if another thread writes a new value to 537.Dv *obj 538and then issues 539.Dv UMTX_OP_WAKE , 540the request is guaranteed to not miss the wakeup, 541which might otherwise happen between comparison and blocking. 542.Pp 543The physical address of memory where the 544.Fa *obj 545variable is located, is used as a key to index sleeping threads. 546.Pp 547The read of the current value of the 548.Dv *obj 549variable is not guarded by barriers. 550In particular, it is the user's duty to ensure the lock acquire 551and release memory semantics, if the 552.Dv UMTX_OP_WAIT 553and 554.Dv UMTX_OP_WAKE 555requests are used as a substrate for implementing a simple lock. 556.Pp 557The request is not restartable. 558An unblocked signal delivered during the wait always results in sleep 559interruption and 560.Er EINTR 561error. 562.Pp 563Optionally, a timeout for the request may be specified. 564.It Dv UMTX_OP_WAKE 565Wake the threads possibly sleeping due to 566.Dv UMTX_OP_WAIT . 567The arguments for the request are: 568.Bl -tag -width "obj" 569.It Fa obj 570Pointer to a variable, used as a key to find sleeping threads. 571.It Fa val 572Up to 573.Fa val 574threads are woken up by this request. 575Specify 576.Dv INT_MAX 577to wake up all waiters. 578.El 579.It Dv UMTX_OP_MUTEX_TRYLOCK 580Try to lock umutex. 581The arguments to the request are: 582.Bl -tag -width "obj" 583.It Fa obj 584Pointer to the umutex. 585.El 586.Pp 587Operates same as the 588.Dv UMTX_OP_MUTEX_LOCK 589request, but returns 590.Er EBUSY 591instead of sleeping if the lock cannot be obtained immediately. 592.It Dv UMTX_OP_MUTEX_LOCK 593Lock umutex. 594The arguments to the request are: 595.Bl -tag -width "obj" 596.It Fa obj 597Pointer to the umutex. 598.El 599.Pp 600Locking is performed by writing the current thread id into the 601.Dv m_owner 602word of the 603.Vt struct umutex . 604The write is atomic, preserves the 605.Dv UMUTEX_CONTESTED 606contention indicator, and provides the acquire barrier for 607lock entrance semantic. 608.Pp 609If the lock cannot be obtained immediately because another thread owns 610the lock, the current thread is put to sleep, with 611.Dv UMUTEX_CONTESTED 612bit set before. 613Upon wake up, the lock conditions are re-tested. 614.Pp 615The request adheres to the priority protection or inheritance protocol 616of the mutex, specified by the 617.Dv UMUTEX_PRIO_PROTECT 618or 619.Dv UMUTEX_PRIO_INHERIT 620flag, respectively. 621.Pp 622Optionally, a timeout for the request may be specified. 623.Pp 624A request with a timeout specified is not restartable. 625An unblocked signal delivered during the wait always results in sleep 626interruption and 627.Er EINTR 628error. 629A request without timeout specified is always restarted after return 630from a signal handler. 631.It Dv UMTX_OP_MUTEX_UNLOCK 632Unlock umutex. 633The arguments to the request are: 634.Bl -tag -width "obj" 635.It Fa obj 636Pointer to the umutex. 637.El 638.Pp 639Unlocks the mutex, by writing 640.Dv UMUTEX_UNOWNED 641(zero) value into 642.Dv m_owner 643word of the 644.Vt struct umutex . 645The write is done with a release barrier, to provide lock leave semantic. 646.Pp 647If there are threads sleeping in the sleep queue associated with the 648umutex, one thread is woken up. 649If more than one thread sleeps in the sleep queue, the 650.Dv UMUTEX_CONTESTED 651bit is set together with the write of the 652.Dv UMUTEX_UNOWNED 653value into 654.Dv m_owner . 655.Pp 656The request adheres to the priority protection or inheritance protocol 657of the mutex, specified by the 658.Dv UMUTEX_PRIO_PROTECT 659or 660.Dv UMUTEX_PRIO_INHERIT 661flag, respectively. 662See description of the 663.Dv m_ceilings 664member of the 665.Vt struct umutex 666structure for additional details of the request operation on the 667priority protected protocol mutex. 668.It Dv UMTX_OP_SET_CEILING 669Set ceiling for the priority protected umutex. 670The arguments to the request are: 671.Bl -tag -width "uaddr" 672.It Fa obj 673Pointer to the umutex. 674.It Fa val 675New ceiling value. 676.It Fa uaddr 677Address of a variable of type 678.Vt uint32_t . 679If not 680.Dv NULL 681and the update was successful, the previous ceiling value is 682written to the location pointed to by 683.Fa uaddr . 684.El 685.Pp 686The request locks the umutex pointed to by the 687.Fa obj 688parameter, waiting for the lock if not immediately available. 689After the lock is obtained, the new ceiling value 690.Fa val 691is written to the 692.Dv m_ceilings[0] 693member of the 694.Vt struct umutex, 695after which the umutex is unlocked. 696.Pp 697The locking does not adhere to the priority protect protocol, 698to conform to the 699.Tn POSIX 700requirements for the 701.Xr pthread_mutex_setprioceiling 3 702interface. 703.It Dv UMTX_OP_CV_WAIT 704Wait for a condition. 705The arguments to the request are: 706.Bl -tag -width "uaddr2" 707.It Fa obj 708Pointer to the 709.Vt struct ucond . 710.It Fa val 711Request flags, see below. 712.It Fa uaddr 713Pointer to the umutex. 714.It Fa uaddr2 715Optional pointer to a 716.Vt struct timespec 717for timeout specification. 718.El 719.Pp 720The request must be issued by the thread owning the mutex pointed to 721by the 722.Fa uaddr 723argument. 724The 725.Dv c_hash_waiters 726member of the 727.Vt struct ucond , 728pointed to by the 729.Fa obj 730argument, is set to an arbitrary non-zero value, after which the 731.Fa uaddr 732mutex is unlocked (following the appropriate protocol), and 733the current thread is put to sleep on the sleep queue keyed by 734the 735.Fa obj 736argument. 737The operations are performed atomically. 738It is guaranteed to not miss a wakeup from 739.Dv UMTX_OP_CV_SIGNAL 740or 741.Dv UMTX_OP_CV_BROADCAST 742sent between mutex unlock and putting the current thread on the sleep queue. 743.Pp 744Upon wakeup, if the timeout expired and no other threads are sleeping in 745the same sleep queue, the 746.Dv c_hash_waiters 747member is cleared. 748After wakeup, the 749.Fa uaddr 750umutex is not relocked. 751.Pp 752The following flags are defined: 753.Bl -tag -width "CVWAIT_CLOCKID" 754.It Dv CVWAIT_ABSTIME 755Timeout is absolute. 756.It Dv CVWAIT_CLOCKID 757Clockid is provided. 758.El 759.Pp 760Optionally, a timeout for the request may be specified. 761Unlike other requests, the timeout value is specified directly by a 762.Vt struct timespec , 763pointed to by the 764.Fa uaddr2 765argument. 766If the 767.Dv CVWAIT_CLOCKID 768flag is provided, the timeout uses the clock from the 769.Dv c_clockid 770member of the 771.Vt struct ucond , 772pointed to by 773.Fa obj 774argument. 775Otherwise, 776.Dv CLOCK_REALTIME 777is used, regardless of the clock identifier possibly specified in the 778.Vt struct _umtx_time . 779If the 780.Dv CVWAIT_ABSTIME 781flag is supplied, the timeout specifies absolute time value, otherwise 782it denotes a relative time interval. 783.Pp 784The request is not restartable. 785An unblocked signal delivered during 786the wait always results in sleep interruption and 787.Er EINTR 788error. 789.It Dv UMTX_OP_CV_SIGNAL 790Wake up one condition waiter. 791The arguments to the request are: 792.Bl -tag -width "obj" 793.It Fa obj 794Pointer to 795.Vt struct ucond . 796.El 797.Pp 798The request wakes up at most one thread sleeping on the sleep queue keyed 799by the 800.Fa obj 801argument. 802If the woken up thread was the last on the sleep queue, the 803.Dv c_has_waiters 804member of the 805.Vt struct ucond 806is cleared. 807.It Dv UMTX_OP_CV_BROADCAST 808Wake up all condition waiters. 809The arguments to the request are: 810.Bl -tag -width "obj" 811.It Fa obj 812Pointer to 813.Vt struct ucond . 814.El 815.Pp 816The request wakes up all threads sleeping on the sleep queue keyed by the 817.Fa obj 818argument. 819The 820.Dv c_has_waiters 821member of the 822.Vt struct ucond 823is cleared. 824.It Dv UMTX_OP_WAIT_UINT 825Same as 826.Dv UMTX_OP_WAIT , 827but the type of the variable pointed to by 828.Fa obj 829is 830.Vt u_int 831.Pq a 32-bit integer . 832.It Dv UMTX_OP_RW_RDLOCK 833Read-lock a 834.Vt struct rwlock 835lock. 836The arguments to the request are: 837.Bl -tag -width "obj" 838.It Fa obj 839Pointer to the lock (of type 840.Vt struct rwlock ) 841to be read-locked. 842.It Fa val 843Additional flags to augment locking behaviour. 844The valid flags in the 845.Fa val 846argument are: 847.Bl -tag -width indent 848.It Dv URWLOCK_PREFER_READER 849.El 850.El 851.Pp 852The request obtains the read lock on the specified 853.Vt struct rwlock 854by incrementing the count of readers in the 855.Dv rw_state 856word of the structure. 857If the 858.Dv URWLOCK_WRITE_OWNER 859bit is set in the word 860.Dv rw_state , 861the lock was granted to a writer which has not yet relinquished 862its ownership. 863In this case the current thread is put to sleep until it makes sense to 864retry. 865.Pp 866If the 867.Dv URWLOCK_PREFER_READER 868flag is set either in the 869.Dv rw_flags 870word of the structure, or in the 871.Fa val 872argument of the request, the presence of the threads trying to obtain 873the write lock on the same structure does not prevent the current thread 874from trying to obtain the read lock. 875Otherwise, if the flag is not set, and the 876.Dv URWLOCK_WRITE_WAITERS 877flag is set in 878.Dv rw_state , 879the current thread does not attempt to obtain read-lock. 880Instead it sets the 881.Dv URWLOCK_READ_WAITERS 882in the 883.Dv rw_state 884word and puts itself to sleep on corresponding sleep queue. 885Upon wakeup, the locking conditions are re-evaluated. 886.Pp 887Optionally, a timeout for the request may be specified. 888.Pp 889The request is not restartable. 890An unblocked signal delivered during the wait always results in sleep 891interruption and 892.Er EINTR 893error. 894.It Dv UMTX_OP_RW_WRLOCK 895Write-lock a 896.Vt struct rwlock 897lock. 898The arguments to the request are: 899.Bl -tag -width "obj" 900.It Fa obj 901Pointer to the lock (of type 902.Vt struct rwlock ) 903to be write-locked. 904.El 905.Pp 906The request obtains a write lock on the specified 907.Vt struct rwlock , 908by setting the 909.Dv URWLOCK_WRITE_OWNER 910bit in the 911.Dv rw_state 912word of the structure. 913If there is already a write lock owner, as indicated by the 914.Dv URWLOCK_WRITE_OWNER 915bit being set, or there are read lock owners, as indicated 916by the read-lock counter, the current thread does not attempt to 917obtain the write-lock. 918Instead it sets the 919.Dv URWLOCK_WRITE_WAITERS 920in the 921.Dv rw_state 922word and puts itself to sleep on corresponding sleep queue. 923Upon wakeup, the locking conditions are re-evaluated. 924.Pp 925Optionally, a timeout for the request may be specified. 926.Pp 927The request is not restartable. 928An unblocked signal delivered during the wait always results in sleep 929interruption and 930.Er EINTR 931error. 932.It Dv UMTX_OP_RW_UNLOCK 933Unlock rwlock. 934The arguments to the request are: 935.Bl -tag -width "obj" 936.It Fa obj 937Pointer to the lock (of type 938.Vt struct rwlock ) 939to be unlocked. 940.El 941.Pp 942The unlock type (read or write) is determined by the 943current lock state. 944Note that the 945.Vt struct rwlock 946does not save information about the identity of the thread which 947acquired the lock. 948.Pp 949If there are pending writers after the unlock, and the 950.Dv URWLOCK_PREFER_READER 951flag is not set in the 952.Dv rw_flags 953member of the 954.Fa *obj 955structure, one writer is woken up, selected as described in the 956.Sx SLEEP QUEUES 957subsection. 958If the 959.Dv URWLOCK_PREFER_READER 960flag is set, a pending writer is woken up only if there is 961no pending readers. 962.Pp 963If there are no pending writers, or, in the case that the 964.Dv URWLOCK_PREFER_READER 965flag is set, then all pending readers are woken up by unlock. 966.It Dv UMTX_OP_WAIT_UINT_PRIVATE 967Same as 968.Dv UMTX_OP_WAIT_UINT , 969but unconditionally select the process-private sleep queue. 970.It Dv UMTX_OP_WAKE_PRIVATE 971Same as 972.Dv UMTX_OP_WAKE , 973but unconditionally select the process-private sleep queue. 974.It Dv UMTX_OP_MUTEX_WAIT 975Wait for mutex availability. 976The arguments to the request are: 977.Bl -tag -width "obj" 978.It Fa obj 979Address of the mutex. 980.El 981.Pp 982Similarly to the 983.Dv UMTX_OP_MUTEX_LOCK , 984put the requesting thread to sleep if the mutex lock cannot be obtained 985immediately. 986The 987.Dv UMUTEX_CONTESTED 988bit is set in the 989.Dv m_owner 990word of the mutex to indicate that there is a waiter, before the thread 991is added to the sleep queue. 992Unlike the 993.Dv UMTX_OP_MUTEX_LOCK 994request, the lock is not obtained. 995.Pp 996The operation is not implemented for priority protected and 997priority inherited protocol mutexes. 998.Pp 999Optionally, a timeout for the request may be specified. 1000.Pp 1001A request with a timeout specified is not restartable. 1002An unblocked signal delivered during the wait always results in sleep 1003interruption and 1004.Er EINTR 1005error. 1006A request without a timeout automatically restarts if the signal disposition 1007requested restart via the 1008.Dv SA_RESTART 1009flag in 1010.Vt struct sigaction 1011member 1012.Dv sa_flags . 1013.It Dv UMTX_OP_NWAKE_PRIVATE 1014Wake up a batch of sleeping threads. 1015The arguments to the request are: 1016.Bl -tag -width "obj" 1017.It Fa obj 1018Pointer to the array of pointers. 1019.It Fa val 1020Number of elements in the array pointed to by 1021.Fa obj . 1022.El 1023.Pp 1024For each element in the array pointed to by 1025.Fa obj , 1026wakes up all threads waiting on the 1027.Em private 1028sleep queue with the key 1029being the byte addressed by the array element. 1030.It Dv UMTX_OP_MUTEX_WAKE 1031Check if a normal umutex is unlocked and wake up a waiter. 1032The arguments for the request are: 1033.Bl -tag -width "obj" 1034.It Fa obj 1035Pointer to the umutex. 1036.El 1037.Pp 1038If the 1039.Dv m_owner 1040word of the mutex pointed to by the 1041.Fa obj 1042argument indicates unowned mutex, which has its contention indicator bit 1043.Dv UMUTEX_CONTESTED 1044set, clear the bit and wake up one waiter in the sleep queue associated 1045with the byte addressed by the 1046.Fa obj , 1047if any. 1048Only normal mutexes are supported by the request. 1049The sleep queue is always one for a normal mutex type. 1050.Pp 1051This request is deprecated in favor of 1052.Dv UMTX_OP_MUTEX_WAKE2 1053since mutexes using it cannot synchronize their own destruction. 1054That is, the 1055.Dv m_owner 1056word has already been set to 1057.Dv UMUTEX_UNOWNED 1058when this request is made, 1059so that another thread can lock, unlock and destroy the mutex 1060(if no other thread uses the mutex afterwards). 1061Clearing the 1062.Dv UMUTEX_CONTESTED 1063bit may then modify freed memory. 1064.It Dv UMTX_OP_MUTEX_WAKE2 1065Check if a umutex is unlocked and wake up a waiter. 1066The arguments for the request are: 1067.Bl -tag -width "obj" 1068.It Fa obj 1069Pointer to the umutex. 1070.It Fa val 1071The umutex flags. 1072.El 1073.Pp 1074The request does not read the 1075.Dv m_flags 1076member of the 1077.Vt struct umutex ; 1078instead, the 1079.Fa val 1080argument supplies flag information, in particular, to determine the 1081sleep queue where the waiters are found for wake up. 1082.Pp 1083If the mutex is unowned, one waiter is woken up. 1084.Pp 1085If the mutex memory cannot be accessed, all waiters are woken up. 1086.Pp 1087If there is more than one waiter on the sleep queue, or there is only 1088one waiter but the mutex is owned by a thread, the 1089.Dv UMUTEX_CONTESTED 1090bit is set in the 1091.Dv m_owner 1092word of the 1093.Vt struct umutex . 1094.It Dv UMTX_OP_SEM2_WAIT 1095Wait until semaphore is available. 1096The arguments to the request are: 1097.Bl -tag -width "obj" 1098.It Fa obj 1099Pointer to the semaphore (of type 1100.Vt struct _usem2 ) . 1101.It Fa uaddr 1102Size of the memory passed in via the 1103.Fa uaddr2 1104argument. 1105.It Fa uaddr2 1106Optional pointer to a structure of type 1107.Vt struct _umtx_time , 1108which may be followed by a structure of type 1109.Vt struct timespec . 1110.El 1111.Pp 1112Put the requesting thread onto a sleep queue if the semaphore counter 1113is zero. 1114If the thread is put to sleep, the 1115.Dv USEM_HAS_WAITERS 1116bit is set in the 1117.Dv _count 1118word to indicate waiters. 1119The function returns either due to 1120.Dv _count 1121indicating the semaphore is available (non-zero count due to post), 1122or due to a wakeup. 1123The return does not guarantee that the semaphore is available, 1124nor does it consume the semaphore lock on successful return. 1125.Pp 1126Optionally, a timeout for the request may be specified. 1127.Pp 1128A request with non-absolute timeout value is not restartable. 1129An unblocked signal delivered during such wait results in sleep 1130interruption and 1131.Er EINTR 1132error. 1133.Pp 1134If 1135.Dv UMTX_ABSTIME 1136was not set, and the operation was interrupted and the caller passed in a 1137.Fa uaddr2 1138large enough to hold a 1139.Vt struct timespec 1140following the initial 1141.Vt struct _umtx_time , 1142then the 1143.Vt struct timespec 1144is updated to contain the unslept amount. 1145.It Dv UMTX_OP_SEM2_WAKE 1146Wake up waiters on semaphore lock. 1147The arguments to the request are: 1148.Bl -tag -width "obj" 1149.It Fa obj 1150Pointer to the semaphore (of type 1151.Vt struct _usem2 ) . 1152.El 1153.Pp 1154The request wakes up one waiter for the semaphore lock. 1155The function does not increment the semaphore lock count. 1156If the 1157.Dv USEM_HAS_WAITERS 1158bit was set in the 1159.Dv _count 1160word, and the last sleeping thread was woken up, the bit is cleared. 1161.It Dv UMTX_OP_SHM 1162Manage anonymous 1163.Tn POSIX 1164shared memory objects (see 1165.Xr shm_open 2 ) , 1166which can be attached to a byte of physical memory, mapped into the 1167process address space. 1168The objects are used to implement process-shared locks in 1169.Dv libthr . 1170.Pp 1171The 1172.Fa val 1173argument specifies the sub-request of the 1174.Dv UMTX_OP_SHM 1175request: 1176.Bl -tag -width indent 1177.It Dv UMTX_SHM_CREAT 1178Creates the anonymous shared memory object, which can be looked up 1179with the specified key 1180.Fa uaddr . 1181If the object associated with the 1182.Fa uaddr 1183key already exists, it is returned instead of creating a new object. 1184The object's size is one page. 1185On success, the file descriptor referencing the object is returned. 1186The descriptor can be used for mapping the object using 1187.Xr mmap 2 , 1188or for other shared memory operations. 1189.It Dv UMTX_SHM_LOOKUP 1190Same as 1191.Dv UMTX_SHM_CREATE 1192request, but if there is no shared memory object associated with 1193the specified key 1194.Fa uaddr , 1195an error is returned, and no new object is created. 1196.It Dv UMTX_SHM_DESTROY 1197De-associate the shared object with the specified key 1198.Fa uaddr . 1199The object is destroyed after the last open file descriptor is closed 1200and the last mapping for it is destroyed. 1201.It Dv UMTX_SHM_ALIVE 1202Checks whether there is a live shared object associated with the 1203supplied key 1204.Fa uaddr . 1205Returns zero if there is, and an error otherwise. 1206This request is an optimization of the 1207.Dv UMTX_SHM_LOOKUP 1208request. 1209It is cheaper when only the liveness of the associated object is asked 1210for, since no file descriptor is installed in the process fd table 1211on success. 1212.El 1213.Pp 1214The 1215.Fa uaddr 1216argument specifies the virtual address, which backing physical memory 1217byte identity is used as a key for the anonymous shared object 1218creation or lookup. 1219.It Dv UMTX_OP_ROBUST_LISTS 1220Register the list heads for the current thread's robust mutex lists. 1221The arguments to the request are: 1222.Bl -tag -width "uaddr" 1223.It Fa val 1224Size of the structure passed in the 1225.Fa uaddr 1226argument. 1227.It Fa uaddr 1228Pointer to the structure of type 1229.Vt struct umtx_robust_lists_params . 1230.El 1231.Pp 1232The structure is defined as 1233.Bd -literal 1234struct umtx_robust_lists_params { 1235 uintptr_t robust_list_offset; 1236 uintptr_t robust_priv_list_offset; 1237 uintptr_t robust_inact_offset; 1238}; 1239.Ed 1240.Pp 1241The 1242.Dv robust_list_offset 1243member contains address of the first element in the list of locked 1244robust shared mutexes. 1245The 1246.Dv robust_priv_list_offset 1247member contains address of the first element in the list of locked 1248robust private mutexes. 1249The private and shared robust locked lists are split to allow fast 1250termination of the shared list on fork, in the child. 1251.Pp 1252The 1253.Dv robust_inact_offset 1254contains a pointer to the mutex which might be locked in nearby future, 1255or might have been just unlocked. 1256It is typically set by the lock or unlock mutex implementation code 1257around the whole operation, since lists can be only changed race-free 1258when the thread owns the mutex. 1259The kernel inspects the 1260.Dv robust_inact_offset 1261in addition to walking the shared and private lists. 1262Also, the mutex pointed to by 1263.Dv robust_inact_offset 1264is handled more loosely at the thread termination time, 1265than other mutexes on the list. 1266That mutex is allowed to be not owned by the current thread, 1267in which case list processing is continued. 1268See 1269.Sx ROBUST UMUTEXES 1270subsection for details. 1271.It Dv UMTX_OP_GET_MIN_TIMEOUT 1272Writes out the current value of minimal umtx operations timeout, 1273in nanoseconds, into the long integer variable pointed to by 1274.Fa uaddr1 . 1275.It Dv UMTX_OP_SET_MIN_TIMEOUT 1276Set the minimal amount of time, in nanoseconds, the thread is required 1277to sleep for umtx operations specifying a timeout using absolute clocks. 1278The value is taken from the 1279.Fa val 1280argument of the call. 1281Zero means no minimum. 1282.El 1283.Pp 1284The 1285.Fa op 1286argument may be a bitwise OR of a single command from above with one or more of 1287the following flags: 1288.Bl -tag -width indent 1289.It Dv UMTX_OP__I386 1290Request i386 ABI compatibility from the native 1291.Nm 1292system call. 1293Specifically, this implies that: 1294.Bl -hang -offset indent 1295.It 1296.Fa obj 1297arguments that point to a word, point to a 32-bit integer. 1298.It 1299The 1300.Dv UMTX_OP_NWAKE_PRIVATE 1301.Fa obj 1302argument is a pointer to an array of 32-bit pointers. 1303.It 1304The 1305.Dv m_rb_lnk 1306member of 1307.Vt struct umutex 1308is a 32-bit pointer. 1309.It 1310.Vt struct timespec 1311uses a 32-bit time_t. 1312.El 1313.Pp 1314.Dv UMTX_OP__32BIT 1315has no effect if this flag is set. 1316This flag is valid for all architectures, but it is ignored on i386. 1317.It Dv UMTX_OP__32BIT 1318Request non-i386, 32-bit ABI compatibility from the native 1319.Nm 1320system call. 1321Specifically, this implies that: 1322.Bl -hang -offset indent 1323.It 1324.Fa obj 1325arguments that point to a word, point to a 32-bit integer. 1326.It 1327The 1328.Dv UMTX_OP_NWAKE_PRIVATE 1329.Fa obj 1330argument is a pointer to an array of 32-bit pointers. 1331.It 1332The 1333.Dv m_rb_lnk 1334member of 1335.Vt struct umutex 1336is a 32-bit pointer. 1337.It 1338.Vt struct timespec 1339uses a 64-bit time_t. 1340.El 1341.Pp 1342This flag has no effect if 1343.Dv UMTX_OP__I386 1344is set. 1345This flag is valid for all architectures. 1346.El 1347.Pp 1348Note that if any 32-bit ABI compatibility is being requested, then care must be 1349taken with robust lists. 1350A single thread may not mix 32-bit compatible robust lists with native 1351robust lists. 1352The first 1353.Dv UMTX_OP_ROBUST_LISTS 1354call in a given thread determines which ABI that thread will use for robust 1355lists going forward. 1356.Sh RETURN VALUES 1357If successful, 1358all requests, except 1359.Dv UMTX_SHM_CREAT 1360and 1361.Dv UMTX_SHM_LOOKUP 1362sub-requests of the 1363.Dv UMTX_OP_SHM 1364request, will return zero. 1365The 1366.Dv UMTX_SHM_CREAT 1367and 1368.Dv UMTX_SHM_LOOKUP 1369return a shared memory file descriptor on success. 1370On error \-1 is returned, and the 1371.Va errno 1372variable is set to indicate the error. 1373.Sh ERRORS 1374The 1375.Fn _umtx_op 1376operations can fail with the following errors: 1377.Bl -tag -width "[ETIMEDOUT]" 1378.It Bq Er EFAULT 1379One of the arguments point to invalid memory. 1380.It Bq Er EINVAL 1381The clock identifier, specified for the 1382.Vt struct _umtx_time 1383timeout parameter, or in the 1384.Dv c_clockid 1385member of 1386.Vt struct ucond, 1387is invalid. 1388.It Bq Er EINVAL 1389The type of the mutex, encoded by the 1390.Dv m_flags 1391member of 1392.Vt struct umutex , 1393is invalid. 1394.It Bq Er EINVAL 1395The 1396.Dv m_owner 1397member of the 1398.Vt struct umutex 1399has changed the lock owner thread identifier during unlock. 1400.It Bq Er EINVAL 1401The 1402.Dv timeout.tv_sec 1403or 1404.Dv timeout.tv_nsec 1405member of 1406.Vt struct _umtx_time 1407is less than zero, or 1408.Dv timeout.tv_nsec 1409is greater than 1000000000. 1410.It Bq Er EINVAL 1411The 1412.Fa op 1413argument specifies invalid operation. 1414.It Bq Er EINVAL 1415The 1416.Fa uaddr 1417argument for the 1418.Dv UMTX_OP_SHM 1419request specifies invalid operation. 1420.It Bq Er EINVAL 1421The 1422.Dv UMTX_OP_SET_CEILING 1423request specifies non priority protected mutex. 1424.It Bq Er EINVAL 1425The new ceiling value for the 1426.Dv UMTX_OP_SET_CEILING 1427request, or one or more of the values read from the 1428.Dv m_ceilings 1429array during lock or unlock operations, is greater than 1430.Dv RTP_PRIO_MAX . 1431.It Bq Er EPERM 1432Unlock attempted on an object not owned by the current thread. 1433.It Bq Er EOWNERDEAD 1434The lock was requested on an umutex where the 1435.Dv m_owner 1436field was set to the 1437.Dv UMUTEX_RB_OWNERDEAD 1438value, indicating terminated robust mutex. 1439The lock was granted to the caller, so this error in fact 1440indicates success with additional conditions. 1441.It Bq Er ENOTRECOVERABLE 1442The lock was requested on an umutex which 1443.Dv m_owner 1444field is equal to the 1445.Dv UMUTEX_RB_NOTRECOV 1446value, indicating abandoned robust mutex after termination. 1447The lock was not granted to the caller. 1448.It Bq Er ENOTTY 1449The shared memory object, associated with the address passed to the 1450.Dv UMTX_SHM_ALIVE 1451sub-request of 1452.Dv UMTX_OP_SHM 1453request, was destroyed. 1454.It Bq Er ESRCH 1455For the 1456.Dv UMTX_SHM_LOOKUP , 1457.Dv UMTX_SHM_DESTROY , 1458and 1459.Dv UMTX_SHM_ALIVE 1460sub-requests of the 1461.Dv UMTX_OP_SHM 1462request, there is no shared memory object associated with the provided key. 1463.It Bq Er ENOMEM 1464The 1465.Dv UMTX_SHM_CREAT 1466sub-request of the 1467.Dv UMTX_OP_SHM 1468request cannot be satisfied, because allocation of the shared memory object 1469would exceed the 1470.Dv RLIMIT_UMTXP 1471resource limit, see 1472.Xr setrlimit 2 . 1473.It Bq Er EAGAIN 1474The maximum number of readers 1475.Dv ( URWLOCK_MAX_READERS ) 1476were already granted ownership of the given 1477.Vt struct rwlock 1478for read. 1479.It Bq Er EBUSY 1480A try mutex lock operation was not able to obtain the lock. 1481.It Bq Er ETIMEDOUT 1482The request specified a timeout in the 1483.Fa uaddr 1484and 1485.Fa uaddr2 1486arguments, and timed out before obtaining the lock or being woken up. 1487.It Bq Er EINTR 1488A signal was delivered during wait, for a non-restartable operation. 1489Operations with timeouts are typically non-restartable, but timeouts 1490specified in absolute time may be restartable. 1491.It Bq Er ERESTART 1492A signal was delivered during wait, for a restartable operation. 1493Mutex lock requests without timeout specified are restartable. 1494The error is not returned to userspace code since restart 1495is handled by usual adjustment of the instruction counter. 1496.El 1497.Sh SEE ALSO 1498.Xr clock_gettime 2 , 1499.Xr mmap 2 , 1500.Xr setrlimit 2 , 1501.Xr shm_open 2 , 1502.Xr sigaction 2 , 1503.Xr thr_exit 2 , 1504.Xr thr_kill 2 , 1505.Xr thr_kill2 2 , 1506.Xr thr_new 2 , 1507.Xr thr_self 2 , 1508.Xr thr_set_name 2 , 1509.Xr signal 3 1510.Sh STANDARDS 1511The 1512.Fn _umtx_op 1513system call is non-standard and is used by the 1514.Lb libthr 1515to implement 1516.St -p1003.1-2001 1517.Xr pthread 3 1518functionality. 1519.Sh BUGS 1520A window between a unlocking robust mutex and resetting the pointer in the 1521.Dv robust_inact_offset 1522member of the registered 1523.Vt struct umtx_robust_lists_params 1524allows another thread to destroy the mutex, thus making the kernel inspect 1525freed or reused memory. 1526The 1527.Li libthr 1528implementation is only vulnerable to this race when operating on 1529a shared mutex. 1530A possible fix for the current implementation is to strengthen the checks 1531for shared mutexes before terminating them, in particular, verifying 1532that the mutex memory is mapped from a shared memory object allocated 1533by the 1534.Dv UMTX_OP_SHM 1535request. 1536This is not done because it is believed that the race is adequately 1537covered by other consistency checks, while adding the check would 1538prevent alternative implementations of 1539.Li libpthread . 1540