1.\" Copyright (c) 2000-2001 John H. Baldwin <jhb@FreeBSD.org> 2.\" All rights reserved. 3.\" 4.\" Redistribution and use in source and binary forms, with or without 5.\" modification, are permitted provided that the following conditions 6.\" are met: 7.\" 1. Redistributions of source code must retain the above copyright 8.\" notice, this list of conditions and the following disclaimer. 9.\" 2. Redistributions in binary form must reproduce the above copyright 10.\" notice, this list of conditions and the following disclaimer in the 11.\" documentation and/or other materials provided with the distribution. 12.\" 13.\" THIS SOFTWARE IS PROVIDED BY THE DEVELOPERS ``AS IS'' AND ANY EXPRESS OR 14.\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES 15.\" OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. 16.\" IN NO EVENT SHALL THE DEVELOPERS BE LIABLE FOR ANY DIRECT, INDIRECT, 17.\" INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT 18.\" NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 19.\" DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 20.\" THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 21.\" (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF 22.\" THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 23.\" 24.\" $FreeBSD$ 25.\" 26.Dd May 12, 2016 27.Dt ATOMIC 9 28.Os 29.Sh NAME 30.Nm atomic_add , 31.Nm atomic_clear , 32.Nm atomic_cmpset , 33.Nm atomic_fetchadd , 34.Nm atomic_load , 35.Nm atomic_readandclear , 36.Nm atomic_set , 37.Nm atomic_subtract , 38.Nm atomic_store 39.Nd atomic operations 40.Sh SYNOPSIS 41.In sys/types.h 42.In machine/atomic.h 43.Ft void 44.Fn atomic_add_[acq_|rel_]<type> "volatile <type> *p" "<type> v" 45.Ft void 46.Fn atomic_clear_[acq_|rel_]<type> "volatile <type> *p" "<type> v" 47.Ft int 48.Fo atomic_cmpset_[acq_|rel_]<type> 49.Fa "volatile <type> *dst" 50.Fa "<type> old" 51.Fa "<type> new" 52.Fc 53.Ft <type> 54.Fn atomic_fetchadd_<type> "volatile <type> *p" "<type> v" 55.Ft <type> 56.Fn atomic_load_acq_<type> "volatile <type> *p" 57.Ft <type> 58.Fn atomic_readandclear_<type> "volatile <type> *p" 59.Ft void 60.Fn atomic_set_[acq_|rel_]<type> "volatile <type> *p" "<type> v" 61.Ft void 62.Fn atomic_subtract_[acq_|rel_]<type> "volatile <type> *p" "<type> v" 63.Ft void 64.Fn atomic_store_rel_<type> "volatile <type> *p" "<type> v" 65.Ft <type> 66.Fn atomic_swap_<type> "volatile <type> *p" "<type> v" 67.Ft int 68.Fn atomic_testandclear_<type> "volatile <type> *p" "u_int v" 69.Ft int 70.Fn atomic_testandset_<type> "volatile <type> *p" "u_int v" 71.Sh DESCRIPTION 72Each of the atomic operations is guaranteed to be atomic across multiple 73threads and in the presence of interrupts. 74They can be used to implement reference counts or as building blocks for more 75advanced synchronization primitives such as mutexes. 76.Ss Types 77Each atomic operation operates on a specific 78.Fa type . 79The type to use is indicated in the function name. 80The available types that can be used are: 81.Pp 82.Bl -tag -offset indent -width short -compact 83.It Li int 84unsigned integer 85.It Li long 86unsigned long integer 87.It Li ptr 88unsigned integer the size of a pointer 89.It Li 32 90unsigned 32-bit integer 91.It Li 64 92unsigned 64-bit integer 93.El 94.Pp 95For example, the function to atomically add two integers is called 96.Fn atomic_add_int . 97.Pp 98Certain architectures also provide operations for types smaller than 99.Dq Li int . 100.Pp 101.Bl -tag -offset indent -width short -compact 102.It Li char 103unsigned character 104.It Li short 105unsigned short integer 106.It Li 8 107unsigned 8-bit integer 108.It Li 16 109unsigned 16-bit integer 110.El 111.Pp 112These must not be used in MI code because the instructions to implement them 113efficiently might not be available. 114.Ss Acquire and Release Operations 115By default, a thread's accesses to different memory locations might not be 116performed in 117.Em program order , 118that is, the order in which the accesses appear in the source code. 119To optimize the program's execution, both the compiler and processor might 120reorder the thread's accesses. 121However, both ensure that their reordering of the accesses is not visible to 122the thread. 123Otherwise, the traditional memory model that is expected by single-threaded 124programs would be violated. 125Nonetheless, other threads in a multithreaded program, such as the 126.Fx 127kernel, might observe the reordering. 128Moreover, in some cases, such as the implementation of synchronization between 129threads, arbitrary reordering might result in the incorrect execution of the 130program. 131To constrain the reordering that both the compiler and processor might perform 132on a thread's accesses, the thread should use atomic operations with 133.Em acquire 134and 135.Em release 136semantics. 137.Pp 138Most of the atomic operations on memory have three variants. 139The first variant performs the operation without imposing any ordering 140constraints on memory accesses to other locations. 141The second variant has acquire semantics, and the third variant has release 142semantics. 143In effect, operations with acquire and release semantics establish one-way 144barriers to reordering. 145.Pp 146When an atomic operation has acquire semantics, the effects of the operation 147must have completed before any subsequent load or store (by program order) is 148performed. 149Conversely, acquire semantics do not require that prior loads or stores have 150completed before the atomic operation is performed. 151To denote acquire semantics, the suffix 152.Dq Li _acq 153is inserted into the function name immediately prior to the 154.Dq Li _ Ns Aq Fa type 155suffix. 156For example, to subtract two integers ensuring that subsequent loads and 157stores happen after the subtraction is performed, use 158.Fn atomic_subtract_acq_int . 159.Pp 160When an atomic operation has release semantics, the effects of all prior 161loads or stores (by program order) must have completed before the operation 162is performed. 163Conversely, release semantics do not require that the effects of the 164atomic operation must have completed before any subsequent load or store is 165performed. 166To denote release semantics, the suffix 167.Dq Li _rel 168is inserted into the function name immediately prior to the 169.Dq Li _ Ns Aq Fa type 170suffix. 171For example, to add two long integers ensuring that all prior loads and 172stores happen before the addition, use 173.Fn atomic_add_rel_long . 174.Pp 175The one-way barriers provided by acquire and release operations allow the 176implementations of common synchronization primitives to express their 177ordering requirements without also imposing unnecessary ordering. 178For example, for a critical section guarded by a mutex, an acquire operation 179when the mutex is locked and a release operation when the mutex is unlocked 180will prevent any loads or stores from moving outside of the critical 181section. 182However, they will not prevent the compiler or processor from moving loads 183or stores into the critical section, which does not violate the semantics of 184a mutex. 185.Ss Multiple Processors 186In multiprocessor systems, the atomicity of the atomic operations on memory 187depends on support for cache coherence in the underlying architecture. 188In general, cache coherence on the default memory type, 189.Dv VM_MEMATTR_DEFAULT , 190is guaranteed by all architectures that are supported by 191.Fx . 192For example, cache coherence is guaranteed on write-back memory by the 193.Tn amd64 194and 195.Tn i386 196architectures. 197However, on some architectures, cache coherence might not be enabled on all 198memory types. 199To determine if cache coherence is enabled for a non-default memory type, 200consult the architecture's documentation. 201.Ss Semantics 202This section describes the semantics of each operation using a C like notation. 203.Bl -hang 204.It Fn atomic_add p v 205.Bd -literal -compact 206*p += v; 207.Ed 208.It Fn atomic_clear p v 209.Bd -literal -compact 210*p &= ~v; 211.Ed 212.It Fn atomic_cmpset dst old new 213.Bd -literal -compact 214if (*dst == old) { 215 *dst = new; 216 return (1); 217} else 218 return (0); 219.Ed 220.El 221.Pp 222The 223.Fn atomic_cmpset 224functions are not implemented for the types 225.Dq Li char , 226.Dq Li short , 227.Dq Li 8 , 228and 229.Dq Li 16 . 230.Bl -hang 231.It Fn atomic_fetchadd p v 232.Bd -literal -compact 233tmp = *p; 234*p += v; 235return (tmp); 236.Ed 237.El 238.Pp 239The 240.Fn atomic_fetchadd 241functions are only implemented for the types 242.Dq Li int , 243.Dq Li long 244and 245.Dq Li 32 246and do not have any variants with memory barriers at this time. 247.Bl -hang 248.It Fn atomic_load p 249.Bd -literal -compact 250return (*p); 251.Ed 252.El 253.Pp 254The 255.Fn atomic_load 256functions are only provided with acquire memory barriers. 257.Bl -hang 258.It Fn atomic_readandclear p 259.Bd -literal -compact 260tmp = *p; 261*p = 0; 262return (tmp); 263.Ed 264.El 265.Pp 266The 267.Fn atomic_readandclear 268functions are not implemented for the types 269.Dq Li char , 270.Dq Li short , 271.Dq Li ptr , 272.Dq Li 8 , 273and 274.Dq Li 16 275and do not have any variants with memory barriers at this time. 276.Bl -hang 277.It Fn atomic_set p v 278.Bd -literal -compact 279*p |= v; 280.Ed 281.It Fn atomic_subtract p v 282.Bd -literal -compact 283*p -= v; 284.Ed 285.It Fn atomic_store p v 286.Bd -literal -compact 287*p = v; 288.Ed 289.El 290.Pp 291The 292.Fn atomic_store 293functions are only provided with release memory barriers. 294.Bl -hang 295.It Fn atomic_swap p v 296.Bd -literal -compact 297tmp = *p; 298*p = v; 299return (tmp); 300.Ed 301.El 302.Pp 303The 304.Fn atomic_swap 305functions are not implemented for the types 306.Dq Li char , 307.Dq Li short , 308.Dq Li ptr , 309.Dq Li 8 , 310and 311.Dq Li 16 312and do not have any variants with memory barriers at this time. 313.Bl -hang 314.It Fn atomic_testandclear p v 315.Bd -literal -compact 316bit = 1 << (v % (sizeof(*p) * NBBY)); 317tmp = (*p & bit) != 0; 318*p &= ~bit; 319return (tmp); 320.Ed 321.El 322.Bl -hang 323.It Fn atomic_testandset p v 324.Bd -literal -compact 325bit = 1 << (v % (sizeof(*p) * NBBY)); 326tmp = (*p & bit) != 0; 327*p |= bit; 328return (tmp); 329.Ed 330.El 331.Pp 332The 333.Fn atomic_testandset 334and 335.Fn atomic_testandclear 336functions are only implemented for the types 337.Dq Li int , 338.Dq Li long 339and 340.Dq Li 32 341and do not have any variants with memory barriers at this time. 342.Pp 343The type 344.Dq Li 64 345is currently not implemented for any of the atomic operations on the 346.Tn arm , 347.Tn i386 , 348and 349.Tn powerpc 350architectures. 351.Sh RETURN VALUES 352The 353.Fn atomic_cmpset 354function returns the result of the compare operation. 355The 356.Fn atomic_fetchadd , 357.Fn atomic_load , 358.Fn atomic_readandclear , 359and 360.Fn atomic_swap 361functions return the value at the specified address. 362The 363.Fn atomic_testandset 364and 365.Fn atomic_testandclear 366function returns the result of the test operation. 367.Sh EXAMPLES 368This example uses the 369.Fn atomic_cmpset_acq_ptr 370and 371.Fn atomic_set_ptr 372functions to obtain a sleep mutex and handle recursion. 373Since the 374.Va mtx_lock 375member of a 376.Vt "struct mtx" 377is a pointer, the 378.Dq Li ptr 379type is used. 380.Bd -literal 381/* Try to obtain mtx_lock once. */ 382#define _obtain_lock(mp, tid) \\ 383 atomic_cmpset_acq_ptr(&(mp)->mtx_lock, MTX_UNOWNED, (tid)) 384 385/* Get a sleep lock, deal with recursion inline. */ 386#define _get_sleep_lock(mp, tid, opts, file, line) do { \\ 387 uintptr_t _tid = (uintptr_t)(tid); \\ 388 \\ 389 if (!_obtain_lock(mp, tid)) { \\ 390 if (((mp)->mtx_lock & MTX_FLAGMASK) != _tid) \\ 391 _mtx_lock_sleep((mp), _tid, (opts), (file), (line));\\ 392 else { \\ 393 atomic_set_ptr(&(mp)->mtx_lock, MTX_RECURSE); \\ 394 (mp)->mtx_recurse++; \\ 395 } \\ 396 } \\ 397} while (0) 398.Ed 399.Sh HISTORY 400The 401.Fn atomic_add , 402.Fn atomic_clear , 403.Fn atomic_set , 404and 405.Fn atomic_subtract 406operations were first introduced in 407.Fx 3.0 . 408This first set only supported the types 409.Dq Li char , 410.Dq Li short , 411.Dq Li int , 412and 413.Dq Li long . 414The 415.Fn atomic_cmpset , 416.Fn atomic_load , 417.Fn atomic_readandclear , 418and 419.Fn atomic_store 420operations were added in 421.Fx 5.0 . 422The types 423.Dq Li 8 , 424.Dq Li 16 , 425.Dq Li 32 , 426.Dq Li 64 , 427and 428.Dq Li ptr 429and all of the acquire and release variants 430were added in 431.Fx 5.0 432as well. 433The 434.Fn atomic_fetchadd 435operations were added in 436.Fx 6.0 . 437The 438.Fn atomic_swap 439and 440.Fn atomic_testandset 441operations were added in 442.Fx 10.0 . 443.Fn atomic_testandclear 444operation was added in 445.Fx 11.0 . 446