1.\" SPDX-License-Identifier: BSD-2-Clause 2.\" 3.\" Copyright (c) 2023 The FreeBSD Foundation 4.\" 5.\" This documentation was written by Mark Johnston <markj@FreeBSD.org> 6.\" under sponsorship from the FreeBSD Foundation. 7.\" 8.\" Redistribution and use in source and binary forms, with or without 9.\" modification, are permitted provided that the following conditions 10.\" are met: 11.\" 1. Redistributions of source code must retain the above copyright 12.\" notice, this list of conditions and the following disclaimer. 13.\" 2. Redistributions in binary form must reproduce the above copyright 14.\" notice, this list of conditions and the following disclaimer in the 15.\" documentation and/or other materials provided with the distribution. 16.\" 17.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND 18.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 19.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 20.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 21.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 22.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 23.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 24.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 25.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 26.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 27.\" SUCH DAMAGE. 28.\" 29.Dd January 17, 2023 30.Dt SMR 9 31.Os 32.Sh NAME 33.Nm smr 34.Nd safe memory reclamation for lock-free data structures 35.Sh SYNOPSIS 36.In sys/smr.h 37.Ft smr_seq_t 38.Fo smr_advance 39.Fa "smr_t smr" 40.Fc 41.Ft smr_t 42.Fo smr_create 43.Fa "const char *name" 44.Fc 45.Ft void 46.Fo smr_destroy 47.Fa "smr_t smr" 48.Fc 49.Ft void 50.Fo smr_enter 51.Fa "smr_t smr" 52.Fc 53.Ft void 54.Fo smr_exit 55.Fa "smr_t smr" 56.Fc 57.Ft bool 58.Fo smr_poll 59.Fa "smr_t smr" 60.Fa "smr_seq_t goal" 61.Fa "bool wait" 62.Fc 63.Ft void 64.Fo smr_synchronize 65.Fa "smr_t smr" 66.Fc 67.Ft bool 68.Fo smr_wait 69.Fa "smr_t smr" 70.Fa "smr_seq_t goal" 71.Fc 72.Sh DESCRIPTION 73Safe Memory Reclamation (SMR) is a facility which enables the implementation of 74memory-safe lock-free data structures. 75In typical usage, read accesses to an SMR-protected data structure, such as a 76hash table or tree, are performed in a 77.Dq read section 78consisting of code bracketed by 79.Fn smr_enter 80and 81.Fn smr_exit 82calls, while mutations of the data structure are serialized by a traditional 83mutex such as 84.Xr mutex 9 . 85In contrast with reader-writer locks such as 86.Xr rwlock 9 , 87.Xr rmlock 9 , 88and 89.Xr sx 9 , 90SMR allows readers and writers to access the data structure concurrently. 91Readers can always enter a read section immediately 92.Po 93.Fn smr_enter 94never blocks 95.Pc , 96so mutations do not introduce read latency. 97Furthermore, 98.Fn smr_enter 99and 100.Fn smr_exit 101operate only on per-CPU data and thus avoid some of the performance problems 102inherent in the implementation of traditional reader-writer mutexes. 103SMR can therefore be a useful building block for data structures which are 104accessed frequently but are only rarely modified. 105.Pp 106Note that any SMR-protected data structure must be implemented carefully such 107that operations behave correctly in the absence of mutual exclusion between 108readers and writers. 109The data structure must be designed to be lock-free; SMR merely facilitates 110the implementation, for example by making it safe to follow dangling pointers 111and by helping avoid the ABA problem. 112.Pp 113When shared accesses to and mutations of a data structure can proceed 114concurrently, writers must take care to ensure that any items removed from the 115structure are not freed and recycled while readers are accessing them in 116parallel. 117This requirement results in a two-phase approach to the removal of items: 118first, the item is unlinked such that all pointers to the item are removed from 119the structure, preventing any new readers from observing the item. 120Then, the writer waits until some mechanism guarantees that no existing readers 121are still accessing the item. 122At that point the memory for that item can be freed and reused safely. 123SMR provides this mechanism: readers may access a lock-free data structure in 124between calls to the 125.Fn smr_enter 126and 127.Fn smr_exit 128functions, which together create a read section, and the 129.Fn smr_advance , 130.Fn smr_poll , 131.Fn smr_wait , 132and 133.Fn smr_synchronize 134functions can be used to wait for threads in read sections to finish. 135All of these functions operate on a 136.Ft smr_t 137state block which holds both per-CPU and global state. 138Readers load global state and modify per-CPU state, while writers must scan all 139per-CPU states to detect active readers. 140SMR is designed to amortize this cost by batching to give acceptable 141performance in write-heavy workloads. 142.Ss Readers 143Threads enter a read section by calling 144.Fn smr_enter . 145Read sections should be short, and many operations are not permitted while in 146a read section. 147Specifically, context switching is disabled, and thus readers may not acquire 148blocking mutexes such as 149.Xr mutex 9 . 150Another consequence of this is that the thread is pinned to the current CPU for 151the duration of the read section. 152Furthermore, read sections may not be nested: it is incorrect to call 153.Fn smr_enter 154with a given 155.Ft smr_t 156state block when already in a read section for that state block. 157.Ss UMA Integration 158To simplify the integration of SMR into consumers, the 159.Xr uma 9 160kernel memory allocator provides some SMR-specified facilities. 161This eliminates a good deal of complexity from the implementation of consumers 162and automatically batches write operations. 163.Pp 164In typical usage, a UMA zone (created with the 165.Dv UMA_ZONE_SMR 166flag or initialized with the 167.Fn uma_zone_set_smr 168function) is coupled with a 169.Ft smr_t 170state block. 171To insert an item into an SMR-protected data structure, memory is allocated 172from the zone using the 173.Fn uma_zalloc_smr 174function. 175Insertions and removals are serialized using traditional mutual exclusion 176and items are freed using the 177.Fn uma_zfree_smr 178function. 179Read-only lookup operations are performed in SMR read sections. 180.Fn uma_zfree_smr 181waits for all active readers which may be accessing the freed item to finish 182their read sections before recycling that item's memory. 183.Pp 184If the zone has an associated per-item destructor, it will be invoked at some 185point when no readers can be accessing a given item. 186The destructor can be used to release additional resources associated with the 187item. 188Note however that there is no guarantee that the destructor will be invoked in 189a bounded time period. 190.Ss Writers 191Consumers are expected to use SMR in conjunction with UMA and thus need only 192make use of the 193.Fn smr_enter 194and 195.Fn smr_exit 196functions, and the SMR helper macros defined in 197.Pa sys/smr_types.h . 198However, an introduction to the write-side interface of SMR can be useful. 199.Pp 200Internally, SMR maintains a global 201.Ql write sequence 202number. 203When entering a read section, 204.Fn smr_enter 205loads a copy of the write sequence and stores it in per-CPU memory, hence 206.Ql observing 207that value. 208To exit a read section, this per-CPU memory is overwritten with an invalid 209value, making the CPU inactive. 210Writers perform two operations: advancing the write sequence number, and 211polling all CPUs to see whether active readers have observed a given sequence 212number. 213These operations are performed by 214.Fn smr_advance 215and 216.Fn smr_poll , 217respectively, which do not require serialization between multiple writers. 218.Pp 219After a writer unlinks an item from a data structure, it increments the write 220sequence number and tags the item with the new value returned by 221.Fn smr_advance . 222Once all CPUs have observed the new value, the writer can use 223.Fn smr_poll 224to deduce that no active readers have access to the unlinked item, and thus the 225item is safe to recycle. 226Because this pair of operations is relatively expensive, it is generally a good 227idea to amortize this cost by accumulating a collection of multiple unlinked 228items and tagging the entire batch with a target write sequence number. 229.Pp 230.Fn smr_poll 231is a non-blocking operation and returns true only if all active readers are 232guaranteed to have observed the target sequence number value. 233.Fn smr_wait 234is a variant of 235.Fn smr_poll 236which waits until all CPUs have observed the target sequence number value. 237.Fn smr_synchronize 238combines 239.Fn smr_advance 240with 241.Fn smr_wait 242to wait for all CPUs to observe a new write sequence number. 243This is an expensive operation and should only be used if polling cannot be 244deferred in some way. 245.Ss Memory Ordering 246The 247.Fn smr_enter 248function has acquire semantics, and the 249.Fn smr_exit 250function has release semantics. 251The 252.Fn smr_advance , 253.Fn smr_poll , 254.Fn smr_wait , 255and 256.Fn smr_synchronize 257functions should not be assumed to have any guarantees with respect to memory 258ordering. 259In practice, some of these functions have stronger ordering semantics than 260is stated here, but this is specific to the implementation and should not be 261relied upon. 262See 263.Xr atomic 9 264for more details. 265.Sh NOTES 266Outside of 267.Fx 268the acronym SMR typically refers to a family of algorithms which enable 269memory-safe read-only access to a data structure concurrent with modifications 270to that data structure. 271Here, SMR refers to a particular algorithm belonging to this family, as well as 272its implementation in 273.Fx . 274See 275.Pa sys/sys/smr.h 276and 277.Pa sys/kern/subr_smr.c 278in the 279.Fx 280source tree for further details on the algorithm and the context. 281.Pp 282The acronym SMR is also used to mean "shingled magnetic recording", a 283technology used to store data on hard disk drives which requires operating 284system support. 285These two uses of the acronym are unrelated. 286.Sh SEE ALSO 287.Xr atomic 9 , 288.Xr locking 9 , 289.Xr uma 9 290.Sh AUTHORS 291The SMR algorithm and its implementation were provided by 292.An Jeff Roberson Aq Mt jeff@FreeBSD.org . 293This manual page was written by 294.An Mark Johnston Aq Mt markj@FreeBSD.org . 295