1.. SPDX-License-Identifier: GPL-2.0 2 3====================================== 4Sequence counters and sequential locks 5====================================== 6 7Introduction 8============ 9 10Sequence counters are a reader-writer consistency mechanism with 11lockless readers (read-only retry loops), and no writer starvation. They 12are used for data that's rarely written to (e.g. system time), where the 13reader wants a consistent set of information and is willing to retry if 14that information changes. 15 16A data set is consistent when the sequence count at the beginning of the 17read side critical section is even and the same sequence count value is 18read again at the end of the critical section. The data in the set must 19be copied out inside the read side critical section. If the sequence 20count has changed between the start and the end of the critical section, 21the reader must retry. 22 23Writers increment the sequence count at the start and the end of their 24critical section. After starting the critical section the sequence count 25is odd and indicates to the readers that an update is in progress. At 26the end of the write side critical section the sequence count becomes 27even again which lets readers make progress. 28 29A sequence counter write side critical section must never be preempted 30or interrupted by read side sections. Otherwise the reader will spin for 31the entire scheduler tick due to the odd sequence count value and the 32interrupted writer. If that reader belongs to a real-time scheduling 33class, it can spin forever and the kernel will livelock. 34 35This mechanism cannot be used if the protected data contains pointers, 36as the writer can invalidate a pointer that the reader is following. 37 38 39.. _seqcount_t: 40 41Sequence counters (``seqcount_t``) 42================================== 43 44This is the raw counting mechanism, which does not protect against 45multiple writers. Write side critical sections must thus be serialized 46by an external lock. 47 48If the write serialization primitive is not implicitly disabling 49preemption, preemption must be explicitly disabled before entering the 50write side section. If the read section can be invoked from hardirq or 51softirq contexts, interrupts or bottom halves must also be respectively 52disabled before entering the write section. 53 54If it's desired to automatically handle the sequence counter 55requirements of writer serialization and non-preemptibility, use 56:ref:`seqlock_t` instead. 57 58Initialization:: 59 60 /* dynamic */ 61 seqcount_t foo_seqcount; 62 seqcount_init(&foo_seqcount); 63 64 /* static */ 65 static seqcount_t foo_seqcount = SEQCNT_ZERO(foo_seqcount); 66 67 /* C99 struct init */ 68 struct { 69 .seq = SEQCNT_ZERO(foo.seq), 70 } foo; 71 72Write path:: 73 74 /* Serialized context with disabled preemption */ 75 76 write_seqcount_begin(&foo_seqcount); 77 78 /* ... [[write-side critical section]] ... */ 79 80 write_seqcount_end(&foo_seqcount); 81 82Read path:: 83 84 do { 85 seq = read_seqcount_begin(&foo_seqcount); 86 87 /* ... [[read-side critical section]] ... */ 88 89 } while (read_seqcount_retry(&foo_seqcount, seq)); 90 91 92.. _seqcount_locktype_t: 93 94Sequence counters with associated locks (``seqcount_LOCKNAME_t``) 95----------------------------------------------------------------- 96 97As discussed at :ref:`seqcount_t`, sequence count write side critical 98sections must be serialized and non-preemptible. This variant of 99sequence counters associate the lock used for writer serialization at 100initialization time, which enables lockdep to validate that the write 101side critical sections are properly serialized. 102 103This lock association is a NOOP if lockdep is disabled and has neither 104storage nor runtime overhead. If lockdep is enabled, the lock pointer is 105stored in struct seqcount and lockdep's "lock is held" assertions are 106injected at the beginning of the write side critical section to validate 107that it is properly protected. 108 109For lock types which do not implicitly disable preemption, preemption 110protection is enforced in the write side function. 111 112The following sequence counters with associated locks are defined: 113 114 - ``seqcount_spinlock_t`` 115 - ``seqcount_raw_spinlock_t`` 116 - ``seqcount_rwlock_t`` 117 - ``seqcount_mutex_t`` 118 - ``seqcount_ww_mutex_t`` 119 120The sequence counter read and write APIs can take either a plain 121seqcount_t or any of the seqcount_LOCKNAME_t variants above. 122 123Initialization (replace "LOCKNAME" with one of the supported locks):: 124 125 /* dynamic */ 126 seqcount_LOCKNAME_t foo_seqcount; 127 seqcount_LOCKNAME_init(&foo_seqcount, &lock); 128 129 /* static */ 130 static seqcount_LOCKNAME_t foo_seqcount = 131 SEQCNT_LOCKNAME_ZERO(foo_seqcount, &lock); 132 133 /* C99 struct init */ 134 struct { 135 .seq = SEQCNT_LOCKNAME_ZERO(foo.seq, &lock), 136 } foo; 137 138Write path: same as in :ref:`seqcount_t`, while running from a context 139with the associated write serialization lock acquired. 140 141Read path: same as in :ref:`seqcount_t`. 142 143 144.. _seqcount_latch_t: 145 146Latch sequence counters (``seqcount_latch_t``) 147---------------------------------------------- 148 149Latch sequence counters are a multiversion concurrency control mechanism 150where the embedded seqcount_t counter even/odd value is used to switch 151between two copies of protected data. This allows the sequence counter 152read path to safely interrupt its own write side critical section. 153 154Use seqcount_latch_t when the write side sections cannot be protected 155from interruption by readers. This is typically the case when the read 156side can be invoked from NMI handlers. 157 158Check `write_seqcount_latch()` for more information. 159 160 161.. _seqlock_t: 162 163Sequential locks (``seqlock_t``) 164================================ 165 166This contains the :ref:`seqcount_t` mechanism earlier discussed, plus an 167embedded spinlock for writer serialization and non-preemptibility. 168 169If the read side section can be invoked from hardirq or softirq context, 170use the write side function variants which disable interrupts or bottom 171halves respectively. 172 173Initialization:: 174 175 /* dynamic */ 176 seqlock_t foo_seqlock; 177 seqlock_init(&foo_seqlock); 178 179 /* static */ 180 static DEFINE_SEQLOCK(foo_seqlock); 181 182 /* C99 struct init */ 183 struct { 184 .seql = __SEQLOCK_UNLOCKED(foo.seql) 185 } foo; 186 187Write path:: 188 189 write_seqlock(&foo_seqlock); 190 191 /* ... [[write-side critical section]] ... */ 192 193 write_sequnlock(&foo_seqlock); 194 195Read path, three categories: 196 1971. Normal Sequence readers which never block a writer but they must 198 retry if a writer is in progress by detecting change in the sequence 199 number. Writers do not wait for a sequence reader:: 200 201 do { 202 seq = read_seqbegin(&foo_seqlock); 203 204 /* ... [[read-side critical section]] ... */ 205 206 } while (read_seqretry(&foo_seqlock, seq)); 207 2082. Locking readers which will wait if a writer or another locking reader 209 is in progress. A locking reader in progress will also block a writer 210 from entering its critical section. This read lock is 211 exclusive. Unlike rwlock_t, only one locking reader can acquire it:: 212 213 read_seqlock_excl(&foo_seqlock); 214 215 /* ... [[read-side critical section]] ... */ 216 217 read_sequnlock_excl(&foo_seqlock); 218 2193. Conditional lockless reader (as in 1), or locking reader (as in 2), 220 according to a passed marker. This is used to avoid lockless readers 221 starvation (too much retry loops) in case of a sharp spike in write 222 activity. First, a lockless read is tried (even marker passed). If 223 that trial fails (odd sequence counter is returned, which is used as 224 the next iteration marker), the lockless read is transformed to a 225 full locking read and no retry loop is necessary:: 226 227 /* marker; even initialization */ 228 int seq = 0; 229 do { 230 read_seqbegin_or_lock(&foo_seqlock, &seq); 231 232 /* ... [[read-side critical section]] ... */ 233 234 } while (need_seqretry(&foo_seqlock, seq)); 235 done_seqretry(&foo_seqlock, seq); 236 237 238API documentation 239================= 240 241.. kernel-doc:: include/linux/seqlock.h 242