xref: /linux/Documentation/locking/seqlock.rst (revision ba653158f40deccb3f79005bf1d5c6c37d45b247)
1.. SPDX-License-Identifier: GPL-2.0
2
3======================================
4Sequence counters and sequential locks
5======================================
6
7Introduction
8============
9
10Sequence counters are a reader-writer consistency mechanism with
11lockless readers (read-only retry loops), and no writer starvation. They
12are used for data that's rarely written to (e.g. system time), where the
13reader wants a consistent set of information and is willing to retry if
14that information changes.
15
16A data set is consistent when the sequence count at the beginning of the
17read side critical section is even and the same sequence count value is
18read again at the end of the critical section. The data in the set must
19be copied out inside the read side critical section. If the sequence
20count has changed between the start and the end of the critical section,
21the reader must retry.
22
23Writers increment the sequence count at the start and the end of their
24critical section. After starting the critical section the sequence count
25is odd and indicates to the readers that an update is in progress. At
26the end of the write side critical section the sequence count becomes
27even again which lets readers make progress.
28
29A sequence counter write side critical section must never be preempted
30or interrupted by read side sections. Otherwise the reader will spin for
31the entire scheduler tick due to the odd sequence count value and the
32interrupted writer. If that reader belongs to a real-time scheduling
33class, it can spin forever and the kernel will livelock.
34
35This mechanism cannot be used if the protected data contains pointers,
36as the writer can invalidate a pointer that the reader is following.
37
38
39.. _seqcount_t:
40
41Sequence counters (``seqcount_t``)
42==================================
43
44This is the raw counting mechanism, which does not protect against
45multiple writers.  Write side critical sections must thus be serialized
46by an external lock.
47
48If the write serialization primitive is not implicitly disabling
49preemption, preemption must be explicitly disabled before entering the
50write side section. If the read section can be invoked from hardirq or
51softirq contexts, interrupts or bottom halves must also be respectively
52disabled before entering the write section.
53
54If it's desired to automatically handle the sequence counter
55requirements of writer serialization and non-preemptibility, use
56:ref:`seqlock_t` instead.
57
58Initialization::
59
60	/* dynamic */
61	seqcount_t foo_seqcount;
62	seqcount_init(&foo_seqcount);
63
64	/* static */
65	static seqcount_t foo_seqcount = SEQCNT_ZERO(foo_seqcount);
66
67	/* C99 struct init */
68	struct {
69		.seq   = SEQCNT_ZERO(foo.seq),
70	} foo;
71
72Write path::
73
74	/* Serialized context with disabled preemption */
75
76	write_seqcount_begin(&foo_seqcount);
77
78	/* ... [[write-side critical section]] ... */
79
80	write_seqcount_end(&foo_seqcount);
81
82Read path::
83
84	do {
85		seq = read_seqcount_begin(&foo_seqcount);
86
87		/* ... [[read-side critical section]] ... */
88
89	} while (read_seqcount_retry(&foo_seqcount, seq));
90
91
92.. _seqcount_locktype_t:
93
94Sequence counters with associated locks (``seqcount_LOCKNAME_t``)
95-----------------------------------------------------------------
96
97As discussed at :ref:`seqcount_t`, sequence count write side critical
98sections must be serialized and non-preemptible. This variant of
99sequence counters associate the lock used for writer serialization at
100initialization time, which enables lockdep to validate that the write
101side critical sections are properly serialized.
102
103This lock association is a NOOP if lockdep is disabled and has neither
104storage nor runtime overhead. If lockdep is enabled, the lock pointer is
105stored in struct seqcount and lockdep's "lock is held" assertions are
106injected at the beginning of the write side critical section to validate
107that it is properly protected.
108
109For lock types which do not implicitly disable preemption, preemption
110protection is enforced in the write side function.
111
112The following sequence counters with associated locks are defined:
113
114  - ``seqcount_spinlock_t``
115  - ``seqcount_raw_spinlock_t``
116  - ``seqcount_rwlock_t``
117  - ``seqcount_mutex_t``
118  - ``seqcount_ww_mutex_t``
119
120The sequence counter read and write APIs can take either a plain
121seqcount_t or any of the seqcount_LOCKNAME_t variants above.
122
123Initialization (replace "LOCKNAME" with one of the supported locks)::
124
125	/* dynamic */
126	seqcount_LOCKNAME_t foo_seqcount;
127	seqcount_LOCKNAME_init(&foo_seqcount, &lock);
128
129	/* static */
130	static seqcount_LOCKNAME_t foo_seqcount =
131		SEQCNT_LOCKNAME_ZERO(foo_seqcount, &lock);
132
133	/* C99 struct init */
134	struct {
135		.seq   = SEQCNT_LOCKNAME_ZERO(foo.seq, &lock),
136	} foo;
137
138Write path: same as in :ref:`seqcount_t`, while running from a context
139with the associated write serialization lock acquired.
140
141Read path: same as in :ref:`seqcount_t`.
142
143
144.. _seqcount_latch_t:
145
146Latch sequence counters (``seqcount_latch_t``)
147----------------------------------------------
148
149Latch sequence counters are a multiversion concurrency control mechanism
150where the embedded seqcount_t counter even/odd value is used to switch
151between two copies of protected data. This allows the sequence counter
152read path to safely interrupt its own write side critical section.
153
154Use seqcount_latch_t when the write side sections cannot be protected
155from interruption by readers. This is typically the case when the read
156side can be invoked from NMI handlers.
157
158Check `write_seqcount_latch()` for more information.
159
160
161.. _seqlock_t:
162
163Sequential locks (``seqlock_t``)
164================================
165
166This contains the :ref:`seqcount_t` mechanism earlier discussed, plus an
167embedded spinlock for writer serialization and non-preemptibility.
168
169If the read side section can be invoked from hardirq or softirq context,
170use the write side function variants which disable interrupts or bottom
171halves respectively.
172
173Initialization::
174
175	/* dynamic */
176	seqlock_t foo_seqlock;
177	seqlock_init(&foo_seqlock);
178
179	/* static */
180	static DEFINE_SEQLOCK(foo_seqlock);
181
182	/* C99 struct init */
183	struct {
184		.seql   = __SEQLOCK_UNLOCKED(foo.seql)
185	} foo;
186
187Write path::
188
189	write_seqlock(&foo_seqlock);
190
191	/* ... [[write-side critical section]] ... */
192
193	write_sequnlock(&foo_seqlock);
194
195Read path, three categories:
196
1971. Normal Sequence readers which never block a writer but they must
198   retry if a writer is in progress by detecting change in the sequence
199   number.  Writers do not wait for a sequence reader::
200
201	do {
202		seq = read_seqbegin(&foo_seqlock);
203
204		/* ... [[read-side critical section]] ... */
205
206	} while (read_seqretry(&foo_seqlock, seq));
207
2082. Locking readers which will wait if a writer or another locking reader
209   is in progress. A locking reader in progress will also block a writer
210   from entering its critical section. This read lock is
211   exclusive. Unlike rwlock_t, only one locking reader can acquire it::
212
213	read_seqlock_excl(&foo_seqlock);
214
215	/* ... [[read-side critical section]] ... */
216
217	read_sequnlock_excl(&foo_seqlock);
218
2193. Conditional lockless reader (as in 1), or locking reader (as in 2),
220   according to a passed marker. This is used to avoid lockless readers
221   starvation (too much retry loops) in case of a sharp spike in write
222   activity. First, a lockless read is tried (even marker passed). If
223   that trial fails (odd sequence counter is returned, which is used as
224   the next iteration marker), the lockless read is transformed to a
225   full locking read and no retry loop is necessary::
226
227	/* marker; even initialization */
228	int seq = 0;
229	do {
230		read_seqbegin_or_lock(&foo_seqlock, &seq);
231
232		/* ... [[read-side critical section]] ... */
233
234	} while (need_seqretry(&foo_seqlock, seq));
235	done_seqretry(&foo_seqlock, seq);
236
237
238API documentation
239=================
240
241.. kernel-doc:: include/linux/seqlock.h
242