Lines Matching +full:a +full:- +full:side
3 What is RCU? -- "Read, Copy, Update"
21 …ries: Fundamentals https://www.linuxfoundation.org/webinars/unraveling-rcu-usage-mysteries
22 …Cases https://www.linuxfoundation.org/webinars/unraveling-rcu-usage-mysteries-additional-use-cases
27 RCU is a synchronization mechanism that was added to the Linux kernel
28 during the 2.5 development effort that is optimized for read-mostly
47 :ref:`6. ANALOGY WITH READER-WRITER LOCKING <6_whatisRCU>`
55 People who prefer starting with a conceptual overview should focus on
67 everything, feel free to read the whole thing -- but if you are really
69 never need this document anyway. ;-)
74 ----------------
78 within a data structure (possibly by replacing them with references to
82 either the old or the new version of the data structure rather than a
92 completed, either by blocking until they finish or by registering a
95 after the removal phase will be unable to gain a reference to the removed
100 a. Remove pointers to a data structure, so that subsequent
101 readers cannot gain a reference to it.
103 b. Wait for all previous readers to complete their RCU read-side
112 use much lighter-weight synchronization, in some cases, absolutely no
113 synchronization at all. In contrast, in more conventional lock-based
114 schemes, readers must use heavy-weight synchronization in order to
116 This is because lock-based updaters typically update data items in place,
117 and must therefore exclude readers. In contrast, RCU-based updaters
120 and replacement of data items in a linked structure without disrupting
123 and communications cache misses that are so expensive on present-day
126 In the three-step procedure shown above, the updater is performing both
129 in the Linux kernel's directory-entry cache (dcache). Even if the same
130 thread performs both the update step (step (a) above) and the reclamation
133 but RCU provides implicit low-overhead communication between readers
136 So how the heck can a reclaimer tell when a reader is done, given
143 ---------------------------
147 a. rcu_read_lock()
165 This temporal primitive is used by a reader to inform the
166 reclaimer that the reader is entering an RCU read-side critical
167 section. It is illegal to block while in an RCU read-side
169 can preempt RCU read-side critical sections. Any RCU-protected
170 data structure accessed during an RCU read-side critical section
173 with RCU to maintain longer-term references to data structures.
176 or interrupts also enters an RCU read-side critical section.
177 Acquiring a spinlock also enters an RCU read-side critical
180 Sleeplocks do *not* enter RCU read-side critical sections.
186 This temporal primitives is used by a reader to inform the
187 reclaimer that the reader is exiting an RCU read-side critical
189 or interrupts also exits an RCU read-side critical section.
190 Releasing a spinlock also exits an RCU read-side critical section.
192 Note that RCU read-side critical sections may be nested and/or
201 all pre-existing RCU read-side critical sections on all CPUs
203 necessarily wait for any subsequent RCU read-side critical
208 ----------------- ------------------------- ---------------
217 read-side critical sections to complete, not necessarily for
221 **immediately** after the last pre-existing RCU read-side critical
229 to be useful in all but the most read-intensive situations,
233 synchronize_rcu(), and is described in more detail in a later
234 section. Instead of blocking, it registers a function and
235 argument which are invoked after all ongoing RCU read-side
238 or where update-side performance is critically important.
245 of denial-of-service attacks. Code using call_rcu() should limit
253 Yes, rcu_assign_pointer() **is** implemented as a macro, though
254 it would be cool to be able to declare a function in this manner.
258 The updater uses this spatial macro to assign a new value to an
259 RCU-protected pointer, in order to safely communicate the change
260 in value from the updater to the reader. This is a spatial (as
262 but it does provide any compiler directives and memory-barrier
263 instructions required for a given compile or CPU architecture.
264 Its ordering properties are that of a store-release operation,
271 a given structure becomes accessible to other CPUs. That said,
273 the _rcu list-manipulation primitives such as list_add_rcu().
280 as a macro.
283 an RCU-protected pointer, which returns a value that may
287 executes any needed memory-barrier instructions for a given
289 within rcu_dereference() -- on other CPUs, it compiles to a
297 RCU-protected pointer to a local variable, then dereferences
301 return p->data;
306 return rcu_dereference(head.next)->data;
309 RCU-protected structure, using the local variable is of
316 only within the enclosing RCU read-side critical section [1]_.
322 x = p->address; /* BUG!!! */
324 y = p->data; /* BUG!!! */
327 Holding a reference from one RCU read-side critical section
328 to another is just as illegal as holding a reference from
329 one lock-based critical section to another! Similarly,
330 using a reference outside of the critical section in which
336 RCU, in particular, flagging a pointer that is subject to changing
339 typically used indirectly, via the _rcu list-manipulation
343 of an RCU read-side critical section as long as the usage is
344 protected by locks acquired by the update-side code. This variant
350 a lockdep expression to indicate which locks must be acquired
352 a lockdep splat is emitted. See Design/Requirements/Requirements.rst
356 update-side code as well as by RCU readers, then an additional
360 invoked outside of an RCU read-side critical section and without
369 +--------+
370 +---------------------->| reader |---------+
371 | +--------+ |
377 +---------+ | |
378 | updater |<----------------+ |
379 +---------+ V
380 | +-----------+
381 +----------------------------------->| reclaimer |
382 +-----------+
394 spatial changes via stores to and loads from the RCU-protected pointer in
398 above shows the most common one. On the updater side, the rcu_assign_pointer(),
400 flavors. However for protection (on the reader side), the primitives used vary
403 a. rcu_read_lock() / rcu_read_unlock()
419 a. RCU applied to normal data structures.
422 to remote denial-of-service attacks.
424 c. RCU applied to scheduler and interrupt/NMI-handler tasks.
426 Again, most uses will be of (a). The (b) and (c) cases are important
427 for specialized uses, but are relatively uncommon. The SRCU, RCU-Tasks,
428 RCU-Tasks-Rude, and RCU-Tasks-Trace have similar relationships among
434 -----------------------------------------------
436 This section shows a simple use of the core RCU API to protect a
437 global pointer to a dynamically allocated structure. More-typical
438 uses of RCU may be found in listRCU.rst and NMI-RCU.rst.
442 int a;
451 * Create a new struct foo that is the same as the one currently
452 * pointed to by gbl_foo, except that field "a" is replaced
454 * frees up the old structure after a grace period.
472 new_fp->a = new_a;
480 * Return the value of field "a" of the current gbl_foo
492 retval = rcu_dereference(gbl_foo)->a;
499 - Use rcu_read_lock() and rcu_read_unlock() to guard RCU
500 read-side critical sections.
502 - Within an RCU read-side critical section, use rcu_dereference()
503 to dereference RCU-protected pointers.
505 - Use some solid design (such as locks or semaphores) to
508 - Use rcu_assign_pointer() to update an RCU-protected pointer.
514 - Use synchronize_rcu() **after** removing a data element from an
515 RCU-protected data structure, but **before** reclaiming/freeing
517 RCU read-side critical sections that might be referencing that
521 And again, more-typical uses of RCU may be found in listRCU.rst
522 and NMI-RCU.rst.
527 --------------------------------------------
529 In the example above, foo_update_a() blocks until a grace period elapses.
531 long -- there might be other high-priority work to be done.
538 This function invokes func(head) after a grace period has elapsed.
544 int a;
553 * Create a new struct foo that is the same as the one currently
554 * pointed to by gbl_foo, except that field "a" is replaced
556 * frees up the old structure after a grace period.
574 new_fp->a = new_a;
577 call_rcu(&old_fp->rcu, foo_reclaim);
586 foo_cleanup(fp->a);
591 The container_of() primitive is a macro that, given a pointer into a
592 struct, the type of the struct, and the pointed-to field within the
593 struct, returns a pointer to the beginning of the struct.
604 - Use call_rcu() **after** removing a data element from an
605 RCU-protected data structure in order to register a callback
607 read-side critical sections that might be referencing that
616 If the occasional sleep is permitted, the single-argument form may
622 synchronize_rcu() in response to memory-allocation failure.
629 ------------------------------------------------
632 implementations that are a good first step towards understanding the
633 production-quality implementations in the Linux kernel. This section
636 resembles "classic" RCU. Both are way too simple for real-world use,
638 in getting a feel for how RCU works. See kernel/rcu/update.c for a
639 production-quality implementation, and see:
644 and OLS'02 papers are a good introduction, and the dissertation provides
648 5A. "TOY" IMPLEMENTATION #1: LOCKING
650 This section presents a "toy" RCU implementation that is based on
651 familiar locking primitives. Its overhead makes it a non-starter for
652 real-life use, as does its lack of scalability. It is also unsuitable
654 one read-side critical section to another. It also assumes recursive
655 reader-writer locks: If you try this with non-recursive locks, and
659 a good starting point.
698 The rcu_read_lock() and rcu_read_unlock() primitive read-acquire
699 and release a global reader-writer lock. The synchronize_rcu()
700 primitive write-acquires this same lock, then releases it. This means
701 that once synchronize_rcu() exits, all RCU read-side critical sections
703 to have completed -- there is no way that synchronize_rcu() would have
704 been able to write-acquire the lock otherwise. The smp_mb__after_spinlock()
705 promotes synchronize_rcu() to a full memory barrier in compliance with
706 the "Memory-Barrier Guarantees" listed in:
710 It is possible to nest rcu_read_lock(), since reader-writer locks may
713 that the only thing that can block rcu_read_lock() is a synchronize_rcu().
720 Why is this argument naive? How could a deadlock
721 occur when using this algorithm in a real-world Linux
728 This section presents a "toy" RCU implementation that is based on
748 This is the great strength of classic RCU in a non-preemptive kernel:
749 read-side overhead is precisely zero, at least on non-Alpha CPUs.
751 participate in a deadlock cycle!
755 in terms of the sched_setaffinity() primitive. Of course, a somewhat less
762 Remember that it is illegal to block while in an RCU read-side critical
763 section. Therefore, if a given CPU executes a context switch, we know
764 that it must have completed all preceding RCU read-side critical sections.
765 Once **all** CPUs have executed a context switch, then **all** preceding
766 RCU read-side critical sections will have completed.
768 So, suppose that we remove a data item from its structure and then invoke
770 that there are no RCU read-side critical sections holding a reference
776 Give an example where Classic RCU's read-side
784 If it is illegal to block in an RCU read-side
792 6. ANALOGY WITH READER-WRITER LOCKING
793 --------------------------------------
795 Although RCU can be used in many different ways, a very common use of
796 RCU is analogous to reader-writer locking. The following unified
797 diff shows how closely related RCU and reader-writer locking can be.
800 @@ -5,5 +5,5 @@ struct el {
804 -rwlock_t listmutex;
808 @@ -13,15 +14,15 @@
812 - read_lock(&listmutex);
813 - list_for_each_entry(p, head, lp) {
816 if (p->key == key) {
817 *result = p->data;
818 - read_unlock(&listmutex);
823 - read_unlock(&listmutex);
828 @@ -29,15 +30,16 @@
832 - write_lock(&listmutex);
835 if (p->key == key) {
836 - list_del(&p->list);
837 - write_unlock(&listmutex);
838 + list_del_rcu(&p->list);
845 - write_unlock(&listmutex);
850 Or, for those who prefer a side-by-side listing::
871 8 if (p->key == key) { 8 if (p->key == key) {
872 9 *result = p->data; 9 *result = p->data;
889 7 if (p->key == key) { 7 if (p->key == key) {
890 8 list_del(&p->list); 8 list_del_rcu(&p->list);
901 Either way, the differences are quite small. Read-side locking moves
902 to rcu_read_lock() and rcu_read_unlock, update-side locking moves from
903 a reader-writer lock to a simple spinlock, and a synchronize_rcu()
906 However, there is one potential catch: the read-side and update-side
908 not be a problem, but it is necessary to check carefully regardless.
910 a single atomic update, converting to RCU will require special care.
913 delete() can now block. If this is a problem, there is a callback-based
920 -----------------------------------
922 The reader-writer analogy (illustrated by the previous section) is not
927 A reference count typically does not prevent the referenced object's
928 values from changing, but does prevent changes to type -- particularly the
930 re-allocated for some other purpose. Once a type-safe reference to the
932 access to the data in the object. This could involve taking a spinlock,
933 but with RCU the typical approach is to perform reads with SMP-aware
935 read-modify-write operations, and to provide the necessary ordering.
936 RCU provides a number of support functions that embed the required
940 A more focused view of the reference counting behavior is that,
942 rcu_dereference() on a pointer marked as ``__rcu`` can be treated as
943 though a reference-count on that object has been temporarily increased.
953 - Copying out data that is guaranteed to be stable by the object's type.
954 - Using kref_get_unless_zero() or similar to get a longer-term
956 - Acquiring a spinlock in the object, and checking if the object still
959 The understanding that RCU provides a reference that only prevents a
960 change of type is particularly visible with objects allocated from a
961 slab cache marked ``SLAB_TYPESAFE_BY_RCU``. RCU operations may yield a
962 reference to an object from such a cache that has been concurrently freed
963 and the memory reallocated to a completely different object, though of
966 one expected, but it will be one where it is safe to take a reference
967 (and then potentially acquiring a spinlock), allowing subsequent code
970 unfortunately any spinlock in a ``SLAB_TYPESAFE_BY_RCU`` object must be
972 reference-free spinlock acquisition completely unsafe. Therefore, when
973 using ``SLAB_TYPESAFE_BY_RCU``, make proper use of a reference counter.
974 (Those willing to initialize their locks in a kmem_cache constructor
975 may also use locking, including cache-friendly sequence locking.)
977 With traditional reference counting -- such as that implemented by the
978 kref library in Linux -- there is typically code that runs when the last
982 been updated, and then a grace period has passed. Every remaining
983 globally visible pointer to the object must be considered to be a
987 To see how to choose between these two analogies -- of RCU as a
988 reader-writer lock and RCU as a reference counting system -- it is useful
989 to reflect on the scale of the thing being protected. The reader-writer
990 lock analogy looks at larger multi-part objects such as a linked list
992 to, and removed from, the list. The reference-count analogy looks at
994 within whatever whole they are a part of.
999 -------------------------
1001 The RCU APIs are documented in docbook-format header comments in the
1002 Linux-kernel source code, but it helps to have a full list of the
1003 APIs, since there does not appear to be a way to categorize them
1094 RCU-Tasks::
1098 N/A call_rcu_tasks rcu_barrier_tasks
1102 RCU-Tasks-Rude::
1106 N/A N/A
1110 RCU-Tasks-Trace::
1135 All: lockdep-checked RCU utility APIs::
1140 All: Unchecked RCU-protected pointer access::
1144 All: Unchecked RCU-protected pointer access with dereferencing prohibited::
1155 a. Will readers need to block? If so, you need SRCU.
1158 example, ftrace or BPF? If so, you need RCU-tasks,
1159 RCU-tasks-rude, and/or RCU-tasks-trace.
1161 c. What about the -rt patchset? If readers would need to block in
1162 an non-rt kernel, you need SRCU. If readers would block when
1163 acquiring spinlocks in a -rt kernel, but not in a non-rt kernel,
1164 SRCU is not necessary. (The -rt patchset turns spinlocks into
1171 If so, RCU-sched readers are the only choice that will work
1177 is your code subject to network-based denial-of-service attacks?
1182 f. Is your workload too update-intensive for normal use of
1187 g. Do you need read-side critical sections that are respected even
1189 from user-mode execution, or on an offlined CPU? If so, SRCU
1201 ----------------------------
1204 Why is this argument naive? How could a deadlock
1205 occur when using this algorithm in a real-world Linux
1206 kernel? [Referring to the lock-based "toy" RCU
1216 2. CPU 1 enters synchronize_rcu(), write-acquiring
1237 consider task A in an RCU read-side critical section
1238 (thus read-holding rcu_gp_mutex), task B blocked
1239 attempting to write-acquire rcu_gp_mutex, and
1241 read_acquire rcu_gp_mutex. Task A's RCU read-side
1245 Realtime RCU implementations therefore use a counter-based
1246 approach where tasks in RCU read-side critical sections
1252 Give an example where Classic RCU's read-side
1256 Imagine a single-CPU system with a non-CONFIG_PREEMPTION
1257 kernel where a routing table is used by process-context
1258 code, but can be updated by irq-context code (for example,
1260 this would be to have the process-context code disable
1262 RCU allows such interrupt-disabling to be dispensed with.
1267 case is negative with respect to the single-CPU
1268 interrupt-disabling approach. Others might argue that
1270 the positive overhead of the interrupt-disabling scheme
1271 with the zero-overhead RCU scheme does not constitute
1276 a synchronization primitive is a bit unexpected. ;-)
1281 If it is illegal to block in an RCU read-side
1288 read-side critical sections. It also permits
1289 spinlocks blocking while in RCU read-side critical
1298 process we need to boost might well be a human being
1299 who just went out for a pizza or something. And although
1300 a computer-operated cattle prod might arouse serious
1309 My thanks to the people who helped make this human-readable, including