Lines Matching +full:cost +full:- +full:effective
3 What is RCU? -- "Read, Copy, Update"
24 | 1. Unraveling RCU Mysteries: Fundamentals https://www.linuxfoundation.org/webinars/unraveling-rcu-usage-mysteries
25 | 2. Unraveling RCU Mysteries: Additional Use Cases https://www.linuxfoundation.org/webinars/unraveling-rcu-usage-mysteries-additional-use-cases
31 during the 2.5 development effort that is optimized for read-mostly
32 situations. Although RCU is actually quite simple, making effective use
50 :ref:`6. ANALOGY WITH READER-WRITER LOCKING <6_whatisRCU>`
70 everything, feel free to read the whole thing -- but if you are really
72 never need this document anyway. ;-)
77 ----------------
106 b. Wait for all previous readers to complete their RCU read-side
115 use much lighter-weight synchronization, in some cases, absolutely no
116 synchronization at all. In contrast, in more conventional lock-based
117 schemes, readers must use heavy-weight synchronization in order to
119 This is because lock-based updaters typically update data items in place,
120 and must therefore exclude readers. In contrast, RCU-based updaters
126 and communications cache misses that are so expensive on present-day
129 In the three-step procedure shown above, the updater is performing both
132 in the Linux kernel's directory-entry cache (dcache). Even if the same
136 but RCU provides implicit low-overhead communication between readers
146 ---------------------------
169 reclaimer that the reader is entering an RCU read-side critical
170 section. It is illegal to block while in an RCU read-side
172 can preempt RCU read-side critical sections. Any RCU-protected
173 data structure accessed during an RCU read-side critical section
176 with RCU to maintain longer-term references to data structures.
179 or interrupts also enters an RCU read-side critical section.
180 Acquiring a spinlock also enters an RCU read-side critical
183 Sleeplocks do *not* enter RCU read-side critical sections.
190 reclaimer that the reader is exiting an RCU read-side critical
192 or interrupts also exits an RCU read-side critical section.
193 Releasing a spinlock also exits an RCU read-side critical section.
195 Note that RCU read-side critical sections may be nested and/or
204 all pre-existing RCU read-side critical sections on all CPUs
206 necessarily wait for any subsequent RCU read-side critical
211 ----------------- ------------------------- ---------------
220 read-side critical sections to complete, not necessarily for
224 **immediately** after the last pre-existing RCU read-side critical
232 to be useful in all but the most read-intensive situations,
238 argument which are invoked after all ongoing RCU read-side
241 or where update-side performance is critically important.
248 of denial-of-service attacks. Code using call_rcu() should limit
262 RCU-protected pointer, in order to safely communicate the change
265 but it does provide any compiler directives and memory-barrier
267 Its ordering properties are that of a store-release operation,
276 the _rcu list-manipulation primitives such as list_add_rcu().
286 an RCU-protected pointer, which returns a value that may
290 executes any needed memory-barrier instructions for a given
292 within rcu_dereference() -- on other CPUs, it compiles to a
300 RCU-protected pointer to a local variable, then dereferences
304 return p->data;
309 return rcu_dereference(head.next)->data;
312 RCU-protected structure, using the local variable is of
319 only within the enclosing RCU read-side critical section [1]_.
325 x = p->address; /* BUG!!! */
327 y = p->data; /* BUG!!! */
330 Holding a reference from one RCU read-side critical section
332 one lock-based critical section to another! Similarly,
342 typically used indirectly, via the _rcu list-manipulation
346 of an RCU read-side critical section as long as the usage is
347 protected by locks acquired by the update-side code. This variant
359 update-side code as well as by RCU readers, then an additional
363 invoked outside of an RCU read-side critical section and without
372 +--------+
373 +---------------------->| reader |---------+
374 | +--------+ |
380 +---------+ | |
381 | updater |<----------------+ |
382 +---------+ V
383 | +-----------+
384 +----------------------------------->| reclaimer |
385 +-----------+
397 spatial changes via stores to and loads from the RCU-protected pointer in
425 to remote denial-of-service attacks.
427 c. RCU applied to scheduler and interrupt/NMI-handler tasks.
430 for specialized uses, but are relatively uncommon. The SRCU, RCU-Tasks,
431 RCU-Tasks-Rude, and RCU-Tasks-Trace have similar relationships among
437 -----------------------------------------------
440 global pointer to a dynamically allocated structure. More-typical
441 uses of RCU may be found in listRCU.rst and NMI-RCU.rst.
475 new_fp->a = new_a;
495 retval = rcu_dereference(gbl_foo)->a;
502 - Use rcu_read_lock() and rcu_read_unlock() to guard RCU
503 read-side critical sections.
505 - Within an RCU read-side critical section, use rcu_dereference()
506 to dereference RCU-protected pointers.
508 - Use some solid design (such as locks or semaphores) to
511 - Use rcu_assign_pointer() to update an RCU-protected pointer.
517 - Use synchronize_rcu() **after** removing a data element from an
518 RCU-protected data structure, but **before** reclaiming/freeing
520 RCU read-side critical sections that might be referencing that
524 And again, more-typical uses of RCU may be found in listRCU.rst
525 and NMI-RCU.rst.
530 --------------------------------------------
534 long -- there might be other high-priority work to be done.
577 new_fp->a = new_a;
580 call_rcu(&old_fp->rcu, foo_reclaim);
589 foo_cleanup(fp->a);
595 struct, the type of the struct, and the pointed-to field within the
607 - Use call_rcu() **after** removing a data element from an
608 RCU-protected data structure in order to register a callback
610 read-side critical sections that might be referencing that
619 If the occasional sleep is permitted, the single-argument form may
625 synchronize_rcu() in response to memory-allocation failure.
632 ------------------------------------------------
636 production-quality implementations in the Linux kernel. This section
639 resembles "classic" RCU. Both are way too simple for real-world use,
642 production-quality implementation, and see:
654 familiar locking primitives. Its overhead makes it a non-starter for
655 real-life use, as does its lack of scalability. It is also unsuitable
657 one read-side critical section to another. It also assumes recursive
658 reader-writer locks: If you try this with non-recursive locks, and
701 The rcu_read_lock() and rcu_read_unlock() primitive read-acquire
702 and release a global reader-writer lock. The synchronize_rcu()
703 primitive write-acquires this same lock, then releases it. This means
704 that once synchronize_rcu() exits, all RCU read-side critical sections
706 to have completed -- there is no way that synchronize_rcu() would have
707 been able to write-acquire the lock otherwise. The smp_mb__after_spinlock()
709 the "Memory-Barrier Guarantees" listed in:
713 It is possible to nest rcu_read_lock(), since reader-writer locks may
724 occur when using this algorithm in a real-world Linux
751 This is the great strength of classic RCU in a non-preemptive kernel:
752 read-side overhead is precisely zero, at least on non-Alpha CPUs.
765 Remember that it is illegal to block while in an RCU read-side critical
767 that it must have completed all preceding RCU read-side critical sections.
769 RCU read-side critical sections will have completed.
773 that there are no RCU read-side critical sections holding a reference
779 Give an example where Classic RCU's read-side
787 If it is illegal to block in an RCU read-side
795 6. ANALOGY WITH READER-WRITER LOCKING
796 --------------------------------------
799 RCU is analogous to reader-writer locking. The following unified
800 diff shows how closely related RCU and reader-writer locking can be.
803 @@ -5,5 +5,5 @@ struct el {
807 -rwlock_t listmutex;
811 @@ -13,15 +14,15 @@
815 - read_lock(&listmutex);
816 - list_for_each_entry(p, head, lp) {
819 if (p->key == key) {
820 *result = p->data;
821 - read_unlock(&listmutex);
826 - read_unlock(&listmutex);
831 @@ -29,15 +30,16 @@
835 - write_lock(&listmutex);
838 if (p->key == key) {
839 - list_del(&p->list);
840 - write_unlock(&listmutex);
841 + list_del_rcu(&p->list);
848 - write_unlock(&listmutex);
853 Or, for those who prefer a side-by-side listing::
874 8 if (p->key == key) { 8 if (p->key == key) {
875 9 *result = p->data; 9 *result = p->data;
892 7 if (p->key == key) { 7 if (p->key == key) {
893 8 list_del(&p->list); 8 list_del_rcu(&p->list);
904 Either way, the differences are quite small. Read-side locking moves
905 to rcu_read_lock() and rcu_read_unlock, update-side locking moves from
906 a reader-writer lock to a simple spinlock, and a synchronize_rcu()
909 However, there is one potential catch: the read-side and update-side
916 delete() can now block. If this is a problem, there is a callback-based
923 -----------------------------------
925 The reader-writer analogy (illustrated by the previous section) is not
927 considers RCU an effective reference count on everything which is
931 values from changing, but does prevent changes to type -- particularly the
933 re-allocated for some other purpose. Once a type-safe reference to the
936 but with RCU the typical approach is to perform reads with SMP-aware
938 read-modify-write operations, and to provide the necessary ordering.
946 though a reference-count on that object has been temporarily increased.
956 - Copying out data that is guaranteed to be stable by the object's type.
957 - Using kref_get_unless_zero() or similar to get a longer-term
959 - Acquiring a spinlock in the object, and checking if the object still
975 reference-free spinlock acquisition completely unsafe. Therefore, when
988 may also use locking, including cache-friendly sequence locking.)
990 With traditional reference counting -- such as that implemented by the
991 kref library in Linux -- there is typically code that runs when the last
1000 To see how to choose between these two analogies -- of RCU as a
1001 reader-writer lock and RCU as a reference counting system -- it is useful
1002 to reflect on the scale of the thing being protected. The reader-writer
1003 lock analogy looks at larger multi-part objects such as a linked list
1005 to, and removed from, the list. The reference-count analogy looks at
1012 -------------------------
1014 The RCU APIs are documented in docbook-format header comments in the
1015 Linux-kernel source code, but it helps to have a full list of the
1107 RCU-Tasks::
1115 RCU-Tasks-Rude::
1123 RCU-Tasks-Trace::
1148 All: lockdep-checked RCU utility APIs::
1153 All: Unchecked RCU-protected pointer access::
1157 All: Unchecked RCU-protected pointer access with dereferencing prohibited::
1171 example, ftrace or BPF? If so, you need RCU-tasks,
1172 RCU-tasks-rude, and/or RCU-tasks-trace.
1174 c. What about the -rt patchset? If readers would need to block in
1175 an non-rt kernel, you need SRCU. If readers would block when
1176 acquiring spinlocks in a -rt kernel, but not in a non-rt kernel,
1177 SRCU is not necessary. (The -rt patchset turns spinlocks into
1184 If so, RCU-sched readers are the only choice that will work
1190 is your code subject to network-based denial-of-service attacks?
1195 f. Is your workload too update-intensive for normal use of
1200 g. Do you need read-side critical sections that are respected even
1202 from user-mode execution, or on an offlined CPU? If so, SRCU
1214 ----------------------------
1218 occur when using this algorithm in a real-world Linux
1219 kernel? [Referring to the lock-based "toy" RCU
1229 2. CPU 1 enters synchronize_rcu(), write-acquiring
1250 consider task A in an RCU read-side critical section
1251 (thus read-holding rcu_gp_mutex), task B blocked
1252 attempting to write-acquire rcu_gp_mutex, and
1254 read_acquire rcu_gp_mutex. Task A's RCU read-side
1258 Realtime RCU implementations therefore use a counter-based
1259 approach where tasks in RCU read-side critical sections
1265 Give an example where Classic RCU's read-side
1269 Imagine a single-CPU system with a non-CONFIG_PREEMPTION
1270 kernel where a routing table is used by process-context
1271 code, but can be updated by irq-context code (for example,
1273 this would be to have the process-context code disable
1275 RCU allows such interrupt-disabling to be dispensed with.
1276 Thus, without RCU, you pay the cost of disabling interrupts,
1280 case is negative with respect to the single-CPU
1281 interrupt-disabling approach. Others might argue that
1283 the positive overhead of the interrupt-disabling scheme
1284 with the zero-overhead RCU scheme does not constitute
1289 a synchronization primitive is a bit unexpected. ;-)
1294 If it is illegal to block in an RCU read-side
1301 read-side critical sections. It also permits
1302 spinlocks blocking while in RCU read-side critical
1313 a computer-operated cattle prod might arouse serious
1322 My thanks to the people who helped make this human-readable, including