14af49830SAmol Grover.. _rcu_barrier: 24af49830SAmol Grover 34af49830SAmol GroverRCU and Unloadable Modules 44af49830SAmol Grover========================== 54af49830SAmol Grover 64af49830SAmol Grover[Originally published in LWN Jan. 14, 2007: http://lwn.net/Articles/217484/] 74af49830SAmol Grover 842d689ecSPaul E. McKenneyRCU updaters sometimes use call_rcu() to initiate an asynchronous wait for 942d689ecSPaul E. McKenneya grace period to elapse. This primitive takes a pointer to an rcu_head 1042d689ecSPaul E. McKenneystruct placed within the RCU-protected data structure and another pointer 1142d689ecSPaul E. McKenneyto a function that may be invoked later to free that structure. Code to 1242d689ecSPaul E. McKenneydelete an element p from the linked list from IRQ context might then be 1342d689ecSPaul E. McKenneyas follows:: 144af49830SAmol Grover 154af49830SAmol Grover list_del_rcu(p); 164af49830SAmol Grover call_rcu(&p->rcu, p_callback); 174af49830SAmol Grover 184af49830SAmol GroverSince call_rcu() never blocks, this code can safely be used from within 194af49830SAmol GroverIRQ context. The function p_callback() might be defined as follows:: 204af49830SAmol Grover 214af49830SAmol Grover static void p_callback(struct rcu_head *rp) 224af49830SAmol Grover { 234af49830SAmol Grover struct pstruct *p = container_of(rp, struct pstruct, rcu); 244af49830SAmol Grover 254af49830SAmol Grover kfree(p); 264af49830SAmol Grover } 274af49830SAmol Grover 284af49830SAmol Grover 294af49830SAmol GroverUnloading Modules That Use call_rcu() 304af49830SAmol Grover------------------------------------- 314af49830SAmol Grover 3242d689ecSPaul E. McKenneyBut what if the p_callback() function is defined in an unloadable module? 334af49830SAmol Grover 344af49830SAmol GroverIf we unload the module while some RCU callbacks are pending, 354af49830SAmol Groverthe CPUs executing these callbacks are going to be severely 364af49830SAmol Groverdisappointed when they are later invoked, as fancifully depicted at 374af49830SAmol Groverhttp://lwn.net/images/ns/kernel/rcu-drop.jpg. 384af49830SAmol Grover 394af49830SAmol GroverWe could try placing a synchronize_rcu() in the module-exit code path, 404af49830SAmol Groverbut this is not sufficient. Although synchronize_rcu() does wait for a 414af49830SAmol Grovergrace period to elapse, it does not wait for the callbacks to complete. 424af49830SAmol Grover 434af49830SAmol GroverOne might be tempted to try several back-to-back synchronize_rcu() 444af49830SAmol Grovercalls, but this is still not guaranteed to work. If there is a very 4542d689ecSPaul E. McKenneyheavy RCU-callback load, then some of the callbacks might be deferred in 4642d689ecSPaul E. McKenneyorder to allow other processing to proceed. For but one example, such 4742d689ecSPaul E. McKenneydeferral is required in realtime kernels in order to avoid excessive 4842d689ecSPaul E. McKenneyscheduling latencies. 494af49830SAmol Grover 504af49830SAmol Grover 514af49830SAmol Groverrcu_barrier() 524af49830SAmol Grover------------- 534af49830SAmol Grover 5442d689ecSPaul E. McKenneyThis situation can be handled by the rcu_barrier() primitive. Rather 5542d689ecSPaul E. McKenneythan waiting for a grace period to elapse, rcu_barrier() waits for all 5642d689ecSPaul E. McKenneyoutstanding RCU callbacks to complete. Please note that rcu_barrier() 5742d689ecSPaul E. McKenneydoes **not** imply synchronize_rcu(), in particular, if there are no RCU 5842d689ecSPaul E. McKenneycallbacks queued anywhere, rcu_barrier() is within its rights to return 5942d689ecSPaul E. McKenneyimmediately, without waiting for anything, let alone a grace period. 604af49830SAmol Grover 614af49830SAmol GroverPseudo-code using rcu_barrier() is as follows: 624af49830SAmol Grover 634af49830SAmol Grover 1. Prevent any new RCU callbacks from being posted. 644af49830SAmol Grover 2. Execute rcu_barrier(). 654af49830SAmol Grover 3. Allow the module to be unloaded. 664af49830SAmol Grover 674af49830SAmol GroverThere is also an srcu_barrier() function for SRCU, and you of course 6842d689ecSPaul E. McKenneymust match the flavor of srcu_barrier() with that of call_srcu(). 6942d689ecSPaul E. McKenneyIf your module uses multiple srcu_struct structures, then it must also 7042d689ecSPaul E. McKenneyuse multiple invocations of srcu_barrier() when unloading that module. 7142d689ecSPaul E. McKenneyFor example, if it uses call_rcu(), call_srcu() on srcu_struct_1, and 7242d689ecSPaul E. McKenneycall_srcu() on srcu_struct_2, then the following three lines of code 7342d689ecSPaul E. McKenneywill be required when unloading:: 744af49830SAmol Grover 754af49830SAmol Grover 1 rcu_barrier(); 764af49830SAmol Grover 2 srcu_barrier(&srcu_struct_1); 774af49830SAmol Grover 3 srcu_barrier(&srcu_struct_2); 784af49830SAmol Grover 7942d689ecSPaul E. McKenneyIf latency is of the essence, workqueues could be used to run these 8042d689ecSPaul E. McKenneythree functions concurrently. 8142d689ecSPaul E. McKenney 8242d689ecSPaul E. McKenneyAn ancient version of the rcutorture module makes use of rcu_barrier() 8342d689ecSPaul E. McKenneyin its exit function as follows:: 844af49830SAmol Grover 854af49830SAmol Grover 1 static void 864af49830SAmol Grover 2 rcu_torture_cleanup(void) 874af49830SAmol Grover 3 { 884af49830SAmol Grover 4 int i; 894af49830SAmol Grover 5 904af49830SAmol Grover 6 fullstop = 1; 914af49830SAmol Grover 7 if (shuffler_task != NULL) { 924af49830SAmol Grover 8 VERBOSE_PRINTK_STRING("Stopping rcu_torture_shuffle task"); 934af49830SAmol Grover 9 kthread_stop(shuffler_task); 944af49830SAmol Grover 10 } 954af49830SAmol Grover 11 shuffler_task = NULL; 964af49830SAmol Grover 12 974af49830SAmol Grover 13 if (writer_task != NULL) { 984af49830SAmol Grover 14 VERBOSE_PRINTK_STRING("Stopping rcu_torture_writer task"); 994af49830SAmol Grover 15 kthread_stop(writer_task); 1004af49830SAmol Grover 16 } 1014af49830SAmol Grover 17 writer_task = NULL; 1024af49830SAmol Grover 18 1034af49830SAmol Grover 19 if (reader_tasks != NULL) { 1044af49830SAmol Grover 20 for (i = 0; i < nrealreaders; i++) { 1054af49830SAmol Grover 21 if (reader_tasks[i] != NULL) { 1064af49830SAmol Grover 22 VERBOSE_PRINTK_STRING( 1074af49830SAmol Grover 23 "Stopping rcu_torture_reader task"); 1084af49830SAmol Grover 24 kthread_stop(reader_tasks[i]); 1094af49830SAmol Grover 25 } 1104af49830SAmol Grover 26 reader_tasks[i] = NULL; 1114af49830SAmol Grover 27 } 1124af49830SAmol Grover 28 kfree(reader_tasks); 1134af49830SAmol Grover 29 reader_tasks = NULL; 1144af49830SAmol Grover 30 } 1154af49830SAmol Grover 31 rcu_torture_current = NULL; 1164af49830SAmol Grover 32 1174af49830SAmol Grover 33 if (fakewriter_tasks != NULL) { 1184af49830SAmol Grover 34 for (i = 0; i < nfakewriters; i++) { 1194af49830SAmol Grover 35 if (fakewriter_tasks[i] != NULL) { 1204af49830SAmol Grover 36 VERBOSE_PRINTK_STRING( 1214af49830SAmol Grover 37 "Stopping rcu_torture_fakewriter task"); 1224af49830SAmol Grover 38 kthread_stop(fakewriter_tasks[i]); 1234af49830SAmol Grover 39 } 1244af49830SAmol Grover 40 fakewriter_tasks[i] = NULL; 1254af49830SAmol Grover 41 } 1264af49830SAmol Grover 42 kfree(fakewriter_tasks); 1274af49830SAmol Grover 43 fakewriter_tasks = NULL; 1284af49830SAmol Grover 44 } 1294af49830SAmol Grover 45 1304af49830SAmol Grover 46 if (stats_task != NULL) { 1314af49830SAmol Grover 47 VERBOSE_PRINTK_STRING("Stopping rcu_torture_stats task"); 1324af49830SAmol Grover 48 kthread_stop(stats_task); 1334af49830SAmol Grover 49 } 1344af49830SAmol Grover 50 stats_task = NULL; 1354af49830SAmol Grover 51 1364af49830SAmol Grover 52 /* Wait for all RCU callbacks to fire. */ 1374af49830SAmol Grover 53 rcu_barrier(); 1384af49830SAmol Grover 54 1394af49830SAmol Grover 55 rcu_torture_stats_print(); /* -After- the stats thread is stopped! */ 1404af49830SAmol Grover 56 1414af49830SAmol Grover 57 if (cur_ops->cleanup != NULL) 1424af49830SAmol Grover 58 cur_ops->cleanup(); 1434af49830SAmol Grover 59 if (atomic_read(&n_rcu_torture_error)) 1444af49830SAmol Grover 60 rcu_torture_print_module_parms("End of test: FAILURE"); 1454af49830SAmol Grover 61 else 1464af49830SAmol Grover 62 rcu_torture_print_module_parms("End of test: SUCCESS"); 1474af49830SAmol Grover 63 } 1484af49830SAmol Grover 1494af49830SAmol GroverLine 6 sets a global variable that prevents any RCU callbacks from 1504af49830SAmol Groverre-posting themselves. This will not be necessary in most cases, since 1514af49830SAmol GroverRCU callbacks rarely include calls to call_rcu(). However, the rcutorture 1524af49830SAmol Grovermodule is an exception to this rule, and therefore needs to set this 1534af49830SAmol Groverglobal variable. 1544af49830SAmol Grover 1554af49830SAmol GroverLines 7-50 stop all the kernel tasks associated with the rcutorture 1564af49830SAmol Grovermodule. Therefore, once execution reaches line 53, no more rcutorture 1574af49830SAmol GroverRCU callbacks will be posted. The rcu_barrier() call on line 53 waits 1584af49830SAmol Groverfor any pre-existing callbacks to complete. 1594af49830SAmol Grover 1604af49830SAmol GroverThen lines 55-62 print status and do operation-specific cleanup, and 1614af49830SAmol Groverthen return, permitting the module-unload operation to be completed. 1624af49830SAmol Grover 1634af49830SAmol Grover.. _rcubarrier_quiz_1: 1644af49830SAmol Grover 1654af49830SAmol GroverQuick Quiz #1: 1664af49830SAmol Grover Is there any other situation where rcu_barrier() might 1674af49830SAmol Grover be required? 1684af49830SAmol Grover 1694af49830SAmol Grover:ref:`Answer to Quick Quiz #1 <answer_rcubarrier_quiz_1>` 1704af49830SAmol Grover 1714af49830SAmol GroverYour module might have additional complications. For example, if your 17242d689ecSPaul E. McKenneymodule invokes call_rcu() from timers, you will need to first refrain 17342d689ecSPaul E. McKenneyfrom posting new timers, cancel (or wait for) all the already-posted 17442d689ecSPaul E. McKenneytimers, and only then invoke rcu_barrier() to wait for any remaining 1754af49830SAmol GroverRCU callbacks to complete. 1764af49830SAmol Grover 17742d689ecSPaul E. McKenneyOf course, if your module uses call_rcu(), you will need to invoke 1784af49830SAmol Groverrcu_barrier() before unloading. Similarly, if your module uses 1794af49830SAmol Grovercall_srcu(), you will need to invoke srcu_barrier() before unloading, 1804af49830SAmol Groverand on the same srcu_struct structure. If your module uses call_rcu() 18142d689ecSPaul E. McKenney**and** call_srcu(), then (as noted above) you will need to invoke 18242d689ecSPaul E. McKenneyrcu_barrier() **and** srcu_barrier(). 1834af49830SAmol Grover 1844af49830SAmol Grover 1854af49830SAmol GroverImplementing rcu_barrier() 1864af49830SAmol Grover-------------------------- 1874af49830SAmol Grover 1884af49830SAmol GroverDipankar Sarma's implementation of rcu_barrier() makes use of the fact 1894af49830SAmol Groverthat RCU callbacks are never reordered once queued on one of the per-CPU 1904af49830SAmol Groverqueues. His implementation queues an RCU callback on each of the per-CPU 1914af49830SAmol Grovercallback queues, and then waits until they have all started executing, at 1924af49830SAmol Groverwhich point, all earlier RCU callbacks are guaranteed to have completed. 1934af49830SAmol Grover 19442d689ecSPaul E. McKenneyThe original code for rcu_barrier() was roughly as follows:: 1954af49830SAmol Grover 1964af49830SAmol Grover 1 void rcu_barrier(void) 1974af49830SAmol Grover 2 { 1984af49830SAmol Grover 3 BUG_ON(in_interrupt()); 1994af49830SAmol Grover 4 /* Take cpucontrol mutex to protect against CPU hotplug */ 2004af49830SAmol Grover 5 mutex_lock(&rcu_barrier_mutex); 2014af49830SAmol Grover 6 init_completion(&rcu_barrier_completion); 20242d689ecSPaul E. McKenney 7 atomic_set(&rcu_barrier_cpu_count, 1); 2034af49830SAmol Grover 8 on_each_cpu(rcu_barrier_func, NULL, 0, 1); 20442d689ecSPaul E. McKenney 9 if (atomic_dec_and_test(&rcu_barrier_cpu_count)) 20542d689ecSPaul E. McKenney 10 complete(&rcu_barrier_completion); 20642d689ecSPaul E. McKenney 11 wait_for_completion(&rcu_barrier_completion); 20742d689ecSPaul E. McKenney 12 mutex_unlock(&rcu_barrier_mutex); 20842d689ecSPaul E. McKenney 13 } 2094af49830SAmol Grover 21042d689ecSPaul E. McKenneyLine 3 verifies that the caller is in process context, and lines 5 and 12 2114af49830SAmol Groveruse rcu_barrier_mutex to ensure that only one rcu_barrier() is using the 2124af49830SAmol Groverglobal completion and counters at a time, which are initialized on lines 2134af49830SAmol Grover6 and 7. Line 8 causes each CPU to invoke rcu_barrier_func(), which is 2144af49830SAmol Grovershown below. Note that the final "1" in on_each_cpu()'s argument list 2154af49830SAmol Groverensures that all the calls to rcu_barrier_func() will have completed 21642d689ecSPaul E. McKenneybefore on_each_cpu() returns. Line 9 removes the initial count from 21742d689ecSPaul E. McKenneyrcu_barrier_cpu_count, and if this count is now zero, line 10 finalizes 21842d689ecSPaul E. McKenneythe completion, which prevents line 11 from blocking. Either way, 21942d689ecSPaul E. McKenneyline 11 then waits (if needed) for the completion. 22042d689ecSPaul E. McKenney 22142d689ecSPaul E. McKenney.. _rcubarrier_quiz_2: 22242d689ecSPaul E. McKenney 22342d689ecSPaul E. McKenneyQuick Quiz #2: 22442d689ecSPaul E. McKenney Why doesn't line 8 initialize rcu_barrier_cpu_count to zero, 22542d689ecSPaul E. McKenney thereby avoiding the need for lines 9 and 10? 22642d689ecSPaul E. McKenney 22742d689ecSPaul E. McKenney:ref:`Answer to Quick Quiz #2 <answer_rcubarrier_quiz_2>` 2284af49830SAmol Grover 2294af49830SAmol GroverThis code was rewritten in 2008 and several times thereafter, but this 2304af49830SAmol Groverstill gives the general idea. 2314af49830SAmol Grover 2324af49830SAmol GroverThe rcu_barrier_func() runs on each CPU, where it invokes call_rcu() 2334af49830SAmol Groverto post an RCU callback, as follows:: 2344af49830SAmol Grover 2354af49830SAmol Grover 1 static void rcu_barrier_func(void *notused) 2364af49830SAmol Grover 2 { 2374af49830SAmol Grover 3 int cpu = smp_processor_id(); 2384af49830SAmol Grover 4 struct rcu_data *rdp = &per_cpu(rcu_data, cpu); 2394af49830SAmol Grover 5 struct rcu_head *head; 2404af49830SAmol Grover 6 2414af49830SAmol Grover 7 head = &rdp->barrier; 2424af49830SAmol Grover 8 atomic_inc(&rcu_barrier_cpu_count); 2434af49830SAmol Grover 9 call_rcu(head, rcu_barrier_callback); 2444af49830SAmol Grover 10 } 2454af49830SAmol Grover 2464af49830SAmol GroverLines 3 and 4 locate RCU's internal per-CPU rcu_data structure, 2474af49830SAmol Groverwhich contains the struct rcu_head that needed for the later call to 2484af49830SAmol Grovercall_rcu(). Line 7 picks up a pointer to this struct rcu_head, and line 24942d689ecSPaul E. McKenney8 increments the global counter. This counter will later be decremented 2504af49830SAmol Groverby the callback. Line 9 then registers the rcu_barrier_callback() on 2514af49830SAmol Groverthe current CPU's queue. 2524af49830SAmol Grover 2534af49830SAmol GroverThe rcu_barrier_callback() function simply atomically decrements the 2544af49830SAmol Groverrcu_barrier_cpu_count variable and finalizes the completion when it 2554af49830SAmol Groverreaches zero, as follows:: 2564af49830SAmol Grover 2574af49830SAmol Grover 1 static void rcu_barrier_callback(struct rcu_head *notused) 2584af49830SAmol Grover 2 { 2594af49830SAmol Grover 3 if (atomic_dec_and_test(&rcu_barrier_cpu_count)) 2604af49830SAmol Grover 4 complete(&rcu_barrier_completion); 2614af49830SAmol Grover 5 } 2624af49830SAmol Grover 26342d689ecSPaul E. McKenney.. _rcubarrier_quiz_3: 2644af49830SAmol Grover 26542d689ecSPaul E. McKenneyQuick Quiz #3: 2664af49830SAmol Grover What happens if CPU 0's rcu_barrier_func() executes 2674af49830SAmol Grover immediately (thus incrementing rcu_barrier_cpu_count to the 2684af49830SAmol Grover value one), but the other CPU's rcu_barrier_func() invocations 2694af49830SAmol Grover are delayed for a full grace period? Couldn't this result in 2704af49830SAmol Grover rcu_barrier() returning prematurely? 2714af49830SAmol Grover 27242d689ecSPaul E. McKenney:ref:`Answer to Quick Quiz #3 <answer_rcubarrier_quiz_3>` 2734af49830SAmol Grover 2744af49830SAmol GroverThe current rcu_barrier() implementation is more complex, due to the need 2754af49830SAmol Groverto avoid disturbing idle CPUs (especially on battery-powered systems) 2764af49830SAmol Groverand the need to minimally disturb non-idle CPUs in real-time systems. 27742d689ecSPaul E. McKenneyIn addition, a great many optimizations have been applied. However, 27842d689ecSPaul E. McKenneythe code above illustrates the concepts. 2794af49830SAmol Grover 2804af49830SAmol Grover 2814af49830SAmol Groverrcu_barrier() Summary 2824af49830SAmol Grover--------------------- 2834af49830SAmol Grover 28442d689ecSPaul E. McKenneyThe rcu_barrier() primitive is used relatively infrequently, since most 2854af49830SAmol Grovercode using RCU is in the core kernel rather than in modules. However, if 2864af49830SAmol Groveryou are using RCU from an unloadable module, you need to use rcu_barrier() 2874af49830SAmol Groverso that your module may be safely unloaded. 2884af49830SAmol Grover 2894af49830SAmol Grover 2904af49830SAmol GroverAnswers to Quick Quizzes 2914af49830SAmol Grover------------------------ 2924af49830SAmol Grover 2934af49830SAmol Grover.. _answer_rcubarrier_quiz_1: 2944af49830SAmol Grover 2954af49830SAmol GroverQuick Quiz #1: 2964af49830SAmol Grover Is there any other situation where rcu_barrier() might 2974af49830SAmol Grover be required? 2984af49830SAmol Grover 299*a75f7b48SAkira YokosawaAnswer: 300*a75f7b48SAkira Yokosawa Interestingly enough, rcu_barrier() was not originally 3014af49830SAmol Grover implemented for module unloading. Nikita Danilov was using 3024af49830SAmol Grover RCU in a filesystem, which resulted in a similar situation at 3034af49830SAmol Grover filesystem-unmount time. Dipankar Sarma coded up rcu_barrier() 3044af49830SAmol Grover in response, so that Nikita could invoke it during the 3054af49830SAmol Grover filesystem-unmount process. 3064af49830SAmol Grover 3074af49830SAmol Grover Much later, yours truly hit the RCU module-unload problem when 3084af49830SAmol Grover implementing rcutorture, and found that rcu_barrier() solves 3094af49830SAmol Grover this problem as well. 3104af49830SAmol Grover 3114af49830SAmol Grover:ref:`Back to Quick Quiz #1 <rcubarrier_quiz_1>` 3124af49830SAmol Grover 3134af49830SAmol Grover.. _answer_rcubarrier_quiz_2: 3144af49830SAmol Grover 3154af49830SAmol GroverQuick Quiz #2: 31642d689ecSPaul E. McKenney Why doesn't line 8 initialize rcu_barrier_cpu_count to zero, 31742d689ecSPaul E. McKenney thereby avoiding the need for lines 9 and 10? 31842d689ecSPaul E. McKenney 319*a75f7b48SAkira YokosawaAnswer: 320*a75f7b48SAkira Yokosawa Suppose that the on_each_cpu() function shown on line 8 was 32142d689ecSPaul E. McKenney delayed, so that CPU 0's rcu_barrier_func() executed and 32242d689ecSPaul E. McKenney the corresponding grace period elapsed, all before CPU 1's 32342d689ecSPaul E. McKenney rcu_barrier_func() started executing. This would result in 32442d689ecSPaul E. McKenney rcu_barrier_cpu_count being decremented to zero, so that line 32542d689ecSPaul E. McKenney 11's wait_for_completion() would return immediately, failing to 32642d689ecSPaul E. McKenney wait for CPU 1's callbacks to be invoked. 32742d689ecSPaul E. McKenney 32842d689ecSPaul E. McKenney Note that this was not a problem when the rcu_barrier() code 32942d689ecSPaul E. McKenney was first added back in 2005. This is because on_each_cpu() 33042d689ecSPaul E. McKenney disables preemption, which acted as an RCU read-side critical 33142d689ecSPaul E. McKenney section, thus preventing CPU 0's grace period from completing 33242d689ecSPaul E. McKenney until on_each_cpu() had dealt with all of the CPUs. However, 33342d689ecSPaul E. McKenney with the advent of preemptible RCU, rcu_barrier() no longer 33442d689ecSPaul E. McKenney waited on nonpreemptible regions of code in preemptible kernels, 33542d689ecSPaul E. McKenney that being the job of the new rcu_barrier_sched() function. 33642d689ecSPaul E. McKenney 33742d689ecSPaul E. McKenney However, with the RCU flavor consolidation around v4.20, this 33842d689ecSPaul E. McKenney possibility was once again ruled out, because the consolidated 33942d689ecSPaul E. McKenney RCU once again waits on nonpreemptible regions of code. 34042d689ecSPaul E. McKenney 34142d689ecSPaul E. McKenney Nevertheless, that extra count might still be a good idea. 34242d689ecSPaul E. McKenney Relying on these sort of accidents of implementation can result 34342d689ecSPaul E. McKenney in later surprise bugs when the implementation changes. 34442d689ecSPaul E. McKenney 34542d689ecSPaul E. McKenney:ref:`Back to Quick Quiz #2 <rcubarrier_quiz_2>` 34642d689ecSPaul E. McKenney 34742d689ecSPaul E. McKenney.. _answer_rcubarrier_quiz_3: 34842d689ecSPaul E. McKenney 34942d689ecSPaul E. McKenneyQuick Quiz #3: 3504af49830SAmol Grover What happens if CPU 0's rcu_barrier_func() executes 3514af49830SAmol Grover immediately (thus incrementing rcu_barrier_cpu_count to the 3524af49830SAmol Grover value one), but the other CPU's rcu_barrier_func() invocations 3534af49830SAmol Grover are delayed for a full grace period? Couldn't this result in 3544af49830SAmol Grover rcu_barrier() returning prematurely? 3554af49830SAmol Grover 356*a75f7b48SAkira YokosawaAnswer: 357*a75f7b48SAkira Yokosawa This cannot happen. The reason is that on_each_cpu() has its last 3584af49830SAmol Grover argument, the wait flag, set to "1". This flag is passed through 3594af49830SAmol Grover to smp_call_function() and further to smp_call_function_on_cpu(), 3604af49830SAmol Grover causing this latter to spin until the cross-CPU invocation of 3614af49830SAmol Grover rcu_barrier_func() has completed. This by itself would prevent 36281ad58beSSebastian Andrzej Siewior a grace period from completing on non-CONFIG_PREEMPTION kernels, 3634af49830SAmol Grover since each CPU must undergo a context switch (or other quiescent 3644af49830SAmol Grover state) before the grace period can complete. However, this is 36581ad58beSSebastian Andrzej Siewior of no use in CONFIG_PREEMPTION kernels. 3664af49830SAmol Grover 3674af49830SAmol Grover Therefore, on_each_cpu() disables preemption across its call 3684af49830SAmol Grover to smp_call_function() and also across the local call to 36942d689ecSPaul E. McKenney rcu_barrier_func(). Because recent RCU implementations treat 37042d689ecSPaul E. McKenney preemption-disabled regions of code as RCU read-side critical 37142d689ecSPaul E. McKenney sections, this prevents grace periods from completing. This 3724af49830SAmol Grover means that all CPUs have executed rcu_barrier_func() before 3734af49830SAmol Grover the first rcu_barrier_callback() can possibly execute, in turn 3744af49830SAmol Grover preventing rcu_barrier_cpu_count from prematurely reaching zero. 3754af49830SAmol Grover 37642d689ecSPaul E. McKenney But if on_each_cpu() ever decides to forgo disabling preemption, 37742d689ecSPaul E. McKenney as might well happen due to real-time latency considerations, 37842d689ecSPaul E. McKenney initializing rcu_barrier_cpu_count to one will save the day. 3794af49830SAmol Grover 38042d689ecSPaul E. McKenney:ref:`Back to Quick Quiz #3 <rcubarrier_quiz_3>` 381