| #
1a72f4bb |
| 29-Dec-2025 |
Paul E. McKenney <paulmck@kernel.org> |
rcu: Add noinstr-fast rcu_read_{,un}lock_tasks_trace() APIs
When expressing RCU Tasks Trace in terms of SRCU-fast, it was necessary to keep a nesting count and per-CPU srcu_ctr structure pointer in
rcu: Add noinstr-fast rcu_read_{,un}lock_tasks_trace() APIs
When expressing RCU Tasks Trace in terms of SRCU-fast, it was necessary to keep a nesting count and per-CPU srcu_ctr structure pointer in the task_struct structure, which is slow to access. But an alternative is to instead make rcu_read_lock_tasks_trace() and rcu_read_unlock_tasks_trace(), which match the underlying SRCU-fast semantics, avoiding the task_struct accesses.
When all callers have switched to the new API, the previous rcu_read_lock_trace() and rcu_read_unlock_trace() APIs will be removed.
The rcu_read_{,un}lock_{,tasks_}trace() functions need to use smp_mb() only if invoked where RCU is not watching, that is, from locations where a call to rcu_is_watching() would return false. In architectures that define the ARCH_WANTS_NO_INSTR Kconfig option, use of noinstr and friends ensures that tracing happens only where RCU is watching, so those architectures can dispense entirely with the read-side calls to smp_mb().
Other architectures include these read-side calls by default, but in many installations there might be either larger than average tolerance for risk, prohibition of removing tracing on a running system, or careful review and approval of removing of tracing. Such installations can build their kernels with CONFIG_TASKS_TRACE_RCU_NO_MB=y to avoid those read-side calls to smp_mb(), thus accepting responsibility for run-time removal of tracing from code regions that RCU is not watching.
Those wishing to disable read-side memory barriers for an entire architecture can select this TASKS_TRACE_RCU_NO_MB Kconfig option, hence the polarity.
[ paulmck: Apply Peter Zijlstra feedback. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: bpf@vger.kernel.org Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
show more ...
|
| #
176a6aea |
| 29-Dec-2025 |
Paul E. McKenney <paulmck@kernel.org> |
rcu: Move rcu_tasks_trace_srcu_struct out of #ifdef CONFIG_TASKS_RCU_GENERIC
Moving the rcu_tasks_trace_srcu_struct structure instance out from under the CONFIG_TASKS_RCU_GENERIC Kconfig option perm
rcu: Move rcu_tasks_trace_srcu_struct out of #ifdef CONFIG_TASKS_RCU_GENERIC
Moving the rcu_tasks_trace_srcu_struct structure instance out from under the CONFIG_TASKS_RCU_GENERIC Kconfig option permits the CONFIG_TASKS_TRACE_RCU Kconfig option to stop enabling this CONFIG_TASKS_RCU_GENERIC Kconfig option. This commit also therefore makes it so.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: bpf@vger.kernel.org Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
show more ...
|
| #
a73fc3dc |
| 29-Dec-2025 |
Paul E. McKenney <paulmck@kernel.org> |
rcu: Clean up after the SRCU-fastification of RCU Tasks Trace
Now that RCU Tasks Trace has been re-implemented in terms of SRCU-fast, the ->trc_ipi_to_cpu, ->trc_blkd_cpu, ->trc_blkd_node, ->trc_hol
rcu: Clean up after the SRCU-fastification of RCU Tasks Trace
Now that RCU Tasks Trace has been re-implemented in terms of SRCU-fast, the ->trc_ipi_to_cpu, ->trc_blkd_cpu, ->trc_blkd_node, ->trc_holdout_list, and ->trc_reader_special task_struct fields are no longer used.
In addition, the rcu_tasks_trace_qs(), rcu_tasks_trace_qs_blkd(), exit_tasks_rcu_finish_trace(), and rcu_spawn_tasks_trace_kthread(), show_rcu_tasks_trace_gp_kthread(), rcu_tasks_trace_get_gp_data(), rcu_tasks_trace_torture_stats_print(), and get_rcu_tasks_trace_gp_kthread() functions and all the other functions that they invoke are no longer used.
Also, the TRC_NEED_QS and TRC_NEED_QS_CHECKED CPP macros are no longer used. Neither are the rcu_tasks_trace_lazy_ms and rcu_task_ipi_delay rcupdate module parameters and the TASKS_TRACE_RCU_READ_MB Kconfig option.
This commit therefore removes all of them.
[ paulmck: Apply Alexei Starovoitov feedback. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: bpf@vger.kernel.org Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
show more ...
|
| #
1dc1e0b9 |
| 25-Mar-2025 |
Paul E. McKenney <paulmck@kernel.org> |
srcu: Make FORCE_NEED_SRCU_NMI_SAFE depend on RCU_EXPERT
The FORCE_NEED_SRCU_NMI_SAFE is useful only for those wishing to test the SRCU code paths that accommodate architectures that do not have NMI
srcu: Make FORCE_NEED_SRCU_NMI_SAFE depend on RCU_EXPERT
The FORCE_NEED_SRCU_NMI_SAFE is useful only for those wishing to test the SRCU code paths that accommodate architectures that do not have NMI-safe per-CPU operations, that is, those architectures that do not select the ARCH_HAS_NMI_SAFE_THIS_CPU_OPS Kconfig option. As such, this is a specialized Kconfig option that is not intended for casual users.
This commit therefore hides it behind the RCU_EXPERT Kconfig option. Given that this new FORCE_NEED_SRCU_NMI_SAFE Kconfig option has no effect unless the ARCH_HAS_NMI_SAFE_THIS_CPU_OPS Kconfig option is also selected, it also depends on this Kconfig option.
[ paulmck: Apply Geert Uytterhoeven feedback. ]
[ boqun: Add the "Fixes" tag. ]
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Closes: https://lore.kernel.org/all/CAMuHMdX6dy9_tmpLkpcnGzxyRbe6qSWYukcPp=H1GzZdyd3qBQ@mail.gmail.com/ Fixes: 536e8b9b80bc ("srcu: Add FORCE_NEED_SRCU_NMI_SAFE Kconfig for testing") Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
show more ...
|
| #
467c890f |
| 05-Mar-2025 |
Boqun Feng <boqun.feng@gmail.com> |
Merge branches 'docs.2025.02.04a', 'lazypreempt.2025.03.04a', 'misc.2025.03.04a', 'srcu.2025.02.05a' and 'torture.2025.02.05a'
|
| #
8437bb84 |
| 13-Dec-2024 |
Ankur Arora <ankur.a.arora@oracle.com> |
rcu: limit PREEMPT_RCU configurations
PREEMPT_LAZY can be enabled stand-alone or alongside PREEMPT_DYNAMIC which allows for dynamic switching of preemption models.
The choice of PREEMPT_RCU or not,
rcu: limit PREEMPT_RCU configurations
PREEMPT_LAZY can be enabled stand-alone or alongside PREEMPT_DYNAMIC which allows for dynamic switching of preemption models.
The choice of PREEMPT_RCU or not, however, is fixed at compile time.
Given that PREEMPT_RCU makes some trade-offs to optimize for latency as opposed to throughput, configurations with limited preemption might prefer the stronger forward-progress guarantees of PREEMPT_RCU=n.
Accordingly, explicitly limit PREEMPT_RCU=y to the latency oriented preemption models: PREEMPT, PREEMPT_RT, and the runtime configurable model PREEMPT_DYNAMIC.
This means the throughput oriented models, PREEMPT_NONE, PREEMPT_VOLUNTARY, and PREEMPT_LAZY will run with PREEMPT_RCU=n.
Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
show more ...
|
| #
536e8b9b |
| 10-Jan-2025 |
Paul E. McKenney <paulmck@kernel.org> |
srcu: Add FORCE_NEED_SRCU_NMI_SAFE Kconfig for testing
The srcu_read_lock_nmisafe() and srcu_read_unlock_nmisafe() functions map to __srcu_read_lock() and __srcu_read_unlock() on systems like x86 th
srcu: Add FORCE_NEED_SRCU_NMI_SAFE Kconfig for testing
The srcu_read_lock_nmisafe() and srcu_read_unlock_nmisafe() functions map to __srcu_read_lock() and __srcu_read_unlock() on systems like x86 that have NMI-safe this_cpu_inc() operations. This makes the underlying __srcu_read_lock_nmisafe() and __srcu_read_unlock_nmisafe() functions difficult to test on (for example) x86 systems, allowing bugs to creep in.
This commit therefore creates a FORCE_NEED_SRCU_NMI_SAFE Kconfig that forces those underlying functions to be used even on systems where they are not needed, thus providing better testing coverage.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
show more ...
|
| #
4dca1af4 |
| 13-Dec-2024 |
Ankur Arora <ankur.a.arora@oracle.com> |
rcu: rename PREEMPT_AUTO to PREEMPT_LAZY
Replace mentions of PREEMPT_AUTO with PREEMPT_LAZY.
Also, since PREMPT_LAZY implies PREEMPTION, we can reduce the TASKS_RCU selection criteria from this:
rcu: rename PREEMPT_AUTO to PREEMPT_LAZY
Replace mentions of PREEMPT_AUTO with PREEMPT_LAZY.
Also, since PREMPT_LAZY implies PREEMPTION, we can reduce the TASKS_RCU selection criteria from this:
NEED_TASKS_RCU && (PREEMPTION || PREEMPT_AUTO)
to this:
NEED_TASKS_RCU && PREEMPTION
CC: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
show more ...
|
| #
81a208c5 |
| 09-Jan-2025 |
Paul E. McKenney <paulmck@kernel.org> |
rcu: Clarify RCU_LAZY and RCU_LAZY_DEFAULT_OFF help text
This commit wordsmiths the RCU_LAZY and RCU_LAZY_DEFAULT_OFF Kconfig options' help text.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org
rcu: Clarify RCU_LAZY and RCU_LAZY_DEFAULT_OFF help text
This commit wordsmiths the RCU_LAZY and RCU_LAZY_DEFAULT_OFF Kconfig options' help text.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
show more ...
|
| #
f30e2582 |
| 09-Oct-2024 |
Paul E. McKenney <paulmck@kernel.org> |
rcu: Add rcuog kthreads to RCU_NOCB_CPU help text
The RCU_NOCB_CPU help text currently fails to mention rcuog kthreads, so this commit adds this information.
Reported-by: Olivier Langlois <olivier@
rcu: Add rcuog kthreads to RCU_NOCB_CPU help text
The RCU_NOCB_CPU help text currently fails to mention rcuog kthreads, so this commit adds this information.
Reported-by: Olivier Langlois <olivier@trillion01.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
show more ...
|
| #
1b4e9fdf |
| 22-Feb-2024 |
Paul E. McKenney <paulmck@kernel.org> |
rcu: Create NEED_TASKS_RCU to factor out enablement logic
Currently, if a Kconfig option depends on TASKS_RCU, it conditionally does "select TASKS_RCU if PREEMPTION". This works, but requires any c
rcu: Create NEED_TASKS_RCU to factor out enablement logic
Currently, if a Kconfig option depends on TASKS_RCU, it conditionally does "select TASKS_RCU if PREEMPTION". This works, but requires any change in this enablement logic to be replicated across all such "select" clauses. This commit therefore creates a new NEED_TASKS_RCU Kconfig option so that the default value of TASKS_RCU can depend on a combination of this new option and any needed enablement logic, so that this logic is in one place.
While in the area, also anticipate a likely future change by adding PREEMPT_AUTO to that logic.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Ankur Arora <ankur.a.arora@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Steven Rostedt <rostedt@goodmis.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
show more ...
|
| #
c1ec7c15 |
| 15-Feb-2024 |
Paul E. McKenney <paulmck@kernel.org> |
rcu: Make TINY_RCU depend on !PREEMPT_RCU rather than !PREEMPTION
Right now, TINY_RCU depends on (!PREEMPTION && !SMP), which has served the kernel well for many years due to the fact that PREEMPT_R
rcu: Make TINY_RCU depend on !PREEMPT_RCU rather than !PREEMPTION
Right now, TINY_RCU depends on (!PREEMPTION && !SMP), which has served the kernel well for many years due to the fact that PREEMPT_RCU is normally a synonym for PREEMPTION. But with the advent of lazy preemption, it will be possible to have non-preemptible RCU in a preemptible kernel, so that kernels could be built with PREEMPT_RCU=n and PREEMPTION=y.
This commit therefore makes TINY_RCU depend on (!PREEMPT_RCU && !SMP), thus allowing for a non-preemptible RCU in preemptible kernels.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Ankur Arora <ankur.a.arora@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
show more ...
|
| #
7f66f099 |
| 03-Dec-2023 |
Qais Yousef <qyousef@layalina.io> |
rcu: Provide a boot time parameter to control lazy RCU
To allow more flexible arrangements while still provide a single kernel for distros, provide a boot time parameter to enable/disable lazy RCU.
rcu: Provide a boot time parameter to control lazy RCU
To allow more flexible arrangements while still provide a single kernel for distros, provide a boot time parameter to enable/disable lazy RCU.
Specify:
rcutree.enable_rcu_lazy=[y|1|n|0]
Which also requires
rcu_nocbs=all
at boot time to enable/disable lazy RCU.
To disable it by default at build time when CONFIG_RCU_LAZY=y, the new CONFIG_RCU_LAZY_DEFAULT_OFF can be used.
Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io> Tested-by: Andrea Righi <andrea.righi@canonical.com> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
show more ...
|
| #
f51164a8 |
| 31-Mar-2023 |
Paul E. McKenney <paulmck@kernel.org> |
rcu: Employ jiffies-based backstop to callback time limit
Currently, if there are more than 100 ready-to-invoke RCU callbacks queued on a given CPU, the rcu_do_batch() function sets a timeout for in
rcu: Employ jiffies-based backstop to callback time limit
Currently, if there are more than 100 ready-to-invoke RCU callbacks queued on a given CPU, the rcu_do_batch() function sets a timeout for invocation of the series. This timeout defaulting to three milliseconds, and may be adjusted using the rcutree.rcu_resched_ns kernel boot parameter. This timeout is checked using local_clock(), but the overhead of this function combined with the common-case very small callback-invocation overhead means that local_clock() is checked every 32nd invocation.
This works well except for longer-than average callbacks. For example, a series of 500-microsecond-duration callbacks means that local_clock() is checked only once every 16 milliseconds, which makes it difficult to enforce a three-millisecond timeout.
This commit therefore adds a Kconfig option RCU_DOUBLE_CHECK_CB_TIME that enables backup timeout checking using the coarser grained but lighter weight jiffies. If the jiffies counter detects a timeout, then local_clock() is consulted even if this is not the 32nd callback. This prevents the aforementioned 16-millisecond latency blow.
Reported-by: Domas Mituzas <dmituzas@meta.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
show more ...
|
| #
e035e887 |
| 24-Mar-2023 |
Paul E. McKenney <paulmck@kernel.org> |
rcu: Remove CONFIG_SRCU
Now that all references to CONFIG_SRCU have been removed, it is time to remove CONFIG_SRCU itself.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: John Ogness <john
rcu: Remove CONFIG_SRCU
Now that all references to CONFIG_SRCU have been removed, it is time to remove CONFIG_SRCU itself.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: John Ogness <john.ogness@linutronix.de> Cc: Petr Mladek <pmladek@suse.com> Reviewed-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
show more ...
|
| #
98d0052d |
| 12-Dec-2022 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'printk-for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk updates from Petr Mladek:
- Add NMI-safe SRCU reader API. It uses atomic_inc() instead of th
Merge tag 'printk-for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk updates from Petr Mladek:
- Add NMI-safe SRCU reader API. It uses atomic_inc() instead of this_cpu_inc() on strong load-store architectures.
- Introduce new console_list_lock to synchronize a manipulation of the list of registered consoles and their flags.
This is a first step in removing the big-kernel-lock-like behavior of console_lock(). This semaphore still serializes console->write() calbacks against:
- each other. It primary prevents potential races between early and proper console drivers using the same device.
- suspend()/resume() callbacks and init() operations in some drivers.
- various other operations in the tty/vt and framebufer susbsystems. It is likely that console_lock() serializes even operations that are not directly conflicting with the console->write() callbacks here. This is the most complicated big-kernel-lock aspect of the console_lock() that will be hard to untangle.
- Introduce new console_srcu lock that is used to safely iterate and access the registered console drivers under SRCU read lock.
This is a prerequisite for introducing atomic console drivers and console kthreads. It will reduce the complexity of serialization against normal consoles and console_lock(). Also it should remove the risk of deadlock during critical situations, like Oops or panic, when only atomic consoles are registered.
- Check whether the console is registered instead of enabled on many locations. It was a historical leftover.
- Cleanly force a preferred console in xenfb code instead of a dirty hack.
- A lot of code and comment clean ups and improvements.
* tag 'printk-for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: (47 commits) printk: htmldocs: add missing description tty: serial: sh-sci: use setup() callback for early console printk: relieve console_lock of list synchronization duties tty: serial: kgdboc: use console_list_lock to trap exit tty: serial: kgdboc: synchronize tty_find_polling_driver() and register_console() tty: serial: kgdboc: use console_list_lock for list traversal tty: serial: kgdboc: use srcu console list iterator proc: consoles: use console_list_lock for list iteration tty: tty_io: use console_list_lock for list synchronization printk, xen: fbfront: create/use safe function for forcing preferred netconsole: avoid CON_ENABLED misuse to track registration usb: early: xhci-dbc: use console_is_registered() tty: serial: xilinx_uartps: use console_is_registered() tty: serial: samsung_tty: use console_is_registered() tty: serial: pic32_uart: use console_is_registered() tty: serial: earlycon: use console_is_registered() tty: hvc: use console_is_registered() efi: earlycon: use console_is_registered() tty: nfcon: use console_is_registered() serial_core: replace uart_console_enabled() with uart_console_registered() ...
show more ...
|
| #
87492c06 |
| 30-Nov-2022 |
Paul E. McKenney <paulmck@kernel.org> |
Merge branches 'doc.2022.10.20a', 'fixes.2022.10.21a', 'lazy.2022.11.30a', 'srcunmisafe.2022.11.09a', 'torture.2022.10.18c' and 'torturescript.2022.10.20a' into HEAD
doc.2022.10.20a: Documentation u
Merge branches 'doc.2022.10.20a', 'fixes.2022.10.21a', 'lazy.2022.11.30a', 'srcunmisafe.2022.11.09a', 'torture.2022.10.18c' and 'torturescript.2022.10.20a' into HEAD
doc.2022.10.20a: Documentation updates. fixes.2022.10.21a: Miscellaneous fixes. lazy.2022.11.30a: Lazy call_rcu() and NOCB updates. srcunmisafe.2022.11.09a: NMI-safe SRCU readers. torture.2022.10.18c: Torture-test updates. torturescript.2022.10.20a: Torture-test scripting updates.
show more ...
|
| #
0cd7e350 |
| 22-Nov-2022 |
Paul E. McKenney <paulmck@kernel.org> |
rcu: Make SRCU mandatory
Kernels configured with CONFIG_PRINTK=n and CONFIG_SRCU=n get build failures. This causes trouble for deep embedded systems. But given that there are more than 25 instance
rcu: Make SRCU mandatory
Kernels configured with CONFIG_PRINTK=n and CONFIG_SRCU=n get build failures. This causes trouble for deep embedded systems. But given that there are more than 25 instances of "select SRCU" in the kernel, it is hard to believe that there are many kernels running in production without SRCU. This commit therefore makes SRCU mandatory. The SRCU Kconfig option remains for backwards compatibility, and will be removed when it is no longer used.
[ paulmck: Update per kernel test robot feedback. ]
Reported-by: John Ogness <john.ogness@linutronix.de> Reported-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: <linux-arch@vger.kernel.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Reviewed-by: John Ogness <john.ogness@linutronix.de>
show more ...
|
| #
3cb278e7 |
| 16-Oct-2022 |
Joel Fernandes (Google) <joel@joelfernandes.org> |
rcu: Make call_rcu() lazy to save power
Implement timer-based RCU callback batching (also known as lazy callbacks). With this we save about 5-10% of power consumed due to RCU requests that happen wh
rcu: Make call_rcu() lazy to save power
Implement timer-based RCU callback batching (also known as lazy callbacks). With this we save about 5-10% of power consumed due to RCU requests that happen when system is lightly loaded or idle.
By default, all async callbacks (queued via call_rcu) are marked lazy. An alternate API call_rcu_hurry() is provided for the few users, for example synchronize_rcu(), that need the old behavior.
The batch is flushed whenever a certain amount of time has passed, or the batch on a particular CPU grows too big. Also memory pressure will flush it in a future patch.
To handle several corner cases automagically (such as rcu_barrier() and hotplug), we re-use bypass lists which were originally introduced to address lock contention, to handle lazy CBs as well. The bypass list length has the lazy CB length included in it. A separate lazy CB length counter is also introduced to keep track of the number of lazy CBs.
[ paulmck: Fix formatting of inline call_rcu_lazy() definition. ] [ paulmck: Apply Zqiang feedback. ] [ paulmck: Apply s/call_rcu_flush/call_rcu_hurry/ feedback from Tejun Heo. ]
Suggested-by: Paul McKenney <paulmck@kernel.org> Acked-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
show more ...
|
| #
2e83b879 |
| 15-Sep-2022 |
Paul E. McKenney <paulmck@kernel.org> |
srcu: Create an srcu_read_lock_nmisafe() and srcu_read_unlock_nmisafe()
On strict load-store architectures, the use of this_cpu_inc() by srcu_read_lock() and srcu_read_unlock() is not NMI-safe in TR
srcu: Create an srcu_read_lock_nmisafe() and srcu_read_unlock_nmisafe()
On strict load-store architectures, the use of this_cpu_inc() by srcu_read_lock() and srcu_read_unlock() is not NMI-safe in TREE SRCU. To see this suppose that an NMI arrives in the middle of srcu_read_lock(), just after it has read ->srcu_lock_count, but before it has written the incremented value back to memory. If that NMI handler also does srcu_read_lock() and srcu_read_lock() on that same srcu_struct structure, then upon return from that NMI handler, the interrupted srcu_read_lock() will overwrite the NMI handler's update to ->srcu_lock_count, but leave unchanged the NMI handler's update by srcu_read_unlock() to ->srcu_unlock_count.
This can result in a too-short SRCU grace period, which can in turn result in arbitrary memory corruption.
If the NMI handler instead interrupts the srcu_read_unlock(), this can result in eternal SRCU grace periods, which is not much better.
This commit therefore creates a pair of new srcu_read_lock_nmisafe() and srcu_read_unlock_nmisafe() functions, which allow SRCU readers in both NMI handlers and in process and IRQ context. It is bad practice to mix the existing and the new _nmisafe() primitives on the same srcu_struct structure. Use one set or the other, not both.
Just to underline that "bad practice" point, using srcu_read_lock() at process level and srcu_read_lock_nmisafe() in your NMI handler will not, repeat NOT, work. If you do not immediately understand why this is the case, please review the earlier paragraphs in this commit log.
[ paulmck: Apply kernel test robot feedback. ] [ paulmck: Apply feedback from Randy Dunlap. ] [ paulmck: Apply feedback from John Ogness. ] [ paulmck: Apply feedback from Frederic Weisbecker. ]
Link: https://lore.kernel.org/all/20220910221947.171557773@linutronix.de/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Ogness <john.ogness@linutronix.de> Cc: Petr Mladek <pmladek@suse.com>
show more ...
|
| #
34bc7b45 |
| 22-Jul-2022 |
Paul E. McKenney <paulmck@kernel.org> |
Merge branch 'ctxt.2022.07.05a' into HEAD
ctxt.2022.07.05a: Linux-kernel memory model development branch.
|
| #
8f489b4d |
| 11-May-2022 |
Uladzislau Rezki (Sony) <urezki@gmail.com> |
rcu/nocb: Add option to opt rcuo kthreads out of RT priority
This commit introduces a RCU_NOCB_CPU_CB_BOOST Kconfig option that prevents rcuo kthreads from running at real-time priority, even in ker
rcu/nocb: Add option to opt rcuo kthreads out of RT priority
This commit introduces a RCU_NOCB_CPU_CB_BOOST Kconfig option that prevents rcuo kthreads from running at real-time priority, even in kernels built with RCU_BOOST. This capability is important to devices needing low-latency (as in a few milliseconds) response from expedited RCU grace periods, but which are not running a classic real-time workload. On such devices, permitting the rcuo kthreads to run at real-time priority results in unacceptable latencies imposed on the application tasks, which run as SCHED_OTHER.
See for example the following trace output:
<snip> <...>-60 [006] d..1 2979.028717: rcu_batch_start: rcu_preempt CBs=34619 bl=270 <snip>
If that rcuop kthread were permitted to run at real-time SCHED_FIFO priority, it would monopolize its CPU for hundreds of milliseconds while invoking those 34619 RCU callback functions, which would cause an unacceptably long latency spike for many application stacks on Android platforms.
However, some existing real-time workloads require that callback invocation run at SCHED_FIFO priority, for example, those running on systems with heavy SCHED_OTHER background loads. (It is the real-time system's administrator's responsibility to make sure that important real-time tasks run at a higher priority than do RCU's kthreads.)
Therefore, this new RCU_NOCB_CPU_CB_BOOST Kconfig option defaults to "y" on kernels built with PREEMPT_RT and defaults to "n" otherwise. The effect is to preserve current behavior for real-time systems, but for other systems to allow expedited RCU grace periods to run with real-time priority while continuing to invoke RCU callbacks as SCHED_OTHER.
As you would expect, this RCU_NOCB_CPU_CB_BOOST Kconfig option has no effect except on CPUs with offloaded RCU callbacks.
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org> Reviewed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
show more ...
|
| #
b37a667c |
| 22-Apr-2022 |
Joel Fernandes <joel@joelfernandes.org> |
rcu/nocb: Add an option to offload all CPUs on boot
Systems built with CONFIG_RCU_NOCB_CPU=y but booted without either the rcu_nocbs= or rcu_nohz_full= kernel-boot parameters will not have callback
rcu/nocb: Add an option to offload all CPUs on boot
Systems built with CONFIG_RCU_NOCB_CPU=y but booted without either the rcu_nocbs= or rcu_nohz_full= kernel-boot parameters will not have callback offloading on any of the CPUs, nor can any of the CPUs be switched to enable callback offloading at runtime. Although this is intentional, it would be nice to have a way to offload all the CPUs without having to make random bootloaders specify either the rcu_nocbs= or the rcu_nohz_full= kernel-boot parameters.
This commit therefore provides a new CONFIG_RCU_NOCB_CPU_DEFAULT_ALL Kconfig option that switches the default so as to offload callback processing on all of the CPUs. This default can still be overridden using the rcu_nocbs= and rcu_nohz_full= kernel-boot parameters.
Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Uladzislau Rezki <urezki@gmail.com> (In v4.1, fixed issues with CONFIG maze reported by kernel test robot). Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Joel Fernandes <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
show more ...
|
| #
e67198cc |
| 08-Jun-2022 |
Frederic Weisbecker <frederic@kernel.org> |
context_tracking: Take idle eqs entrypoints over RCU
The RCU dynticks counter is going to be merged into the context tracking subsystem. Start with moving the idle extended quiescent states entrypoi
context_tracking: Take idle eqs entrypoints over RCU
The RCU dynticks counter is going to be merged into the context tracking subsystem. Start with moving the idle extended quiescent states entrypoints to context tracking. For now those are dumb redirections to existing RCU calls.
[ paulmck: Apply kernel test robot feedback. ]
Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com> Cc: Uladzislau Rezki <uladzislau.rezki@sony.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Nicolas Saenz Julienne <nsaenz@kernel.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com> Cc: Yu Liao <liaoyu15@huawei.com> Cc: Phil Auld <pauld@redhat.com> Cc: Paul Gortmaker<paul.gortmaker@windriver.com> Cc: Alex Belits <abelits@marvell.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Nicolas Saenz Julienne <nsaenzju@redhat.com> Tested-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
show more ...
|
| #
ce133890 |
| 11-May-2022 |
Paul E. McKenney <paulmck@kernel.org> |
Merge branch 'exp.2022.05.11a' into HEAD
exp.2022.05.11a: Expedited-grace-period latency-reduction updates.
|