#
686bcb5c |
| 15-Dec-2019 |
Jeff Roberson <jeff@FreeBSD.org> |
schedlock 4/4
Don't hold the scheduler lock while doing context switches. Instead we unlock after selecting the new thread and switch within a spinlock section leaving interrupts and preemption dis
schedlock 4/4
Don't hold the scheduler lock while doing context switches. Instead we unlock after selecting the new thread and switch within a spinlock section leaving interrupts and preemption disabled to prevent local concurrency. This means that mi_switch() is entered with the thread locked but returns without. This dramatically simplifies scheduler locking because we will not hold the schedlock while spinning on blocked lock in switch.
This change has not been made to 4BSD but in principle it would be more straightforward.
Discussed with: markj Reviewed by: kib Tested by: pho Differential Revision: https://reviews.freebsd.org/D22778
show more ...
|
#
61a74c5c |
| 15-Dec-2019 |
Jeff Roberson <jeff@FreeBSD.org> |
schedlock 1/4
Eliminate recursion from most thread_lock consumers. Return from sched_add() without the thread_lock held. This eliminates unnecessary atomics and lock word loads as well as reducing
schedlock 1/4
Eliminate recursion from most thread_lock consumers. Return from sched_add() without the thread_lock held. This eliminates unnecessary atomics and lock word loads as well as reducing the hold time for scheduler locks. This will eventually allow for lockless remote adds.
Discussed with: kib Reviewed by: jhb Tested by: pho Differential Revision: https://reviews.freebsd.org/D22626
show more ...
|
#
9825eadf |
| 13-Dec-2019 |
Ryan Libby <rlibby@FreeBSD.org> |
bitset: rename confusing macro NAND to ANDNOT
s/BIT_NAND/BIT_ANDNOT/, and for CPU and DOMAINSET too. The actual implementation is "and not" (or "but not"), i.e. A but not B. Fortunately this does a
bitset: rename confusing macro NAND to ANDNOT
s/BIT_NAND/BIT_ANDNOT/, and for CPU and DOMAINSET too. The actual implementation is "and not" (or "but not"), i.e. A but not B. Fortunately this does appear to be what all existing callers want.
Don't supply a NAND (not (A and B)) operation at this time.
Discussed with: jeff Reviewed by: cem Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D22791
show more ...
|
#
7789ab32 |
| 12-Dec-2019 |
Mark Johnston <markj@FreeBSD.org> |
Rename tdq_ipipending and clear it in sched_switch().
This fixes a regression after r355311. Specifically, sched_preempt() may trigger a context switch by calling thread_lock(), since thread_lock()
Rename tdq_ipipending and clear it in sched_switch().
This fixes a regression after r355311. Specifically, sched_preempt() may trigger a context switch by calling thread_lock(), since thread_lock() calls critical_exit() in its slow path and the interrupted thread may have already been marked for preemption. This would happen before tdq_ipipending is cleared, blocking further preemption IPIs. The CPU can be left in this state indefinitely if the interrupted thread migrates.
Rename tdq_ipipending to tdq_owepreempt. Any switch satisfies a remote preemption request, so clear tdq_owepreempt in sched_switch() instead of sched_preempt() to avoid subtle problems of the sort described above.
Reviewed by: jeff, kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D22758
show more ...
|
#
c3cccf95 |
| 08-Dec-2019 |
Jeff Roberson <jeff@FreeBSD.org> |
Handle multiple clock interrupts simultaneously in sched_clock().
Reviewed by: kib, markj, mav Differential Revision: https://reviews.freebsd.org/D22625
|
#
61322a0a |
| 04-Dec-2019 |
Alexander Motin <mav@FreeBSD.org> |
Mark some more hot global variables with __read_mostly.
MFC after: 1 week
|
#
e1504695 |
| 02-Dec-2019 |
Jeff Roberson <jeff@FreeBSD.org> |
Initialize the idle thread's lock sooner so it's not evaluated on every fork exit and we can rely on it elsewhere.
Reviewed by: mav, kib, jhb, markj Differential Revision: https://reviews.freebsd.or
Initialize the idle thread's lock sooner so it's not evaluated on every fork exit and we can rely on it elsewhere.
Reviewed by: mav, kib, jhb, markj Differential Revision: https://reviews.freebsd.org/D22624
show more ...
|
Revision tags: release/12.1.0 |
|
#
668ee101 |
| 26-Sep-2019 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r352587 through r352763.
|
#
176dd236 |
| 26-Sep-2019 |
Alexander Motin <mav@FreeBSD.org> |
Microoptimize sched_pickcpu() CPU affinity on SMT.
Use of CPU_FFS() to implement CPUSET_FOREACH() allows to save up to ~0.5% of CPU time on 72-thread SMT system doing 80K IOPS to NVMe from one threa
Microoptimize sched_pickcpu() CPU affinity on SMT.
Use of CPU_FFS() to implement CPUSET_FOREACH() allows to save up to ~0.5% of CPU time on 72-thread SMT system doing 80K IOPS to NVMe from one thread.
MFC after: 1 month Sponsored by: iXsystems, Inc.
show more ...
|
#
c55dc51c |
| 25-Sep-2019 |
Alexander Motin <mav@FreeBSD.org> |
Microoptimize sched_pickcpu() after r352658.
I've noticed that I missed intr check at one more SCHED_AFFINITY(), so instead of adding one more branching I prefer to remove few.
Profiler shows the f
Microoptimize sched_pickcpu() after r352658.
I've noticed that I missed intr check at one more SCHED_AFFINITY(), so instead of adding one more branching I prefer to remove few.
Profiler shows the function CPU time reduction from 0.24% to 0.16%.
MFC after: 1 month Sponsored by: iXsystems, Inc.
show more ...
|
#
bb3dfc6a |
| 25-Sep-2019 |
Alexander Motin <mav@FreeBSD.org> |
Fix wrong assertion in r352658.
MFC after: 1 month
|
#
c9205e35 |
| 24-Sep-2019 |
Alexander Motin <mav@FreeBSD.org> |
Fix/improve interrupt threads scheduling.
Doing some tests with very high interrupt rates I've noticed that one of conditions I added in r232207 to make interrupt threads in most cases run on local
Fix/improve interrupt threads scheduling.
Doing some tests with very high interrupt rates I've noticed that one of conditions I added in r232207 to make interrupt threads in most cases run on local CPU never worked as expected (worked only if previous time it was executed on some other CPU, that is quite opposite). It caused additional CPU usage to run full CPU search and could schedule interrupt threads to some other CPU.
This patch removes that code and instead reuses existing non-interrupt code path with some tweaks for interrupt case: - On SMT systems, if current thread is idle, don't look on other threads. Even if they are busy, it may take more time to do fill search and bounce the interrupt thread to other core then execute it locally, even sharing CPU resources. It is other threads should migrate, not bound interrupts. - Try hard to keep interrupt threads within LLC of their original CPU. This improves scheduling cost and supposedly cache and memory locality.
On a test system with 72 threads doing 2.2M IOPS to NVMe this saves few percents of CPU time while adding few percents to IOPS.
MFC after: 1 month Sponsored by: iXsystems, Inc.
show more ...
|
#
018ff686 |
| 13-Aug-2019 |
Jeff Roberson <jeff@FreeBSD.org> |
Move scheduler state into the per-cpu area where it can be allocated on the correct NUMA domain.
Reviewed by: markj, gallatin Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org
Move scheduler state into the per-cpu area where it can be allocated on the correct NUMA domain.
Reviewed by: markj, gallatin Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D19315
show more ...
|
Revision tags: release/11.3.0 |
|
#
7648bc9f |
| 13-May-2019 |
Alan Somers <asomers@FreeBSD.org> |
MFHead @347527
Sponsored by: The FreeBSD Foundation
|
#
ac97da9a |
| 08-May-2019 |
Mateusz Guzik <mjg@FreeBSD.org> |
Reduce umtx-related work on exec and exit
- there is no need to take the process lock to iterate the thread list after single-threading is enforced - typically there are no mutexes to clean up (te
Reduce umtx-related work on exec and exit
- there is no need to take the process lock to iterate the thread list after single-threading is enforced - typically there are no mutexes to clean up (testable without taking the global umtx lock) - typically there is no need to adjust the priority (testable without taking thread lock)
Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D20160
show more ...
|
Revision tags: release/12.0.0 |
|
#
14b841d4 |
| 11-Aug-2018 |
Kyle Evans <kevans@FreeBSD.org> |
MFH @ r337607, in preparation for boarding
|
#
290d9060 |
| 29-Jul-2018 |
Don Lewis <truckman@FreeBSD.org> |
Fix the long term ULE load balancer so that it actually works. The initial call to sched_balance() during startup is meant to initialize balance_ticks, but does not actually do that since smp_starte
Fix the long term ULE load balancer so that it actually works. The initial call to sched_balance() during startup is meant to initialize balance_ticks, but does not actually do that since smp_started is still zero at that time. Since balance_ticks does not get set, there are no further calls to sched_balance(). Fix this by setting balance_ticks in sched_initticks() since we know the value of balance_interval at that time, and eliminate the useless startup call to sched_balance(). We don't need to randomize the intial value of balance_ticks.
Since there is now only one call to sched_balance(), we can hoist the tests at the top of this function out to the caller and avoid the overhead of the function call when running a SMP kernel on UP hardware.
PR: 223914 Reviewed by: kib MFC after: 2 weeks
show more ...
|
#
2bf95012 |
| 05-Jul-2018 |
Andrew Turner <andrew@FreeBSD.org> |
Create a new macro for static DPCPU data.
On arm64 (and possible other architectures) we are unable to use static DPCPU data in kernel modules. This is because the compiler will generate PC-relative
Create a new macro for static DPCPU data.
On arm64 (and possible other architectures) we are unable to use static DPCPU data in kernel modules. This is because the compiler will generate PC-relative accesses, however the runtime-linker expects to be able to relocate these.
In preparation to fix this create two macros depending on if the data is global or static.
Reviewed by: bz, emaste, markj Sponsored by: ABT Systems Ltd Differential Revision: https://reviews.freebsd.org/D16140
show more ...
|
Revision tags: release/11.2.0 |
|
#
28240885 |
| 08-May-2018 |
Mateusz Guzik <mjg@FreeBSD.org> |
Inlined sched_userret.
The tested condition is rarely true and it induces a function call on each return to userspace.
Bumps getuid rate by about 1% on Broadwell.
|
#
4c8a8cfc |
| 23-Feb-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Restore UP build.
Reviewed by: truckman Sponsored by: The FreeBSD Foundation
|
#
97e9382d |
| 23-Feb-2018 |
Don Lewis <truckman@FreeBSD.org> |
Decrease latency by not wrapping the idle loop's potentially lengthy search for a thread to steal inside a critical section. Since this allows the search to be preempted, restart the search if preem
Decrease latency by not wrapping the idle loop's potentially lengthy search for a thread to steal inside a critical section. Since this allows the search to be preempted, restart the search if preemption happens since the search results found earlier may no longer be valid.
Decrease the latency of starting a thread that may be assigned to this CPU during the search by polling for incoming threads during the search and switching to that thread instead of continuing the search.
Test for stale search results and restart the search before going through the expense of calling tdq_lock_pair(). Retry some tests after grabbing the locks since things may have changed while waiting to get both locks.
Eliminate special case handling for stealing from an SMT peer that uses 1 as the steal threshold. This can only succeed if a thread has been assigned but our SMT peer has not yet started executing it. This is quite rare and when it happens the other SMT thread is generally waiting for the same tdq lock that we hold. Basically both SMT threads are racing to grab the same spin lock.
Add the kern.sched.always_steal knob from a ULE patch by jeff@.
Incorporate another idea from Jeff's ULE patch. If the sched_switch() detects that the CPU is about to go idle, try to steal a thread before switching to the idle thread. Since the search for a thread to steal has to be done inside a critical section in this context, limit the impact on latency by adding the knob kern.sched.trysteal_limit to limit the topological distance of the search and don't restart the search if we detect stale results. If this search can't find an stealable thread, the idle loop can do a more complete search. Also poll for threads being assigned to this CPU during the search and switch to them instead of continuing the search. This change is responsibile for the majority of the improvement in parallel buildworld times.
In sched_balance_group() change the minimum threshold from stealing a thread from 1 to 2. Poaching a newly assigned thread from a CPU that is waking up hasn't yet switched to that thread from idle is likely very rare and is likely to have the same lock race as is seen when stealing threads in the idle loop. Also use tdq_notify() to kick the destintation CPU instead of always sending an IPI. Update a stale comment, the number of transferable threads is not calculated.
Reviewed by: kib (earlier version) Comments by: avg, jeff, mav MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D12130
show more ...
|
#
0127914c |
| 22-Feb-2018 |
Eric van Gyzen <vangyzen@FreeBSD.org> |
sched_ule: update a comment to reflect reality
MFC after: 3 days Sponsored by: Dell EMC
|
#
072c8a3b |
| 24-Jan-2018 |
Wojciech Macek <wma@FreeBSD.org> |
Reverting r328320
|
#
4d249cdd |
| 24-Jan-2018 |
Wojciech Macek <wma@FreeBSD.org> |
ULE: provide defaults to ts_cpu
Fix a bug when the system has no CPU 0. When created, threads were implicitly assigned to CPU 0. This had no practical effect since a real CPU was chosen immediately
ULE: provide defaults to ts_cpu
Fix a bug when the system has no CPU 0. When created, threads were implicitly assigned to CPU 0. This had no practical effect since a real CPU was chosen immediately by the scheduler. However, on systems without a CPU 0, sched_ule attempted to access the scheduler queue of the "old" CPU when assigned the initial choice of the old one. This caused an attempt to use illegal memory and a crash (or, more usually, a deadlock). Fix this by assigned new threads to the BSP explicitly and add some asserts to see that this problem does not recur.
Authored by: Nathan Whitehorn <nwhitehorn@freebsd.org> Submitted by: Wojciech Macek <wma@semihalf.com> Obtained from: Semihalf Differential revision: https://reviews.freebsd.org/D13932
show more ...
|
#
72bfb31a |
| 13-Jan-2018 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r327886 through r327930.
|