History log of /freebsd/sys/kern/sched_ule.c (Results 51 – 75 of 810)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 686bcb5c 15-Dec-2019 Jeff Roberson <jeff@FreeBSD.org>

schedlock 4/4

Don't hold the scheduler lock while doing context switches. Instead we
unlock after selecting the new thread and switch within a spinlock
section leaving interrupts and preemption dis

schedlock 4/4

Don't hold the scheduler lock while doing context switches. Instead we
unlock after selecting the new thread and switch within a spinlock
section leaving interrupts and preemption disabled to prevent local
concurrency. This means that mi_switch() is entered with the thread
locked but returns without. This dramatically simplifies scheduler
locking because we will not hold the schedlock while spinning on
blocked lock in switch.

This change has not been made to 4BSD but in principle it would be
more straightforward.

Discussed with: markj
Reviewed by: kib
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D22778

show more ...


# 61a74c5c 15-Dec-2019 Jeff Roberson <jeff@FreeBSD.org>

schedlock 1/4

Eliminate recursion from most thread_lock consumers. Return from
sched_add() without the thread_lock held. This eliminates unnecessary
atomics and lock word loads as well as reducing

schedlock 1/4

Eliminate recursion from most thread_lock consumers. Return from
sched_add() without the thread_lock held. This eliminates unnecessary
atomics and lock word loads as well as reducing the hold time for
scheduler locks. This will eventually allow for lockless remote adds.

Discussed with: kib
Reviewed by: jhb
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D22626

show more ...


# 9825eadf 13-Dec-2019 Ryan Libby <rlibby@FreeBSD.org>

bitset: rename confusing macro NAND to ANDNOT

s/BIT_NAND/BIT_ANDNOT/, and for CPU and DOMAINSET too. The actual
implementation is "and not" (or "but not"), i.e. A but not B.
Fortunately this does a

bitset: rename confusing macro NAND to ANDNOT

s/BIT_NAND/BIT_ANDNOT/, and for CPU and DOMAINSET too. The actual
implementation is "and not" (or "but not"), i.e. A but not B.
Fortunately this does appear to be what all existing callers want.

Don't supply a NAND (not (A and B)) operation at this time.

Discussed with: jeff
Reviewed by: cem
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D22791

show more ...


# 7789ab32 12-Dec-2019 Mark Johnston <markj@FreeBSD.org>

Rename tdq_ipipending and clear it in sched_switch().

This fixes a regression after r355311. Specifically, sched_preempt()
may trigger a context switch by calling thread_lock(), since
thread_lock()

Rename tdq_ipipending and clear it in sched_switch().

This fixes a regression after r355311. Specifically, sched_preempt()
may trigger a context switch by calling thread_lock(), since
thread_lock() calls critical_exit() in its slow path and the interrupted
thread may have already been marked for preemption. This would happen
before tdq_ipipending is cleared, blocking further preemption IPIs. The
CPU can be left in this state indefinitely if the interrupted thread
migrates.

Rename tdq_ipipending to tdq_owepreempt. Any switch satisfies a remote
preemption request, so clear tdq_owepreempt in sched_switch() instead of
sched_preempt() to avoid subtle problems of the sort described above.

Reviewed by: jeff, kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22758

show more ...


# c3cccf95 08-Dec-2019 Jeff Roberson <jeff@FreeBSD.org>

Handle multiple clock interrupts simultaneously in sched_clock().

Reviewed by: kib, markj, mav
Differential Revision: https://reviews.freebsd.org/D22625


# 61322a0a 04-Dec-2019 Alexander Motin <mav@FreeBSD.org>

Mark some more hot global variables with __read_mostly.

MFC after: 1 week


# e1504695 02-Dec-2019 Jeff Roberson <jeff@FreeBSD.org>

Initialize the idle thread's lock sooner so it's not evaluated on every fork
exit and we can rely on it elsewhere.

Reviewed by: mav, kib, jhb, markj
Differential Revision: https://reviews.freebsd.or

Initialize the idle thread's lock sooner so it's not evaluated on every fork
exit and we can rely on it elsewhere.

Reviewed by: mav, kib, jhb, markj
Differential Revision: https://reviews.freebsd.org/D22624

show more ...


Revision tags: release/12.1.0
# 668ee101 26-Sep-2019 Dimitry Andric <dim@FreeBSD.org>

Merge ^/head r352587 through r352763.


# 176dd236 26-Sep-2019 Alexander Motin <mav@FreeBSD.org>

Microoptimize sched_pickcpu() CPU affinity on SMT.

Use of CPU_FFS() to implement CPUSET_FOREACH() allows to save up to ~0.5%
of CPU time on 72-thread SMT system doing 80K IOPS to NVMe from one threa

Microoptimize sched_pickcpu() CPU affinity on SMT.

Use of CPU_FFS() to implement CPUSET_FOREACH() allows to save up to ~0.5%
of CPU time on 72-thread SMT system doing 80K IOPS to NVMe from one thread.

MFC after: 1 month
Sponsored by: iXsystems, Inc.

show more ...


# c55dc51c 25-Sep-2019 Alexander Motin <mav@FreeBSD.org>

Microoptimize sched_pickcpu() after r352658.

I've noticed that I missed intr check at one more SCHED_AFFINITY(),
so instead of adding one more branching I prefer to remove few.

Profiler shows the f

Microoptimize sched_pickcpu() after r352658.

I've noticed that I missed intr check at one more SCHED_AFFINITY(),
so instead of adding one more branching I prefer to remove few.

Profiler shows the function CPU time reduction from 0.24% to 0.16%.

MFC after: 1 month
Sponsored by: iXsystems, Inc.

show more ...


# bb3dfc6a 25-Sep-2019 Alexander Motin <mav@FreeBSD.org>

Fix wrong assertion in r352658.

MFC after: 1 month


# c9205e35 24-Sep-2019 Alexander Motin <mav@FreeBSD.org>

Fix/improve interrupt threads scheduling.

Doing some tests with very high interrupt rates I've noticed that one of
conditions I added in r232207 to make interrupt threads in most cases
run on local

Fix/improve interrupt threads scheduling.

Doing some tests with very high interrupt rates I've noticed that one of
conditions I added in r232207 to make interrupt threads in most cases
run on local CPU never worked as expected (worked only if previous time
it was executed on some other CPU, that is quite opposite). It caused
additional CPU usage to run full CPU search and could schedule interrupt
threads to some other CPU.

This patch removes that code and instead reuses existing non-interrupt
code path with some tweaks for interrupt case:
- On SMT systems, if current thread is idle, don't look on other threads.
Even if they are busy, it may take more time to do fill search and bounce
the interrupt thread to other core then execute it locally, even sharing
CPU resources. It is other threads should migrate, not bound interrupts.
- Try hard to keep interrupt threads within LLC of their original CPU.
This improves scheduling cost and supposedly cache and memory locality.

On a test system with 72 threads doing 2.2M IOPS to NVMe this saves few
percents of CPU time while adding few percents to IOPS.

MFC after: 1 month
Sponsored by: iXsystems, Inc.

show more ...


# 018ff686 13-Aug-2019 Jeff Roberson <jeff@FreeBSD.org>

Move scheduler state into the per-cpu area where it can be allocated on the
correct NUMA domain.

Reviewed by: markj, gallatin
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org

Move scheduler state into the per-cpu area where it can be allocated on the
correct NUMA domain.

Reviewed by: markj, gallatin
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D19315

show more ...


Revision tags: release/11.3.0
# 7648bc9f 13-May-2019 Alan Somers <asomers@FreeBSD.org>

MFHead @347527

Sponsored by: The FreeBSD Foundation


# ac97da9a 08-May-2019 Mateusz Guzik <mjg@FreeBSD.org>

Reduce umtx-related work on exec and exit

- there is no need to take the process lock to iterate the thread
list after single-threading is enforced
- typically there are no mutexes to clean up (te

Reduce umtx-related work on exec and exit

- there is no need to take the process lock to iterate the thread
list after single-threading is enforced
- typically there are no mutexes to clean up (testable without taking
the global umtx lock)
- typically there is no need to adjust the priority (testable without
taking thread lock)

Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20160

show more ...


Revision tags: release/12.0.0
# 14b841d4 11-Aug-2018 Kyle Evans <kevans@FreeBSD.org>

MFH @ r337607, in preparation for boarding


# 290d9060 29-Jul-2018 Don Lewis <truckman@FreeBSD.org>

Fix the long term ULE load balancer so that it actually works. The
initial call to sched_balance() during startup is meant to initialize
balance_ticks, but does not actually do that since smp_starte

Fix the long term ULE load balancer so that it actually works. The
initial call to sched_balance() during startup is meant to initialize
balance_ticks, but does not actually do that since smp_started is
still zero at that time. Since balance_ticks does not get set,
there are no further calls to sched_balance(). Fix this by setting
balance_ticks in sched_initticks() since we know the value of
balance_interval at that time, and eliminate the useless startup
call to sched_balance(). We don't need to randomize the intial
value of balance_ticks.

Since there is now only one call to sched_balance(), we can hoist
the tests at the top of this function out to the caller and avoid
the overhead of the function call when running a SMP kernel on UP
hardware.

PR: 223914
Reviewed by: kib
MFC after: 2 weeks

show more ...


# 2bf95012 05-Jul-2018 Andrew Turner <andrew@FreeBSD.org>

Create a new macro for static DPCPU data.

On arm64 (and possible other architectures) we are unable to use static
DPCPU data in kernel modules. This is because the compiler will generate
PC-relative

Create a new macro for static DPCPU data.

On arm64 (and possible other architectures) we are unable to use static
DPCPU data in kernel modules. This is because the compiler will generate
PC-relative accesses, however the runtime-linker expects to be able to
relocate these.

In preparation to fix this create two macros depending on if the data is
global or static.

Reviewed by: bz, emaste, markj
Sponsored by: ABT Systems Ltd
Differential Revision: https://reviews.freebsd.org/D16140

show more ...


Revision tags: release/11.2.0
# 28240885 08-May-2018 Mateusz Guzik <mjg@FreeBSD.org>

Inlined sched_userret.

The tested condition is rarely true and it induces a function call
on each return to userspace.

Bumps getuid rate by about 1% on Broadwell.


# 4c8a8cfc 23-Feb-2018 Konstantin Belousov <kib@FreeBSD.org>

Restore UP build.

Reviewed by: truckman
Sponsored by: The FreeBSD Foundation


# 97e9382d 23-Feb-2018 Don Lewis <truckman@FreeBSD.org>

Decrease latency by not wrapping the idle loop's potentially lengthy
search for a thread to steal inside a critical section. Since this
allows the search to be preempted, restart the search if preem

Decrease latency by not wrapping the idle loop's potentially lengthy
search for a thread to steal inside a critical section. Since this
allows the search to be preempted, restart the search if preemption
happens since the search results found earlier may no longer be
valid.

Decrease the latency of starting a thread that may be assigned to
this CPU during the search by polling for incoming threads during
the search and switching to that thread instead of continuing the
search.

Test for stale search results and restart the search before going
through the expense of calling tdq_lock_pair(). Retry some tests
after grabbing the locks since things may have changed while waiting
to get both locks.

Eliminate special case handling for stealing from an SMT peer that
uses 1 as the steal threshold. This can only succeed if a thread
has been assigned but our SMT peer has not yet started executing
it. This is quite rare and when it happens the other SMT thread
is generally waiting for the same tdq lock that we hold. Basically
both SMT threads are racing to grab the same spin lock.

Add the kern.sched.always_steal knob from a ULE patch by jeff@.

Incorporate another idea from Jeff's ULE patch. If the sched_switch()
detects that the CPU is about to go idle, try to steal a thread
before switching to the idle thread. Since the search for a thread
to steal has to be done inside a critical section in this context,
limit the impact on latency by adding the knob kern.sched.trysteal_limit
to limit the topological distance of the search and don't restart
the search if we detect stale results. If this search can't find
an stealable thread, the idle loop can do a more complete search.
Also poll for threads being assigned to this CPU during the search
and switch to them instead of continuing the search. This change
is responsibile for the majority of the improvement in parallel
buildworld times.

In sched_balance_group() change the minimum threshold from stealing
a thread from 1 to 2. Poaching a newly assigned thread from a CPU
that is waking up hasn't yet switched to that thread from idle is
likely very rare and is likely to have the same lock race as is
seen when stealing threads in the idle loop. Also use tdq_notify()
to kick the destintation CPU instead of always sending an IPI.
Update a stale comment, the number of transferable threads is not
calculated.

Reviewed by: kib (earlier version)
Comments by: avg, jeff, mav
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D12130

show more ...


# 0127914c 22-Feb-2018 Eric van Gyzen <vangyzen@FreeBSD.org>

sched_ule: update a comment to reflect reality

MFC after: 3 days
Sponsored by: Dell EMC


# 072c8a3b 24-Jan-2018 Wojciech Macek <wma@FreeBSD.org>

Reverting r328320


# 4d249cdd 24-Jan-2018 Wojciech Macek <wma@FreeBSD.org>

ULE: provide defaults to ts_cpu

Fix a bug when the system has no CPU 0. When created, threads were implicitly assigned to CPU 0.
This had no practical effect since a real CPU was chosen immediately

ULE: provide defaults to ts_cpu

Fix a bug when the system has no CPU 0. When created, threads were implicitly assigned to CPU 0.
This had no practical effect since a real CPU was chosen immediately by the scheduler. However,
on systems without a CPU 0, sched_ule attempted to access the scheduler queue of the "old" CPU
when assigned the initial choice of the old one. This caused an attempt to use illegal memory
and a crash (or, more usually, a deadlock). Fix this by assigned new threads to the BSP
explicitly and add some asserts to see that this problem does not recur.

Authored by: Nathan Whitehorn <nwhitehorn@freebsd.org>
Submitted by: Wojciech Macek <wma@semihalf.com>
Obtained from: Semihalf
Differential revision: https://reviews.freebsd.org/D13932

show more ...


# 72bfb31a 13-Jan-2018 Dimitry Andric <dim@FreeBSD.org>

Merge ^/head r327886 through r327930.


12345678910>>...33