#
ef50d5fb |
| 23-Sep-2021 |
Alexander Motin <mav@FreeBSD.org> |
x86: Add NUMA nodes into CPU topology.
Depending on hardware, NUMA nodes may match last level caches, or they may be above them (AMD Zen 2/3) or below (Intel Xeon w/ SNC). This information is provid
x86: Add NUMA nodes into CPU topology.
Depending on hardware, NUMA nodes may match last level caches, or they may be above them (AMD Zen 2/3) or below (Intel Xeon w/ SNC). This information is provided by ACPI instead of CPUID, and it is provided for each CPU individually instead of mask widths, but this code should be able to properly handle all the above cases.
This change should immediately allow idle stealing in sched_ule(4) to prefer load from NUMA-local CPUs to remote ones when the node does not match LLC. Later we may think of how to better handle it on sched_pickcpu() side.
MFC after: 1 month
show more ...
|
#
8db16699 |
| 22-Sep-2021 |
Alexander Motin <mav@FreeBSD.org> |
Fix build without SMP.
MFC after: 1 month
|
#
e745d729 |
| 22-Sep-2021 |
Alexander Motin <mav@FreeBSD.org> |
sched_ule(4): Improve long-term load balancer.
Before this change long-term load balancer was unable to migrate running threads, only ones waiting on run queues. But with growing number of CPU core
sched_ule(4): Improve long-term load balancer.
Before this change long-term load balancer was unable to migrate running threads, only ones waiting on run queues. But with growing number of CPU cores it is quite typical now for system to not have many waiting threads. But same time if due to some coincidence two long-running CPU-bound threads ended up sharing same physical CPU core, they could suffer from the SMT penalty indefinitely, and the load balancer couldn't help.
Improve that by teaching the load balancer to hint running threads to migrate by marking them with TDF_NEEDRESCHED and new TDF_PICKCPU flag, making sched_pickcpu() to search for better CPU later, when it is convenient.
Fix CPU search logic when balancing to limit round-robin migrations in case of almost equal load to the group of physical cores. The previous code bounced threads across all the system, that should be pretty bad for caches and NUMA affinity, while additional fairness was almost invisible, diminishing with number of cores in the group.
MFC after: 1 month
show more ...
|
#
bd84094a |
| 21-Sep-2021 |
Alexander Motin <mav@FreeBSD.org> |
sched_ule(4): Fix interactive threads stealing.
In scenarios when first thread in the queue can migrate to specified CPU, but later ones can't runq_steal_from() incorrectly returned NULL.
MFC after
sched_ule(4): Fix interactive threads stealing.
In scenarios when first thread in the queue can migrate to specified CPU, but later ones can't runq_steal_from() incorrectly returned NULL.
MFC after: 2 weeks
show more ...
|
#
ca34553b |
| 02-Aug-2021 |
Alexander Motin <mav@FreeBSD.org> |
sched_ule(4): Pre-seed sched_random().
I don't think it changes anything, but why not.
While there, make cpu_search_highest() use all 8 lower load bits for noise, since it does not use cs_prefer an
sched_ule(4): Pre-seed sched_random().
I don't think it changes anything, but why not.
While there, make cpu_search_highest() use all 8 lower load bits for noise, since it does not use cs_prefer and the code is not shared with cpu_search_lowest() any more.
MFC after: 1 month
show more ...
|
#
8bb173fb |
| 02-Aug-2021 |
Alexander Motin <mav@FreeBSD.org> |
sched_ule(4): Use trylock when stealing load.
On some load patterns it is possible for several CPUs to try steal thread from the same CPU despite randomization introduced. It may cause significant
sched_ule(4): Use trylock when stealing load.
On some load patterns it is possible for several CPUs to try steal thread from the same CPU despite randomization introduced. It may cause significant lock contention when holding one queue lock idle thread tries to acquire another one. Use of trylock on the remote queue allows both reduce the contention and handle lock ordering easier. If we can't get lock inside tdq_trysteal() we just return, allowing tdq_idled() handle it. If it happens in tdq_idled(), then we repeat search for load skipping this CPU.
On 2-socket 80-thread Xeon system I am observing dramatic reduction of the lock spinning time when doing random uncached 4KB reads from 12 ZVOLs, while IOPS increase from 327K to 403K.
MFC after: 1 month
show more ...
|
#
2668bb2a |
| 02-Aug-2021 |
Alexander Motin <mav@FreeBSD.org> |
sched_ule(4): Reduce duplicate search for load.
When sched_highest() called for some CPU group returns nothing, idle thread calls it for the parent CPU group. But the parent CPU group also includes
sched_ule(4): Reduce duplicate search for load.
When sched_highest() called for some CPU group returns nothing, idle thread calls it for the parent CPU group. But the parent CPU group also includes the CPU group we've just searched, and unless there is a race going on, it is unlikely we find anything new this time.
Avoid the double search in case of parent group having only two sub- groups (the most prominent case). Instead of escalating to the parent group run the next search over the sibling subgroup and escalate two levels up after if that fail too. In case of more than two siblings the difference is less significant, while searching the parent group can result in better decision if we find several candidate CPUs.
On 2-socket 40-core Xeon system I am measuring ~25% reduction of CPU time spent inside cpu_search_highest() in both SMT (2x20x2) and non- SMT (2x20) cases.
MFC after: 1 month
show more ...
|
#
af29f399 |
| 29-Jul-2021 |
Dmitry Chagin <dchagin@FreeBSD.org> |
umtx: Split umtx.h on two counterparts.
To prevent umtx.h polluting by future changes split it on two headers: umtx.h - ABI header for userspace; umtxvar.h - the kernel staff.
While here fix umtx_k
umtx: Split umtx.h on two counterparts.
To prevent umtx.h polluting by future changes split it on two headers: umtx.h - ABI header for userspace; umtxvar.h - the kernel staff.
While here fix umtx_key_match style.
Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D31248 MFC after: 2 weeks
show more ...
|
#
aefe0a8c |
| 29-Jul-2021 |
Alexander Motin <mav@FreeBSD.org> |
Refactor/optimize cpu_search_*().
Remove cpu_search_both(), unused for many years. Without it there is less sense for the trick of compiling common cpu_search() into separate cpu_search_lowest() an
Refactor/optimize cpu_search_*().
Remove cpu_search_both(), unused for many years. Without it there is less sense for the trick of compiling common cpu_search() into separate cpu_search_lowest() and cpu_search_highest(), so split them completely, making code more readable. While there, split iteration over children groups and CPUs, complicating code for very small deduplication.
Stop passing cpuset_t arguments by value and avoid some manipulations. Since MAXCPU bump from 64 to 256, what was a single register turned into 32-byte memory array, requiring memory allocation and accesses. Splitting struct cpu_search into parameter and result parts allows to even more reduce stack usage, since the first can be passed through on recursion.
Remove CPU_FFS() from the hot paths, precalculating first and last CPU for each CPU group in advance during initialization. Again, it was not a problem for 64 CPUs before, but for 256 FFS needs much more code.
With these changes on 80-thread system doing ~260K uncached ZFS reads per second I observe ~30% reduction of time spent in cpu_search_*().
MFC after: 1 month
show more ...
|
Revision tags: release/13.0.0, release/12.2.0, release/11.4.0 |
|
#
43521b46 |
| 19-May-2020 |
wiklam <wiktorpilar99@gmail.com> |
Correcting comment about "sched_interact_score".
Reviewed by: jrtc@, imp@ Pull Request: https://github.com/freebsd/freebsd-src/pull/431
Sponsored by: Netflix
|
#
b77594bb |
| 15-Nov-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
sched: fix an incorrect comparison in sched_lend_user_prio_cond
Compare with sched_lend_user_prio.
|
#
e43d33d2 |
| 05-Mar-2020 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r358466 through r358677.
|
#
b05ca429 |
| 02-Mar-2020 |
Pawel Biernacki <kaktus@FreeBSD.org> |
sys/: Document few more sysctls.
Submitted by: Antranig Vartanian <antranigv@freebsd.am> Reviewed by: kaktus Commented by: jhb Approved by: kib (mentor) Sponsored by: illuria security Differential R
sys/: Document few more sysctls.
Submitted by: Antranig Vartanian <antranigv@freebsd.am> Reviewed by: kaktus Commented by: jhb Approved by: kib (mentor) Sponsored by: illuria security Differential Revision: https://reviews.freebsd.org/D23759
show more ...
|
#
75dfc66c |
| 27-Feb-2020 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r358269 through r358399.
|
#
7029da5c |
| 26-Feb-2020 |
Pawel Biernacki <kaktus@FreeBSD.org> |
Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly mark
Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes.
This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags.
Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT
Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718
show more ...
|
#
bc02c18c |
| 07-Feb-2020 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r357408 through r357661.
|
#
e4894505 |
| 03-Feb-2020 |
Mark Johnston <markj@FreeBSD.org> |
Fix the !SMP case in sched_add() after r355779.
If the thread's lock is already that of the runqueue, don't recurse on the queue lock.
Reviewed by: jeff, kib Sponsored by: The FreeBSD Foundation Di
Fix the !SMP case in sched_add() after r355779.
If the thread's lock is already that of the runqueue, don't recurse on the queue lock.
Reviewed by: jeff, kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D23492
show more ...
|
#
59abbffa |
| 31-Jan-2020 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r357270 through r357349.
|
#
3ff65f71 |
| 30-Jan-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
Remove duplicated empty lines from kern/*.c
No functional changes.
|
#
051669e8 |
| 25-Jan-2020 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r356931 through r357118.
|
#
a89c2c8c |
| 24-Jan-2020 |
Mark Johnston <markj@FreeBSD.org> |
Revert r357050.
It seems to have introduced a couple of regressions.
Reported by: cy, pho
|
#
1bfca40c |
| 23-Jan-2020 |
Mark Johnston <markj@FreeBSD.org> |
Set td_oncpu before dropping the thread lock during a switch.
After r355784 we no longer hold a thread's thread lock when switching it out. Preserve the previous synchronization protocol for td_onc
Set td_oncpu before dropping the thread lock during a switch.
After r355784 we no longer hold a thread's thread lock when switching it out. Preserve the previous synchronization protocol for td_oncpu by setting it together with td_state, before dropping the thread lock during a switch.
Reported and tested by: pho Reviewed by: kib Discussed with: jeff Differential Revision: https://reviews.freebsd.org/D23270
show more ...
|
#
1eb13fce |
| 23-Jan-2020 |
Jeff Roberson <jeff@FreeBSD.org> |
Block the thread lock in sched_throw() and use cpu_switch() to unblock it. The introduction of lockless switch in r355784 created a race to re-use the exiting thread that was only possible to hit on
Block the thread lock in sched_throw() and use cpu_switch() to unblock it. The introduction of lockless switch in r355784 created a race to re-use the exiting thread that was only possible to hit on a hypervisor.
Reported/Tested by: rlibby Discussed with: rlibby, jhb
show more ...
|
#
879e0604 |
| 12-Jan-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
Add KERNEL_PANICKED macro for use in place of direct panicstr tests
|
#
d8d5f036 |
| 19-Dec-2019 |
Jeff Roberson <jeff@FreeBSD.org> |
Fix a bug in r355784. I missed a sched_add() call that needed to reacquire the thread lock.
Reported by: mjg
|