#
2c27cb3a |
| 10-Nov-2012 |
Alexander Motin <mav@FreeBSD.org> |
Several optimizations to sched_idletd(): - Do not try to steal load from other CPUs if there was no contest switches on this CPU (i.e. it was idle all the time and woke up just for bus mastering or
Several optimizations to sched_idletd(): - Do not try to steal load from other CPUs if there was no contest switches on this CPU (i.e. it was idle all the time and woke up just for bus mastering or TLB shutdown). If current CPU was idle, then it is quite unlikely that some other CPU has load to steal. Under high I/O rate, when TLB shutdowns cause numerous CPU wakeups, on 24-CPU system load stealing code may consume up to 25% of all CPU time without giving any benefits. - Change code that implements spinning for load to restart spin in case of context switch. Previous code periodically called cpu_idle() even under high interrupt/context switch rate. - Rise spinning threshold to 10KHz, where it gives at least some effect that may worth consumed power.
Reviewed by: jeff@
show more ...
|
#
5e5c3873 |
| 08-Nov-2012 |
Jeff Roberson <jeff@FreeBSD.org> |
- Change ULE to use dynamic slice sizes for the timeshare queue in order to further reduce latency for threads in this queue. This should help as threads transition from realtime to timeshare.
- Change ULE to use dynamic slice sizes for the timeshare queue in order to further reduce latency for threads in this queue. This should help as threads transition from realtime to timeshare. The latency is bound to a max of sched_slice until we have more than sched_slice / 6 threads runnable. Then the min slice is allotted to all threads and latency becomes (nthreads - 1) * min_slice.
Discussed with: mav
show more ...
|
#
23090366 |
| 04-Nov-2012 |
Simon J. Gerraty <sjg@FreeBSD.org> |
Sync from head
|
#
4ceaf45d |
| 31-Oct-2012 |
Attilio Rao <attilio@FreeBSD.org> |
Rework the known mutexes to benefit about staying on their own cache line in order to avoid manual frobbing but using struct mtx_padalign.
The sole exception being nvme and sxfge drivers, where the
Rework the known mutexes to benefit about staying on their own cache line in order to avoid manual frobbing but using struct mtx_padalign.
The sole exception being nvme and sxfge drivers, where the author redefined CACHE_LINE_SIZE manually, so they need to be analyzed and dealt with separately.
Reviwed by: jimharris, alc
show more ...
|
#
a049aa05 |
| 30-Oct-2012 |
Attilio Rao <attilio@FreeBSD.org> |
tdq_lock_pair() already does spinlock_enter() so migration is not possible in sched_balance_pair(). Remove redundant sched_pin().
Reviewed by: marius, jeff
|
#
39f819e2 |
| 24-Oct-2012 |
Jim Harris <jimharris@FreeBSD.org> |
Pad tdq_lock to avoid false sharing with tdq_load and tdq_cpu_idle.
This enables CPU searches (which read tdq_load) to operate independently of any contention on the spinlock. Some scheduler-intens
Pad tdq_lock to avoid false sharing with tdq_load and tdq_cpu_idle.
This enables CPU searches (which read tdq_load) to operate independently of any contention on the spinlock. Some scheduler-intensive workloads running on an 8C single-socket SNB Xeon show considerable improvement with this change (2-3% perf improvement, 5-6% decrease in CPU util).
Sponsored by: Intel Reviewed by: jeff
show more ...
|
#
db702c59 |
| 22-Oct-2012 |
Eitan Adler <eadler@FreeBSD.org> |
remove duplicate semicolons where possible.
Approved by: cperciva MFC after: 1 week
|
#
e87fc7cf |
| 14-Sep-2012 |
Andriy Gapon <avg@FreeBSD.org> |
sched_ule: fix inverted condition in reporting of priority lending via ktr
Reviewed by: kan MFC after: 1 week
|
#
24bf3585 |
| 04-Sep-2012 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Merge head r233826 through r240095.
|
#
ba96d2d8 |
| 22-Aug-2012 |
John Baldwin <jhb@FreeBSD.org> |
Mark the idle threads as non-sleepable and also assert that an idle thread never blocks on a turnstile.
|
#
37f4e025 |
| 11-Aug-2012 |
Alexander Motin <mav@FreeBSD.org> |
Some more minor tunings inspired by bde@.
|
#
bf89d544 |
| 11-Aug-2012 |
Alexander Motin <mav@FreeBSD.org> |
Allow idle threads to steal second threads from other cores on systems with 8 or more cores to improve utilization. None of my tests on 2xXeon (2x6x2) system shown any slowdown from mentioned "exces
Allow idle threads to steal second threads from other cores on systems with 8 or more cores to improve utilization. None of my tests on 2xXeon (2x6x2) system shown any slowdown from mentioned "excess thrashing". Same time in pbzip2 test with number of threads more then number of CPUs I see up to 10% speedup with SMT disabled and up 5% with SMT enabled. Thinking about trashing I was trying to limit that stealing within same last level cache, but got only worse results. Present code any way prefers to steal threads from topologically closer cores.
Sponsored by: iXsystems, Inc.
show more ...
|
#
579895df |
| 10-Aug-2012 |
Alexander Motin <mav@FreeBSD.org> |
Some minor tunings/cleanups inspired by bde@ after previous commits: - remove extra dynamic variable initializations; - restore (4BSD) and implement (ULE) hogticks variable setting; - make sched_r
Some minor tunings/cleanups inspired by bde@ after previous commits: - remove extra dynamic variable initializations; - restore (4BSD) and implement (ULE) hogticks variable setting; - make sched_rr_interval() more tolerant to options; - restore (4BSD) and implement (ULE) kern.sched.quantum sysctl, a more user-friendly wrapper for sched_slice; - tune some sysctl descriptions; - make some style fixes.
show more ...
|
#
d2679663 |
| 10-Aug-2012 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Merge head r233826 through r239173.
|
#
3d7f4117 |
| 09-Aug-2012 |
Alexander Motin <mav@FreeBSD.org> |
Rework r220198 change (by fabient). I believe it solves the problem from the wrong direction. Before it, if preemption and end of time slice happen same time, thread was put to the head of the queue
Rework r220198 change (by fabient). I believe it solves the problem from the wrong direction. Before it, if preemption and end of time slice happen same time, thread was put to the head of the queue as for only preemption. It could cause single thread to run for indefinitely long time. r220198 handles it by not clearing TDF_NEEDRESCHED in case of preemption. But that causes delayed context switch every time preemption happens, even when not needed.
Solve problem by introducing scheduler-specifoc thread flag TDF_SLICEEND, set when thread's time slice is over and it should be put to the tail of queue. Using SW_PREEMPT flag for that purpose as it was before just not enough informative to work correctly.
On my tests this by 2-3 times reduces run time deviation (improves fairness) in cases when several threads share one CPU.
Reviewed by: fabient MFC after: 2 months Sponsored by: iXsystems, Inc.
show more ...
|
#
b652778e |
| 11-Jul-2012 |
Peter Grehan <grehan@FreeBSD.org> |
IFC @ r238370
|
#
2d5e7d2e |
| 30-May-2012 |
Will Andrews <will@FreeBSD.org> |
IFC @ r236291. Diff reductions to the enclosure driver made in r235911.
|
#
31ccd489 |
| 28-May-2012 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Merge head r233826 through r236168.
|
#
17f4cae4 |
| 27-May-2012 |
Rafal Jaworowski <raj@FreeBSD.org> |
Let us manage differences of Book-E PowerPC variations i.e. vendor / implementation specific vs. the common architecture definition.
Bring PPC4XX defines (PSL, SPR, TLB). Note the new definitions un
Let us manage differences of Book-E PowerPC variations i.e. vendor / implementation specific vs. the common architecture definition.
Bring PPC4XX defines (PSL, SPR, TLB). Note the new definitions under BOOKE_PPC4XX are not used in the code yet.
This change set is not supposed to affect existing E500 support, it's just another reorg step before bringing support for E500mc, E5500 and PPC465.
Obtained from: AppliedMicro, Freescale, Semihalf
show more ...
|
#
b3e9e682 |
| 15-May-2012 |
Ryan Stone <rstone@FreeBSD.org> |
Implement the DTrace sched provider. This implementation aims to be compatible with the sched provider implemented by Solaris and its open- source derivatives. Full documentation of the sched provi
Implement the DTrace sched provider. This implementation aims to be compatible with the sched provider implemented by Solaris and its open- source derivatives. Full documentation of the sched provider can be found on Oracle's DTrace wiki pages.
Note that for compatibility with scripts originally written for Solaris, serveral probes are defined that will never fire. These probes are defined to fire when Solaris-specific features perform certain actions. As these features are not present in FreeBSD, the probes can never fire.
Also, I have added a two probes that are not defined in Solaris, lend-pri and load-change. These probes have been added to make it possible to collect schedgraph data with DTrace.
Finally, a few probes are defined in Solaris to take a cpuinfo_t * argument. As it was not immediately clear to me how to translate that to FreeBSD, currently those probes are passed NULL in place of a cpuinfo_t *.
Sponsored by: Sandvine Incorporated MFC after: 2 weeks
show more ...
|
#
6a068746 |
| 15-May-2012 |
Alexander Motin <mav@FreeBSD.org> |
MFC
|
#
38f1b189 |
| 26-Apr-2012 |
Peter Grehan <grehan@FreeBSD.org> |
IFC @ r234692
sys/amd64/include/cpufunc.h sys/amd64/include/fpu.h sys/amd64/amd64/fpu.c sys/amd64/vmm/vmm.c
- Add API to allow vmm FPU state init/save/restore.
FP stuff discussed with: kib
|
#
7ab97117 |
| 10-Apr-2012 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Merge head r233826 through r234091.
|
#
70801abe |
| 09-Apr-2012 |
Alexander Motin <mav@FreeBSD.org> |
Microoptimize cpu_search().
According to profiling, it makes one take 6% of CPU time on hackbench with its million of context switches per second, instead of 8% before.
|
Revision tags: release/8.3.0_cvs, release/8.3.0 |
|
#
8833b15f |
| 03-Apr-2012 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Merge head r232686 through r233825 into projects/pf/head.
|