#
e9271f53 |
| 23-Oct-2007 |
Julian Elischer <julian@FreeBSD.org> |
Take out the single-threading code in fork. After discussions with jeff, alc, (various Ironport people), david Xu, and mostly Alfred (who found the problem) it has been demonstrated that this is not
Take out the single-threading code in fork. After discussions with jeff, alc, (various Ironport people), david Xu, and mostly Alfred (who found the problem) it has been demonstrated that this is not needed for our implementations of threads and represents a real (as in we've seen it happen a lot) deadlock danger.
Several points: Since forking multiple threads is not allowed, and posix states that any mutexes owned by othre threads wilol be owned in the child by phantom threads, and therads shouldn't ba accessing shared structures without protection, It can be proved that if this leads to the child process accessing inconsistent data, it's a programming error.
The mode of thread_single() being used in fork() is the wrong one. It is using SINGLE_NO_EXIT when it should be using SINGLE_BOUNDARY.
Even if this we used, System processes have no need to do it as they have no userland to get inconsistent.
This commmit first fixes the above bugs to get tehm correct in CVS. then removes them with #ifdef. This is so that history contains the corrected version should it be needed in the future. This code may be needed if we implement the forkall() syscall from Solaris. It may be needed for other non-posix thread libraries at some time in the future, so let the code sit for a short while while I do some work on it anyhow.
This removes a reproducible lockup in NFS. It may be argued that maybe doing a fork while holding a vnode lock may not be the best idea in th efirst place but it shouldn't cause a deadlock. The removal has been running under soak test for several days now.
This removal should be seriously considered for 7.0 and RELENG_6.
Note. There is code in the core-dumping code that may have a similar problem with coredumping threaded processes
MFC After: 4 days
show more ...
|
#
3745c395 |
| 21-Oct-2007 |
Julian Elischer <julian@FreeBSD.org> |
Rename the kthread_xxx (e.g. kthread_create()) calls to kproc_xxx as they actually make whole processes. Thos makes way for us to add REAL kthread_create() and friends that actually make theads. it t
Rename the kthread_xxx (e.g. kthread_create()) calls to kproc_xxx as they actually make whole processes. Thos makes way for us to add REAL kthread_create() and friends that actually make theads. it turns out that most of these calls actually end up being moved back to the thread version when it's added. but we need to make this cosmetic change first.
I'd LOVE to do this rename in 7.0 so that we can eventually MFC the new kthread_xxx() calls.
show more ...
|
#
54b0e65f |
| 21-Sep-2007 |
Jeff Roberson <jeff@FreeBSD.org> |
- Redefine p_swtime and td_slptime as p_swtick and td_slptick. This changes the units from seconds to the value of 'ticks' when swapped in/out. ULE does not have a periodic timer that scans a
- Redefine p_swtime and td_slptime as p_swtick and td_slptick. This changes the units from seconds to the value of 'ticks' when swapped in/out. ULE does not have a periodic timer that scans all threads in the system and as such maintaining a per-second counter is difficult. - Change computations requiring the unit in seconds to subtract ticks and divide by hz. This does make the wraparound condition hz times more frequent but this is still in the range of several months to years and the adverse effects are minimal.
Approved by: re
show more ...
|
#
b61ce5b0 |
| 17-Sep-2007 |
Jeff Roberson <jeff@FreeBSD.org> |
- Move all of the PS_ flags into either p_flag or td_flags. - p_sflag was mostly protected by PROC_LOCK rather than the PROC_SLOCK or previously the sched_lock. These bugs have existed for some
- Move all of the PS_ flags into either p_flag or td_flags. - p_sflag was mostly protected by PROC_LOCK rather than the PROC_SLOCK or previously the sched_lock. These bugs have existed for some time. - Allow swapout to try each thread in a process individually and then swapin the whole process if any of these fail. This allows us to move most scheduler related swap flags into td_flags. - Keep ki_sflag for backwards compat but change all in source tools to use the new and more correct location of P_INMEM.
Reported by: pho Reviewed by: attilio, kib Approved by: re (kensmith)
show more ...
|
#
7251b786 |
| 17-Jun-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Rather than passing SUSER_RUID into priv_check_cred() to specify when a privilege is checked against the real uid rather than the effective uid, instead decide which uid to use in priv_check_cred() b
Rather than passing SUSER_RUID into priv_check_cred() to specify when a privilege is checked against the real uid rather than the effective uid, instead decide which uid to use in priv_check_cred() based on the privilege passed in. We use the real uid for PRIV_MAXFILES, PRIV_MAXPROC, and PRIV_PROC_LIMIT. Remove the definition of SUSER_RUID; there are now no flags defined for priv_check_cred().
Obtained from: TrustedBSD Project
show more ...
|
#
fe54587f |
| 12-Jun-2007 |
Jeff Roberson <jeff@FreeBSD.org> |
- Move some common code out of sched_fork_exit() and back into fork_exit().
|
#
32f9753c |
| 12-Jun-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in some cases, move to priv_check() if it was an operation on a thread and no other flags were present.
Eliminate caller-side jai
Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in some cases, move to priv_check() if it was an operation on a thread and no other flags were present.
Eliminate caller-side jail exception checking (also now-unused); jail privilege exception code now goes solely in kern_jail.c.
We can't yet eliminate suser() due to some cases in the KAME code where a privilege check is performed and then used in many different deferred paths. Do, however, move those prototypes to priv.h.
Reviewed by: csjp Obtained from: TrustedBSD Project
show more ...
|
#
393a081d |
| 10-Jun-2007 |
Attilio Rao <attilio@FreeBSD.org> |
Optimize vmmeter locking. In particular: - Add an explicative table for locking of struct vmmeter members - Apply new rules for some of those members - Remove some unuseful comments
Heavily reviewed
Optimize vmmeter locking. In particular: - Add an explicative table for locking of struct vmmeter members - Apply new rules for some of those members - Remove some unuseful comments
Heavily reviewed by: alc, bde, jeff Approved by: jeff (mentor)
show more ...
|
#
faef5371 |
| 08-Jun-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Move per-process audit state from a pointer in the proc structure to embedded storage in struct ucred. This allows audit state to be cached with the thread, avoiding locking operations with each sys
Move per-process audit state from a pointer in the proc structure to embedded storage in struct ucred. This allows audit state to be cached with the thread, avoiding locking operations with each system call, and makes it available in asynchronous execution contexts, such as deep in the network stack or VFS.
Reviewed by: csjp Approved by: re (kensmith) Obtained from: TrustedBSD Project
show more ...
|
#
11bda9b8 |
| 05-Jun-2007 |
Jeff Roberson <jeff@FreeBSD.org> |
Commit 6/14 of sched_lock decomposition. - Use thread_lock() rather than sched_lock for per-thread scheduling sychronization. - Use the per-process spinlock rather than the sched_lock for per-pr
Commit 6/14 of sched_lock decomposition. - Use thread_lock() rather than sched_lock for per-thread scheduling sychronization. - Use the per-process spinlock rather than the sched_lock for per-process scheduling synchronization. - Replace the tail-end of fork_exit() with a scheduler specific routine which can do the appropriate lock manipulations.
Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
show more ...
|
#
1c4bcd05 |
| 01-Jun-2007 |
Jeff Roberson <jeff@FreeBSD.org> |
- Move rusage from being per-process in struct pstats to per-thread in td_ru. This removes the requirement for per-process synchronization in statclock() and mi_switch(). This was previously
- Move rusage from being per-process in struct pstats to per-thread in td_ru. This removes the requirement for per-process synchronization in statclock() and mi_switch(). This was previously supported by sched_lock which is going away. All modifications to rusage are now done in the context of the owning thread. reads proceed without locks. - Aggregate exiting threads rusage in thread_exit() such that the exiting thread's rusage is not lost. - Provide a new routine, rufetch() to fetch an aggregate of all rusage structures from all threads in a process. This routine must be used in any place requiring a rusage from a process prior to it's exit. The exited process's rusage is still available via p_ru. - Aggregate tick statistics only on demand via rufetch() or when a thread exits. Tick statistics are kept in the thread and protected by sched_lock until it exits.
Initial patch by: attilio Reviewed by: attilio, bde (some objections), arch (mostly silent)
show more ...
|
#
2feb50bf |
| 01-Jun-2007 |
Attilio Rao <attilio@FreeBSD.org> |
Revert VMCNT_* operations introduction. Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately.
Requested by: alc Approved b
Revert VMCNT_* operations introduction. Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately.
Requested by: alc Approved by: jeff (mentor)
show more ...
|
#
eefc4972 |
| 18-May-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Remove unnecessary assignment.
CID: 2227 Found with: Coverity Prevent(tm)
|
#
222d0195 |
| 18-May-2007 |
Jeff Roberson <jeff@FreeBSD.org> |
- define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sc
- define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sched lock is no longer responsible for protecting counts in the switch routines.
Contributed by: Attilio Rao <attilio@FreeBSD.org>
show more ...
|
#
5e3f7694 |
| 04-Apr-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Replace custom file descriptor array sleep lock constructed using a mutex and flags with an sxlock. This leads to a significant and measurable performance improvement as a result of access to shared
Replace custom file descriptor array sleep lock constructed using a mutex and flags with an sxlock. This leads to a significant and measurable performance improvement as a result of access to shared locking for frequent lookup operations, reduced general overhead, and reduced overhead in the event of contention. All of these are imported for threaded applications where simultaneous access to a shared file descriptor array occurs frequently. Kris has reported 2x-4x transaction rate improvements on 8-core MySQL benchmarks; smaller improvements can be expected for many workloads as a result of reduced overhead.
- Generally eliminate the distinction between "fast" and regular acquisisition of the filedesc lock; the plan is that they will now all be fast. Change all locking instances to either shared or exclusive locks.
- Correct a bug (pointed out by kib) in fdfree() where previously msleep() was called without the mutex held; sx_sleep() is now always called with the sxlock held exclusively.
- Universally hold the struct file lock over changes to struct file, rather than the filedesc lock or no lock. Always update the f_ops field last. A further memory barrier is required here in the future (discussed with jhb).
- Improve locking and reference management in linux_at(), which fails to properly acquire vnode references before using vnode pointers. Annotate improper use of vn_fullpath(), which will be replaced at a future date.
In fcntl(), we conservatively acquire an exclusive lock, even though in some cases a shared lock may be sufficient, which should be revisited. The dropping of the filedesc lock in fdgrowtable() is no longer required as the sxlock can be held over the sleep operation; we should consider removing that (pointed out by attilio).
Tested by: kris Discussed with: jhb, kris, attilio, jeff
show more ...
|
#
0c14ff0e |
| 04-Mar-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Remove 'MPSAFE' annotations from the comments above most system calls: all system calls now enter without Giant held, and then in some cases, acquire Giant explicitly.
Remove a number of other MPSAF
Remove 'MPSAFE' annotations from the comments above most system calls: all system calls now enter without Giant held, and then in some cases, acquire Giant explicitly.
Remove a number of other MPSAFE annotations in the credential code and tweak one or two other adjacent comments.
show more ...
|
#
84d37a46 |
| 27-Feb-2007 |
John Baldwin <jhb@FreeBSD.org> |
Use pause() rather than tsleep() on explicit global dummy variables.
|
#
1ad9ee86 |
| 26-Feb-2007 |
Xin LI <delphij@FreeBSD.org> |
Close race conditions between fork() and [sg]etpriority()'s PRIO_USER case, possibly also other places that deferences p_ucred.
In the past, we insert a new process into the allproc list right after
Close race conditions between fork() and [sg]etpriority()'s PRIO_USER case, possibly also other places that deferences p_ucred.
In the past, we insert a new process into the allproc list right after PID allocation, and release the allproc_lock sx. Because most content in new proc's structure is not yet initialized, this could lead to undefined result if we do not handle PRS_NEW with care.
The problem with PRS_NEW state is that it does not provide fine grained information about how much initialization is done for a new process. By defination, after PRIO_USER setpriority(), all processes that belongs to given user should have their nice value set to the specified value. Therefore, if p_{start,end}copy section was done for a PRS_NEW process, we can not safely ignore it because p_nice is in this area. On the other hand, we should be careful on PRS_NEW processes because we do not allow non-root users to lower their nice values, and without a successful copy of the copy section, we can get stale values that is inherted from the uninitialized area of the process structure.
This commit tries to close the race condition by grabbing proc mutex *before* we release allproc_lock xlock, and do copy as well as zero immediately after the allproc_lock xunlock. This guarantees that the new process would have its p_copy and p_zero sections, as well as user credential informaion initialized. In getpriority() case, instead of grabbing PROC_LOCK for a PRS_NEW process, we just skip the process in question, because it does not affect the final result of the call, as the p_nice value would be copied from its parent, and we will see it during allproc traverse.
Other potential solutions are still under evaluation.
Discussed with: davidxu, jhb, rwatson PR: kern/108071 MFC after: 2 weeks
show more ...
|
#
f0393f06 |
| 23-Jan-2007 |
Jeff Roberson <jeff@FreeBSD.org> |
- Remove setrunqueue and replace it with direct calls to sched_add(). setrunqueue() was mostly empty. The few asserts and thread state setting were moved to the individual schedulers. sched_a
- Remove setrunqueue and replace it with direct calls to sched_add(). setrunqueue() was mostly empty. The few asserts and thread state setting were moved to the individual schedulers. sched_add() was chosen to displace it for naming consistency reasons. - Remove adjustrunqueue, it was 4 lines of code that was ifdef'd to be different on all three schedulers where it was only called in one place each. - Remove the long ifdef'd out remrunqueue code. - Remove the now redundant ts_state. Inspect the thread state directly. - Don't set TSF_* flags from kern_switch.c, we were only doing this to support a feature in one scheduler. - Change sched_choose() to return a thread rather than a td_sched. Also, rely on the schedulers to return the idlethread. This simplifies the logic in choosethread(). Aside from the run queue links kern_switch.c mostly does not care about the contents of td_sched.
Discussed with: julian
- Move the idle thread loop into the per scheduler area. ULE wants to do something different from the other schedulers.
Suggested by: jhb
Tested on: x86/amd64 sched_{4BSD, ULE, CORE}.
show more ...
|
Revision tags: release/6.2.0_cvs, release/6.2.0 |
|
#
ad1e7d28 |
| 06-Dec-2006 |
Julian Elischer <julian@FreeBSD.org> |
Threading cleanup.. part 2 of several.
Make part of John Birrell's KSE patch permanent.. Specifically, remove: Any reference of the ksegrp structure. This feature was never fully utilised and made t
Threading cleanup.. part 2 of several.
Make part of John Birrell's KSE patch permanent.. Specifically, remove: Any reference of the ksegrp structure. This feature was never fully utilised and made things overly complicated. All code in the scheduler that tried to make threaded programs fair to unthreaded programs. Libpthread processes will already do this to some extent and libthr processes already disable it.
Also: Since this makes such a big change to the scheduler(s), take the opportunity to rename some structures and elements that had to be moved anyhow. This makes the code a lot more readable.
The ULE scheduler compiles again but I have no idea if it works.
The 4bsd scheduler still reqires a little cleaning and some functions that now do ALMOST nothing will go away, but I thought I'd do that as a separate commit.
Tested by David Xu, and Dan Eischen using libthr and libpthread.
show more ...
|
#
acd3428b |
| 06-Nov-2006 |
Robert Watson <rwatson@FreeBSD.org> |
Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking.
Sponsored by: nCircle
Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking.
Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>
show more ...
|
#
8460a577 |
| 26-Oct-2006 |
John Birrell <jb@FreeBSD.org> |
Make KSE a kernel option, turned on by default in all GENERIC kernel configs except sun4v (which doesn't process signals properly with KSE).
Reviewed by: davidxu@
|
#
aed55708 |
| 22-Oct-2006 |
Robert Watson <rwatson@FreeBSD.org> |
Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitio
Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead.
This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd.
Obtained from: TrustedBSD Project Sponsored by: SPARTA
show more ...
|
#
993182e5 |
| 15-Aug-2006 |
Alexander Leidinger <netchild@FreeBSD.org> |
- Change process_exec function handlers prototype to include struct image_params arg. - Change struct image_params to include struct sysentvec pointer and initialize it. - Change all consumers of
- Change process_exec function handlers prototype to include struct image_params arg. - Change struct image_params to include struct sysentvec pointer and initialize it. - Change all consumers of process_exit/process_exec eventhandlers to new prototypes (includes splitting up into distinct exec/exit functions). - Add eventhandler to userret.
Sponsored by: Google SoC 2006 Submitted by: rdivacky Parts suggested by: jhb (on hackers@)
show more ...
|
#
38affe13 |
| 01-Aug-2006 |
John Baldwin <jhb@FreeBSD.org> |
Don't lock each of the processes while looking for a pid. The allproc and proctree locks that we already hold provide sufficient protection.
|