#
14961ba7 |
| 27-Jun-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Replace AUDIT_ARG() with variable argument macros with a set more more specific macros for each audit argument type. This makes it easier to follow call-graphs, especially for automated analysis too
Replace AUDIT_ARG() with variable argument macros with a set more more specific macros for each audit argument type. This makes it easier to follow call-graphs, especially for automated analysis tools (such as fxr).
In MFC, we should leave the existing AUDIT_ARG() macros as they may be used by third-party kernel modules.
Suggested by: brooks Approved by: re (kib) Obtained from: TrustedBSD Project MFC after: 1 week
show more ...
|
#
3364c323 |
| 23-Jun-2009 |
Konstantin Belousov <kib@FreeBSD.org> |
Implement global and per-uid accounting of the anonymous memory. Add rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved for the uid.
The accounting information (charge) is associ
Implement global and per-uid accounting of the anonymous memory. Add rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved for the uid.
The accounting information (charge) is associated with either map entry, or vm object backing the entry, assuming the object is the first one in the shadow chain and entry does not require COW. Charge is moved from entry to object on allocation of the object, e.g. during the mmap, assuming the object is allocated, or on the first page fault on the entry. It moves back to the entry on forks due to COW setup.
The per-entry granularity of accounting makes the charge process fair for processes that change uid during lifetime, and decrements charge for proper uid when region is unmapped.
The interface of vm_pager_allocate(9) is extended by adding struct ucred *, that is used to charge appropriate uid when allocation if performed by kernel, e.g. md(4).
Several syscalls, among them is fork(2), may now return ENOMEM when global or per-uid limits are enforced.
In collaboration with: pho Reviewed by: alc Approved by: re (kensmith)
show more ...
|
#
7e857dd1 |
| 12-Jun-2009 |
Oleksandr Tymoshenko <gonzo@FreeBSD.org> |
- Merge from HEAD
|
#
d8b0556c |
| 10-Jun-2009 |
Konstantin Belousov <kib@FreeBSD.org> |
Adapt vfs kqfilter to the shared vnode lock used by zfs write vop. Use vnode interlock to protect the knote fields [1]. The locking assumes that shared vnode lock is held, thus we get exclusive acces
Adapt vfs kqfilter to the shared vnode lock used by zfs write vop. Use vnode interlock to protect the knote fields [1]. The locking assumes that shared vnode lock is held, thus we get exclusive access to knote either by exclusive vnode lock protection, or by shared vnode lock + vnode interlock.
Do not use kl_locked() method to assert either lock ownership or the fact that curthread does not own the lock. For shared locks, ownership is not recorded, e.g. VOP_ISLOCKED can return LK_SHARED for the shared lock not owned by curthread, causing false positives in kqueue subsystem assertions about knlist lock.
Remove kl_locked method from knlist lock vector, and add two separate assertion methods kl_assert_locked and kl_assert_unlocked, that are supposed to use proper asserts. Change knlist_init accordingly.
Add convenience function knlist_init_mtx to reduce number of arguments for typical knlist initialization.
Submitted by: jhb [1] Noted by: jhb [2] Reviewed by: jhb Tested by: rnoland
show more ...
|
#
bcf11e8d |
| 05-Jun-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in du
Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in due to copy-and-paste of MAC-aware code without the associated opt_mac.h include.
Discussed with: pjd
show more ...
|
#
0304c731 |
| 27-May-2009 |
Jamie Gritton <jamie@FreeBSD.org> |
Add hierarchical jails. A jail may further virtualize its environment by creating a child jail, which is visible to that jail and to any parent jails. Child jails may be restricted more than their
Add hierarchical jails. A jail may further virtualize its environment by creating a child jail, which is visible to that jail and to any parent jails. Child jails may be restricted more than their parents, but never less. Jail names reflect this hierarchy, being MIB-style dot-separated strings.
Every thread now points to a jail, the default being prison0, which contains information about the physical system. Prison0's root directory is the same as rootvnode; its hostname is the same as the global hostname, and its securelevel replaces the global securelevel. Note that the variable "securelevel" has actually gone away, which should not cause any problems for code that properly uses securelevel_gt() and securelevel_ge().
Some jail-related permissions that were kept in global variables and set via sysctls are now per-jail settings. The sysctls still exist for backward compatibility, used only by the now-deprecated jail(2) system call.
Approved by: bz (mentor)
show more ...
|
#
2e370a5c |
| 26-May-2009 |
Oleksandr Tymoshenko <gonzo@FreeBSD.org> |
Merge from HEAD
|
#
29b02909 |
| 08-May-2009 |
Marko Zec <zec@FreeBSD.org> |
Introduce a new virtualization container, provisionally named vprocg, to hold virtualized instances of hostname and domainname, as well as a new top-level virtualization struct vimage, which holds po
Introduce a new virtualization container, provisionally named vprocg, to hold virtualized instances of hostname and domainname, as well as a new top-level virtualization struct vimage, which holds pointers to struct vnet and struct vprocg. Struct vprocg is likely to become replaced in the near future with a new jail management API import.
As a consequence of this change, change struct ucred to point to a struct vimage, instead of directly pointing to a vnet.
Merge vnet / vimage / ucred refcounting infrastructure from p4 / vimage branch.
Permit kldload / kldunload operations to be executed only from the default vimage context.
This change should have no functional impact on nooptions VIMAGE kernel builds.
Reviewed by: bz Approved by: julian (mentor)
show more ...
|
#
e7153b25 |
| 07-May-2009 |
Oleksandr Tymoshenko <gonzo@FreeBSD.org> |
Merge from HEAD
|
#
21ca7b57 |
| 05-May-2009 |
Marko Zec <zec@FreeBSD.org> |
Change the curvnet variable from a global const struct vnet *, previously always pointing to the default vnet context, to a dynamically changing thread-local one. The currvnet context should be set
Change the curvnet variable from a global const struct vnet *, previously always pointing to the default vnet context, to a dynamically changing thread-local one. The currvnet context should be set on entry to networking code via CURVNET_SET() macros, and reverted to previous state via CURVNET_RESTORE(). Recursions on curvnet are permitted, though strongly discuouraged.
This change should have no functional impact on nooptions VIMAGE kernel builds, where CURVNET_* macros expand to whitespace.
The curthread->td_vnet (aka curvnet) variable's purpose is to be an indicator of the vnet context in which the current network-related operation takes place, in case we cannot deduce the current vnet context from any other source, such as by looking at mbuf's m->m_pkthdr.rcvif->if_vnet, sockets's so->so_vnet etc. Moreover, so far curvnet has turned out to be an invaluable consistency checking aid: it helps to catch cases when sockets, ifnets or any other vnet-aware structures may have leaked from one vnet to another.
The exact placement of the CURVNET_SET() / CURVNET_RESTORE() macros was a result of an empirical iterative process, whith an aim to reduce recursions on CURVNET_SET() to a minimum, while still reducing the scope of CURVNET_SET() to networking only operations - the alternative would be calling CURVNET_SET() on each system call entry. In general, curvnet has to be set in three typicall cases: when processing socket-related requests from userspace or from within the kernel; when processing inbound traffic flowing from device drivers to upper layers of the networking stack, and when executing timer-driven networking functions.
This change also introduces a DDB subcommand to show the list of all vnet instances.
Approved by: julian (mentor)
show more ...
|
Revision tags: release/7.2.0_cvs, release/7.2.0, release/7.1.0_cvs, release/7.1.0 |
|
#
aeb32571 |
| 05-Dec-2008 |
Konstantin Belousov <kib@FreeBSD.org> |
Several threads in a process may do vfork() simultaneously. Then, all parent threads sleep on the parent' struct proc until corresponding child releases the vmspace. Each sleep is interlocked with pr
Several threads in a process may do vfork() simultaneously. Then, all parent threads sleep on the parent' struct proc until corresponding child releases the vmspace. Each sleep is interlocked with proc mutex of the child, that triggers assertion in the sleepq_add(). The assertion requires that at any time, all simultaneous sleepers for the channel use the same interlock.
Silent the assertion by using conditional variable allocated in the child. Broadcast the variable event on exec() and exit().
Since struct proc * sleep wait channel is overloaded for several unrelated events, I was unable to remove wakeups from the places where cv_broadcast() is added, except exec().
Reported and tested by: ganbold Suggested and reviewed by: jhb MFC after: 2 week
show more ...
|
#
e57c2b13 |
| 04-Dec-2008 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
integrate from head@185615
|
#
413628a7 |
| 29-Nov-2008 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
MFp4: Bring in updated jail support from bz_jail branch.
This enhances the current jail implementation to permit multiple addresses per jail. In addtion to IPv4, IPv6 is supported as well. Due to
MFp4: Bring in updated jail support from bz_jail branch.
This enhances the current jail implementation to permit multiple addresses per jail. In addtion to IPv4, IPv6 is supported as well. Due to updated checks it is even possible to have jails without an IP address at all, which basically gives one a chroot with restricted process view, no networking,..
SCTP support was updated and supports IPv6 in jails as well.
Cpuset support permits jails to be bound to specific processor sets after creation.
Jails can have an unrestricted (no duplicate protection, etc.) name in addition to the hostname. The jail name cannot be changed from within a jail and is considered to be used for management purposes or as audit-token in the future.
DDB 'show jails' command was added to aid debugging.
Proper compat support permits 32bit jail binaries to be used on 64bit systems to manage jails. Also backward compatibility was preserved where possible: for jail v1 syscalls, as well as with user space management utilities.
Both jail as well as prison version were updated for the new features. A gap was intentionally left as the intermediate versions had been used by various patches floating around the last years.
Bump __FreeBSD_version for the afore mentioned and in kernel changes.
Special thanks to: - Pawel Jakub Dawidek (pjd) for his multi-IPv4 patches and Olivier Houchard (cognet) for initial single-IPv6 patches. - Jeff Roberson (jeff) and Randall Stewart (rrs) for their help, ideas and review on cpuset and SCTP support. - Robert Watson (rwatson) for lots and lots of help, discussions, suggestions and review of most of the patch at various stages. - John Baldwin (jhb) for his help. - Simon L. Nielsen (simon) as early adopter testing changes on cluster machines as well as all the testers and people who provided feedback the last months on freebsd-jail and other channels. - My employer, CK Software GmbH, for the support so I could work on this.
Reviewed by: (see above) MFC after: 3 months (this is just so that I get the mail) X-MFC Before: 7.2-RELEASE if possible
show more ...
|
Revision tags: release/6.4.0_cvs, release/6.4.0 |
|
#
50d6e424 |
| 19-Oct-2008 |
Kip Macy <kmacy@FreeBSD.org> |
- Forward port flush of page table updates on context switch or userret - Forward port vfork XEN hack
|
#
8b4a2800 |
| 23-Jul-2008 |
Konstantin Belousov <kib@FreeBSD.org> |
Do the pargs_hold() on the copy of the pointer to the p_args of the child process immediately after bulk bcopy() without dropping the process lock.
Since process is not single-threaded when forking,
Do the pargs_hold() on the copy of the pointer to the p_args of the child process immediately after bulk bcopy() without dropping the process lock.
Since process is not single-threaded when forking, dropping and reacquiring the lock allows an other thread to change the process title of the parent in between, and results in hold being done on the invalid pointer. The problem manifested itself as the double free of the old p_args.
Reported by: kris Reviewed by: jhb MFC after: 1 week
show more ...
|
#
7054ee4e |
| 07-Jul-2008 |
Konstantin Belousov <kib@FreeBSD.org> |
The kqueue_register() function assumes that it is called from the top of the syscall code and acquires various event subsystem locks as needed. The handling of the NOTE_TRACK for EVFILT_PROC is curre
The kqueue_register() function assumes that it is called from the top of the syscall code and acquires various event subsystem locks as needed. The handling of the NOTE_TRACK for EVFILT_PROC is currently done by calling the kqueue_register() from filt_proc() filter, causing recursive entrance of the kqueue code. This results in the LORs and recursive acquisition of the locks.
Implement the variant of the knote() function designed to only handle the fork() event. It mostly copies the knote() body, but also handles the NOTE_TRACK, removing the handling from the filt_proc(), where it causes problems described above. The function is called from the fork1() instead of knote().
When encountering NOTE_TRACK knote, it marks the knote as influx and drops the knlist and kqueue lock. In this context call to kqueue_register is safe from the problems.
An error from the kqueue_register() is reported to the observer as NOTE_TRACKERR fflag.
PR: 108201 Reviewed by: jhb, Pramod Srinivasan <pramod juniper net> (previous version) Discussed with: jmg Tested by: pho MFC after: 2 weeks
show more ...
|
#
5d217f17 |
| 24-May-2008 |
John Birrell <jb@FreeBSD.org> |
Add DTrace 'proc' provider probes using the Statically Defined Trace (sdt) mechanism.
|
#
69aa768a |
| 20-Mar-2008 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix the leak of the vmspace on the fork when the process limits are exceeded.
Pointy hat to: me MFC after: 3 days
|
#
0ac213ef |
| 20-Mar-2008 |
Jeff Roberson <jeff@FreeBSD.org> |
- Don't call the empty sched_newproc() function. sched_newproc() already existed as sched_fork() which is a non empty function in both schedulers.
|
#
6617724c |
| 12-Mar-2008 |
Jeff Roberson <jeff@FreeBSD.org> |
Remove kernel support for M:N threading.
While the KSE project was quite successful in bringing threading to FreeBSD, the M:N approach taken by the kse library was never developed to its full potent
Remove kernel support for M:N threading.
While the KSE project was quite successful in bringing threading to FreeBSD, the M:N approach taken by the kse library was never developed to its full potential. Backwards compatibility will be provided via libmap.conf for dynamically linked binaries and static binaries will be broken.
show more ...
|
Revision tags: release/7.0.0_cvs, release/7.0.0, release/6.3.0_cvs, release/6.3.0 |
|
#
4b9322ae |
| 15-Nov-2007 |
Julian Elischer <julian@FreeBSD.org> |
When forking, the new thread deserves a name too. Don't just use the td_startcopy section as it is not the right thing to do in other cases (e.g. if starting a new thread from one that is already nam
When forking, the new thread deserves a name too. Don't just use the td_startcopy section as it is not the right thing to do in other cases (e.g. if starting a new thread from one that is already named).
show more ...
|
#
e01eafef |
| 14-Nov-2007 |
Julian Elischer <julian@FreeBSD.org> |
A bunch more files that should probably print out a thread name instead of a process name.
|
#
89b57fcf |
| 05-Nov-2007 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix for the panic("vm_thread_new: kstack allocation failed") and silent NULL pointer dereference in the i386 and sparc64 pmap_pinit() when the kmem_alloc_nofault() failed to allocate address space. B
Fix for the panic("vm_thread_new: kstack allocation failed") and silent NULL pointer dereference in the i386 and sparc64 pmap_pinit() when the kmem_alloc_nofault() failed to allocate address space. Both functions now return error instead of panicing or dereferencing NULL.
As consequence, vmspace_exec() and vmspace_unshare() returns the errno int. struct vmspace arg was added to vm_forkproc() to avoid dealing with failed allocation when most of the fork1() job is already done.
The kernel stack for the thread is now set up in the thread_alloc(), that itself may return NULL. Also, allocation of the first process thread is performed in the fork1() to properly deal with stack allocation failure. proc_linkup() is separated into proc_linkup() called from fork1(), and proc_linkup0(), that is used to set up the kernel process (was known as swapper).
In collaboration with: Peter Holm Reviewed by: jhb
show more ...
|
#
f4bb4fc8 |
| 02-Nov-2007 |
Julian Elischer <julian@FreeBSD.org> |
Completely remove the code for single threading the mainline fork code. Put in a little comment explaining why it went away. Re-enable it in the case there an exisiting process is just splitting off
Completely remove the code for single threading the mainline fork code. Put in a little comment explaining why it went away. Re-enable it in the case there an exisiting process is just splitting off its address space and file descriptors. (I donpt think anything uses that code but it needs some sort of locking and this does the job.
Reviewed by: Davidxu, alc, others MFC after: 3 days
show more ...
|
#
30d239bc |
| 24-Oct-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Merge first in a series of TrustedBSD MAC Framework KPI changes from Mac OS X Leopard--rationalize naming for entry points to the following general forms:
mac_<object>_<method/action> mac_<objec
Merge first in a series of TrustedBSD MAC Framework KPI changes from Mac OS X Leopard--rationalize naming for entry points to the following general forms:
mac_<object>_<method/action> mac_<object>_check_<method/action>
The previous naming scheme was inconsistent and mostly reversed from the new scheme. Also, make object types more consistent and remove spaces from object types that contain multiple parts ("posix_sem" -> "posixsem") to make mechanical parsing easier. Introduce a new "netinet" object type for certain IPv4/IPv6-related methods. Also simplify, slightly, some entry point names.
All MAC policy modules will need to be recompiled, and modules not updates as part of this commit will need to be modified to conform to the new KPI.
Sponsored by: SPARTA (original patches against Mac OS X) Obtained from: TrustedBSD Project, Apple Computer
show more ...
|