#
24a590a0 |
| 27-Jul-2001 |
Peter Wemm <peter@FreeBSD.org> |
Fix cut/paste blunder. Serves me right for doing a last minute tweak to what I had for some time.
Submitted by: bde
|
#
0cddd8f0 |
| 04-Jul-2001 |
Matthew Dillon <dillon@FreeBSD.org> |
With Alfred's permission, remove vm_mtx in favor of a fine-grained approach (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to
With Alfred's permission, remove vm_mtx in favor of a fine-grained approach (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.
show more ...
|
#
ac8f990b |
| 24-May-2001 |
Matthew Dillon <dillon@FreeBSD.org> |
This patch implements O_DIRECT about 80% of the way. It takes a patchset Tor created a while ago, removes the raw I/O piece (that has cache coherency problems), and adds a buffer cache / VM freeing
This patch implements O_DIRECT about 80% of the way. It takes a patchset Tor created a while ago, removes the raw I/O piece (that has cache coherency problems), and adds a buffer cache / VM freeing piece.
Essentially this patch causes O_DIRECT I/O to not be left in the cache, but does not prevent it from going through the cache, hence the 80%. For the last 20% we need a method by which the I/O can be issued directly to buffer supplied by the user process and bypass the buffer cache entirely, but still maintain cache coherency.
I also have the code working under -stable but the changes made to sys/file.h may not be MFCable, so an MFC is not on the table yet.
Submitted by: tegge, dillon
show more ...
|
#
8aa66068 |
| 24-May-2001 |
John Baldwin <jhb@FreeBSD.org> |
- Always call bfreekva() w/o vm_mtx held. - Always call vfs_setdirty() with vm_mtx held. - Fix an old comment: vm_hold_unload_pages is called vm_hold_free_pages() nowadays. - Always call vm_hold_fr
- Always call bfreekva() w/o vm_mtx held. - Always call vfs_setdirty() with vm_mtx held. - Fix an old comment: vm_hold_unload_pages is called vm_hold_free_pages() nowadays. - Always call vm_hold_free_pages() w/o vm_mtx held.
show more ...
|
#
23955314 |
| 19-May-2001 |
Alfred Perlstein <alfred@FreeBSD.org> |
Introduce a global lock for the vm subsystem (vm_mtx).
vm_mtx does not recurse and is required for most low level vm operations.
faults can not be taken without holding Giant.
Memory subsystems ca
Introduce a global lock for the vm subsystem (vm_mtx).
vm_mtx does not recurse and is required for most low level vm operations.
faults can not be taken without holding Giant.
Memory subsystems can now call the base page allocators safely.
Almost all atomic ops were removed as they are covered under the vm mutex.
Alpha and ia64 now need to catch up to i386's trap handlers.
FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties).
Reviewed (partially) by: jake, jhb
show more ...
|
#
60fb0ce3 |
| 29-Apr-2001 |
Greg Lehey <grog@FreeBSD.org> |
Revert consequences of changes to mount.h, part 2.
Requested by: bde
|
#
d98dc34f |
| 23-Apr-2001 |
Greg Lehey <grog@FreeBSD.org> |
Correct #includes to work with fixed sys/mount.h.
|
Revision tags: release/4.3.0_cvs, release/4.3.0 |
|
#
793d6d5d |
| 18-Apr-2001 |
Poul-Henning Kamp <phk@FreeBSD.org> |
bread() is a special case of breadn(), so don't replicate code.
|
#
0dfba3ce |
| 17-Apr-2001 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Write a switch statement as less obscure if statements.
|
#
f84e29a0 |
| 17-Apr-2001 |
Poul-Henning Kamp <phk@FreeBSD.org> |
This patch removes the VOP_BWRITE() vector.
VOP_BWRITE() was a hack which made it possible for NFS client side to use struct buf with non-bio backing.
This patch takes a more general approach and a
This patch removes the VOP_BWRITE() vector.
VOP_BWRITE() was a hack which made it possible for NFS client side to use struct buf with non-bio backing.
This patch takes a more general approach and adds a bp->b_op vector where more methods can be added.
The success of this patch depends on bp->b_op being initialized all relevant places for some value of "relevant" which is not easy to determine. For now the buffers have grown a b_magic element which will make such issues a tiny bit easier to debug.
show more ...
|
#
5819ab3f |
| 17-Apr-2001 |
Kirk McKusick <mckusick@FreeBSD.org> |
Add debugging option to always read/write cylinder groups as full sized blocks. To enable this option, use: `sysctl -w debug.bigcgs=1'. Add debugging option to disable background writes of cylinder g
Add debugging option to always read/write cylinder groups as full sized blocks. To enable this option, use: `sysctl -w debug.bigcgs=1'. Add debugging option to disable background writes of cylinder groups. To enable this option, use: `sysctl -w debug.dobkgrdwrite=0'. These debugging options should be tried on systems that are panicing with corrupted cylinder group maps to see if it makes the problem go away. The set of panics in question are:
ffs_clusteralloc: map mismatch ffs_nodealloccg: map corrupted ffs_nodealloccg: block not in map ffs_alloccg: map corrupted ffs_alloccg: block not in map ffs_alloccgblk: cyl groups corrupted ffs_alloccgblk: can't find blk in cyl ffs_checkblk: partially free fragment
The following panics are less likely to be related to this problem, but might be helped by these debugging options:
ffs_valloc: dup alloc ffs_blkfree: freeing free block ffs_blkfree: freeing free frag ffs_vfree: freeing free inode
If you try these options, please report whether they helped reduce your bitmap corruption panics to Kirk McKusick at <mckusick@mckusick.com> and to Matt Dillon <dillon@earth.backplane.com>.
show more ...
|
#
63692125 |
| 28-Feb-2001 |
Matthew Dillon <dillon@FreeBSD.org> |
Fix lockup for loopback NFS mounts. The pipelined I/O limitations could be hit on the client side and prevent the server side from retiring writes. Pipeline operations turned off for all READs (no b
Fix lockup for loopback NFS mounts. The pipelined I/O limitations could be hit on the client side and prevent the server side from retiring writes. Pipeline operations turned off for all READs (no big loss since reads are usually synchronous) and for NFS writes, and left on for the default bwrite(). (MFC expected prior to 4.3 freeze)
Testing by: mjacob, dillon
show more ...
|
#
9ed346ba |
| 09-Feb-2001 |
Bosko Milekic <bmilekic@FreeBSD.org> |
Change and clean the mutex lock interface.
mtx_enter(lock, type) becomes:
mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)
simil
Change and clean the mutex lock interface.
mtx_enter(lock, type) becomes:
mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)
similarily, for releasing a lock, we now have:
mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument.
The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind.
Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two:
MTX_QUIET and MTX_NOSWITCH
The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers:
mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively.
Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case.
Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled.
Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those.
Finally, caught up to the interface changes in all sys code.
Contributors: jake, jhb, jasone (in no particular order)
show more ...
|
#
4e71e795 |
| 04-Feb-2001 |
Matthew Dillon <dillon@FreeBSD.org> |
This commit represents work mainly submitted by Tor and slightly modified by myself. It solves a serious vm_map corruption problem that can occur with the buffer cache when block sizes > 64K are use
This commit represents work mainly submitted by Tor and slightly modified by myself. It solves a serious vm_map corruption problem that can occur with the buffer cache when block sizes > 64K are used. This code has been heavily tested in -stable but only tested somewhat on -current. An MFC will occur in a few days. My additions include the vm_map_simplify_entry() and minor buffer cache boundry case fix.
Make the buffer cache use a system map for buffer cache KVM rather then a normal map.
Ensure that VM objects are not allocated for system maps. There were cases where a buffer map could wind up with a backing VM object -- normally harmless, but this could also result in the buffer cache blocking in places where it assumes no blocking will occur, possibly resulting in corrupted maps.
Fix a minor boundry case in the buffer cache size limit is reached that could result in non-optimal code.
Add vm_map_simplify_entry() calls to prevent 'creeping proliferation' of vm_map_entry's in the buffer cache's vm_map. Previously only a simple linear optimization was made. (The buffer vm_map typically has only a handful of vm_map_entry's. This stabilizes it at that level permanently).
PR: 20609 Submitted by: (Tor Egge) tegge
show more ...
|
#
ef73ae4b |
| 10-Jan-2001 |
Jake Burkholder <jake@FreeBSD.org> |
Use PCPU_GET, PCPU_PTR and PCPU_SET to access all per-cpu variables other then curproc.
|
#
2b6b0df7 |
| 26-Dec-2000 |
Matthew Dillon <dillon@FreeBSD.org> |
This implements a better launder limiting solution. There was a solution in 4.2-REL which I ripped out in -stable and -current when implementing the low-memory handling solution. However, maxlaunde
This implements a better launder limiting solution. There was a solution in 4.2-REL which I ripped out in -stable and -current when implementing the low-memory handling solution. However, maxlaunder turns out to be the saving grace in certain very heavily loaded systems (e.g. newsreader box). The new algorithm limits the number of pages laundered in the first pageout daemon pass. If that is not sufficient then suceessive will be run without any limit.
Write I/O is now pipelined using two sysctls, vfs.lorunningspace and vfs.hirunningspace. This prevents excessive buffered writes in the disk queues which cause long (multi-second) delays for reads. It leads to more stable (less jerky) and generally faster I/O streaming to disk by allowing required read ops (e.g. for indirect blocks and such) to occur without interrupting the write stream, amoung other things.
NOTE: eventually, filesystem write I/O pipelining needs to be done on a per-device basis. At the moment it is globalized.
show more ...
|
#
ffc831da |
| 15-Dec-2000 |
John Baldwin <jhb@FreeBSD.org> |
Stick the kthread API in a kthread_* namespace, and the specialized kproc functions in a kproc_* namespace.
Reviewed by: -arch
|
Revision tags: release/4.2.0 |
|
#
936524aa |
| 19-Nov-2000 |
Matthew Dillon <dillon@FreeBSD.org> |
Implement a low-memory deadlock solution.
Removed most of the hacks that were trying to deal with low-memory situations prior to now.
The new code is based on the concept that I/O must
Implement a low-memory deadlock solution.
Removed most of the hacks that were trying to deal with low-memory situations prior to now.
The new code is based on the concept that I/O must be able to function in a low memory situation. All major modules related to I/O (except networking) have been adjusted to allow allocation out of the system reserve memory pool. These modules now detect a low memory situation but rather then block they instead continue to operate, then return resources to the memory pool instead of cache them or leave them wired.
Code has been added to stall in a low-memory situation prior to a vnode being locked.
Thus situations where a process blocks in a low-memory condition while holding a locked vnode have been reduced to near nothing. Not only will I/O continue to operate, but many prior deadlock conditions simply no longer exist.
Implement a number of VFS/BIO fixes
(found by Ian): in biodone(), bogus-page replacement code, the loop was not properly incrementing loop variables prior to a continue statement. We do not believe this code can be hit anyway but we aren't taking any chances. We'll turn the whole section into a panic (as it already is in brelse()) after the release is rolled.
In biodone(), the foff calculation was incorrectly clamped to the iosize, causing the wrong foff to be calculated for pages in the case of an I/O error or biodone() called without initiating I/O. The problem always caused a panic before. Now it doesn't. The problem is mainly an issue with NFS.
Fixed casts for ~PAGE_MASK. This code worked properly before only because the calculations use signed arithmatic. Better to properly extend PAGE_MASK first before inverting it for the 64 bit masking op.
In brelse(), the bogus_page fixup code was improperly throwing away the original contents of 'm' when it did the j-loop to fix the bogus pages. The result was that it would potentially invalidate parts of the *WRONG* page(!), leading to corruption.
There may still be cases where a background bitmap write is being duplicated, causing potential corruption. We have identified a potentially serious bug related to this but the fix is still TBD. So instead this patch contains a KASSERT to detect the problem and panic the machine rather then continue to corrupt the filesystem. The problem does not occur very often.. it is very hard to reproduce, and it may or may not be the cause of the corruption people have reported.
Review by: (VFS/BIO: mckusick, Ian Dowse <iedowse@maths.tcd.ie>) Testing by: (VM/Deadlock) Paul Saab <ps@yahoo-inc.com>
show more ...
|
#
1d7e3e42 |
| 02-Nov-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Take VBLK devices further out of their missery.
This should fix the panic I introduced in my previous commit on this topic.
|
#
35e0e5b3 |
| 20-Oct-2000 |
John Baldwin <jhb@FreeBSD.org> |
Catch up to moving headers: - machine/ipl.h -> sys/ipl.h - machine/mutex.h -> sys/mutex.h
|
#
a18b1f1d |
| 04-Oct-2000 |
Jason Evans <jasone@FreeBSD.org> |
Convert lockmgr locks from using simple locks to using mutexes.
Add lockdestroy() and appropriate invocations, which corresponds to lockinit() and must be called to clean up after a lockmgr lock is
Convert lockmgr locks from using simple locks to using mutexes.
Add lockdestroy() and appropriate invocations, which corresponds to lockinit() and must be called to clean up after a lockmgr lock is no longer needed.
show more ...
|
Revision tags: release/4.1.1_cvs |
|
#
9ff5ce6b |
| 12-Sep-2000 |
Boris Popov <bp@FreeBSD.org> |
Add three new VOPs: VOP_CREATEVOBJECT, VOP_DESTROYVOBJECT and VOP_GETVOBJECT. They will be used by nullfs and other stacked filesystems to support full cache coherency.
Reviewed in general by: mckus
Add three new VOPs: VOP_CREATEVOBJECT, VOP_DESTROYVOBJECT and VOP_GETVOBJECT. They will be used by nullfs and other stacked filesystems to support full cache coherency.
Reviewed in general by: mckusick, dillon
show more ...
|
#
0384fff8 |
| 07-Sep-2000 |
Jason Evans <jasone@FreeBSD.org> |
Major update to the way synchronization is done in the kernel. Highlights include:
* Mutual exclusion is used instead of spl*(). See mutex(9). (Note: The alpha port is still in transition and c
Major update to the way synchronization is done in the kernel. Highlights include:
* Mutual exclusion is used instead of spl*(). See mutex(9). (Note: The alpha port is still in transition and currently uses both.)
* Per-CPU idle processes.
* Interrupts are run in their own separate kernel threads and can be preempted (i386 only).
Partially contributed by: BSDi (BSD/OS) Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh
show more ...
|
Revision tags: release/4.1.0 |
|
#
54e53ebd |
| 25-Jul-2000 |
Kirk McKusick <mckusick@FreeBSD.org> |
Now that buffer locks can be recursive, we need to delete the panics that complain about them.
Obtained from: Brian Fundakowski Feldman <green@FreeBSD.org>
|
#
f2a2857b |
| 12-Jul-2000 |
Kirk McKusick <mckusick@FreeBSD.org> |
Add snapshots to the fast filesystem. Most of the changes support the gating of system calls that cause modifications to the underlying filesystem. The gating can be enabled by any filesystem that ne
Add snapshots to the fast filesystem. Most of the changes support the gating of system calls that cause modifications to the underlying filesystem. The gating can be enabled by any filesystem that needs to consistently suspend operations by adding the vop_stdgetwritemount to their set of vnops. Once gating is enabled, the function vfs_write_suspend stops all new write operations to a filesystem, allows any filesystem modifying system calls already in progress to complete, then sync's the filesystem to disk and returns. The function vfs_write_resume allows the suspended write operations to begin again. Gating is not added by default for all filesystems as for SMP systems it adds two extra locks to such critical kernel paths as the write system call. Thus, gating should only be added as needed.
Details on the use and current status of snapshots in FFS can be found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness is not included here. Unless and until you create a snapshot file, these changes should have no effect on your system (famous last words).
show more ...
|