#
4d4f9323 |
| 13-Aug-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
s/v_specinfo/v_rdev/
|
#
0ef1c826 |
| 08-Aug-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Decommision miscfs/specfs/specdev.h. Most of it goes into <sys/conf.h>, a few lines into <sys/vnode.h>.
Add a few fields to struct specinfo, paving the way for the fun part.
|
#
67452993 |
| 26-Jul-1999 |
Alan Cox <alc@FreeBSD.org> |
Add sysctl and support code to allow directories to be VMIO'd. The default setting for the sysctl is OFF, which is the historical operation.
Submitted by: dillon
|
#
698bfad7 |
| 20-Jul-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Now a dev_t is a pointer to struct specinfo which is shared by all specdev vnodes referencing this device.
Details: cdevsw->d_parms has been removed, the specinfo is available now (=
Now a dev_t is a pointer to struct specinfo which is shared by all specdev vnodes referencing this device.
Details: cdevsw->d_parms has been removed, the specinfo is available now (== dev_t) and the driver should modify it directly when applicable, and the only driver doing so, does so: vn.c. I am not sure the logic in checking for "<" was right before, and it looks even less so now.
An intial pool of 50 struct specinfo are depleted during early boot, after that malloc had better work. It is likely that fewer than 50 would do.
Hashing is done from udev_t to dev_t with a prime number remainder hash, experiments show no better hash available for decent cost (MD5 is only marginally better) The prime number used should not be close to a power of two, we use 83 for now.
Add new checkalias2() to get around the loss of info from dev2udev() in bdevvp();
The aliased vnodes are hung on a list straight of the dev_t, and speclisth[SPECSZ] is unused. The sharing of struct specinfo means that the v_specnext moves into the vnode which grows by 4 bytes.
Don't use a VBLK dev_t which doesn't make sense in MFS, now we hang a dummy cdevsw on B/Cmaj 253 so that things look sane.
Storage overhead from all of this is O(50k).
Bump __FreeBSD_version to 400009
The next step will add the stuff needed so device-drivers can start to hang things from struct specinfo
show more ...
|
#
3de280c4 |
| 19-Jul-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
[click] Now all dev_t's in the kernel have their char device major.
Only know casualy of this is swapinfo/pstat which should be fixes the right way: Store the actual pathname in the kernel like mo
[click] Now all dev_t's in the kernel have their char device major.
Only know casualy of this is swapinfo/pstat which should be fixes the right way: Store the actual pathname in the kernel like mount does. [Volounteers sought for this task]
The road map from here is roughly: expand struct specinfo into struct based dev_t. Add dev_t registration facilities for device drivers and start to use them.
show more ...
|
#
6ca54864 |
| 18-Jul-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Introduce the vn_todev(struct vnode*) function, which returns the dev_t corresponding to a VBLK or VCHR node, or NODEV.
|
#
c7119ea7 |
| 17-Jul-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Fix 2nd arg to udev2dev().
|
#
f008cfcc |
| 17-Jul-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
I have not one single time remembered the name of this function correctly so obviously I gave it the wrong name. s/umakedev/makeudev/g
|
#
e7647e6c |
| 12-Jul-1999 |
Kris Kennaway <kris@FreeBSD.org> |
Correct a couple of spelling errors in comments.
|
#
ad8ac923 |
| 08-Jul-1999 |
Kirk McKusick <mckusick@FreeBSD.org> |
These changes appear to give us benefits with both small (32MB) and large (1G) memory machine configurations. I was able to run 'dbench 32' on a 32MB system without bring the machine to a grinding h
These changes appear to give us benefits with both small (32MB) and large (1G) memory machine configurations. I was able to run 'dbench 32' on a 32MB system without bring the machine to a grinding halt.
* buffer cache hash table now dynamically allocated. This will have no effect on memory consumption for smaller systems and will help scale the buffer cache for larger systems.
* minor enhancement to pmap_clearbit(). I noticed that all the calls to it used constant arguments. Making it an inline allows the constants to propogate to deeper inlines and should produce better code.
* removal of inherent vfs_ioopt support through the emplacement of appropriate #ifdef's, with John's permission. If we do not find a use for it by the end of the year we will remove it entirely.
* removal of getnewbufloops* counters & sysctl's - no longer necessary for debugging, getnewbuf() is now optimal.
* buffer hash table functions removed from sys/buf.h and localized to vfs_bio.c
* VFS_BIO_NEED_DIRTYFLUSH flag and support code added ( bwillwrite() ), allowing processes to block when too many dirty buffers are present in the system.
* removal of a softdep test in bdwrite() that is no longer necessary now that bdwrite() no longer attempts to flush dirty buffers.
* slight optimization added to bqrelse() - there is no reason to test for available buffer space on B_DELWRI buffers.
* addition of reverse-scanning code to vfs_bio_awrite(). vfs_bio_awrite() will attempt to locate clusterable areas in both the forward and reverse direction relative to the offset of the buffer passed to it. This will probably not make much of a difference now, but I believe we will start to rely on it heavily in the future if we decide to shift some of the burden of the clustering closer to the actual I/O initiation.
* Removal of the newbufcnt and lastnewbuf counters that Kirk added. They do not fix any race conditions that haven't already been fixed by the gbincore() test done after the only call to getnewbuf(). getnewbuf() is a static, so there is no chance of it being misused by other modules. ( Unless Kirk can think of a specific thing that this code fixes. I went through it very carefully and didn't see anything ).
* removal of VOP_ISLOCKED() check in flushbufqueues(). I do not think this check is necessary, the buffer should flush properly whether the vnode is locked or not. ( yes? ).
* removal of extra arguments passed to getnewbuf() that are not necessary.
* missed cluster_wbuild() that had to be a cluster_wbuild_wb() in vfs_cluster.c
* vn_write() now calls bwillwrite() *PRIOR* to locking the vnode, which should greatly aid flushing operations in heavy load situations - both the pageout and update daemons will be able to operate more efficiently.
* removal of b_usecount. We may add it back in later but for now it is useless. Prior implementations of the buffer cache never had enough buffers for it to be useful, and current implementations which make more buffers available might not benefit relative to the amount of sophistication required to implement a b_usecount. Straight LRU should work just as well, especially when most things are VMIO backed. I expect that (even though John will not like this assumption) directories will become VMIO backed some point soon.
Submitted by: Matthew Dillon <dillon@backplane.com> Reviewed by: Kirk McKusick <mckusick@mckusick.com>
show more ...
|
#
e929c00d |
| 04-Jul-1999 |
Kirk McKusick <mckusick@FreeBSD.org> |
The buffer queue mechanism has been reformulated. Instead of having QUEUE_AGE, QUEUE_LRU, and QUEUE_EMPTY we instead have QUEUE_CLEAN, QUEUE_DIRTY, QUEUE_EMPTY, and QUEUE_EMPTYKVA. With this patch
The buffer queue mechanism has been reformulated. Instead of having QUEUE_AGE, QUEUE_LRU, and QUEUE_EMPTY we instead have QUEUE_CLEAN, QUEUE_DIRTY, QUEUE_EMPTY, and QUEUE_EMPTYKVA. With this patch clean and dirty buffers have been separated. Empty buffers with KVM assignments have been separated from truely empty buffers. getnewbuf() has been rewritten and now operates in a 100% optimal fashion. That is, it is able to find precisely the right kind of buffer it needs to allocate a new buffer, defragment KVM, or to free-up an existing buffer when the buffer cache is full (which is a steady-state situation for the buffer cache).
Buffer flushing has been reorganized. Previously buffers were flushed in the context of whatever process hit the conditions forcing buffer flushing to occur. This resulted in processes blocking on conditions unrelated to what they were doing. This also resulted in inappropriate VFS stacking chains due to multiple processes getting stuck trying to flush dirty buffers or due to a single process getting into a situation where it might attempt to flush buffers recursively - a situation that was only partially fixed in prior commits. We have added a new daemon called the buf_daemon which is responsible for flushing dirty buffers when the number of dirty buffers exceeds the vfs.hidirtybuffers limit. This daemon attempts to dynamically adjust the rate at which dirty buffers are flushed such that getnewbuf() calls (almost) never block.
The number of nbufs and amount of buffer space is now scaled past the 8MB limit that was previously imposed for systems with over 64MB of memory, and the vfs.{lo,hi}dirtybuffers limits have been relaxed somewhat. The number of physical buffers has been increased with the intention that we will manage physical I/O differently in the future.
reassignbuf previously attempted to keep the dirtyblkhd list sorted which could result in non-deterministic operation under certain conditions, such as when a large number of dirty buffers are being managed. This algorithm has been changed. reassignbuf now keeps buffers locally sorted if it can do so cheaply, and otherwise gives up and adds buffers to the head of the dirtyblkhd list. The new algorithm is deterministic but not perfect. The new algorithm greatly reduces problems that previously occured when write_behind was turned off in the system.
The P_FLSINPROG proc->p_flag bit has been replaced by the more descriptive P_BUFEXHAUST bit. This bit allows processes working with filesystem buffers to use available emergency reserves. Normal processes do not set this bit and are not allowed to dig into emergency reserves. The purpose of this bit is to avoid low-memory deadlocks.
A small race condition was fixed in getpbuf() in vm/vm_pager.c.
Submitted by: Matthew Dillon <dillon@apollo.backplane.com> Reviewed by: Kirk McKusick <mckusick@mckusick.com>
show more ...
|
#
8947a90a |
| 02-Jul-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Make sure that stat(2) and friends always return a valid st_dev field.
Pseudo-FS need not fill in the va_fsid anymore, the syscall code will use the first half of the fsid, which now looks like a ud
Make sure that stat(2) and friends always return a valid st_dev field.
Pseudo-FS need not fill in the va_fsid anymore, the syscall code will use the first half of the fsid, which now looks like a udev_t with major 255.
show more ...
|
#
9c8b8baa |
| 01-Jul-1999 |
Peter Wemm <peter@FreeBSD.org> |
Slight reorganization of kernel thread/process creation. Instead of using SYSINIT_KT() etc (which is a static, compile-time procedure), use a NetBSD-style kthread_create() interface. kproc_start is
Slight reorganization of kernel thread/process creation. Instead of using SYSINIT_KT() etc (which is a static, compile-time procedure), use a NetBSD-style kthread_create() interface. kproc_start is still available as a SYSINIT() hook. This allowed simplification of chunks of the sysinit code in the process. This kthread_create() is our old kproc_start internals, with the SYSINIT_KT fork hooks grafted in and tweaked to work the same as the NetBSD one.
One thing I'd like to do shortly is get rid of nfsiod as a user initiated process. It makes sense for the nfs client code to create them on the fly as needed up to a user settable limit. This means that nfsiod doesn't need to be in /sbin and is always "available". This is a fair bit easier to do outside of the SYSINIT_KT() framework.
show more ...
|
#
67812eac |
| 26-Jun-1999 |
Kirk McKusick <mckusick@FreeBSD.org> |
Convert buffer locking from using the B_BUSY and B_WANTED flags to using lockmgr locks. This commit should be functionally equivalent to the old semantics. That is, all buffer locking is done with LK
Convert buffer locking from using the B_BUSY and B_WANTED flags to using lockmgr locks. This commit should be functionally equivalent to the old semantics. That is, all buffer locking is done with LK_EXCLUSIVE requests. Changes to take advantage of LK_SHARED and LK_RECURSIVE will be done in future commits.
show more ...
|
#
f9c8cab5 |
| 17-Jun-1999 |
Kirk McKusick <mckusick@FreeBSD.org> |
Add a vnode argument to VOP_BWRITE to get rid of the last vnode operator special case. Delete special case code from vnode_if.sh, vnode_if.src, umap_vnops.c, and null_vnops.c.
|
#
e4ab40bc |
| 16-Jun-1999 |
Kirk McKusick <mckusick@FreeBSD.org> |
Get rid of the global variable rushjob and replace it with a function in kern/vfs_subr.c named speedup_syncer() which handles the speedup request. Change the various clients of rushjob to use the new
Get rid of the global variable rushjob and replace it with a function in kern/vfs_subr.c named speedup_syncer() which handles the speedup request. Change the various clients of rushjob to use the new function.
show more ...
|
#
2447bec8 |
| 31-May-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Simplify cdevsw registration.
The cdevsw_add() function now finds the major number(s) in the struct cdevsw passed to it. cdevsw_add_generic() is no longer needed, cdevsw_add() does the same thing.
Simplify cdevsw registration.
The cdevsw_add() function now finds the major number(s) in the struct cdevsw passed to it. cdevsw_add_generic() is no longer needed, cdevsw_add() does the same thing.
cdevsw_add() will print an message if the d_maj field looks bogus.
Remove nblkdev and nchrdev variables. Most places they were used bogusly. Instead check a dev_t for validity by seeing if devsw() or bdevsw() returns NULL.
Move bdevsw() and devsw() functions to kern/kern_conf.c
Bump __FreeBSD_version to 400006
This commit removes: 72 bogus makedev() calls 26 bogus SYSINIT functions
if_xe.c bogusly accessed cdevsw[], author/maintainer please fix.
I4b and vinum not changed. Patches emailed to authors. LINT probably broken until they catch up.
show more ...
|
Revision tags: release/3.2.0 |
|
#
02013ff8 |
| 24-May-1999 |
John Birrell <jb@FreeBSD.org> |
Remove the test for bdevsw(dev) == NULL from bdevvp() because it fails if there is no character device associated with the block device. In this case that doesn't matter because bdevvp() doesn't use
Remove the test for bdevsw(dev) == NULL from bdevvp() because it fails if there is no character device associated with the block device. In this case that doesn't matter because bdevvp() doesn't use the character device structure.
I can use the pointy bit of the axe too.
show more ...
|
#
0ce54cbb |
| 14-May-1999 |
Luoqi Chen <luoqi@FreeBSD.org> |
Legally acquire a major number for mfs.
|
#
eaea7a9e |
| 14-May-1999 |
Kirk McKusick <mckusick@FreeBSD.org> |
Previously directories were sync'ed every 10 seconds while bitmaps & inodes were synced every 15 seconds. This is now reversed as during directory create, we cannot commit the directory entry until i
Previously directories were sync'ed every 10 seconds while bitmaps & inodes were synced every 15 seconds. This is now reversed as during directory create, we cannot commit the directory entry until its inode has been written. With this switch, the inodes will be more likely to be written by the time that the directory is written thus reducing the number of directory rollbacks that are needed.
show more ...
|
#
cc5881cf |
| 12-May-1999 |
Peter Wemm <peter@FreeBSD.org> |
Fix (?) SPECHASH dev_t/major/minor/etc args
|
#
8bee45c4 |
| 12-May-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Don't peek into dev_t
|
#
bfbb9ce6 |
| 11-May-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Divorce "dev_t" from the "major|minor" bitmap, which is now called udev_t in the kernel but still called dev_t in userland.
Provide functions to manipulate both types: major() umajor
Divorce "dev_t" from the "major|minor" bitmap, which is now called udev_t in the kernel but still called dev_t in userland.
Provide functions to manipulate both types: major() umajor() minor() uminor() makedev() umakedev() dev2udev() udev2dev()
For now they're functions, they will become in-line functions after one of the next two steps in this process.
Return major/minor/makedev to macro-hood for userland.
Register a name in cdevsw[] for the "filedescriptor" driver.
In the kernel the udev_t appears in places where we have the major/minor number combination, (ie: a potential device: we may not have the driver nor the device), like in inodes, vattr, cdevsw registration and so on, whereas the dev_t appears where we carry around a reference to a actual device.
In the future the cdevsw and the aliased-from vnode will be hung directly from the dev_t, along with up to two softc pointers for the device driver and a few houskeeping bits. This will essentially replace the current "alias" check code (same buck, bigger bang).
A little stunt has been provided to try to catch places where the wrong type is being used (dev_t vs udev_t), if you see something not working, #undef DEVT_FASCIST in kern/kern_conf.c and see if it makes a difference. If it does, please try to track it down (many hands make light work) or at least try to reproduce it as simply as possible, and describe how to do that.
Without DEVT_FASCIST I belive this patch is a no-op.
Stylistic/posixoid comments about the userland view of the <sys/*.h> files welcome now, from userland they now contain the end result.
Next planned step: make all dev_t's refer to the same devsw[] which means convert BLK's to CHR's at the perimeter of the vnodes and other places where they enter the game (bootdev, mknod, sysctl).
show more ...
|
#
1637aa4b |
| 08-May-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Fix some of the places where too much inside knowledge about major/minor layout and dev_t structure is being (ab)used.
|
#
4be2eb8c |
| 08-May-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
I got tired of seeing all the cdevsw[major(foo)] all over the place.
Made a new (inline) function devsw(dev_t dev) and substituted it.
Changed to the BDEV variant to this format as well: bdevsw(dev
I got tired of seeing all the cdevsw[major(foo)] all over the place.
Made a new (inline) function devsw(dev_t dev) and substituted it.
Changed to the BDEV variant to this format as well: bdevsw(dev_t dev)
DEVFS will eventually benefit from this change too.
show more ...
|