#
bed927ee |
| 21-May-2013 |
Attilio Rao <attilio@FreeBSD.org> |
vm_object locking is not needed there as pages are already wired.
Sponsored by: EMC / Isilon storage division Submitted by: alc
|
#
e3ed7ff0 |
| 17-May-2013 |
Attilio Rao <attilio@FreeBSD.org> |
Use readlocking now that assertions on vm_page_lookup() are relaxed.
Sponsored by: EMC / Isilon storage division Reviewed by: alc Tested by: flo, pho
|
#
69e6d7b7 |
| 12-Apr-2013 |
Simon J. Gerraty <sjg@FreeBSD.org> |
sync from head
|
#
d1e99f43 |
| 27-Mar-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Add dev_strategy_csw() function, which is similar to dev_strategy() but assumes that a thread reference was already obtained on the passed device. Use the function from physio(), to avoid two extra
Add dev_strategy_csw() function, which is similar to dev_strategy() but assumes that a thread reference was already obtained on the passed device. Use the function from physio(), to avoid two extra dev_mtx lock and unlock. Note that physio() is always used as the cdevsw method, or is called from a cdevsw method, and the caller already owns the reference.
dev_strategy() is left to keep KPI intact, but now it is implemented as a wrapper around dev_strategy_csw().
Do some style cleanup in physio().
Requested and reviewed by: kan (previous version) Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
show more ...
|
#
88c8c0a7 |
| 27-Mar-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
On i386, double the default size of the bio transient map. With the maxbcache size fixed, the auto-tuned transient map is too small for real-world load on i386.
Tested by: David Wolfskill Sponsored
On i386, double the default size of the bio transient map. With the maxbcache size fixed, the auto-tuned transient map is too small for real-world load on i386.
Tested by: David Wolfskill Sponsored by: The FreeBSD Foundation
show more ...
|
#
7db07e1c |
| 21-Mar-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Only size and create the bio_transient_map when unmapped buffers are enabled. Now, disabling the unmapped buffers should result in the kernel memory map identical to pre-r248550.
Sponsored by: The
Only size and create the bio_transient_map when unmapped buffers are enabled. Now, disabling the unmapped buffers should result in the kernel memory map identical to pre-r248550.
Sponsored by: The FreeBSD Foundation
show more ...
|
#
e3269b50 |
| 20-Mar-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
In bufwrite(), a dirty buffer is moved to the clean queue before the bufobj counter of the writes in progress is incremented. Other thread inspecting the bufobj would consider it clean.
For the reg
In bufwrite(), a dirty buffer is moved to the clean queue before the bufobj counter of the writes in progress is incremented. Other thread inspecting the bufobj would consider it clean.
For the regular vnodes, the vnode lock is typically held both by the thread performing the bufwrite() and an other thread doing syncing, which prevents the situation. On the other hand, writes to the VCHR vnodes are done without holding vnode lock.
Increment the write ref counter for the buffer object before calling bundirty().
Sponsored by: The FreeBSD Foundation Tested by: pho MFC after: 2 weeks
show more ...
|
#
e81ff91e |
| 19-Mar-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Do not remap usermode pages into KVA for physio.
Sponsored by: The FreeBSD Foundation Tested by: pho
|
#
7d5365c7 |
| 19-Mar-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Add a helper function vfs_bio_bzero_buf() to zero the portion of the buffer, transparently handling mapped or unmapped buffers. Its intent is to replace the use of bzero(bp->b_data) in cases where t
Add a helper function vfs_bio_bzero_buf() to zero the portion of the buffer, transparently handling mapped or unmapped buffers. Its intent is to replace the use of bzero(bp->b_data) in cases where the buffer might be unmapped, to avoid unneeded upgrades.
Sponsored by: The FreeBSD Foundation Tested by: pho
show more ...
|
#
ee75e7de |
| 19-Mar-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Implement the concept of the unmapped VMIO buffers, i.e. buffers which do not map the b_pages pages into buffer_map KVA. The use of the unmapped buffers eliminate the need to perform TLB shootdown f
Implement the concept of the unmapped VMIO buffers, i.e. buffers which do not map the b_pages pages into buffer_map KVA. The use of the unmapped buffers eliminate the need to perform TLB shootdown for mapping on the buffer creation and reuse, greatly reducing the amount of IPIs for shootdown on big-SMP machines and eliminating up to 25-30% of the system time on i/o intensive workloads.
The unmapped buffer should be explicitely requested by the GB_UNMAPPED flag by the consumer. For unmapped buffer, no KVA reservation is performed at all. The consumer might request unmapped buffer which does have a KVA reserve, to manually map it without recursing into buffer cache and blocking, with the GB_KVAALLOC flag.
When the mapped buffer is requested and unmapped buffer already exists, the cache performs an upgrade, possibly reusing the KVA reservation.
Unmapped buffer is translated into unmapped bio in g_vfs_strategy(). Unmapped bio carry a pointer to the vm_page_t array, offset and length instead of the data pointer. The provider which processes the bio should explicitely specify a readiness to accept unmapped bio, otherwise g_down geom thread performs the transient upgrade of the bio request by mapping the pages into the new bio_transient_map KVA submap.
The bio_transient_map submap claims up to 10% of the buffer map, and the total buffer_map + bio_transient_map KVA usage stays the same. Still, it could be manually tuned by kern.bio_transient_maxcnt tunable, in the units of the transient mappings. Eventually, the bio_transient_map could be removed after all geom classes and drivers can accept unmapped i/o requests.
Unmapped support can be turned off by the vfs.unmapped_buf_allowed tunable, disabling which makes the buffer (or cluster) creation requests to ignore GB_UNMAPPED and GB_KVAALLOC flags. Unmapped buffers are only enabled by default on the architectures where pmap_copy_page() was implemented and tested.
In the rework, filesystem metadata is not the subject to maxbufspace limit anymore. Since the metadata buffers are always mapped, the buffers still have to fit into the buffer map, which provides a reasonable (but practically unreachable) upper bound on it. The non-metadata buffer allocations, both mapped and unmapped, is accounted against maxbufspace, as before. Effectively, this means that the maxbufspace is forced on mapped and unmapped buffers separately. The pre-patch bufspace limiting code did not worked, because buffer_map fragmentation does not allow the limit to be reached.
By Jeff Roberson request, the getnewbuf() function was split into smaller single-purpose functions.
Sponsored by: The FreeBSD Foundation Discussed with: jeff (previous version) Tested by: pho, scottl (previous version), jhb, bf MFC after: 2 weeks
show more ...
|
#
876a84e8 |
| 18-Mar-2013 |
Martin Matuska <mm@FreeBSD.org> |
MFC @248461
|
#
70e198dd |
| 14-Mar-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Some style fixes.
Sponsored by: The FreeBSD Foundation
|
#
c535690b |
| 14-Mar-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Add currently unused flag argument to the cluster_read(), cluster_write() and cluster_wbuild() functions. The flags to be allowed are a subset of the GB_* flags for getblk().
Sponsored by: The Free
Add currently unused flag argument to the cluster_read(), cluster_write() and cluster_wbuild() functions. The flags to be allowed are a subset of the GB_* flags for getblk().
Sponsored by: The FreeBSD Foundation Tested by: pho
show more ...
|
#
a1143a3b |
| 14-Mar-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Rewrite the vfs_bio_clrbuf(9) to not access the b_data for B_VMIO buffers directly, use pmap_zero_page_area(9) for each zeroing page region instead.
Sponsored by: The FreeBSD Foundation Tested by: p
Rewrite the vfs_bio_clrbuf(9) to not access the b_data for B_VMIO buffers directly, use pmap_zero_page_area(9) for each zeroing page region instead.
Sponsored by: The FreeBSD Foundation Tested by: pho MFC after: 2 weeks
show more ...
|
#
a03fbc7e |
| 09-Mar-2013 |
Martin Matuska <mm@FreeBSD.org> |
MFC @248093
|
#
89f6b863 |
| 09-Mar-2013 |
Attilio Rao <attilio@FreeBSD.org> |
Switch the vm_object mutex to be a rwlock. This will enable in the future further optimizations where the vm_object lock will be held in read mode most of the time the page cache resident pool of pa
Switch the vm_object mutex to be a rwlock. This will enable in the future further optimizations where the vm_object lock will be held in read mode most of the time the page cache resident pool of pages are accessed for reading purposes.
The change is mostly mechanical but few notes are reported: * The KPI changes as follow: - VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK() - VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK() - VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK() - VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED() (in order to avoid visibility of implementation details) - The read-mode operations are added: VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(), VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED() * The vm/vm_pager.h namespace pollution avoidance (forcing requiring sys/mutex.h in consumers directly to cater its inlining functions using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h consumers now must include also sys/rwlock.h. * zfs requires a quite convoluted fix to include FreeBSD rwlocks into the compat layer because the name clash between FreeBSD and solaris versions must be avoided. At this purpose zfs redefines the vm_object locking functions directly, isolating the FreeBSD components in specific compat stubs.
The KPI results heavilly broken by this commit. Thirdy part ports must be updated accordingly (I can think off-hand of VirtualBox, for example).
Sponsored by: EMC / Isilon storage division Reviewed by: jeff Reviewed by: pjd (ZFS specific review) Discussed with: alc Tested by: pho
show more ...
|
#
20f4e3e1 |
| 27-Feb-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Make recursive getblk() slightly more useful. Keep the buffer state intact if getblk() is done on the already owned buffer. Exit from brelse() early when the lock recursion is detected, otherwise b
Make recursive getblk() slightly more useful. Keep the buffer state intact if getblk() is done on the already owned buffer. Exit from brelse() early when the lock recursion is detected, otherwise brelse() might prematurely destroy the buffer under some circumstances.
Sponsored by: The FreeBSD Foundation Noted by: mckusick Tested by: pho MFC after: 2 weeks
show more ...
|
#
d241a0e6 |
| 26-Feb-2013 |
Xin LI <delphij@FreeBSD.org> |
IFC @247348.
|
#
2bc1a1fe |
| 16-Feb-2013 |
Kirk McKusick <mckusick@FreeBSD.org> |
Add barrier write capability to the VFS buffer interface. A barrier write is a disk write request that tells the disk that the buffer being written must be committed to the media along with any write
Add barrier write capability to the VFS buffer interface. A barrier write is a disk write request that tells the disk that the buffer being written must be committed to the media along with any writes that preceeded it before any future blocks may be written to the drive.
Barrier writes are provided by adding the functions bbarrierwrite (bwrite with barrier) and babarrierwrite (bawrite with barrier).
Following a bbarrierwrite the client knows that the requested buffer is on the media. It does not ensure that buffers written before that buffer are on the media. It only ensure that buffers written before that buffer will get to the media before any buffers written after that buffer. A flush command must be sent to the disk to ensure that all earlier written buffers are on the media.
Reviewed by: kib Tested by: Peter Holm
show more ...
|
#
d9a44755 |
| 08-Feb-2013 |
David E. O'Brien <obrien@FreeBSD.org> |
Sync with HEAD.
|
#
46b1c55d |
| 04-Jan-2013 |
Neel Natu <neel@FreeBSD.org> |
IFC @ r244983.
|
#
b1308d72 |
| 21-Dec-2012 |
Attilio Rao <attilio@FreeBSD.org> |
Fixup r218424: uio_yield() was scaling directly to userland priority. When kern_yield() was introduced with the possibility to specify a new priority, the behaviour changed by not lowering priority a
Fixup r218424: uio_yield() was scaling directly to userland priority. When kern_yield() was introduced with the possibility to specify a new priority, the behaviour changed by not lowering priority at all in the consumers, making the yielding mechanism highly ineffective for high priority kthreads like bufdaemon, syncer, vlrudaemon, etc. There are no evidences that consumers could bear with such change in semantic and this situation could finally lead to bugs similar to the ones fixed in r244240. Re-specify userland pri for kthreads involved.
Tested by: pho Reviewed by: kib, mdf MFC after: 1 week
show more ...
|
#
5d439a29 |
| 10-Dec-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Do not ignore zero address, possibly returned by the vm_map_find() call. The function indicates a failure by the TRUE return value. To be extra safe, assert that the return value from the following
Do not ignore zero address, possibly returned by the vm_map_find() call. The function indicates a failure by the TRUE return value. To be extra safe, assert that the return value from the following vm_map_insert() indicates success.
Fix style issues in the nearby lines, reformulate the comment.
Reviewed by: alc (previous version) Sponsored by: The FreeBSD Foundation MFC after: 1 week
show more ...
|
#
17cb8cfc |
| 09-Dec-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Remove useless comment.
MFC after: 3 days
|
Revision tags: release/9.1.0 |
|
#
300675f6 |
| 27-Nov-2012 |
Alexander Motin <mav@FreeBSD.org> |
MFC
|