#
0cd95859 |
| 25-Sep-2019 |
Kyle Evans <kevans@FreeBSD.org> |
[2/3] Add an initial seal argument to kern_shm_open()
Now that flags may be set on posixshm, add an argument to kern_shm_open() for the initial seals. To maintain past behavior where callers of shm_
[2/3] Add an initial seal argument to kern_shm_open()
Now that flags may be set on posixshm, add an argument to kern_shm_open() for the initial seals. To maintain past behavior where callers of shm_open(2) are guaranteed to not have any seals applied to the fd they're given, apply F_SEAL_SEAL for existing callers of kern_shm_open. A special flag could be opened later for shm_open(2) to indicate that sealing should be allowed.
We currently restrict initial seals to F_SEAL_SEAL. We cannot error out if F_SEAL_SEAL is re-applied, as this would easily break shm_open() twice to a shmfd that already existed. A note's been added about the assumptions we've made here as a hint towards anyone wanting to allow other seals to be applied at creation.
Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D21392
show more ...
|
#
af755d3e |
| 25-Sep-2019 |
Kyle Evans <kevans@FreeBSD.org> |
[1/3] Add mostly Linux-compatible file sealing support
File sealing applies protections against certain actions (currently: write, growth, shrink) at the inode level. New fileops are added to accomm
[1/3] Add mostly Linux-compatible file sealing support
File sealing applies protections against certain actions (currently: write, growth, shrink) at the inode level. New fileops are added to accommodate seals - EINVAL is returned by fcntl(2) if they are not implemented.
Reviewed by: markj, kib Differential Revision: https://reviews.freebsd.org/D21391
show more ...
|
#
61c1328e |
| 13-Sep-2019 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r352105 through r352307.
|
#
c7575748 |
| 10-Sep-2019 |
Jeff Roberson <jeff@FreeBSD.org> |
Replace redundant code with a few new vm_page_grab facilities: - VM_ALLOC_NOCREAT will grab without creating a page. - vm_page_grab_valid() will grab and page in if necessary. - vm_page_busy_acqui
Replace redundant code with a few new vm_page_grab facilities: - VM_ALLOC_NOCREAT will grab without creating a page. - vm_page_grab_valid() will grab and page in if necessary. - vm_page_busy_acquire() automates some busy acquire loops.
Discussed with: alc, kib, markj Tested by: pho (part of larger branch) Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21546
show more ...
|
#
fee2a2fa |
| 09-Sep-2019 |
Mark Johnston <markj@FreeBSD.org> |
Change synchonization rules for vm_page reference counting.
There are several mechanisms by which a vm_page reference is held, preventing the page from being freed back to the page allocator. In pa
Change synchonization rules for vm_page reference counting.
There are several mechanisms by which a vm_page reference is held, preventing the page from being freed back to the page allocator. In particular, holding the page's object lock is sufficient to prevent the page from being freed; holding the busy lock or a wiring is sufficent as well. These references are protected by the page lock, which must therefore be acquired for many per-page operations. This results in false sharing since the page locks are external to the vm_page structures themselves and each lock protects multiple structures.
Transition to using an atomically updated per-page reference counter. The object's reference is counted using a flag bit in the counter. A second flag bit is used to atomically block new references via pmap_extract_and_hold() while removing managed mappings of a page. Thus, the reference count of a page is guaranteed not to increase if the page is unbusied, unmapped, and the object's write lock is held. As a consequence of this, the page lock no longer protects a page's identity; operations which move pages between objects are now synchronized solely by the objects' locks.
The vm_page_wire() and vm_page_unwire() KPIs are changed. The former requires that either the object lock or the busy lock is held. The latter no longer has a return value and may free the page if it releases the last reference to that page. vm_page_unwire_noq() behaves the same as before; the caller is responsible for checking its return value and freeing or enqueuing the page as appropriate. vm_page_wire_mapped() is introduced for use in pmap_extract_and_hold(). It fails if the page is concurrently being unmapped, typically triggering a fallback to the fault handler. vm_page_wire() no longer requires the page lock and vm_page_unwire() now internally acquires the page lock when releasing the last wiring of a page (since the page lock still protects a page's queue state). In particular, synchronization details are no longer leaked into the caller.
The change excises the page lock from several frequently executed code paths. In particular, vm_object_terminate() no longer bounces between page locks as it releases an object's pages, and direct I/O and sendfile(SF_NOCACHE) completions no longer require the page lock. In these latter cases we now get linear scalability in the common scenario where different threads are operating on different files.
__FreeBSD_version is bumped. The DRM ports have been updated to accomodate the KPI changes.
Reviewed by: jeff (earlier version) Tested by: gallatin (earlier version), pho Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D20486
show more ...
|
#
f993ed2f |
| 09-Sep-2019 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r351732 through r352104.
|
#
dca52ab4 |
| 03-Sep-2019 |
Kyle Evans <kevans@FreeBSD.org> |
posixshm: start counting writeable mappings
r351650 switched posixshm to using OBJT_SWAP for shm_object
r351795 added support to the swap_pager for tracking writeable mappings
Take advantage of th
posixshm: start counting writeable mappings
r351650 switched posixshm to using OBJT_SWAP for shm_object
r351795 added support to the swap_pager for tracking writeable mappings
Take advantage of this and start tracking writeable mappings; fd sealing will use this to reject a seal on writing with EBUSY if any such mapping exist.
Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D21456
show more ...
|
#
c5c3ba6b |
| 03-Sep-2019 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r351317 through r351731.
|
#
32287ea7 |
| 01-Sep-2019 |
Kyle Evans <kevans@FreeBSD.org> |
posixshm: switch to OBJT_SWAP in advance of other changes
Future changes to posixshm will start tracking writeable mappings in order to support file sealing. Tracking writeable mappings for an OBJT_
posixshm: switch to OBJT_SWAP in advance of other changes
Future changes to posixshm will start tracking writeable mappings in order to support file sealing. Tracking writeable mappings for an OBJT_DEFAULT object is complicated as it may be swapped out and converted to an OBJT_SWAP. One may generically add this tracking for vm_object, but this is difficult to do without increasing memory footprint of vm_object and blowing up memory usage by a significant amount.
On the other hand, the swap pager can be expanded to track writeable mappings without increasing vm_object size. This change is currently in D21456. Switch over to OBJT_SWAP in advance of the other changes to the swap pager and posixshm.
show more ...
|
#
b5d239cb |
| 28-Aug-2019 |
Mark Johnston <markj@FreeBSD.org> |
Wire pages in vm_page_grab() when appropriate.
uiomove_object_page() and exec_map_first_page() would previously wire a page after having grabbed it. Ask vm_page_grab() to perform the wiring instead
Wire pages in vm_page_grab() when appropriate.
uiomove_object_page() and exec_map_first_page() would previously wire a page after having grabbed it. Ask vm_page_grab() to perform the wiring instead: this removes some redundant code, and is cheaper in the case where the requested page is not resident since the page allocator can be asked to initialize the page as wired, whereas a separate vm_page_wire() call requires the page lock.
In vm_imgact_hold_page(), use vm_page_unwire_noq() instead of vm_page_unwire(PQ_NONE). The latter ensures that the page is dequeued before returning, but this is unnecessary since vm_page_free() will trigger a batched dequeue of the page.
Reviewed by: alc, kib Tested by: pho (part of a larger patch) MFC after: 1 week Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21440
show more ...
|
#
b5a7ac99 |
| 31-Jul-2019 |
Kyle Evans <kevans@FreeBSD.org> |
kern_shm_open: push O_CLOEXEC into caller control
The motivation for this change is to allow wrappers around shm to be written that don't set CLOEXEC. kern_shm_open currently accepts O_CLOEXEC but s
kern_shm_open: push O_CLOEXEC into caller control
The motivation for this change is to allow wrappers around shm to be written that don't set CLOEXEC. kern_shm_open currently accepts O_CLOEXEC but sets it unconditionally. kern_shm_open is used by the shm_open(2) syscall, which is mandated by POSIX to set CLOEXEC, and CloudABI's sys_fd_create1(). Presumably O_CLOEXEC is intended in the latter caller, but it's unclear from the context.
sys_shm_open() now unconditionally sets O_CLOEXEC to meet POSIX requirements, and a comment has been dropped in to kern_fd_open() to explain the situation and add a pointer to where O_CLOEXEC setting is maintained for shm_open(2) correctness. CloudABI's sys_fd_create1() also unconditionally sets O_CLOEXEC to match previous behavior.
This also has the side-effect of making flags correctly reflect the O_CLOEXEC status on this fd for the rest of kern_shm_open(), but a glance-over leads me to believe that it didn't really matter.
Reviewed by: kib, markj MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D21119
show more ...
|
#
58df81b3 |
| 30-Jul-2019 |
Alan Somers <asomers@FreeBSD.org> |
MFHead @350426
Sponsored by: The FreeBSD Foundation
|
#
91898857 |
| 29-Jul-2019 |
Mark Johnston <markj@FreeBSD.org> |
Avoid relying on header pollution from sys/refcount.h.
MFC after: 3 days Sponsored by: The FreeBSD Foundation
|
#
a63915c2 |
| 28-Jul-2019 |
Alan Somers <asomers@FreeBSD.org> |
MFHead @r350386
Sponsored by: The FreeBSD Foundation
|
#
eeacb3b0 |
| 08-Jul-2019 |
Mark Johnston <markj@FreeBSD.org> |
Merge the vm_page hold and wire mechanisms.
The hold_count and wire_count fields of struct vm_page are separate reference counters with similar semantics. The remaining essential differences are th
Merge the vm_page hold and wire mechanisms.
The hold_count and wire_count fields of struct vm_page are separate reference counters with similar semantics. The remaining essential differences are that holds are not counted as a reference with respect to LRU, and holds have an implicit free-on-last unhold semantic whereas vm_page_unwire() callers must explicitly determine whether to free the page once the last reference to the page is released.
This change removes the KPIs which directly manipulate hold_count. Functions such as vm_fault_quick_hold_pages() now return wired pages instead. Since r328977 the overhead of maintaining LRU for wired pages is lower, and in many cases vm_fault_quick_hold_pages() callers would swap holds for wirings on the returned pages anyway, so with this change we remove a number of page lock acquisitions.
No functional change is intended. __FreeBSD_version is bumped.
Reviewed by: alc, kib Discussed with: jeff Discussed with: jhb, np (cxgbe) Tested by: pho (previous version) Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D19247
show more ...
|
Revision tags: release/11.3.0 |
|
#
0269ae4c |
| 06-Jun-2019 |
Alan Somers <asomers@FreeBSD.org> |
MFHead @348740
Sponsored by: The FreeBSD Foundation
|
#
5c066cd2 |
| 30-May-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
Remove TODO comment after posixshmcontrol(1) added.
Sponsored by: The FreeBSD Foundation MFC after: 3 days
|
#
56d0e33e |
| 23-May-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
Add a kern.ipc.posix_shm_list sysctl.
The sysctl provides the listing on named linked posix shared memory segments existing in the system.
Reuse shm_fill_kinfo() for filling individual struct kinfo
Add a kern.ipc.posix_shm_list sysctl.
The sysctl provides the listing on named linked posix shared memory segments existing in the system.
Reuse shm_fill_kinfo() for filling individual struct kinfo_file. Remove unneeded lock around reading of shmfd->shm_mode.
Reviewed by: jilles, tmunro Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D20258
show more ...
|
#
e4b77548 |
| 23-May-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
Report ref count of the backing object as st_nlink for posix shm fd.
Unless there are transient references to the object, the ref count is equal to the number of the shared memory segment mappings p
Report ref count of the backing object as st_nlink for posix shm fd.
Unless there are transient references to the object, the ref count is equal to the number of the shared memory segment mappings plus one.
Reviewed by: jilles, tmunro Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D20258
show more ...
|
#
2aaf9152 |
| 18-Mar-2019 |
Alan Somers <asomers@FreeBSD.org> |
MFHead@r345275
|
#
b18a4cca |
| 05-Mar-2019 |
Enji Cooper <ngie@FreeBSD.org> |
MFhead@r344786
|
#
844fc3e9 |
| 04-Mar-2019 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r344549 through r344775.
|
#
2b64ab22 |
| 28-Feb-2019 |
Mark Johnston <markj@FreeBSD.org> |
Allow FIONBIO and FIOASYNC ioctls on POSIX shm descriptors.
They have no effect, as with filesystem file descriptors. This improves compatibility with some existing userspace code.
Submitted by: Gr
Allow FIONBIO and FIOASYNC ioctls on POSIX shm descriptors.
They have no effect, as with filesystem file descriptors. This improves compatibility with some existing userspace code.
Submitted by: Greg V <greg@unrelenting.technology> Reviewed by: kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D19330
show more ...
|
#
cc426dd3 |
| 11-Dec-2018 |
Mateusz Guzik <mjg@FreeBSD.org> |
Remove unused argument to priv_check_cred.
Patch mostly generated with cocinnelle:
@@ expression E1,E2; @@
- priv_check_cred(E1,E2,0) + priv_check_cred(E1,E2)
Sponsored by: The FreeBSD Foundation
|
Revision tags: release/12.0.0 |
|
#
3d5db455 |
| 24-Nov-2018 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r340427 through r340868.
|