| #
7bc5e747 |
| 23-May-2026 |
Mike Rapoport (Microsoft) <rppt@kernel.org> |
userfaultfd: merge fs/userfaultfd.c into mm/userfaultfd.c
Patch series "userfaultfd: merge fs/userfaultfd.c into mm/userfaultfd.c", v3.
These patches merge fs/userfaultfd.c into mm/userfaultfd.c an
userfaultfd: merge fs/userfaultfd.c into mm/userfaultfd.c
Patch series "userfaultfd: merge fs/userfaultfd.c into mm/userfaultfd.c", v3.
These patches merge fs/userfaultfd.c into mm/userfaultfd.c and make functions used only inside mm/userfaultfd.c static.
This patch (of 2):
Historically userfaultfd implementation has been split between fs/userfaultfd.c and mm/userfaultfd.c.
The mm/ part implemented memory management operations, while the fs/ part implemented file descriptor handling and called into the mm/ part for the actual memory management work.
This separation is quite artificial and fs/userfaultfd.c does not seem to belong to fs/ because it's only a user if vfs APIs and like for other users, for example, memfd and secretmem, the file descriptor handling could live in mm/ as well.
"Append" fs/userfaultfd.c to mm/userfaultfd and update fs/Makefile and MAINTAINERS accordingly.
No intended functional changes.
Link: https://lore.kernel.org/20260523173759.3964908-1-rppt@kernel.org Link: https://lore.kernel.org/20260523173759.3964908-2-rppt@kernel.org Assisted-by: Copilot:claude-opus-4-6 Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Christian Brauner (Amutable) <brauner@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: David Hildenbrand <david@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| #
47503f98 |
| 02-Feb-2026 |
Namjae Jeon <linkinjeon@kernel.org> |
ntfs: add Kconfig and Makefile
Introduce Kconfig and Makefile for remade ntfs. And this patch make ntfs and ntfs3 mutually exclusive so only one can be built-in(y), while both can still be built as
ntfs: add Kconfig and Makefile
Introduce Kconfig and Makefile for remade ntfs. And this patch make ntfs and ntfs3 mutually exclusive so only one can be built-in(y), while both can still be built as modules(m).
Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
show more ...
|
| #
c84bb79f |
| 09-Feb-2026 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'vfs-7.0-rc1.nullfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs nullfs update from Christian Brauner: "Add a completely catatonic minimal pseudo filesystem called "
Merge tag 'vfs-7.0-rc1.nullfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs nullfs update from Christian Brauner: "Add a completely catatonic minimal pseudo filesystem called "nullfs" and make pivot_root() work in the initramfs.
Currently pivot_root() does not work on the real rootfs because it cannot be unmounted. Userspace has to recursively delete initramfs contents manually before continuing boot, using the fragile switch_root sequence (overmount + chroot).
Add nullfs, a minimal immutable filesystem that serves as the true root of the mount hierarchy. The mutable rootfs (tmpfs/ramfs) is mounted on top of it. This allows userspace to simply:
chdir(new_root); pivot_root(".", "."); umount2(".", MNT_DETACH);
without the traditional switch_root workarounds. systemd already handles this correctly. It tries pivot_root() first and falls back to MS_MOVE only when that fails.
This also means rootfs mounts in unprivileged namespaces no longer need MNT_LOCKED, since the immutable nullfs guarantees nothing can be revealed by unmounting the covering mount.
nullfs is a single-instance filesystem (get_tree_single()) marked SB_NOUSER | SB_I_NOEXEC | SB_I_NODEV with an immutable empty root directory. This means sooner or later it can be used to overmount other directories to hide their contents without any additional protection needed.
We enable it unconditionally. If we see any real regression we'll hide it behind a boot option.
nullfs has extensions beyond this in the future. It will serve as a concept to support the creation of completely empty mount namespaces - which is work coming up in the next cycle"
* tag 'vfs-7.0-rc1.nullfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs: use nullfs unconditionally as the real rootfs docs: mention nullfs fs: add immutable rootfs fs: add init_pivot_root() fs: ensure that internal tmpfs mount gets mount id zero
show more ...
|
| #
21945e6c |
| 13-Jan-2026 |
Darrick J. Wong <djwong@kernel.org> |
fs: report filesystem and file I/O errors to fsnotify
Create some wrapper code around struct super_block so that filesystems have a standard way to queue filesystem metadata and file I/O error repor
fs: report filesystem and file I/O errors to fsnotify
Create some wrapper code around struct super_block so that filesystems have a standard way to queue filesystem metadata and file I/O error reports to have them sent to fsnotify.
If a filesystem wants to provide an error number, it must supply only negative error numbers. These are stored internally as negative numbers, but they are converted to positive error numbers before being passed to fanotify, per the fanotify(7) manpage. Implementations of super_operations::report_error are passed the raw internal event data.
Note that we have to play some shenanigans with mempools and queue_work so that the error handling doesn't happen outside of process context, and the event handler functions (both ->report_error and fsnotify) can handle file I/O error messages without having to worry about whatever locks might be held. This asynchronicity requires that unmount wait for pending events to clear.
Add a new callback to the superblock operations structure so that filesystem drivers can themselves respond to file I/O errors if they so desire. This will be used for an upcoming self-healing patchset for XFS.
Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Link: https://patch.msgid.link/176826402610.3490369.4378391061533403171.stgit@frogsfrogsfrogs Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
| #
576ee5df |
| 12-Jan-2026 |
Christian Brauner <brauner@kernel.org> |
fs: add immutable rootfs
Currently pivot_root() doesn't work on the real rootfs because it cannot be unmounted. Userspace has to do a recursive removal of the initramfs contents manually before cont
fs: add immutable rootfs
Currently pivot_root() doesn't work on the real rootfs because it cannot be unmounted. Userspace has to do a recursive removal of the initramfs contents manually before continuing the boot.
Really all we want from the real rootfs is to serve as the parent mount for anything that is actually useful such as the tmpfs or ramfs for initramfs unpacking or the rootfs itself. There's no need for the real rootfs to actually be anything meaningful or useful. Add a immutable rootfs called "nullfs" that can be selected via the "nullfs_rootfs" kernel command line option.
The kernel will mount a tmpfs/ramfs on top of it, unpack the initramfs and fire up userspace which mounts the rootfs and can then just do:
chdir(rootfs); pivot_root(".", "."); umount2(".", MNT_DETACH);
and be done with it. (Ofc, userspace can also choose to retain the initramfs contents by using something like pivot_root(".", "/initramfs") without unmounting it.)
Technically this also means that the rootfs mount in unprivileged namespaces doesn't need to become MNT_LOCKED anymore as it's guaranteed that the immutable rootfs remains permanently empty so there cannot be anything revealed by unmounting the covering mount.
In the future this will also allow us to create completely empty mount namespaces without risking to leak anything.
systemd already handles this all correctly as it tries to pivot_root() first and falls back to MS_MOVE only when that fails.
This goes back to various discussion in previous years and a LPC 2024 presentation about this very topic.
Link: https://patch.msgid.link/20260112-work-immutable-rootfs-v2-3-88dd1c34a204@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
| #
0485a18d |
| 04-Nov-2025 |
Christian Brauner <brauner@kernel.org> |
fs: rename fs_types.h to fs_dirent.h
We will split out a bunch of types into a separate header. So free up the appropriate name for it.
Link: https://patch.msgid.link/20251104-work-fs-header-v1-1-f
fs: rename fs_types.h to fs_dirent.h
We will split out a bunch of types into a separate header. So free up the appropriate name for it.
Link: https://patch.msgid.link/20251104-work-fs-header-v1-1-fb39a2efe39e@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
| #
f2c61db2 |
| 29-Sep-2025 |
Linus Torvalds <torvalds@linux-foundation.org> |
Remove bcachefs core code
bcachefs was marked 'externally maintained' in 6.17 but the code remained to make the transition smoother.
It's now a DKMS module, making the in-kernel code stale, so remo
Remove bcachefs core code
bcachefs was marked 'externally maintained' in 6.17 but the code remained to make the transition smoother.
It's now a DKMS module, making the in-kernel code stale, so remove it to avoid any version confusion.
Link: https://lore.kernel.org/linux-bcachefs/yokpt2d2g2lluyomtqrdvmkl3amv3kgnipmenobkpgx537kay7@xgcgjviv3n7x/T/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
| #
2f952c9e |
| 30-Jun-2025 |
Andrey Albershteyn <aalbersh@kernel.org> |
fs: split fileattr related helpers into separate file
This patch moves function related to file extended attributes manipulations to separate file. Refactoring only.
Signed-off-by: Andrey Albershte
fs: split fileattr related helpers into separate file
This patch moves function related to file extended attributes manipulations to separate file. Refactoring only.
Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org> Link: https://lore.kernel.org/20250630-xattrat-syscall-v6-1-c4e3bc35227b@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
| #
bff70402 |
| 15-May-2025 |
James Morse <james.morse@arm.com> |
fs/resctrl: Add boiler plate for external resctrl code
Add Makefile and Kconfig for fs/resctrl. Add ARCH_HAS_CPU_RESCTRL for the common parts of the resctrl interface and make X86_CPU_RESCTRL select
fs/resctrl: Add boiler plate for external resctrl code
Add Makefile and Kconfig for fs/resctrl. Add ARCH_HAS_CPU_RESCTRL for the common parts of the resctrl interface and make X86_CPU_RESCTRL select this.
Adding an include of asm/resctrl.h to linux/resctrl.h allows the /fs/resctrl files to switch over to using this header instead.
Co-developed-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Tested-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/20250515165855.31452-16-james.morse@arm.com
show more ...
|
| #
448fa701 |
| 20-Feb-2025 |
Jan Kara <jack@suse.cz> |
sysv: Remove the filesystem
Since 2002 (change "Replace BKL for chain locking with sysvfs-private rwlock") the sysv filesystem was doing IO under a rwlock in its get_block() function (yes, a non-sle
sysv: Remove the filesystem
Since 2002 (change "Replace BKL for chain locking with sysvfs-private rwlock") the sysv filesystem was doing IO under a rwlock in its get_block() function (yes, a non-sleepable lock hold over a function used to read inode metadata for all reads and writes). Nobody noticed until syzbot in 2023 [1]. This shows nobody is using the filesystem. Just drop it.
[1] https://lore.kernel.org/all/0000000000000ccf9a05ee84f5b0@google.com/
Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20250220163940.10155-2-jack@suse.cz Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
| #
fb6f20ec |
| 17-Oct-2024 |
Jan Kara <jack@suse.cz> |
reiserfs: The last commit
Deprecation period of reiserfs ends with the end of this year so it is time to remove it from the kernel.
Acked-by: Darrick J. Wong <djwong@kernel.org> Acked-by: Christian
reiserfs: The last commit
Deprecation period of reiserfs ends with the end of this year so it is time to remove it from the kernel.
Acked-by: Darrick J. Wong <djwong@kernel.org> Acked-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Jan Kara <jack@suse.cz>
show more ...
|
| #
d08e2045 |
| 31-Jul-2024 |
Matt Bobrowski <mattbobrowski@google.com> |
bpf: introduce new VFS based BPF kfuncs
Add a new variant of bpf_d_path() named bpf_path_d_path() which takes the form of a BPF kfunc and enforces KF_TRUSTED_ARGS semantics onto its arguments.
This
bpf: introduce new VFS based BPF kfuncs
Add a new variant of bpf_d_path() named bpf_path_d_path() which takes the form of a BPF kfunc and enforces KF_TRUSTED_ARGS semantics onto its arguments.
This new d_path() based BPF kfunc variant is intended to address the legacy bpf_d_path() BPF helper's susceptability to memory corruption issues [0, 1, 2] by ensuring to only operate on supplied arguments which are deemed trusted by the BPF verifier. Typically, this means that only pointers to a struct path which have been referenced counted may be supplied.
In addition to the new bpf_path_d_path() BPF kfunc, we also add a KF_ACQUIRE based BPF kfunc bpf_get_task_exe_file() and KF_RELEASE counterpart BPF kfunc bpf_put_file(). This is so that the new bpf_path_d_path() BPF kfunc can be used more flexibily from within the context of a BPF LSM program. It's rather common to ascertain the backing executable file for the calling process by performing the following walk current->mm->exe_file while instrumenting a given operation from the context of the BPF LSM program. However, walking current->mm->exe_file directly is never deemed to be OK, and doing so from both inside and outside of BPF LSM program context should be considered as a bug. Using bpf_get_task_exe_file() and in turn bpf_put_file() will allow BPF LSM programs to reliably get and put references to current->mm->exe_file.
As of now, all the newly introduced BPF kfuncs within this patch are limited to BPF LSM program types. These can be either sleepable or non-sleepable variants of BPF LSM program types.
[0] https://lore.kernel.org/bpf/CAG48ez0ppjcT=QxU-jtCUfb5xQb3mLr=5FcwddF_VKfEBPs_Dg@mail.gmail.com/ [1] https://lore.kernel.org/bpf/20230606181714.532998-1-jolsa@kernel.org/ [2] https://lore.kernel.org/bpf/20220219113744.1852259-1-memxor@gmail.com/
Acked-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Matt Bobrowski <mattbobrowski@google.com> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240731110833.1834742-2-mattbobrowski@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
| #
b5683a37 |
| 11-Mar-2024 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'vfs-6.9.pidfd' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull pdfd updates from Christian Brauner:
- Until now pidfds could only be created for thread-group leaders but
Merge tag 'vfs-6.9.pidfd' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull pdfd updates from Christian Brauner:
- Until now pidfds could only be created for thread-group leaders but not for threads. There was no technical reason for this. We simply had no users that needed support for this. Now we do have users that need support for this.
This introduces a new PIDFD_THREAD flag for pidfd_open(). If that flag is set pidfd_open() creates a pidfd that refers to a specific thread.
In addition, we now allow clone() and clone3() to be called with CLONE_PIDFD | CLONE_THREAD which wasn't possible before.
A pidfd that refers to an individual thread differs from a pidfd that refers to a thread-group leader:
(1) Pidfds are pollable. A task may poll a pidfd and get notified when the task has exited.
For thread-group leader pidfds the polling task is woken if the thread-group is empty. In other words, if the thread-group leader task exits when there are still threads alive in its thread-group the polling task will not be woken when the thread-group leader exits but rather when the last thread in the thread-group exits.
For thread-specific pidfds the polling task is woken if the thread exits.
(2) Passing a thread-group leader pidfd to pidfd_send_signal() will generate thread-group directed signals like kill(2) does.
Passing a thread-specific pidfd to pidfd_send_signal() will generate thread-specific signals like tgkill(2) does.
The default scope of the signal is thus determined by the type of the pidfd.
Since use-cases exist where the default scope of the provided pidfd needs to be overriden the following flags are added to pidfd_send_signal():
- PIDFD_SIGNAL_THREAD Send a thread-specific signal.
- PIDFD_SIGNAL_THREAD_GROUP Send a thread-group directed signal.
- PIDFD_SIGNAL_PROCESS_GROUP Send a process-group directed signal.
The scope change will only work if the struct pid is actually used for this scope.
For example, in order to send a thread-group directed signal the provided pidfd must be used as a thread-group leader and similarly for PIDFD_SIGNAL_PROCESS_GROUP the struct pid must be used as a process group leader.
- Move pidfds from the anonymous inode infrastructure to a tiny pseudo filesystem. This will unblock further work that we weren't able to do simply because of the very justified limitations of anonymous inodes. Moving pidfds to a tiny pseudo filesystem allows for statx on pidfds to become useful for the first time. They can now be compared by inode number which are unique for the system lifetime.
Instead of stashing struct pid in file->private_data we can now stash it in inode->i_private. This makes it possible to introduce concepts that operate on a process once all file descriptors have been closed. A concrete example is kill-on-last-close. Another side-effect is that file->private_data is now freed up for per-file options for pidfds.
Now, each struct pid will refer to a different inode but the same struct pid will refer to the same inode if it's opened multiple times. In contrast to now where each struct pid refers to the same inode.
The tiny pseudo filesystem is not visible anywhere in userspace exactly like e.g., pipefs and sockfs. There's no lookup, there's no complex inode operations, nothing. Dentries and inodes are always deleted when the last pidfd is closed.
We allocate a new inode and dentry for each struct pid and we reuse that inode and dentry for all pidfds that refer to the same struct pid. The code is entirely optional and fairly small. If it's not selected we fallback to anonymous inodes. Heavily inspired by nsfs.
The dentry and inode allocation mechanism is moved into generic infrastructure that is now shared between nsfs and pidfs. The path_from_stashed() helper must be provided with a stashing location, an inode number, a mount, and the private data that is supposed to be used and it will provide a path that can be passed to dentry_open().
The helper will try retrieve an existing dentry from the provided stashing location. If a valid dentry is found it is reused. If not a new one is allocated and we try to stash it in the provided location. If this fails we retry until we either find an existing dentry or the newly allocated dentry could be stashed. Subsequent openers of the same namespace or task are then able to reuse it.
- Currently it is only possible to get notified when a task has exited, i.e., become a zombie and userspace gets notified with EPOLLIN. We now also support waiting until the task has been reaped, notifying userspace with EPOLLHUP.
- Ensure that ESRCH is reported for getfd if a task is exiting instead of the confusing EBADF.
- Various smaller cleanups to pidfd functions.
* tag 'vfs-6.9.pidfd' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (23 commits) libfs: improve path_from_stashed() libfs: add stashed_dentry_prune() libfs: improve path_from_stashed() helper pidfs: convert to path_from_stashed() helper nsfs: convert to path_from_stashed() helper libfs: add path_from_stashed() pidfd: add pidfs pidfd: move struct pidfd_fops pidfd: allow to override signal scope in pidfd_send_signal() pidfd: change pidfd_send_signal() to respect PIDFD_THREAD signal: fill in si_code in prepare_kill_siginfo() selftests: add ESRCH tests for pidfd_getfd() pidfd: getfd should always report ESRCH if a task is exiting pidfd: clone: allow CLONE_THREAD | CLONE_PIDFD together pidfd: exit: kill the no longer used thread_group_exited() pidfd: change do_notify_pidfd() to use __wake_up(poll_to_key(EPOLLIN)) pid: kill the obsolete PIDTYPE_PID code in transfer_pid() pidfd: kill the no longer needed do_notify_pidfd() in de_thread() pidfd_poll: report POLLHUP when pid_task() == NULL pidfd: implement PIDFD_THREAD flag for pidfd_open() ...
show more ...
|
| #
50f4f2d1 |
| 12-Feb-2024 |
Christian Brauner <brauner@kernel.org> |
pidfd: move struct pidfd_fops
Move the pidfd file operations over to their own file in preparation of implementing pidfs and to isolate them from other mostly unrelated functionality in other files.
pidfd: move struct pidfd_fops
Move the pidfd file operations over to their own file in preparation of implementing pidfs and to isolate them from other mostly unrelated functionality in other files.
Link: https://lore.kernel.org/r/20240213-vfs-pidfd_fs-v1-1-f863f58cfce1@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
| #
7ffa8f3d |
| 15-Jan-2024 |
Matthew Wilcox (Oracle) <willy@infradead.org> |
fs: Remove NTFS classic
The replacement, NTFS3, was merged over two years ago. It is now time to remove the original from the tree as it is the last user of several APIs, and it is not worth changi
fs: Remove NTFS classic
The replacement, NTFS3, was merged over two years ago. It is now time to remove the original from the tree as it is the last user of several APIs, and it is not worth changing.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20240115072025.2071931-1-willy@infradead.org Acked-by: Namjae Jeon <linkinjeon@kernel.org> Acked-by: Dave Chinner <david@fromorbit.com> Cc: Anton Altaparmakov <anton@tuxera.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
| #
16df6e07 |
| 19-Jan-2024 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'vfs-6.8.netfs' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs
Pull netfs updates from Christian Brauner: "This extends the netfs helper library that network filesystems can use
Merge tag 'vfs-6.8.netfs' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs
Pull netfs updates from Christian Brauner: "This extends the netfs helper library that network filesystems can use to replace their own implementations. Both afs and 9p are ported. cifs is ready as well but the patches are way bigger and will be routed separately once this is merged. That will remove lots of code as well.
The overal goal is to get high-level I/O and knowledge of the page cache and ouf of the filesystem drivers. This includes knowledge about the existence of pages and folios
The pull request converts afs and 9p. This removes about 800 lines of code from afs and 300 from 9p. For 9p it is now possible to do writes in larger than a page chunks. Additionally, multipage folio support can be turned on for 9p. Separate patches exist for cifs removing another 2000+ lines. I've included detailed information in the individual pulls I took.
Summary:
- Add NFS-style (and Ceph-style) locking around DIO vs buffered I/O calls to prevent these from happening at the same time.
- Support for direct and unbuffered I/O.
- Support for write-through caching in the page cache.
- O_*SYNC and RWF_*SYNC writes use write-through rather than writing to the page cache and then flushing afterwards.
- Support for write-streaming.
- Support for write grouping.
- Skip reads for which the server could only return zeros or EOF.
- The fscache module is now part of the netfs library and the corresponding maintainer entry is updated.
- Some helpers from the fscache subsystem are renamed to mark them as belonging to the netfs library.
- Follow-up fixes for the netfs library.
- Follow-up fixes for the 9p conversion"
* tag 'vfs-6.8.netfs' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: (50 commits) netfs: Fix wrong #ifdef hiding wait cachefiles: Fix signed/unsigned mixup netfs: Fix the loop that unmarks folios after writing to the cache netfs: Fix interaction between write-streaming and cachefiles culling netfs: Count DIO writes netfs: Mark netfs_unbuffered_write_iter_locked() static netfs: Fix proc/fs/fscache symlink to point to "netfs" not "../netfs" netfs: Rearrange netfs_io_subrequest to put request pointer first 9p: Use length of data written to the server in preference to error 9p: Do a couple of cleanups 9p: Fix initialisation of netfs_inode for 9p cachefiles: Fix __cachefiles_prepare_write() 9p: Use netfslib read/write_iter afs: Use the netfs write helpers netfs: Export the netfs_sreq tracepoint netfs: Optimise away reads above the point at which there can be no data netfs: Implement a write-through caching option netfs: Provide a launder_folio implementation netfs: Provide a writepages implementation netfs, cachefiles: Pass upper bound length to allow expansion ...
show more ...
|
| #
47757ea8 |
| 20-Nov-2023 |
David Howells <dhowells@redhat.com> |
netfs, fscache: Move fs/fscache/* into fs/netfs/
There's a problem with dependencies between netfslib and fscache as each wants to access some functions of the other. Deal with this by moving fs/fs
netfs, fscache: Move fs/fscache/* into fs/netfs/
There's a problem with dependencies between netfslib and fscache as each wants to access some functions of the other. Deal with this by moving fs/fscache/* into fs/netfs/ and renaming those files to begin with "fscache-".
For the moment, the moved files are changed as little as possible and an fscache module is still built. A subsequent patch will integrate them.
Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: Christian Brauner <christian@brauner.io> cc: linux-fsdevel@vger.kernel.org cc: linux-cachefs@redhat.com
show more ...
|
| #
f91a704f |
| 02-Oct-2023 |
Amir Goldstein <amir73il@gmail.com> |
fs: prepare for stackable filesystems backing file helpers
In preparation for factoring out some backing file io helpers from overlayfs, move backing_file_open() into a new file fs/backing-file.c an
fs: prepare for stackable filesystems backing file helpers
In preparation for factoring out some backing file io helpers from overlayfs, move backing_file_open() into a new file fs/backing-file.c and header.
Add a MAINTAINERS entry for stackable filesystems and add a Kconfig FS_STACK which stackable filesystems need to select.
For now, the backing_file struct, the backing_file alloc/free functions and the backing_file_real_path() accessor remain internal to file_table.c. We may change that in the future.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
show more ...
|
| #
1c6fdbd8 |
| 17-Mar-2017 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Initial commit
Initially forked from drivers/md/bcache, bcachefs is a new copy-on-write filesystem with every feature you could possibly want.
Website: https://bcachefs.org
Signed-off-by
bcachefs: Initial commit
Initially forked from drivers/md/bcache, bcachefs is a new copy-on-write filesystem with every feature you could possibly want.
Website: https://bcachefs.org
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
show more ...
|
| #
925c86a1 |
| 01-Aug-2023 |
Christoph Hellwig <hch@lst.de> |
fs: add CONFIG_BUFFER_HEAD
Add a new config option that controls building the buffer_head code, and select it from all file systems and stacking drivers that need it.
For the block device nodes and
fs: add CONFIG_BUFFER_HEAD
Add a new config option that controls building the buffer_head code, and select it from all file systems and stacking drivers that need it.
For the block device nodes and alternative iomap based buffered I/O path is provided when buffer_head support is not enabled, and iomap needs a a small tweak to define the IOMAP_F_BUFFER_HEAD flag to 0 to not call into the buffer_head code when it doesn't exist.
Otherwise this is just Kconfig and ifdef changes.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20230801172201.1923299-7-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
| #
a0433f8c |
| 26-Jun-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'for-6.5/block-2023-06-23' of git://git.kernel.dk/linux
Pull block updates from Jens Axboe:
- NVMe pull request via Keith: - Various cleanups all around (Irvin, Chaitanya, Christop
Merge tag 'for-6.5/block-2023-06-23' of git://git.kernel.dk/linux
Pull block updates from Jens Axboe:
- NVMe pull request via Keith: - Various cleanups all around (Irvin, Chaitanya, Christophe) - Better struct packing (Christophe JAILLET) - Reduce controller error logs for optional commands (Keith) - Support for >=64KiB block sizes (Daniel Gomez) - Fabrics fixes and code organization (Max, Chaitanya, Daniel Wagner)
- bcache updates via Coly: - Fix a race at init time (Mingzhe Zou) - Misc fixes and cleanups (Andrea, Thomas, Zheng, Ye)
- use page pinning in the block layer for dio (David)
- convert old block dio code to page pinning (David, Christoph)
- cleanups for pktcdvd (Andy)
- cleanups for rnbd (Guoqing)
- use the unchecked __bio_add_page() for the initial single page additions (Johannes)
- fix overflows in the Amiga partition handling code (Michael)
- improve mq-deadline zoned device support (Bart)
- keep passthrough requests out of the IO schedulers (Christoph, Ming)
- improve support for flush requests, making them less special to deal with (Christoph)
- add bdev holder ops and shutdown methods (Christoph)
- fix the name_to_dev_t() situation and use cases (Christoph)
- decouple the block open flags from fmode_t (Christoph)
- ublk updates and cleanups, including adding user copy support (Ming)
- BFQ sanity checking (Bart)
- convert brd from radix to xarray (Pankaj)
- constify various structures (Thomas, Ivan)
- more fine grained persistent reservation ioctl capability checks (Jingbo)
- misc fixes and cleanups (Arnd, Azeem, Demi, Ed, Hengqi, Hou, Jan, Jordy, Li, Min, Yu, Zhong, Waiman)
* tag 'for-6.5/block-2023-06-23' of git://git.kernel.dk/linux: (266 commits) scsi/sg: don't grab scsi host module reference ext4: Fix warning in blkdev_put() block: don't return -EINVAL for not found names in devt_from_devname cdrom: Fix spectre-v1 gadget block: Improve kernel-doc headers blk-mq: don't insert passthrough request into sw queue bsg: make bsg_class a static const structure ublk: make ublk_chr_class a static const structure aoe: make aoe_class a static const structure block/rnbd: make all 'class' structures const block: fix the exclusive open mask in disk_scan_partitions block: add overflow checks for Amiga partition support block: change all __u32 annotations to __be32 in affs_hardblocks.h block: fix signed int overflow in Amiga partition support block: add capacity validation in bdev_add_partition() block: fine-granular CAP_SYS_ADMIN for Persistent Reservation block: disallow Persistent Reservation on partitions reiserfs: fix blkdev_put() warning from release_journal_dev() block: fix wrong mode for blkdev_get_by_dev() from disk_scan_partitions() block: document the holder argument to blkdev_get_by_path ...
show more ...
|
| #
38c8a9a5 |
| 22-May-2023 |
Steve French <stfrench@microsoft.com> |
smb: move client and server files to common directory fs/smb
Move CIFS/SMB3 related client and server files (cifs.ko and ksmbd.ko and helper modules) to new fs/smb subdirectory:
fs/cifs --> fs/s
smb: move client and server files to common directory fs/smb
Move CIFS/SMB3 related client and server files (cifs.ko and ksmbd.ko and helper modules) to new fs/smb subdirectory:
fs/cifs --> fs/smb/client fs/ksmbd --> fs/smb/server fs/smbfs_common --> fs/smb/common
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
show more ...
|
| #
bda2795a |
| 08-May-2023 |
Christoph Hellwig <hch@lst.de> |
fs: remove the special !CONFIG_BLOCK def_blk_fops
def_blk_fops always returns -ENODEV, which dosn't match the return value of a non-existing block device with CONFIG_BLOCK, which is -ENXIO. Just rem
fs: remove the special !CONFIG_BLOCK def_blk_fops
def_blk_fops always returns -ENODEV, which dosn't match the return value of a non-existing block device with CONFIG_BLOCK, which is -ENXIO. Just remove the extra implementation and fall back to the default no_open_fops that always returns -ENXIO.
Fixes: 9361401eb761 ("[PATCH] BLOCK: Make it possible to disable the block layer [try #6]") Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20230508144405.41792-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
| #
342528ff |
| 04-May-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'uml-for-linus-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux
Pull uml updates from Richard Weinberger:
- Make stub data pages configurable
- Make it harder to mix
Merge tag 'uml-for-linus-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux
Pull uml updates from Richard Weinberger:
- Make stub data pages configurable
- Make it harder to mix user and kernel code by accident
* tag 'uml-for-linus-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: um: make stub data pages size tweakable um: prevent user code in modules um: further clean up user_syms um: don't export printf() um: hostfs: define our own API boundary um: add __weak for exported functions
show more ...
|
| #
8c617450 |
| 10-Feb-2023 |
Johannes Berg <johannes.berg@intel.com> |
um: hostfs: define our own API boundary
Instead of exporting the set of functions provided by glibc that are needed for hostfs_user.c, just build that into the kernel image whenever hostfs is built,
um: hostfs: define our own API boundary
Instead of exporting the set of functions provided by glibc that are needed for hostfs_user.c, just build that into the kernel image whenever hostfs is built, and then export _those_ functions cleanly, to be independent of the libc implementation.
Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>
show more ...
|