#
2ed053cd |
| 06-Aug-2024 |
Jason A. Harmening <jah@FreeBSD.org> |
vfs: Add IGNOREWHITEOUT flag and adopt it in UFS/unionfs
This flag is meant to request that the VOP implementation ignore whiteout entries when processing directory contents.
Employ this flag (init
vfs: Add IGNOREWHITEOUT flag and adopt it in UFS/unionfs
This flag is meant to request that the VOP implementation ignore whiteout entries when processing directory contents.
Employ this flag (initially) in UFS when determining whether a directory is empty for the purpose of deleting it or renaming another directory over it. The previous UFS behavior was to always ignore whiteouts and to therefore always allow directories containing only whiteouts to be deleted or overwritten. This makes sense when the directory in question is being accessed through a unionfs view in which the whiteouts produce a unionfs directory that is logically empty, but it makes less sense when directly operating against the UFS directory in which case silently discarding the whiteouts may produce unexpected behavior in a current or future unionfs view. IGNOREWHITEOUT is therefore treated as opt-in and only specified by unionfs_rmdir() when invoking VOP_RMDIR() against the upper filesystem. IGNOREWHITEOUT is not currently used for unionfs rename operations, as the current implementation of unionfs_rename() simply forbids renaming over any existing upper filesystem directory in the first place.
Differential Revision: https://reviews.freebsd.org/D45987 Reviewed by: olce Tested by: pho
show more ...
|
Revision tags: release/14.1.0, release/13.3.0 |
|
#
35a30155 |
| 03-Dec-2023 |
Kirk McKusick <mckusick@FreeBSD.org> |
Increase UFS/FFS maximum link count from 32767 to 65530.
The link count for a UFS/FFS inode is stored in a signed 16-bit integer. Thus the maximum link count has been 32767.
This limit has been rec
Increase UFS/FFS maximum link count from 32767 to 65530.
The link count for a UFS/FFS inode is stored in a signed 16-bit integer. Thus the maximum link count has been 32767.
This limit has been recently hit by the poudriere build system when doing a ports build as it needs one directory per port and the number of ports recently passed 32767.
A long-term solution would be to use one of the spare 32-bit fields in the inode to store the link count. However, the UFS1 format does not have a spare and adding the spare in UFS2 would make it hard to make it compatible when running on older kernels that use the original link count field. So this patch uses the much simpler approach of changing the existing link count field from a signed 16-bit value to an unsigned 16-bit value. It has the fewest lines of code changes. The only thing that changes is the type in the dinode and inode structures and the definition of UFS_LINK_MAX. It has the added benefit that it works with both UFS1 and UFS2.
It allows easy backward compatibility. Indeed it is backward compatibility that is the primary reason to go with this approach. If a filesystem with the new organization is mounted on an older kernel, it still needs to work. Thus if we move the new link count to a new field, we still need to maintain the old link count as best as possible even when running on a kernel that knows about the larger link counts. And we would have to carry this overhead for the indefinite future.
If we have a new link-count field, we will have to add a new filesystem flag to indicate that we are running with larger link counts. We will also need to add of one of the new-feature flags to say that we have larger link counts. Older kernels clear the new-feature flags that they do not know about, so when a filesystem is used on an older kernel and then moved back to a newer one, the newer one will know that the new link counts have not been maintained and that it will be necessary to run a full fsck on the filesystem to correct the link counts before it can be mounted.
With this change, older kernels will generally work with the bigger counts. While it will not itself allow the link count to exceed 32767, it will have no problem working with inodes that have a link count greater than 32767. Since it tests that i_nlink <= UFS_LINK_MAX, counts that are bigger than 32767 will appear negative, so will still pass the test. Of course, if they ever drop below 32767, they will no longer be able to exceed 32767. The one issue is if the link count ever exceeds 65535 then it will wrap to zero and the older kernel will be none the wiser. But this corner case is likely to be very rare since these kernels and the applications running on them do not expect to be able to get link counts over 32767. And over time, the use of new filesystems on older kernels will become rarer and rarer.
Reported-by: Mark Millard running poudriere on the ports tree Reviewed-by: kib, olce.freebsd_certner.fr Tested-by: Peter Holm, Mark Millard MFC-after: 2 weeks Differential Revision: https://reviews.freebsd.org/D42767
show more ...
|
#
29363fb4 |
| 23-Nov-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove ancient SCCS tags.
Remove ancient SCCS tags from the tree, automated scripting, with two minor fixup to keep things compiling. All the common forms in the tree were removed with a perl s
sys: Remove ancient SCCS tags.
Remove ancient SCCS tags from the tree, automated scripting, with two minor fixup to keep things compiling. All the common forms in the tree were removed with a perl script.
Sponsored by: Netflix
show more ...
|
Revision tags: release/14.0.0 |
|
#
685dc743 |
| 16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: one-line .c pattern
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
#
831b1ff7 |
| 28-Jul-2023 |
Kirk McKusick <mckusick@FreeBSD.org> |
UFS/FFS: Migrate to modern uintXX_t from u_intXX_t.
As per https://lists.freebsd.org/archives/freebsd-scsi/2023-July/000257.html move to the modern uintXX_t. While here also migrate u_char to uint8_
UFS/FFS: Migrate to modern uintXX_t from u_intXX_t.
As per https://lists.freebsd.org/archives/freebsd-scsi/2023-July/000257.html move to the modern uintXX_t. While here also migrate u_char to uint8_t. Where other kernel interfaces allow, migrate u_long to uint64_t.
No functional changes intended.
MFC-after: 1 week Sponsored-by: The FreeBSD Foundation
show more ...
|
Revision tags: release/13.2.0 |
|
#
06976709 |
| 18-Mar-2023 |
Kirk McKusick <mckusick@FreeBSD.org> |
Do not panic in case of corrupted UFS/FFS directory.
Historically the system panic'ed when it encountered a corrupt directory. This change recovers well enough to continue operations. This change is
Do not panic in case of corrupted UFS/FFS directory.
Historically the system panic'ed when it encountered a corrupt directory. This change recovers well enough to continue operations. This change is made in response to a similar change made in the ext2 filesystem as described in the cited Differential Revision.
MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D38503
show more ...
|
Revision tags: release/12.4.0 |
|
#
5b5b7e2c |
| 17-Sep-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: always retain path buffer after lookup
This removes some of the complexity needed to maintain HASBUF and allows for removing injecting SAVENAME by filesystems.
Reviewed by: kib (previous versi
vfs: always retain path buffer after lookup
This removes some of the complexity needed to maintain HASBUF and allows for removing injecting SAVENAME by filesystems.
Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D36542
show more ...
|
#
064e6b43 |
| 13-Jul-2022 |
Kirk McKusick <mckusick@FreeBSD.org> |
Rewrite function definitions in the UFS/FFS code base with identifier lists.
The K&R style in UFS and other places in the tree's days are numbered as this syntax is removed in C2x proposal N2432:
Rewrite function definitions in the UFS/FFS code base with identifier lists.
The K&R style in UFS and other places in the tree's days are numbered as this syntax is removed in C2x proposal N2432:
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2432.pdf
Though running to nearly 6000 lines of diffs this update should cause no functional change to the code.
Requested by: Warner Losh MFC after: 2 weeks
show more ...
|
Revision tags: release/13.1.0 |
|
#
0cdc6033 |
| 20-Jan-2022 |
Konstantin Belousov <kib@FreeBSD.org> |
ufs: Use IS_SNAPSHOT()
Reviewed by: markj, mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D34072
|
Revision tags: release/12.3.0 |
|
#
b4a58fbf |
| 01-Oct-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: remove cn_thread
It is always curthread.
Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D32453
|
#
8df4bc48 |
| 06-Aug-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
ufs rename: ensure that the result of ufs_checkpath() is stable
ufs_rename() calls ufs_checkpath() to ensure that the target directory is not a child of the source. If not, rename would create a lo
ufs rename: ensure that the result of ufs_checkpath() is stable
ufs_rename() calls ufs_checkpath() to ensure that the target directory is not a child of the source. If not, rename would create a loop. For instance: source->X1->X2->target and if source moved under target, we get corrupted filesystem. Suppose that we initially have source->X1 .... and X2->target where X1 is not on path from root to X2. Then ufs_checkpath() accepts the inodes, but there is nothing preventing parallel rename of X2 to become under X1, after checkpath finished.
Ensure stability of ufs_checkpath() result by taking a per-mount sx in ufs_rename right before ufs_checkpath() and till the end.
Reviewed by: chs, mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
show more ...
|
#
2e2212b4 |
| 01-Aug-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
Style: wrap the long line, definition of ufs_checkpath()
Sponsored by: The FreeBSD Foundation MFC after: 3 days
|
#
f784da88 |
| 18-May-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
Move mnt_maxsymlinklen into appropriate fs mount data structures
Reviewed by: mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week X-MFC-Note: struct mount layout Different
Move mnt_maxsymlinklen into appropriate fs mount data structures
Reviewed by: mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week X-MFC-Note: struct mount layout Differential revision: https://reviews.freebsd.org/D30325
show more ...
|
Revision tags: release/13.0.0 |
|
#
06f2918a |
| 29-Jan-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
ufs_direnter: directory truncation does not need special case for rename
In ufs_rename case, tdvp is locked from the place where ufs_direnter() is done till VOP_VPUT_PAIR(), which means that we no l
ufs_direnter: directory truncation does not need special case for rename
In ufs_rename case, tdvp is locked from the place where ufs_direnter() is done till VOP_VPUT_PAIR(), which means that we no longer need to specially handle rename in ufs_direnter(). Truncation, if possible, is done in the same way in ffs_vput_pair() both for rename and other VOPs calling ufs_direnter(). Remove isrename argument and set IN_ENDOFF if ufs_direnter() succeeded and directory needs truncation.
In ffs_vput_pair(), stop verifying the condition that directory needs truncation when IN_ENDOFF is set, instead assert that the condition is true.
Suggested by: mckusick Reviewed by: chs, mckusick Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation
show more ...
|
#
74a3652f |
| 27-Jan-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
ufs_direnter: move directory truncation to ffs_vput_pair().
VOP_VPUT_PAIR() provides the hook to do the truncation right before unlock, which is required since truncation might need to fsync(), whic
ufs_direnter: move directory truncation to ffs_vput_pair().
VOP_VPUT_PAIR() provides the hook to do the truncation right before unlock, which is required since truncation might need to fsync(), which itself might unlock the directory vnode.
Set new flag IN_ENDOFF which indicates that i_endoff is valid and should be checked against inode size. Excessive size is chomped, but this operation is advisory and failure to truncate should not result in the failure of the main VOP.
Reviewed by: chs, mckusick Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation
show more ...
|
#
08c2dc28 |
| 23-Jan-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
ufs_direnter/SU: unconditionally UFS_UPDATE inode when extending directory
for all kinds of async/SU mount variants.
Submitted by: mckusick Reviewed by: chs Tested by: pho MFC after: 2 weeks Sponso
ufs_direnter/SU: unconditionally UFS_UPDATE inode when extending directory
for all kinds of async/SU mount variants.
Submitted by: mckusick Reviewed by: chs Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation
show more ...
|
#
e94f2f1b |
| 28-Jan-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
ffs: call ufsdirhash_dirtrunc() right after setting directory size
Later processing of ffs_truncate() might temporary unlock the directory vnode, causing unsychronized dirhash and inode sizes if upd
ffs: call ufsdirhash_dirtrunc() right after setting directory size
Later processing of ffs_truncate() might temporary unlock the directory vnode, causing unsychronized dirhash and inode sizes if update is postponed to UFS_TRUNCATE() callers.
Reviewed by: chs, mkcusick Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation
show more ...
|
#
2c7ada99 |
| 06-Dec-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
ufs: handle two more cases of possible VNON vnode returned from VFS_VGET().
Reported by: kevans Reviewed by: mckusick, mjg Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision:
ufs: handle two more cases of possible VNON vnode returned from VFS_VGET().
Reported by: kevans Reviewed by: mckusick, mjg Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D27457
show more ...
|
#
8a1509e4 |
| 14-Nov-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
Handle LoR in flush_pagedep_deps().
When operating in SU or SU+J mode, ffs_syncvnode() might need to instantiate other vnode by inode number while owning syncing vnode lock. Typically this other vn
Handle LoR in flush_pagedep_deps().
When operating in SU or SU+J mode, ffs_syncvnode() might need to instantiate other vnode by inode number while owning syncing vnode lock. Typically this other vnode is the parent of our vnode, but due to renames occuring right before fsync (or during fsync when we drop the syncing vnode lock, see below) it might be no longer parent.
More, the called function flush_pagedep_deps() needs to lock other vnode while owning the lock for vnode which owns the buffer, for which the dependencies are flushed. This creates another instance of the same LoR as was fixed in softdep_sync().
Put the generic code for safe relocking into new SU helper get_parent_vp() and use it in flush_pagedep_deps(). The case for safe relocking of two vnodes with undefined lock order was extracted into vn helper vn_lock_pair().
Due to call sequence ffs_syncvnode()->softdep_sync_buf()->flush_pagedep_deps(), ffs_syncvnode() indicates with ERELOOKUP that passed vnode was unlocked in process, and can return ENOENT if the passed vnode reclaimed. All callers of the function were inspected.
Because UFS namei lookups store auxiliary information about directory entry in in-memory directory inode, and this information is then used by UFS code that creates/removed directory entry in the actual mutating VOPs, it is critical that directory vnode lock is not dropped between lookup and VOP. For softdep_prelink(), which ensures that later link/unlink operation can proceed without overflowing the journal, calls were moved to the place where it is safe to drop processing VOP because mutations are not yet applied. Then, ERELOOKUP causes restart of the whole VFS operation (typically VFS syscall) at top level, including the re-lookup of the involved pathes. [Note that we already do the same restart for failing calls to vn_start_write(), so formally this patch does not introduce new behavior.]
Similarly, unsafe calls to fsync in snapshot creation code were plugged. A possible view on these failures is that it does not make sense to continue creating snapshot if the snapshot vnode was reclaimed due to forced unmount.
It is possible that relock/ERELOOKUP situation occurs in ffs_truncate() called from ufs_inactive(). In this case, dropping the vnode lock is not safe. Detect the situation with VI_DOINGINACT and reschedule inactivation by setting VI_OWEINACT. ufs_inactive() rechecks VI_OWEINACT and avoids reclaiming vnode is truncation failed this way.
In ffs_truncate(), allocation of the EOF block for partial truncation is re-done after vnode is synced, since we cannot leave the buffer locked through ffs_syncvnode().
In collaboration with: pho Reviewed by: mckusick (previous version), markj Tested by: markj (syzkaller), pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D26136
show more ...
|
#
61846fc4 |
| 14-Nov-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
Add a framework that tracks exclusive vnode lock generation count for UFS.
This count is memoized together with the lookup metadata in directory inode, and we assert that accesses to lookup metadata
Add a framework that tracks exclusive vnode lock generation count for UFS.
This count is memoized together with the lookup metadata in directory inode, and we assert that accesses to lookup metadata are done under the same lock generation as they were stored. Enabled under DIAGNOSTICS.
UFS saves additional data for parent dirent when doing lookup (i_offset, i_count, i_endoff), and this data is used later by VOPs operating on dirents. If parent vnode exclusive lock is dropped and re-acquired between lookup and the VOP call, we corrupt directories.
Framework asserts that corruption cannot occur that way, by tracking vnode lock generation counter. Updates to inode dirent members also save the counter, while users compare current and saved counters values.
Also, fix a case in ufs_lookup_ino() where i_offset and i_count could be updated under shared lock. It is not a bug on its own since dvp i_offset results from such lookup cannot be used, but it causes false positive in the checker.
In collaboration with: pho Reviewed by: mckusick (previous version), markj Tested by: markj (syzkaller), pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D26136
show more ...
|
Revision tags: release/12.2.0, release/11.4.0 |
|
#
52488b51 |
| 05-Jun-2020 |
Kirk McKusick <mckusick@FreeBSD.org> |
Further evaluation of the POSIX spec for fdatasync() shows that it requires that new data on growing files be accessible. Thus, the the fsyncdata() system call must update the on-disk inode when the
Further evaluation of the POSIX spec for fdatasync() shows that it requires that new data on growing files be accessible. Thus, the the fsyncdata() system call must update the on-disk inode when the size of the file has changed.
This commit adds another inode update flag, IN_SIZEMOD, that gets set any time that the file size changes. If either the IN_IBLKDATA or the IN_SIZEMOD flag is set when fdatasync() is called, the associated inode is synchronously written to disk. We could have overloaded the IN_IBLKDATA flag to also track size changes since the only (current) use case for these flags are for fsyncdata(), but it does seem useful for possible future uses to separately track the file size changes and the inode block pointer changes.
Reviewed by: kib MFC with: -r361785 Differential revision: https://reviews.freebsd.org/D25072
show more ...
|
#
ac4ec141 |
| 13-Jan-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
ufs: add a setter for inode i_flag field
This will be used later to add vnodes to the lazy list.
Reviewed by: kib (previous version), jeff Tested by: pho (in a larger patch) Differential Revision:
ufs: add a setter for inode i_flag field
This will be used later to add vnodes to the lazy list.
Reviewed by: kib (previous version), jeff Tested by: pho (in a larger patch) Differential Revision: https://reviews.freebsd.org/D22994
show more ...
|
#
27a62571 |
| 11-Jan-2020 |
Kirk McKusick <mckusick@FreeBSD.org> |
When a read error occurs while fetching a directory block to delete or rename an entry in it, properly reset the link count of the inode associated with the entry that was to have been changed.
Test
When a read error occurs while fetching a directory block to delete or rename an entry in it, properly reset the link count of the inode associated with the entry that was to have been changed.
Tested by: Peter Holm MFC after: 7 days
show more ...
|
#
b249ce48 |
| 03-Jan-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: drop the mostly unused flags argument from VOP_UNLOCK
Filesystems which want to use it in limited capacity can employ the VOP_UNLOCK_FLAGS macro.
Reviewed by: kib (previous version) Differenti
vfs: drop the mostly unused flags argument from VOP_UNLOCK
Filesystems which want to use it in limited capacity can employ the VOP_UNLOCK_FLAGS macro.
Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D21427
show more ...
|
#
abd80ddb |
| 08-Dec-2019 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: introduce v_irflag and make v_type smaller
The current vnode layout is not smp-friendly by having frequently read data avoidably sharing cachelines with very frequently modified fields. In part
vfs: introduce v_irflag and make v_type smaller
The current vnode layout is not smp-friendly by having frequently read data avoidably sharing cachelines with very frequently modified fields. In particular v_iflag inspected for VI_DOOMED can be found in the same line with v_usecount. Instead make it available in the same cacheline as the v_op, v_data and v_type which all get read all the time.
v_type is avoidably 4 bytes while the necessary data will easily fit in 1. Shrinking it frees up 3 bytes, 2 of which get used here to introduce a new flag field with a new value: VIRF_DOOMED.
Reviewed by: kib, jeff Differential Revision: https://reviews.freebsd.org/D22715
show more ...
|