Revision tags: v6.2-rc5, v6.2-rc4, v6.2-rc3 |
|
#
43592c46 |
| 04-Jan-2023 |
Damien Le Moal <damien.lemoal@opensource.wdc.com> |
zonefs: Cache zone group directory inodes
Since looking up any zone file inode requires looking up first the inode for the directory representing the zone group of the file, ensuring that the zone g
zonefs: Cache zone group directory inodes
Since looking up any zone file inode requires looking up first the inode for the directory representing the zone group of the file, ensuring that the zone group inodes are always cached is desired. To do so, take an extra reference on the zone groups directory inodes on mount, thus avoiding the eviction of these inodes from the inode cache until the volume is unmounted.
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
show more ...
|
Revision tags: v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8 |
|
#
d207794a |
| 30-Nov-2022 |
Damien Le Moal <damien.lemoal@opensource.wdc.com> |
zonefs: Dynamically create file inodes when needed
Allocating and initializing all inodes and dentries for all files results in a very large memory usage with high capacity zoned block devices. For
zonefs: Dynamically create file inodes when needed
Allocating and initializing all inodes and dentries for all files results in a very large memory usage with high capacity zoned block devices. For instance, with a 26 TB SMR HDD with over 96000 zones, mounting the disk with zonefs results in about 130 MB of memory used, the vast majority of this space being used for vfs inodes and dentries.
However, since a user will rarely access all zones at the same time, dynamically creating file inodes and dentries on demand, similarly to regular file systems, can significantly reduce memory usage.
This patch modifies mount processing to not create the inodes and dentries for zone files. Instead, the directory inode operation zonefs_lookup() and directory file operation zonefs_readdir() are introduced to allocate and initialize inodes on-demand using the helper functions zonefs_get_dir_inode() and zonefs_get_zgroup_inode().
Implementation of these functions is simple, relying on the static nature of zonefs directories and files. Directory inodes are linked to the volume zone groups (struct zonefs_zone_group) they represent by using the directory inode i_private field. This simplifies the implementation of the lookup and readdir operations.
Unreferenced zone file inodes can be evicted from the inode cache at any time. In such case, the only inode information that cannot be recreated from the zone information that is saved in the zone group data structures attached to the volume super block is the inode uid, gid and access rights. These values may have been changed by the user. To keep these attributes for the life time of the mount, as before, the inode mode, uid and gid are saved in the inode zone information and the saved values are used to initialize regular file inodes when an inode lookup happens. The zone information mode, uid and gid are initialized in zonefs_init_zgroup() using the default values.
With these changes, the static minimal memory usage of a zonefs volume is mostly reduced to the array of zone information for each zone group. For the 26 TB SMR hard-disk mentioned above, the memory usage after mount becomes about 5.4 MB, a reduction by a factor of 24 from the initial 130 MB memory use.
Co-developed-by: Jorgen Hansen <Jorgen.Hansen@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
show more ...
|
Revision tags: v6.1-rc7, v6.1-rc6 |
|
#
aa7f243f |
| 16-Nov-2022 |
Damien Le Moal <damien.lemoal@opensource.wdc.com> |
zonefs: Separate zone information from inode information
In preparation for adding dynamic inode allocation, separate an inode zone information from the zonefs inode structure. The new data structur
zonefs: Separate zone information from inode information
In preparation for adding dynamic inode allocation, separate an inode zone information from the zonefs inode structure. The new data structure zonefs_zone is introduced to store in memory information about a zone that must be kept throughout the lifetime of the device mount.
Linking between a zone file inode and its zone information is done by setting the inode i_private field to point to a struct zonefs_zone. Using the i_private pointer avoids the need for adding a pointer in struct zonefs_inode_info. Beside the vfs inode, this structure is reduced to a mutex and a write open counter.
One struct zonefs_zone is created per file inode on mount. These structures are organized in an array using the new struct zonefs_zone_group data structure to represent zone groups. The zonefs_zone arrays are indexed per file number (the index of a struct zonefs_zone in its array directly gives the file number/name for that zone file inode).
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
show more ...
|
#
34422914 |
| 24-Nov-2022 |
Damien Le Moal <damien.lemoal@opensource.wdc.com> |
zonefs: Reduce struct zonefs_inode_info size
Instead of using the i_ztype field in struct zonefs_inode_info to indicate the zone type of an inode, introduce the new inode flag ZONEFS_ZONE_CNV to be
zonefs: Reduce struct zonefs_inode_info size
Instead of using the i_ztype field in struct zonefs_inode_info to indicate the zone type of an inode, introduce the new inode flag ZONEFS_ZONE_CNV to be set in the i_flags field of struct zonefs_inode_info to identify conventional zones. If this flag is not set, the zone of an inode is considered to be a sequential zone.
The helpers zonefs_zone_is_cnv(), zonefs_zone_is_seq(), zonefs_inode_is_cnv() and zonefs_inode_is_seq() are introduced to simplify testing the zone type of a struct zonefs_inode_info and of a struct inode.
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
show more ...
|
#
46a9c526 |
| 25-Nov-2022 |
Damien Le Moal <damien.lemoal@opensource.wdc.com> |
zonefs: Simplify IO error handling
Simplify zonefs_check_zone_condition() by moving the code that changes an inode access rights to the new function zonefs_inode_update_mode(). Furthermore, since on
zonefs: Simplify IO error handling
Simplify zonefs_check_zone_condition() by moving the code that changes an inode access rights to the new function zonefs_inode_update_mode(). Furthermore, since on mount an inode wpoffset is always zero when zonefs_check_zone_condition() is called during an inode initialization, the "mount" boolean argument is not necessary for the readonly zone case. This argument is thus removed.
zonefs_io_error_cb() is also modified to use the inode offline and zone state flags instead of checking the device zone condition. The multiple calls to zonefs_check_zone_condition() are reduced to the first call on entry, which allows removing the "warn" argument. zonefs_inode_update_mode() is also used to update an inode access rights as zonefs_io_error_cb() modifies the inode flags depending on the volume error handling mode (defined with a mount option). Since an inode mode change differs for read-only zones between mount time and IO error time, the flag ZONEFS_ZONE_INIT_MODE is used to differentiate both cases.
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
show more ...
|
#
4008e2a0 |
| 25-Nov-2022 |
Damien Le Moal <damien.lemoal@opensource.wdc.com> |
zonefs: Reorganize code
Move all code related to zone file operations from super.c to the new file.c file. Inode and zone management code remains in super.c.
Signed-off-by: Damien Le Moal <damien.l
zonefs: Reorganize code
Move all code related to zone file operations from super.c to the new file.c file. Inode and zone management code remains in super.c.
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
show more ...
|
#
f89fd043 |
| 22-Jan-2023 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
Merge 6.2-rc5 into driver-core-next
We need the driver core fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
7a6aa989 |
| 22-Jan-2023 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
Merge 6.2-rc5 into tty-next
We need the serial/tty changes into this branch as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
99ba2ad1 |
| 22-Jan-2023 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
Merge 6.2-rc5 into char-misc-next
We need the char/misc driver fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
6f849817 |
| 19-Jan-2023 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-next into drm-misc-next
Backmerging into drm-misc-next to get DRM accelerator infrastructure, which is required by ipuv driver.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
#
f861646a |
| 13-Jan-2023 |
Christian Brauner <brauner@kernel.org> |
quota: port to mnt_idmap
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversio
quota: port to mnt_idmap
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap.
Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs.
Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap.
Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
show more ...
|
#
f2d40141 |
| 13-Jan-2023 |
Christian Brauner <brauner@kernel.org> |
fs: port inode_init_owner() to mnt_idmap
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is ju
fs: port inode_init_owner() to mnt_idmap
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap.
Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs.
Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap.
Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
show more ...
|
#
c1632a0f |
| 13-Jan-2023 |
Christian Brauner <brauner@kernel.org> |
fs: port ->setattr() to pass mnt_idmap
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just
fs: port ->setattr() to pass mnt_idmap
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap.
Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs.
Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap.
Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
show more ...
|
#
407da561 |
| 10-Jan-2023 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'v6.2-rc3' into next
Merge with mainline to bring in timer_shutdown_sync() API.
|
#
0d8eae7b |
| 02-Jan-2023 |
Jani Nikula <jani.nikula@intel.com> |
Merge drm/drm-next into drm-intel-next
Sync up with v6.2-rc1.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
#
b501d4dc |
| 30-Dec-2022 |
Rodrigo Vivi <rodrigo.vivi@intel.com> |
Merge drm/drm-next into drm-intel-gt-next
Sync after v6.2-rc1 landed in drm-next.
We need to get some dependencies in place before we can merge the fixes series from Gwan-gyeong and Chris.
Referen
Merge drm/drm-next into drm-intel-gt-next
Sync after v6.2-rc1 landed in drm-next.
We need to get some dependencies in place before we can merge the fixes series from Gwan-gyeong and Chris.
References: https://lore.kernel.org/all/Y6x5JCDnh2rvh4lA@intel.com/ Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
show more ...
|
#
6599e683 |
| 28-Dec-2022 |
Mauro Carvalho Chehab <mchehab@kernel.org> |
Merge tag 'v6.2-rc1' into media_tree
Linux 6.2-rc1
* tag 'v6.2-rc1': (14398 commits) Linux 6.2-rc1 treewide: Convert del_timer*() to timer_shutdown*() pstore: Properly assign mem_type propert
Merge tag 'v6.2-rc1' into media_tree
Linux 6.2-rc1
* tag 'v6.2-rc1': (14398 commits) Linux 6.2-rc1 treewide: Convert del_timer*() to timer_shutdown*() pstore: Properly assign mem_type property pstore: Make sure CONFIG_PSTORE_PMSG selects CONFIG_RT_MUTEXES cfi: Fix CFI failure with KASAN perf python: Fix splitting CC into compiler and options afs: Stop implementing ->writepage() afs: remove afs_cache_netfs and afs_zap_permits() declarations afs: remove variable nr_servers afs: Fix lost servers_outstanding count ALSA: usb-audio: Add new quirk FIXED_RATE for JBL Quantum810 Wireless ALSA: azt3328: Remove the unused function snd_azf3328_codec_outl() gcov: add support for checksum field test_maple_tree: add test for mas_spanning_rebalance() on insufficient data maple_tree: fix mas_spanning_rebalance() on insufficient data hugetlb: really allocate vma lock for all sharable vmas kmsan: export kmsan_handle_urb kmsan: include linux/vmalloc.h mm/mempolicy: fix memory leak in set_mempolicy_home_node system call mm, mremap: fix mremap() expanding vma with addr inside vma ...
show more ...
|
#
1e5b3968 |
| 24-Nov-2022 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-next into drm-misc-next
Backmerging to get v6.1-rc6 into drm-misc-next.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
#
b3c588cd |
| 20-Jan-2023 |
Jakub Kicinski <kuba@kernel.org> |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
drivers/net/ipa/ipa_interrupt.c drivers/net/ipa/ipa_interrupt.h 9ec9b2a30853 ("net: ipa: disable ipa interrupt during suspend") 8e4
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
drivers/net/ipa/ipa_interrupt.c drivers/net/ipa/ipa_interrupt.h 9ec9b2a30853 ("net: ipa: disable ipa interrupt during suspend") 8e461e1f092b ("net: ipa: introduce ipa_interrupt_enable()") d50ed3558719 ("net: ipa: enable IPA interrupt handlers separate from registration") https://lore.kernel.org/all/20230119114125.5182c7ab@canb.auug.org.au/ https://lore.kernel.org/all/79e46152-8043-a512-79d9-c3b905462774@tessares.net/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
081edded |
| 19-Jan-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'zonefs-6.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs
Pull zonefs fix from Damien Le Moal:
- A single patch to fix sync write operations to detect and handle
Merge tag 'zonefs-6.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs
Pull zonefs fix from Damien Le Moal:
- A single patch to fix sync write operations to detect and handle errors due to external zone corruptions resulting in writes at invalid location, from me.
* tag 'zonefs-6.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs: zonefs: Detect append writes at invalid locations
show more ...
|
#
a608da3b |
| 06-Jan-2023 |
Damien Le Moal <damien.lemoal@opensource.wdc.com> |
zonefs: Detect append writes at invalid locations
Using REQ_OP_ZONE_APPEND operations for synchronous writes to sequential files succeeds regardless of the zone write pointer position, as long as th
zonefs: Detect append writes at invalid locations
Using REQ_OP_ZONE_APPEND operations for synchronous writes to sequential files succeeds regardless of the zone write pointer position, as long as the target zone is not full. This means that if an external (buggy) application writes to the zone of a sequential file underneath the file system, subsequent file write() operation will succeed but the file size will not be correct and the file will contain invalid data written by another application.
Modify zonefs_file_dio_append() to check the written sector of an append write (returned in bio->bi_iter.bi_sector) and return -EIO if there is a mismatch with the file zone wp offset field. This change triggers a call to zonefs_io_error() and a zone check. Modify zonefs_io_error_cb() to not expose the unexpected data after the current inode size when the errors=remount-ro mode is used. Other error modes are correctly handled already.
Fixes: 02ef12a663c7 ("zonefs: use REQ_OP_ZONE_APPEND for sync DIO") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
show more ...
|
#
2d78eb03 |
| 22-Dec-2022 |
Takashi Iwai <tiwai@suse.de> |
Merge branch 'for-next' into for-linus
|
#
1a931707 |
| 16-Dec-2022 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
Merge remote-tracking branch 'torvalds/master' into perf/core
To resolve a trivial merge conflict with c302378bc157f6a7 ("libbpf: Hashmap interface update to allow both long and void* keys/values"),
Merge remote-tracking branch 'torvalds/master' into perf/core
To resolve a trivial merge conflict with c302378bc157f6a7 ("libbpf: Hashmap interface update to allow both long and void* keys/values"), where a function present upstream was removed in the perf tools development tree.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
#
4f2c0a4a |
| 14-Dec-2022 |
Nick Terrell <terrelln@fb.com> |
Merge branch 'main' into zstd-linus
|
#
d69e8c63 |
| 09-Dec-2022 |
Jason Gunthorpe <jgg@nvidia.com> |
Merge tag 'v6.1-rc8' into rdma.git for-next
For dependencies in following patches
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|