History log of /linux/fs/btrfs/relocation.c (Results 1 – 25 of 2438)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.12-rc4
# be602cde 17-Oct-2024 Ingo Molnar <mingo@kernel.org>

Merge branch 'linus' into sched/urgent, to resolve conflict

Conflicts:
kernel/sched/ext.c

There's a context conflict between this upstream commit:

3fdb9ebcec10 sched_ext: Start schedulers with

Merge branch 'linus' into sched/urgent, to resolve conflict

Conflicts:
kernel/sched/ext.c

There's a context conflict between this upstream commit:

3fdb9ebcec10 sched_ext: Start schedulers with consistent p->scx.slice values

... and this fix in sched/urgent:

98442f0ccd82 sched: Fix delayed_dequeue vs switched_from_fair()

Resolve it.

Signed-off-by: Ingo Molnar <mingo@kernel.org>

show more ...


Revision tags: v6.12-rc3, v6.12-rc2
# c8d430db 06-Oct-2024 Paolo Bonzini <pbonzini@redhat.com>

Merge tag 'kvmarm-fixes-6.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 6.12, take #1

- Fix pKVM error path on init, making sure we do not chang

Merge tag 'kvmarm-fixes-6.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 6.12, take #1

- Fix pKVM error path on init, making sure we do not change critical
system registers as we're about to fail

- Make sure that the host's vector length is at capped by a value
common to all CPUs

- Fix kvm_has_feat*() handling of "negative" features, as the current
code is pretty broken

- Promote Joey to the status of official reviewer, while James steps
down -- hopefully only temporarly

show more ...


# 79eb2c07 04-Oct-2024 Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'for-6.12-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

- in incremental send, fix invalid clone operation for file that got

Merge tag 'for-6.12-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

- in incremental send, fix invalid clone operation for file that got
its size decreased

- fix __counted_by() annotation of send path cache entries, we do not
store the terminating NUL

- fix a longstanding bug in relocation (and quite hard to hit by
chance), drop back reference cache that can get out of sync after
transaction commit

- wait for fixup worker kthread before finishing umount

- add missing raid-stripe-tree extent for NOCOW files, zoned mode
cannot have NOCOW files but RST is meant to be a standalone feature

- handle transaction start error during relocation, avoid potential
NULL pointer dereference of relocation control structure (reported by
syzbot)

- disable module-wide rate limiting of debug level messages

- minor fix to tracepoint definition (reported by checkpatch.pl)

* tag 'for-6.12-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: disable rate limiting when debug enabled
btrfs: wait for fixup workers before stopping cleaner kthread during umount
btrfs: fix a NULL pointer dereference when failed to start a new trasacntion
btrfs: send: fix invalid clone operation for file that got its size decreased
btrfs: tracepoints: end assignment with semicolon at btrfs_qgroup_extent event class
btrfs: drop the backref cache during relocation if we commit
btrfs: also add stripe entries for NOCOW writes
btrfs: send: fix buffer overflow detection when copying path to cache entry

show more ...


# 0c436dfe 02-Oct-2024 Takashi Iwai <tiwai@suse.de>

Merge tag 'asoc-fix-v6.12-rc1' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v6.12

A bunch of fixes here that came in during the merge window and t

Merge tag 'asoc-fix-v6.12-rc1' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v6.12

A bunch of fixes here that came in during the merge window and the first
week of release, plus some new quirks and device IDs. There's nothing
major here, it's a bit bigger than it might've been due to there being
no fixes sent during the merge window due to your vacation.

show more ...


Revision tags: v6.12-rc1
# c3b47f49 28-Sep-2024 Qu Wenruo <wqu@suse.com>

btrfs: fix a NULL pointer dereference when failed to start a new trasacntion

[BUG]
Syzbot reported a NULL pointer dereference with the following crash:

FAULT_INJECTION: forcing a failure.
star

btrfs: fix a NULL pointer dereference when failed to start a new trasacntion

[BUG]
Syzbot reported a NULL pointer dereference with the following crash:

FAULT_INJECTION: forcing a failure.
start_transaction+0x830/0x1670 fs/btrfs/transaction.c:676
prepare_to_relocate+0x31f/0x4c0 fs/btrfs/relocation.c:3642
relocate_block_group+0x169/0xd20 fs/btrfs/relocation.c:3678
...
BTRFS info (device loop0): balance: ended with status: -12
Oops: general protection fault, probably for non-canonical address 0xdffffc00000000cc: 0000 [#1] PREEMPT SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x0000000000000660-0x0000000000000667]
RIP: 0010:btrfs_update_reloc_root+0x362/0xa80 fs/btrfs/relocation.c:926
Call Trace:
<TASK>
commit_fs_roots+0x2ee/0x720 fs/btrfs/transaction.c:1496
btrfs_commit_transaction+0xfaf/0x3740 fs/btrfs/transaction.c:2430
del_balance_item fs/btrfs/volumes.c:3678 [inline]
reset_balance_state+0x25e/0x3c0 fs/btrfs/volumes.c:3742
btrfs_balance+0xead/0x10c0 fs/btrfs/volumes.c:4574
btrfs_ioctl_balance+0x493/0x7c0 fs/btrfs/ioctl.c:3673
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:907 [inline]
__se_sys_ioctl+0xf9/0x170 fs/ioctl.c:893
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f

[CAUSE]
The allocation failure happens at the start_transaction() inside
prepare_to_relocate(), and during the error handling we call
unset_reloc_control(), which makes fs_info->balance_ctl to be NULL.

Then we continue the error path cleanup in btrfs_balance() by calling
reset_balance_state() which will call del_balance_item() to fully delete
the balance item in the root tree.

However during the small window between set_reloc_contrl() and
unset_reloc_control(), we can have a subvolume tree update and created a
reloc_root for that subvolume.

Then we go into the final btrfs_commit_transaction() of
del_balance_item(), and into btrfs_update_reloc_root() inside
commit_fs_roots().

That function checks if fs_info->reloc_ctl is in the merge_reloc_tree
stage, but since fs_info->reloc_ctl is NULL, it results a NULL pointer
dereference.

[FIX]
Just add extra check on fs_info->reloc_ctl inside
btrfs_update_reloc_root(), before checking
fs_info->reloc_ctl->merge_reloc_tree.

That DEAD_RELOC_TREE handling is to prevent further modification to the
reloc tree during merge stage, but since there is no reloc_ctl at all,
we do not need to bother that.

Reported-by: syzbot+283673dbc38527ef9f3d@syzkaller.appspotmail.com
Link: https://lore.kernel.org/linux-btrfs/66f6bfa7.050a0220.38ace9.0019.GAE@google.com/
CC: stable@vger.kernel.org # 4.19+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

show more ...


# db7e68b5 24-Sep-2024 Josef Bacik <josef@toxicpanda.com>

btrfs: drop the backref cache during relocation if we commit

Since the inception of relocation we have maintained the backref cache
across transaction commits, updating the backref cache with the ne

btrfs: drop the backref cache during relocation if we commit

Since the inception of relocation we have maintained the backref cache
across transaction commits, updating the backref cache with the new
bytenr whenever we COWed blocks that were in the cache, and then
updating their bytenr once we detected a transaction id change.

This works as long as we're only ever modifying blocks, not changing the
structure of the tree.

However relocation does in fact change the structure of the tree. For
example, if we are relocating a data extent, we will look up all the
leaves that point to this data extent. We will then call
do_relocation() on each of these leaves, which will COW down to the leaf
and then update the file extent location.

But, a key feature of do_relocation() is the pending list. This is all
the pending nodes that we modified when we updated the file extent item.
We will then process all of these blocks via finish_pending_nodes, which
calls do_relocation() on all of the nodes that led up to that leaf.

The purpose of this is to make sure we don't break sharing unless we
absolutely have to. Consider the case that we have 3 snapshots that all
point to this leaf through the same nodes, the initial COW would have
created a whole new path. If we did this for all 3 snapshots we would
end up with 3x the number of nodes we had originally. To avoid this we
will cycle through each of the snapshots that point to each of these
nodes and update their pointers to point at the new nodes.

Once we update the pointer to the new node we will drop the node we
removed the link for and all of its children via btrfs_drop_subtree().
This is essentially just btrfs_drop_snapshot(), but for an arbitrary
point in the snapshot.

The problem with this is that we will never reflect this in the backref
cache. If we do this btrfs_drop_snapshot() for a node that is in the
backref tree, we will leave the node in the backref tree. This becomes
a problem when we change the transid, as now the backref cache has
entire subtrees that no longer exist, but exist as if they still are
pointed to by the same roots.

In the best case scenario you end up with "adding refs to an existing
tree ref" errors from insert_inline_extent_backref(), where we attempt
to link in nodes on roots that are no longer valid.

Worst case you will double free some random block and re-use it when
there's still references to the block.

This is extremely subtle, and the consequences are quite bad. There
isn't a way to make sure our backref cache is consistent between
transid's.

In order to fix this we need to simply evict the entire backref cache
anytime we cross transid's. This reduces performance in that we have to
rebuild this backref cache every time we change transid's, but fixes the
bug.

This has existed since relocation was added, and is a pretty critical
bug. There's a lot more cleanup that can be done now that this
functionality is going away, but this patch is as small as possible in
order to fix the problem and make it easy for us to backport it to all
the kernels it needs to be backported to.

Followup series will dismantle more of this code and simplify relocation
drastically to remove this functionality.

We have a reproducer that reproduced the corruption within a few minutes
of running. With this patch it survives several iterations/hours of
running the reproducer.

Fixes: 3fd0a5585eb9 ("Btrfs: Metadata ENOSPC handling for balance")
CC: stable@vger.kernel.org
Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>

show more ...


# 2cd86f02 01-Oct-2024 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Merge remote-tracking branch 'drm/drm-fixes' into drm-misc-fixes

Required for a panthor fix that broke when
FOP_UNSIGNED_OFFSET was added in place of FMODE_UNSIGNED_OFFSET.

Signed-off-by: Maarten L

Merge remote-tracking branch 'drm/drm-fixes' into drm-misc-fixes

Required for a panthor fix that broke when
FOP_UNSIGNED_OFFSET was added in place of FMODE_UNSIGNED_OFFSET.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

show more ...


# 3a39d672 27-Sep-2024 Paolo Abeni <pabeni@redhat.com>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Cross-merge networking fixes after downstream PR.

No conflicts and no adjacent changes.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>


# 36ec807b 20-Sep-2024 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge branch 'next' into for-linus

Prepare input updates for 6.12 merge window.


Revision tags: v6.11, v6.11-rc7
# f057b572 06-Sep-2024 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge branch 'ib/6.11-rc6-matrix-keypad-spitz' into next

Bring in changes removing support for platform data from matrix-keypad
driver.


Revision tags: v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1
# 3daee2e4 16-Jul-2024 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge tag 'v6.10' into next

Sync up with mainline to bring in device_for_each_child_node_scoped()
and other newer APIs.


# 66e72a01 29-Jul-2024 Jerome Brunet <jbrunet@baylibre.com>

Merge tag 'v6.11-rc1' into clk-meson-next

Linux 6.11-rc1


# ee057c8c 14-Aug-2024 Steven Rostedt <rostedt@goodmis.org>

Merge tag 'v6.11-rc3' into trace/ring-buffer/core

The "reserve_mem" kernel command line parameter has been pulled into
v6.11. Merge the latest -rc3 to allow the persistent ring buffer memory to
be a

Merge tag 'v6.11-rc3' into trace/ring-buffer/core

The "reserve_mem" kernel command line parameter has been pulled into
v6.11. Merge the latest -rc3 to allow the persistent ring buffer memory to
be able to be mapped at the address specified by the "reserve_mem" command
line parameter.

Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

show more ...


# c8faf11c 30-Jul-2024 Tejun Heo <tj@kernel.org>

Merge tag 'v6.11-rc1' into for-6.12

Linux 6.11-rc1


# ed7171ff 16-Aug-2024 Lucas De Marchi <lucas.demarchi@intel.com>

Merge drm/drm-next into drm-xe-next

Get drm-xe-next on v6.11-rc2 and synchronized with drm-intel-next for
the display side. This resolves the current conflict for the
enable_display module parameter

Merge drm/drm-next into drm-xe-next

Get drm-xe-next on v6.11-rc2 and synchronized with drm-intel-next for
the display side. This resolves the current conflict for the
enable_display module parameter and allows further pending refactors.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

show more ...


# 5c61f598 12-Aug-2024 Thomas Zimmermann <tzimmermann@suse.de>

Merge drm/drm-next into drm-misc-next

Get drm-misc-next to the state of v6.11-rc2.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>


# 3663e2c4 01-Aug-2024 Jani Nikula <jani.nikula@intel.com>

Merge drm/drm-next into drm-intel-next

Sync with v6.11-rc1 in general, and specifically get the new
BACKLIGHT_POWER_ constants for power states.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 4436e6da 02-Aug-2024 Thomas Gleixner <tglx@linutronix.de>

Merge branch 'linus' into x86/mm

Bring x86 and selftests up to date


# 7a40974f 16-Sep-2024 Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'for-6.12-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs updates from David Sterba:
"This brings mostly refactoring, cleanups, minor performance
optimizati

Merge tag 'for-6.12-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs updates from David Sterba:
"This brings mostly refactoring, cleanups, minor performance
optimizations and usual fixes. The folio API conversions are most
noticeable.

There's one less visible change that could have a high impact. The
extent lock scope for read is reduced, not held for the entire
operation. In the buffered read case it's left to page or inode lock,
some direct io read synchronization is still needed.

This used to prevent deadlocks induced by page faults during direct
io, so there was a 4K limitation on the requests, e.g. for io_uring.
In the future this will allow smoother integration with iomap where
the extent read lock was a major obstacle.

User visible changes:

- the FSTRIM ioctl updates the processed range even after an error or
interruption

- cleaner thread is woken up in SYNC ioctl instead of waking the
transaction thread that can take some delay before waking up the
cleaner, this can speed up cleaning of deleted subvolumes

- print an error message when opening a device fail, e.g. when it's
unexpectedly read-only

Core changes:

- improved extent map handling in various ways (locking, iteration, ...)

- new assertions and locking annotations

- raid-stripe-tree locking fixes

- use xarray for tracking dirty qgroup extents, switched from rb-tree

- turn the subpage test to compile-time condition if possible (e.g.
on x86_64 with 4K pages), this allows to skip a lot of ifs and
remove dead code

- more preparatory work for compression in subpage mode

Cleanups and refactoring

- folio API conversions, many simple cases where page is passed so
switch it to folios

- more subpage code refactoring, update page state bitmap processing

- introduce auto free for btrfs_path structure, use for the simple
cases"

* tag 'for-6.12-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (110 commits)
btrfs: only unlock the to-be-submitted ranges inside a folio
btrfs: merge btrfs_folio_unlock_writer() into btrfs_folio_end_writer_lock()
btrfs: BTRFS_PATH_AUTO_FREE in orphan.c
btrfs: use btrfs_path auto free in zoned.c
btrfs: DEFINE_FREE for struct btrfs_path
btrfs: remove btrfs_folio_end_all_writers()
btrfs: constify more pointer parameters
btrfs: rework BTRFS_I as macro to preserve parameter const
btrfs: add and use helper to verify the calling task has locked the inode
btrfs: always update fstrim_range on failure in FITRIM ioctl
btrfs: convert copy_inline_to_page() to use folio
btrfs: convert btrfs_decompress() to take a folio
btrfs: convert zstd_decompress() to take a folio
btrfs: convert lzo_decompress() to take a folio
btrfs: convert zlib_decompress() to take a folio
btrfs: convert try_release_extent_mapping() to take a folio
btrfs: convert try_release_extent_state() to take a folio
btrfs: convert submit_eb_page() to take a folio
btrfs: convert submit_eb_subpage() to take a folio
btrfs: convert read_key_bytes() to take a folio
...

show more ...


# 04915240 31-Jul-2024 Johannes Thumshirn <jthumshirn@wdc.com>

btrfs: don't readahead the relocation inode on RST

On relocation we're doing readahead on the relocation inode, but if the
filesystem is backed by a RAID stripe tree we can get ENOENT (e.g. due to
p

btrfs: don't readahead the relocation inode on RST

On relocation we're doing readahead on the relocation inode, but if the
filesystem is backed by a RAID stripe tree we can get ENOENT (e.g. due to
preallocated extents not being mapped in the RST) from the lookup.

But readahead doesn't handle the error and submits invalid reads to the
device, causing an assertion in the scatter-gather list code:

BTRFS info (device nvme1n1): balance: start -d -m -s
BTRFS info (device nvme1n1): relocating block group 6480920576 flags data|raid0
BTRFS error (device nvme1n1): cannot find raid-stripe for logical [6481928192, 6481969152] devid 2, profile raid0
------------[ cut here ]------------
kernel BUG at include/linux/scatterlist.h:115!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP PTI
CPU: 0 PID: 1012 Comm: btrfs Not tainted 6.10.0-rc7+ #567
RIP: 0010:__blk_rq_map_sg+0x339/0x4a0
RSP: 0018:ffffc90001a43820 EFLAGS: 00010202
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffea00045d4802
RDX: 0000000117520000 RSI: 0000000000000000 RDI: ffff8881027d1000
RBP: 0000000000003000 R08: ffffea00045d4902 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000001000 R12: ffff8881003d10b8
R13: ffffc90001a438f0 R14: 0000000000000000 R15: 0000000000003000
FS: 00007fcc048a6900(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000002cd11000 CR3: 00000001109ea001 CR4: 0000000000370eb0
Call Trace:
<TASK>
? __die_body.cold+0x14/0x25
? die+0x2e/0x50
? do_trap+0xca/0x110
? do_error_trap+0x65/0x80
? __blk_rq_map_sg+0x339/0x4a0
? exc_invalid_op+0x50/0x70
? __blk_rq_map_sg+0x339/0x4a0
? asm_exc_invalid_op+0x1a/0x20
? __blk_rq_map_sg+0x339/0x4a0
nvme_prep_rq.part.0+0x9d/0x770
nvme_queue_rq+0x7d/0x1e0
__blk_mq_issue_directly+0x2a/0x90
? blk_mq_get_budget_and_tag+0x61/0x90
blk_mq_try_issue_list_directly+0x56/0xf0
blk_mq_flush_plug_list.part.0+0x52b/0x5d0
__blk_flush_plug+0xc6/0x110
blk_finish_plug+0x28/0x40
read_pages+0x160/0x1c0
page_cache_ra_unbounded+0x109/0x180
relocate_file_extent_cluster+0x611/0x6a0
? btrfs_search_slot+0xba4/0xd20
? balance_dirty_pages_ratelimited_flags+0x26/0xb00
relocate_data_extent.constprop.0+0x134/0x160
relocate_block_group+0x3f2/0x500
btrfs_relocate_block_group+0x250/0x430
btrfs_relocate_chunk+0x3f/0x130
btrfs_balance+0x71b/0xef0
? kmalloc_trace_noprof+0x13b/0x280
btrfs_ioctl+0x2c2e/0x3030
? kvfree_call_rcu+0x1e6/0x340
? list_lru_add_obj+0x66/0x80
? mntput_no_expire+0x3a/0x220
__x64_sys_ioctl+0x96/0xc0
do_syscall_64+0x54/0x110
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7fcc04514f9b
Code: Unable to access opcode bytes at 0x7fcc04514f71.
RSP: 002b:00007ffeba923370 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fcc04514f9b
RDX: 00007ffeba923460 RSI: 00000000c4009420 RDI: 0000000000000003
RBP: 0000000000000000 R08: 0000000000000013 R09: 0000000000000001
R10: 00007fcc043fbba8 R11: 0000000000000246 R12: 00007ffeba924fc5
R13: 00007ffeba923460 R14: 0000000000000002 R15: 00000000004d4bb0
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:__blk_rq_map_sg+0x339/0x4a0
RSP: 0018:ffffc90001a43820 EFLAGS: 00010202
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffea00045d4802
RDX: 0000000117520000 RSI: 0000000000000000 RDI: ffff8881027d1000
RBP: 0000000000003000 R08: ffffea00045d4902 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000001000 R12: ffff8881003d10b8
R13: ffffc90001a438f0 R14: 0000000000000000 R15: 0000000000003000
FS: 00007fcc048a6900(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fcc04514f71 CR3: 00000001109ea001 CR4: 0000000000370eb0
Kernel panic - not syncing: Fatal exception
Kernel Offset: disabled
---[ end Kernel panic - not syncing: Fatal exception ]---

So in case of a relocation on a RAID stripe-tree based file system, skip
the readahead.

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

show more ...


# a1ff5a7d 30-Jul-2024 Maxime Ripard <mripard@kernel.org>

Merge drm/drm-fixes into drm-misc-fixes

Let's start the new drm-misc-fixes cycle by bringing in 6.11-rc1.

Signed-off-by: Maxime Ripard <mripard@kernel.org>


# fbc90c04 22-Jul-2024 Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'mm-stable-2024-07-21-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

- In the series "mm: Avoid possible overflows in dirty throttlin

Merge tag 'mm-stable-2024-07-21-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

- In the series "mm: Avoid possible overflows in dirty throttling" Jan
Kara addresses a couple of issues in the writeback throttling code.
These fixes are also targetted at -stable kernels.

- Ryusuke Konishi's series "nilfs2: fix potential issues related to
reserved inodes" does that. This should actually be in the
mm-nonmm-stable tree, along with the many other nilfs2 patches. My
bad.

- More folio conversions from Kefeng Wang in the series "mm: convert to
folio_alloc_mpol()"

- Kemeng Shi has sent some cleanups to the writeback code in the series
"Add helper functions to remove repeated code and improve readability
of cgroup writeback"

- Kairui Song has made the swap code a little smaller and a little
faster in the series "mm/swap: clean up and optimize swap cache
index".

- In the series "mm/memory: cleanly support zeropage in
vm_insert_page*(), vm_map_pages*() and vmf_insert_mixed()" David
Hildenbrand has reworked the rather sketchy handling of the use of
the zeropage in MAP_SHARED mappings. I don't see any runtime effects
here - more a cleanup/understandability/maintainablity thing.

- Dev Jain has improved selftests/mm/va_high_addr_switch.c's handling
of higher addresses, for aarch64. The (poorly named) series is
"Restructure va_high_addr_switch".

- The core TLB handling code gets some cleanups and possible slight
optimizations in Bang Li's series "Add update_mmu_tlb_range() to
simplify code".

- Jane Chu has improved the handling of our
fake-an-unrecoverable-memory-error testing feature MADV_HWPOISON in
the series "Enhance soft hwpoison handling and injection".

- Jeff Johnson has sent a billion patches everywhere to add
MODULE_DESCRIPTION() to everything. Some landed in this pull.

- In the series "mm: cleanup MIGRATE_SYNC_NO_COPY mode", Kefeng Wang
has simplified migration's use of hardware-offload memory copying.

- Yosry Ahmed performs more folio API conversions in his series "mm:
zswap: trivial folio conversions".

- In the series "large folios swap-in: handle refault cases first",
Chuanhua Han inches us forward in the handling of large pages in the
swap code. This is a cleanup and optimization, working toward the end
objective of full support of large folio swapin/out.

- In the series "mm,swap: cleanup VMA based swap readahead window
calculation", Huang Ying has contributed some cleanups and a possible
fixlet to his VMA based swap readahead code.

- In the series "add mTHP support for anonymous shmem" Baolin Wang has
taught anonymous shmem mappings to use multisize THP. By default this
is a no-op - users must opt in vis sysfs controls. Dramatic
improvements in pagefault latency are realized.

- David Hildenbrand has some cleanups to our remaining use of
page_mapcount() in the series "fs/proc: move page_mapcount() to
fs/proc/internal.h".

- David also has some highmem accounting cleanups in the series
"mm/highmem: don't track highmem pages manually".

- Build-time fixes and cleanups from John Hubbard in the series
"cleanups, fixes, and progress towards avoiding "make headers"".

- Cleanups and consolidation of the core pagemap handling from Barry
Song in the series "mm: introduce pmd|pte_needs_soft_dirty_wp helpers
and utilize them".

- Lance Yang's series "Reclaim lazyfree THP without splitting" has
reduced the latency of the reclaim of pmd-mapped THPs under fairly
common circumstances. A 10x speedup is seen in a microbenchmark.

It does this by punting to aother CPU but I guess that's a win unless
all CPUs are pegged.

- hugetlb_cgroup cleanups from Xiu Jianfeng in the series
"mm/hugetlb_cgroup: rework on cftypes".

- Miaohe Lin's series "Some cleanups for memory-failure" does just that
thing.

- Someone other than SeongJae has developed a DAMON feature in Honggyu
Kim's series "DAMON based tiered memory management for CXL memory".
This adds DAMON features which may be used to help determine the
efficiency of our placement of CXL/PCIe attached DRAM.

- DAMON user API centralization and simplificatio work in SeongJae
Park's series "mm/damon: introduce DAMON parameters online commit
function".

- In the series "mm: page_type, zsmalloc and page_mapcount_reset()"
David Hildenbrand does some maintenance work on zsmalloc - partially
modernizing its use of pageframe fields.

- Kefeng Wang provides more folio conversions in the series "mm: remove
page_maybe_dma_pinned() and page_mkclean()".

- More cleanup from David Hildenbrand, this time in the series
"mm/memory_hotplug: use PageOffline() instead of PageReserved() for
!ZONE_DEVICE". It "enlightens memory hotplug more about PageOffline()
pages" and permits the removal of some virtio-mem hacks.

- Barry Song's series "mm: clarify folio_add_new_anon_rmap() and
__folio_add_anon_rmap()" is a cleanup to the anon folio handling in
preparation for mTHP (multisize THP) swapin.

- Kefeng Wang's series "mm: improve clear and copy user folio"
implements more folio conversions, this time in the area of large
folio userspace copying.

- The series "Docs/mm/damon/maintaier-profile: document a mailing tool
and community meetup series" tells people how to get better involved
with other DAMON developers. From SeongJae Park.

- A large series ("kmsan: Enable on s390") from Ilya Leoshkevich does
that.

- David Hildenbrand sends along more cleanups, this time against the
migration code. The series is "mm/migrate: move NUMA hinting fault
folio isolation + checks under PTL".

- Jan Kara has found quite a lot of strangenesses and minor errors in
the readahead code. He addresses this in the series "mm: Fix various
readahead quirks".

- SeongJae Park's series "selftests/damon: test DAMOS tried regions and
{min,max}_nr_regions" adds features and addresses errors in DAMON's
self testing code.

- Gavin Shan has found a userspace-triggerable WARN in the pagecache
code. The series "mm/filemap: Limit page cache size to that supported
by xarray" addresses this. The series is marked cc:stable.

- Chengming Zhou's series "mm/ksm: cmp_and_merge_page() optimizations
and cleanup" cleans up and slightly optimizes KSM.

- Roman Gushchin has separated the memcg-v1 and memcg-v2 code - lots of
code motion. The series (which also makes the memcg-v1 code
Kconfigurable) are "mm: memcg: separate legacy cgroup v1 code and put
under config option" and "mm: memcg: put cgroup v1-specific memcg
data under CONFIG_MEMCG_V1"

- Dan Schatzberg's series "Add swappiness argument to memory.reclaim"
adds an additional feature to this cgroup-v2 control file.

- The series "Userspace controls soft-offline pages" from Jiaqi Yan
permits userspace to stop the kernel's automatic treatment of
excessive correctable memory errors. In order to permit userspace to
monitor and handle this situation.

- Kefeng Wang's series "mm: migrate: support poison recover from
migrate folio" teaches the kernel to appropriately handle migration
from poisoned source folios rather than simply panicing.

- SeongJae Park's series "Docs/damon: minor fixups and improvements"
does those things.

- In the series "mm/zsmalloc: change back to per-size_class lock"
Chengming Zhou improves zsmalloc's scalability and memory
utilization.

- Vivek Kasireddy's series "mm/gup: Introduce memfd_pin_folios() for
pinning memfd folios" makes the GUP code use FOLL_PIN rather than
bare refcount increments. So these paes can first be moved aside if
they reside in the movable zone or a CMA block.

- Andrii Nakryiko has added a binary ioctl()-based API to
/proc/pid/maps for much faster reading of vma information. The series
is "query VMAs from /proc/<pid>/maps".

- In the series "mm: introduce per-order mTHP split counters" Lance
Yang improves the kernel's presentation of developer information
related to multisize THP splitting.

- Michael Ellerman has developed the series "Reimplement huge pages
without hugepd on powerpc (8xx, e500, book3s/64)". This permits
userspace to use all available huge page sizes.

- In the series "revert unconditional slab and page allocator fault
injection calls" Vlastimil Babka removes a performance-affecting and
not very useful feature from slab fault injection.

* tag 'mm-stable-2024-07-21-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (411 commits)
mm/mglru: fix ineffective protection calculation
mm/zswap: fix a white space issue
mm/hugetlb: fix kernel NULL pointer dereference when migrating hugetlb folio
mm/hugetlb: fix possible recursive locking detected warning
mm/gup: clear the LRU flag of a page before adding to LRU batch
mm/numa_balancing: teach mpol_to_str about the balancing mode
mm: memcg1: convert charge move flags to unsigned long long
alloc_tag: fix page_ext_get/page_ext_put sequence during page splitting
lib: reuse page_ext_data() to obtain codetag_ref
lib: add missing newline character in the warning message
mm/mglru: fix overshooting shrinker memory
mm/mglru: fix div-by-zero in vmpressure_calc_level()
mm/kmemleak: replace strncpy() with strscpy()
mm, page_alloc: put should_fail_alloc_page() back behing CONFIG_FAIL_PAGE_ALLOC
mm, slab: put should_failslab() back behind CONFIG_SHOULD_FAILSLAB
mm: ignore data-race in __swap_writepage
hugetlbfs: ensure generic_hugetlb_get_unmapped_area() returns higher address than mmap_min_addr
mm: shmem: rename mTHP shmem counters
mm: swap_state: use folio_alloc_mpol() in __read_swap_cache_async()
mm/migrate: putback split folios when numa hint migration fails
...

show more ...


Revision tags: v6.10, v6.10-rc7, v6.10-rc6
# bb82ac31 25-Jun-2024 Jan Kara <jack@suse.cz>

readahead: drop index argument of page_cache_async_readahead()

The index argument of page_cache_async_readahead() is just folio->index so
there's no point in passing is separately. Drop it.

Link:

readahead: drop index argument of page_cache_async_readahead()

The index argument of page_cache_async_readahead() is just folio->index so
there's no point in passing is separately. Drop it.

Link: https://lkml.kernel.org/r/20240625101909.12234-5-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Tested-by: Zhang Peng <zhangpengpeng0808@gmail.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

show more ...


# a23e1966 15-Jul-2024 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge branch 'next' into for-linus

Prepare input updates for 6.11 merge window.


Revision tags: v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2
# 6f47c7ae 28-May-2024 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge tag 'v6.9' into next

Sync up with the mainline to bring in the new cleanup API.


12345678910>>...98