| #
13f24586 |
| 21-Apr-2026 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull more arm64 updates from Catalin Marinas: "The main 'feature' is a workaround for C1-Pro erratum 4193714
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull more arm64 updates from Catalin Marinas: "The main 'feature' is a workaround for C1-Pro erratum 4193714 requiring IPIs during TLB maintenance if a process is running in user space with SME enabled.
The hardware acknowledges the DVMSync messages before completing in-flight SME accesses, with security implications. The workaround makes use of the mm_cpumask() to track the cores that need interrupting (arm64 hasn't used this mask before).
The rest are fixes for MPAM, CCA and generated header that turned up during the merging window or shortly before.
Summary:
Core features:
- Add workaround for C1-Pro erratum 4193714 - early CME (SME unit) DVMSync acknowledgement. The fix consists of sending IPIs on TLB maintenance to those CPUs running in user space with SME enabled
- Include kernel-hwcap.h in list of generated files (missed in a recent commit generating the KERNEL_HWCAP_* macros)
CCA:
- Fix RSI_INCOMPLETE error check in arm-cca-guest
MPAM:
- Fix an unmount->remount problem with the CDP emulation, uninitialised variable and checker warnings"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm_mpam: resctrl: Make resctrl_mon_ctx_waiters static arm_mpam: resctrl: Fix the check for no monitor components found arm_mpam: resctrl: Fix MBA CDP alloc_capable handling on unmount virt: arm-cca-guest: fix error check for RSI_INCOMPLETE arm64/hwcap: Include kernel-hwcap.h in list of generated files arm64: errata: Work around early CME DVMSync acknowledgement arm64: cputype: Add C1-Pro definitions arm64: tlb: Pass the corresponding mm to __tlbi_sync_s1ish() arm64: tlb: Introduce __tlbi_sync_s1ish_{kernel,batch}() for TLB maintenance
show more ...
|
| #
858fbd72 |
| 20-Apr-2026 |
Catalin Marinas <catalin.marinas@arm.com> |
Merge branch 'for-next/c1-pro-erratum-4193714' into for-next/core
* for-next/c1-pro-erratum-4193714: : Work around C1-Pro erratum 4193714 (CVE-2026-0995) arm64: errata: Work around early CME DVM
Merge branch 'for-next/c1-pro-erratum-4193714' into for-next/core
* for-next/c1-pro-erratum-4193714: : Work around C1-Pro erratum 4193714 (CVE-2026-0995) arm64: errata: Work around early CME DVMSync acknowledgement arm64: cputype: Add C1-Pro definitions arm64: tlb: Pass the corresponding mm to __tlbi_sync_s1ish() arm64: tlb: Introduce __tlbi_sync_s1ish_{kernel,batch}() for TLB maintenance
show more ...
|
| #
87768582 |
| 17-Apr-2026 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'dma-mapping-7.1-2026-04-16' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux
Pull dma-mapping updates from Marek Szyprowski:
- added support for batched cache sync, wh
Merge tag 'dma-mapping-7.1-2026-04-16' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux
Pull dma-mapping updates from Marek Szyprowski:
- added support for batched cache sync, what improves performance of dma_map/unmap_sg() operations on ARM64 architecture (Barry Song)
- introduced DMA_ATTR_CC_SHARED attribute for explicitly shared memory used in confidential computing (Jiri Pirko)
- refactored spaghetti-like code in drivers/of/of_reserved_mem.c and its clients (Marek Szyprowski, shared branch with device-tree updates to avoid merge conflicts)
- prepared Contiguous Memory Allocator related code for making dma-buf drivers modularized (Maxime Ripard)
- added support for benchmarking dma_map_sg() calls to tools/dma utility (Qinxin Xia)
* tag 'dma-mapping-7.1-2026-04-16' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux: (24 commits) dma-buf: heaps: system: document system_cc_shared heap dma-buf: heaps: system: add system_cc_shared heap for explicitly shared memory dma-mapping: introduce DMA_ATTR_CC_SHARED for shared memory mm: cma: Export cma_alloc(), cma_release() and cma_get_name() dma: contiguous: Export dev_get_cma_area() dma: contiguous: Make dma_contiguous_default_area static dma: contiguous: Make dev_get_cma_area() a proper function dma: contiguous: Turn heap registration logic around of: reserved_mem: rework fdt_init_reserved_mem_node() of: reserved_mem: clarify fdt_scan_reserved_mem*() functions of: reserved_mem: rearrange code a bit of: reserved_mem: replace CMA quirks by generic methods of: reserved_mem: switch to ops based OF_DECLARE() of: reserved_mem: use -ENODEV instead of -ENOENT of: reserved_mem: remove fdt node from the structure dma-mapping: fix false kernel-doc comment marker dma-mapping: Support batch mode for dma_direct_{map,unmap}_sg dma-mapping: Separate DMA sync issuing and completion waiting arm64: Provide dcache_inval_poc_nosync helper arm64: Provide dcache_clean_poc_nosync helper ...
show more ...
|
| #
334fbe73 |
| 15-Apr-2026 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- "maple_tree: Replace big node with maple copy" (Liam Howlett)
Merge tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- "maple_tree: Replace big node with maple copy" (Liam Howlett)
Mainly prepararatory work for ongoing development but it does reduce stack usage and is an improvement.
- "mm, swap: swap table phase III: remove swap_map" (Kairui Song)
Offers memory savings by removing the static swap_map. It also yields some CPU savings and implements several cleanups.
- "mm: memfd_luo: preserve file seals" (Pratyush Yadav)
File seal preservation to LUO's memfd code
- "mm: zswap: add per-memcg stat for incompressible pages" (Jiayuan Chen)
Additional userspace stats reportng to zswap
- "arch, mm: consolidate empty_zero_page" (Mike Rapoport)
Some cleanups for our handling of ZERO_PAGE() and zero_pfn
- "mm/kmemleak: Improve scan_should_stop() implementation" (Zhongqiu Han)
A robustness improvement and some cleanups in the kmemleak code
- "Improve khugepaged scan logic" (Vernon Yang)
Improve khugepaged scan logic and reduce CPU consumption by prioritizing scanning tasks that access memory frequently
- "Make KHO Stateless" (Jason Miu)
Simplify Kexec Handover by transitioning KHO from an xarray-based metadata tracking system with serialization to a radix tree data structure that can be passed directly to the next kernel
- "mm: vmscan: add PID and cgroup ID to vmscan tracepoints" (Thomas Ballasi and Steven Rostedt)
Enhance vmscan's tracepointing
- "mm: arch/shstk: Common shadow stack mapping helper and VM_NOHUGEPAGE" (Catalin Marinas)
Cleanup for the shadow stack code: remove per-arch code in favour of a generic implementation
- "Fix KASAN support for KHO restored vmalloc regions" (Pasha Tatashin)
Fix a WARN() which can be emitted the KHO restores a vmalloc area
- "mm: Remove stray references to pagevec" (Tal Zussman)
Several cleanups, mainly udpating references to "struct pagevec", which became folio_batch three years ago
- "mm: Eliminate fake head pages from vmemmap optimization" (Kiryl Shutsemau)
Simplify the HugeTLB vmemmap optimization (HVO) by changing how tail pages encode their relationship to the head page
- "mm/damon/core: improve DAMOS quota efficiency for core layer filters" (SeongJae Park)
Improve two problematic behaviors of DAMOS that makes it less efficient when core layer filters are used
- "mm/damon: strictly respect min_nr_regions" (SeongJae Park)
Improve DAMON usability by extending the treatment of the min_nr_regions user-settable parameter
- "mm/page_alloc: pcp locking cleanup" (Vlastimil Babka)
The proper fix for a previously hotfixed SMP=n issue. Code simplifications and cleanups ensued
- "mm: cleanups around unmapping / zapping" (David Hildenbrand)
A bunch of cleanups around unmapping and zapping. Mostly simplifications, code movements, documentation and renaming of zapping functions
- "support batched checking of the young flag for MGLRU" (Baolin Wang)
Batched checking of the young flag for MGLRU. It's part cleanups; one benchmark shows large performance benefits for arm64
- "memcg: obj stock and slab stat caching cleanups" (Johannes Weiner)
memcg cleanup and robustness improvements
- "Allow order zero pages in page reporting" (Yuvraj Sakshith)
Enhance free page reporting - it is presently and undesirably order-0 pages when reporting free memory.
- "mm: vma flag tweaks" (Lorenzo Stoakes)
Cleanup work following from the recent conversion of the VMA flags to a bitmap
- "mm/damon: add optional debugging-purpose sanity checks" (SeongJae Park)
Add some more developer-facing debug checks into DAMON core
- "mm/damon: test and document power-of-2 min_region_sz requirement" (SeongJae Park)
An additional DAMON kunit test and makes some adjustments to the addr_unit parameter handling
- "mm/damon/core: make passed_sample_intervals comparisons overflow-safe" (SeongJae Park)
Fix a hard-to-hit time overflow issue in DAMON core
- "mm/damon: improve/fixup/update ratio calculation, test and documentation" (SeongJae Park)
A batch of misc/minor improvements and fixups for DAMON
- "mm: move vma_(kernel|mmu)_pagesize() out of hugetlb.c" (David Hildenbrand)
Fix a possible issue with dax-device when CONFIG_HUGETLB=n. Some code movement was required.
- "zram: recompression cleanups and tweaks" (Sergey Senozhatsky)
A somewhat random mix of fixups, recompression cleanups and improvements in the zram code
- "mm/damon: support multiple goal-based quota tuning algorithms" (SeongJae Park)
Extend DAMOS quotas goal auto-tuning to support multiple tuning algorithms that users can select
- "mm: thp: reduce unnecessary start_stop_khugepaged()" (Breno Leitao)
Fix the khugpaged sysfs handling so we no longer spam the logs with reams of junk when starting/stopping khugepaged
- "mm: improve map count checks" (Lorenzo Stoakes)
Provide some cleanups and slight fixes in the mremap, mmap and vma code
- "mm/damon: support addr_unit on default monitoring targets for modules" (SeongJae Park)
Extend the use of DAMON core's addr_unit tunable
- "mm: khugepaged cleanups and mTHP prerequisites" (Nico Pache)
Cleanups to khugepaged and is a base for Nico's planned khugepaged mTHP support
- "mm: memory hot(un)plug and SPARSEMEM cleanups" (David Hildenbrand)
Code movement and cleanups in the memhotplug and sparsemem code
- "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and cleanup CONFIG_MIGRATION" (David Hildenbrand)
Rationalize some memhotplug Kconfig support
- "change young flag check functions to return bool" (Baolin Wang)
Cleanups to change all young flag check functions to return bool
- "mm/damon/sysfs: fix memory leak and NULL dereference issues" (Josh Law and SeongJae Park)
Fix a few potential DAMON bugs
- "mm/vma: convert vm_flags_t to vma_flags_t in vma code" (Lorenzo Stoakes)
Convert a lot of the existing use of the legacy vm_flags_t data type to the new vma_flags_t type which replaces it. Mainly in the vma code.
- "mm: expand mmap_prepare functionality and usage" (Lorenzo Stoakes)
Expand the mmap_prepare functionality, which is intended to replace the deprecated f_op->mmap hook which has been the source of bugs and security issues for some time. Cleanups, documentation, extension of mmap_prepare into filesystem drivers
- "mm/huge_memory: refactor zap_huge_pmd()" (Lorenzo Stoakes)
Simplify and clean up zap_huge_pmd(). Additional cleanups around vm_normal_folio_pmd() and the softleaf functionality are performed.
* tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits) mm: fix deferred split queue races during migration mm/khugepaged: fix issue with tracking lock mm/huge_memory: add and use has_deposited_pgtable() mm/huge_memory: add and use normal_or_softleaf_folio_pmd() mm: add softleaf_is_valid_pmd_entry(), pmd_to_softleaf_folio() mm/huge_memory: separate out the folio part of zap_huge_pmd() mm/huge_memory: use mm instead of tlb->mm mm/huge_memory: remove unnecessary sanity checks mm/huge_memory: deduplicate zap deposited table call mm/huge_memory: remove unnecessary VM_BUG_ON_PAGE() mm/huge_memory: add a common exit path to zap_huge_pmd() mm/huge_memory: handle buggy PMD entry in zap_huge_pmd() mm/huge_memory: have zap_huge_pmd return a boolean, add kdoc mm/huge: avoid big else branch in zap_huge_pmd() mm/huge_memory: simplify vma_is_specal_huge() mm: on remap assert that input range within the proposed VMA mm: add mmap_action_map_kernel_pages[_full]() uio: replace deprecated mmap hook with mmap_prepare in uio_info drivers: hv: vmbus: replace deprecated mmap hook with mmap_prepare mm: allow handling of stacked mmap_prepare hooks in more drivers ...
show more ...
|
| #
c43267e6 |
| 15-Apr-2026 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas: "The biggest changes are MPAM enablement in drivers/resctrl and new
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas: "The biggest changes are MPAM enablement in drivers/resctrl and new PMU support under drivers/perf.
On the core side, FEAT_LSUI lets futex atomic operations with EL0 permissions, avoiding PAN toggling.
The rest is mostly TLB invalidation refactoring, further generic entry work, sysreg updates and a few fixes.
Core features:
- Add support for FEAT_LSUI, allowing futex atomic operations without toggling Privileged Access Never (PAN)
- Further refactor the arm64 exception handling code towards the generic entry infrastructure
- Optimise __READ_ONCE() with CONFIG_LTO=y and allow alias analysis through it
Memory management:
- Refactor the arm64 TLB invalidation API and implementation for better control over barrier placement and level-hinted invalidation
- Enable batched TLB flushes during memory hot-unplug
- Fix rodata=full block mapping support for realm guests (when BBML2_NOABORT is available)
Perf and PMU:
- Add support for a whole bunch of system PMUs featured in NVIDIA's Tegra410 SoC (cspmu extensions for the fabric and PCIe, new drivers for CPU/C2C memory latency PMUs)
- Clean up iomem resource handling in the Arm CMN driver
- Fix signedness handling of AA64DFR0.{PMUVer,PerfMon}
MPAM (Memory Partitioning And Monitoring):
- Add architecture context-switch and hiding of the feature from KVM
- Add interface to allow MPAM to be exposed to user-space using resctrl
- Add errata workaround for some existing platforms
- Add documentation for using MPAM and what shape of platforms can use resctrl
Miscellaneous:
- Check DAIF (and PMR, where relevant) at task-switch time
- Skip TFSR_EL1 checks and barriers in synchronous MTE tag check mode (only relevant to asynchronous or asymmetric tag check modes)
- Remove a duplicate allocation in the kexec code
- Remove redundant save/restore of SCS SP on entry to/from EL0
- Generate the KERNEL_HWCAP_ definitions from the arm64 hwcap descriptions
- Add kselftest coverage for cmpbr_sigill()
- Update sysreg definitions"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (109 commits) arm64: rsi: use linear-map alias for realm config buffer arm64: Kconfig: fix duplicate word in CMDLINE help text arm64: mte: Skip TFSR_EL1 checks and barriers in synchronous tag check mode arm64/sysreg: Update ID_AA64SMFR0_EL1 description to DDI0601 2025-12 arm64/sysreg: Update ID_AA64ZFR0_EL1 description to DDI0601 2025-12 arm64/sysreg: Update ID_AA64FPFR0_EL1 description to DDI0601 2025-12 arm64/sysreg: Update ID_AA64ISAR2_EL1 description to DDI0601 2025-12 arm64/sysreg: Update ID_AA64ISAR0_EL1 description to DDI0601 2025-12 arm64/hwcap: Generate the KERNEL_HWCAP_ definitions for the hwcaps arm64: kexec: Remove duplicate allocation for trans_pgd ACPI: AGDI: fix missing newline in error message arm64: Check DAIF (and PMR) at task-switch time arm64: entry: Use split preemption logic arm64: entry: Use irqentry_{enter_from,exit_to}_kernel_mode() arm64: entry: Consistently prefix arm64-specific wrappers arm64: entry: Don't preempt with SError or Debug masked entry: Split preemption from irqentry_exit_to_kernel_mode() entry: Split kernel mode logic from irqentry_{enter,exit}() entry: Move irqentry_enter() prototype later entry: Remove local_irq_{enable,disable}_exit_to_user() ...
show more ...
|
| #
26ff9699 |
| 13-Apr-2026 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'rust-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux
Pull Rust updates from Miguel Ojeda: "Toolchain and infrastructure:
- Bump the minimum Rust version to 1.85.0 (
Merge tag 'rust-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux
Pull Rust updates from Miguel Ojeda: "Toolchain and infrastructure:
- Bump the minimum Rust version to 1.85.0 (and 'bindgen' to 0.71.1).
As proposed in LPC 2025 and the Maintainers Summit [1], we are going to follow Debian Stable's Rust versions as our minimum versions.
Debian Trixie was released on 2025-08-09 with a Rust 1.85.0 and 'bindgen' 0.71.1 toolchain, which is a fair amount of time for e.g. kernel developers to upgrade.
Other major distributions support a Rust version that is high enough as well, including:
+ Arch Linux. + Fedora Linux. + Gentoo Linux. + Nix. + openSUSE Slowroll and openSUSE Tumbleweed. + Ubuntu 25.10 and 26.04 LTS. In addition, 24.04 LTS using their versioned packages.
The merged patch series comes with the associated cleanups and simplifications treewide that can be performed thanks to both bumps, as well as documentation updates.
In addition, start using 'bindgen''s '--with-attribute-custom-enum' feature to set the 'cfi_encoding' attribute for the 'lru_status' enum used in Binder.
Link: https://lwn.net/Articles/1050174/ [1]
- Add experimental Kconfig option ('CONFIG_RUST_INLINE_HELPERS') that inlines C helpers into Rust.
Essentially, it performs a step similar to LTO, but just for the helpers, i.e. very local and fast.
It relies on 'llvm-link' and its '--internalize' flag, and requires a compatible LLVM between Clang and 'rustc' (i.e. same major version, 'CONFIG_RUSTC_CLANG_LLVM_COMPATIBLE'). It is only enabled for two architectures for now.
The result is a measurable speedup in different workloads that different users have tested. For instance, for the null block driver, it amounts to a 2%.
- Support global per-version flags.
While we already have per-version flags in many places, we didn't have a place to set global ones that depend on the compiler version, i.e. in 'rust_common_flags', which sometimes is needed to e.g. tweak the lints set per version.
Use that to allow the 'clippy::precedence' lint for Rust < 1.86.0, since it had a change in behavior.
- Support overriding the crate name and apply it to Rust Binder, which wanted the module to be called 'rust_binder'.
- Add the remaining '__rust_helper' annotations (started in the previous cycle).
'kernel' crate:
- Introduce the 'const_assert!' macro: a more powerful version of 'static_assert!' that can refer to generics inside functions or implementation bodies, e.g.:
fn f<const N: usize>() { const_assert!(N > 1); }
fn g<T>() { const_assert!(size_of::<T>() > 0, "T cannot be ZST"); }
In addition, reorganize our set of build-time assertion macros ('{build,const,static_assert}!') to live in the 'build_assert' module.
Finally, improve the docs as well to clarify how these are different from one another and how to pick the right one to use, and their equivalence (if any) to the existing C ones for extra clarity.
- 'sizes' module: add 'SizeConstants' trait.
This gives us typed 'SZ_*' constants (avoiding casts) for use in device address spaces where the address width depends on the hardware (e.g. 32-bit MMIO windows, 64-bit GPU framebuffers, etc.), e.g.:
let gpu_heap = 14 * u64::SZ_1M; let mmio_window = u32::SZ_16M;
- 'clk' module: implement 'Send' and 'Sync' for 'Clk' and thus simplify the users in Tyr and PWM.
- 'ptr' module: add 'const_align_up'.
- 'str' module: improve the documentation of the 'c_str!' macro to explain that one should only use it for non-literal cases (for the other case we instead use C string literals, e.g. 'c"abc"').
- Disallow the use of 'CStr::{as_ptr,from_ptr}' and clean one such use in the 'task' module.
- 'sync' module: finish the move of 'ARef' and 'AlwaysRefCounted' outside of the 'types' module, i.e. update the last remaining instances and finally remove the re-exports.
- 'error' module: clarify that 'from_err_ptr' can return 'Ok(NULL)', including runtime-tested examples.
The intention is to hopefully prevent UB that assumes the result of the function is not 'NULL' if successful. This originated from a case of UB I noticed in 'regulator' that created a 'NonNull' on it.
Timekeeping:
- Expand the example section in the 'HrTimer' documentation.
- Mark the 'ClockSource' trait as unsafe to ensure valid values for 'ktime_get()'.
- Add 'Delta::from_nanos()'.
'pin-init' crate:
- Replace the 'Zeroable' impls for 'Option<NonZero*>' with impls of 'ZeroableOption' for 'NonZero*'.
- Improve feature gate handling for unstable features.
- Declutter the documentation of implementations of 'Zeroable' for tuples.
- Replace uses of 'addr_of[_mut]!' with '&raw [mut]'.
rust-analyzer:
- Add type annotations to 'generate_rust_analyzer.py'.
- Add support for scripts written in Rust ('generate_rust_target.rs', 'rustdoc_test_builder.rs', 'rustdoc_test_gen.rs').
- Refactor 'generate_rust_analyzer.py' to explicitly identify host and target crates, improve readability, and reduce duplication.
And some other fixes, cleanups and improvements"
* tag 'rust-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux: (79 commits) rust: sizes: add SizeConstants trait for device address space constants rust: kernel: update `file_with_nul` comment rust: kbuild: allow `clippy::precedence` for Rust < 1.86.0 rust: kbuild: support global per-version flags rust: declare cfi_encoding for lru_status docs: rust: general-information: use real example docs: rust: general-information: simplify Kconfig example docs: rust: quick-start: remove GDB/Binutils mention docs: rust: quick-start: remove Nix "unstable channel" note docs: rust: quick-start: remove Gentoo "testing" note docs: rust: quick-start: add Ubuntu 26.04 LTS and remove subsection title docs: rust: quick-start: update minimum Ubuntu version docs: rust: quick-start: update Ubuntu versioned packages docs: rust: quick-start: openSUSE provides `rust-src` package nowadays rust: kbuild: remove "dummy parameter" workaround for `bindgen` < 0.71.1 rust: kbuild: update `bindgen --rust-target` version and replace comment rust: rust_is_available: remove warning for `bindgen` < 0.69.5 && libclang >= 19.1 rust: rust_is_available: remove warning for `bindgen` 0.66.[01] rust: bump `bindgen` minimum supported version to 0.71.1 (Debian Trixie) rust: block: update `const_refs_to_static` MSRV TODO comment ...
show more ...
|
| #
0baba94a |
| 07-Apr-2026 |
Catalin Marinas <catalin.marinas@arm.com> |
arm64: errata: Work around early CME DVMSync acknowledgement
C1-Pro acknowledges DVMSync messages before completing the SME/CME memory accesses. Work around this by issuing an IPI to the affected CP
arm64: errata: Work around early CME DVMSync acknowledgement
C1-Pro acknowledges DVMSync messages before completing the SME/CME memory accesses. Work around this by issuing an IPI to the affected CPUs if they are running in EL0 with SME enabled.
Note that we avoid the local DSB in the IPI handler as the kernel runs with SCTLR_EL1.IESB=1. This is sufficient to complete SME memory accesses at EL0 on taking an exception to EL1. On the return to user path, no barrier is necessary either. See the comment in sme_set_active() and the more detailed explanation in the link below.
To avoid a potential IPI flood from malicious applications (e.g. madvise(MADV_PAGEOUT) in a tight loop), track where a process is active via mm_cpumask() and only interrupt those CPUs.
Link: https://lore.kernel.org/r/ablEXwhfKyJW1i7l@J2N7QTR9R3 Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Mark Brown <broonie@kernel.org> Reviewed-by: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
show more ...
|
| #
480a9e57 |
| 10-Apr-2026 |
Catalin Marinas <catalin.marinas@arm.com> |
Merge branches 'for-next/misc', 'for-next/tlbflush', 'for-next/ttbr-macros-cleanup', 'for-next/kselftest', 'for-next/feat_lsui', 'for-next/mpam', 'for-next/hotplug-batched-tlbi', 'for-next/bbml2-fixe
Merge branches 'for-next/misc', 'for-next/tlbflush', 'for-next/ttbr-macros-cleanup', 'for-next/kselftest', 'for-next/feat_lsui', 'for-next/mpam', 'for-next/hotplug-batched-tlbi', 'for-next/bbml2-fixes', 'for-next/sysreg', 'for-next/generic-entry' and 'for-next/acpi', remote-tracking branches 'arm64/for-next/perf' and 'arm64/for-next/read-once' into for-next/core
* arm64/for-next/perf: : Perf updates perf/arm-cmn: Fix resource_size_t printk specifier in arm_cmn_init_dtc() perf/arm-cmn: Fix incorrect error check for devm_ioremap() perf: add NVIDIA Tegra410 C2C PMU perf: add NVIDIA Tegra410 CPU Memory Latency PMU perf/arm_cspmu: nvidia: Add Tegra410 PCIE-TGT PMU perf/arm_cspmu: nvidia: Add Tegra410 PCIE PMU perf/arm_cspmu: Add arm_cspmu_acpi_dev_get perf/arm_cspmu: nvidia: Add Tegra410 UCF PMU perf/arm_cspmu: nvidia: Rename doc to Tegra241 perf/arm-cmn: Stop claiming entire iomem region arm64: cpufeature: Use pmuv3_implemented() function arm64: cpufeature: Make PMUVer and PerfMon unsigned KVM: arm64: Read PMUVer as unsigned
* arm64/for-next/read-once: : Fixes for __READ_ONCE() with CONFIG_LTO=y arm64, compiler-context-analysis: Permit alias analysis through __READ_ONCE() with CONFIG_LTO=y arm64: Optimize __READ_ONCE() with CONFIG_LTO=y
* for-next/misc: : Miscellaneous cleanups/fixes arm64: rsi: use linear-map alias for realm config buffer arm64: Kconfig: fix duplicate word in CMDLINE help text arm64: mte: Skip TFSR_EL1 checks and barriers in synchronous tag check mode arm64/hwcap: Generate the KERNEL_HWCAP_ definitions for the hwcaps arm64: kexec: Remove duplicate allocation for trans_pgd arm64: mm: Use generic enum pgtable_level arm64: scs: Remove redundant save/restore of SCS SP on entry to/from EL0 arm64: remove ARCH_INLINE_*
* for-next/tlbflush: : Refactor the arm64 TLB invalidation API and implementation arm64: mm: __ptep_set_access_flags must hint correct TTL arm64: mm: Provide level hint for flush_tlb_page() arm64: mm: Wrap flush_tlb_page() around __do_flush_tlb_range() arm64: mm: More flags for __flush_tlb_range() arm64: mm: Refactor __flush_tlb_range() to take flags arm64: mm: Refactor flush_tlb_page() to use __tlbi_level_asid() arm64: mm: Simplify __flush_tlb_range_limit_excess() arm64: mm: Simplify __TLBI_RANGE_NUM() macro arm64: mm: Re-implement the __flush_tlb_range_op macro in C arm64: mm: Inline __TLBI_VADDR_RANGE() into __tlbi_range() arm64: mm: Push __TLBI_VADDR() into __tlbi_level() arm64: mm: Implicitly invalidate user ASID based on TLBI operation arm64: mm: Introduce a C wrapper for by-range TLB invalidation arm64: mm: Re-implement the __tlbi_level macro as a C function
* for-next/ttbr-macros-cleanup: : Cleanups of the TTBR1_* macros arm64/mm: Directly use TTBRx_EL1_CnP arm64/mm: Directly use TTBRx_EL1_ASID_MASK arm64/mm: Describe TTBR1_BADDR_4852_OFFSET
* for-next/kselftest: : arm64 kselftest updates selftests/arm64: Implement cmpbr_sigill() to hwcap test
* for-next/feat_lsui: : Futex support using FEAT_LSUI instructions to avoid toggling PAN arm64: armv8_deprecated: Disable swp emulation when FEAT_LSUI present arm64: Kconfig: Add support for LSUI KVM: arm64: Use CAST instruction for swapping guest descriptor arm64: futex: Support futex with FEAT_LSUI arm64: futex: Refactor futex atomic operation KVM: arm64: kselftest: set_id_regs: Add test for FEAT_LSUI KVM: arm64: Expose FEAT_LSUI to guests arm64: cpufeature: Add FEAT_LSUI
* for-next/mpam: (40 commits) : Expose MPAM to user-space via resctrl: : - Add architecture context-switch and hiding of the feature from KVM. : - Add interface to allow MPAM to be exposed to user-space using resctrl. : - Add errata workaoround for some existing platforms. : - Add documentation for using MPAM and what shape of platforms can use resctrl arm64: mpam: Add initial MPAM documentation arm_mpam: Quirk CMN-650's CSU NRDY behaviour arm_mpam: Add workaround for T241-MPAM-6 arm_mpam: Add workaround for T241-MPAM-4 arm_mpam: Add workaround for T241-MPAM-1 arm_mpam: Add quirk framework arm_mpam: resctrl: Call resctrl_init() on platforms that can support resctrl arm64: mpam: Select ARCH_HAS_CPU_RESCTRL arm_mpam: resctrl: Add empty definitions for assorted resctrl functions arm_mpam: resctrl: Update the rmid reallocation limit arm_mpam: resctrl: Add resctrl_arch_rmid_read() arm_mpam: resctrl: Allow resctrl to allocate monitors arm_mpam: resctrl: Add support for csu counters arm_mpam: resctrl: Add monitor initialisation and domain boilerplate arm_mpam: resctrl: Add kunit test for control format conversions arm_mpam: resctrl: Add support for 'MB' resource arm_mpam: resctrl: Wait for cacheinfo to be ready arm_mpam: resctrl: Add rmid index helpers arm_mpam: resctrl: Convert to/from MPAMs fixed-point formats arm_mpam: resctrl: Hide CDP emulation behind CONFIG_EXPERT ...
* for-next/hotplug-batched-tlbi: : arm64/mm: Enable batched TLB flush in unmap_hotplug_range() arm64/mm: Reject memory removal that splits a kernel leaf mapping arm64/mm: Enable batched TLB flush in unmap_hotplug_range()
* for-next/bbml2-fixes: : Fixes for realm guest and BBML2_NOABORT arm64: mm: Remove pmd_sect() and pud_sect() arm64: mm: Handle invalid large leaf mappings correctly arm64: mm: Fix rodata=full block mapping support for realm guests
* for-next/sysreg: : arm64 sysreg updates arm64/sysreg: Update ID_AA64SMFR0_EL1 description to DDI0601 2025-12 arm64/sysreg: Update ID_AA64ZFR0_EL1 description to DDI0601 2025-12 arm64/sysreg: Update ID_AA64FPFR0_EL1 description to DDI0601 2025-12 arm64/sysreg: Update ID_AA64ISAR2_EL1 description to DDI0601 2025-12 arm64/sysreg: Update ID_AA64ISAR0_EL1 description to DDI0601 2025-12 arm64/sysreg: Update SMIDR_EL1 to DDI0601 2025-06
* for-next/generic-entry: : More arm64 refactoring towards using the generic entry code arm64: Check DAIF (and PMR) at task-switch time arm64: entry: Use split preemption logic arm64: entry: Use irqentry_{enter_from,exit_to}_kernel_mode() arm64: entry: Consistently prefix arm64-specific wrappers arm64: entry: Don't preempt with SError or Debug masked entry: Split preemption from irqentry_exit_to_kernel_mode() entry: Split kernel mode logic from irqentry_{enter,exit}() entry: Move irqentry_enter() prototype later entry: Remove local_irq_{enable,disable}_exit_to_user() entry: Fix stale comment for irqentry_enter()
* for-next/acpi: : arm64 ACPI updates ACPI: AGDI: fix missing newline in error message
show more ...
|
| #
74b63934 |
| 09-Apr-2026 |
Michael Ugrin <mugrinphoto@gmail.com> |
arm64: Kconfig: fix duplicate word in CMDLINE help text
Remove duplicate 'the' in the CMDLINE config help text.
Signed-off-by: Michael Ugrin <mugrinphoto@gmail.com> Signed-off-by: Catalin Marinas <
arm64: Kconfig: fix duplicate word in CMDLINE help text
Remove duplicate 'the' in the CMDLINE config help text.
Signed-off-by: Michael Ugrin <mugrinphoto@gmail.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
show more ...
|
| #
b28711ac |
| 06-Apr-2026 |
Miguel Ojeda <ojeda@kernel.org> |
rust: simplify `RUSTC_VERSION` Kconfig conditions
With the Rust version bump in place, several Kconfig conditions based on `RUSTC_VERSION` are always true.
Thus simplify them.
The minimum supporte
rust: simplify `RUSTC_VERSION` Kconfig conditions
With the Rust version bump in place, several Kconfig conditions based on `RUSTC_VERSION` are always true.
Thus simplify them.
The minimum supported major LLVM version by our new Rust minimum version is now LLVM 18, instead of LLVM 16. However, there are no possible cleanups for `RUSTC_LLVM_VERSION`.
Reviewed-by: Tamir Duberstein <tamird@kernel.org> Reviewed-by: Gary Guo <gary@garyguo.net> Link: https://patch.msgid.link/20260405235309.418950-9-ojeda@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
show more ...
|
| #
078f80f9 |
| 19-Mar-2026 |
David Hildenbrand (Arm) <david@kernel.org> |
mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE
Patch series "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and cleanup CONFIG_MIGRATION".
While working on memory hotplug code cleanups, I realized
mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE
Patch series "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and cleanup CONFIG_MIGRATION".
While working on memory hotplug code cleanups, I realized that CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE is not really required anymore.
Changing that revealed some rather nasty looking CONFIG_MIGRATION handling.
Let's clean that up by introducing a dedicated CONFIG_NUMA_MIGRATION option and reducing the dependencies that CONFIG_MIGRATION has.
This patch (of 2):
All architectures that select CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE also select CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG. So we can just remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE.
For CONFIG_MIGRATION, make it depend on CONFIG_MEMORY_HOTREMOVE instead, and make CONFIG_MEMORY_HOTREMOVE select CONFIG_MIGRATION (just like CONFIG_CMA and CONFIG_COMPACTION already do).
We'll clean up CONFIG_MIGRATION next.
Link: https://lkml.kernel.org/r/20260319-config_migration-v1-0-42270124966f@kernel.org Link: https://lkml.kernel.org/r/20260319-config_migration-v1-1-42270124966f@kernel.org Signed-off-by: David Hildenbrand (Arm) <david@kernel.org> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Joshua Hahn <joshua.hahnjy@gmail.com> Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Alistair Popple <apopple@nvidia.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Byungchul Park <byungchul@sk.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: "Huang, Ying" <ying.huang@linux.alibaba.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| #
54ac9ff8 |
| 31-Mar-2026 |
Ard Biesheuvel <ardb@kernel.org> |
arm64: Use static call trampolines when kCFI is enabled
Implement arm64 support for the 'unoptimized' static call variety, which routes all calls through a trampoline that performs a tail call to th
arm64: Use static call trampolines when kCFI is enabled
Implement arm64 support for the 'unoptimized' static call variety, which routes all calls through a trampoline that performs a tail call to the chosen function, and wire it up for use when kCFI is enabled. This works around an issue with kCFI and generic static calls, where the prototypes of default handlers such as __static_call_nop() and __static_call_ret0() don't match the expected prototype of the call site, resulting in kCFI false positives [0].
Since static call targets may be located in modules loaded out of direct branching range, this needs an ADRP/LDR pair to load the branch target into R16 and a branch-to-register (BR) instruction to perform an indirect call.
Unlike on x86, there is no pressing need on arm64 to avoid indirect calls at all cost, but hiding it from the compiler as is done here does have some benefits: - the literal is located in .rodata, which gives us the same robustness advantage that code patching does; - no D-cache pollution from fetching hash values from .text sections.
From an execution speed PoV, this is unlikely to make any difference at all.
Cc: Sami Tolvanen <samitolvanen@google.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Kees Cook <kees@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will McVicker <willmcvicker@google.com> Reported-by: Carlos Llamas <cmllamas@google.com> Closes: https://lore.kernel.org/all/20260311225822.1565895-1-cmllamas@google.com/ [0] Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
show more ...
|
| #
4aab135b |
| 13-Mar-2026 |
James Morse <james.morse@arm.com> |
arm64: mpam: Select ARCH_HAS_CPU_RESCTRL
Enough MPAM support is present to enable ARCH_HAS_CPU_RESCTRL. Let it rip^Wlink!
ARCH_HAS_CPU_RESCTRL indicates resctrl can be enabled. It is enabled by th
arm64: mpam: Select ARCH_HAS_CPU_RESCTRL
Enough MPAM support is present to enable ARCH_HAS_CPU_RESCTRL. Let it rip^Wlink!
ARCH_HAS_CPU_RESCTRL indicates resctrl can be enabled. It is enabled by the arch code simply because it has 'arch' in its name.
This removes ARM_CPU_RESCTRL as a mimic of X86_CPU_RESCTRL. While here, move the ACPI dependency to the driver's Kconfig file.
Tested-by: Gavin Shan <gshan@redhat.com> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Zeng Heng <zengheng4@huawei.com> Tested-by: Punit Agrawal <punit.agrawal@oss.qualcomm.com> Tested-by: Jesse Chick <jessechick@os.amperecomputing.com> Reviewed-by: Zeng Heng <zengheng4@huawei.com> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Co-developed-by: Ben Horgan <ben.horgan@arm.com> Signed-off-by: Ben Horgan <ben.horgan@arm.com> Signed-off-by: James Morse <james.morse@arm.com>
show more ...
|
| #
c544f00a |
| 13-Mar-2026 |
Ben Horgan <ben.horgan@arm.com> |
arm64: mpam: Drop the CONFIG_EXPERT restriction
In anticipation of MPAM being useful remove the CONFIG_EXPERT restriction.
This was done to prevent the driver being enabled before the user-space in
arm64: mpam: Drop the CONFIG_EXPERT restriction
In anticipation of MPAM being useful remove the CONFIG_EXPERT restriction.
This was done to prevent the driver being enabled before the user-space interface was wired up.
Tested-by: Zeng Heng <zengheng4@huawei.com> Tested-by: Punit Agrawal <punit.agrawal@oss.qualcomm.com> Tested-by: Gavin Shan <gshan@redhat.com> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Jesse Chick <jessechick@os.amperecomputing.com> Reviewed-by: Zeng Heng <zengheng4@huawei.com> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Reviewed-by: James Morse <james.morse@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Ben Horgan <ben.horgan@arm.com> [ morse: Added second paragraph ] Signed-off-by: James Morse <james.morse@arm.com>
show more ...
|
| #
8e06d04f |
| 13-Mar-2026 |
James Morse <james.morse@arm.com> |
arm64: mpam: Context switch the MPAM registers
MPAM allows traffic in the SoC to be labeled by the OS, these labels are used to apply policy in caches and bandwidth regulators, and to monitor traffi
arm64: mpam: Context switch the MPAM registers
MPAM allows traffic in the SoC to be labeled by the OS, these labels are used to apply policy in caches and bandwidth regulators, and to monitor traffic in the SoC. The label is made up of a PARTID and PMG value. The x86 equivalent calls these CLOSID and RMID, but they don't map precisely.
MPAM has two CPU system registers that is used to hold the PARTID and PMG values that traffic generated at each exception level will use. These can be set per-task by the resctrl file system. (resctrl is the defacto interface for controlling this stuff).
Add a helper to switch this.
struct task_struct's separate CLOSID and RMID fields are insufficient to implement resctrl using MPAM, as resctrl can change the PARTID (CLOSID) and PMG (sort of like the RMID) separately. On x86, the rmid is an independent number, so a race that writes a mismatched closid and rmid into hardware is benign. On arm64, the pmg bits extend the partid. (i.e. partid-5 has a pmg-0 that is not the same as partid-6's pmg-0). In this case, mismatching the values will 'dirty' a pmg value that resctrl believes is clean, and is not tracking with its 'limbo' code.
To avoid this, the partid and pmg are always read and written as a pair. This requires a new u64 field. In struct task_struct there are two u32, rmid and closid for the x86 case, but as we can't use them here do something else. Add this new field, mpam_partid_pmg, to struct thread_info to avoid adding more architecture specific code to struct task_struct. Always use READ_ONCE()/WRITE_ONCE() when accessing this field.
Resctrl allows a per-cpu 'default' value to be set, this overrides the values when scheduling a task in the default control-group, which has PARTID 0. The way 'code data prioritisation' gets emulated means the register value for the default group needs to be a variable.
The current system register value is kept in a per-cpu variable to avoid writing to the system register if the value isn't going to change. Writes to this register may reset the hardware state for regulating bandwidth.
Finally, there is no reason to context switch these registers unless there is a driver changing the values in struct task_struct. Hide the whole thing behind a static key. This also allows the driver to disable MPAM in response to errors reported by hardware. Move the existing static key to belong to the arch code, as in the future the MPAM driver may become a loadable module.
All this should depend on whether there is an MPAM driver, hide it behind CONFIG_ARM64_MPAM.
Tested-by: Gavin Shan <gshan@redhat.com> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Zeng Heng <zengheng4@huawei.com> Tested-by: Punit Agrawal <punit.agrawal@oss.qualcomm.com> Tested-by: Jesse Chick <jessechick@os.amperecomputing.com> CC: Amit Singh Tomar <amitsinght@marvell.com> Reviewed-by: Zeng Heng <zengheng4@huawei.com> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Co-developed-by: Ben Horgan <ben.horgan@arm.com> Signed-off-by: Ben Horgan <ben.horgan@arm.com> Signed-off-by: James Morse <james.morse@arm.com>
show more ...
|
| #
377609ae |
| 14-Mar-2026 |
Yeoreum Yun <yeoreum.yun@arm.com> |
arm64: Kconfig: Add support for LSUI
Since Armv9.6, FEAT_LSUI supplies the load/store instructions for previleged level to access to access user memory without clearing PSTATE.PAN bit.
Add Kconfig
arm64: Kconfig: Add support for LSUI
Since Armv9.6, FEAT_LSUI supplies the load/store instructions for previleged level to access to access user memory without clearing PSTATE.PAN bit.
Add Kconfig option entry for FEAT_LSUI.
Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
show more ...
|
| #
d7eafe65 |
| 28-Feb-2026 |
Barry Song <baohua@kernel.org> |
dma-mapping: Separate DMA sync issuing and completion waiting
Currently, arch_sync_dma_for_cpu and arch_sync_dma_for_device always wait for the completion of each DMA buffer. That is, issuing the DM
dma-mapping: Separate DMA sync issuing and completion waiting
Currently, arch_sync_dma_for_cpu and arch_sync_dma_for_device always wait for the completion of each DMA buffer. That is, issuing the DMA sync and waiting for completion is done in a single API call.
For scatter-gather lists with multiple entries, this means issuing and waiting is repeated for each entry, which can hurt performance. Architectures like ARM64 may be able to issue all DMA sync operations for all entries first and then wait for completion together.
To address this, arch_sync_dma_for_* now batches DMA operations and performs a flush afterward. On ARM64, the flush is implemented with a dsb instruction in arch_sync_dma_flush(). On other architectures, arch_sync_dma_flush() is currently a nop.
Cc: Leon Romanovsky <leon@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Ada Couprie Diaz <ada.coupriediaz@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Cc: Tangquan Zheng <zhengtangquan@oppo.com> Reviewed-by: Juergen Gross <jgross@suse.com> # drivers/xen/swiotlb-xen.c Tested-by: Xueyuan Chen <xueyuan.chen21@gmail.com> Signed-off-by: Barry Song <baohua@kernel.org> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20260228221316.59934-1-21cnbao@gmail.com
show more ...
|
| #
6712fcde |
| 20-Feb-2026 |
Jisheng Zhang <jszhang@kernel.org> |
arm64: remove ARCH_INLINE_*
Since commit 7dadeaa6e851 ("sched: Further restrict the preemption modes"), arm64 only has two preemption models: full and lazy. Both implies PREEMPTION, so !PREEMPTION i
arm64: remove ARCH_INLINE_*
Since commit 7dadeaa6e851 ("sched: Further restrict the preemption modes"), arm64 only has two preemption models: full and lazy. Both implies PREEMPTION, so !PREEMPTION is always false for arm64, it's time to remove ARCH_INLINE_*.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
show more ...
|
| #
4cff5c05 |
| 12-Feb-2026 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'mm-stable-2026-02-11-19-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- "powerpc/64s: do not re-activate batched TLB flush" makes a
Merge tag 'mm-stable-2026-02-11-19-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- "powerpc/64s: do not re-activate batched TLB flush" makes arch_{enter|leave}_lazy_mmu_mode() nest properly (Alexander Gordeev)
It adds a generic enter/leave layer and switches architectures to use it. Various hacks were removed in the process.
- "zram: introduce compressed data writeback" implements data compression for zram writeback (Richard Chang and Sergey Senozhatsky)
- "mm: folio_zero_user: clear page ranges" adds clearing of contiguous page ranges for hugepages. Large improvements during demand faulting are demonstrated (David Hildenbrand)
- "memcg cleanups" tidies up some memcg code (Chen Ridong)
- "mm/damon: introduce {,max_}nr_snapshots and tracepoint for damos stats" improves DAMOS stat's provided information, deterministic control, and readability (SeongJae Park)
- "selftests/mm: hugetlb cgroup charging: robustness fixes" fixes a few issues in the hugetlb cgroup charging selftests (Li Wang)
- "Fix va_high_addr_switch.sh test failure - again" addresses several issues in the va_high_addr_switch test (Chunyu Hu)
- "mm/damon/tests/core-kunit: extend existing test scenarios" improves the KUnit test coverage for DAMON (Shu Anzai)
- "mm/khugepaged: fix dirty page handling for MADV_COLLAPSE" fixes a glitch in khugepaged which was causing madvise(MADV_COLLAPSE) to transiently return -EAGAIN (Shivank Garg)
- "arch, mm: consolidate hugetlb early reservation" reworks and consolidates a pile of straggly code related to reservation of hugetlb memory from bootmem and creation of CMA areas for hugetlb (Mike Rapoport)
- "mm: clean up anon_vma implementation" cleans up the anon_vma implementation in various ways (Lorenzo Stoakes)
- "tweaks for __alloc_pages_slowpath()" does a little streamlining of the page allocator's slowpath code (Vlastimil Babka)
- "memcg: separate private and public ID namespaces" cleans up the memcg ID code and prevents the internal-only private IDs from being exposed to userspace (Shakeel Butt)
- "mm: hugetlb: allocate frozen gigantic folio" cleans up the allocation of frozen folios and avoids some atomic refcount operations (Kefeng Wang)
- "mm/damon: advance DAMOS-based LRU sorting" improves DAMOS's movement of memory betewwn the active and inactive LRUs and adds auto-tuning of the ratio-based quotas and of monitoring intervals (SeongJae Park)
- "Support page table check on PowerPC" makes CONFIG_PAGE_TABLE_CHECK_ENFORCED work on powerpc (Andrew Donnellan)
- "nodemask: align nodes_and{,not} with underlying bitmap ops" makes nodes_and() and nodes_andnot() propagate the return values from the underlying bit operations, enabling some cleanup in calling code (Yury Norov)
- "mm/damon: hide kdamond and kdamond_lock from API callers" cleans up some DAMON internal interfaces (SeongJae Park)
- "mm/khugepaged: cleanups and scan limit fix" does some cleanup work in khupaged and fixes a scan limit accounting issue (Shivank Garg)
- "mm: balloon infrastructure cleanups" goes to town on the balloon infrastructure and its page migration function. Mainly cleanups, also some locking simplification (David Hildenbrand)
- "mm/vmscan: add tracepoint and reason for kswapd_failures reset" adds additional tracepoints to the page reclaim code (Jiayuan Chen)
- "Replace wq users and add WQ_PERCPU to alloc_workqueue() users" is part of Marco's kernel-wide migration from the legacy workqueue APIs over to the preferred unbound workqueues (Marco Crivellari)
- "Various mm kselftests improvements/fixes" provides various unrelated improvements/fixes for the mm kselftests (Kevin Brodsky)
- "mm: accelerate gigantic folio allocation" greatly speeds up gigantic folio allocation, mainly by avoiding unnecessary work in pfn_range_valid_contig() (Kefeng Wang)
- "selftests/damon: improve leak detection and wss estimation reliability" improves the reliability of two of the DAMON selftests (SeongJae Park)
- "mm/damon: cleanup kdamond, damon_call(), damos filter and DAMON_MIN_REGION" does some cleanup work in the core DAMON code (SeongJae Park)
- "Docs/mm/damon: update intro, modules, maintainer profile, and misc" performs maintenance work on the DAMON documentation (SeongJae Park)
- "mm: add and use vma_assert_stabilised() helper" refactors and cleans up the core VMA code. The main aim here is to be able to use the mmap write lock's lockdep state to perform various assertions regarding the locking which the VMA code requires (Lorenzo Stoakes)
- "mm, swap: swap table phase II: unify swapin use" removes some old swap code (swap cache bypassing and swap synchronization) which wasn't working very well. Various other cleanups and simplifications were made. The end result is a 20% speedup in one benchmark (Kairui Song)
- "enable PT_RECLAIM on more 64-bit architectures" makes PT_RECLAIM available on 64-bit alpha, loongarch, mips, parisc, and um. Various cleanups were performed along the way (Qi Zheng)
* tag 'mm-stable-2026-02-11-19-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (325 commits) mm/memory: handle non-split locks correctly in zap_empty_pte_table() mm: move pte table reclaim code to memory.c mm: make PT_RECLAIM depends on MMU_GATHER_RCU_TABLE_FREE mm: convert __HAVE_ARCH_TLB_REMOVE_TABLE to CONFIG_HAVE_ARCH_TLB_REMOVE_TABLE config um: mm: enable MMU_GATHER_RCU_TABLE_FREE parisc: mm: enable MMU_GATHER_RCU_TABLE_FREE mips: mm: enable MMU_GATHER_RCU_TABLE_FREE LoongArch: mm: enable MMU_GATHER_RCU_TABLE_FREE alpha: mm: enable MMU_GATHER_RCU_TABLE_FREE mm: change mm/pt_reclaim.c to use asm/tlb.h instead of asm-generic/tlb.h mm/damon/stat: remove __read_mostly from memory_idle_ms_percentiles zsmalloc: make common caches global mm: add SPDX id lines to some mm source files mm/zswap: use %pe to print error pointers mm/vmscan: use %pe to print error pointers mm/readahead: fix typo in comment mm: khugepaged: fix NR_FILE_PAGES and NR_SHMEM in collapse_file() mm: refactor vma_map_pages to use vm_insert_pages mm/damon: unify address range representation with damon_addr_range mm/cma: replace snprintf with strscpy in cma_new_area ...
show more ...
|
| #
57cb8450 |
| 11-Feb-2026 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'x86_paravirt_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 paravirt updates from Borislav Petkov:
- A nice cleanup to the paravirt code containing a un
Merge tag 'x86_paravirt_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 paravirt updates from Borislav Petkov:
- A nice cleanup to the paravirt code containing a unification of the paravirt clock interface, taming the include hell by splitting the pv_ops structure and removing of a bunch of obsolete code (Juergen Gross)
* tag 'x86_paravirt_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) x86/paravirt: Use XOR r32,r32 to clear register in pv_vcpu_is_preempted() x86/paravirt: Remove trailing semicolons from alternative asm templates x86/pvlocks: Move paravirt spinlock functions into own header x86/paravirt: Specify pv_ops array in paravirt macros x86/paravirt: Allow pv-calls outside paravirt.h objtool: Allow multiple pv_ops arrays x86/xen: Drop xen_mmu_ops x86/xen: Drop xen_cpu_ops x86/xen: Drop xen_irq_ops x86/paravirt: Move pv_native_*() prototypes to paravirt.c x86/paravirt: Introduce new paravirt-base.h header x86/paravirt: Move paravirt_sched_clock() related code into tsc.c x86/paravirt: Use common code for paravirt_steal_clock() riscv/paravirt: Use common code for paravirt_steal_clock() loongarch/paravirt: Use common code for paravirt_steal_clock() arm64/paravirt: Use common code for paravirt_steal_clock() arm/paravirt: Use common code for paravirt_steal_clock() sched: Move clock related paravirt code to kernel/sched paravirt: Remove asm/paravirt_api_clock.h x86/paravirt: Move thunk macros to paravirt_types.h ...
show more ...
|
| #
2f8aed5e |
| 29-Jan-2026 |
Will Deacon <will@kernel.org> |
Merge branch 'for-next/errata' into for-next/core
* for-next/errata: arm64: errata: Workaround for SI L1 downstream coherency issue
|
| #
3fed7e00 |
| 14-Jan-2026 |
Lucas Wei <lucaswei@google.com> |
arm64: errata: Workaround for SI L1 downstream coherency issue
When software issues a Cache Maintenance Operation (CMO) targeting a dirty cache line, the CPU and DSU cluster may optimize the operati
arm64: errata: Workaround for SI L1 downstream coherency issue
When software issues a Cache Maintenance Operation (CMO) targeting a dirty cache line, the CPU and DSU cluster may optimize the operation by combining the CopyBack Write and CMO into a single combined CopyBack Write plus CMO transaction presented to the interconnect (MCN). For these combined transactions, the MCN splits the operation into two separate transactions, one Write and one CMO, and then propagates the write and optionally the CMO to the downstream memory system or external Point of Serialization (PoS). However, the MCN may return an early CompCMO response to the DSU cluster before the corresponding Write and CMO transactions have completed at the external PoS or downstream memory. As a result, stale data may be observed by external observers that are directly connected to the external PoS or downstream memory.
This erratum affects any system topology in which the following conditions apply: - The Point of Serialization (PoS) is located downstream of the interconnect. - A downstream observer accesses memory directly, bypassing the interconnect.
Conditions: This erratum occurs only when all of the following conditions are met: 1. Software executes a data cache maintenance operation, specifically, a clean or clean&invalidate by virtual address (DC CVAC or DC CIVAC), that hits on unique dirty data in the CPU or DSU cache. This results in a combined CopyBack and CMO being issued to the interconnect. 2. The interconnect splits the combined transaction into separate Write and CMO transactions and returns an early completion response to the CPU or DSU before the write has completed at the downstream memory or PoS. 3. A downstream observer accesses the affected memory address after the early completion response is issued but before the actual memory write has completed. This allows the observer to read stale data that has not yet been updated at the PoS or downstream memory.
The implementation of workaround put a second loop of CMOs at the same virtual address whose operation meet erratum conditions to wait until cache data be cleaned to PoC. This way of implementation mitigates performance penalty compared to purely duplicate original CMO.
Signed-off-by: Lucas Wei <lucaswei@google.com> Signed-off-by: Will Deacon <will@kernel.org>
show more ...
|
| #
018a231b |
| 07-Jan-2026 |
Marc Zyngier <maz@kernel.org> |
arm64: Unconditionally enable PAN support
FEAT_PAN has been around since ARMv8.1 (over 11 years ago), has no compiler dependency (we have our own accessors), and is a great security benefit.
Drop C
arm64: Unconditionally enable PAN support
FEAT_PAN has been around since ARMv8.1 (over 11 years ago), has no compiler dependency (we have our own accessors), and is a great security benefit.
Drop CONFIG_ARM64_PAN, and make the support unconditionnal.
Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
show more ...
|
| #
6191b25d |
| 07-Jan-2026 |
Marc Zyngier <maz@kernel.org> |
arm64: Unconditionally enable LSE support
LSE atomics have been in the architecture since ARMv8.1 (released in 2014), and are hopefully supported by all modern toolchains.
Drop the optional nature
arm64: Unconditionally enable LSE support
LSE atomics have been in the architecture since ARMv8.1 (released in 2014), and are hopefully supported by all modern toolchains.
Drop the optional nature of LSE support in the kernel, and always compile the support in, as this really is very little code. LL/SC still is the default, and the switch to LSE is done dynamically.
Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
show more ...
|
| #
7303ecbf |
| 15-Dec-2025 |
Kevin Brodsky <kevin.brodsky@arm.com> |
mm: introduce CONFIG_ARCH_HAS_LAZY_MMU_MODE
Architectures currently opt in for implementing lazy_mmu helpers by defining __HAVE_ARCH_ENTER_LAZY_MMU_MODE.
In preparation for introducing a generic la
mm: introduce CONFIG_ARCH_HAS_LAZY_MMU_MODE
Architectures currently opt in for implementing lazy_mmu helpers by defining __HAVE_ARCH_ENTER_LAZY_MMU_MODE.
In preparation for introducing a generic lazy_mmu layer that will require storage in task_struct, let's switch to a cleaner approach: instead of defining a macro, select a CONFIG option.
This patch introduces CONFIG_ARCH_HAS_LAZY_MMU_MODE and has each arch select it when it implements lazy_mmu helpers. __HAVE_ARCH_ENTER_LAZY_MMU_MODE is removed and <linux/pgtable.h> relies on the new CONFIG instead.
On x86, lazy_mmu helpers are only implemented if PARAVIRT_XXL is selected. This creates some complications in arch/x86/boot/, because a few files manually undefine PARAVIRT* options. As a result <asm/paravirt.h> does not define the lazy_mmu helpers, but this breaks the build as <linux/pgtable.h> only defines them if !CONFIG_ARCH_HAS_LAZY_MMU_MODE. There does not seem to be a clean way out of this - let's just undefine that new CONFIG too.
Link: https://lkml.kernel.org/r/20251215150323.2218608-7-kevin.brodsky@arm.com Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com> Acked-by: Andreas Larsson <andreas@gaisler.com> [sparc] Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Borislav Betkov <bp@alien8.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: David Hildenbrand (Red Hat) <david@kernel.org> Cc: David S. Miller <davem@davemloft.net> Cc: David Woodhouse <dwmw2@infradead.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Juegren Gross <jgross@suse.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleinxer <tglx@linutronix.de> Cc: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|