| #
509d3f45 |
| 06-Dec-2025 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'mm-nonmm-stable-2025-12-06-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- "panic: sys_info: Refactor and fix a potential issue
Merge tag 'mm-nonmm-stable-2025-12-06-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- "panic: sys_info: Refactor and fix a potential issue" (Andy Shevchenko) fixes a build issue and does some cleanup in ib/sys_info.c
- "Implement mul_u64_u64_div_u64_roundup()" (David Laight) enhances the 64-bit math code on behalf of a PWM driver and beefs up the test module for these library functions
- "scripts/gdb/symbols: make BPF debug info available to GDB" (Ilya Leoshkevich) makes BPF symbol names, sizes, and line numbers available to the GDB debugger
- "Enable hung_task and lockup cases to dump system info on demand" (Feng Tang) adds a sysctl which can be used to cause additional info dumping when the hung-task and lockup detectors fire
- "lib/base64: add generic encoder/decoder, migrate users" (Kuan-Wei Chiu) adds a general base64 encoder/decoder to lib/ and migrates several users away from their private implementations
- "rbree: inline rb_first() and rb_last()" (Eric Dumazet) makes TCP a little faster
- "liveupdate: Rework KHO for in-kernel users" (Pasha Tatashin) reworks the KEXEC Handover interfaces in preparation for Live Update Orchestrator (LUO), and possibly for other future clients
- "kho: simplify state machine and enable dynamic updates" (Pasha Tatashin) increases the flexibility of KEXEC Handover. Also preparation for LUO
- "Live Update Orchestrator" (Pasha Tatashin) is a major new feature targeted at cloud environments. Quoting the cover letter:
This series introduces the Live Update Orchestrator, a kernel subsystem designed to facilitate live kernel updates using a kexec-based reboot. This capability is critical for cloud environments, allowing hypervisors to be updated with minimal downtime for running virtual machines. LUO achieves this by preserving the state of selected resources, such as memory, devices and their dependencies, across the kernel transition.
As a key feature, this series includes support for preserving memfd file descriptors, which allows critical in-memory data, such as guest RAM or any other large memory region, to be maintained in RAM across the kexec reboot.
Mike Rappaport merits a mention here, for his extensive review and testing work.
- "kexec: reorganize kexec and kdump sysfs" (Sourabh Jain) moves the kexec and kdump sysfs entries from /sys/kernel/ to /sys/kernel/kexec/ and adds back-compatibility symlinks which can hopefully be removed one day
- "kho: fixes for vmalloc restoration" (Mike Rapoport) fixes a BUG which was being hit during KHO restoration of vmalloc() regions
* tag 'mm-nonmm-stable-2025-12-06-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (139 commits) calibrate: update header inclusion Reinstate "resource: avoid unnecessary lookups in find_next_iomem_res()" vmcoreinfo: track and log recoverable hardware errors kho: fix restoring of contiguous ranges of order-0 pages kho: kho_restore_vmalloc: fix initialization of pages array MAINTAINERS: TPM DEVICE DRIVER: update the W-tag init: replace simple_strtoul with kstrtoul to improve lpj_setup KHO: fix boot failure due to kmemleak access to non-PRESENT pages Documentation/ABI: new kexec and kdump sysfs interface Documentation/ABI: mark old kexec sysfs deprecated kexec: move sysfs entries to /sys/kernel/kexec test_kho: always print restore status kho: free chunks using free_page() instead of kfree() selftests/liveupdate: add kexec test for multiple and empty sessions selftests/liveupdate: add simple kexec-based selftest for LUO selftests/liveupdate: add userspace API selftests docs: add documentation for memfd preservation via LUO mm: memfd_luo: allow preserving memfd liveupdate: luo_file: add private argument to store runtime state mm: shmem: export some functions to internal.h ...
show more ...
|
|
Revision tags: v6.18 |
|
| #
7b71205a |
| 25-Nov-2025 |
Mike Rapoport (Microsoft) <rppt@kernel.org> |
kho: fix restoring of contiguous ranges of order-0 pages
When contiguous ranges of order-0 pages are restored, kho_restore_page() calls prep_compound_page() with the first page in the range and orde
kho: fix restoring of contiguous ranges of order-0 pages
When contiguous ranges of order-0 pages are restored, kho_restore_page() calls prep_compound_page() with the first page in the range and order as parameters and then kho_restore_pages() calls split_page() to make sure all pages in the range are order-0.
However, since split_page() is not intended to split compound pages and with VM_DEBUG enabled it will trigger a VM_BUG_ON_PAGE().
Update kho_restore_page() so that it will use prep_compound_page() when it restores a folio and make sure it properly sets page count for both large folios and ranges of order-0 pages.
Link: https://lkml.kernel.org/r/20251125110917.843744-3-rppt@kernel.org Fixes: a667300bd53f ("kho: add support for preserving vmalloc allocations") Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reported-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| #
4bc84cd5 |
| 25-Nov-2025 |
Mike Rapoport (Microsoft) <rppt@kernel.org> |
kho: kho_restore_vmalloc: fix initialization of pages array
Patch series "kho: fixes for vmalloc restoration".
Pratyush reported off-list that when kho_restore_vmalloc() is used to restore a vmallo
kho: kho_restore_vmalloc: fix initialization of pages array
Patch series "kho: fixes for vmalloc restoration".
Pratyush reported off-list that when kho_restore_vmalloc() is used to restore a vmalloc_huge() allocation it hits VM_BUG_ON() when we reconstruct the struct pages in kho_restore_pages().
These patches fix the issue.
This patch (of 2):
In case a preserved vmalloc allocation was using huge pages, all pages in the array of pages added to vm_struct during kho_restore_vmalloc() are wrongly set to the same page.
Fix the indexing when assigning pages to that array.
Link: https://lkml.kernel.org/r/20251125110917.843744-1-rppt@kernel.org Link: https://lkml.kernel.org/r/20251125110917.843744-2-rppt@kernel.org Fixes: a667300bd53f ("kho: add support for preserving vmalloc allocations") Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
|
Revision tags: v6.18-rc7 |
|
| #
40cd0e8d |
| 22-Nov-2025 |
Ran Xiaokai <ran.xiaokai@zte.com.cn> |
KHO: fix boot failure due to kmemleak access to non-PRESENT pages
When booting with debug_pagealloc=on while having: CONFIG_KEXEC_HANDOVER_ENABLE_DEFAULT=y CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF=n the sy
KHO: fix boot failure due to kmemleak access to non-PRESENT pages
When booting with debug_pagealloc=on while having: CONFIG_KEXEC_HANDOVER_ENABLE_DEFAULT=y CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF=n the system fails to boot due to page faults during kmemleak scanning.
This occurs because: With debug_pagealloc is enabled, __free_pages() invokes debug_pagealloc_unmap_pages(), clearing the _PAGE_PRESENT bit for freed pages in the kernel page table. KHO scratch areas are allocated from memblock and noted by kmemleak. But these areas don't remain reserved but released later to the page allocator using init_cma_reserved_pageblock(). This causes subsequent kmemleak scans access non-PRESENT pages, leading to fatal page faults.
Mark scratch areas with kmemleak_ignore_phys() after they are allocated from memblock to exclude them from kmemleak scanning before they are released to buddy allocator to fix this.
[ran.xiaokai@zte.com.cn: add comment] Link: https://lkml.kernel.org/r/20251127122700.103927-1-ranxiaokai627@163.com Link: https://lkml.kernel.org/r/20251122182929.92634-1-ranxiaokai627@163.com Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Changyuan Lyu <changyuanl@google.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| #
b1551515 |
| 18-Nov-2025 |
Pratyush Yadav <pratyush@kernel.org> |
kho: free chunks using free_page() instead of kfree()
Before commit fa759cd75bce5 ("kho: allocate metadata directly from the buddy allocator"), the chunks were allocated from the slab allocator usin
kho: free chunks using free_page() instead of kfree()
Before commit fa759cd75bce5 ("kho: allocate metadata directly from the buddy allocator"), the chunks were allocated from the slab allocator using kzalloc(). Those were rightly freed using kfree().
When the commit switched to using the buddy allocator directly, it missed updating kho_mem_ser_free() to use free_page() instead of kfree().
Link: https://lkml.kernel.org/r/20251118182218.63044-1-pratyush@kernel.org Fixes: fa759cd75bce5 ("kho: allocate metadata directly from the buddy allocator") Signed-off-by: Pratyush Yadav <pratyush@kernel.org> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: David Matlack <dmatlack@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
|
Revision tags: v6.18-rc6 |
|
| #
7bd3643f |
| 14-Nov-2025 |
Pasha Tatashin <pasha.tatashin@soleen.com> |
kho: add Kconfig option to enable KHO by default
Currently, Kexec Handover must be explicitly enabled via the kernel command line parameter `kho=on`.
For workloads that rely on KHO as a foundationa
kho: add Kconfig option to enable KHO by default
Currently, Kexec Handover must be explicitly enabled via the kernel command line parameter `kho=on`.
For workloads that rely on KHO as a foundational requirement (such as the upcoming Live Update Orchestrator), requiring an explicit boot parameter adds redundant configuration steps.
Introduce CONFIG_KEXEC_HANDOVER_ENABLE_DEFAULT. When selected, KHO defaults to enabled. This is equivalent to passing kho=on at boot. The behavior can still be disabled at runtime by passing kho=off.
Link: https://lkml.kernel.org/r/20251114190002.3311679-14-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Baoquan He <bhe@redhat.com> Cc: Coiby Xu <coxu@redhat.com> Cc: Dave Vasilevsky <dave@vasilevsky.ca> Cc: Eric Biggers <ebiggers@google.com> Cc: Kees Cook <kees@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| #
de51999e |
| 14-Nov-2025 |
Pasha Tatashin <pasha.tatashin@soleen.com> |
kho: allow memory preservation state updates after finalization
Currently, kho_preserve_* and kho_unpreserve_* return -EBUSY if KHO is finalized. This enforces a rigid "freeze" on the KHO memory st
kho: allow memory preservation state updates after finalization
Currently, kho_preserve_* and kho_unpreserve_* return -EBUSY if KHO is finalized. This enforces a rigid "freeze" on the KHO memory state.
With the introduction of re-entrant finalization, this restriction is no longer necessary. Users should be allowed to modify the preservation set (e.g., adding new pages or freeing old ones) even after an initial finalization.
The intended workflow for updates is now: 1. Modify state (preserve/unpreserve). 2. Call kho_finalize() again to refresh the serialized metadata.
Remove the kho_out.finalized checks to enable this dynamic behavior.
This also allows to convert kho_unpreserve_* functions to void, as they do not return any error anymore.
Link: https://lkml.kernel.org/r/20251114190002.3311679-13-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Baoquan He <bhe@redhat.com> Cc: Coiby Xu <coxu@redhat.com> Cc: Dave Vasilevsky <dave@vasilevsky.ca> Cc: Eric Biggers <ebiggers@google.com> Cc: Kees Cook <kees@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| #
d7255959 |
| 14-Nov-2025 |
Pasha Tatashin <pasha.tatashin@soleen.com> |
kho: allow kexec load before KHO finalization
Currently, kho_fill_kimage() checks kho_out.finalized and returns early if KHO is not yet finalized. This enforces a strict ordering where userspace mu
kho: allow kexec load before KHO finalization
Currently, kho_fill_kimage() checks kho_out.finalized and returns early if KHO is not yet finalized. This enforces a strict ordering where userspace must finalize KHO *before* loading the kexec image.
This is restrictive, as standard workflows often involve loading the target kernel early in the lifecycle and finalizing the state (FDT) only immediately before the reboot.
Since the KHO FDT resides at a physical address allocated during boot (kho_init), its location is stable. We can attach this stable address to the kimage regardless of whether the content has been finalized yet.
Relax the check to only require kho_enable, allowing kexec_file_load to proceed at any time.
Link: https://lkml.kernel.org/r/20251114190002.3311679-12-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Baoquan He <bhe@redhat.com> Cc: Coiby Xu <coxu@redhat.com> Cc: Dave Vasilevsky <dave@vasilevsky.ca> Cc: Eric Biggers <ebiggers@google.com> Cc: Kees Cook <kees@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| #
8e068a28 |
| 14-Nov-2025 |
Pasha Tatashin <pasha.tatashin@soleen.com> |
kho: update FDT dynamically for subtree addition/removal
Currently, sub-FDTs were tracked in a list (kho_out.sub_fdts) and the final FDT is constructed entirely from scratch during kho_finalize().
kho: update FDT dynamically for subtree addition/removal
Currently, sub-FDTs were tracked in a list (kho_out.sub_fdts) and the final FDT is constructed entirely from scratch during kho_finalize().
We can maintain the FDT dynamically: 1. Initialize a valid, empty FDT in kho_init(). 2. Use fdt_add_subnode and fdt_setprop in kho_add_subtree to update the FDT immediately when a subsystem registers. 3. Use fdt_del_node in kho_remove_subtree to remove entries.
This removes the need for the intermediate sub_fdts list and the reconstruction logic in kho_finalize(). kho_finalize() now only needs to trigger memory map serialization.
Link: https://lkml.kernel.org/r/20251114190002.3311679-11-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Baoquan He <bhe@redhat.com> Cc: Coiby Xu <coxu@redhat.com> Cc: Dave Vasilevsky <dave@vasilevsky.ca> Cc: Eric Biggers <ebiggers@google.com> Cc: Kees Cook <kees@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| #
9a4301f7 |
| 14-Nov-2025 |
Pasha Tatashin <pasha.tatashin@soleen.com> |
kho: remove abort functionality and support state refresh
Previously, KHO required a dedicated kho_abort() function to clean up state before kho_finalize() could be called again. This was necessary
kho: remove abort functionality and support state refresh
Previously, KHO required a dedicated kho_abort() function to clean up state before kho_finalize() could be called again. This was necessary to handle complex unwind paths when using notifiers.
With the shift to direct memory preservation, the explicit abort step is no longer strictly necessary.
Remove kho_abort() and refactor kho_finalize() to handle re-entry. If kho_finalize() is called while KHO is already finalized, it will now automatically clean up the previous memory map and state before generating a new one. This allows the KHO state to be updated/refreshed simply by triggering finalize again.
Update debugfs to return -EINVAL if userspace attempts to write 0 to the finalize attribute, as explicit abort is no longer supported.
Link: https://lkml.kernel.org/r/20251114190002.3311679-10-pasha.tatashin@soleen.com Suggested-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Baoquan He <bhe@redhat.com> Cc: Coiby Xu <coxu@redhat.com> Cc: Dave Vasilevsky <dave@vasilevsky.ca> Cc: Eric Biggers <ebiggers@google.com> Cc: Kees Cook <kees@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| #
efa3a977 |
| 14-Nov-2025 |
Pasha Tatashin <pasha.tatashin@soleen.com> |
kho: remove global preserved_mem_map and store state in FDT
Currently, the serialized memory map is tracked via kho_out.preserved_mem_map and copied to the FDT during finalization. This double trac
kho: remove global preserved_mem_map and store state in FDT
Currently, the serialized memory map is tracked via kho_out.preserved_mem_map and copied to the FDT during finalization. This double tracking is redundant.
Remove preserved_mem_map from kho_out. Instead, maintain the physical address of the head chunk directly in the preserved-memory-map FDT property.
Introduce kho_update_memory_map() to manage this property. This function handles: 1. Retrieving and freeing any existing serialized map (handling the abort/retry case). 2. Updating the FDT property with the new chunk address.
This establishes the FDT as the single source of truth for the handover state.
Link: https://lkml.kernel.org/r/20251114190002.3311679-9-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Baoquan He <bhe@redhat.com> Cc: Coiby Xu <coxu@redhat.com> Cc: Dave Vasilevsky <dave@vasilevsky.ca> Cc: Eric Biggers <ebiggers@google.com> Cc: Kees Cook <kees@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| #
71960fe1 |
| 14-Nov-2025 |
Pasha Tatashin <pasha.tatashin@soleen.com> |
kho: simplify serialization and remove __kho_abort
Currently, __kho_finalize() performs memory serialization in the middle of FDT construction. If FDT construction fails later, the function must ma
kho: simplify serialization and remove __kho_abort
Currently, __kho_finalize() performs memory serialization in the middle of FDT construction. If FDT construction fails later, the function must manually clean up the serialized memory via __kho_abort().
Refactor __kho_finalize() to perform kho_mem_serialize() only after the FDT has been successfully constructed and finished. This reordering has two benefits: 1. It avoids expensive serialization work if FDT generation fails. 2. It removes the need for cleanup in the FDT error path.
As a result, the internal helper __kho_abort() is no longer needed for internal error handling. Inline its remaining logic (cleanup of the preserved memory map) directly into kho_abort() and remove the helper.
Link: https://lkml.kernel.org/r/20251114190002.3311679-8-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Baoquan He <bhe@redhat.com> Cc: Coiby Xu <coxu@redhat.com> Cc: Dave Vasilevsky <dave@vasilevsky.ca> Cc: Eric Biggers <ebiggers@google.com> Cc: Kees Cook <kees@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| #
e268689a |
| 14-Nov-2025 |
Pasha Tatashin <pasha.tatashin@soleen.com> |
kho: always expose output FDT in debugfs
Currently, the output FDT is added to debugfs only when KHO is finalized and removed when aborted.
There is no need to hide the FDT based on the state. Alw
kho: always expose output FDT in debugfs
Currently, the output FDT is added to debugfs only when KHO is finalized and removed when aborted.
There is no need to hide the FDT based on the state. Always expose it starting from initialization. This aids the transition toward removing the explicit abort functionality and converting KHO to be fully stateless.
Link: https://lkml.kernel.org/r/20251114190002.3311679-7-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Baoquan He <bhe@redhat.com> Cc: Coiby Xu <coxu@redhat.com> Cc: Dave Vasilevsky <dave@vasilevsky.ca> Cc: Eric Biggers <ebiggers@google.com> Cc: Kees Cook <kees@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| #
53f8f064 |
| 14-Nov-2025 |
Pasha Tatashin <pasha.tatashin@soleen.com> |
kho: verify deserialization status and fix FDT alignment access
During boot, kho_restore_folio() relies on the memory map having been successfully deserialized. If deserialization fails or no map i
kho: verify deserialization status and fix FDT alignment access
During boot, kho_restore_folio() relies on the memory map having been successfully deserialized. If deserialization fails or no map is present, attempting to restore the FDT folio is unsafe.
Update kho_mem_deserialize() to return a boolean indicating success. Use this return value in kho_memory_init() to disable KHO if deserialization fails. Also, the incoming FDT folio is never used, there is no reason to restore it.
Additionally, use get_unaligned() to retrieve the memory map pointer from the FDT. FDT properties are not guaranteed to be naturally aligned, and accessing a 64-bit value via a pointer that is only 32-bit aligned can cause faults.
Link: https://lkml.kernel.org/r/20251114190002.3311679-6-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Baoquan He <bhe@redhat.com> Cc: Coiby Xu <coxu@redhat.com> Cc: Dave Vasilevsky <dave@vasilevsky.ca> Cc: Eric Biggers <ebiggers@google.com> Cc: Kees Cook <kees@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| #
85de0090 |
| 14-Nov-2025 |
Pasha Tatashin <pasha.tatashin@soleen.com> |
kho: preserve FDT folio only once during initialization
Currently, the FDT folio is preserved inside __kho_finalize(). If the user performs multiple finalize/abort cycles, kho_preserve_folio() is c
kho: preserve FDT folio only once during initialization
Currently, the FDT folio is preserved inside __kho_finalize(). If the user performs multiple finalize/abort cycles, kho_preserve_folio() is called repeatedly for the same FDT folio.
Since the FDT folio is allocated once during kho_init(), it should be marked for preservation at the same time. Move the preservation call to kho_init() to align the preservation state with the object's lifecycle and simplify the finalize path.
Also, pre-zero the FDT tree so we do not expose random bits to the user and to the next kernel by using the new kho_alloc_preserve() api.
Link: https://lkml.kernel.org/r/20251114190002.3311679-5-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Baoquan He <bhe@redhat.com> Cc: Coiby Xu <coxu@redhat.com> Cc: Dave Vasilevsky <dave@vasilevsky.ca> Cc: Eric Biggers <ebiggers@google.com> Cc: Kees Cook <kees@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| #
4c205677 |
| 14-Nov-2025 |
Pasha Tatashin <pasha.tatashin@soleen.com> |
kho: introduce high-level memory allocation API
Currently, clients of KHO must manually allocate memory (e.g., via alloc_pages), calculate the page order, and explicitly call kho_preserve_folio().
kho: introduce high-level memory allocation API
Currently, clients of KHO must manually allocate memory (e.g., via alloc_pages), calculate the page order, and explicitly call kho_preserve_folio(). Similarly, cleanup requires separate calls to unpreserve and free the memory.
Introduce a high-level API to streamline this common pattern:
- kho_alloc_preserve(size): Allocates physically contiguous, zeroed memory and immediately marks it for preservation. - kho_unpreserve_free(ptr): Unpreserves and frees the memory in the current kernel. - kho_restore_free(ptr): Restores the struct page state of preserved memory in the new kernel and immediately frees it to the page allocator.
[pasha.tatashin@soleen.com: build fixes] Link: https://lkml.kernel.org/r/CA+CK2bBgXDhrHwTVgxrw7YTQ-0=LgW0t66CwPCgG=C85ftz4zw@mail.gmail.com Link: https://lkml.kernel.org/r/20251114190002.3311679-4-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Baoquan He <bhe@redhat.com> Cc: Coiby Xu <coxu@redhat.com> Cc: Dave Vasilevsky <dave@vasilevsky.ca> Cc: Eric Biggers <ebiggers@google.com> Cc: Kees Cook <kees@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| #
8c3819f6 |
| 14-Nov-2025 |
Pasha Tatashin <pasha.tatashin@soleen.com> |
kho: convert __kho_abort() to return void
The internal helper __kho_abort() always returns 0 and has no failure paths. Its return value is ignored by __kho_finalize and checked needlessly by kho_ab
kho: convert __kho_abort() to return void
The internal helper __kho_abort() always returns 0 and has no failure paths. Its return value is ignored by __kho_finalize and checked needlessly by kho_abort.
Change the return type to void to reflect that this function cannot fail, and simplify kho_abort by removing dead error handling code.
Link: https://lkml.kernel.org/r/20251114190002.3311679-3-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Baoquan He <bhe@redhat.com> Cc: Coiby Xu <coxu@redhat.com> Cc: Dave Vasilevsky <dave@vasilevsky.ca> Cc: Eric Biggers <ebiggers@google.com> Cc: Kees Cook <kees@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| #
077a4851 |
| 14-Nov-2025 |
Pasha Tatashin <pasha.tatashin@soleen.com> |
kho: fix misleading log message in kho_populate()
Patch series "kho: simplify state machine and enable dynamic updates", v2.
This patch series refactors the Kexec Handover subsystem to transition f
kho: fix misleading log message in kho_populate()
Patch series "kho: simplify state machine and enable dynamic updates", v2.
This patch series refactors the Kexec Handover subsystem to transition from a rigid, state-locked model to a dynamic, re-entrant architecture. It also introduces usability improvements.
Motivation Currently, KHO relies on a strict state machine where memory preservation is locked upon finalization. If a change is required, the user must explicitly "abort" to reset the state. Additionally, the kexec image cannot be loaded until KHO is finalized, and the FDT is rebuilt from scratch on every finalization.
This series simplifies this workflow to support "load early, finalize late" scenarios.
Key Changes
State Machine Simplification: - Removed kho_abort(). kho_finalize() is now re-entrant; calling it a second time automatically flushes the previous serialized state and generates a fresh one.
- Removed kho_out.finalized checks from preservation APIs, allowing drivers to add/remove pages even after an initial finalization.
- Decoupled kexec_file_load from KHO finalization. The KHO FDT physical address is now stable from boot, allowing the kexec image to be loaded before the handover metadata is finalized.
FDT Management: - The FDT is now updated in-place dynamically when subtrees are added or removed, removing the need for complex reconstruction logic.
- The output FDT is always exposed in debugfs (initialized and zeroed at boot), improving visibility and debugging capabilities throughout the system lifecycle.
- Removed the redundant global preserved_mem_map pointer, establishing the FDT property as the single source of truth.
New Features & API Enhancements: - High-Level Allocators: Introduced kho_alloc_preserve() and friends to reduce boilerplate for drivers that need to allocate, preserve, and eventually restore simple memory buffers.
- Configuration: Added CONFIG_KEXEC_HANDOVER_ENABLE_DEFAULT to allow KHO to be active by default without requiring the kho=on command line parameter.
Fixes: - Fixed potential alignment faults when accessing 64-bit FDT properties.
- Fixed the lifecycle of the FDT folio preservation (now preserved once at init).
This patch (of 13):
The log message in kho_populate() currently states "Will skip init for some devices". This implies that Kexec Handover always involves skipping device initialization.
However, KHO is a generic mechanism used to preserve kernel memory across reboot for various purposes, such as memfd, telemetry, or reserve_mem. Skipping device initialization is a specific property of live update drivers using KHO, not a property of the mechanism itself.
Remove the misleading suffix to accurately reflect the generic nature of KHO discovery.
Link: https://lkml.kernel.org/r/20251114190002.3311679-2-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Baoquan He <bhe@redhat.com> Cc: Coiby Xu <coxu@redhat.com> Cc: Dave Vasilevsky <dave@vasilevsky.ca> Cc: Eric Biggers <ebiggers@google.com> Cc: Kees Cook <kees@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
|
Revision tags: v6.18-rc5, v6.18-rc4 |
|
| #
8db839ca |
| 01-Nov-2025 |
Zhu Yanjun <yanjun.zhu@linux.dev> |
liveupdate: kho: use %pe format specifier for error pointer printing
Make pr_xxx() call to use the %pe format specifier instead of %d. The %pe specifier prints a symbolic error string (e.g., -ENOME
liveupdate: kho: use %pe format specifier for error pointer printing
Make pr_xxx() call to use the %pe format specifier instead of %d. The %pe specifier prints a symbolic error string (e.g., -ENOMEM, -EINVAL) when given an error pointer created with ERR_PTR(err).
This change enhances the clarity and diagnostic value of the error message by showing a descriptive error name rather than a numeric error code.
Note, that some err are still printed by value, as those errors might come from libfdt and not regular errnos.
Link: https://lkml.kernel.org/r/20251101142325.1326536-10-pasha.tatashin@soleen.com Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Co-developed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Changyuan Lyu <changyuanl@google.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: "Mike Rapoport (Microsoft)" <rppt@kernel.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| #
48a1b232 |
| 01-Nov-2025 |
Pasha Tatashin <pasha.tatashin@soleen.com> |
liveupdate: kho: move to kernel/liveupdate
Move KHO to kernel/liveupdate/ in preparation of placing all Live Update core kernel related files to the same place.
[pasha.tatashin@soleen.com: disable
liveupdate: kho: move to kernel/liveupdate
Move KHO to kernel/liveupdate/ in preparation of placing all Live Update core kernel related files to the same place.
[pasha.tatashin@soleen.com: disable the menu when DEFERRED_STRUCT_PAGE_INIT] Link: https://lkml.kernel.org/r/CA+CK2bAvh9Oa2SLfsbJ8zztpEjrgr_hr-uGgF1coy8yoibT39A@mail.gmail.com Link: https://lkml.kernel.org/r/20251101142325.1326536-8-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Changyuan Lyu <changyuanl@google.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Pratyush Yadav <pratyush@kernel.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Simon Horman <horms@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|