| 9ec95329 | 10-Apr-2026 |
Breno Leitao <leitao@debian.org> |
kho: fix error handling in kho_add_subtree()
Fix two error handling issues in kho_add_subtree(), where it doesn't handle the error path correctly.
1. If fdt_setprop() fails after the subnode has be
kho: fix error handling in kho_add_subtree()
Fix two error handling issues in kho_add_subtree(), where it doesn't handle the error path correctly.
1. If fdt_setprop() fails after the subnode has been created, the subnode is not removed. This leaves an incomplete node in the FDT (missing "preserved-data" or "blob-size" properties).
2. The fdt_setprop() return value (an FDT error code) is stored directly in err and returned to the caller, which expects -errno.
Fix both by storing fdt_setprop() results in fdt_err, jumping to a new out_del_node label that removes the subnode on failure, and only setting err = 0 on the success path, otherwise returning -ENOMEM (instead of FDT_ERR_ errors that would come from fdt_setprop).
No user-visible changes. This patch fixes error handling in the KHO (Kexec HandOver) subsystem, which is used to preserve data across kexec reboots. The fix only affects a rare failure path during kexec preparation — specifically when the kernel runs out of space in the Flattened Device Tree buffer while registering preserved memory regions.
In the unlikely event that this error path was triggered, the old code would leave a malformed node in the device tree and return an incorrect error code to the calling subsystem, which could lead to confusing log messages or incorrect recovery decisions. With this fix, the incomplete node is properly cleaned up and the appropriate errno value is propagated, this error code is not returned to the user.
Link: https://lore.kernel.org/20260410-kho_fix_send-v2-1-1b4debf7ee08@debian.org Fixes: 3dc92c311498 ("kexec: add Kexec HandOver (KHO) generation helpers") Signed-off-by: Breno Leitao <leitao@debian.org> Suggested-by: Pratyush Yadav <pratyush@kernel.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Breno Leitao <leitao@debian.org> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| 68750e82 | 27-Mar-2026 |
Pasha Tatashin <pasha.tatashin@soleen.com> |
liveupdate: defer file handler module refcounting to active sessions
Stop pinning modules indefinitely upon file handler registration. Instead, dynamically increment the module reference count only
liveupdate: defer file handler module refcounting to active sessions
Stop pinning modules indefinitely upon file handler registration. Instead, dynamically increment the module reference count only when a live update session actively uses the file handler (e.g., during preservation or deserialization), and release it when the session ends.
This allows modules providing live update handlers to be gracefully unloaded when no live update is in progress.
Link: https://lore.kernel.org/20260327033335.696621-11-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org> Cc: David Matlack <dmatlack@google.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Samiullah Khawaja <skhawaja@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| 2ab7207e | 27-Mar-2026 |
Pasha Tatashin <pasha.tatashin@soleen.com> |
liveupdate: make unregister functions return void
Change liveupdate_unregister_file_handler and liveupdate_unregister_flb to return void instead of an error code. This follows the design principle
liveupdate: make unregister functions return void
Change liveupdate_unregister_file_handler and liveupdate_unregister_flb to return void instead of an error code. This follows the design principle that unregistration during module unload should not fail, as the unload cannot be stopped at that point.
Link: https://lore.kernel.org/20260327033335.696621-10-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org> Cc: David Matlack <dmatlack@google.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Samiullah Khawaja <skhawaja@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| 07448800 | 27-Mar-2026 |
Pasha Tatashin <pasha.tatashin@soleen.com> |
liveupdate: remove liveupdate_test_unregister()
Now that file handler unregistration automatically unregisters all associated file handlers (FLBs), the liveupdate_test_unregister() function is no lo
liveupdate: remove liveupdate_test_unregister()
Now that file handler unregistration automatically unregisters all associated file handlers (FLBs), the liveupdate_test_unregister() function is no longer needed. Remove it along with its usages and declarations.
Link: https://lore.kernel.org/20260327033335.696621-9-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org> Cc: David Matlack <dmatlack@google.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Samiullah Khawaja <skhawaja@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| 5ee1c7d6 | 27-Mar-2026 |
Pasha Tatashin <pasha.tatashin@soleen.com> |
liveupdate: auto unregister FLBs on file handler unregistration
To ensure that unregistration is always successful and doesn't leave dangling resources, introduce auto-unregistration of FLBs: when a
liveupdate: auto unregister FLBs on file handler unregistration
To ensure that unregistration is always successful and doesn't leave dangling resources, introduce auto-unregistration of FLBs: when a file handler is unregistered, all FLBs associated with it are automatically unregistered.
Introduce a new helper luo_flb_unregister_all() which unregisters all FLBs linked to the given file handler.
Link: https://lore.kernel.org/20260327033335.696621-8-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org> Cc: David Matlack <dmatlack@google.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Samiullah Khawaja <skhawaja@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| 118c3908 | 27-Mar-2026 |
Pasha Tatashin <pasha.tatashin@soleen.com> |
liveupdate: remove luo_session_quiesce()
Now that FLB module references are handled dynamically during active sessions, we can safely remove the luo_session_quiesce() and luo_session_resume() mechan
liveupdate: remove luo_session_quiesce()
Now that FLB module references are handled dynamically during active sessions, we can safely remove the luo_session_quiesce() and luo_session_resume() mechanism.
Link: https://lore.kernel.org/20260327033335.696621-7-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org> Cc: David Matlack <dmatlack@google.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Samiullah Khawaja <skhawaja@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| 76be9983 | 27-Mar-2026 |
Pasha Tatashin <pasha.tatashin@soleen.com> |
liveupdate: defer FLB module refcounting to active sessions
Stop pinning modules indefinitely upon FLB registration. Instead, dynamically take a module reference when the FLB is actively used in a
liveupdate: defer FLB module refcounting to active sessions
Stop pinning modules indefinitely upon FLB registration. Instead, dynamically take a module reference when the FLB is actively used in a session (e.g., during preserve and retrieve) and release it when the session concludes.
This allows modules providing FLB operations to be cleanly unloaded when not in active use by the live update orchestrator.
Link: https://lore.kernel.org/20260327033335.696621-6-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Samiullah Khawaja <skhawaja@google.com> Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org> Cc: David Matlack <dmatlack@google.com> Cc: Mike Rapoport <rppt@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| 6b2b22f7 | 27-Mar-2026 |
Pasha Tatashin <pasha.tatashin@soleen.com> |
liveupdate: protect FLB lists with luo_register_rwlock
Because liveupdate FLB objects will soon drop their persistent module references when registered, list traversals must be protected against con
liveupdate: protect FLB lists with luo_register_rwlock
Because liveupdate FLB objects will soon drop their persistent module references when registered, list traversals must be protected against concurrent module unloading.
To provide this protection, utilize the global luo_register_rwlock. It protects the global registry of FLBs and the handler's specific list of FLB dependencies.
Read locks are used during concurrent list traversals (e.g., during preservation and serialization). Write locks are taken during registration and unregistration.
Link: https://lore.kernel.org/20260327033335.696621-5-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org> Cc: David Matlack <dmatlack@google.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Samiullah Khawaja <skhawaja@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| 9e1e1858 | 27-Mar-2026 |
Pasha Tatashin <pasha.tatashin@soleen.com> |
liveupdate: protect file handler list with rwsem
Because liveupdate file handlers will no longer hold a module reference when registered, we must ensure that the access to the handler list is protec
liveupdate: protect file handler list with rwsem
Because liveupdate file handlers will no longer hold a module reference when registered, we must ensure that the access to the handler list is protected against concurrent module unloading.
Utilize the global luo_register_rwlock to protect the global registry of file handlers. Read locks are taken during list traversals in luo_preserve_file() and luo_file_deserialize(). Write locks are taken during registration and unregistration.
Link: https://lore.kernel.org/20260327033335.696621-4-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org> Cc: David Matlack <dmatlack@google.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Samiullah Khawaja <skhawaja@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| 38fb71ac | 27-Mar-2026 |
Pasha Tatashin <pasha.tatashin@soleen.com> |
liveupdate: synchronize lazy initialization of FLB private state
The luo_flb_get_private() function, which is responsible for lazily initializing the private state of FLB objects, can be called conc
liveupdate: synchronize lazy initialization of FLB private state
The luo_flb_get_private() function, which is responsible for lazily initializing the private state of FLB objects, can be called concurrently from multiple threads. This creates a data race on the 'initialized' flag and can lead to multiple executions of mutex_init() and INIT_LIST_HEAD() on the same memory.
Introduce a static spinlock (luo_flb_init_lock) local to the function to synchronize the initialization path. Use smp_load_acquire() and smp_store_release() for memory ordering between the fast path and the slow path.
Link: https://lore.kernel.org/20260327033335.696621-3-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: David Matlack <dmatlack@google.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Samiullah Khawaja <skhawaja@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| 277f4e5e | 27-Mar-2026 |
Pasha Tatashin <pasha.tatashin@soleen.com> |
liveupdate: safely print untrusted strings
Patch series "liveupdate: Fix module unloading and unregister API", v3.
This patch series addresses an issue with how LUO handles module reference countin
liveupdate: safely print untrusted strings
Patch series "liveupdate: Fix module unloading and unregister API", v3.
This patch series addresses an issue with how LUO handles module reference counting and unregistration during a module unload (e.g., via rmmod).
Currently, modules that register live update file handlers are pinned for the entire duration they are registered. This prevents the modules from being unloaded gracefully, even when no live update session is in progress.
Furthermore, if a module is forcefully unloaded, the unregistration functions return an error (e.g. -EBUSY) if a session is active, which is ignored by the kernel's module unload path, leaving dangling pointers in the LUO global lists.
To resolve these issues, this series introduces the following changes: 1. Adds a global read-write semaphore (luo_register_rwlock) to protect the registration lists for both file handlers and FLBs. 2. Reduces the scope of module reference counting for file handlers and FLBs. Instead of pinning modules indefinitely upon registration, references are now taken only when they are actively used in a live update session (e.g., during preservation, retrieval, or deserialization). 3. Removes the global luo_session_quiesce() mechanism since module unload behavior now handles active sessions implicitly. 4. Introduces auto-unregistration of FLBs during file handler unregistration to prevent leaving dangling resources. 5. Changes the unregistration functions to return void instead of an error code. 6. Fixes a data race in luo_flb_get_private() by introducing a spinlock for thread-safe lazy initialization. 7. Strengthens security by using %.*s when printing untrusted deserialized compatible strings and session names to prevent out-of-bounds reads.
This patch (of 10):
Deserialized strings from KHO data (such as file handler compatible strings and session names) are provided by the previous kernel and might not be null-terminated if the data is corrupted or maliciously crafted.
When printing these strings in error messages, use the %.*s format specifier with the maximum buffer size to prevent out-of-bounds reads into adjacent kernel memory.
Link: https://lore.kernel.org/20260327033335.696621-1-pasha.tatashin@soleen.com Link: https://lore.kernel.org/20260327033335.696621-2-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org> Cc: David Matlack <dmatlack@google.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Samiullah Khawaja <skhawaja@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| 00d0b372 | 26-Mar-2026 |
Pasha Tatashin <pasha.tatashin@soleen.com> |
liveupdate: prevent double management of files
Patch series "liveupdate: prevent double preservation", v4.
Currently, LUO does not prevent the same file from being managed twice across different ac
liveupdate: prevent double management of files
Patch series "liveupdate: prevent double preservation", v4.
Currently, LUO does not prevent the same file from being managed twice across different active sessions.
Because LUO preserves files of absolutely different types: memfd, and upcoming vfiofd [1], iommufd [2], guestmefd (and possible kvmfd/cpufd). There is no common private data or guarantee on how to prevent that the same file is not preserved twice beside using inode or some slower and expensive method like hashtables.
This patch (of 4)
Currently, LUO does not prevent the same file from being managed twice across different active sessions.
Use a global xarray luo_preserved_files to keep track of file identifiers being preserved by LUO. Update luo_preserve_file() to check and insert the file identifier into this xarray when it is preserved, and erase it in luo_file_unpreserve_files() when it is released.
To allow handlers to define what constitutes a "unique" file (e.g., different struct file objects pointing to the same hardware resource), add a get_id() callback to struct liveupdate_file_ops. If not provided, the default identifier is the struct file pointer itself.
This ensures that the same file (or resource) cannot be managed by multiple sessions. If another session attempts to preserve an already managed file, it will now fail with -EBUSY.
Link: https://lore.kernel.org/20260326163943.574070-1-pasha.tatashin@soleen.com Link: https://lore.kernel.org/20260326163943.574070-2-pasha.tatashin@soleen.com Link: https://lore.kernel.org/all/20260129212510.967611-1-dmatlack@google.com [1] Link: https://lore.kernel.org/all/20260203220948.2176157-1-skhawaja@google.com [2] Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Samiullah Khawaja <skhawaja@google.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: David Matlack <dmatlack@google.com> Cc: Pratyush Yadav <pratyush@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Christian Brauner <brauner@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| 76aa46b9 | 16-Mar-2026 |
Breno Leitao <leitao@debian.org> |
kho: kexec-metadata: track previous kernel chain
Use Kexec Handover (KHO) to pass the previous kernel's version string and the number of kexec reboots since the last cold boot to the next kernel, an
kho: kexec-metadata: track previous kernel chain
Use Kexec Handover (KHO) to pass the previous kernel's version string and the number of kexec reboots since the last cold boot to the next kernel, and print it at boot time.
Example output: [ 0.000000] KHO: exec from: 6.19.0-rc4-next-20260107 (count 1)
Motivation ==========
Bugs that only reproduce when kexecing from specific kernel versions are difficult to diagnose. These issues occur when a buggy kernel kexecs into a new kernel, with the bug manifesting only in the second kernel.
Recent examples include the following commits:
* commit eb2266312507 ("x86/boot: Fix page table access in 5-level to 4-level paging transition") * commit 77d48d39e991 ("efistub/tpm: Use ACPI reclaim memory for event log to avoid corruption") * commit 64b45dd46e15 ("x86/efi: skip memattr table on kexec boot")
As kexec-based reboots become more common, these version-dependent bugs are appearing more frequently. At scale, correlating crashes to the previous kernel version is challenging, especially when issues only occur in specific transition scenarios.
Implementation ==============
The kexec metadata is stored as a plain C struct (struct kho_kexec_metadata) rather than FDT format, for simplicity and direct field access. It is registered via kho_add_subtree() as a separate subtree, keeping it independent from the core KHO ABI. This design choice:
- Keeps the core KHO ABI minimal and stable - Allows the metadata format to evolve independently - Avoids requiring version bumps for all KHO consumers (LUO, etc.) when the metadata format changes
The struct kho_kexec_metadata contains two fields: - previous_release: The kernel version that initiated the kexec - kexec_count: Number of kexec boots since last cold boot
On cold boot, kexec_count starts at 0 and increments with each kexec. The count helps identify issues that only manifest after multiple consecutive kexec reboots.
[leitao@debian.org: call kho_kexec_metadata_init() for both boot paths] Link: https://lore.kernel.org/all/20260309-kho-v8-5-c3abcf4ac750@debian.org/ [1] Link: https://lore.kernel.org/20260409-kho_fix_merge_issue-v1-1-710c84ceaa85@debian.org Link: https://lore.kernel.org/20260316-kho-v9-5-ed6dcd951988@debian.org Signed-off-by: Breno Leitao <leitao@debian.org> Acked-by: SeongJae Park <sj@kernel.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: David Hildenbrand <david@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| 062dd306 | 16-Mar-2026 |
Breno Leitao <leitao@debian.org> |
kho: fix kho_in_debugfs_init() to handle non-FDT blobs
kho_in_debugfs_init() calls fdt_totalsize() to determine blob sizes, which assumes all blobs are FDTs. This breaks for non-FDT blobs like stru
kho: fix kho_in_debugfs_init() to handle non-FDT blobs
kho_in_debugfs_init() calls fdt_totalsize() to determine blob sizes, which assumes all blobs are FDTs. This breaks for non-FDT blobs like struct kho_kexec_metadata.
Fix this by reading the "blob-size" property from the FDT (persisted by kho_add_subtree()) instead of calling fdt_totalsize(). Also rename local variables from fdt_phys/sub_fdt to blob_phys/blob for consistency with the non-FDT-specific naming.
Link: https://lore.kernel.org/20260316-kho-v9-4-ed6dcd951988@debian.org Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: David Hildenbrand <david@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: SeongJae Park <sj@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| 85e41392 | 16-Mar-2026 |
Breno Leitao <leitao@debian.org> |
kho: persist blob size in KHO FDT
kho_add_subtree() accepts a size parameter but only forwards it to debugfs. The size is not persisted in the KHO FDT, so it is lost across kexec. This makes it im
kho: persist blob size in KHO FDT
kho_add_subtree() accepts a size parameter but only forwards it to debugfs. The size is not persisted in the KHO FDT, so it is lost across kexec. This makes it impossible for the incoming kernel to determine the blob size without understanding the blob format.
Store the blob size as a "blob-size" property in the KHO FDT alongside the "preserved-data" physical address. This allows the receiving kernel to recover the size for any blob regardless of format.
Also extend kho_retrieve_subtree() with an optional size output parameter so callers can learn the blob size without needing to understand the blob format. Update all callers to pass NULL for the new parameter.
Link: https://lore.kernel.org/20260316-kho-v9-3-ed6dcd951988@debian.org Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: David Hildenbrand <david@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: SeongJae Park <sj@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| 4916ae38 | 16-Mar-2026 |
Breno Leitao <leitao@debian.org> |
kho: rename fdt parameter to blob in kho_add/remove_subtree()
Since kho_add_subtree() now accepts arbitrary data blobs (not just FDTs), rename the parameter from 'fdt' to 'blob' to better reflect it
kho: rename fdt parameter to blob in kho_add/remove_subtree()
Since kho_add_subtree() now accepts arbitrary data blobs (not just FDTs), rename the parameter from 'fdt' to 'blob' to better reflect its purpose. Apply the same rename to kho_remove_subtree() for consistency.
Also rename kho_debugfs_fdt_add() and kho_debugfs_fdt_remove() to kho_debugfs_blob_add() and kho_debugfs_blob_remove() respectively, with the same parameter rename from 'fdt' to 'blob'.
Link: https://lore.kernel.org/20260316-kho-v9-2-ed6dcd951988@debian.org Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: David Hildenbrand <david@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: SeongJae Park <sj@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| d9e4142e | 16-Mar-2026 |
Breno Leitao <leitao@debian.org> |
kho: add size parameter to kho_add_subtree()
Patch series "kho: history: track previous kernel version and kexec boot count", v9.
Use Kexec Handover (KHO) to pass the previous kernel's version stri
kho: add size parameter to kho_add_subtree()
Patch series "kho: history: track previous kernel version and kexec boot count", v9.
Use Kexec Handover (KHO) to pass the previous kernel's version string and the number of kexec reboots since the last cold boot to the next kernel, and print it at boot time.
Example ======= [ 0.000000] Linux version 6.19.0-rc3-upstream-00047-ge5d992347849 ... [ 0.000000] KHO: exec from: 6.19.0-rc4-next-20260107upstream-00004-g3071b0dc4498 (count 1)
Motivation ==========
Bugs that only reproduce when kexecing from specific kernel versions are difficult to diagnose. These issues occur when a buggy kernel kexecs into a new kernel, with the bug manifesting only in the second kernel.
Recent examples include:
* eb2266312507 ("x86/boot: Fix page table access in 5-level to 4-level paging transition") * 77d48d39e991 ("efistub/tpm: Use ACPI reclaim memory for event log to avoid corruption") * 64b45dd46e15 ("x86/efi: skip memattr table on kexec boot")
As kexec-based reboots become more common, these version-dependent bugs are appearing more frequently. At scale, correlating crashes to the previous kernel version is challenging, especially when issues only occur in specific transition scenarios.
Some bugs manifest only after multiple consecutive kexec reboots. Tracking the kexec count helps identify these cases (this metric is already used by live update sub-system).
KHO provides a reliable mechanism to pass information between kernels. By carrying the previous kernel's release string and kexec count forward, we can print this context at boot time to aid debugging.
The goal of this feature is to have this information being printed in early boot, so, users can trace back kernel releases in kexec. Systemd is not helpful because we cannot assume that the previous kernel has systemd or even write access to the disk (common when using Linux as bootloaders)
This patch (of 6):
kho_add_subtree() assumes the fdt argument is always an FDT and calls fdt_totalsize() on it in the debugfs code path. This assumption will break if a caller passes arbitrary data instead of an FDT.
When CONFIG_KEXEC_HANDOVER_DEBUGFS is enabled, kho_debugfs_fdt_add() calls __kho_debugfs_fdt_add(), which executes:
f->wrapper.size = fdt_totalsize(fdt);
Fix this by adding an explicit size parameter to kho_add_subtree() so callers specify the blob size. This allows subtrees to contain arbitrary data formats, not just FDTs. Update all callers:
- memblock.c: use fdt_totalsize(fdt) - luo_core.c: use fdt_totalsize(fdt_out) - test_kho.c: use fdt_totalsize() - kexec_handover.c (root fdt): use fdt_totalsize(kho_out.fdt)
Also update __kho_debugfs_fdt_add() to receive the size explicitly instead of computing it internally via fdt_totalsize(). In kho_in_debugfs_init(), pass fdt_totalsize() for the root FDT and sub-blobs since all current users are FDTs. A subsequent patch will persist the size in the KHO FDT so the incoming side can handle non-FDT blobs correctly.
Link: https://lore.kernel.org/20260323110747.193569-1-duanchenghao@kylinos.cn Link: https://lore.kernel.org/20260316-kho-v9-1-ed6dcd951988@debian.org Signed-off-by: Breno Leitao <leitao@debian.org> Suggested-by: Pratyush Yadav <pratyush@kernel.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: David Hildenbrand <david@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: SeongJae Park <sj@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| 22bdab8e | 09-Mar-2026 |
Pratyush Yadav <pratyush@kernel.org> |
kho: drop restriction on maximum page order
KHO currently restricts the maximum order of a restored page to the maximum order supported by the buddy allocator. While this works fine for much of the
kho: drop restriction on maximum page order
KHO currently restricts the maximum order of a restored page to the maximum order supported by the buddy allocator. While this works fine for much of the data passed across kexec, it is possible to have pages larger than MAX_PAGE_ORDER.
For one, it is possible to get a larger order when using kho_preserve_pages() if the number of pages is large enough, since it tries to combine multiple aligned 0-order preservations into one higher order preservation.
For another, upcoming support for hugepages can have gigantic hugepages being preserved over KHO.
There is no real reason for this limit. The KHO preservation machinery can handle any page order. Remove this artificial restriction on max page order.
Link: https://lkml.kernel.org/r/20260309123410.382308-2-pratyush@kernel.org Signed-off-by: Pratyush Yadav <pratyush@kernel.org> Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Samiullah Khawaja <skhawaja@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| 019fc368 | 25-Feb-2026 |
Pasha Tatashin <pasha.tatashin@soleen.com> |
kho: fix KASAN support for restored vmalloc regions
Restored vmalloc regions are currently not properly marked for KASAN, causing KASAN to treat accesses to these regions as out-of-bounds.
Fix this
kho: fix KASAN support for restored vmalloc regions
Restored vmalloc regions are currently not properly marked for KASAN, causing KASAN to treat accesses to these regions as out-of-bounds.
Fix this by properly unpoisoning the restored vmalloc area using kasan_unpoison_vmalloc(). This requires setting the VM_UNINITIALIZED flag during the initial area allocation and clearing it after the pages have been mapped and unpoisoned, using the clear_vm_uninitialized_flag() helper.
Link: https://lkml.kernel.org/r/20260225223857.1714801-3-pasha.tatashin@soleen.com Fixes: a667300bd53f ("kho: add support for preserving vmalloc allocations") Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reported-by: Pratyush Yadav <pratyush@kernel.org> Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org> Tested-by: Pratyush Yadav (Google) <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
| 6b0dd42d | 06-Feb-2026 |
Jason Miu <jasonmiu@google.com> |
kho: remove finalize state and clients
Eliminate the `kho_finalize()` function and its associated state from the KHO subsystem. The transition to a radix tree for memory tracking makes the explicit
kho: remove finalize state and clients
Eliminate the `kho_finalize()` function and its associated state from the KHO subsystem. The transition to a radix tree for memory tracking makes the explicit "finalize" state and its serialization step obsolete.
Remove the `kho_finalize()` and `kho_finalized()` APIs and their stub implementations. Update KHO client code and the debugfs interface to no longer call or depend on the `kho_finalize()` mechanism.
Complete the move towards a stateless KHO, simplifying the overall design by removing unnecessary state management.
Link: https://lkml.kernel.org/r/20260206021428.3386442-3-jasonmiu@google.com Signed-off-by: Jason Miu <jasonmiu@google.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Baoquan He <bhe@redhat.com> Cc: Changyuan Lyu <changyuanl@google.com> Cc: David Matlack <dmatlack@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Pratyush Yadav <pratyush@kernel.org> Cc: Ran Xiaokai <ran.xiaokai@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|