xref: /linux/Documentation/core-api/kho/concepts.rst (revision 00c010e130e58301db2ea0cec1eadc931e1cb8cf)
1*3498209fSAlexander Graf.. SPDX-License-Identifier: GPL-2.0-or-later
2*3498209fSAlexander Graf.. _kho-concepts:
3*3498209fSAlexander Graf
4*3498209fSAlexander Graf=======================
5*3498209fSAlexander GrafKexec Handover Concepts
6*3498209fSAlexander Graf=======================
7*3498209fSAlexander Graf
8*3498209fSAlexander GrafKexec HandOver (KHO) is a mechanism that allows Linux to preserve memory
9*3498209fSAlexander Grafregions, which could contain serialized system states, across kexec.
10*3498209fSAlexander Graf
11*3498209fSAlexander GrafIt introduces multiple concepts:
12*3498209fSAlexander Graf
13*3498209fSAlexander GrafKHO FDT
14*3498209fSAlexander Graf=======
15*3498209fSAlexander Graf
16*3498209fSAlexander GrafEvery KHO kexec carries a KHO specific flattened device tree (FDT) blob
17*3498209fSAlexander Grafthat describes preserved memory regions. These regions contain either
18*3498209fSAlexander Grafserialized subsystem states, or in-memory data that shall not be touched
19*3498209fSAlexander Grafacross kexec. After KHO, subsystems can retrieve and restore preserved
20*3498209fSAlexander Grafmemory regions from KHO FDT.
21*3498209fSAlexander Graf
22*3498209fSAlexander GrafKHO only uses the FDT container format and libfdt library, but does not
23*3498209fSAlexander Grafadhere to the same property semantics that normal device trees do: Properties
24*3498209fSAlexander Grafare passed in native endianness and standardized properties like ``regs`` and
25*3498209fSAlexander Graf``ranges`` do not exist, hence there are no ``#...-cells`` properties.
26*3498209fSAlexander Graf
27*3498209fSAlexander GrafKHO is still under development. The FDT schema is unstable and would change
28*3498209fSAlexander Grafin the future.
29*3498209fSAlexander Graf
30*3498209fSAlexander GrafScratch Regions
31*3498209fSAlexander Graf===============
32*3498209fSAlexander Graf
33*3498209fSAlexander GrafTo boot into kexec, we need to have a physically contiguous memory range that
34*3498209fSAlexander Grafcontains no handed over memory. Kexec then places the target kernel and initrd
35*3498209fSAlexander Grafinto that region. The new kernel exclusively uses this region for memory
36*3498209fSAlexander Grafallocations before during boot up to the initialization of the page allocator.
37*3498209fSAlexander Graf
38*3498209fSAlexander GrafWe guarantee that we always have such regions through the scratch regions: On
39*3498209fSAlexander Graffirst boot KHO allocates several physically contiguous memory regions. Since
40*3498209fSAlexander Grafafter kexec these regions will be used by early memory allocations, there is a
41*3498209fSAlexander Grafscratch region per NUMA node plus a scratch region to satisfy allocations
42*3498209fSAlexander Grafrequests that do not require particular NUMA node assignment.
43*3498209fSAlexander GrafBy default, size of the scratch region is calculated based on amount of memory
44*3498209fSAlexander Grafallocated during boot. The ``kho_scratch`` kernel command line option may be
45*3498209fSAlexander Grafused to explicitly define size of the scratch regions.
46*3498209fSAlexander GrafThe scratch regions are declared as CMA when page allocator is initialized so
47*3498209fSAlexander Grafthat their memory can be used during system lifetime. CMA gives us the
48*3498209fSAlexander Grafguarantee that no handover pages land in that region, because handover pages
49*3498209fSAlexander Grafmust be at a static physical memory location and CMA enforces that only
50*3498209fSAlexander Grafmovable pages can be located inside.
51*3498209fSAlexander Graf
52*3498209fSAlexander GrafAfter KHO kexec, we ignore the ``kho_scratch`` kernel command line option and
53*3498209fSAlexander Grafinstead reuse the exact same region that was originally allocated. This allows
54*3498209fSAlexander Grafus to recursively execute any amount of KHO kexecs. Because we used this region
55*3498209fSAlexander Graffor boot memory allocations and as target memory for kexec blobs, some parts
56*3498209fSAlexander Grafof that memory region may be reserved. These reservations are irrelevant for
57*3498209fSAlexander Grafthe next KHO, because kexec can overwrite even the original kernel.
58*3498209fSAlexander Graf
59*3498209fSAlexander Graf.. _kho-finalization-phase:
60*3498209fSAlexander Graf
61*3498209fSAlexander GrafKHO finalization phase
62*3498209fSAlexander Graf======================
63*3498209fSAlexander Graf
64*3498209fSAlexander GrafTo enable user space based kexec file loader, the kernel needs to be able to
65*3498209fSAlexander Grafprovide the FDT that describes the current kernel's state before
66*3498209fSAlexander Grafperforming the actual kexec. The process of generating that FDT is
67*3498209fSAlexander Grafcalled serialization. When the FDT is generated, some properties
68*3498209fSAlexander Grafof the system may become immutable because they are already written down
69*3498209fSAlexander Grafin the FDT. That state is called the KHO finalization phase.
70*3498209fSAlexander Graf
71*3498209fSAlexander GrafPublic API
72*3498209fSAlexander Graf==========
73*3498209fSAlexander Graf.. kernel-doc:: kernel/kexec_handover.c
74*3498209fSAlexander Graf   :export:
75