| 856bc8bb | 21-May-2026 |
Chao Gao <chao.gao@intel.com> |
x86/virt/tdx: Refresh TDX module version after update
The kernel exposes the TDX module version through sysfs so userspace can check update compatibility. That information needs to remain accurate a
x86/virt/tdx: Refresh TDX module version after update
The kernel exposes the TDX module version through sysfs so userspace can check update compatibility. That information needs to remain accurate across runtime updates.
A runtime update may change the module's update_version, so refresh the cached version right after a successful update.
Drop __ro_after_init from tdx_sysinfo because it is now updated at runtime.
Do not refresh the rest of tdx_sysinfo, even if some values change across updates. TDX module updates are backward compatible, so existing tdx_sysinfo consumers, such as KVM, can continue to operate without seeing the new values.
[ dhansen: trim changelog ]
Signed-off-by: Chao Gao <chao.gao@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Link: https://patch.msgid.link/20260520133909.409394-22-chao.gao@intel.com
show more ...
|
| 6e97c234 | 22-May-2026 |
Dave Hansen <dave.hansen@linux.intel.com> |
x86/virt/seamldr: Add module update locking
TDX metadata like the version number changes during a module update. Add functions to lock out module updates.
The current stop_machine() implementation
x86/virt/seamldr: Add module update locking
TDX metadata like the version number changes during a module update. Add functions to lock out module updates.
The current stop_machine() implementation uses worker threads. The scheduler actually does a full, normal context switch over to that thread. preempt_disable() obviously inhibits that context switch and thus, locks out stop_machine() users like the module update.
Thanks to Chao for the idea of using preempt_disable().
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
show more ...
|
| b344c50a | 21-May-2026 |
Chao Gao <chao.gao@intel.com> |
x86/virt/tdx: Restore TDX module state
After per-CPU initialization, the module is nearly functional. It is in a similar state to TDX initialization before TDH.SYS.CONFIG.
At this point, the kernel
x86/virt/tdx: Restore TDX module state
After per-CPU initialization, the module is nearly functional. It is in a similar state to TDX initialization before TDH.SYS.CONFIG.
At this point, the kernel _could_ just repeat the boot-time sequence, but that would land the new module in a slightly different state than the old module. This would leave old TDs unrunnable, which is not a good outcome.
Thankfully, the "handoff" data saved during module shutdown should contain all the information needed to restore the TDX module state to exactly what it was before the update.
Restore TDX module state. The TDX module only needs a single copy so only do this on the lead CPU.
Restoration errors can theoretically be handled in a few ways. For instance, userspace could try to load a different TDX module version. Or, the kernel could give up on the handoff process and just reinitialize the new module from scratch, which would lose all existing TDs.
Simply propagate errors to userspace. Ignore the idea of a TD-destroying reinitialization. It would destroy data like a reboot and if things have gone that wrong a reboot is probably the best option anyway.
Note: the location and the format of handoff data is defined by the TDX module. The new module knows where to get handoff data and how to parse it. The kernel does not touch it at all.
Signed-off-by: Chao Gao <chao.gao@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Tony Lindgren <tony.lindgren@linux.intel.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Kiryl Shutsemau (Meta) <kas@kernel.org> Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Link: https://patch.msgid.link/20260520133909.409394-21-chao.gao@intel.com
show more ...
|
| bf7c0ed2 | 21-May-2026 |
Chao Gao <chao.gao@intel.com> |
x86/virt/seamldr: Initialize the newly-installed TDX module
Continue fleshing out the update process. At this point the new module is sitting in memory but has never been called and is not usable. I
x86/virt/seamldr: Initialize the newly-installed TDX module
Continue fleshing out the update process. At this point the new module is sitting in memory but has never been called and is not usable. It is in a similar state to the when the system first boots.
Leave the P-SEAMLDR behind. Stop making calls to it. Transition to calling the new TDX module itself to set up both global and per-cpu state.
Share tdx_cpu_enable() with the fresh-boot module initialization code. Export it and invoke it on all CPUs.
Note: "TDX global initialization" needs to be done once before "TDX per-CPU initialization". It would be a great fit for the new runtime update "is_lead_cpu" logic. But tdx_cpu_enable() already has some logic to do the global initialization properly. Just use it directly to maximize fresh-boot and runtime update code sharing.
== Background ==
The boot-time and post-update initialization flows share the same first steps:
- TDX global initialization - TDX per-CPU initialization
After that, they diverge:
- Fresh boot: Prepare TDMRs/PAMTs Configure the TDX module Configure the global KeyID Initialize TDMRs - Runtime update: Restore TDX module state from handoff data
Future changes will consume the handoff data.
[ dhansen: major changelog munging ]
Signed-off-by: Chao Gao <chao.gao@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Xu Yilun <yilun.xu@linux.intel.com> Reviewed-by: Tony Lindgren <tony.lindgren@linux.intel.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Kiryl Shutsemau (Meta) <kas@kernel.org> Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Link: https://patch.msgid.link/20260520133909.409394-20-chao.gao@intel.com
show more ...
|
| 2bfb2ef8 | 21-May-2026 |
Chao Gao <chao.gao@intel.com> |
x86/virt/seamldr: Install a new TDX module
Continue fleshing out the update proces. The old module is shut down and the system is ready for the new module image. Run the SEAMLDR.INSTALL SEAMCALL on
x86/virt/seamldr: Install a new TDX module
Continue fleshing out the update proces. The old module is shut down and the system is ready for the new module image. Run the SEAMLDR.INSTALL SEAMCALL on all CPUs.
Signed-off-by: Chao Gao <chao.gao@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Tony Lindgren <tony.lindgren@linux.intel.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Xu Yilun <yilun.xu@linux.intel.com> Reviewed-by: Kiryl Shutsemau (Meta) <kas@kernel.org> Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Link: https://patch.msgid.link/20260520133909.409394-19-chao.gao@intel.com
show more ...
|
| 65a6542a | 21-May-2026 |
Chao Gao <chao.gao@intel.com> |
x86/virt/tdx: Reset software states during TDX module shutdown
The TDX module requires a one-time global initialization (TDH.SYS.INIT) and per-CPU initialization (TDH.SYS.LP.INIT) before use. These
x86/virt/tdx: Reset software states during TDX module shutdown
The TDX module requires a one-time global initialization (TDH.SYS.INIT) and per-CPU initialization (TDH.SYS.LP.INIT) before use. These initializations are guarded by software flags to prevent repetition.
Reset all software flags guarding the initialization flows to allow the global and per-CPU initializations to be triggered again after updates.
[ dhansen: trim down changelog ]
Signed-off-by: Chao Gao <chao.gao@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Tony Lindgren <tony.lindgren@linux.intel.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Link: https://patch.msgid.link/20260520133909.409394-18-chao.gao@intel.com
show more ...
|
| 9ea06080 | 21-May-2026 |
Chao Gao <chao.gao@intel.com> |
x86/virt/seamldr: Shut down the current TDX module
The first step of TDX module updates is shutting down the current TDX module. This step also packs state information that needs to be preserved acr
x86/virt/seamldr: Shut down the current TDX module
The first step of TDX module updates is shutting down the current TDX module. This step also packs state information that needs to be preserved across updates, called "handoff data". This handoff data is consumed by the updated module and stored internally in the SEAM range and hidden from the kernel.
Since the handoff data layout may change between modules, the handoff data is versioned. Each module has a native handoff version and provides backward support for several older versions.
The complete handoff versioning protocol is complex as it supports both module upgrades and downgrades. See details in "Intel Trust Domain Extensions (Intel TDX) Module Base Architecture Specification", Chapter "Handoff Versioning".
Ideally, the kernel needs to retrieve the handoff versions supported by the current module and the new module and select a version supported by both. But since this implementation only supports module upgrades, simply request handoff data from the current module using its highest supported version. That is sufficient for this upgrade-only implementation.
Retrieve the module's handoff version from TDX global metadata and add an update step to shut down the module. Module shutdown only needs to run on one CPU.
Don't cache the handoff information in tdx_sysinfo. It is used only for module shutdown, and is present only when the TDX module supports updates. Caching it in get_tdx_sys_info() would require extra update-support guards and refreshing the cached value across module updates.
[ dhansen: fix up function variables, remove 'cpu'. Return from tdx_module_shutdown() early if handoff call fails. ]
Signed-off-by: Chao Gao <chao.gao@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Tony Lindgren <tony.lindgren@linux.intel.com> Reviewed-by: Xu Yilun <yilun.xu@linux.intel.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Kiryl Shutsemau (Meta) <kas@kernel.org> Link: https://patch.msgid.link/20260520133909.409394-17-chao.gao@intel.com
show more ...
|
| be4efe63 | 21-May-2026 |
Chao Gao <chao.gao@intel.com> |
x86/virt/seamldr: Abort updates after a failed step
A TDX module update is a multi-step process, and any step can fail.
The current update flow continues to later steps after an error. Continuing a
x86/virt/seamldr: Abort updates after a failed step
A TDX module update is a multi-step process, and any step can fail.
The current update flow continues to later steps after an error. Continuing after a failure can cause the TDX module to enter an unrecoverable state.
But certain failures during the initial module shutdown step should simply return an error to userspace, so the update can be retried cleanly.
To preserve that recoverability, one option would be to abort the update only for those failures, since they occur before any TDX module state is changed. But special-casing specific failures in specific steps would complicate the do-while() update loop for no benefit.
Simply abort update on any failure, at any step.
Track failures for each step, stop the update loop once a failure is observed, and do not advance the state machine to the next step.
[ dhansen: style nits ]
Signed-off-by: Chao Gao <chao.gao@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Xu Yilun <yilun.xu@linux.intel.com> Reviewed-by: Tony Lindgren <tony.lindgren@linux.intel.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Kiryl Shutsemau (Meta) <kas@kernel.org> Link: https://lore.kernel.org/linux-coco/aQFmOZCdw64z14cJ@google.com/ # [1] Link: https://patch.msgid.link/20260520133909.409394-16-chao.gao@intel.com
show more ...
|
| ab6be116 | 21-May-2026 |
Chao Gao <chao.gao@intel.com> |
x86/virt/seamldr: Introduce skeleton for TDX module updates
tl;dr: Use stop_machine() and a state machine based on the "MULTI_STOP" pattern to implement core TDX module update logic.
Long version:
x86/virt/seamldr: Introduce skeleton for TDX module updates
tl;dr: Use stop_machine() and a state machine based on the "MULTI_STOP" pattern to implement core TDX module update logic.
Long version:
TDX module updates require careful synchronization with other TDX operations. The requirements are (#1/#2 reflect current behavior that must be preserved):
1. SEAMCALLs need to be callable from both process and IRQ contexts. 2. SEAMCALLs need to be able to run concurrently across CPUs 3. During updates, only update-related SEAMCALLs are permitted; all other SEAMCALLs shouldn't be called. 4. During updates, all online CPUs must participate in the update work.
No single lock primitive satisfies all requirements. For instance, rwlock_t handles #1/#2 but fails #4: CPUs spinning with IRQs disabled cannot be directed to perform update work.
Use stop_machine() as it is the only well-understood mechanism that can meet all requirements.
And TDX module updates consist of several steps (See Intel Trust Domain Extensions (Intel TDX) Module Base Architecture Specification, Chapter "TD-Preserving TDX module Update"). Ordering requirements between steps mandate lockstep synchronization across all CPUs.
multi_cpu_stop() provides a good example of executing a multi-step task in lockstep across CPUs, but it does not synchronize the individual steps inside the callback itself.
Implement a similar state machine as the skeleton for TDX module updates. Each state represents one step in the update flow, and the state advances only after all CPUs acknowledge completion of the current step. This acknowledgment mechanism provides the required lockstep execution.
The update flow is intentionally simpler than multi_cpu_stop() in two ways:
a) use a spinlock to protect the control data instead of atomic_t and explicit memory barriers.
b) omit touch_nmi_watchdog() and rcu_momentary_eqs(), which exist there for debugging and are not strictly needed for this update flow
Potential alternative to stop_machine() ======================================= An alternative approach is to lock all KVM entry points and kick all vCPUs. Here, KVM entry points refer to KVM VM/vCPU ioctl entry points, implemented in KVM common code (virt/kvm). Adding a locking mechanism there would affect all architectures KVM supports. And to lock only TDX vCPUs, new logic would be needed to identify TDX vCPUs, which the KVM common code currently lacks. This would add significant complexity and maintenance overhead to KVM for this TDX-specific use case, so don't take this approach.
[ dhansen: normal changelog/style munging ]
Signed-off-by: Chao Gao <chao.gao@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Xu Yilun <yilun.xu@linux.intel.com> Reviewed-by: Tony Lindgren <tony.lindgren@linux.intel.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Kiryl Shutsemau (Meta) <kas@kernel.org> Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Link: https://patch.msgid.link/20260520133909.409394-15-chao.gao@intel.com
show more ...
|
| 23a81e6c | 21-May-2026 |
Chao Gao <chao.gao@intel.com> |
x86/virt/seamldr: Allocate and populate a module update request
There are two important ABIs here:
'struct tdx_image' - The on-disk and in-memory format for a TDX module update image. 'struct
x86/virt/seamldr: Allocate and populate a module update request
There are two important ABIs here:
'struct tdx_image' - The on-disk and in-memory format for a TDX module update image. 'struct seamldr_params' - The in-memory ABI passed to the TDX module loader. Points to a single 'struct tdx_image' broken up into 4k pages.
Userspace supplies the update image in 'struct tdx_image' format. The image consists of a header followed by a sigstruct and the module binary. P-SEAMLDR, however, consumes 'struct seamldr_params' rather than the image directly.
Parse the 'struct tdx_image' provided by userspace and populate a matching 'struct seamldr_params'.
The 'tdx_image' ABI is versioned. Two public versions exist today: 0x100 and 0x200. This kernel only accepts 0x200. The older 0x100 format is being deprecated and is intentionally not supported here. Future versions of the module might be able to use the same ABIs (user/kernel and kernel/SEAMLDR) but they will not be able to use this kernel code.
Reject module images without that specific version. This ensures that the kernel is able to understand the passed-in format.
Validate the 'struct tdx_image' header before using it, because the header is consumed solely by the kernel to locate the sigstruct and module within the image. Do not validate the payload itself. The sigstruct and module pages are passed through to P-SEAMLDR, which validates them as part of the update.
sigstruct_pages_pa_list currently has only one entry, but it will grow to four pages in the future. Keep it as an array for symmetry with module_pages_pa_list and for extensibility.
[ dhansen: normal changelog clarification/munging ]
Signed-off-by: Chao Gao <chao.gao@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://patch.msgid.link/20260520133909.409394-14-chao.gao@intel.com
show more ...
|
| c3e70c5e | 21-May-2026 |
Chao Gao <chao.gao@intel.com> |
coco/tdx-host: Implement firmware upload sysfs ABI for TDX module updates
tl;dr: Select fw_upload for doing TDX module updates. The process of selecting among available update images is complicated
coco/tdx-host: Implement firmware upload sysfs ABI for TDX module updates
tl;dr: Select fw_upload for doing TDX module updates. The process of selecting among available update images is complicated and nuanced. Punt the selection process out to userspace. One existing userspace implementation today is the script in the Intel TDX Module Binaries repository[1].
Long Version:
The kernel supports two primary firmware update mechanisms: 1. request_firmware() - used by microcode, SEV firmware, hundreds of other drivers 2. 'struct fw_upload' - used by CXL, FPGA updates, dozens of others
The key difference between is that request_firmware() loads a named file from the filesystem where the filename is kernel-controlled, while fw_upload accepts firmware data directly from userspace.
TDX module firmware update selection policy is too complex for the kernel. Leave it to userspace and use fw_upload.
Add a skeleton fw_upload implementation to be fleshed out in subsequent patches.
Refactor the sysfs visiblity attribute function so it can be used as a more generic flag for the presence of viable runtime update support.
Why fw_upload instead of request_firmware()? ============================================
Selecting a TDX module update image is not a simple "load the latest" decision. Userspace needs to choose an image that is compatible with both the platform and the currently running module.
Some constraints are hard requirements:
a. Module version series are platform-specific. For example, the 1.5.x series runs on Sapphire Rapids but not Granite Rapids, which needs 2.0.x.
b. Updates are also constrained by version distance. A 1.5.6 module might permit updates to 1.5.7 but not to 1.5.50.
There may also be userspace policy choices:
c. Decide the update direction: upgrade or downgrade
d. Choose whether to optimize for fewer updates or smaller version steps, for example, 1.2.3=>1.2.5 versus 1.2.3=>1.2.4=>1.2.5.
Given that complexity, leave module selection to userspace and use fw_upload.
1. https://github.com/intel/confidential-computing.tdx.tdx-module.binaries/blob/main/version_select_and_load.py
[ dhansen: add version script link, add more explanation of code moves, fix some minor whitespace issues ]
Signed-off-by: Chao Gao <chao.gao@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Tony Lindgren <tony.lindgren@linux.intel.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Kiryl Shutsemau (Meta) <kas@kernel.org> Link: https://lore.kernel.org/kvm/01fc8946-eb84-46fa-9458-f345dd3f6033@intel.com/ Link: https://patch.msgid.link/20260520133909.409394-13-chao.gao@intel.com
show more ...
|
| 5ce9cc5a | 21-May-2026 |
Chao Gao <chao.gao@intel.com> |
coco/tdx-host: Don't expose P-SEAMLDR information on CPUs with erratum
TDX-capable CPUs clobber the current VMCS on P-SEAMLDR calls. Clearing the current VMCS behind KVM's back breaks KVM.
Future C
coco/tdx-host: Don't expose P-SEAMLDR information on CPUs with erratum
TDX-capable CPUs clobber the current VMCS on P-SEAMLDR calls. Clearing the current VMCS behind KVM's back breaks KVM.
Future CPUs will fix this by preserving the current VMCS across P-SEAMLDR calls. A future specification update will describe the VMCS-clearing behavior as an erratum and to state that it does not occur when IA32_VMX_BASIC[60] is set.
Add a CPU bug bit and refuse to expose P-SEAMLDR information on affected CPUs.
Use a CPU bug bit to stay consistent with X86_BUG_TDX_PW_MCE. As a bonus, the bug bit is visible to userspace, which allows userspace to determine why these sysfs files are not exposed, and it can also be checked by other kernel components in the future if needed.
== Alternatives == Two workarounds were considered but both were rejected:
1. Save/restore the current VMCS around P-SEAMLDR calls. This produces ugly assembly code [1] and doesn't play well with #MCE or #NMI if they need to use the current VMCS.
2. Move KVM's VMCS tracking logic to the TDX core code, which would break the boundary between KVM and the TDX core code [2].
[ dhansen: comment and changelog munging. Add seamldr_call() bug check. ]
Signed-off-by: Chao Gao <chao.gao@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Kiryl Shutsemau (Meta) <kas@kernel.org> Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/kvm/fedb3192-e68c-423c-93b2-a4dc2f964148@intel.com/ # [1] Link: https://lore.kernel.org/kvm/aYIXFmT-676oN6j0@google.com/ # [2] Link: https://patch.msgid.link/20260520133909.409394-12-chao.gao@intel.com
show more ...
|
| 0988bf69 | 21-May-2026 |
Chao Gao <chao.gao@intel.com> |
x86/virt/seamldr: Add a helper to retrieve P-SEAMLDR information
P-SEAMLDR reports its state via SEAMLDR.INFO, including its version and the number of remaining runtime updates.
This information is
x86/virt/seamldr: Add a helper to retrieve P-SEAMLDR information
P-SEAMLDR reports its state via SEAMLDR.INFO, including its version and the number of remaining runtime updates.
This information is useful for userspace. For example, userspace can use the P-SEAMLDR version to determine whether a candidate TDX module is compatible with the running loader, and can use the remaining update count to determine whether another runtime update is still possible.
Add a helper to retrieve P-SEAMLDR information in preparation for exposing P-SEAMLDR version and other necessary information to userspace. Export the new kAPI for use by the "tdx_host" device.
Note that there are two distinct P-SEAMLDR APIs with similar names:
"SEAMLDR.INFO" is metadata about the loader. It's metadata for the update process.
"SEAMLDR.SEAMINFO" is metadata about SEAM mode. It is for the module init process, not for the update process.
Use SEAMLDR.INFO here.
For details, see "Intel Trust Domain Extensions - SEAM Loader (SEAMLDR) Interface Specification".
Signed-off-by: Chao Gao <chao.gao@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Kiryl Shutsemau (Meta) <kas@kernel.org> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://patch.msgid.link/20260520133909.409394-10-chao.gao@intel.com
show more ...
|
| 9bc3ce2c | 21-May-2026 |
Chao Gao <chao.gao@intel.com> |
x86/virt/seamldr: Introduce a wrapper for P-SEAMLDR SEAMCALLs
The TDX architecture uses the "SEAMCALL" instruction to communicate with SEAM mode software. Right now, the only SEAM mode software that
x86/virt/seamldr: Introduce a wrapper for P-SEAMLDR SEAMCALLs
The TDX architecture uses the "SEAMCALL" instruction to communicate with SEAM mode software. Right now, the only SEAM mode software that the kernel communicates with is the TDX module. But, there is actually another component that runs in SEAM mode but it is separate from the TDX module: the persistent SEAM loader or "P-SEAMLDR". Right now, the only component that communicates with it is the BIOS which loads the TDX module itself at boot. But, to support updating the TDX module, the kernel now needs to be able to talk to it.
P-SEAMLDR SEAMCALLs differ from TDX module SEAMCALLs in areas such as concurrency requirements.
Add a P-SEAMLDR wrapper to handle these differences and prepare for implementing concrete functions.
Use seamcall_prerr() (not '_ret') because current P-SEAMLDR calls do not use any output registers other than RAX.
Note: Despite the similar name, the NP-SEAMLDR ("Non-Persistent") (ACM) invoked exclusively by the BIOS at boot rather than a component running in SEAM mode. The kernel cannot call it at runtime. It exposes no SEAMCALL interface.
Signed-off-by: Chao Gao <chao.gao@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Kiryl Shutsemau (Meta) <kas@kernel.org> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://cdrdv2.intel.com/v1/dl/getContent/733582 # [1] Link: https://patch.msgid.link/20260520133909.409394-9-chao.gao@intel.com
show more ...
|
| 6fe8a33f | 21-May-2026 |
Chao Gao <chao.gao@intel.com> |
coco/tdx-host: Expose TDX module version
For TDX module updates, userspace needs to select compatible update versions based on the current module version.
For example, the 1.5.x series runs on Sapp
coco/tdx-host: Expose TDX module version
For TDX module updates, userspace needs to select compatible update versions based on the current module version.
For example, the 1.5.x series runs on Sapphire Rapids but not Granite Rapids, which needs 2.0.x. Updates are also constrained by version distance, so a 1.5.6 module might permit updates to 1.5.7 but not to 1.5.20.
Start the process of punting the version selection logic to userspace. Expose the TDX module version in the new faux device.
Define TDX_VERSION_FMT macro for the TDX version format since it will be used multiple times. Also convert an existing print statement to use it.
== Background ==
For posterity, here's what other firmware mechanisms do:
1. AMD SEV leverages an existing PCI device for the PSP to expose metadata. TDX uses a faux device as it doesn't have PCI device in its architecture.
2. Microcode uses per-CPU virtual devices to report microcode revisions because CPUs can have different revisions. But, there is only a single TDX module, so exposing the TDX module version through a global TDX faux device is appropriate
3. ARM's CCA implementation isn't in-tree yet, but will likely follow a similar faux device approach, though it's unclear whether they need to expose firmware version information
[ dhansen: trim changelog ]
Signed-off-by: Chao Gao <chao.gao@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com> Reviewed-by: Tony Lindgren <tony.lindgren@linux.intel.com> Reviewed-by: Xu Yilun <yilun.xu@linux.intel.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Kiryl Shutsemau (Meta) <kas@kernel.org> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/2025073035-bulginess-rematch-b92e@gregkh/ # [1] Link: https://patch.msgid.link/20260520133909.409394-8-chao.gao@intel.com
show more ...
|
| 59783353 | 21-May-2026 |
Chao Gao <chao.gao@intel.com> |
coco/tdx-host: Introduce a "tdx_host" device
TDX depends on a platform firmware module that runs on the CPU. Unlike other CoCo architectures, TDX has no hardware "device" running the show, just a bl
coco/tdx-host: Introduce a "tdx_host" device
TDX depends on a platform firmware module that runs on the CPU. Unlike other CoCo architectures, TDX has no hardware "device" running the show, just a blob on the CPU.
Create a virtual device to anchor interactions with this platform firmware. This lets later code:
- expose metadata: TDX module version, seamldr version, to userspace as device attributes
- implement firmware uploader APIs (which are tied to a device) to support TDX module runtime updates
Use a faux device because the TDX module is singular within the system and has no platform resources. Using a faux device eliminates the need to create a stub bus.
The call to tdx_get_sysinfo() ensures that the TDX module is ready to provide services.
Note that AMD has a PCI device for the PSP for SEV and ARM CCA will likely have a faux device [1].
Thanks to Dan and Yilun for all the help on this one.
[ dhansen: trim changelog ]
Signed-off-by: Chao Gao <chao.gao@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Tony Lindgren <tony.lindgren@linux.intel.com> Reviewed-by: Xu Yilun <yilun.xu@linux.intel.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Kiryl Shutsemau (Meta) <kas@kernel.org> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Link: https://lore.kernel.org/all/2025073035-bulginess-rematch-b92e@gregkh/ # [1] Link: https://patch.msgid.link/20260520133909.409394-7-chao.gao@intel.com
show more ...
|
| 2e41297b | 21-May-2026 |
Kai Huang <kai.huang@intel.com> |
x86/virt/tdx: Move low level SEAMCALL helpers out of <asm/tdx.h>
TDX host core code implements three seamcall*() helpers to make SEAMCALLs to the TDX module. Currently, they are implemented in <asm
x86/virt/tdx: Move low level SEAMCALL helpers out of <asm/tdx.h>
TDX host core code implements three seamcall*() helpers to make SEAMCALLs to the TDX module. Currently, they are implemented in <asm/tdx.h> and are exposed to other kernel code which includes <asm/tdx.h>.
However, other than the TDX host core, seamcall*() are not expected to be used by other kernel code directly. For instance, for all SEAMCALLs that are used by KVM, the TDX host core exports a wrapper function for each of them.
Move seamcall*() and related code out of <asm/tdx.h> and make them only visible to TDX host core.
Since TDX host core tdx.c is already very heavy, don't put low level seamcall*() code there but to a new dedicated "seamcall_internal.h". Also, currently tdx.c has seamcall_prerr*() helpers which additionally print error message when calling seamcall*() fails. Move them to "seamcall_internal.h" as well. In such way all low level SEAMCALL helpers are in a dedicated place, which is much more readable.
Copy the copyright notice from the original files and consolidate the date ranges to:
Copyright (C) 2021-2023 Intel Corporation
Signed-off-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Chao Gao <chao.gao@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com> Reviewed-by: Tony Lindgren <tony.lindgren@linux.intel.com> Reviewed-by: Kiryl Shutsemau (Meta) <kas@kernel.org> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Vishal Annapurve <vannapurve@google.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://patch.msgid.link/20260520133909.409394-6-chao.gao@intel.com
show more ...
|
| 77525820 | 21-May-2026 |
Chao Gao <chao.gao@intel.com> |
x86/virt/tdx: Move TDX_FEATURES0 bits to asm/tdx.h
Future changes will add support for new TDX features exposed as TDX_FEATURES0 bits. The presence of these features will need to be checked outside
x86/virt/tdx: Move TDX_FEATURES0 bits to asm/tdx.h
Future changes will add support for new TDX features exposed as TDX_FEATURES0 bits. The presence of these features will need to be checked outside of arch/x86/virt. The feature query helpers and the TDX_FEATURES0 defines they reference will need to live in the widely accessible asm/tdx.h header. Move the existing TDX_FEATURES0 to asm/tdx.h so that they can all be kept together.
Opportunistically switch to BIT_ULL() since TDX_FEATURES0 is 64-bit.
No functional change intended.
[ dhansen: grammar fixups ]
Signed-off-by: Chao Gao <chao.gao@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/kvm/20260427152854.101171-17-chao.gao@intel.com/ # [1] Link: https://lore.kernel.org/kvm/20251121005125.417831-16-rick.p.edgecombe@intel.com/ # [2] Link: https://patch.msgid.link/20260520133909.409394-5-chao.gao@intel.com
show more ...
|
| 451735bf | 21-May-2026 |
Chao Gao <chao.gao@intel.com> |
x86/virt/tdx: Consolidate TDX global initialization states
The kernel uses several global flags to guard one-time TDX initialization flows and prevent them from being repeated.
When the TDX module
x86/virt/tdx: Consolidate TDX global initialization states
The kernel uses several global flags to guard one-time TDX initialization flows and prevent them from being repeated.
When the TDX module is updated, all of those states must be reset so that the module can be initialized again. Today those states are kept as separate global variables, which makes the reset path awkward and easy to miss when a new state is added.
Group the states into a single structure so they can be reset together, for example with memset(), and so a newly added state won't be missed.
Drop the __ro_after_init annotation from tdx_module_initialized because the other two states do not have it. And with TDX module update support, all the states need to be writable at runtime.
Signed-off-by: Chao Gao <chao.gao@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://patch.msgid.link/20260520133909.409394-4-chao.gao@intel.com
show more ...
|
| 9c0c6870 | 21-May-2026 |
Chao Gao <chao.gao@intel.com> |
x86/virt/tdx: Move TDX global initialization states to file scope
TDX module global initialization is executed only once. The first call caches both the result and the "done" state, and later caller
x86/virt/tdx: Move TDX global initialization states to file scope
TDX module global initialization is executed only once. The first call caches both the result and the "done" state, and later callers reuse the saved result. A lock protects that cached states.
Those states and the lock are currently kept as function-local statics because they are used only by try_init_module_global().
TDX module updates need to reset the cached states so TDX global initialization can be run again after an update. That will add another access site in the same file.
Move the cached states to file scope so it is accessible outside try_init_module_global(), and move the lock along with the states it protects.
No functional change intended.
Signed-off-by: Chao Gao <chao.gao@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://patch.msgid.link/20260520133909.409394-3-chao.gao@intel.com
show more ...
|
| 1ffa6a10 | 21-May-2026 |
Chao Gao <chao.gao@intel.com> |
x86/virt/tdx: Clarify try_init_module_global() result caching
TDX module global initialization is executed only once. The first call caches both the return code and the "done" state in static functi
x86/virt/tdx: Clarify try_init_module_global() result caching
TDX module global initialization is executed only once. The first call caches both the return code and the "done" state in static function variables. Later callers read the variables. A lock protects the saved state and serializes callers.
These variables will soon be moved to a global structure. Prepare for that by treating the variables as a unit. Assign them together and limit accesses to while the lock is held.
[ dhansen: mostly rewrite changelog ]
Signed-off-by: Chao Gao <chao.gao@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://patch.msgid.link/20260520133909.409394-2-chao.gao@intel.com
show more ...
|
| 3f330fbb | 30-Apr-2026 |
Yan Zhao <yan.y.zhao@intel.com> |
x86/virt/tdx: Move mk_keyed_paddr() to tdx.c due to no external users
Move mk_keyed_paddr() from tdx.h to tdx.c to avoid unnecessary header inclusion and improve encapsulation since there are no use
x86/virt/tdx: Move mk_keyed_paddr() to tdx.c due to no external users
Move mk_keyed_paddr() from tdx.h to tdx.c to avoid unnecessary header inclusion and improve encapsulation since there are no users outside of tdx.c.
No functional change intended.
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com> Acked-by: Kiryl Shutsemau <kas@kernel.org> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://patch.msgid.link/20260430015014.24261-1-yan.y.zhao@intel.com Signed-off-by: Sean Christopherson <seanjc@google.com>
show more ...
|
| 4a72a6dc | 30-Apr-2026 |
Yan Zhao <yan.y.zhao@intel.com> |
x86/tdx: Drop exported function tdx_quirk_reset_page()
KVM invokes tdx_quirk_reset_page() to reset TDX control pages (including S-EPT pages, TDR page, etc.), as all those pages are allocated by KVM
x86/tdx: Drop exported function tdx_quirk_reset_page()
KVM invokes tdx_quirk_reset_page() to reset TDX control pages (including S-EPT pages, TDR page, etc.), as all those pages are allocated by KVM TDX and thus always have struct page.
However, it's also reasonable for KVM to reset those TDX control pages via tdx_quirk_reset_paddr() directly, eliminating the need to export two parallel APIs. Keeping tdx_quirk_reset_page() as a one-line helper in the header file is also unnecessary.
No functional change intended.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Suggested-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com> Acked-by: Kiryl Shutsemau <kas@kernel.org> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Ackerley Tng <ackerleytng@google.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://patch.msgid.link/20260430015001.24242-1-yan.y.zhao@intel.com Signed-off-by: Sean Christopherson <seanjc@google.com>
show more ...
|
| 4c7a1247 | 30-Apr-2026 |
Sean Christopherson <seanjc@google.com> |
x86/tdx: Use PFN directly for unmapping guest private memory
Remove struct page assumptions/constraints in APIs for unmapping guest private memory and have them take physical address directly.
Havi
x86/tdx: Use PFN directly for unmapping guest private memory
Remove struct page assumptions/constraints in APIs for unmapping guest private memory and have them take physical address directly.
Having core TDX make assumptions that guest private memory must be backed by struct page (and/or folio) will create subtle dependencies on how KVM/guest_memfd allocates/manages memory (e.g., whether it uses memory allocated from core MM, if the memory is refcounted, or if the folio is split) that are easily avoided. [1].
KVM's MMUs work with PFNs. This is very much an intentional design choice. It ensures that the KVM MMUs remain flexible and are not too tightly tied to the regular CPU MMUs and the kernel code around them. Using "struct page" for TDX guest memory is not a good fit anywhere near the KVM MMU code [2].
Therefore, for unmapping guest private memory: export tdx_quirk_reset_paddr() for direct KVM invocation, and convert the SEAMCALL wrapper API tdh_phymem_page_wbinvd_hkid() to take PFN as input (thus updating mk_keyed_paddr() and tdh_phymem_page_wbinvd_tdr()).
Intentionally have KVM pass PAGE_SIZE (rather than KVM_HPAGE_SIZE(level)) to tdx_quirk_reset_paddr() in tdx_sept_remove_private_spte() to avoid mixing in huge page changes. The KVM_BUG_ON() check for !PG_LEVEL_4K in tdx_sept_remove_private_spte() justifies using PAGE_SIZE.
Do not convert tdx_reclaim_page() to use PFN as input since it currently does not remove guest private memory.
Use "kvm_pfn_t pfn" for type safety. Using this KVM type is appropriate since APIs tdh_phymem_page_wbinvd_hkid() and tdx_quirk_reset_paddr() are exported to KVM only.
[Yan: Use kvm_pfn_t,exclude tdx_reclaim_page(),use tdx_quirk_reset_paddr()]
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com> Link: https://lore.kernel.org/all/aWgyhmTJphGQqO0Y@google.com [1] Link: https://lore.kernel.org/all/ac7V0g2q2hN3dU5u@google.com [2] Acked-by: Kiryl Shutsemau <kas@kernel.org> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Ackerley Tng <ackerleytng@google.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://patch.msgid.link/20260430014948.24226-1-yan.y.zhao@intel.com Signed-off-by: Sean Christopherson <seanjc@google.com>
show more ...
|