History log of /linux/arch/x86/kvm/x86.c (Results 1 – 25 of 6762)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 0c436dfe 02-Oct-2024 Takashi Iwai <tiwai@suse.de>

Merge tag 'asoc-fix-v6.12-rc1' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v6.12

A bunch of fixes here that came in during the merge window and t

Merge tag 'asoc-fix-v6.12-rc1' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v6.12

A bunch of fixes here that came in during the merge window and the first
week of release, plus some new quirks and device IDs. There's nothing
major here, it's a bit bigger than it might've been due to there being
no fixes sent during the merge window due to your vacation.

show more ...


# 2cd86f02 01-Oct-2024 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Merge remote-tracking branch 'drm/drm-fixes' into drm-misc-fixes

Required for a panthor fix that broke when
FOP_UNSIGNED_OFFSET was added in place of FMODE_UNSIGNED_OFFSET.

Signed-off-by: Maarten L

Merge remote-tracking branch 'drm/drm-fixes' into drm-misc-fixes

Required for a panthor fix that broke when
FOP_UNSIGNED_OFFSET was added in place of FMODE_UNSIGNED_OFFSET.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

show more ...


Revision tags: v6.12-rc1
# 3efc5736 28-Sep-2024 Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull x86 kvm updates from Paolo Bonzini:
"x86:

- KVM currently invalidates the entirety of the page tables, not just
thos

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull x86 kvm updates from Paolo Bonzini:
"x86:

- KVM currently invalidates the entirety of the page tables, not just
those for the memslot being touched, when a memslot is moved or
deleted.

This does not traditionally have particularly noticeable overhead,
but Intel's TDX will require the guest to re-accept private pages
if they are dropped from the secure EPT, which is a non starter.

Actually, the only reason why this is not already being done is a
bug which was never fully investigated and caused VM instability
with assigned GeForce GPUs, so allow userspace to opt into the new
behavior.

- Advertise AVX10.1 to userspace (effectively prep work for the
"real" AVX10 functionality that is on the horizon)

- Rework common MSR handling code to suppress errors on userspace
accesses to unsupported-but-advertised MSRs

This will allow removing (almost?) all of KVM's exemptions for
userspace access to MSRs that shouldn't exist based on the vCPU
model (the actual cleanup is non-trivial future work)

- Rework KVM's handling of x2APIC ICR, again, because AMD (x2AVIC)
splits the 64-bit value into the legacy ICR and ICR2 storage,
whereas Intel (APICv) stores the entire 64-bit value at the ICR
offset

- Fix a bug where KVM would fail to exit to userspace if one was
triggered by a fastpath exit handler

- Add fastpath handling of HLT VM-Exit to expedite re-entering the
guest when there's already a pending wake event at the time of the
exit

- Fix a WARN caused by RSM entering a nested guest from SMM with
invalid guest state, by forcing the vCPU out of guest mode prior to
signalling SHUTDOWN (the SHUTDOWN hits the VM altogether, not the
nested guest)

- Overhaul the "unprotect and retry" logic to more precisely identify
cases where retrying is actually helpful, and to harden all retry
paths against putting the guest into an infinite retry loop

- Add support for yielding, e.g. to honor NEED_RESCHED, when zapping
rmaps in the shadow MMU

- Refactor pieces of the shadow MMU related to aging SPTEs in
prepartion for adding multi generation LRU support in KVM

- Don't stuff the RSB after VM-Exit when RETPOLINE=y and AutoIBRS is
enabled, i.e. when the CPU has already flushed the RSB

- Trace the per-CPU host save area as a VMCB pointer to improve
readability and cleanup the retrieval of the SEV-ES host save area

- Remove unnecessary accounting of temporary nested VMCB related
allocations

- Set FINAL/PAGE in the page fault error code for EPT violations if
and only if the GVA is valid. If the GVA is NOT valid, there is no
guest-side page table walk and so stuffing paging related metadata
is nonsensical

- Fix a bug where KVM would incorrectly synthesize a nested VM-Exit
instead of emulating posted interrupt delivery to L2

- Add a lockdep assertion to detect unsafe accesses of vmcs12
structures

- Harden eVMCS loading against an impossible NULL pointer deref
(really truly should be impossible)

- Minor SGX fix and a cleanup

- Misc cleanups

Generic:

- Register KVM's cpuhp and syscore callbacks when enabling
virtualization in hardware, as the sole purpose of said callbacks
is to disable and re-enable virtualization as needed

- Enable virtualization when KVM is loaded, not right before the
first VM is created

Together with the previous change, this simplifies a lot the logic
of the callbacks, because their very existence implies
virtualization is enabled

- Fix a bug that results in KVM prematurely exiting to userspace for
coalesced MMIO/PIO in many cases, clean up the related code, and
add a testcase

- Fix a bug in kvm_clear_guest() where it would trigger a buffer
overflow _if_ the gpa+len crosses a page boundary, which thankfully
is guaranteed to not happen in the current code base. Add WARNs in
more helpers that read/write guest memory to detect similar bugs

Selftests:

- Fix a goof that caused some Hyper-V tests to be skipped when run on
bare metal, i.e. NOT in a VM

- Add a regression test for KVM's handling of SHUTDOWN for an SEV-ES
guest

- Explicitly include one-off assets in .gitignore. Past Sean was
completely wrong about not being able to detect missing .gitignore
entries

- Verify userspace single-stepping works when KVM happens to handle a
VM-Exit in its fastpath

- Misc cleanups"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (127 commits)
Documentation: KVM: fix warning in "make htmldocs"
s390: Enable KVM_S390_UCONTROL config in debug_defconfig
selftests: kvm: s390: Add VM run test case
KVM: SVM: let alternatives handle the cases when RSB filling is required
KVM: VMX: Set PFERR_GUEST_{FINAL,PAGE}_MASK if and only if the GVA is valid
KVM: x86/mmu: Use KVM_PAGES_PER_HPAGE() instead of an open coded equivalent
KVM: x86/mmu: Add KVM_RMAP_MANY to replace open coded '1' and '1ul' literals
KVM: x86/mmu: Fold mmu_spte_age() into kvm_rmap_age_gfn_range()
KVM: x86/mmu: Morph kvm_handle_gfn_range() into an aging specific helper
KVM: x86/mmu: Honor NEED_RESCHED when zapping rmaps and blocking is allowed
KVM: x86/mmu: Add a helper to walk and zap rmaps for a memslot
KVM: x86/mmu: Plumb a @can_yield parameter into __walk_slot_rmaps()
KVM: x86/mmu: Move walk_slot_rmaps() up near for_each_slot_rmap_range()
KVM: x86/mmu: WARN on MMIO cache hit when emulating write-protected gfn
KVM: x86/mmu: Detect if unprotect will do anything based on invalid_list
KVM: x86/mmu: Subsume kvm_mmu_unprotect_page() into the and_retry() version
KVM: x86: Rename reexecute_instruction()=>kvm_unprotect_and_retry_on_failure()
KVM: x86: Update retry protection fields when forcing retry on emulation failure
KVM: x86: Apply retry protection to "unprotect on failure" path
KVM: x86: Check EMULTYPE_WRITE_PF_TO_SP before unprotecting gfn
...

show more ...


Revision tags: v6.11
# 3f8df628 14-Sep-2024 Paolo Bonzini <pbonzini@redhat.com>

Merge tag 'kvm-x86-vmx-6.12' of https://github.com/kvm-x86/linux into HEAD

KVM VMX changes for 6.12:

- Set FINAL/PAGE in the page fault error code for EPT Violations if and only
if the GVA is v

Merge tag 'kvm-x86-vmx-6.12' of https://github.com/kvm-x86/linux into HEAD

KVM VMX changes for 6.12:

- Set FINAL/PAGE in the page fault error code for EPT Violations if and only
if the GVA is valid. If the GVA is NOT valid, there is no guest-side page
table walk and so stuffing paging related metadata is nonsensical.

- Fix a bug where KVM would incorrectly synthesize a nested VM-Exit instead of
emulating posted interrupt delivery to L2.

- Add a lockdep assertion to detect unsafe accesses of vmcs12 structures.

- Harden eVMCS loading against an impossible NULL pointer deref (really truly
should be impossible).

- Minor SGX fix and a cleanup.

show more ...


# 43d97b2e 14-Sep-2024 Paolo Bonzini <pbonzini@redhat.com>

Merge tag 'kvm-x86-pat_vmx_msrs-6.12' of https://github.com/kvm-x86/linux into HEAD

KVM VMX and x86 PAT MSR macro cleanup for 6.12:

- Add common defines for the x86 architectural memory types, i.e

Merge tag 'kvm-x86-pat_vmx_msrs-6.12' of https://github.com/kvm-x86/linux into HEAD

KVM VMX and x86 PAT MSR macro cleanup for 6.12:

- Add common defines for the x86 architectural memory types, i.e. the types
that are shared across PAT, MTRRs, VMCSes, and EPTPs.

- Clean up the various VMX MSR macros to make the code self-documenting
(inasmuch as possible), and to make it less painful to add new macros.

show more ...


# 5d55a052 14-Sep-2024 Paolo Bonzini <pbonzini@redhat.com>

Merge tag 'kvm-x86-mmu-6.12' of https://github.com/kvm-x86/linux into HEAD

KVM x86 MMU changes for 6.12:

- Overhaul the "unprotect and retry" logic to more precisely identify cases
where retryi

Merge tag 'kvm-x86-mmu-6.12' of https://github.com/kvm-x86/linux into HEAD

KVM x86 MMU changes for 6.12:

- Overhaul the "unprotect and retry" logic to more precisely identify cases
where retrying is actually helpful, and to harden all retry paths against
putting the guest into an infinite retry loop.

- Add support for yielding, e.g. to honor NEED_RESCHED, when zapping rmaps in
the shadow MMU.

- Refactor pieces of the shadow MMU related to aging SPTEs in prepartion for
adding MGLRU support in KVM.

- Misc cleanups

show more ...


# 41786cc5 14-Sep-2024 Paolo Bonzini <pbonzini@redhat.com>

Merge tag 'kvm-x86-misc-6.12' of https://github.com/kvm-x86/linux into HEAD

KVM x86 misc changes for 6.12

- Advertise AVX10.1 to userspace (effectively prep work for the "real" AVX10
functional

Merge tag 'kvm-x86-misc-6.12' of https://github.com/kvm-x86/linux into HEAD

KVM x86 misc changes for 6.12

- Advertise AVX10.1 to userspace (effectively prep work for the "real" AVX10
functionality that is on the horizon).

- Rework common MSR handling code to suppress errors on userspace accesses to
unsupported-but-advertised MSRs. This will allow removing (almost?) all of
KVM's exemptions for userspace access to MSRs that shouldn't exist based on
the vCPU model (the actual cleanup is non-trivial future work).

- Rework KVM's handling of x2APIC ICR, again, because AMD (x2AVIC) splits the
64-bit value into the legacy ICR and ICR2 storage, whereas Intel (APICv)
stores the entire 64-bit value a the ICR offset.

- Fix a bug where KVM would fail to exit to userspace if one was triggered by
a fastpath exit handler.

- Add fastpath handling of HLT VM-Exit to expedite re-entering the guest when
there's already a pending wake event at the time of the exit.

- Finally fix the RSM vs. nested VM-Enter WARN by forcing the vCPU out of
guest mode prior to signalling SHUTDOWN (architecturally, the SHUTDOWN is
supposed to hit L1, not L2).

show more ...


# c09dd2bb 12-Sep-2024 Paolo Bonzini <pbonzini@redhat.com>

Merge branch 'kvm-redo-enable-virt' into HEAD

Register KVM's cpuhp and syscore callbacks when enabling virtualization in
hardware, as the sole purpose of said callbacks is to disable and re-enable
v

Merge branch 'kvm-redo-enable-virt' into HEAD

Register KVM's cpuhp and syscore callbacks when enabling virtualization in
hardware, as the sole purpose of said callbacks is to disable and re-enable
virtualization as needed.

The primary motivation for this series is to simplify dealing with enabling
virtualization for Intel's TDX, which needs to enable virtualization
when kvm-intel.ko is loaded, i.e. long before the first VM is created.

That said, this is a nice cleanup on its own. By registering the callbacks
on-demand, the callbacks themselves don't need to check kvm_usage_count,
because their very existence implies a non-zero count.

Patch 1 (re)adds a dedicated lock for kvm_usage_count. This avoids a
lock ordering issue between cpus_read_lock() and kvm_lock. The lock
ordering issue still exist in very rare cases, and will be fixed for
good by switching vm_list to an (S)RCU-protected list.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

show more ...


Revision tags: v6.11-rc7, v6.11-rc6
# 2876624e 31-Aug-2024 Sean Christopherson <seanjc@google.com>

KVM: x86: Rename reexecute_instruction()=>kvm_unprotect_and_retry_on_failure()

Rename reexecute_instruction() to kvm_unprotect_and_retry_on_failure() to
make the intent and purpose of the helper muc

KVM: x86: Rename reexecute_instruction()=>kvm_unprotect_and_retry_on_failure()

Rename reexecute_instruction() to kvm_unprotect_and_retry_on_failure() to
make the intent and purpose of the helper much more obvious.

No functional change intended.

Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Link: https://lore.kernel.org/r/20240831001538.336683-20-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

show more ...


# 4df68566 31-Aug-2024 Sean Christopherson <seanjc@google.com>

KVM: x86: Update retry protection fields when forcing retry on emulation failure

When retrying the faulting instruction after emulation failure, refresh
the infinite loop protection fields even if n

KVM: x86: Update retry protection fields when forcing retry on emulation failure

When retrying the faulting instruction after emulation failure, refresh
the infinite loop protection fields even if no shadow pages were zapped,
i.e. avoid hitting an infinite loop even when retrying the instruction as
a last-ditch effort to avoid terminating the guest.

Link: https://lore.kernel.org/r/20240831001538.336683-19-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

show more ...


# dabc4ff7 31-Aug-2024 Sean Christopherson <seanjc@google.com>

KVM: x86: Apply retry protection to "unprotect on failure" path

Use kvm_mmu_unprotect_gfn_and_retry() in reexecute_instruction() to pick
up protection against infinite loops, e.g. if KVM somehow man

KVM: x86: Apply retry protection to "unprotect on failure" path

Use kvm_mmu_unprotect_gfn_and_retry() in reexecute_instruction() to pick
up protection against infinite loops, e.g. if KVM somehow manages to
encounter an unsupported instruction and unprotecting the gfn doesn't
allow the vCPU to make forward progress. Other than that, the retry-on-
failure logic is a functionally equivalent, open coded version of
kvm_mmu_unprotect_gfn_and_retry().

Note, the emulation failure path still isn't fully protected, as KVM
won't update the retry protection fields if no shadow pages are zapped
(but this change is still a step forward). That flaw will be addressed
in a future patch.

Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Link: https://lore.kernel.org/r/20240831001538.336683-18-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

show more ...


# 19ab2c8b 31-Aug-2024 Sean Christopherson <seanjc@google.com>

KVM: x86: Check EMULTYPE_WRITE_PF_TO_SP before unprotecting gfn

Don't bother unprotecting the target gfn if EMULTYPE_WRITE_PF_TO_SP is
set, as KVM will simply report the emulation failure to userspa

KVM: x86: Check EMULTYPE_WRITE_PF_TO_SP before unprotecting gfn

Don't bother unprotecting the target gfn if EMULTYPE_WRITE_PF_TO_SP is
set, as KVM will simply report the emulation failure to userspace. This
will allow converting reexecute_instruction() to use
kvm_mmu_unprotect_gfn_instead_retry() instead of kvm_mmu_unprotect_page().

Link: https://lore.kernel.org/r/20240831001538.336683-17-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

show more ...


# 62052573 31-Aug-2024 Sean Christopherson <seanjc@google.com>

KVM: x86: Remove manual pfn lookup when retrying #PF after failed emulation

Drop the manual pfn look when retrying an instruction that KVM failed to
emulation in response to a #PF due to a write-pro

KVM: x86: Remove manual pfn lookup when retrying #PF after failed emulation

Drop the manual pfn look when retrying an instruction that KVM failed to
emulation in response to a #PF due to a write-protected gfn. Now that KVM
sets EMULTYPE_ALLOW_RETRY_PF if and only if the page fault hit a write-
protected gfn, i.e. if and only if there's a writable memslot, there's no
need to redo the lookup to avoid retrying an instruction that failed on
emulated MMIO (no slot, or a write to a read-only slot).

I.e. KVM will never attempt to retry an instruction that failed on
emulated MMIO, whereas that was not the case prior to the introduction of
RET_PF_WRITE_PROTECTED.

Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Link: https://lore.kernel.org/r/20240831001538.336683-16-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

show more ...


# 2df354e3 31-Aug-2024 Sean Christopherson <seanjc@google.com>

KVM: x86: Fold retry_instruction() into x86_emulate_instruction()

Now that retry_instruction() is reasonably tiny, fold it into its sole
caller, x86_emulate_instruction(). In addition to getting ri

KVM: x86: Fold retry_instruction() into x86_emulate_instruction()

Now that retry_instruction() is reasonably tiny, fold it into its sole
caller, x86_emulate_instruction(). In addition to getting rid of the
absurdly confusing retry_instruction() name, handling the retry in
x86_emulate_instruction() pairs it back up with the code that resets
last_retry_{eip,address}.

No functional change intended.

Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Link: https://lore.kernel.org/r/20240831001538.336683-12-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

show more ...


# 41e6e367 31-Aug-2024 Sean Christopherson <seanjc@google.com>

KVM: x86: Move EMULTYPE_ALLOW_RETRY_PF to x86_emulate_instruction()

Move the sanity checks for EMULTYPE_ALLOW_RETRY_PF to the top of
x86_emulate_instruction(). In addition to deduplicating a small

KVM: x86: Move EMULTYPE_ALLOW_RETRY_PF to x86_emulate_instruction()

Move the sanity checks for EMULTYPE_ALLOW_RETRY_PF to the top of
x86_emulate_instruction(). In addition to deduplicating a small amount
of code, this makes the connection between EMULTYPE_ALLOW_RETRY_PF and
EMULTYPE_PF even more explicit, and will allow dropping retry_instruction()
entirely.

Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Link: https://lore.kernel.org/r/20240831001538.336683-11-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

show more ...


# 01dd4d31 31-Aug-2024 Sean Christopherson <seanjc@google.com>

KVM: x86/mmu: Apply retry protection to "fast nTDP unprotect" path

Move the anti-infinite-loop protection provided by last_retry_{eip,addr}
into kvm_mmu_write_protect_fault() so that it guards unpro

KVM: x86/mmu: Apply retry protection to "fast nTDP unprotect" path

Move the anti-infinite-loop protection provided by last_retry_{eip,addr}
into kvm_mmu_write_protect_fault() so that it guards unprotect+retry that
never hits the emulator, as well as reexecute_instruction(), which is the
last ditch "might as well try it" logic that kicks in when emulation fails
on an instruction that faulted on a write-protected gfn.

Add a new helper, kvm_mmu_unprotect_gfn_and_retry(), to set the retry
fields and deduplicate other code (with more to come).

Link: https://lore.kernel.org/r/20240831001538.336683-9-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

show more ...


# 9c19129e 31-Aug-2024 Sean Christopherson <seanjc@google.com>

KVM: x86: Store gpa as gpa_t, not unsigned long, when unprotecting for retry

Store the gpa used to unprotect the faulting gfn for retry as a gpa_t, not
an unsigned long. This fixes a bug where 32-b

KVM: x86: Store gpa as gpa_t, not unsigned long, when unprotecting for retry

Store the gpa used to unprotect the faulting gfn for retry as a gpa_t, not
an unsigned long. This fixes a bug where 32-bit KVM would unprotect and
retry the wrong gfn if the gpa had bits 63:32!=0. In practice, this bug
is functionally benign, as unprotecting the wrong gfn is purely a
performance issue (thanks to the anti-infinite-loop logic). And of course,
almost no one runs 32-bit KVM these days.

Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Link: https://lore.kernel.org/r/20240831001538.336683-8-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

show more ...


# 019f3f84 31-Aug-2024 Sean Christopherson <seanjc@google.com>

KVM: x86: Get RIP from vCPU state when storing it to last_retry_eip

Read RIP from vCPU state instead of pulling it from the emulation context
when filling last_retry_eip, which is part of the anti-i

KVM: x86: Get RIP from vCPU state when storing it to last_retry_eip

Read RIP from vCPU state instead of pulling it from the emulation context
when filling last_retry_eip, which is part of the anti-infinite-loop
protection used when unprotecting and retrying instructions that hit a
write-protected gfn.

This will allow reusing the anti-infinite-loop protection in flows that
never make it into the emulator.

No functional change intended, as ctxt->eip is set to kvm_rip_read() in
init_emulate_ctxt(), and EMULTYPE_PF emulation is mutually exclusive with
EMULTYPE_NO_DECODE and EMULTYPE_SKIP, i.e. always goes through
x86_decode_emulated_instruction() and hasn't advanced ctxt->eip (yet).

Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Link: https://lore.kernel.org/r/20240831001538.336683-7-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

show more ...


# c1edcc41 31-Aug-2024 Sean Christopherson <seanjc@google.com>

KVM: x86: Retry to-be-emulated insn in "slow" unprotect path iff sp is zapped

Resume the guest and thus skip emulation of a non-PTE-writing instruction
if and only if unprotecting the gfn actually z

KVM: x86: Retry to-be-emulated insn in "slow" unprotect path iff sp is zapped

Resume the guest and thus skip emulation of a non-PTE-writing instruction
if and only if unprotecting the gfn actually zapped at least one shadow
page. If the gfn is write-protected for some reason other than shadow
paging, attempting to unprotect the gfn will effectively fail, and thus
retrying the instruction is all but guaranteed to be pointless. This bug
has existed for a long time, but was effectively fudged around by the
retry RIP+address anti-loop detection.

Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Link: https://lore.kernel.org/r/20240831001538.336683-6-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

show more ...


# 3f6821aa 06-Sep-2024 Sean Christopherson <seanjc@google.com>

KVM: x86: Forcibly leave nested if RSM to L2 hits shutdown

Leave nested mode before synthesizing shutdown (a.k.a. TRIPLE_FAULT) if
RSM fails when resuming L2 (a.k.a. guest mode). Architecturally, s

KVM: x86: Forcibly leave nested if RSM to L2 hits shutdown

Leave nested mode before synthesizing shutdown (a.k.a. TRIPLE_FAULT) if
RSM fails when resuming L2 (a.k.a. guest mode). Architecturally, shutdown
on RSM occurs _before_ the transition back to guest mode on both Intel and
AMD.

On Intel, per the SDM pseudocode, SMRAM state is loaded before critical
VMX state:

restore state normally from SMRAM;
...
CR4.VMXE := value stored internally;
IF internal storage indicates that the logical processor had been in
VMX operation (root or non-root)
THEN
enter VMX operation (root or non-root);
restore VMX-critical state as defined in Section 32.14.1;
...
restore current VMCS pointer;
FI;

AMD's APM is both less clearcut and more explicit. Because AMD CPUs save
VMCB and guest state in SMRAM itself, given the lack of anything in the
APM to indicate a shutdown in guest mode is possible, a straightforward
reading of the clause on invalid state is that _what_ state is invalid is
irrelevant, i.e. all roads lead to shutdown.

An RSM causes a processor shutdown if an invalid-state condition is
found in the SMRAM state-save area.

This fixes a bug found by syzkaller where synthesizing shutdown for L2
led to a nested VM-Exit (if L1 is intercepting shutdown), which in turn
caused KVM to complain about trying to cancel a nested VM-Enter (see
commit 759cbd59674a ("KVM: x86: nSVM/nVMX: set nested_run_pending on VM
entry which is a result of RSM").

Note, Paolo pointed out that KVM shouldn't set nested_run_pending until
after loading SMRAM state. But as above, that's only half the story, KVM
shouldn't transition to guest mode either. Unfortunately, fixing that
mess requires rewriting the nVMX and nSVM RSM flows to not piggyback
their nested VM-Enter flows, as executing the nested VM-Enter flows after
loading state from SMRAM would clobber much of said state.

For now, add a FIXME to call out that transitioning to guest mode before
loading state from SMRAM is wrong.

Link: https://lore.kernel.org/all/CABgObfYaUHXyRmsmg8UjRomnpQ0Jnaog9-L2gMjsjkqChjDYUQ@mail.gmail.com
Reported-by: syzbot+988d9efcdf137bc05f66@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/0000000000007a9acb06151e1670@google.com
Reported-by: Zheyu Ma <zheyuma97@gmail.com>
Closes: https://lore.kernel.org/all/CAMhUBjmXMYsEoVYw_M8hSZjBMHh24i88QYm-RY6HDta5YZ7Wgw@mail.gmail.com
Analyzed-by: Michal Wilczynski <michal.wilczynski@intel.com>
Cc: Kishen Maloor <kishen.maloor@intel.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20240906161337.1118412-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

show more ...


# 590b09b1 30-Aug-2024 Sean Christopherson <seanjc@google.com>

KVM: x86: Register "emergency disable" callbacks when virt is enabled

Register the "disable virtualization in an emergency" callback just
before KVM enables virtualization in hardware, as there is n

KVM: x86: Register "emergency disable" callbacks when virt is enabled

Register the "disable virtualization in an emergency" callback just
before KVM enables virtualization in hardware, as there is no functional
need to keep the callbacks registered while KVM happens to be loaded, but
is inactive, i.e. if KVM hasn't enabled virtualization.

Note, unregistering the callback every time the last VM is destroyed could
have measurable latency due to the synchronize_rcu() needed to ensure all
references to the callback are dropped before KVM is unloaded. But the
latency should be a small fraction of the total latency of disabling
virtualization across all CPUs, and userspace can set enable_virt_at_load
to completely eliminate the runtime overhead.

Add a pointer in kvm_x86_ops to allow vendor code to provide its callback.
There is no reason to force vendor code to do the registration, and either
way KVM would need a new kvm_x86_ops hook.

Suggested-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Acked-by: Kai Huang <kai.huang@intel.com>
Tested-by: Farrah Chen <farrah.chen@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20240830043600.127750-11-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

show more ...


# 0617a769 30-Aug-2024 Sean Christopherson <seanjc@google.com>

KVM: x86: Rename virtualization {en,dis}abling APIs to match common KVM

Rename x86's the per-CPU vendor hooks used to enable virtualization in
hardware to align with the recently renamed arch hooks.

KVM: x86: Rename virtualization {en,dis}abling APIs to match common KVM

Rename x86's the per-CPU vendor hooks used to enable virtualization in
hardware to align with the recently renamed arch hooks.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Message-ID: <20240830043600.127750-7-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

show more ...


# 071f24ad 30-Aug-2024 Sean Christopherson <seanjc@google.com>

KVM: Rename arch hooks related to per-CPU virtualization enabling

Rename the per-CPU hooks used to enable virtualization in hardware to
align with the KVM-wide helpers in kvm_main.c, and to better c

KVM: Rename arch hooks related to per-CPU virtualization enabling

Rename the per-CPU hooks used to enable virtualization in hardware to
align with the KVM-wide helpers in kvm_main.c, and to better capture that
the callbacks are invoked on every online CPU.

No functional change intended.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Message-ID: <20240830043600.127750-5-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

show more ...


Revision tags: v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2
# 1876dd69 02-Aug-2024 Sean Christopherson <seanjc@google.com>

KVM: x86: Add fastpath handling of HLT VM-Exits

Add a fastpath for HLT VM-Exits by immediately re-entering the guest if
it has a pending wake event. When virtual interrupt delivery is enabled,
i.e.

KVM: x86: Add fastpath handling of HLT VM-Exits

Add a fastpath for HLT VM-Exits by immediately re-entering the guest if
it has a pending wake event. When virtual interrupt delivery is enabled,
i.e. when KVM doesn't need to manually inject interrupts, this allows KVM
to stay in the fastpath run loop when a vIRQ arrives between the guest
doing CLI and STI;HLT. Without AMD's Idle HLT-intercept support, the CPU
generates a HLT VM-Exit even though KVM will immediately resume the guest.

Note, on bare metal, it's relatively uncommon for a modern guest kernel to
actually trigger this scenario, as the window between the guest checking
for a wake event and committing to HLT is quite small. But in a nested
environment, the timings change significantly, e.g. rudimentary testing
showed that ~50% of HLT exits where HLT-polling was successful would be
serviced by this fastpath, i.e. ~50% of the time that a nested vCPU gets
a wake event before KVM schedules out the vCPU, the wake event was pending
even before the VM-Exit.

Link: https://lore.kernel.org/all/20240528041926.3989-3-manali.shukla@amd.com
Link: https://lore.kernel.org/r/20240802195120.325560-6-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

show more ...


# 70cdd238 02-Aug-2024 Sean Christopherson <seanjc@google.com>

KVM: x86: Reorganize code in x86.c to co-locate vCPU blocking/running helpers

Shuffle code around in x86.c so that the various helpers related to vCPU
blocking/running logic are (a) located near eac

KVM: x86: Reorganize code in x86.c to co-locate vCPU blocking/running helpers

Shuffle code around in x86.c so that the various helpers related to vCPU
blocking/running logic are (a) located near each other and (b) ordered so
that HLT emulation can use kvm_vcpu_has_events() in a future path.

No functional change intended.

Link: https://lore.kernel.org/r/20240802195120.325560-5-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

show more ...


12345678910>>...271