Skip to content

Commit b0c39fd

Browse files
committed
Revert "KVM: x86/mmu: Introduce a quirk to control memslot zap behavior"
Remove KVM_X86_QUIRK_SLOT_ZAP_ALL, as the code is broken for shadow MMUs, and the underlying premise is dodgy. As was tried in commit 4e10313 ("KVM: x86/mmu: Zap only the relevant pages when removing a memslot"), all shadow pages, i.e. non-leaf SPTEs, need to be zapped. All of the accounting for a shadow page is tied to the memslot, i.e. the shadow page holds a reference to the memslot, for all intents and purposes. Deleting the memslot without removing all relevant shadow pages, as is done when KVM_X86_QUIRK_SLOT_ZAP_ALL is disabled, results in NULL pointer derefs when tearing down the VM. BUG: kernel NULL pointer dereference, address: 00000000000000b0 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 6085f43067 P4D 608c080067 PUD 608c081067 PMD 0 Oops: Oops: 0000 [#1] SMP NOPTI CPU: 79 UID: 0 PID: 187063 Comm: set_memory_regi Tainted: G W 6.11.0-smp--24867312d167-cpl torvalds#395 Tainted: [W]=WARN Hardware name: Google Astoria/astoria, BIOS 0.20240617.0-0 06/17/2024 RIP: 0010:__kvm_mmu_prepare_zap_page+0x3a9/0x7b0 [kvm] Code: <48> 8b 8e b0 00 00 00 48 8b 96 e0 00 00 00 48 c1 e9 09 48 29 c8 8b RSP: 0018:ff314a25b19f7c28 EFLAGS: 00010212 Call Trace: <TASK> kvm_arch_flush_shadow_all+0x7a/0xf0 [kvm] kvm_mmu_notifier_release+0x6c/0xb0 [kvm] mmu_notifier_unregister+0x85/0x140 kvm_put_kvm+0x263/0x410 [kvm] kvm_vm_release+0x21/0x30 [kvm] __fput+0x8d/0x2c0 __se_sys_close+0x71/0xc0 do_syscall_64+0x83/0x160 entry_SYSCALL_64_after_hwframe+0x76/0x7e Rather than trying to get things functional for shadow MMUs (which includes nested TDP), scrap the quirk idea, at least for now. In addition to the function bug, it's not clear that unconditionally doing a targeted zap for all non-default VM types is actually desirable. E.g. it's entirely possible that SEV-ES and SNP VMs would exhibit worse performance than KVM's current "zap all" behavior, or that it's better to do a targeted zap only in specific situations, etc. This reverts commit aa8d1f4. Cc: Kai Huang <[email protected]> Cc: Rick Edgecombe <[email protected]> Cc: Yan Zhao <[email protected]> Signed-off-by: Sean Christopherson <[email protected]>
1 parent 54415c6 commit b0c39fd

File tree

4 files changed

+2
-44
lines changed

4 files changed

+2
-44
lines changed

Documentation/virt/kvm/api.rst

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -8097,14 +8097,6 @@ KVM_X86_QUIRK_MWAIT_NEVER_UD_FAULTS By default, KVM emulates MONITOR/MWAIT (if
80978097
guest CPUID on writes to MISC_ENABLE if
80988098
KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT is
80998099
disabled.
8100-
8101-
KVM_X86_QUIRK_SLOT_ZAP_ALL By default, KVM invalidates all SPTEs in
8102-
fast way for memslot deletion when VM type
8103-
is KVM_X86_DEFAULT_VM.
8104-
When this quirk is disabled or when VM type
8105-
is other than KVM_X86_DEFAULT_VM, KVM zaps
8106-
only leaf SPTEs that are within the range of
8107-
the memslot being deleted.
81088100
=================================== ============================================
81098101

81108102
7.32 KVM_CAP_MAX_VCPU_ID

arch/x86/include/asm/kvm_host.h

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2358,8 +2358,7 @@ int memslot_rmap_alloc(struct kvm_memory_slot *slot, unsigned long npages);
23582358
KVM_X86_QUIRK_OUT_7E_INC_RIP | \
23592359
KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT | \
23602360
KVM_X86_QUIRK_FIX_HYPERCALL_INSN | \
2361-
KVM_X86_QUIRK_MWAIT_NEVER_UD_FAULTS | \
2362-
KVM_X86_QUIRK_SLOT_ZAP_ALL)
2361+
KVM_X86_QUIRK_MWAIT_NEVER_UD_FAULTS)
23632362

23642363
/*
23652364
* KVM previously used a u32 field in kvm_run to indicate the hypercall was

arch/x86/include/uapi/asm/kvm.h

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -439,7 +439,6 @@ struct kvm_sync_regs {
439439
#define KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT (1 << 4)
440440
#define KVM_X86_QUIRK_FIX_HYPERCALL_INSN (1 << 5)
441441
#define KVM_X86_QUIRK_MWAIT_NEVER_UD_FAULTS (1 << 6)
442-
#define KVM_X86_QUIRK_SLOT_ZAP_ALL (1 << 7)
443442

444443
#define KVM_STATE_NESTED_FORMAT_VMX 0
445444
#define KVM_STATE_NESTED_FORMAT_SVM 1

arch/x86/kvm/mmu/mmu.c

Lines changed: 1 addition & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -7049,42 +7049,10 @@ void kvm_arch_flush_shadow_all(struct kvm *kvm)
70497049
kvm_mmu_zap_all(kvm);
70507050
}
70517051

7052-
/*
7053-
* Zapping leaf SPTEs with memslot range when a memslot is moved/deleted.
7054-
*
7055-
* Zapping non-leaf SPTEs, a.k.a. not-last SPTEs, isn't required, worst
7056-
* case scenario we'll have unused shadow pages lying around until they
7057-
* are recycled due to age or when the VM is destroyed.
7058-
*/
7059-
static void kvm_mmu_zap_memslot_leafs(struct kvm *kvm, struct kvm_memory_slot *slot)
7060-
{
7061-
struct kvm_gfn_range range = {
7062-
.slot = slot,
7063-
.start = slot->base_gfn,
7064-
.end = slot->base_gfn + slot->npages,
7065-
.may_block = true,
7066-
};
7067-
7068-
write_lock(&kvm->mmu_lock);
7069-
if (kvm_unmap_gfn_range(kvm, &range))
7070-
kvm_flush_remote_tlbs_memslot(kvm, slot);
7071-
7072-
write_unlock(&kvm->mmu_lock);
7073-
}
7074-
7075-
static inline bool kvm_memslot_flush_zap_all(struct kvm *kvm)
7076-
{
7077-
return kvm->arch.vm_type == KVM_X86_DEFAULT_VM &&
7078-
kvm_check_has_quirk(kvm, KVM_X86_QUIRK_SLOT_ZAP_ALL);
7079-
}
7080-
70817052
void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
70827053
struct kvm_memory_slot *slot)
70837054
{
7084-
if (kvm_memslot_flush_zap_all(kvm))
7085-
kvm_mmu_zap_all_fast(kvm);
7086-
else
7087-
kvm_mmu_zap_memslot_leafs(kvm, slot);
7055+
kvm_mmu_zap_all_fast(kvm);
70887056
}
70897057

70907058
void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm, u64 gen)

0 commit comments

Comments
 (0)