kernel_optimize_test/Documentation
Sean Christopherson 771dfb3c53 KVM: x86: Protect userspace MSR filter with SRCU, and set atomically-ish
[ Upstream commit b318e8decf6b9ef1bcf4ca06fae6d6a2cb5d5c5c ]

Fix a plethora of issues with MSR filtering by installing the resulting
filter as an atomic bundle instead of updating the live filter one range
at a time.  The KVM_X86_SET_MSR_FILTER ioctl() isn't truly atomic, as
the hardware MSR bitmaps won't be updated until the next VM-Enter, but
the relevant software struct is atomically updated, which is what KVM
really needs.

Similar to the approach used for modifying memslots, make arch.msr_filter
a SRCU-protected pointer, do all the work configuring the new filter
outside of kvm->lock, and then acquire kvm->lock only when the new filter
has been vetted and created.  That way vCPU readers either see the old
filter or the new filter in their entirety, not some half-baked state.

Yuan Yao pointed out a use-after-free in ksm_msr_allowed() due to a
TOCTOU bug, but that's just the tip of the iceberg...

  - Nothing is __rcu annotated, making it nigh impossible to audit the
    code for correctness.
  - kvm_add_msr_filter() has an unpaired smp_wmb().  Violation of kernel
    coding style aside, the lack of a smb_rmb() anywhere casts all code
    into doubt.
  - kvm_clear_msr_filter() has a double free TOCTOU bug, as it grabs
    count before taking the lock.
  - kvm_clear_msr_filter() also has memory leak due to the same TOCTOU bug.

The entire approach of updating the live filter is also flawed.  While
installing a new filter is inherently racy if vCPUs are running, fixing
the above issues also makes it trivial to ensure certain behavior is
deterministic, e.g. KVM can provide deterministic behavior for MSRs with
identical settings in the old and new filters.  An atomic update of the
filter also prevents KVM from getting into a half-baked state, e.g. if
installing a filter fails, the existing approach would leave the filter
in a half-baked state, having already committed whatever bits of the
filter were already processed.

[*] https://lkml.kernel.org/r/20210312083157.25403-1-yaoyuan0329os@gmail.com

Fixes: 1a155254ff ("KVM: x86: Introduce MSR filtering")
Cc: stable@vger.kernel.org
Cc: Alexander Graf <graf@amazon.com>
Reported-by: Yuan Yao <yaoyuan0329os@gmail.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210316184436.2544875-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-30 14:31:53 +02:00
..
ABI drivers/base/memory: don't store phys_device in memory blocks 2021-03-17 17:06:25 +01:00
accounting
admin-guide drivers/base/memory: don't store phys_device in memory blocks 2021-03-17 17:06:25 +01:00
arm documentation: arm: sunxi: add Allwinner H6 documents 2020-10-28 11:19:24 -06:00
arm64 arm64: Add workaround for Arm Cortex-A77 erratum 1508412 2020-10-29 12:56:01 +00:00
block block-5.10-2020-10-24 2020-10-24 12:46:42 -07:00
bpf bpf: Migrate from patchwork.ozlabs.org to patchwork.kernel.org. 2020-10-11 22:05:47 +02:00
cdrom
core-api dma-mapping: document dma_{alloc,free}_pages 2020-10-23 12:07:46 +02:00
cpu-freq
crypto
dev-tools linux-kselftest-kunit-fixes-5.10-rc5 2020-11-18 11:57:55 -08:00
devicetree dt-bindings: net: btusb: DT fix s/interrupt-name/interrupt-names/ 2021-03-07 12:34:08 +01:00
doc-guide docs: kerneldoc.py: add support for kerneldoc -nosymbol 2020-10-15 07:49:38 +02:00
driver-api media: vidtv.rst: add kernel-doc markups 2020-11-26 08:05:24 +01:00
fault-injection A handful of late-arriving documentation fixes. 2020-10-23 17:13:53 -07:00
fb drm fixes (round two) for 5.10-rc1 2020-10-23 13:56:34 -07:00
features s390 updates for the 5.10 merge window 2020-10-16 12:36:38 -07:00
filesystems seq_file: document how per-entry resources are managed. 2021-03-04 11:38:37 +01:00
firmware_class
firmware-guide Documentation: ACPI: fix spelling mistakes 2020-11-10 18:48:56 +01:00
fpga
gpu drm: Use USB controller's DMA mask when importing dmabufs 2021-03-17 17:06:19 +01:00
hid
hwmon docs: hwmon: mp2975.rst: address some html build warnings 2020-10-28 11:26:10 -06:00
i2c Documentation: i2c: add testunit docs to index 2020-10-05 22:57:45 +02:00
ia64
ide
iio
infiniband
input
isdn
kbuild kbuild: remove unused OBJSIZE 2020-11-02 11:31:00 +09:00
kernel-hacking
leds docs: leds: index.rst: add a missing file 2020-11-02 13:45:37 +01:00
litmus-tests
livepatch
locking Documentation: seqlock: s/LOCKTYPE/LOCKNAME/g 2020-12-30 11:54:11 +01:00
m68k
maintainer
mhi
mips dt: Remove booting-without-of.rst 2020-10-13 13:33:16 -05:00
misc-devices Documentation: remove mic/index from misc-devices/index.rst 2020-11-04 11:38:32 +01:00
netlabel
networking docs: networking: drop special stable handling 2021-03-17 17:06:13 +01:00
nios2
nvdimm
openrisc
parisc
PCI Documentation: better locations for sysfs-pci, sysfs-tagging 2020-10-09 09:33:23 -06:00
pcmcia
power
powerpc docs updates for v5.10-rc1 2020-10-16 15:02:21 -07:00
process docs: networking: drop special stable handling 2021-03-17 17:06:13 +01:00
RCU Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu 2020-10-09 08:21:56 +02:00
riscv
s390
scheduler
scsi scsi: libsas: Introduce a _gfp() variant of event notifiers 2021-03-25 09:04:11 +01:00
security watch_queue: Drop references to /dev/watch_queue 2021-03-04 11:37:59 +01:00
sh dt: Remove booting-without-of.rst 2020-10-13 13:33:16 -05:00
sound ALSA: doc: Fix reference to mixart.rst 2021-01-19 18:27:17 +01:00
sparc
sphinx A small number of fixes, plus a build tweak to respect the desire for 2020-11-03 09:57:30 -08:00
sphinx-static
spi
staging
target
timers
trace docs updates for v5.10-rc1 2020-10-16 15:02:21 -07:00
translations net: switch to the kernel.org patchwork instance 2020-11-11 17:12:00 -08:00
usb
userspace-api docs: userspace-api: add iommu.rst to the index file 2020-10-28 11:26:10 -06:00
virt KVM: x86: Protect userspace MSR filter with SRCU, and set atomically-ish 2021-03-30 14:31:53 +02:00
vm A handful of late-arriving documentation fixes. 2020-10-23 17:13:53 -07:00
w1 docs: w1: w1_therm: Fix broken xref, mistakes, clarify text 2020-10-08 09:47:15 +02:00
watchdog
x86 x86/CPU/AMD: Save AMD NodeId as cpu_die_id 2020-12-30 11:54:29 +01:00
xtensa xtensa: fix TLBTEMP area placement 2020-11-16 02:13:15 -08:00
.gitignore
asm-annotations.rst x86/entry: Emit a symbol for register restoring thunk 2021-02-03 23:28:40 +01:00
atomic_bitops.txt
atomic_t.txt
Changes
CodingStyle
conf.py This pull contains a series of warning fixes from Mauro; once applied, the 2020-11-03 13:14:14 -08:00
COPYING-logo
docutils.conf
dontdiff
index.rst
Kconfig docs: Kconfig/Makefile: add a check for broken ABI files 2020-10-30 13:08:07 +01:00
logo.gif
Makefile A small number of fixes, plus a build tweak to respect the desire for 2020-11-03 09:57:30 -08:00
memory-barriers.txt
SubmittingPatches
watch_queue.rst