Paolo Bonzini [Wed, 22 Feb 2023 01:00:44 +0000 (20:00 -0500)]
Merge tag 'kvm-x86-apic-6.3' of https://github.com/kvm-x86/linux into HEAD
KVM x86 APIC changes for 6.3:
- Remove a superfluous variables from apic_get_tmcct()
- Fix various edge cases in x2APIC MSR emulation
- Mark APIC timer as expired if its in one-shot mode and the count
underflows while the vCPU task was being migrated
- Reset xAPIC when userspace forces "impossible" x2APIC => xAPIC transition
Paolo Bonzini [Mon, 20 Feb 2023 11:12:42 +0000 (06:12 -0500)]
Merge tag 'kvmarm-6.3' of git://git./linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for 6.3
- Provide a virtual cache topology to the guest to avoid
inconsistencies with migration on heterogenous systems. Non secure
software has no practical need to traverse the caches by set/way in
the first place.
- Add support for taking stage-2 access faults in parallel. This was an
accidental omission in the original parallel faults implementation,
but should provide a marginal improvement to machines w/o FEAT_HAFDBS
(such as hardware from the fruit company).
- A preamble to adding support for nested virtualization to KVM,
including vEL2 register state, rudimentary nested exception handling
and masking unsupported features for nested guests.
- Fixes to the PSCI relay that avoid an unexpected host SVE trap when
resuming a CPU when running pKVM.
- VGIC maintenance interrupt support for the AIC
- Improvements to the arch timer emulation, primarily aimed at reducing
the trap overhead of running nested.
- Add CONFIG_USERFAULTFD to the KVM selftests config fragment in the
interest of CI systems.
- Avoid VM-wide stop-the-world operations when a vCPU accesses its own
redistributor.
- Serialize when toggling CPACR_EL1.SMEN to avoid unexpected exceptions
in the host.
- Aesthetic and comment/kerneldoc fixes
- Drop the vestiges of the old Columbia mailing list and add [Oliver]
as co-maintainer
This also drags in arm64's 'for-next/sme2' branch, because both it and
the PSCI relay changes touch the EL2 initialization code.
Marc Zyngier [Sat, 18 Feb 2023 14:06:10 +0000 (14:06 +0000)]
Merge tag ' https://github.com/oupton/linux tags/kvmarm-6.3' from into kvmarm-master/next
Merge Oliver's kvmarm-6.3 tag:
KVM/arm64 updates for 6.3
- Provide a virtual cache topology to the guest to avoid
inconsistencies with migration on heterogenous systems. Non secure
software has no practical need to traverse the caches by set/way in
the first place.
- Add support for taking stage-2 access faults in parallel. This was an
accidental omission in the original parallel faults implementation,
but should provide a marginal improvement to machines w/o FEAT_HAFDBS
(such as hardware from the fruit company).
- A preamble to adding support for nested virtualization to KVM,
including vEL2 register state, rudimentary nested exception handling
and masking unsupported features for nested guests.
- Fixes to the PSCI relay that avoid an unexpected host SVE trap when
resuming a CPU when running pKVM.
- VGIC maintenance interrupt support for the AIC
- Improvements to the arch timer emulation, primarily aimed at reducing
the trap overhead of running nested.
- Add CONFIG_USERFAULTFD to the KVM selftests config fragment in the
interest of CI systems.
- Avoid VM-wide stop-the-world operations when a vCPU accesses its own
redistributor.
- Serialize when toggling CPACR_EL1.SMEN to avoid unexpected exceptions
in the host.
- Aesthetic and comment/kerneldoc fixes
- Drop the vestiges of the old Columbia mailing list and add myself as
co-maintainer
This also drags in a couple of branches to avoid conflicts:
- The shared 'kvm-hw-enable-refactor' branch that reworks
initialization, as it conflicted with the virtual cache topology
changes.
- arm64's 'for-next/sme2' branch, as the PSCI relay changes, as both
touched the EL2 initialization code.
Signed-off-by: Marc Zyngier <maz@kernel.org>
David Matlack [Mon, 13 Feb 2023 21:28:44 +0000 (13:28 -0800)]
KVM: x86/mmu: Make tdp_mmu_allowed static
Make tdp_mmu_allowed static since it is only ever used within
arch/x86/kvm/mmu/mmu.c.
Link: https://lore.kernel.org/kvm/202302072055.odjDVd5V-lkp@intel.com/
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <
20230213212844.3062733-1-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Wed, 15 Feb 2023 17:35:26 +0000 (12:35 -0500)]
Merge tag 'kvm-s390-next-6.3-1' of https://git./linux/kernel/git/kvms390/linux into HEAD
* Two more V!=R patches
* The last part of the cmpxchg patches
* A few fixes
Paolo Bonzini [Wed, 15 Feb 2023 17:33:28 +0000 (12:33 -0500)]
Merge tag 'kvm-riscv-6.3-1' of https://github.com/kvm-riscv/linux into HEAD
KVM/riscv changes for 6.3
- Fix wrong usage of PGDIR_SIZE to check page sizes
- Fix privilege mode setting in kvm_riscv_vcpu_trap_redirect()
- Redirect illegal instruction traps to guest
- SBI PMU support for guest
Paolo Bonzini [Wed, 15 Feb 2023 17:23:19 +0000 (12:23 -0500)]
Merge tag 'kvm-x86-vmx-6.3' of https://github.com/kvm-x86/linux into HEAD
KVM VMX changes for 6.3:
- Handle NMI VM-Exits before leaving the noinstr region
- A few trivial cleanups in the VM-Enter flows
- Stop enabling VMFUNC for L1 purely to document that KVM doesn't support
EPTP switching (or any other VM function) for L1
- Fix a crash when using eVMCS's enlighted MSR bitmaps
Paolo Bonzini [Wed, 15 Feb 2023 17:23:06 +0000 (12:23 -0500)]
Merge tag 'kvm-x86-svm-6.3' of https://github.com/kvm-x86/linux into HEAD
KVM SVM changes for 6.3:
- Fix a mostly benign overflow bug in SEV's send|receive_update_data()
- Move the SVM-specific "host flags" into vcpu_svm (extracted from the
vNMI enabling series)
- A handful for fixes and cleanups
Paolo Bonzini [Wed, 15 Feb 2023 13:34:32 +0000 (08:34 -0500)]
Merge tag 'kvm-x86-selftests-6.3' of https://github.com/kvm-x86/linux into HEAD
KVM selftests changes for 6.3:
- Cache the CPU vendor (AMD vs. Intel) and use the info to emit the correct
hypercall instruction instead of relying on KVM to patch in VMMCALL
- A variety of one-off cleanups and fixes
Paolo Bonzini [Wed, 15 Feb 2023 13:23:24 +0000 (08:23 -0500)]
Merge tag 'kvm-x86-pmu-6.3' of https://github.com/kvm-x86/linux into HEAD
KVM x86 PMU changes for 6.3:
- Add support for created masked events for the PMU filter to allow
userspace to heavily restrict what events the guest can use without
needing to create an absurd number of events
- Clean up KVM's handling of "PMU MSRs to save", especially when vPMU
support is disabled
- Add PEBS support for Intel SPR
Paolo Bonzini [Wed, 15 Feb 2023 13:22:44 +0000 (08:22 -0500)]
Merge tag 'kvm-x86-mmu-6.3' of https://github.com/kvm-x86/linux into HEAD
KVM x86 MMU changes for 6.3:
- Fix and cleanup the range-based TLB flushing code, used when KVM is
running on Hyper-V
- A few one-off cleanups
Paolo Bonzini [Wed, 15 Feb 2023 13:22:09 +0000 (08:22 -0500)]
Merge tag 'kvm-x86-misc-6.3' of https://github.com/kvm-x86/linux into HEAD
KVM x86 changes for 6.3:
- Advertise support for Intel's fancy new fast REP string features
- Fix a double-shootdown issue in the emergency reboot code
- Ensure GIF=1 and disable SVM during an emergency reboot, i.e. give SVM
similar treatment to VMX
- Update Xen's TSC info CPUID sub-leaves as appropriate
- Add support for Hyper-V's extended hypercalls, where "support" at this
point is just forwarding the hypercalls to userspace
- Clean up the kvm->lock vs. kvm->srcu sequences when updating the PMU and
MSR filters
- One-off fixes and cleanups
Paolo Bonzini [Wed, 15 Feb 2023 12:30:43 +0000 (07:30 -0500)]
Merge tag 'kvm-x86-generic-6.3' of https://github.com/kvm-x86/linux into HEAD
Common KVM changes for 6.3:
- Account allocations in generic kvm_arch_alloc_vm()
- Fix a typo and a stale comment
- Fix a memory leak if coalesced MMIO unregistration fails
Oliver Upton [Mon, 13 Feb 2023 23:33:41 +0000 (23:33 +0000)]
Merge branch kvm-arm64/nv-prefix into kvmarm/next
* kvm-arm64/nv-prefix:
: Preamble to NV support, courtesy of Marc Zyngier.
:
: This brings in a set of prerequisite patches for supporting nested
: virtualization in KVM/arm64. Of course, there is a long way to go until
: NV is actually enabled in KVM.
:
: - Introduce cpucap / vCPU feature flag to pivot the NV code on
:
: - Add support for EL2 vCPU register state
:
: - Basic nested exception handling
:
: - Hide unsupported features from the ID registers for NV-capable VMs
KVM: arm64: nv: Use reg_to_encoding() to get sysreg ID
KVM: arm64: nv: Only toggle cache for virtual EL2 when SCTLR_EL2 changes
KVM: arm64: nv: Filter out unsupported features from ID regs
KVM: arm64: nv: Emulate EL12 register accesses from the virtual EL2
KVM: arm64: nv: Allow a sysreg to be hidden from userspace only
KVM: arm64: nv: Emulate PSTATE.M for a guest hypervisor
KVM: arm64: nv: Add accessors for SPSR_EL1, ELR_EL1 and VBAR_EL1 from virtual EL2
KVM: arm64: nv: Handle SMCs taken from virtual EL2
KVM: arm64: nv: Handle trapped ERET from virtual EL2
KVM: arm64: nv: Inject HVC exceptions to the virtual EL2
KVM: arm64: nv: Support virtual EL2 exceptions
KVM: arm64: nv: Handle HCR_EL2.NV system register traps
KVM: arm64: nv: Add nested virt VCPU primitives for vEL2 VCPU state
KVM: arm64: nv: Add EL2 system registers to vcpu context
KVM: arm64: nv: Allow userspace to set PSR_MODE_EL2x
KVM: arm64: nv: Reset VCPU to EL2 registers if VCPU nested virt is set
KVM: arm64: nv: Introduce nested virtualization VCPU feature
KVM: arm64: Use the S2 MMU context to iterate over S2 table
arm64: Add ARM64_HAS_NESTED_VIRT cpufeature
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Oliver Upton [Mon, 13 Feb 2023 23:32:59 +0000 (23:32 +0000)]
Merge branch kvm-arm64/misc into kvmarm/next
* kvm-arm64/misc:
: Miscellaneous updates
:
: - Convert CPACR_EL1_TTA to the new, generated system register
: definitions.
:
: - Serialize toggling CPACR_EL1.SMEN to avoid unexpected exceptions when
: accessing SVCR in the host.
:
: - Avoid quiescing the guest if a vCPU accesses its own redistributor's
: SGIs/PPIs, eliminating the need to IPI. Largely an optimization for
: nested virtualization, as the L1 accesses the affected registers
: rather often.
:
: - Conversion to kstrtobool()
:
: - Common definition of INVALID_GPA across architectures
:
: - Enable CONFIG_USERFAULTFD for CI runs of KVM selftests
KVM: arm64: Fix non-kerneldoc comments
KVM: selftests: Enable USERFAULTFD
KVM: selftests: Remove redundant setbuf()
arm64/sysreg: clean up some inconsistent indenting
KVM: MMU: Make the definition of 'INVALID_GPA' common
KVM: arm64: vgic-v3: Use kstrtobool() instead of strtobool()
KVM: arm64: vgic-v3: Limit IPI-ing when accessing GICR_{C,S}ACTIVER0
KVM: arm64: Synchronize SMEN on vcpu schedule out
KVM: arm64: Kill CPACR_EL1_TTA definition
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Oliver Upton [Mon, 13 Feb 2023 23:31:23 +0000 (23:31 +0000)]
Merge branch kvm-arm64/apple-vgic-mi into kvmarm/next
* kvm-arm64/apple-vgic-mi:
: VGIC maintenance interrupt support for the AIC, courtesy of Marc Zyngier.
:
: The AIC provides a non-maskable VGIC maintenance interrupt, which until
: now was not supported by KVM. This series (1) allows the registration of
: a non-maskable maintenance interrupt and (2) wires in support for this
: with the AIC driver.
irqchip/apple-aic: Correctly map the vgic maintenance interrupt
irqchip/apple-aic: Register vgic maintenance interrupt with KVM
KVM: arm64: vgic: Allow registration of a non-maskable maintenance interrupt
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Oliver Upton [Mon, 13 Feb 2023 23:30:37 +0000 (23:30 +0000)]
Merge branch kvm-arm64/psci-relay-fixes into kvmarm/next
* kvm-arm64/psci-relay-fixes:
: Fixes for CPU on/resume with pKVM, courtesy Quentin Perret.
:
: A consequence of deprivileging the host is that pKVM relays PSCI calls
: on behalf of the host. pKVM's CPU initialization failed to fully
: initialize the CPU's EL2 state, which notably led to unexpected SVE
: traps resulting in a hyp panic.
:
: The issue is addressed by reusing parts of __finalise_el2 to restore CPU
: state in the PSCI relay.
KVM: arm64: Finalise EL2 state from pKVM PSCI relay
KVM: arm64: Use sanitized values in __check_override in nVHE
KVM: arm64: Introduce finalise_el2_state macro
KVM: arm64: Provide sanitized SYS_ID_AA64SMFR0_EL1 to nVHE
Oliver Upton [Mon, 13 Feb 2023 23:26:21 +0000 (23:26 +0000)]
Merge branch kvm-arm64/nv-timer-improvements into kvmarm/next
* kvm-arm64/nv-timer-improvements:
: Timer emulation improvements, courtesy of Marc Zyngier.
:
: - Avoid re-arming an hrtimer for a guest timer that is already pending
:
: - Only reload the affected timer context when emulating a sysreg access
: instead of both the virtual/physical timers.
KVM: arm64: timers: Don't BUG() on unhandled timer trap
KVM: arm64: Reduce overhead of trapped timer sysreg accesses
KVM: arm64: Don't arm a hrtimer for an already pending timer
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Oliver Upton [Mon, 13 Feb 2023 23:25:50 +0000 (23:25 +0000)]
Merge branch kvm-arm64/MAINTAINERS into kvmarm/next
* kvm-arm64/MAINTAINERS:
: KVM/arm64 MAINTAINERS updates
:
: - Drop the old columbia.edu mailing list (you will be missed!)
:
: - Move Oliver up to co-maintainer w/ Marc
KVM: arm64: Drop Columbia-hosted mailing list
MAINTAINERS: Add Oliver Upton as co-maintainer of KVM/arm64
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Oliver Upton [Mon, 13 Feb 2023 22:33:10 +0000 (22:33 +0000)]
Merge branch kvm-arm64/parallel-access-faults into kvmarm/next
* kvm-arm64/parallel-access-faults:
: Parallel stage-2 access fault handling
:
: The parallel faults changes that went in to 6.2 covered most stage-2
: aborts, with the exception of stage-2 access faults. Building on top of
: the new infrastructure, this series adds support for handling access
: faults (i.e. updating the access flag) in parallel.
:
: This is expected to provide a performance uplift for cores that do not
: implement FEAT_HAFDBS, such as those from the fruit company.
KVM: arm64: Condition HW AF updates on config option
KVM: arm64: Handle access faults behind the read lock
KVM: arm64: Don't serialize if the access flag isn't set
KVM: arm64: Return EAGAIN for invalid PTE in attr walker
KVM: arm64: Ignore EAGAIN for walks outside of a fault
KVM: arm64: Use KVM's pte type/helpers in handle_access_fault()
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Oliver Upton [Mon, 13 Feb 2023 22:32:14 +0000 (22:32 +0000)]
Merge branch kvm-arm64/virtual-cache-geometry into kvmarm/next
* kvm-arm64/virtual-cache-geometry:
: Virtualized cache geometry for KVM guests, courtesy of Akihiko Odaki.
:
: KVM/arm64 has always exposed the host cache geometry directly to the
: guest, even though non-secure software should never perform CMOs by
: Set/Way. This was slightly wrong, as the cache geometry was derived from
: the PE on which the vCPU thread was running and not a sanitized value.
:
: All together this leads to issues migrating VMs on heterogeneous
: systems, as the cache geometry saved/restored could be inconsistent.
:
: KVM/arm64 now presents 1 level of cache with 1 set and 1 way. The cache
: geometry is entirely controlled by userspace, such that migrations from
: older kernels continue to work.
KVM: arm64: Mark some VM-scoped allocations as __GFP_ACCOUNT
KVM: arm64: Normalize cache configuration
KVM: arm64: Mask FEAT_CCIDX
KVM: arm64: Always set HCR_TID2
arm64/cache: Move CLIDR macro definitions
arm64/sysreg: Add CCSIDR2_EL1
arm64/sysreg: Convert CCSIDR_EL1 to automatic generation
arm64: Allow the definition of UNKNOWN system register fields
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Oliver Upton [Mon, 13 Feb 2023 22:30:07 +0000 (22:30 +0000)]
Merge branch arm64/for-next/sme2 into kvmarm/next
Merge the SME2 branch to fix up a rather annoying conflict due to the
EL2 finalization refactor.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Oliver Upton [Mon, 13 Feb 2023 22:28:30 +0000 (22:28 +0000)]
Merge branch kvm/kvm-hw-enable-refactor into kvmarm/next
Merge the kvm_init() + hardware enable rework to avoid conflicts
with kvmarm.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Oliver Upton [Sat, 11 Feb 2023 19:07:42 +0000 (19:07 +0000)]
KVM: arm64: nv: Use reg_to_encoding() to get sysreg ID
Avoid open-coding and just use the helper to encode the ID from the
sysreg table entry.
No functional change intended.
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230211190742.49843-1-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Christoffer Dall [Thu, 9 Feb 2023 17:58:20 +0000 (17:58 +0000)]
KVM: arm64: nv: Only toggle cache for virtual EL2 when SCTLR_EL2 changes
So far we were flushing almost the entire universe whenever a VM would
load/unload the SCTLR_EL1 and the two versions of that register had
different MMU enabled settings. This turned out to be so slow that it
prevented forward progress for a nested VM, because a scheduler timer
tick interrupt would always be pending when we reached the nested VM.
To avoid this problem, we consider the SCTLR_EL2 when evaluating if
caches are on or off when entering virtual EL2 (because this is the
value that we end up shadowing onto the hardware EL1 register).
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230209175820.1939006-19-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Marc Zyngier [Thu, 9 Feb 2023 17:58:19 +0000 (17:58 +0000)]
KVM: arm64: nv: Filter out unsupported features from ID regs
As there is a number of features that we either can't support,
or don't want to support right away with NV, let's add some
basic filtering so that we don't advertize silly things to the
EL2 guest.
Whilst we are at it, advertize FEAT_TTL as well as FEAT_GTG, which
the NV implementation will implement.
Reviewed-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230209175820.1939006-18-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Jintack Lim [Thu, 9 Feb 2023 17:58:18 +0000 (17:58 +0000)]
KVM: arm64: nv: Emulate EL12 register accesses from the virtual EL2
With HCR_EL2.NV bit set, accesses to EL12 registers in the virtual EL2
trap to EL2. Handle those traps just like we do for EL1 registers.
One exception is CNTKCTL_EL12. We don't trap on CNTKCTL_EL1 for non-VHE
virtual EL2 because we don't have to. However, accessing CNTKCTL_EL12
will trap since it's one of the EL12 registers controlled by HCR_EL2.NV
bit. Therefore, add a handler for it and don't treat it as a
non-trap-registers when preparing a shadow context.
These registers, being only a view on their EL1 counterpart, are
permanently hidden from userspace.
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
[maz: EL12_REG(), register visibility]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230209175820.1939006-17-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Marc Zyngier [Thu, 9 Feb 2023 17:58:17 +0000 (17:58 +0000)]
KVM: arm64: nv: Allow a sysreg to be hidden from userspace only
So far, we never needed to distinguish between registers hidden
from userspace and being hidden from a guest (they are always
either visible to both, or hidden from both).
With NV, we have the ugly case of the EL02 and EL12 registers,
which are only a view on the EL0 and EL1 registers. It makes
absolutely no sense to expose them to userspace, since it
already has the canonical view.
Add a new visibility flag (REG_HIDDEN_USER) and a new helper that
checks for it and REG_HIDDEN when checking whether to expose
a sysreg to userspace. Subsequent patches will make use of it.
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230209175820.1939006-16-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Marc Zyngier [Thu, 9 Feb 2023 17:58:16 +0000 (17:58 +0000)]
KVM: arm64: nv: Emulate PSTATE.M for a guest hypervisor
We can no longer blindly copy the VCPU's PSTATE into SPSR_EL2 and return
to the guest and vice versa when taking an exception to the hypervisor,
because we emulate virtual EL2 in EL1 and therefore have to translate
the mode field from EL2 to EL1 and vice versa.
This requires keeping track of the state we enter the guest, for which
we transiently use a dedicated flag.
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230209175820.1939006-15-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Jintack Lim [Thu, 9 Feb 2023 17:58:15 +0000 (17:58 +0000)]
KVM: arm64: nv: Add accessors for SPSR_EL1, ELR_EL1 and VBAR_EL1 from virtual EL2
For the same reason we trap virtual memory register accesses at virtual
EL2, we need to trap SPSR_EL1, ELR_EL1 and VBAR_EL1 accesses. ARM v8.3
introduces the HCR_EL2.NV1 bit to be able to trap on those register
accesses in EL1. Do not set this bit until the whole nesting support is
completed, which happens further down the line...
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230209175820.1939006-14-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Jintack Lim [Thu, 9 Feb 2023 17:58:14 +0000 (17:58 +0000)]
KVM: arm64: nv: Handle SMCs taken from virtual EL2
Non-nested guests have used the hvc instruction to initiate SMCCC
calls into KVM. This is quite a poor fit for NV as hvc exceptions are
always taken to EL2. In other words, KVM needs to unconditionally
forward the hvc exception back into vEL2 to uphold the architecture.
Instead, treat the smc instruction from vEL2 as we would a guest
hypercall, thereby allowing the vEL2 to interact with KVM's hypercall
surface. Note that on NV-capable hardware HCR_EL2.TSC causes smc
instructions executed in non-secure EL1 to trap to EL2, even if EL3 is
not implemented.
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230209175820.1939006-13-maz@kernel.org
[Oliver: redo commit message, only handle smc from vEL2]
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Christoffer Dall [Thu, 9 Feb 2023 17:58:13 +0000 (17:58 +0000)]
KVM: arm64: nv: Handle trapped ERET from virtual EL2
When a guest hypervisor running virtual EL2 in EL1 executes an ERET
instruction, we will have set HCR_EL2.NV which traps ERET to EL2, so
that we can emulate the exception return in software.
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230209175820.1939006-12-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Jintack Lim [Thu, 9 Feb 2023 17:58:12 +0000 (17:58 +0000)]
KVM: arm64: nv: Inject HVC exceptions to the virtual EL2
As we expect all PSCI calls from the L1 hypervisor to be performed
using SMC when nested virtualization is enabled, it is clear that
all HVC instruction from the VM (including from the virtual EL2)
are supposed to handled in the virtual EL2.
Forward these to EL2 as required.
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
[maz: add handling of HCR_EL2.HCD]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230209175820.1939006-11-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Jintack Lim [Thu, 9 Feb 2023 17:58:11 +0000 (17:58 +0000)]
KVM: arm64: nv: Support virtual EL2 exceptions
Support injecting exceptions and performing exception returns to and
from virtual EL2. This must be done entirely in software except when
taking an exception from vEL0 to vEL2 when the virtual HCR_EL2.{E2H,TGE}
== {1,1} (a VHE guest hypervisor).
[maz: switch to common exception injection framework, illegal exeption
return handling]
Reviewed-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230209175820.1939006-10-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Jintack Lim [Thu, 9 Feb 2023 17:58:10 +0000 (17:58 +0000)]
KVM: arm64: nv: Handle HCR_EL2.NV system register traps
ARM v8.3 introduces a new bit in the HCR_EL2, which is the NV bit. When
this bit is set, accessing EL2 registers in EL1 traps to EL2. In
addition, executing the following instructions in EL1 will trap to EL2:
tlbi, at, eret, and msr/mrs instructions to access SP_EL1. Most of the
instructions that trap to EL2 with the NV bit were undef at EL1 prior to
ARM v8.3. The only instruction that was not undef is eret.
This patch sets up a handler for EL2 registers and SP_EL1 register
accesses at EL1. The host hypervisor keeps those register values in
memory, and will emulate their behavior.
This patch doesn't set the NV bit yet. It will be set in a later patch
once nested virtualization support is completed.
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
[maz: EL2_REG() macros]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230209175820.1939006-9-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Christoffer Dall [Thu, 9 Feb 2023 17:58:09 +0000 (17:58 +0000)]
KVM: arm64: nv: Add nested virt VCPU primitives for vEL2 VCPU state
When running a nested hypervisor we commonly have to figure out if
the VCPU mode is running in the context of a guest hypervisor or guest
guest, or just a normal guest.
Add convenient primitives for this.
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230209175820.1939006-8-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Marc Zyngier [Thu, 9 Feb 2023 17:58:08 +0000 (17:58 +0000)]
KVM: arm64: nv: Add EL2 system registers to vcpu context
Add the minimal set of EL2 system registers to the vcpu context.
Nothing uses them just yet.
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230209175820.1939006-7-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Christoffer Dall [Thu, 9 Feb 2023 17:58:07 +0000 (17:58 +0000)]
KVM: arm64: nv: Allow userspace to set PSR_MODE_EL2x
We were not allowing userspace to set a more privileged mode for the VCPU
than EL1, but we should allow this when nested virtualization is enabled
for the VCPU.
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230209175820.1939006-6-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Christoffer Dall [Thu, 9 Feb 2023 17:58:06 +0000 (17:58 +0000)]
KVM: arm64: nv: Reset VCPU to EL2 registers if VCPU nested virt is set
Reset the VCPU with PSTATE.M = EL2h when the nested virtualization
feature is enabled on the VCPU.
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
[maz: rework register reset not to use empty data structures]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230209175820.1939006-5-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Christoffer Dall [Thu, 9 Feb 2023 17:58:05 +0000 (17:58 +0000)]
KVM: arm64: nv: Introduce nested virtualization VCPU feature
Introduce the feature bit and a primitive that checks if the feature is
set behind a static key check based on the cpus_have_const_cap check.
Checking vcpu_has_nv() on systems without nested virt enabled
should have negligible overhead.
We don't yet allow userspace to actually set this feature.
Reviewed-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230209175820.1939006-4-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Marc Zyngier [Thu, 9 Feb 2023 17:58:04 +0000 (17:58 +0000)]
KVM: arm64: Use the S2 MMU context to iterate over S2 table
Most of our S2 helpers take a kvm_s2_mmu pointer, but quickly
revert back to using the kvm structure. By doing so, we lose
track of which S2 MMU context we were initially using, and fallback
to the "canonical" context.
If we were trying to unmap a S2 context managed by a guest hypervisor,
we end-up parsing the wrong set of page tables, and bad stuff happens
(as this is often happening on the back of a trapped TLBI from the
guest hypervisor).
Instead, make sure we always use the provided MMU context all the way.
This has no impact on non-NV, as we always pass the canonical MMU
context.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Link: https://lore.kernel.org/r/20230209175820.1939006-3-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Jintack Lim [Thu, 9 Feb 2023 17:58:03 +0000 (17:58 +0000)]
arm64: Add ARM64_HAS_NESTED_VIRT cpufeature
Add a new ARM64_HAS_NESTED_VIRT feature to indicate that the
CPU has the ARMv8.3 nested virtualization capability, together
with the 'kvm-arm.mode=nested' command line option.
This will be used to support nested virtualization in KVM.
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
[maz: moved the command-line option to kvm-arm.mode]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230209175820.1939006-2-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Shaoqin Huang [Wed, 8 Feb 2023 07:18:00 +0000 (15:18 +0800)]
KVM: selftests: Remove duplicate macro definition
The KVM_GUEST_PAGE_TABLE_MIN_PADDR macro has been defined in
include/kvm_util_base.h. So remove the duplicate definition in
lib/kvm_util.c.
Fixes:
cce0c23dd944 ("KVM: selftests: Add wrapper to allocate page table page")
Signed-off-by: Shaoqin Huang <shahuang@redhat.com>
Link: https://lore.kernel.org/r/20230208071801.68620-1-shahuang@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Michal Luczaj [Mon, 6 Feb 2023 20:24:30 +0000 (21:24 +0100)]
KVM: selftests: Clean up misnomers in xen_shinfo_test
As discussed[*], relabel the poorly named structs to align with the
current KVM nomenclature.
Old names are a leftover from before commit
52491a38b2c2 ("KVM:
Initialize gfn_to_pfn_cache locks in dedicated helper"), which i.a.
introduced kvm_gpc_init() and renamed kvm_gfn_to_pfn_cache_init()/
_destroy() to kvm_gpc_activate()/_deactivate(). Partly in an effort
to avoid implying that the cache really is destroyed/freed.
While at it, get rid of #define GPA_INVALID, which being used as a GFN,
is not only misnamed, but also unnecessarily reinvents a UAPI constant.
No functional change intended.
[*] https://lore.kernel.org/r/Y5yZ6CFkEMBqyJ6v@google.com
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://lore.kernel.org/r/20230206202430.1898057-1-mhal@rbox.co
Signed-off-by: Sean Christopherson <seanjc@google.com>
Shaoqin Huang [Thu, 2 Feb 2023 02:57:15 +0000 (10:57 +0800)]
selftests: KVM: Replace optarg with arg in guest_modes_cmdline
The parameter arg in guest_modes_cmdline not being used now, and the
optarg should be replaced with arg in guest_modes_cmdline.
And this is the chance to change strtoul() to atoi_non_negative(), since
guest mode ID will never be negative.
Signed-off-by: Shaoqin Huang <shahuang@redhat.com>
Fixes:
e42ac777d661 ("KVM: selftests: Factor out guest mode code")
Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Reviewed-by: Vipin Sharma <vipinsh@google.com>
Link: https://lore.kernel.org/r/20230202025716.216323-1-shahuang@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Alexander Gordeev [Fri, 29 Jan 2021 11:29:06 +0000 (12:29 +0100)]
s390/virtio: sort out physical vs virtual pointers usage
This does not fix a real bug, since virtual addresses
are currently indentical to physical ones.
Reviewed-by: Nico Boehr <nrb@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Nico Boehr [Mon, 7 Nov 2022 08:57:27 +0000 (09:57 +0100)]
KVM: s390: GISA: sort out physical vs virtual pointers usage
Fix virtual vs physical address confusion (which currently are the same).
In chsc_sgib(), do the virtual-physical conversion in the caller since
the caller needs to make sure it is a 31-bit address and zero has a
special meaning (disassociating the GIB).
Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Michael Mueller <mimu@linux.ibm.com>
Link: https://lore.kernel.org/r/20221107085727.1533792-1-nrb@linux.ibm.com
Message-Id: <
20221107085727.1533792-1-nrb@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Wang Yong [Thu, 2 Feb 2023 08:13:42 +0000 (08:13 +0000)]
KVM: update code comment in struct kvm_vcpu
Commit
c5b077549136 ("KVM: Convert the kvm->vcpus array to a xarray")
changed kvm->vcpus array to a xarray, so update the code comment of
kvm_vcpu->vcpu_idx accordingly.
Signed-off-by: Wang Yong <yongw.kernel@gmail.com>
Link: https://lore.kernel.org/r/20230202081342.856687-1-yongw.kernel@gmail.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Gavin Shan [Wed, 18 Jan 2023 09:21:33 +0000 (17:21 +0800)]
KVM: selftests: Assign guest page size in sync area early in memslot_perf_test
The guest page size in the synchronization area is needed by all test
cases. So it's reasonable to set it in the unified preparation function
(prepare_vm()).
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Link: https://lore.kernel.org/r/20230118092133.320003-3-gshan@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Gavin Shan [Wed, 18 Jan 2023 09:21:32 +0000 (17:21 +0800)]
KVM: selftests: Remove duplicate VM creation in memslot_perf_test
Remove a spurious call to __vm_create_with_one_vcpu() that was introduced
by a merge gone sideways.
Fixes:
eb5618911af0 ("Merge tag 'kvmarm-6.2' of https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD")
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Link: https://lore.kernel.org/r/20230118092133.320003-2-gshan@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Peter Gonda [Tue, 7 Feb 2023 17:13:54 +0000 (09:13 -0800)]
KVM: SVM: Fix potential overflow in SEV's send|receive_update_data()
KVM_SEV_SEND_UPDATE_DATA and KVM_SEV_RECEIVE_UPDATE_DATA have an integer
overflow issue. Params.guest_len and offset are both 32 bits wide, with a
large params.guest_len the check to confirm a page boundary is not
crossed can falsely pass:
/* Check if we are crossing the page boundary *
offset = params.guest_uaddr & (PAGE_SIZE - 1);
if ((params.guest_len + offset > PAGE_SIZE))
Add an additional check to confirm that params.guest_len itself is not
greater than PAGE_SIZE.
Note, this isn't a security concern as overflow can happen if and only if
params.guest_len is greater than 0xfffff000, and the FW spec says these
commands fail with lengths greater than 16KB, i.e. the PSP will detect
KVM's goof.
Fixes:
15fb7de1a7f5 ("KVM: SVM: Add KVM_SEV_RECEIVE_UPDATE_DATA command")
Fixes:
d3d1af85e2c7 ("KVM: SVM: Add KVM_SEND_UPDATE_DATA command")
Reported-by: Andy Nguyen <theflow@google.com>
Suggested-by: Thomas Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Peter Gonda <pgonda@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: kvm@vger.kernel.org
Cc: stable@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/r/20230207171354.4012821-1-pgonda@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Janis Schoetterl-Glausch [Tue, 7 Feb 2023 16:42:25 +0000 (17:42 +0100)]
KVM: s390: selftest: memop: Add cmpxchg tests
Test successful exchange, unsuccessful exchange, storage key protection
and invalid arguments.
Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/r/20230207164225.2114706-1-scgl@linux.ibm.com
Message-Id: <
20230207164225.2114706-1-scgl@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Janis Schoetterl-Glausch [Mon, 6 Feb 2023 16:46:01 +0000 (17:46 +0100)]
Documentation: KVM: s390: Describe KVM_S390_MEMOP_F_CMPXCHG
Describe the semantics of the new KVM_S390_MEMOP_F_CMPXCHG flag for
absolute vm write memops which allows user space to perform (storage key
checked) cmpxchg operations on guest memory.
Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/r/20230206164602.138068-14-scgl@linux.ibm.com
Message-Id: <
20230206164602.138068-14-scgl@linux.ibm.com>
[frankja@de.ibm.com: Removed a line from an earlier version]
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Janis Schoetterl-Glausch [Mon, 6 Feb 2023 16:46:00 +0000 (17:46 +0100)]
KVM: s390: Extend MEM_OP ioctl by storage key checked cmpxchg
User space can use the MEM_OP ioctl to make storage key checked reads
and writes to the guest, however, it has no way of performing atomic,
key checked, accesses to the guest.
Extend the MEM_OP ioctl in order to allow for this, by adding a cmpxchg
op. For now, support this op for absolute accesses only.
This op can be used, for example, to set the device-state-change
indicator and the adapter-local-summary indicator atomically.
Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/r/20230206164602.138068-13-scgl@linux.ibm.com
Message-Id: <
20230206164602.138068-13-scgl@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Janis Schoetterl-Glausch [Mon, 6 Feb 2023 16:45:59 +0000 (17:45 +0100)]
KVM: s390: Refactor vcpu mem_op function
Remove code duplication with regards to the CHECK_ONLY flag.
Decrease the number of indents.
No functional change indented.
Suggested-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Link: https://lore.kernel.org/r/20230206164602.138068-12-scgl@linux.ibm.com
Message-Id: <
20230206164602.138068-12-scgl@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Janis Schoetterl-Glausch [Mon, 6 Feb 2023 16:45:58 +0000 (17:45 +0100)]
KVM: s390: Refactor absolute vm mem_op function
Remove code duplication with regards to the CHECK_ONLY flag.
Decrease the number of indents.
No functional change indented.
Suggested-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/r/20230206164602.138068-11-scgl@linux.ibm.com
Message-Id: <
20230206164602.138068-11-scgl@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Janis Schoetterl-Glausch [Mon, 6 Feb 2023 16:45:57 +0000 (17:45 +0100)]
KVM: s390: Dispatch to implementing function at top level of vm mem_op
Instead of having one function covering all mem_op operations,
have a function implementing absolute access and dispatch to that
function in its caller, based on the operation code.
This way additional future operations can be implemented by adding an
implementing function without changing existing operations.
Suggested-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/r/20230206164602.138068-10-scgl@linux.ibm.com
Message-Id: <
20230206164602.138068-10-scgl@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Janis Schoetterl-Glausch [Mon, 6 Feb 2023 16:45:56 +0000 (17:45 +0100)]
KVM: s390: Move common code of mem_op functions into function
The vcpu and vm mem_op ioctl implementations share some functionality.
Move argument checking into a function and call it from both
implementations. This allows code reuse in case of additional future
mem_op operations.
Suggested-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/r/20230206164602.138068-9-scgl@linux.ibm.com
Message-Id: <
20230206164602.138068-9-scgl@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Janis Schoetterl-Glausch [Mon, 6 Feb 2023 16:45:55 +0000 (17:45 +0100)]
KVM: s390: selftest: memop: Fix integer literal
The address is a 64 bit value, specifying a 32 bit value can crash the
guest. In this case things worked out with -O2 but not -O0.
Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Fixes:
1bb873495a9e ("KVM: s390: selftests: Add more copy memop tests")
Reviewed-by: Thomas Huth <thuth@redhat.com>
Link: https://lore.kernel.org/r/20230206164602.138068-8-scgl@linux.ibm.com
Message-Id: <
20230206164602.138068-8-scgl@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Janis Schoetterl-Glausch [Mon, 6 Feb 2023 16:45:54 +0000 (17:45 +0100)]
KVM: s390: selftest: memop: Fix wrong address being used in test
The guest code sets the key for mem1 only. In order to provoke a
protection exception the test codes needs to address mem1.
Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: Nico Boehr <nrb@linux.ibm.com>
Link: https://lore.kernel.org/r/20230206164602.138068-7-scgl@linux.ibm.com
Message-Id: <
20230206164602.138068-7-scgl@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Janis Schoetterl-Glausch [Mon, 6 Feb 2023 16:45:53 +0000 (17:45 +0100)]
KVM: s390: selftest: memop: Fix typo
"
acceeded" isn't a word, should be "exceeded".
Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Nico Boehr <nrb@linux.ibm.com>
Link: https://lore.kernel.org/r/20230206164602.138068-6-scgl@linux.ibm.com
Message-Id: <
20230206164602.138068-6-scgl@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Janis Schoetterl-Glausch [Mon, 6 Feb 2023 16:45:52 +0000 (17:45 +0100)]
KVM: s390: selftest: memop: Add bad address test
Add a test that tries a real write to a bad address.
The existing CHECK_ONLY test doesn't cover all paths.
Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: Nico Boehr <nrb@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/r/20230206164602.138068-5-scgl@linux.ibm.com
Message-Id: <
20230206164602.138068-5-scgl@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Janis Schoetterl-Glausch [Mon, 6 Feb 2023 16:45:51 +0000 (17:45 +0100)]
KVM: s390: selftest: memop: Move testlist into main
This allows checking if the necessary requirements for a test case are
met via an arbitrary expression. In particular, it is easy to check if
certain bits are set in the memop extension capability.
Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/r/20230206164602.138068-4-scgl@linux.ibm.com
Message-Id: <
20230206164602.138068-4-scgl@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Janis Schoetterl-Glausch [Mon, 6 Feb 2023 16:45:50 +0000 (17:45 +0100)]
KVM: s390: selftest: memop: Replace macros by functions
Replace the DEFAULT_* test helpers by functions, as they don't
need the extra flexibility.
Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/r/20230206164602.138068-3-scgl@linux.ibm.com
Message-Id: <
20230206164602.138068-3-scgl@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Janis Schoetterl-Glausch [Mon, 6 Feb 2023 16:45:49 +0000 (17:45 +0100)]
KVM: s390: selftest: memop: Pass mop_desc via pointer
The struct is quite large, so this seems nicer.
Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/r/20230206164602.138068-2-scgl@linux.ibm.com
Message-Id: <
20230206164602.138068-2-scgl@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Nina Schoetterl-Glausch [Fri, 27 Jan 2023 17:45:52 +0000 (18:45 +0100)]
KVM: selftests: Compile s390 tests with -march=z10
The guest used in s390 kvm selftests is not be set up to handle all
instructions the compiler might emit, i.e. vector instructions, leading
to crashes.
Limit what the compiler emits to the oldest machine model currently
supported by Linux.
Signed-off-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Link: https://lore.kernel.org/r/20230127174552.3370169-1-nsg@linux.ibm.com
Message-Id: <
20230127174552.3370169-1-nsg@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Nico Boehr [Fri, 27 Jan 2023 14:05:32 +0000 (15:05 +0100)]
KVM: s390: disable migration mode when dirty tracking is disabled
Migration mode is a VM attribute which enables tracking of changes in
storage attributes (PGSTE). It assumes dirty tracking is enabled on all
memslots to keep a dirty bitmap of pages with changed storage attributes.
When enabling migration mode, we currently check that dirty tracking is
enabled for all memslots. However, userspace can disable dirty tracking
without disabling migration mode.
Since migration mode is pointless with dirty tracking disabled, disable
migration mode whenever userspace disables dirty tracking on any slot.
Also update the documentation to clarify that dirty tracking must be
enabled when enabling migration mode, which is already enforced by the
code in kvm_s390_vm_start_migration().
Also highlight in the documentation for KVM_S390_GET_CMMA_BITS that it
can now fail with -EINVAL when dirty tracking is disabled while
migration mode is on. Move all the error codes to a table so this stays
readable.
To disable migration mode, slots_lock should be held, which is taken
in kvm_set_memory_region() and thus held in
kvm_arch_prepare_memory_region().
Restructure the prepare code a bit so all the sanity checking is done
before disabling migration mode. This ensures migration mode isn't
disabled when some sanity check fails.
Cc: stable@vger.kernel.org
Fixes:
190df4a212a7 ("KVM: s390: CMMA tracking, ESSA emulation, migration mode")
Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20230127140532.230651-2-nrb@linux.ibm.com
Message-Id: <
20230127140532.230651-2-nrb@linux.ibm.com>
[frankja@linux.ibm.com: fixed commit message typo, moved api.rst error table upwards]
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Janosch Frank [Tue, 7 Feb 2023 17:04:23 +0000 (18:04 +0100)]
Merge remote-tracking branch 'l390-korg/cmpxchg_user_key' into kvm-next
Alexandru Matei [Mon, 23 Jan 2023 22:12:08 +0000 (00:12 +0200)]
KVM: VMX: Fix crash due to uninitialized current_vmcs
KVM enables 'Enlightened VMCS' and 'Enlightened MSR Bitmap' when running as
a nested hypervisor on top of Hyper-V. When MSR bitmap is updated,
evmcs_touch_msr_bitmap function uses current_vmcs per-cpu variable to mark
that the msr bitmap was changed.
vmx_vcpu_create() modifies the msr bitmap via vmx_disable_intercept_for_msr
-> vmx_msr_bitmap_l01_changed which in the end calls this function. The
function checks for current_vmcs if it is null but the check is
insufficient because current_vmcs is not initialized. Because of this, the
code might incorrectly write to the structure pointed by current_vmcs value
left by another task. Preemption is not disabled, the current task can be
preempted and moved to another CPU while current_vmcs is accessed multiple
times from evmcs_touch_msr_bitmap() which leads to crash.
The manipulation of MSR bitmaps by callers happens only for vmcs01 so the
solution is to use vmx->vmcs01.vmcs instead of current_vmcs.
BUG: kernel NULL pointer dereference, address:
0000000000000338
PGD
4e1775067 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
...
RIP: 0010:vmx_msr_bitmap_l01_changed+0x39/0x50 [kvm_intel]
...
Call Trace:
vmx_disable_intercept_for_msr+0x36/0x260 [kvm_intel]
vmx_vcpu_create+0xe6/0x540 [kvm_intel]
kvm_arch_vcpu_create+0x1d1/0x2e0 [kvm]
kvm_vm_ioctl_create_vcpu+0x178/0x430 [kvm]
kvm_vm_ioctl+0x53f/0x790 [kvm]
__x64_sys_ioctl+0x8a/0xc0
do_syscall_64+0x5c/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Fixes:
ceef7d10dfb6 ("KVM: x86: VMX: hyper-v: Enlightened MSR-Bitmap support")
Cc: stable@vger.kernel.org
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
Link: https://lore.kernel.org/r/20230123221208.4964-1-alexandru.matei@uipath.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Atish Patra [Tue, 7 Feb 2023 09:55:29 +0000 (01:55 -0800)]
RISC-V: KVM: Increment firmware pmu events
KVM supports firmware events now. Invoke the firmware event increment
function from appropriate places.
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Atish Patra [Tue, 7 Feb 2023 09:55:28 +0000 (01:55 -0800)]
RISC-V: KVM: Support firmware events
SBI PMU extension defines a set of firmware events which can provide
useful information to guests about the number of SBI calls. As
hypervisor implements the SBI PMU extension, these firmware events
correspond to ecall invocations between VS->HS mode. All other firmware
events will always report zero if monitored as KVM doesn't implement them.
This patch adds all the infrastructure required to support firmware
events.
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Atish Patra [Tue, 7 Feb 2023 12:14:28 +0000 (17:44 +0530)]
RISC-V: KVM: Implement perf support without sampling
RISC-V SBI PMU & Sscofpmf ISA extension allows supporting perf in
the virtualization enviornment as well. KVM implementation
relies on SBI PMU extension for the most part while trapping
& emulating the CSRs read for counter access.
This patch doesn't have the event sampling support yet.
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Atish Patra [Tue, 7 Feb 2023 09:55:26 +0000 (01:55 -0800)]
RISC-V: KVM: Implement trap & emulate for hpmcounters
As the KVM guests only see the virtual PMU counters, all hpmcounter
access should trap and KVM emulates the read access on behalf of guests.
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Atish Patra [Tue, 7 Feb 2023 09:55:25 +0000 (01:55 -0800)]
RISC-V: KVM: Disable all hpmcounter access for VS/VU mode
Any guest must not get access to any hpmcounter including cycle/instret
without any checks. We achieve that by disabling all the bits except TM
bit in hcounteren.
However, instret and cycle access for guest user space can be enabled
upon explicit request (via ONE REG) or on first trap from VU mode
to maintain ABI requirement in the future. This patch doesn't support
that as ONE REG interface is not settled yet.
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Atish Patra [Tue, 7 Feb 2023 09:55:24 +0000 (01:55 -0800)]
RISC-V: KVM: Make PMU functionality depend on Sscofpmf
The privilege mode filtering feature must be available in the host so
that the host can inhibit the counters while the execution is in HS mode.
Otherwise, the guests may have access to critical guest information.
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Atish Patra [Tue, 7 Feb 2023 09:55:23 +0000 (01:55 -0800)]
RISC-V: KVM: Add SBI PMU extension support
SBI PMU extension allows KVM guests to configure/start/stop/query
about the PMU counters in virtualized enviornment as well.
In order to allow that, KVM implements the entire SBI PMU extension.
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Atish Patra [Tue, 7 Feb 2023 09:55:22 +0000 (01:55 -0800)]
RISC-V: KVM: Add skeleton support for perf
This patch only adds barebone structure of perf implementation. Most
of the function returns zero at this point and will be implemented
fully in the future.
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Atish Patra [Sun, 5 Feb 2023 01:15:07 +0000 (17:15 -0800)]
RISC-V: KVM: Modify SBI extension handler to return SBI error code
Currently, the SBI extension handle is expected to return Linux error code.
The top SBI layer converts the Linux error code to SBI specific error code
that can be returned to guest invoking the SBI calls. This model works
as long as SBI error codes have 1-to-1 mappings between them.
However, that may not be true always. This patch attempts to disassociate
both these error codes by allowing the SBI extension implementation to
return SBI specific error codes as well.
The extension will continue to return the Linux error specific code which
will indicate any problem *with* the extension emulation while the
SBI specific error will indicate the problem *of* the emulation.
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Atish Patra [Sun, 5 Feb 2023 01:15:06 +0000 (17:15 -0800)]
RISC-V: KVM: Return correct code for hsm stop function
According to the SBI specification, the stop function can only
return error code SBI_ERR_FAILED. However, currently it returns
-EINVAL which will be mapped SBI_ERR_INVALID_PARAM.
Return an linux error code that maps to SBI_ERR_FAILED i.e doesn't map
to any other SBI error code. While EACCES is not the best error code
to describe the situation, it is close enough and will be replaced
with SBI error codes directly anyways.
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Atish Patra [Sun, 5 Feb 2023 01:15:05 +0000 (17:15 -0800)]
RISC-V: KVM: Define a probe function for SBI extension data structures
Currently the probe function just checks if an SBI extension is
registered or not. However, the extension may not want to advertise
itself depending on some other condition.
An additional extension specific probe function will allow
extensions to decide if they want to be advertised to the caller or
not. Any extension that does not require additional dependency checks
can avoid implementing this function.
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Atish Patra [Sun, 5 Feb 2023 01:15:04 +0000 (17:15 -0800)]
RISC-V: Improve SBI PMU extension related definitions
This patch fixes/improve few minor things in SBI PMU extension
definition.
1. Align all the firmware event names.
2. Add macros for bit positions in cache event ID & ops.
The changes were small enough to combine them together instead
of creating 1 liner patches.
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Atish Patra [Sun, 5 Feb 2023 01:15:03 +0000 (17:15 -0800)]
perf: RISC-V: Improve privilege mode filtering for perf
Currently, the host driver doesn't have any method to identify if the
requested perf event is from kvm or bare metal. As KVM runs in HS
mode, there are no separate hypervisor privilege mode to distinguish
between the attributes for guest/host.
Improve the privilege mode filtering by using the event specific
config1 field.
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Atish Patra [Sun, 5 Feb 2023 01:15:02 +0000 (17:15 -0800)]
perf: RISC-V: Define helper functions expose hpm counter width and count
KVM module needs to know how many hardware counters and the counter
width that the platform supports. Otherwise, it will not be able to show
optimal value of virtual counters to the guest. The virtual hardware
counters also need to have the same width as the logical hardware
counters for simplicity. However, there shouldn't be mapping between
virtual hardware counters and logical hardware counters. As we don't
support hetergeneous harts or counters with different width as of now,
the implementation relies on the counter width of the first available
programmable counter.
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Andy Chiu [Fri, 27 Jan 2023 05:58:55 +0000 (11:28 +0530)]
RISC-V: KVM: Redirect illegal instruction traps to guest
The M-mode redirects an unhandled illegal instruction trap back
to S-mode. However, KVM running in HS-mode terminates the VS-mode
software when it receives illegal instruction trap. Instead, KVM
should redirect the illegal instruction trap back to VS-mode, and
let VS-mode trap handler decide the next step. This futher allows
guest kernel to implement on-demand enabling of vector extension
for a guest user space process upon first-use.
Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Anup Patel [Sat, 28 Jan 2023 07:48:25 +0000 (13:18 +0530)]
RISC-V: KVM: Fix privilege mode setting in kvm_riscv_vcpu_trap_redirect()
The kvm_riscv_vcpu_trap_redirect() should set guest privilege mode
to supervisor mode because guest traps/interrupts are always handled
in virtual supervisor mode.
Fixes:
9f7013265112 ("RISC-V: KVM: Handle MMIO exits for VCPU")
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Alexandre Ghiti [Mon, 23 Jan 2023 09:29:28 +0000 (10:29 +0100)]
KVM: RISC-V: Fix wrong usage of PGDIR_SIZE to check page sizes
At the moment, riscv only supports PMD and PUD hugepages. For sv39,
PGDIR_SIZE == PUD_SIZE but not for sv48 and sv57. So fix this by changing
PGDIR_SIZE into PUD_SIZE.
Fixes:
9d05c1fee837 ("RISC-V: KVM: Implement stage2 page table programming")
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Oliver Upton [Mon, 6 Feb 2023 23:52:29 +0000 (23:52 +0000)]
KVM: arm64: Mark some VM-scoped allocations as __GFP_ACCOUNT
Generally speaking, any memory allocations that can be associated with a
particular VM should be charged to the cgroup of its process.
Nonetheless, there are a couple spots in KVM/arm64 that aren't currently
accounted:
- the ccsidr array containing the virtualized cache hierarchy
- the cpumask of supported cpus, for use of the vPMU on heterogeneous
systems
Go ahead and set __GFP_ACCOUNT for these allocations.
Reviewed-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Link: https://lore.kernel.org/r/20230206235229.4174711-1-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Marc Zyngier [Tue, 7 Feb 2023 09:43:21 +0000 (09:43 +0000)]
KVM: arm64: Fix non-kerneldoc comments
The robots amongts us have started spitting out irritating emails about
random errors such as:
<quote>
arch/arm64/kvm/arm.c:2207: warning: expecting prototype for Initialize Hyp().
Prototype was for kvm_arm_init() instead
</quote>
which makes little sense until you finally grok what they are on about:
comments that look like a kerneldoc, but that aren't.
Let's address this before I get even more irritated... ;-)
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/63e139e1.J5AHO6vmxaALh7xv%25lkp@intel.com
Link: https://lore.kernel.org/r/20230207094321.1238600-1-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Yu Zhang [Wed, 9 Nov 2022 07:54:13 +0000 (15:54 +0800)]
KVM: nVMX: Simplify the setting of SECONDARY_EXEC_ENABLE_VMFUNC for nested.
Values of base settings for nested proc-based VM-Execution control MSR come
from the ones for non-nested. And for SECONDARY_EXEC_ENABLE_VMFUNC flag,
KVM currently a) first mask off it from vmcs_conf->cpu_based_2nd_exec_ctrl;
b) then check it against the same source; c) and reset it again if host has
it.
So just simplify this, by not masking off SECONDARY_EXEC_ENABLE_VMFUNC in
the first place.
No functional change.
Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
Link: https://lore.kernel.org/r/20221109075413.1405803-3-yu.c.zhang@linux.intel.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Yu Zhang [Wed, 9 Nov 2022 07:54:12 +0000 (15:54 +0800)]
KVM: VMX: Do not trap VMFUNC instructions for L1 guests.
Explicitly disable VMFUNC in vmcs01 to document that KVM doesn't support
any VM-Functions for L1. WARN in the dedicated VMFUNC handler if an exit
occurs while L1 is active, but keep the existing handlers as fallbacks to
avoid killing the VM as an unexpected VMFUNC VM-Exit isn't fatal
Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
Link: https://lore.kernel.org/r/20221109075413.1405803-2-yu.c.zhang@linux.intel.com
[sean: don't kill the VM on an unexpected VMFUNC from L1, reword changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Mark Brown [Thu, 2 Feb 2023 21:01:36 +0000 (21:01 +0000)]
KVM: selftests: Enable USERFAULTFD
The page_fault_test KVM selftest requires userfaultfd but the config
fragment for the KVM selftests does not enable it, meaning that those tests
are skipped in CI systems that rely on appropriate settings in the config
fragments except on S/390 which happens to have it in defconfig. Enable
the option in the config fragment so that the tests get run.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20230202-kvm-selftest-userfaultfd-v1-1-8186ac5a33a5@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Marc Zyngier [Mon, 6 Feb 2023 09:24:40 +0000 (09:24 +0000)]
arm64/sme: Fix __finalise_el2 SMEver check
When checking for ID_AA64SMFR0_EL1.SMEver, __check_override assumes
that the ID_AA64SMFR0_EL1 value is in x1, and the intent of the code
is to reuse value read a few lines above.
However, as the comment says at the beginning of the macro, x1 will
be clobbered, and the checks always fails.
The easiest fix is just to reload the id register before checking it.
Fixes:
f122576f3533 ("arm64/sme: Enable host kernel to access ZT0")
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Linus Torvalds [Sun, 5 Feb 2023 21:13:28 +0000 (13:13 -0800)]
Linux 6.2-rc7
Linus Torvalds [Sun, 5 Feb 2023 20:19:55 +0000 (12:19 -0800)]
Merge tag 'usb-6.2-rc7' of git://git./linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some small USB fixes that resolve some reported problems.
These include:
- gadget driver fixes
- dwc3 driver fix
- typec driver fix
- MAINTAINERS file update.
All of these have been in linux-next with no reported problems"
* tag 'usb-6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: typec: ucsi: Don't attempt to resume the ports before they exist
usb: gadget: udc: do not clear gadget driver.bus
usb: gadget: f_uac2: Fix incorrect increment of bNumEndpoints
usb: gadget: f_fs: Fix unbalanced spinlock in __ffs_ep0_queue_wait
usb: dwc3: qcom: enable vbus override when in OTG dr-mode
MAINTAINERS: Add myself as UVC Gadget Maintainer
Linus Torvalds [Sun, 5 Feb 2023 20:06:29 +0000 (12:06 -0800)]
Merge tag 'tty-6.2-rc7' of git://git./linux/kernel/git/gregkh/tty
Pull tty/serial driver fixes from Greg KH:
"Here are some small serial and vt fixes. These include:
- 8250 driver fixes relating to dma issues
- stm32 serial driver fix for threaded irqs
- vc_screen bugfix for reported problems.
All have been in linux-next for a while with no reported problems"
* tag 'tty-6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
vc_screen: move load of struct vc_data pointer in vcs_read() to avoid UAF
serial: 8250_dma: Fix DMA Rx rearm race
serial: 8250_dma: Fix DMA Rx completion race
serial: stm32: Merge hard IRQ and threaded IRQ handling into single IRQ handler
Linus Torvalds [Sun, 5 Feb 2023 19:52:23 +0000 (11:52 -0800)]
Merge tag 'char-misc-6.2-rc7' of git://git./linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are a number of small char/misc/whatever driver fixes. They
include:
- IIO driver fixes for some reported problems
- nvmem driver fixes
- fpga driver fixes
- debugfs memory leak fix in the hv_balloon and irqdomain code
(irqdomain change was acked by the maintainer)
All have been in linux-next with no reported problems"
* tag 'char-misc-6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (33 commits)
kernel/irq/irqdomain.c: fix memory leak with using debugfs_lookup()
HV: hv_balloon: fix memory leak with using debugfs_lookup()
nvmem: qcom-spmi-sdam: fix module autoloading
nvmem: core: fix return value
nvmem: core: fix cell removal on error
nvmem: core: fix device node refcounting
nvmem: core: fix registration vs use race
nvmem: core: fix cleanup after dev_set_name()
nvmem: core: remove nvmem_config wp_gpio
nvmem: core: initialise nvmem->id early
nvmem: sunxi_sid: Always use 32-bit MMIO reads
nvmem: brcm_nvram: Add check for kzalloc
iio: imu: fxos8700: fix MAGN sensor scale and unit
iio: imu: fxos8700: remove definition FXOS8700_CTRL_ODR_MIN
iio: imu: fxos8700: fix failed initialization ODR mode assignment
iio: imu: fxos8700: fix incorrect ODR mode readback
iio: light: cm32181: Fix PM support on system with 2 I2C resources
iio: hid: fix the retval in gyro_3d_capture_sample
iio: hid: fix the retval in accel_3d_capture_sample
iio: imu: st_lsm6dsx: fix build when CONFIG_IIO_TRIGGERED_BUFFER=m
...
Linus Torvalds [Sun, 5 Feb 2023 19:43:00 +0000 (11:43 -0800)]
Merge tag 'fbdev-for-6.2-rc7' of git://git./linux/kernel/git/deller/linux-fbdev
Pull fbdev fixes from Helge Deller:
- fix fbcon to prevent fonts bigger than 32x32 pixels to avoid
overflows reported by syzbot
- switch omapfb to use kstrtobool()
- switch some fbdev drivers to use the backlight helpers
* tag 'fbdev-for-6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
fbcon: Check font dimension limits
fbdev: omapfb: Use kstrtobool() instead of strtobool()
fbdev: fbmon: fix function name in kernel-doc
fbdev: atmel_lcdfb: Rework backlight status updates
fbdev: riva: Use backlight helper
fbdev: omapfb: panel-dsi-cm: Use backlight helper
fbdev: nvidia: Use backlight helper
fbdev: mx3fb: Use backlight helper
fbdev: radeon: Use backlight helper
fbdev: atyfb: Use backlight helper
fbdev: aty128fb: Use backlight helper
Linus Torvalds [Sun, 5 Feb 2023 19:28:42 +0000 (11:28 -0800)]
Merge tag 'x86_urgent_for_v6.2_rc7' of git://git./linux/kernel/git/tip/tip
Pull x86 fix from Borislav Petkov:
- Prevent the compiler from reordering accesses to debug regs which
could cause a #VC exception in SEV-ES guests at the wrong place in
the NMI handling path
* tag 'x86_urgent_for_v6.2_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/debug: Fix stack recursion caused by wrongly ordered DR7 accesses
Linus Torvalds [Sun, 5 Feb 2023 19:03:56 +0000 (11:03 -0800)]
Merge tag 'perf_urgent_for_v6.2_rc7' of git://git./linux/kernel/git/tip/tip
Pull perf fix from Borislav Petkov:
- Lock the proper critical section when dealing with perf event context
* tag 'perf_urgent_for_v6.2_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf: Fix perf_event_pmu_context serialization
Linus Torvalds [Sun, 5 Feb 2023 02:40:51 +0000 (18:40 -0800)]
Merge tag 'powerpc-6.2-4' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"It's a bit of a big batch for rc6, but just because I didn't send any
fixes the last week or two while I was on vacation, next week should
be quieter:
- Fix a few objtool warnings since we recently enabled objtool.
- Fix a deadlock with the hash MMU vs perf record.
- Fix perf profiling of asynchronous interrupt handlers.
- Revert the IMC PMU nest_init_lock to being a mutex.
- Two commits fixing problems with the kexec_file FDT size
estimation.
- Two commits fixing problems with strict RWX vs kernels running at
non-zero.
- Reconnect tlb_flush() to hash__tlb_flush()
Thanks to Kajol Jain, Nicholas Piggin, Sachin Sant Sathvika Vasireddy,
and Sourabh Jain"
* tag 'powerpc-6.2-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64s: Reconnect tlb_flush() to hash__tlb_flush()
powerpc/kexec_file: Count hot-pluggable memory in FDT estimate
powerpc/64s/radix: Fix RWX mapping with relocated kernel
powerpc/64s/radix: Fix crash with unaligned relocated kernel
powerpc/kexec_file: Fix division by zero in extra size estimation
powerpc/imc-pmu: Revert nest_init_lock to being a mutex
powerpc/64: Fix perf profiling asynchronous interrupt handlers
powerpc/64s: Fix local irq disable when PMIs are disabled
powerpc/kvm: Fix unannotated intra-function call warning
powerpc/85xx: Fix unannotated intra-function call warning