Pekka Enberg [Mon, 9 May 2011 19:48:54 +0000 (22:48 +0300)]
KVM: Add documentation for KVM_CAP_NR_VCPUS
Document KVM_CAP_NR_VCPUS that can be used by the userspace to determine
maximum number of VCPUs it can create with the KVM_CREATE_VCPU ioctl.
Cc: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
Gleb Natapov [Wed, 4 May 2011 13:31:04 +0000 (16:31 +0300)]
KVM: make guest mode entry to be rcu quiescent state
KVM does not hold any references to rcu protected data when it switches
CPU into a guest mode. In fact switching to a guest mode is very similar
to exiting to userspase from rcu point of view. In addition CPU may stay
in a guest mode for quite a long time (up to one time slice). Lets treat
guest mode as quiescent state, just like we do with user-mode execution.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 11 May 2011 09:56:53 +0000 (05:56 -0400)]
Merge commit '
29ce831000081dd757d3116bf774aafffc4b6b20' into next
* commit '
29ce831000081dd757d3116bf774aafffc4b6b20': (34 commits)
rcu: provide rcu_virt_note_context_switch() function.
rcu: get rid of signed overflow in check_cpu_stall()
rcu: optimize rcutiny
rcu: prevent call_rcu() from diving into rcu core if irqs disabled
rcu: further lower priority in rcu_yield()
rcu: introduce kfree_rcu()
rcu: fix spelling
rcu: call __rcu_read_unlock() in exit_rcu for tree RCU
rcu: Converge TINY_RCU expedited and normal boosting
rcu: remove useless ->boosted_this_gp field
rcu: code cleanups in TINY_RCU priority boosting.
rcu: Switch to this_cpu() primitives
rcu: Use WARN_ON_ONCE for DEBUG_OBJECTS_RCU_HEAD warnings
rcu: mark rcutorture boosting callback as being on-stack
rcu: add DEBUG_OBJECTS_RCU_HEAD check for alignment
rcu: Enable DEBUG_OBJECTS_RCU_HEAD from !PREEMPT
rcu: Add forward-progress diagnostic for per-CPU kthreads
rcu: add grace-period age and more kthread state to tracing
rcu: fix tracing bug thinko on boost-balk attribution
rcu: update tracing documentation for new rcutorture and rcuboost
...
Pulling in rcu_virt_note_context_switch().
Signed-off-by: Avi Kivity <avi@redhat.com>
* commit '
29ce831000081dd757d3116bf774aafffc4b6b20': (34 commits)
rcu: provide rcu_virt_note_context_switch() function.
rcu: get rid of signed overflow in check_cpu_stall()
rcu: optimize rcutiny
rcu: prevent call_rcu() from diving into rcu core if irqs disabled
rcu: further lower priority in rcu_yield()
rcu: introduce kfree_rcu()
rcu: fix spelling
rcu: call __rcu_read_unlock() in exit_rcu for tree RCU
rcu: Converge TINY_RCU expedited and normal boosting
rcu: remove useless ->boosted_this_gp field
rcu: code cleanups in TINY_RCU priority boosting.
rcu: Switch to this_cpu() primitives
rcu: Use WARN_ON_ONCE for DEBUG_OBJECTS_RCU_HEAD warnings
rcu: mark rcutorture boosting callback as being on-stack
rcu: add DEBUG_OBJECTS_RCU_HEAD check for alignment
rcu: Enable DEBUG_OBJECTS_RCU_HEAD from !PREEMPT
rcu: Add forward-progress diagnostic for per-CPU kthreads
rcu: add grace-period age and more kthread state to tracing
rcu: fix tracing bug thinko on boost-balk attribution
rcu: update tracing documentation for new rcutorture and rcuboost
...
Takuya Yoshikawa [Sun, 1 May 2011 17:30:48 +0000 (02:30 +0900)]
KVM: x86 emulator: Make jmp far emulation into a separate function
We introduce em_jmp_far().
We also call this from em_grp45() to stop treating modrm_reg == 5 case
separately in the group 5 emulation.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
Takuya Yoshikawa [Sun, 1 May 2011 17:29:17 +0000 (02:29 +0900)]
KVM: x86 emulator: Rename emulate_grpX() to em_grpX()
The prototypes are changed appropriately.
We also replaces "goto grp45;" with simple em_grp45() call.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
Takuya Yoshikawa [Sun, 1 May 2011 17:27:55 +0000 (02:27 +0900)]
KVM: x86 emulator: Remove unused arg from emulate_pop()
The opt of emulate_grp1a() is also removed.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
Takuya Yoshikawa [Sun, 1 May 2011 17:26:23 +0000 (02:26 +0900)]
KVM: x86 emulator: Remove unused arg from writeback()
Remove inline at this chance.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
Takuya Yoshikawa [Sun, 1 May 2011 17:25:07 +0000 (02:25 +0900)]
KVM: x86 emulator: Remove unused arg from read_descriptor()
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
Takuya Yoshikawa [Sun, 1 May 2011 17:23:13 +0000 (02:23 +0900)]
KVM: x86 emulator: Remove unused arg from seg_override()
In addition, one comma at the end of a statement is replaced with a
semicolon.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
Takuya Yoshikawa [Sat, 7 May 2011 07:35:38 +0000 (16:35 +0900)]
KVM: Validate userspace_addr of memslot when registered
This way, we can avoid checking the user space address many times when
we read the guest memory.
Although we can do the same for write if we check which slots are
writable, we do not care write now: reading the guest memory happens
more often than writing.
[avi: change VERIFY_READ to VERIFY_WRITE]
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
Takuya Yoshikawa [Sat, 7 May 2011 07:31:36 +0000 (16:31 +0900)]
KVM: MMU: Clean up gpte reading with copy_from_user()
When we optimized walk_addr_generic() by not using the generic guest
memory reader, we replaced copy_from_user() with get_user():
commit
e30d2a170506830d5eef5e9d7990c5aedf1b0a51
KVM: MMU: Optimize guest page table walk
commit
15e2ac9a43d4d7d08088e404fddf2533a8e7d52e
KVM: MMU: Fix 64-bit paging breakage on x86_32
But as Andi pointed out later, copy_from_user() does the same as
get_user() as long as we give a constant size to it.
So we use copy_from_user() to clean up the code.
The only, noticeable, regression introduced by this is 64-bit gpte
reading on x86_32 hosts needed for PAE guests.
But this can be mitigated by implementing 8-byte get_user() for x86_32,
if needed.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
Scott Wood [Wed, 27 Apr 2011 22:24:21 +0000 (17:24 -0500)]
KVM: PPC: booke: add sregs support
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Scott Wood [Wed, 27 Apr 2011 22:24:10 +0000 (17:24 -0500)]
KVM: PPC: booke: save/restore VRSAVE (a.k.a. USPRG0)
Linux doesn't use USPRG0 (now renamed VRSAVE in the architecture, even
when Altivec isn't involved), but a guest might.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Stuart Yoder [Mon, 28 Mar 2011 20:01:56 +0000 (15:01 -0500)]
KVM: PPC: use ticks, not usecs, for exit timing
Convert to microseconds when displaying
(with fix from Bharat Bhushan <Bharat.Bhushan@freescale.com>).
This reduces rounding error with large quantities of short exits.
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Scott Wood [Mon, 28 Mar 2011 20:01:24 +0000 (15:01 -0500)]
KVM: PPC: fix exit accounting for SPRs, tlbwe, tlbsx
The exit type setting for mfspr/mtspr is moved from 44x to toplevel SPR
emulation. This enables it on e500, and makes sure that all SPRs
are covered.
Exit accounting for tlbwe and tlbsx is added to e500.
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Scott Wood [Tue, 29 Mar 2011 21:49:10 +0000 (16:49 -0500)]
KVM: PPC: e500: emulate SVR
Return the actual host SVR for now, as we already do for PVR. Eventually
we may support Qemu overriding PVR/SVR if the situation is appropriate,
once we implement KVM_SET_SREGS on e500.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Avi Kivity [Wed, 27 Apr 2011 16:42:18 +0000 (19:42 +0300)]
KVM: VMX: Cache vmcs segment fields
Since the emulator now checks segment limits and access rights, it
generates a lot more accesses to the vmcs segment fields. Undo some
of the performance hit by cacheing those fields in a read-only cache
(the entire cache is invalidated on any write, or on guest exit).
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 27 Apr 2011 10:20:30 +0000 (13:20 +0300)]
KVM: x86 emulator: consolidate segment accessors
Instead of separate accessors for the segment selector and cached descriptor,
use one accessor for both. This simplifies the code somewhat.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Thu, 28 Apr 2011 12:59:33 +0000 (15:59 +0300)]
KVM: VMX: Avoid reading %rip unnecessarily when handling exceptions
Avoids a VMREAD.
Signed-off-by: Avi Kivity <avi@redhat.com>
Joe Perches [Mon, 25 Apr 2011 05:00:50 +0000 (22:00 -0700)]
KVM: SVM: Make dump_vmcb static, reduce text
dump_vmcb isn't used outside this module, make it static.
Shrink text and object by ~1% by standardizing formats.
$ size arch/x86/kvm/svm.o*
text data bss dec hex filename
52910 580 10072 63562 f84a arch/x86/kvm/svm.o.new
53563 580 10072 64215 fad7 arch/x86/kvm/svm.o.old
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Takuya Yoshikawa [Wed, 27 Apr 2011 22:08:36 +0000 (07:08 +0900)]
KVM: MMU: Fix 64-bit paging breakage on x86_32
Fix regression introduced by
commit
e30d2a170506830d5eef5e9d7990c5aedf1b0a51
KVM: MMU: Optimize guest page table walk
On x86_32, get_user() does not support 64-bit values and we fail to
build KVM at the point of 64-bit paging.
This patch fixes this by using get_user() twice for that condition.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Reported-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
Jeff Mahoney [Wed, 27 Apr 2011 18:06:07 +0000 (14:06 -0400)]
KVM: ia64: fix sparse warnings
This patch fixes some sparse warning about "dubious one-bit signed bitfield."
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Originally-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Jan Blunck <jblunck@suse.de>
Acked-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
BrillyWu@viatech.com.cn [Mon, 25 Apr 2011 05:55:15 +0000 (13:55 +0800)]
KVM: Add CPUID support for VIA CPU
The CPUIDs for Centaur are added, and then the features of
PadLock hardware engine on VIA CPU, such as "ace", "ace_en"
and so on, can be passed into the kvm guest.
Signed-off-by: Brilly Wu <brillywu@viatech.com.cn>
Signed-off-by: Kary Jin <karyjin@viatech.com.cn>
Signed-off-by: Avi Kivity <avi@redhat.com>
Gleb Natapov [Tue, 12 Apr 2011 09:36:25 +0000 (12:36 +0300)]
KVM: call cache_all_regs() only once during instruction emulation
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Gleb Natapov [Tue, 12 Apr 2011 09:36:24 +0000 (12:36 +0300)]
KVM: Fix compound mmio
mmio_index should be taken into account when copying data from
userspace.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Gleb Natapov [Tue, 12 Apr 2011 09:36:23 +0000 (12:36 +0300)]
KVM: emulator: Propagate fault in far jump emulation
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Gleb Natapov [Tue, 12 Apr 2011 09:36:21 +0000 (12:36 +0300)]
KVM: mmio_fault_cr2 is not used
Remove unused variable mmio_fault_cr2.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Sun, 24 Apr 2011 11:09:59 +0000 (14:09 +0300)]
KVM: x86 emulator: consolidate group handling
Move all groups into a single field and handle them in a single place. This
saves bits when we add more group types (3 bits -> 7 groups types).
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Sun, 24 Apr 2011 09:25:50 +0000 (12:25 +0300)]
KVM: MMU: Add unlikely() annotations to walk_addr_generic()
walk_addr_generic() is a hot path and is also hard for the cpu to predict -
some of the parameters (fetch_fault in particular) vary wildly from
invocation to invocation.
Add unlikely() annotations where appropriate; all walk failures are
considered unlikely, as are cases where we have to mark the accessed or
dirty bit, as they are slow paths both in kvm and on real processors.
Signed-off-by: Avi Kivity <avi@redhat.com>
Takuya Yoshikawa [Sat, 23 Apr 2011 09:52:56 +0000 (18:52 +0900)]
KVM: x86 emulator: Use opcode::execute for PUSHF/POPF (9C/9D)
For this, em_pushf/popf() are introduced.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
Takuya Yoshikawa [Sat, 23 Apr 2011 09:51:07 +0000 (18:51 +0900)]
KVM: x86 emulator: Use opcode::execute for PUSHA/POPA (60/61)
For this, emulate_pusha/popa() are converted to em_pusha/popa().
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
Takuya Yoshikawa [Sat, 23 Apr 2011 09:49:40 +0000 (18:49 +0900)]
KVM: x86 emulator: Use opcode::execute for POP reg (58-5F)
In addition, the RET emulation is changed to call em_pop() to remove
the pop_instruction label.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
Takuya Yoshikawa [Sat, 23 Apr 2011 09:48:02 +0000 (18:48 +0900)]
KVM: x86 emulator: Use opcode::execute for Group 1, CMPS and SCAS
The following instructions are changed to use opcode::execute.
Group 1 (80-83)
ADD (00-05), OR (08-0D), ADC (10-15), SBB (18-1D), AND (20-25),
SUB (28-2D), XOR (30-35), CMP (38-3D)
CMPS (A6-A7), SCAS (AE-AF)
The last two do the same as CMP in the emulator, so em_cmp() is used.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
Takuya Yoshikawa [Thu, 21 Apr 2011 15:34:44 +0000 (00:34 +0900)]
KVM: MMU: Optimize guest page table walk
This patch optimizes the guest page table walk by using get_user()
instead of copy_from_user().
With this patch applied, paging64_walk_addr_generic() has become
about 0.5us to 1.0us faster on my Phenom II machine with NPT on.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Thu, 21 Apr 2011 09:35:41 +0000 (12:35 +0300)]
KVM: SVM: Get rid of x86_intercept_map::valid
By reserving 0 as an invalid x86_intercept_stage, we no longer
need to store a valid flag in x86_intercept_map.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Thu, 21 Apr 2011 09:21:50 +0000 (12:21 +0300)]
KVM: x86 emulator: Use opcode::execute for 0F 01 opcode
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Thu, 21 Apr 2011 09:17:13 +0000 (12:17 +0300)]
KVM: x86 emulator: Don't force #UD for 0F 01 /5
While it isn't defined, no need to force a #UD. If it becomes defined
in the future this can cause wierd problems for the guest.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Thu, 21 Apr 2011 09:07:59 +0000 (12:07 +0300)]
KVM: x86 emulator: move 0F 01 sub-opcodes into their own functions
Signed-off-by: Avi Kivity <avi@redhat.com>
Randy Dunlap [Thu, 21 Apr 2011 16:09:22 +0000 (09:09 -0700)]
KVM: x86 emulator: fix const value warning on i386 in svm insn RAX check
arch/x86/kvm/emulate.c:2598: warning: integer constant is too large for 'long' type
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Clemens Noss [Thu, 21 Apr 2011 19:16:05 +0000 (21:16 +0200)]
KVM: x86 emulator: avoid calling wbinvd() macro
Commit
0b56652e33c72092956c651ab6ceb9f0ad081153 fails to build:
CC [M] arch/x86/kvm/emulate.o
arch/x86/kvm/emulate.c: In function 'x86_emulate_insn':
arch/x86/kvm/emulate.c:4095:25: error: macro "wbinvd" passed 1 arguments, but takes just 0
arch/x86/kvm/emulate.c:4095:3: warning: statement with no effect
make[2]: *** [arch/x86/kvm/emulate.o] Error 1
make[1]: *** [arch/x86/kvm] Error 2
make: *** [arch/x86] Error 2
Work around this for now.
Signed-off-by: Clemens Noss <cnoss@gmx.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
Liu Yuan [Thu, 21 Apr 2011 06:53:57 +0000 (14:53 +0800)]
KVM: ioapic: Fix an error field reference
Function ioapic_debug() in the ioapic_deliver() misnames
one filed by reference. This patch correct it.
Signed-off-by: Liu Yuan <tailai.ly@taobao.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Roedel, Joerg [Wed, 20 Apr 2011 13:33:16 +0000 (15:33 +0200)]
KVM: MMU: Make cmpxchg_gpte aware of nesting too
This patch makes the cmpxchg_gpte() function aware of the
difference between l1-gfns and l2-gfns when nested
virtualization is in use. This fixes a potential
data-corruption problem in the l1-guest and makes the code
work correct (at least as correct as the hardware which is
emulated in this code) again.
Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 12:56:20 +0000 (15:56 +0300)]
KVM: x86 emulator: drop x86_emulate_ctxt::vcpu
No longer used.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 12:55:40 +0000 (15:55 +0300)]
KVM: Avoid using x86_emulate_ctxt.vcpu
We can use container_of() instead.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 12:53:23 +0000 (15:53 +0300)]
KVM: x86 emulator: add new ->wbinvd() callback
Instead of calling kvm_emulate_wbinvd() directly.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 12:47:13 +0000 (15:47 +0300)]
KVM: x86 emulator: add ->fix_hypercall() callback
Artificial, but needed to remove direct calls to KVM.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 12:43:05 +0000 (15:43 +0300)]
KVM: x86 emulator: add new ->halt() callback
Instead of reaching into vcpu internals.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 12:38:44 +0000 (15:38 +0300)]
KVM: x86 emulator: make emulate_invlpg() an emulator callback
Removing direct calls to KVM.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 12:32:49 +0000 (15:32 +0300)]
KVM: x86 emulator: emulate CLTS internally
Avoid using ctxt->vcpu; we can do everything with ->get_cr() and ->set_cr().
A side effect is that we no longer activate the fpu on emulated CLTS; but that
should be very rare.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 12:24:32 +0000 (15:24 +0300)]
KVM: x86 emulator: Replace calls to is_pae() and is_paging with ->get_cr()
Avoid use of ctxt->vcpu.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 12:21:35 +0000 (15:21 +0300)]
KVM: x86 emulator: drop use of is_long_mode()
Requires ctxt->vcpu, which is to be abolished. Replace with open calls
to get_msr().
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 12:12:00 +0000 (15:12 +0300)]
KVM: x86 emulator: add and use new callbacks set_idt(), set_gdt()
Replacing direct calls to realmode_lgdt(), realmode_lidt().
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 12:01:23 +0000 (15:01 +0300)]
KVM: x86 emulator: avoid using ctxt->vcpu in check_perm() callbacks
Unneeded for register access.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 10:37:53 +0000 (13:37 +0300)]
KVM: x86 emulator: drop vcpu argument from intercept callback
Making the emulator caller agnostic.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 10:37:53 +0000 (13:37 +0300)]
KVM: x86 emulator: drop vcpu argument from cr/dr/cpl/msr callbacks
Making the emulator caller agnostic.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 10:37:53 +0000 (13:37 +0300)]
KVM: x86 emulator: drop vcpu argument from segment/gdt/idt callbacks
Making the emulator caller agnostic.
[Takuya Yoshikawa: fix typo leading to LDT failures]
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 10:37:53 +0000 (13:37 +0300)]
KVM: x86 emulator: drop vcpu argument from pio callbacks
Making the emulator caller agnostic.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 10:37:53 +0000 (13:37 +0300)]
KVM: x86 emulator: drop vcpu argument from memory read/write callbacks
Making the emulator caller agnostic.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 10:12:27 +0000 (13:12 +0300)]
KVM: x86 emulator: whitespace cleanups
Clean up lines longer than 80 columns. No code changes.
Signed-off-by: Avi Kivity <avi@redhat.com>
Nelson Elhage [Mon, 18 Apr 2011 16:05:53 +0000 (12:05 -0400)]
KVM: emulator: Use linearize() when fetching instructions
Since segments need to be handled slightly differently when fetching
instructions, we add a __linearize helper that accepts a new 'fetch' boolean.
[avi: fix oops caused by wrong segmented_address initialization order]
Signed-off-by: Nelson Elhage <nelhage@ksplice.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 18 Apr 2011 09:42:53 +0000 (11:42 +0200)]
KVM: X86: Update last_guest_tsc in vcpu_put
The last_guest_tsc is used in vcpu_load to adjust the
tsc_offset since tsc-scaling is merged. So the
last_guest_tsc needs to be updated in vcpu_put instead of
the the last_host_tsc. This is fixed with this patch.
Reported-by: Jan Kiszka <jan.kiszka@web.de>
Tested-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 18 Apr 2011 09:42:52 +0000 (11:42 +0200)]
KVM: SVM: Fix nested sel_cr0 intercept path with decode-assists
This patch fixes a bug in the nested-svm path when
decode-assists is available on the machine. After a
selective-cr0 intercept is detected the rip is advanced
unconditionally. This causes the l1-guest to continue
running with an l2-rip.
This bug was with the sel_cr0 unit-test on decode-assists
capable hardware.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Nelson Elhage [Wed, 13 Apr 2011 15:44:13 +0000 (11:44 -0400)]
KVM: x86 emulator: Handle wraparound in (cs_base + offset) when fetching insns
Currently, setting a large (i.e. negative) base address for %cs does not work on
a 64-bit host. The "JOS" teaching operating system, used by MIT and other
universities, relies on such segments while bootstrapping its way to full
virtual memory management.
Signed-off-by: Nelson Elhage <nelhage@ksplice.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Duan Jiong [Mon, 11 Apr 2011 04:44:06 +0000 (12:44 +0800)]
KVM: remove useless function declaration kvm_inject_pit_timer_irqs()
Just remove useless function define kvm_inject_pit_timer_irqs() from
file arch/x86/kvm/i8254.h
Signed-off-by:Duan Jiong<djduanjiong@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Duan Jiong [Mon, 11 Apr 2011 04:56:01 +0000 (12:56 +0800)]
KVM: remove useless function declarations from file arch/x86/kvm/irq.h
Just remove useless function define kvm_pic_clear_isr_ack() and
pit_has_pending_timer()
Signed-off-by: Duan Jiong<djduanjiong@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Jeff Mahoney [Wed, 13 Apr 2011 01:30:17 +0000 (21:30 -0400)]
KVM: Fix off by one in kvm_for_each_vcpu iteration
This patch avoids gcc issuing the following warning when KVM_MAX_VCPUS=1:
warning: array subscript is above array bounds
kvm_for_each_vcpu currently checks to see if the index for the vcpu is
valid /after/ loading it. We don't run into problems because the address
is still inside the enclosing struct kvm and we never deference or write
to it, so this isn't a security issue.
The warning occurs when KVM_MAX_VCPUS=1 because the increment portion of
the loop will *always* cause the loop to load an invalid location since
++idx will always be > 0.
This patch moves the load so that the check occurs before the load and
we don't run into the compiler warning.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Serge E. Hallyn [Wed, 13 Apr 2011 14:12:54 +0000 (09:12 -0500)]
KVM: fix push of wrong eip when doing softint
When doing a soft int, we need to bump eip before pushing it to
the stack. Otherwise we'll do the int a second time.
[apw@canonical.com: merged eip update as per Jan's recommendation.]
Signed-off-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Takuya Yoshikawa [Tue, 12 Apr 2011 15:31:23 +0000 (00:31 +0900)]
KVM: x86 emulator: Use em_push() instead of emulate_push()
em_push() is a simple wrapper of emulate_push(). So this patch replaces
emulate_push() with em_push() and removes the unnecessary former.
In addition, the unused ops arguments are removed from emulate_pusha()
and emulate_grp45().
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
Takuya Yoshikawa [Tue, 12 Apr 2011 15:29:09 +0000 (00:29 +0900)]
KVM: x86 emulator: Make emulate_push() store the value directly
PUSH emulation stores the value by calling writeback() after setting
the dst operand appropriately in emulate_push().
This writeback() using dst is not needed at all because we know the
target is the stack. So this patch makes emulate_push() call, newly
introduced, segmented_write() directly.
By this, many inlined writeback()'s are removed.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
Takuya Yoshikawa [Tue, 12 Apr 2011 15:24:55 +0000 (00:24 +0900)]
KVM: x86 emulator: Disable writeback for CMP emulation
This stops "CMP r/m, reg" to write back the data into memory.
Pointed out by Avi.
The writeback suppression now covers CMP, CMPS, SCAS.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
Jan Kiszka [Tue, 12 Apr 2011 23:27:55 +0000 (01:27 +0200)]
KVM: VMX: Ensure that vmx_create_vcpu always returns proper error
In case certain allocations fail, vmx_create_vcpu may return 0 as error
instead of a negative value encoded via ERR_PTR. This causes a NULL
pointer dereferencing later on in kvm_vm_ioctl_vcpu_create.
Reported-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Gleb Natapov [Thu, 31 Mar 2011 10:06:41 +0000 (12:06 +0200)]
KVM: emulator: do not needlesly sync registers from emulator ctxt to vcpu
Currently we sync registers back and forth before/after exiting
to userspace for IO, but during IO device model shouldn't need to
read/write the registers, so we can as well skip those sync points. The
only exaception is broken vmware backdor interface. The new code sync
registers content during IO only if registers are read from/written to
by userspace in the middle of the IO operation and this almost never
happens in practise.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Avi Kivity [Sun, 3 Apr 2011 09:32:09 +0000 (12:32 +0300)]
KVM: x86 emulator: implement segment permission checks
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Sun, 3 Apr 2011 11:08:51 +0000 (14:08 +0300)]
KVM: x86 emulator: move desc_limit_scaled()
For reuse later.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Sun, 3 Apr 2011 09:33:12 +0000 (12:33 +0300)]
KVM: x86 emulator: move linearize() downwards
So it can call emulate_gp() without forward declarations.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Sun, 3 Apr 2011 08:31:19 +0000 (11:31 +0300)]
KVM: x86 emulator: pass access size and read/write intent to linearize()
Needed for segment read/write checks.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Thu, 31 Mar 2011 16:54:30 +0000 (18:54 +0200)]
KVM: x86 emulator: change address linearization to return an error code
Preparing to add segment checks.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Thu, 31 Mar 2011 16:48:09 +0000 (18:48 +0200)]
KVM: x86 emulator: move invlpg emulation into a function
It's going to get more complicated soon.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Thu, 31 Mar 2011 14:52:26 +0000 (16:52 +0200)]
KVM: x86 emulator: Add helpers for memory access using segmented addresses
Will help later adding proper segment checks.
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Wed, 6 Apr 2011 10:30:03 +0000 (12:30 +0200)]
KVM: SVM: Fix fault-rip on vmsave/vmload emulation
When the emulation of vmload or vmsave fails because the
guest passed an unsupported physical address it gets an #GP
with rip pointing to the instruction after vmsave/vmload.
This is a bug and fixed by this patch.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Fri, 25 Mar 2011 08:44:51 +0000 (09:44 +0100)]
KVM: X86: Implement userspace interface to set virtual_tsc_khz
This patch implements two new vm-ioctls to get and set the
virtual_tsc_khz if the machine supports tsc-scaling. Setting
the tsc-frequency is only possible before userspace creates
any vcpu.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Fri, 25 Mar 2011 08:44:50 +0000 (09:44 +0100)]
KVM: X86: Delegate tsc-offset calculation to architecture code
With TSC scaling in SVM the tsc-offset needs to be
calculated differently. This patch propagates this
calculation into the architecture specific modules so that
this complexity can be handled there.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Fri, 25 Mar 2011 08:44:49 +0000 (09:44 +0100)]
KVM: X86: Implement call-back to propagate virtual_tsc_khz
This patch implements a call-back into the architecture code
to allow the propagation of changes to the virtual tsc_khz
of the vcpu.
On SVM it updates the tsc_ratio variable, on VMX it does
nothing.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Fri, 25 Mar 2011 08:44:48 +0000 (09:44 +0100)]
KVM: X86: Make tsc_delta calculation a function of guest tsc
The calculation of the tsc_delta value to ensure a
forward-going tsc for the guest is a function of the
host-tsc. This works as long as the guests tsc_khz is equal
to the hosts tsc_khz. With tsc-scaling hardware support this
is not longer true and the tsc_delta needs to be calculated
using guest_tsc values.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Fri, 25 Mar 2011 08:44:47 +0000 (09:44 +0100)]
KVM: X86: Let kvm-clock report the right tsc frequency
This patch changes the kvm_guest_time_update function to use
TSC frequency the guest actually has for updating its clock.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Fri, 25 Mar 2011 08:44:46 +0000 (09:44 +0100)]
KVM: SVM: Implement infrastructure for TSC_RATE_MSR
This patch enhances the kvm_amd module with functions to
support the TSC_RATE_MSR which can be used to set a given
tsc frequency for the guest vcpu.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Tue, 5 Apr 2011 13:25:20 +0000 (16:25 +0300)]
KVM: x86 emulator: Drop EFER.SVME requirement from VMMCALL
VMMCALL requires EFER.SVME to be enabled in the host, not in the guest, which
is what check_svme() checks.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Tue, 5 Apr 2011 13:21:58 +0000 (16:21 +0300)]
KVM: x86 emulator: Re-add VendorSpecific tag to VMMCALL insn
VMMCALL needs the VendorSpecific tag so that #UD emulation
(called if a guest running on AMD was migrated to an Intel host)
is allowed to process the instruction.
Signed-off-by: Avi Kivity <avi@redhat.com>
Bharat Bhushan [Fri, 25 Mar 2011 05:02:13 +0000 (10:32 +0530)]
KVM: PPC: Fix issue clearing exit timing counters
Following dump is observed on host when clearing the exit timing counters
[root@p1021mds kvm]# echo -n 'c' > vm1200_vcpu0_timing
INFO: task echo:1276 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
echo D
0ff5bf94 0 1276 1190 0x00000000
Call Trace:
[
c2157e40] [
c0007908] __switch_to+0x9c/0xc4
[
c2157e50] [
c040293c] schedule+0x1b4/0x3bc
[
c2157e90] [
c04032dc] __mutex_lock_slowpath+0x74/0xc0
[
c2157ec0] [
c00369e4] kvmppc_init_timing_stats+0x20/0xb8
[
c2157ed0] [
c0036b00] kvmppc_exit_timing_write+0x84/0x98
[
c2157ef0] [
c00b9f90] vfs_write+0xc0/0x16c
[
c2157f10] [
c00ba284] sys_write+0x4c/0x90
[
c2157f40] [
c000e320] ret_from_syscall+0x0/0x3c
The vcpu->mutex is used by kvm_ioctl_* (KVM_RUN etc) and same was
used when clearing the stats (in kvmppc_init_timing_stats()). What happens
is that when the guest is idle then it held the vcpu->mutx. While the
exiting timing process waits for guest to release the vcpu->mutex and
a hang state is reached.
Now using seprate lock for exit timing stats.
Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
Xiao Guangrong [Mon, 28 Mar 2011 02:29:27 +0000 (10:29 +0800)]
KVM: MMU: remove mmu_seq verification on pte update path
The mmu_seq verification can be removed since we get the pfn in the
protection of mmu_lock.
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Gleb Natapov [Mon, 28 Mar 2011 14:57:49 +0000 (16:57 +0200)]
KVM: x86 emulator: do not open code return values from the emulator
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Justin P. Mattock [Wed, 30 Mar 2011 16:54:47 +0000 (09:54 -0700)]
KVM: Remove base_addresss in kvm_pit since it is unused
The patch below removes unsigned long base_addresss; in i8254.h
since it is unused.
Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:36 +0000 (12:39 +0200)]
KVM: SVM: Remove nested sel_cr0_write handling code
This patch removes all the old code which handled the nested
selective cr0 write intercepts. This code was only in place
as a work-around until the instruction emulator is capable
of doing the same. This is the case with this patch-set and
so the code can be removed.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:35 +0000 (12:39 +0200)]
KVM: SVM: Add checks for IO instructions
This patch adds code to check for IOIO intercepts on
instructions decoded by the KVM instruction emulator.
[avi: fix build error due to missing #define D2bvIP]
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:34 +0000 (12:39 +0200)]
KVM: SVM: Add intercept checks for one-byte instructions
This patch add intercept checks for emulated one-byte
instructions to the KVM instruction emulation path.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:33 +0000 (12:39 +0200)]
KVM: SVM: Add intercept checks for remaining twobyte instructions
This patch adds intercepts checks for the remaining twobyte
instructions to the KVM instruction emulator.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:32 +0000 (12:39 +0200)]
KVM: SVM: Add intercept checks for remaining group7 instructions
This patch implements the emulator intercept checks for the
RDTSCP, MONITOR, and MWAIT instructions.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:31 +0000 (12:39 +0200)]
KVM: SVM: Add intercept checks for SVM instructions
This patch adds the necessary code changes in the
instruction emulator and the extensions to svm.c to
implement intercept checks for the svm instructions.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:30 +0000 (12:39 +0200)]
KVM: SVM: Add intercept checks for descriptor table accesses
This patch add intercept checks into the KVM instruction
emulator to check for the 8 instructions that access the
descriptor table addresses.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:29 +0000 (12:39 +0200)]
KVM: SVM: Add intercept check for accessing dr registers
This patch adds the intercept checks for instruction
accessing the debug registers.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>