Geoff Levand [Fri, 7 Jun 2013 01:02:54 +0000 (18:02 -0700)]
arm/kvm: Cleanup KVM_ARM_MAX_VCPUS logic
Commit
d21a1c83c7595e387545632e44cd7797b76e19cc (ARM: KVM: define KVM_ARM_MAX_VCPUS
unconditionally) changed the Kconfig logic for KVM_ARM_MAX_VCPUS to work around a
build error arising from the use of KVM_ARM_MAX_VCPUS when CONFIG_KVM=n. The
resulting Kconfig logic is a bit awkward and leaves a KVM_ARM_MAX_VCPUS always
defined in the kernel config file.
This change reverts the Kconfig logic back and adds a simple preprocessor
conditional in kvm_host.h to handle when CONFIG_KVM_ARM_MAX_VCPUS is undefined.
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Marc Zyngier [Fri, 21 Jun 2013 12:08:48 +0000 (13:08 +0100)]
ARM: KVM: clear exclusive monitor on all exception returns
Make sure we clear the exclusive monitor on all exception returns,
which otherwise could lead to lock corruptions.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Marc Zyngier [Fri, 21 Jun 2013 12:08:47 +0000 (13:08 +0100)]
ARM: KVM: add missing dsb before invalidating Stage-2 TLBs
When performing a Stage-2 TLB invalidation, it is necessary to
make sure the write to the page tables is observable by all CPUs.
For this purpose, add a dsb instruction to __kvm_tlb_flush_vmid_ipa
before doing the TLB invalidation itself.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Marc Zyngier [Fri, 21 Jun 2013 12:08:46 +0000 (13:08 +0100)]
ARM: KVM: perform save/restore of PAR
Not saving PAR is an unfortunate oversight. If the guest performs
an AT* operation and gets scheduled out before reading the result
of the translation from PAR, it could become corrupted by another
guest or the host.
Saving this register is made slightly more complicated as KVM also
uses it on the permission fault handling path, leading to an ugly
"stash and restore" sequence. Fortunately, this is already a slow
path so we don't really care. Also, Linux doesn't do any AT*
operation, so Linux guests are not impacted by this bug.
[ Slightly tweaked to use an even register as first operand to ldrd
and strd operations in interrupts_head.S - Christoffer ]
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Marc Zyngier [Tue, 14 May 2013 11:11:39 +0000 (12:11 +0100)]
ARM: KVM: get rid of S2_PGD_SIZE
S2_PGD_SIZE defines the number of pages used by a stage-2 PGD
and is unused, except for a VM_BUG_ON check that missuses the
define.
As the check is very unlikely to ever triggered except in
circumstances where KVM is the least of our worries, just kill
both the define and the VM_BUG_ON check.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Marc Zyngier [Tue, 14 May 2013 11:11:38 +0000 (12:11 +0100)]
ARM: KVM: don't special case PC when doing an MMIO
Admitedly, reading a MMIO register to load PC is very weird.
Writing PC to a MMIO register is probably even worse. But
the architecture doesn't forbid any of these, and injecting
a Prefetch Abort is the wrong thing to do anyway.
Remove this check altogether, and let the adventurous guest
wander into LaLaLand if they feel compelled to do so.
Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Marc Zyngier [Tue, 14 May 2013 11:11:37 +0000 (12:11 +0100)]
ARM: KVM: use phys_addr_t instead of unsigned long long for HYP PGDs
HYP PGDs are passed around as phys_addr_t, except just before calling
into the hypervisor init code, where they are cast to a rather weird
unsigned long long.
Just keep them around as phys_addr_t, which is what makes the most
sense.
Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Marc Zyngier [Tue, 14 May 2013 11:11:35 +0000 (12:11 +0100)]
ARM: KVM: remove dead prototype for __kvm_tlb_flush_vmid
__kvm_tlb_flush_vmid has been renamed to __kvm_tlb_flush_vmid_ipa,
and the old prototype should have been removed when the code was
modified.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Dave P Martin [Wed, 1 May 2013 16:49:28 +0000 (17:49 +0100)]
ARM: KVM: Don't handle PSCI calls via SMC
Currently, kvmtool unconditionally declares that HVC should be used
to call PSCI, so the function numbers in the DT tell the guest
nothing about the function ID namespace or calling convention for
SMC.
We already assume that the guest will examine and honour the DT,
since there is no way it could possibly guess the KVM-specific PSCI
function IDs otherwise. So let's not encourage guests to violate
what's specified in the DT by using SMC to make the call.
[ Modified to apply to top of kvm/arm tree - Christoffer ]
Signed-off-by: Dave P Martin <Dave.Martin@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Anup Patel [Tue, 30 Apr 2013 06:32:15 +0000 (12:02 +0530)]
ARM: KVM: Allow host virt timer irq to be different from guest timer virt irq
The arch_timer irq numbers (or PPI numbers) are implementation dependent,
so the host virtual timer irq number can be different from guest virtual
timer irq number.
This patch ensures that host virtual timer irq number is read from DTB and
guest virtual timer irq is determined based on vcpu target type.
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Thomas Huth [Thu, 20 Jun 2013 15:22:05 +0000 (17:22 +0200)]
KVM: s390: Fixed priority of execution in STSI
Added some missing validity checks for the operands and fixed the
priority of exceptions for some function codes according to the
"Principles of Operation" document.
Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Thomas Huth [Thu, 20 Jun 2013 15:22:04 +0000 (17:22 +0200)]
KVM: s390: Reworked LCTL and LCTLG instructions
LCTL and LCTLG are also privileged instructions, thus there is no need for
treating them separately from the other instructions in priv.c. So this
patch moves these two instructions to priv.c, adds a check for supervisor
state and simplifies the "handle_eb" instruction decoding by merging the
two eb_handlers jump tables from intercept.c and priv.c into one table only.
Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Thomas Huth [Thu, 20 Jun 2013 15:22:03 +0000 (17:22 +0200)]
KVM: s390: Check for access exceptions during TPI
When a guest calls the TPI instruction, the second operand address could
point to an invalid location. In this case the problem should be signaled
to the guest by throwing an access exception.
Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Thomas Huth [Thu, 20 Jun 2013 15:22:02 +0000 (17:22 +0200)]
KVM: s390: Check for PSTATE when handling DIAGNOSE
DIAGNOSE is a privileged instruction and thus we must make sure that we are
in supervisor mode before taking any other actions.
Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Thomas Huth [Thu, 20 Jun 2013 15:22:01 +0000 (17:22 +0200)]
KVM: s390: Privileged operation checks moved to instruction handlers
We need more fine-grained control about the point in time when we check
for privileged instructions, since the exceptions that can happen during
an instruction have a well-defined priority. For example, for the PFMF
instruction, the check for PGM_PRIVILEGED_OP must happen after the check
for PGM_OPERATION since the latter has a higher precedence - thus the
check for privileged operation must not be done in kvm_s390_handle_b9()
already.
Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Thomas Huth [Thu, 20 Jun 2013 15:22:00 +0000 (17:22 +0200)]
KVM: s390: Privileged operation check for TPROT
TPROT is a privileged instruction and thus should generate a privileged
operation exception when the problem state bit is not cleared in the PSW.
Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Thomas Huth [Thu, 20 Jun 2013 15:21:59 +0000 (17:21 +0200)]
KVM: s390: Renamed PGM_PRIVILEGED_OPERATION
Renamed the PGM_PRIVILEGED_OPERATION define to PGM_PRIVILEGED_OP since this
define was way longer than the other PGM_* defines and caused the code often
to exceed the 80 columns limit when not split to multiple lines.
Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Xiao Guangrong [Wed, 19 Jun 2013 09:09:19 +0000 (17:09 +0800)]
KVM: MMU: update the documentation for reverse mapping of parent_pte
Update the document to match the current reverse mapping of
parent_pte
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Alexey Kardashevskiy [Wed, 19 Jun 2013 01:42:07 +0000 (11:42 +1000)]
kvm api doc: fix section numbers
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Thomas Huth [Wed, 12 Jun 2013 11:54:57 +0000 (13:54 +0200)]
KVM: s390: Fix epsw instruction decoding
The handle_epsw() function calculated the first register in the wrong way,
so that it always used r0 by mistake. Now the code uses the common helper
function for decoding the registers of rre functions instead to avoid such
mistakes.
Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Heinz Graalfs [Wed, 12 Jun 2013 11:54:56 +0000 (13:54 +0200)]
KVM: s390,perf: Detect if perf samples belong to KVM host or guest
This patch is based on an original patch of David Hildenbrand.
The perf core implementation calls architecture specific code in order
to ask for specific information for a particular sample:
perf_instruction_pointer()
When perf core code asks for the instruction pointer, architecture
specific code must detect if a KVM guest was running when the sample
was taken. A sample can be associated with a KVM guest when the PSW
supervisor state bit is set and the PSW instruction pointer part
contains the address of 'sie_exit'.
A KVM guest's instruction pointer information is then retrieved via
gpsw entry pointed to by the sie control-block.
perf_misc_flags()
perf code code calls this function in order to associate the kernel
vs. user state infomation with a particular sample. Architecture
specific code must also first detectif a KVM guest was running
at the time the sample was taken.
Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Christian Borntraeger [Wed, 12 Jun 2013 11:54:55 +0000 (13:54 +0200)]
KVM: s390: Use common waitqueue
Lets use the common waitqueue for kvm cpus on s390. By itself it is
just a cleanup, but it should also improve the accuracy of diag 0x44
which is implemented via kvm_vcpu_on_spin. kvm_vcpu_on_spin has
an explicit check for waiting on the waitqueue to optimize the
yielding.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Michael Mueller [Wed, 12 Jun 2013 11:54:54 +0000 (13:54 +0200)]
KVM: s390: code cleanup to use common vcpu slab cache
cleanup of arch specific code to use common code provided vcpu slab cache
instead of kzalloc() provided memory
Signed-off-by: Michael Mueller <mimu@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Christian Borntraeger [Wed, 12 Jun 2013 11:54:53 +0000 (13:54 +0200)]
KVM: s390: guest large pages
This patch enables kvm to give large pages to the guest. The heavy
lifting is done by the hardware, the host only has to take care
of the PFMF instruction, which is also part of EDAT-1.
We also support the non-quiescing key setting facility if the host
supports it, to behave similar to the interpretation of sske.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Christian Borntraeger [Wed, 12 Jun 2013 11:54:52 +0000 (13:54 +0200)]
KVM: s390: Provide function for setting the guest storage key
From time to time we need to set the guest storage key. Lets
provide a helper function that handles the changes with all the
right locking and checking.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Marcelo Tosatti [Wed, 12 Jun 2013 02:31:12 +0000 (23:31 -0300)]
KVM: x86: handle idiv overflow at kvm_write_tsc
Its possible that idivl overflows (due to large delta stored in usdiff,
valid scenario).
Create an exception handler to catch the overflow exception (division by zero
is protected by vcpu->arch.virtual_tsc_khz check), and interpret it accordingly
(delta is larger than USEC_PER_SEC).
Fixes https://bugzilla.redhat.com/show_bug.cgi?id=969644
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Gleb Natapov [Fri, 31 May 2013 00:36:30 +0000 (08:36 +0800)]
KVM: MMU: reduce KVM_REQ_MMU_RELOAD when root page is zapped
Quote Gleb's mail:
| why don't we check for sp->role.invalid in
| kvm_mmu_prepare_zap_page before calling kvm_reload_remote_mmus()?
and
| Actually we can add check for is_obsolete_sp() there too since
| kvm_mmu_invalidate_all_pages() already calls kvm_reload_remote_mmus()
| after incrementing mmu_valid_gen.
[ Xiao: add some comments and the check of is_obsolete_sp() ]
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Xiao Guangrong [Fri, 31 May 2013 00:36:29 +0000 (08:36 +0800)]
KVM: MMU: reclaim the zapped-obsolete page first
As Marcelo pointed out that
| "(retention of large number of pages while zapping)
| can be fatal, it can lead to OOM and host crash"
We introduce a list, kvm->arch.zapped_obsolete_pages, to link all
the pages which are deleted from the mmu cache but not actually
freed. When page reclaiming is needed, we always zap this kind of
pages first.
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Xiao Guangrong [Fri, 31 May 2013 00:36:28 +0000 (08:36 +0800)]
KVM: MMU: collapse TLB flushes when zap all pages
kvm_zap_obsolete_pages uses lock-break technique to zap pages,
it will flush tlb every time when it does lock-break
We can reload mmu on all vcpus after updating the generation
number so that the obsolete pages are not used on any vcpus,
after that we do not need to flush tlb when obsolete pages
are zapped
It will do kvm_mmu_prepare_zap_page many times and use one
kvm_mmu_commit_zap_page to collapse tlb flush, the side-effects
is that causes obsolete pages unlinked from active_list but leave
on hash-list, so we add the comment around the hash list walker
Note: kvm_mmu_commit_zap_page is still needed before free
the pages since other vcpus may be doing locklessly shadow
page walking
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Xiao Guangrong [Fri, 31 May 2013 00:36:27 +0000 (08:36 +0800)]
KVM: MMU: zap pages in batch
Zap at lease 10 pages before releasing mmu-lock to reduce the overload
caused by requiring lock
After the patch, kvm_zap_obsolete_pages can forward progress anyway,
so update the comments
[ It improves the case 0.6% ~ 1% that do kernel building meanwhile read
PCI ROM. ]
Note: i am not sure that "10" is the best speculative value, i just
guessed that '10' can make vcpu do not spend long time on
kvm_zap_obsolete_pages and do not cause mmu-lock too hungry.
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Xiao Guangrong [Fri, 31 May 2013 00:36:26 +0000 (08:36 +0800)]
KVM: MMU: do not reuse the obsolete page
The obsolete page will be zapped soon, do not reuse it to
reduce future page fault
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Xiao Guangrong [Fri, 31 May 2013 00:36:25 +0000 (08:36 +0800)]
KVM: MMU: add tracepoint for kvm_mmu_invalidate_all_pages
It is good for debug and development
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Xiao Guangrong [Fri, 31 May 2013 00:36:24 +0000 (08:36 +0800)]
KVM: MMU: show mmu_valid_gen in shadow page related tracepoints
Show sp->mmu_valid_gen
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Xiao Guangrong [Fri, 31 May 2013 00:36:23 +0000 (08:36 +0800)]
KVM: x86: use the fast way to invalidate all pages
Replace kvm_mmu_zap_all by kvm_mmu_invalidate_zap_all_pages
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Xiao Guangrong [Fri, 31 May 2013 00:36:22 +0000 (08:36 +0800)]
KVM: MMU: fast invalidate all pages
The current kvm_mmu_zap_all is really slow - it is holding mmu-lock to
walk and zap all shadow pages one by one, also it need to zap all guest
page's rmap and all shadow page's parent spte list. Particularly, things
become worse if guest uses more memory or vcpus. It is not good for
scalability
In this patch, we introduce a faster way to invalidate all shadow pages.
KVM maintains a global mmu invalid generation-number which is stored in
kvm->arch.mmu_valid_gen and every shadow page stores the current global
generation-number into sp->mmu_valid_gen when it is created
When KVM need zap all shadow pages sptes, it just simply increase the
global generation-number then reload root shadow pages on all vcpus.
Vcpu will create a new shadow page table according to current kvm's
generation-number. It ensures the old pages are not used any more.
Then the obsolete pages (sp->mmu_valid_gen != kvm->arch.mmu_valid_gen)
are zapped by using lock-break technique
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Xiao Guangrong [Fri, 31 May 2013 00:36:21 +0000 (08:36 +0800)]
KVM: MMU: drop unnecessary kvm_reload_remote_mmus
It is the responsibility of kvm_mmu_zap_all that keeps the
consistent of mmu and tlbs. And it is also unnecessary after
zap all mmio sptes since no mmio spte exists on root shadow
page and it can not be cached into tlb
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Xiao Guangrong [Fri, 31 May 2013 00:36:20 +0000 (08:36 +0800)]
KVM: x86: drop calling kvm_mmu_zap_all in emulator_fix_hypercall
Quote Gleb's mail:
| Back then kvm->lock protected memslot access so code like:
|
| mutex_lock(&vcpu->kvm->lock);
| kvm_mmu_zap_all(vcpu->kvm);
| mutex_unlock(&vcpu->kvm->lock);
|
| which is what
7aa81cc0 does was enough to guaranty that no vcpu will
| run while code is patched. This is no longer the case and
| mutex_lock(&vcpu->kvm->lock); is gone from that code path long time ago,
| so now kvm_mmu_zap_all() there is useless and the code is incorrect.
So we drop it and it will be fixed later
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Amos Kong [Fri, 24 May 2013 22:44:15 +0000 (06:44 +0800)]
kvm: exclude ioeventfd from counting kvm_io_range limit
We can easily reach the 1000 limit by start VM with a couple
hundred I/O devices (multifunction=on). The hardcode limit
already been adjusted 3 times (6 ~ 200 ~ 300 ~ 1000).
In userspace, we already have maximum file descriptor to
limit ioeventfd count. But kvm_io_bus devices also are used
for pit, pic, ioapic, coalesced_mmio. They couldn't be limited
by maximum file descriptor.
Currently only ioeventfds take too much kvm_io_bus devices,
so just exclude it from counting kvm_io_range limit.
Also fixed one indent issue in kvm_host.h
Signed-off-by: Amos Kong <akong@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Cornelia Huck [Mon, 27 May 2013 16:42:33 +0000 (18:42 +0200)]
KVM: s390: Add "devname:kvm" alias.
Providing a "devname:kvm" module alias enables automatic loading of
the kvm module when /dev/kvm is opened.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Avi Kivity [Sat, 9 Feb 2013 09:31:51 +0000 (11:31 +0200)]
KVM: x86 emulator: convert XADD to fastop
Signed-off-by: Avi Kivity <avi.kivity@gmail.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Avi Kivity [Sat, 9 Feb 2013 09:31:50 +0000 (11:31 +0200)]
KVM: x86 emulator: drop unused old-style inline emulation
Signed-off-by: Avi Kivity <avi.kivity@gmail.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Avi Kivity [Sat, 9 Feb 2013 09:31:49 +0000 (11:31 +0200)]
KVM: x86 emulator: convert DIV/IDIV to fastop
Since DIV and IDIV can generate exceptions, we need an additional output
parameter indicating whether an execption has occured. To avoid increasing
register pressure on i386, we use %rsi, which is already allocated for
the fastop code pointer.
Gleb: added comment about fop usage as exception indication.
Signed-off-by: Avi Kivity <avi.kivity@gmail.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Avi Kivity [Sat, 9 Feb 2013 09:31:48 +0000 (11:31 +0200)]
KVM: x86 emulator: convert single-operand MUL/IMUL to fastop
Signed-off-by: Avi Kivity <avi.kivity@gmail.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Avi Kivity [Sat, 9 Feb 2013 09:31:47 +0000 (11:31 +0200)]
KVM: x86 emulator: Switch fastop src operand to RDX
This makes OpAccHi useful.
Signed-off-by: Avi Kivity <avi.kivity@gmail.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Avi Kivity [Sat, 9 Feb 2013 09:31:46 +0000 (11:31 +0200)]
KVM: x86 emulator: switch MUL/DIV to DstXacc
Signed-off-by: Avi Kivity <avi.kivity@gmail.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Avi Kivity [Sat, 9 Feb 2013 09:31:45 +0000 (11:31 +0200)]
KVM: x86 emulator: decode extended accumulator explicity
Single-operand MUL and DIV access an extended accumulator: AX for byte
instructions, and DX:AX, EDX:EAX, or RDX:RAX for larger-sized instructions.
Add support for fetching the extended accumulator.
In order not to change things too much, RDX is loaded into Src2, which is
already loaded by fastop(). This avoids increasing register pressure on
i386.
Gleb: disable src writeback for ByteOp div/mul.
Signed-off-by: Avi Kivity <avi.kivity@gmail.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Avi Kivity [Sat, 9 Feb 2013 09:31:44 +0000 (11:31 +0200)]
KVM: x86 emulator: add support for writing back the source operand
Some instructions write back the source operand, not just the destination.
Add support for doing this via the decode flags.
Gleb: add BUG_ON() to prevent source to be memory operand.
Signed-off-by: Avi Kivity <avi.kivity@gmail.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Christian Borntraeger [Fri, 17 May 2013 12:41:38 +0000 (14:41 +0200)]
s390: fix gmap_ipte_notifier vs. software dirty pages
On heavy paging load some guest cpus started to loop in gmap_ipte_notify.
This was visible as stalled cpus inside the guest. The gmap_ipte_notifier
tries to map a user page and then made sure that the pte is valid and
writable. Turns out that with the software change bit tracking the pte
can become read-only (and only software writable) if the page is clean.
Since we loop in this code, the page would stay clean and, therefore,
be never writable again.
Let us just use fixup_user_fault, that guarantees to call handle_mm_fault.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Martin Schwidefsky [Fri, 17 May 2013 12:41:37 +0000 (14:41 +0200)]
s390/kvm: avoid automatic sie reentry
Do not automatically restart the sie instruction in entry64.S after an
interrupt, return to the caller with a reason code instead. That allows
to deal with RCU and other conditions in C code.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Christian Borntraeger [Fri, 17 May 2013 12:41:36 +0000 (14:41 +0200)]
s390/kvm: Kick guests out of sie if prefix page host pte is touched
The guest prefix pages must be mapped writeable all the time
while SIE is running, otherwise the guest might see random
behaviour. (pinned at the pte level) Turns out that mlocking is
not enough, the page table entry (not the page) might change or
become r/o. This patch uses the gmap notifiers to kick guest
cpus out of SIE.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Christian Borntraeger [Fri, 17 May 2013 12:41:35 +0000 (14:41 +0200)]
s390/kvm: Provide a way to prevent reentering SIE
Lets provide functions to prevent KVM from reentering SIE and
to kick cpus out of SIE. We cannot use the common kvm_vcpu_kick code,
since we need to kick out guests in places that hold architecture
specific locks (e.g. pgste lock) which might be necessary on the
other cpus - so no waiting possible.
So lets provide a bit in a private field of the sie control block
that acts as a gate keeper, after we claimed we are in SIE.
Please note that we do not reuse prog0c, since we want to access
that bit without atomic ops.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Christian Borntraeger [Fri, 17 May 2013 12:41:34 +0000 (14:41 +0200)]
s390/kvm: Mark if a cpu is in SIE
Lets track in a private bit if the sie control block is active.
We want to track this as closely as possible, so we also have to
instrument the interrupt and program check handler. Lets use the
existing HANDLE_SIE_INTERCEPT macro.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Martin Schwidefsky [Fri, 17 May 2013 12:41:33 +0000 (14:41 +0200)]
s390/kvm: rename RCP_xxx defines to PGSTE_xxx
The RCP byte is a part of the PGSTE value, the existing RCP_xxx names
are inaccurate. As the defines describe bits and pieces of the PGSTE,
the names should start with PGSTE_. The KVM_UR_BIT and KVM_UC_BIT are
part of the PGSTE as well, give them better names as well.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Martin Schwidefsky [Fri, 17 May 2013 12:41:32 +0000 (14:41 +0200)]
s390/kvm: fix psw rewinding in handle_skey
The PSW can wrap if the guest has been running in the 24 bit or 31 bit
addressing mode. Use __rewind_psw to find the correct address.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Christian Borntraeger [Fri, 17 May 2013 12:41:31 +0000 (14:41 +0200)]
s390/pgtable: fix ipte notify bit
Dont use the same bit as user referenced.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Marc Zyngier [Tue, 14 May 2013 13:31:02 +0000 (14:31 +0100)]
KVM: get rid of $(addprefix ../../../virt/kvm/, ...) in Makefiles
As requested by the KVM maintainers, remove the addprefix used to
refer to the main KVM code from the arch code, and replace it with
a KVM variable that does the same thing.
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Christoffer Dall <cdall@cs.columbia.edu>
Acked-by: Xiantao Zhang <xiantao.zhang@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Marc Zyngier [Tue, 14 May 2013 13:31:01 +0000 (14:31 +0100)]
ARM: KVM: move GIC/timer code to a common location
As KVM/arm64 is looming on the horizon, it makes sense to move some
of the common code to a single location in order to reduce duplication.
The code could live anywhere. Actually, most of KVM is already built
with a bunch of ugly ../../.. hacks in the various Makefiles, so we're
not exactly talking about style here. But maybe it is time to start
moving into a less ugly direction.
The include files must be in a "public" location, as they are accessed
from non-KVM files (arch/arm/kernel/asm-offsets.c).
For this purpose, introduce two new locations:
- virt/kvm/arm/ : x86 and ia64 already share the ioapic code in
virt/kvm, so this could be seen as a (very ugly) precedent.
- include/kvm/ : there is already an include/xen, and while the
intent is slightly different, this seems as good a location as
any
Eventually, we should probably have independant Makefiles at every
levels (just like everywhere else in the kernel), but this is just
the first step.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Gleb Natapov [Thu, 16 May 2013 08:55:51 +0000 (11:55 +0300)]
KVM: MMU: clenaup locking in mmu_free_roots()
Do locking around each case separately instead of having one lock and two
unlocks. Move root_hpa assignment out of the lock.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Marcelo Tosatti [Thu, 9 May 2013 23:21:41 +0000 (20:21 -0300)]
KVM: x86: limit difference between kvmclock updates
kvmclock updates which are isolated to a given vcpu, such as vcpu->cpu
migration, should not allow system_timestamp from the rest of the vcpus
to remain static. Otherwise ntp frequency correction applies to one
vcpu's system_timestamp but not the others.
So in those cases, request a kvmclock update for all vcpus. The worst
case for a remote vcpu to update its kvmclock is then bounded by maximum
nohz sleep latency.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Jan Kiszka [Sun, 28 Apr 2013 12:00:41 +0000 (14:00 +0200)]
KVM: x86: Remove support for reporting coalesced APIC IRQs
Since the arrival of posted interrupt support we can no longer guarantee
that coalesced IRQs are always reported to the IRQ source. Moreover,
accumulated APIC timer events could cause a busy loop when a VCPU should
rather be halted. The consensus is to remove coalesced tracking from the
LAPIC.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Acked-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Takuya Yoshikawa [Thu, 9 May 2013 06:45:12 +0000 (15:45 +0900)]
KVM: MMU: Use kvm_mmu_sync_roots() in kvm_mmu_load()
No need to open-code this function.
Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Wei Yongjun [Sun, 5 May 2013 12:03:40 +0000 (20:03 +0800)]
KVM: add missing misc_deregister() on error in kvm_init()
Add the missing misc_deregister() before return from kvm_init()
in the debugfs init error handling case.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Linus Torvalds [Sun, 12 May 2013 00:14:08 +0000 (17:14 -0700)]
Linux 3.10-rc1
Linus Torvalds [Sun, 12 May 2013 00:04:59 +0000 (17:04 -0700)]
Merge tag 'trace-fixes-v3.10' of git://git./linux/kernel/git/rostedt/linux-trace
Pull tracing/kprobes update from Steven Rostedt:
"The majority of these changes are from Masami Hiramatsu bringing
kprobes up to par with the latest changes to ftrace (multi buffering
and the new function probes).
He also discovered and fixed some bugs in doing so. When pulling in
his patches, I also found a few minor bugs as well and fixed them.
This also includes a compile fix for some archs that select the ring
buffer but not tracing.
I based this off of the last patch you took from me that fixed the
merge conflict error, as that was the commit that had all the changes
I needed for this set of changes."
* tag 'trace-fixes-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing/kprobes: Support soft-mode disabling
tracing/kprobes: Support ftrace_event_file base multibuffer
tracing/kprobes: Pass trace_probe directly from dispatcher
tracing/kprobes: Increment probe hit-count even if it is used by perf
tracing/kprobes: Use bool for retprobe checker
ftrace: Fix function probe when more than one probe is added
ftrace: Fix the output of enabled_functions debug file
ftrace: Fix locking in register_ftrace_function_probe()
tracing: Add helper function trace_create_new_event() to remove duplicate code
tracing: Modify soft-mode only if there's no other referrer
tracing: Indicate enabled soft-mode in enable file
tracing/kprobes: Fix to increment return event probe hit-count
ftrace: Cleanup regex_lock and ftrace_lock around hash updating
ftrace, kprobes: Fix a deadlock on ftrace_regex_lock
ftrace: Have ftrace_regex_write() return either read or error
tracing: Return error if register_ftrace_function_probe() fails for event_enable_func()
tracing: Don't succeed if event_enable_func did not register anything
ring-buffer: Select IRQ_WORK
Linus Torvalds [Sat, 11 May 2013 23:19:30 +0000 (16:19 -0700)]
Merge tag 'stable/for-linus-3.10-rc0-tag-two' of git://git./linux/kernel/git/konrad/xen
Pull Xen bug-fixes from Konrad Rzeszutek Wilk:
- More fixes in the vCPU PVHVM hotplug path.
- Add more documentation.
- Fix various ARM related issues in the Xen generic drivers.
- Updates in the xen-pciback driver per Bjorn's updates.
- Mask the x2APIC feature for PV guests.
* tag 'stable/for-linus-3.10-rc0-tag-two' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/pci: Used cached MSI-X capability offset
xen/pci: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK
xen: clear IRQ_NOAUTOEN and IRQ_NOREQUEST
xen: mask x2APIC feature in PV
xen: SWIOTLB is only used on x86
xen/spinlock: Fix check from greater than to be also be greater or equal to.
xen/smp/pvhvm: Don't point per_cpu(xen_vpcu, 33 and larger) to shared_info
xen/vcpu: Document the xen_vcpu_info and xen_vcpu
xen/vcpu/pvhvm: Fix vcpu hotplugging hanging.
Linus Torvalds [Sat, 11 May 2013 22:24:22 +0000 (15:24 -0700)]
Merge tag 'scsi-for-linus' of git://git./linux/kernel/git/jejb/scsi
Pull second SCSI update from James "Jaj B" Bottomley:
"This is the final round of SCSI patches for the merge window. It
consists mostly of driver updates (bnx2fc, ibmfc, fnic, lpfc,
be2iscsi, pm80xx, qla4x and ipr).
There's also the power management updates that complete the patches in
Jens' tree, an iscsi refcounting problem fix from the last pull, some
dif handling in scsi_debug fixes, a few nice code cleanups and an
error handling busy bug fix."
* tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (92 commits)
[SCSI] qla2xxx: Update firmware link in Kconfig file.
[SCSI] iscsi class, qla4xxx: fix sess/conn refcounting when find fns are used
[SCSI] sas: unify the pointlessly separated enums sas_dev_type and sas_device_type
[SCSI] pm80xx: thermal, sas controller config and error handling update
[SCSI] pm80xx: NCQ error handling changes
[SCSI] pm80xx: WWN Modification for PM8081/88/89 controllers
[SCSI] pm80xx: Changed module name and debug messages update
[SCSI] pm80xx: Firmware flash memory free fix, with addition of new memory region for it
[SCSI] pm80xx: SPC new firmware changes for device id 0x8081 alone
[SCSI] pm80xx: Added SPCv/ve specific hardware functionalities and relevant changes in common files
[SCSI] pm80xx: MSI-X implementation for using 64 interrupts
[SCSI] pm80xx: Updated common functions common for SPC and SPCv/ve
[SCSI] pm80xx: Multiple inbound/outbound queue configuration
[SCSI] pm80xx: Added SPCv/ve specific ids, variables and modify for SPC
[SCSI] lpfc: fix up Kconfig dependencies
[SCSI] Handle MLQUEUE busy response in scsi_send_eh_cmnd
[SCSI] sd: change to auto suspend mode
[SCSI] sd: use REQ_PM in sd's runtime suspend operation
[SCSI] qla4xxx: Fix iocb_cnt calculation in qla4xxx_send_mbox_iocb()
[SCSI] ufs: Correct the expected data transfersize
...
Linus Torvalds [Sat, 11 May 2013 22:23:17 +0000 (15:23 -0700)]
Merge branch 'release' of git://git./linux/kernel/git/lenb/linux
Pull idle update from Len Brown:
"Add support for new Haswell-ULT CPU idle power states"
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
intel_idle: initial C8, C9, C10 support
tools/power turbostat: display C8, C9, C10 residency
Linus Torvalds [Sat, 11 May 2013 21:29:11 +0000 (14:29 -0700)]
Merge git://git.infradead.org/users/eparis/audit
Pull audit changes from Eric Paris:
"Al used to send pull requests every couple of years but he told me to
just start pushing them to you directly.
Our touching outside of core audit code is pretty straight forward. A
couple of interface changes which hit net/. A simple argument bug
calling audit functions in namei.c and the removal of some assembly
branch prediction code on ppc"
* git://git.infradead.org/users/eparis/audit: (31 commits)
audit: fix message spacing printing auid
Revert "audit: move kaudit thread start from auditd registration to kaudit init"
audit: vfs: fix audit_inode call in O_CREAT case of do_last
audit: Make testing for a valid loginuid explicit.
audit: fix event coverage of AUDIT_ANOM_LINK
audit: use spin_lock in audit_receive_msg to process tty logging
audit: do not needlessly take a lock in tty_audit_exit
audit: do not needlessly take a spinlock in copy_signal
audit: add an option to control logging of passwords with pam_tty_audit
audit: use spin_lock_irqsave/restore in audit tty code
helper for some session id stuff
audit: use a consistent audit helper to log lsm information
audit: push loginuid and sessionid processing down
audit: stop pushing loginid, uid, sessionid as arguments
audit: remove the old depricated kernel interface
audit: make validity checking generic
audit: allow checking the type of audit message in the user filter
audit: fix build break when AUDIT_DEBUG == 2
audit: remove duplicate export of audit_enabled
Audit: do not print error when LSMs disabled
...
Linus Torvalds [Fri, 10 May 2013 16:28:55 +0000 (09:28 -0700)]
Merge branch 'for-3.10' of git://linux-nfs.org/~bfields/linux
Pull nfsd fixes from Bruce Fields:
"Small fixes for two bugs and two warnings"
* 'for-3.10' of git://linux-nfs.org/~bfields/linux:
nfsd: fix oops when legacy_recdir_name_error is passed a -ENOENT error
SUNRPC: fix decoding of optional gss-proxy xdr fields
SUNRPC: Refactor gssx_dec_option_array() to kill uninitialized warning
nfsd4: don't allow owner override on 4.1 CLAIM_FH opens
Linus Torvalds [Fri, 10 May 2013 16:27:40 +0000 (09:27 -0700)]
Merge branch 'for_linus' of git://cavan.codon.org.uk/platform-drivers-x86
Pull x86 platform drivers from Matthew Garrett:
"Small set of updates, mainly trivial bugfixes and some small updates
to deal with newer hardware.
There's also a new driver that allows qemu guests to notify the
hypervisor that they've just paniced, which seems useful."
* 'for_linus' of git://cavan.codon.org.uk/platform-drivers-x86:
Add support for fan button on Ideapad Z580
pvpanic: pvpanic device driver
asus-nb-wmi: set wapf=4 for ASUSTeK COMPUTER INC. X75A
drivers: platform: x86: Use PTR_RET function
sony-laptop: SVS151290S kbd backlight and gfx switch support
hp-wmi: add more definitions for new event_id's
dell-laptop: Fix krealloc() misuse in parse_da_table()
hp_accel: Ignore the error from lis3lv02d_poweron() at resume
dell: add new dell WMI format for the AIO machines
Linus Torvalds [Fri, 10 May 2013 16:21:05 +0000 (09:21 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/signal
Pull stray syscall bits from Al Viro:
"Several syscall-related commits that were missing from the original"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
switch compat_sys_sysctl to COMPAT_SYSCALL_DEFINE
unicore32: just use mmap_pgoff()...
unify compat fanotify_mark(2), switch to COMPAT_SYSCALL_DEFINE
x86, vm86: fix VM86 syscalls: use SYSCALL_DEFINEx(...)
Linus Torvalds [Fri, 10 May 2013 16:20:01 +0000 (09:20 -0700)]
Merge tag 'ecryptfs-3.10-rc1-ablkcipher' of git://git./linux/kernel/git/tyhicks/ecryptfs
Pull eCryptfs update from Tyler Hicks:
"Improve performance when AES-NI (and most likely other crypto
accelerators) is available by moving to the ablkcipher crypto API.
The improvement is more apparent on faster storage devices.
There's no noticeable change when hardware crypto is not available"
* tag 'ecryptfs-3.10-rc1-ablkcipher' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs:
eCryptfs: Use the ablkcipher crypto API
Linus Torvalds [Fri, 10 May 2013 16:09:47 +0000 (09:09 -0700)]
Merge tag 'for-linus-
20130509' of git://git.infradead.org/~dwmw2/random-2.6
Pull misc fixes from David Woodhouse:
"This is some miscellaneous cleanups that don't really belong anywhere
else (or were ignored), that have been sitting in linux-next for some
time. Two of them are fixes resulting from my audit of krealloc()
usage that don't seem to have elicited any response when I posted
them, and the other three are patches from Artem removing dead code."
* tag 'for-linus-
20130509' of git://git.infradead.org/~dwmw2/random-2.6:
pcmcia: remove RPX board stuff
m68k: remove rpxlite stuff
pcmcia: remove Motorola MBX860 support
params: Fix potential memory leak in add_sysfs_param()
dell-laptop: Fix krealloc() misuse in parse_da_table()
Linus Torvalds [Fri, 10 May 2013 16:08:21 +0000 (09:08 -0700)]
Merge tag 'kvm-3.10-2' of git://git./virt/kvm/kvm
Pull kvm fixes from Gleb Natapov:
"Most of the fixes are in the emulator since now we emulate more than
we did before for correctness sake we see more bugs there, but there
is also an OOPS fixed and corruption of xcr0 register."
* tag 'kvm-3.10-2' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: emulator: emulate SALC
KVM: emulator: emulate XLAT
KVM: emulator: emulate AAM
KVM: VMX: fix halt emulation while emulating invalid guest sate
KVM: Fix kvm_irqfd_init initialization
KVM: x86: fix maintenance of guest/host xcr0 state
Linus Torvalds [Fri, 10 May 2013 16:02:50 +0000 (09:02 -0700)]
Merge tag 'dm-3.10-changes-2' of git://git./linux/kernel/git/agk/linux-dm
Pull device-mapper updates from Alasdair Kergon:
"Allow devices that hold metadata for the device-mapper thin
provisioning target to be extended easily; allow WRITE SAME on
multipath devices; an assortment of little fixes and clean-ups."
* tag 'dm-3.10-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm: (21 commits)
dm cache: set config value
dm cache: move config fns
dm thin: generate event when metadata threshold passed
dm persistent metadata: add space map threshold callback
dm persistent data: add threshold callback to space map
dm thin: detect metadata device resizing
dm persistent data: support space map resizing
dm thin: open dev read only when possible
dm thin: refactor data dev resize
dm cache: replace memcpy with struct assignment
dm cache: fix typos in comments
dm cache policy: fix description of lookup fn
dm: document iterate_devices
dm persistent data: fix error message typos
dm cache: tune migration throttling
dm mpath: enable WRITE SAME support
dm table: fix write same support
dm bufio: avoid a possible __vmalloc deadlock
dm snapshot: fix error return code in snapshot_ctr
dm cache: fix error return code in cache_create
...
Linus Torvalds [Fri, 10 May 2013 16:00:39 +0000 (09:00 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jikos/hid
Pull HID fixes from Jiri Kosina:
- fix usage of sleeping lock in atomic context from Jiri Kosina
- build fix for hid-steelseries under certain .config setups by Simon Wood
- simple mismerge fix from Fernando Luis Vázquez Cao
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: debug: fix RCU preemption issue
HID: hid-steelseries fix led class build issue
HID: reintroduce fix-up for certain Sony RF receivers
James Bottomley [Fri, 10 May 2013 14:54:01 +0000 (07:54 -0700)]
Merge branch 'postmerge' into for-linus
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
James Bottomley [Fri, 10 May 2013 14:53:40 +0000 (07:53 -0700)]
Merge branch 'misc' into for-linus
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Linus Torvalds [Fri, 10 May 2013 14:51:56 +0000 (07:51 -0700)]
Merge tag 'sound-3.10' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"This contains small fixes since the previous pull request:
- A few regression fixes and small updates of HD-audio
- Yet another fix for Haswell HDMI audio
- A copule of trivial fixes in ASoC McASP, DPAM and WM8994"
* tag 'sound-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
Revert "ALSA: hda - Don't set up active streams twice"
ALSA: Add comment for control TLV API
ALSA: hda - Apply pin-enablement workaround to all Haswell HDMI codecs
ALSA: HDA: Fix Oops caused by dereference NULL pointer
ALSA: mips/sgio2audio: Remove redundant platform_set_drvdata()
ALSA: mips/hal2: Remove redundant platform_set_drvdata()
ALSA: hda - Fix 3.9 regression of EAPD init on Conexant codecs
sound: Fix make allmodconfig on MIPS
ALSA: hda - Fix system panic when DMA > 40 bits for Nvidia audio controllers
ALSA: atmel: Remove redundant platform_set_drvdata()
ASoC: McASP: Fix receive clock polarity in DAIFMT_NB_NF mode.
ASoC: wm8994: missing break in wm8994_aif3_hw_params()
ASoC: McASP: Add pins output direction for rx clocks when configured in CBS_CFS format
ASoC: dapm: use clk_prepare_enable and clk_disable_unprepare
Linus Torvalds [Fri, 10 May 2013 14:48:05 +0000 (07:48 -0700)]
Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus
Pull MIPS updates from Ralf Baechle:
- More work on DT support for various platforms
- Various fixes that were to late to make it straight into 3.9
- Improved platform support, in particular the Netlogic XLR and
BCM63xx, and the SEAD3 and Malta eval boards.
- Support for several Ralink SOC families.
- Complete support for the microMIPS ASE which basically reencodes the
existing MIPS32/MIPS64 ISA to use non-constant size instructions.
- Some fallout from LTO work which remove old cruft and will generally
make the MIPS kernel easier to maintain and resistant to compiler
optimization, even in absence of LTO.
- KVM support. While MIPS has announced hardware virtualization
extensions this KVM extension uses trap and emulate mode for
virtualization of MIPS32. More KVM work to add support for VZ
hardware virtualizaiton extensions and MIPS64 will probably already
be merged for 3.11.
Most of this has been sitting in -next for a long time. All defconfigs
have been build or run time tested except three for which fixes are being
sent by other maintainers.
Semantic conflict with kvm updates done as per Ralf
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (118 commits)
MIPS: Add new GIC clockevent driver.
MIPS: Formatting clean-ups for clocksources.
MIPS: Refactor GIC clocksource code.
MIPS: Move 'gic_frequency' to common location.
MIPS: Move 'gic_present' to common location.
MIPS: MIPS16e: Add unaligned access support.
MIPS: MIPS16e: Support handling of delay slots.
MIPS: MIPS16e: Add instruction formats.
MIPS: microMIPS: Optimise 'strnlen' core library function.
MIPS: microMIPS: Optimise 'strlen' core library function.
MIPS: microMIPS: Optimise 'strncpy' core library function.
MIPS: microMIPS: Optimise 'memset' core library function.
MIPS: microMIPS: Add configuration option for microMIPS kernel.
MIPS: microMIPS: Disable LL/SC and fix linker bug.
MIPS: microMIPS: Add vdso support.
MIPS: microMIPS: Add unaligned access support.
MIPS: microMIPS: Support handling of delay slots.
MIPS: microMIPS: Add support for exception handling.
MIPS: microMIPS: Floating point support.
MIPS: microMIPS: Fix macro naming in micro-assembler.
...
Chad Dupuis [Thu, 7 Mar 2013 15:30:22 +0000 (10:30 -0500)]
[SCSI] qla2xxx: Update firmware link in Kconfig file.
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Mike Christie [Mon, 6 May 2013 17:06:56 +0000 (12:06 -0500)]
[SCSI] iscsi class, qla4xxx: fix sess/conn refcounting when find fns are used
This fixes a bug where the iscsi class/driver did not do a put_device
when a sess/conn device was found. This also simplifies the interface
by not having to pass in some arguments that were duplicated and did
not need to be exported.
Reported-by: Zhao Hongjiang <zhaohongjiang@huawei.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Acked-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
James Bottomley [Tue, 7 May 2013 21:44:06 +0000 (14:44 -0700)]
[SCSI] sas: unify the pointlessly separated enums sas_dev_type and sas_device_type
These enums have been separate since the dawn of SAS, mainly because the
latter is a procotol only enum and the former includes additional state
for libsas. The dichotomy causes endless confusion about which one you
should use where and leads to pointless warnings like this:
drivers/scsi/mvsas/mv_sas.c: In function 'mvs_update_phyinfo':
drivers/scsi/mvsas/mv_sas.c:1162:34: warning: comparison between 'enum sas_device_type' and 'enum sas_dev_type' [-Wenum-compare]
Fix by eliminating one of them. The one kept is effectively the sas.h
one, but call it sas_device_type and make sure the enums are all
properly namespaced with the SAS_ prefix.
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Sakthivel K [Tue, 19 Mar 2013 12:38:40 +0000 (18:08 +0530)]
[SCSI] pm80xx: thermal, sas controller config and error handling update
Modified thermal configuration to happen after interrupt registration
Added SAS controller configuration during initialization
Added error handling logic to handle I_T_Nexus errors and variants
[jejb: fix up tabs and spaces issues]
Signed-off-by: Anand Kumar S <AnandKumar.Santhanam@pmcs.com>
Acked-by: Jack Wang <jack_wang@usish.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Sakthivel K [Tue, 19 Mar 2013 12:38:08 +0000 (18:08 +0530)]
[SCSI] pm80xx: NCQ error handling changes
Handled NCQ errors in the low level driver as the FW
is not providing the faulty tag for NCQ errors for libsas
to recover.
[jejb: fix checkpatch issues]
Signed-off-by: Anand Kumar S <AnandKumar.Santhanam@pmcs.com>
Acked-by: Jack Wang <jack_wang@usish.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Sakthivel K [Tue, 19 Mar 2013 12:37:35 +0000 (18:07 +0530)]
[SCSI] pm80xx: WWN Modification for PM8081/88/89 controllers
Individual WWN read operations based on controller.
PM8081 - Read WWN from Flash VPD.
PM8088/89 - Read WWN from EEPROM.
PM8001 - Read WWN from NVM.
Signed-off-by: Sakthivel K <Sakthivel.SaravananKamalRaju@pmcs.com>
Signed-off-by: Anand Kumar S <AnandKumar.Santhanam@pmcs.com>
Acked-by: Jack Wang <jack_wang@usish.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Sakthivel K [Tue, 19 Mar 2013 12:37:09 +0000 (18:07 +0530)]
[SCSI] pm80xx: Changed module name and debug messages update
Changed name in driver to pm80xx. Updated debug messages.
Signed-off-by: Sakthivel K <Sakthivel.SaravananKamalRaju@pmcs.com>
Signed-off-by: Anand Kumar S <AnandKumar.Santhanam@pmcs.com>
Acked-by: Jack Wang <jack_wang@usish.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Sakthivel K [Tue, 19 Mar 2013 12:36:40 +0000 (18:06 +0530)]
[SCSI] pm80xx: Firmware flash memory free fix, with addition of new memory region for it
Performing pci_free_consistent in tasklet had result in a core dump. So
allocated a new memory region for it. Fix for passing proper address
and operation in firmware flash update.
Signed-off-by: Sakthivel K <Sakthivel.SaravananKamalRaju@pmcs.com>
Signed-off-by: Anand Kumar S <AnandKumar.Santhanam@pmcs.com>
Acked-by: Jack Wang <jack_wang@usish.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Sakthivel K [Tue, 19 Mar 2013 12:35:55 +0000 (18:05 +0530)]
[SCSI] pm80xx: SPC new firmware changes for device id 0x8081 alone
Additional bar shift for new SPC firmware, applicable to device
id 0x8081 only.
Signed-off-by: Sakthivel K <Sakthivel.SaravananKamalRaju@pmcs.com>
Signed-off-by: Anand Kumar S <AnandKumar.Santhanam@pmcs.com>
Acked-by: Jack Wang <jack_wang@usish.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Sakthivel K [Wed, 17 Apr 2013 11:07:02 +0000 (16:37 +0530)]
[SCSI] pm80xx: Added SPCv/ve specific hardware functionalities and relevant changes in common files
Implementation of SPCv/ve specific hardware functionality and
macros. Changing common functionalities wrt SPCv/ve operations.
Conditional checks for SPC specific operations.
Signed-off-by: Sakthivel K <Sakthivel.SaravananKamalRaju@pmcs.com>
Signed-off-by: Anand Kumar S <AnandKumar.Santhanam@pmcs.com>
Acked-by: Jack Wang <jack_wang@usish.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Sakthivel K [Tue, 19 Mar 2013 12:26:17 +0000 (17:56 +0530)]
[SCSI] pm80xx: MSI-X implementation for using 64 interrupts
Implementation of interrupt handlers and tasklets to support
upto 64 interrupt for the device.
Signed-off-by: Sakthivel K <Sakthivel.SaravananKamalRaju@pmcs.com>
Signed-off-by: Anand Kumar S <AnandKumar.Santhanam@pmcs.com>
Acked-by: Jack Wang <jack_wang@usish.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Sakthivel K [Wed, 27 Feb 2013 14:57:43 +0000 (20:27 +0530)]
[SCSI] pm80xx: Updated common functions common for SPC and SPCv/ve
Update of function prototype for common function to SPC and SPCv/ve.
Multiple queues implementation for IO.
Signed-off-by: Sakthivel K <Sakthivel.SaravananKamalRaju@pmcs.com>
Signed-off-by: Anand Kumar S <AnandKumar.Santhanam@pmcs.com>
Acked-by: Jack Wang <jack_wang@usish.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Sakthivel K [Wed, 27 Feb 2013 14:55:25 +0000 (20:25 +0530)]
[SCSI] pm80xx: Multiple inbound/outbound queue configuration
Memory allocation and configuration of multiple inbound and
outbound queues.
Signed-off-by: Sakthivel K <Sakthivel.SaravananKamalRaju@pmcs.com>
Signed-off-by: Anand Kumar S <AnandKumar.Santhanam@pmcs.com>
Acked-by: Jack Wang <jack_wang@usish.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Sakthivel K [Wed, 17 Apr 2013 10:56:36 +0000 (16:26 +0530)]
[SCSI] pm80xx: Added SPCv/ve specific ids, variables and modify for SPC
Updated pci id table with device, vendor, subdevice and subvendor ids
for 8081, 8088, 8089 SAS/SATA controllers. Added SPCv/ve related macros.
Updated macros, hba info structure and other structures for SPCv/ve.
Update of structure and variable names for SPC hardware functionalities.
Signed-off-by: Sakthivel K <Sakthivel.SaravananKamalRaju@pmcs.com>
Signed-off-by: Anand Kumar S <AnandKumar.Santhanam@pmcs.com>
Acked-by: Jack Wang <jack_wang@usish.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
James Bottomley [Mon, 6 May 2013 16:49:25 +0000 (09:49 -0700)]
[SCSI] lpfc: fix up Kconfig dependencies
lpfc uses the generic checksum as well as the T10DIF one from the lib/
directory, so make sure they're selected.
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Hannes Reinecke [Thu, 25 Apr 2013 06:10:00 +0000 (08:10 +0200)]
[SCSI] Handle MLQUEUE busy response in scsi_send_eh_cmnd
scsi_send_eh_cmnd() is calling queuecommand() directly, so
it needs to check the return value here.
The only valid return codes for queuecommand() are 'busy'
states, so we need to wait for a bit to allow the LLDD
to recover.
Based on an earlier patch from Wen Xiong.
[jejb: fix confusion between msec and jiffies values and other issues]
[bvanassche: correct stall_for interval]
Cc: Wen Xiong <wenxiong@linux.vnet.ibm.com>
Cc: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Linus Torvalds [Fri, 10 May 2013 14:24:14 +0000 (07:24 -0700)]
Merge tag 'arc-v3.10-rc1-part2' of git://git./linux/kernel/git/vgupta/arc
Pull second set of arc arch updates from Vineet Gupta:
"Aliasing VIPT dcache support for ARC
I'm satisified with testing, specially with fuse which has
historically given grief to VIPT arches (ARM/PARISC...)"
* tag 'arc-v3.10-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: [TB10x] Remove GENERIC_GPIO
ARC: [mm] Aliasing VIPT dcache support 4/4
ARC: [mm] Aliasing VIPT dcache support 3/4
ARC: [mm] Aliasing VIPT dcache support 2/4
ARC: [mm] Aliasing VIPT dcache support 1/4
ARC: [mm] refactor the core (i|d)cache line ops loops
ARC: [mm] serious bug in vaddr based icache flush
Linus Torvalds [Fri, 10 May 2013 14:22:35 +0000 (07:22 -0700)]
Merge branch 'for-next' of git://git./linux/kernel/git/gerg/m68knommu
Pull m68knommu updates from Greg Ungerer:
"The bulk of the changes are generalizing the ColdFire v3 core support
and adding in 537x CPU support. Also a couple of other bug fixes, one
to fix a reintroduction of a past bug in the romfs filesystem nommu
support."
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68knommu: enable Timer on coldfire 532x
m68knommu: fix ColdFire 5373/5329 QSPI base address
m68knommu: add support for configuring a Freescale M5373EVB board
m68knommu: add support for the ColdFire 537x family of CPUs
m68knommu: make ColdFire M532x platform support more v3 generic
m68knommu: create and use a common M53xx ColdFire class of CPUs
m68k: remove unused asm/dbg.h
m68k: Set ColdFire ACR1 cache mode depending on kernel configuration
romfs: fix nommu map length to keep inside filesystem
m68k: clean up unused "config ROMVECSIZE"
Linus Torvalds [Fri, 10 May 2013 14:21:16 +0000 (07:21 -0700)]
Merge tag 'for-linus' of git://github.com/realmz/blackfin-linux
Pull blackfin updates from Steven Miao.
* tag 'for-linus' of git://github.com/realmz/blackfin-linux:
bfin cache: dcplb map: add 16M dcplb map for BF60x
blackfin: smp: fix smp build after drop asm/system.h
blackfin: fix bootup core clock and system clock display
Platform Nand: Set the GPIO for NAND read as input
blackfin: rename vmImage to uImage after we move to buildroot
blackfin: twi: Remove bogus #endif
bf609: rsi: Add bf609 rsi MMR macro and board platform data.
blackfin: dmc: Improve DDR2 write through in DMC effict controller.
Linus Torvalds [Fri, 10 May 2013 14:19:52 +0000 (07:19 -0700)]
Merge branch 'next' of git://git.monstr.eu/linux-2.6-microblaze
Pull microblaze updates from Michal Simek.
* 'next' of git://git.monstr.eu/linux-2.6-microblaze:
microblaze: Enable IRQ in arch_cpu_idle
microblaze: Fix uaccess_ok macro
microblaze: Add support for new cpu versions and target architecture
microblaze: Do not select OPT_LIB_ASM by default
microblaze: Fix initrd support
microblaze: Do not use r6 in head.S
microblaze: pci: Remove duplicated header
microblaze: Set the default irq_domain
microblaze: pci: Remove duplicated include from pci-common.c