sdk/emulator/qemu.git
8 years agoblock/sheepdog: fix argument passed to qemu_strtoul()
Jeff Cody [Wed, 2 Mar 2016 16:24:42 +0000 (11:24 -0500)]
block/sheepdog: fix argument passed to qemu_strtoul()

The function qemu_strtoul() reads 'unsigned long' sized data,
which is larger than uint32_t on 64-bit machines.

Even though the snap_id field in the header is 32-bits, we must
accommodate the full size in qemu_strtoul().

This patch also adds more meaningful error handling to the
qemu_strtoul() call, and subsequent results.

Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
Message-id: e56fc50abedd9a112e0683342c8eafda063cd2f9.1456935548.git.jcody@redhat.com

8 years agoutil/base64.c: Clean includes
Peter Maydell [Tue, 23 Feb 2016 14:18:32 +0000 (14:18 +0000)]
util/base64.c: Clean includes

Remove unnecessary include of config-host.h.
(This was missed by the clean-includes script because of the
incorrect use of <> for a QEMU header.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 1456237112-32662-5-git-send-email-peter.maydell@linaro.org

8 years agoupdate-linux-headers.sh: Fake types.h doesn't need to include anything
Peter Maydell [Tue, 23 Feb 2016 14:18:31 +0000 (14:18 +0000)]
update-linux-headers.sh: Fake types.h doesn't need to include anything

We have a fake linux/types.h which we create in update-linux-headers.h.
Now that every QEMU source file includes osdep.h, this fake header
doesn't need to include anything at all.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 1456237112-32662-4-git-send-email-peter.maydell@linaro.org

8 years agoinclude/config.h: Remove
Peter Maydell [Tue, 23 Feb 2016 14:18:30 +0000 (14:18 +0000)]
include/config.h: Remove

include/config.h just includes config-target.h (and used to also
include config-host.h).
It is now obsolete and unused, because osdep.h does this job, so
remove it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 1456237112-32662-3-git-send-email-peter.maydell@linaro.org

8 years agoslirp/slirp.h: Remove now-empty #ifdefs
Peter Maydell [Tue, 23 Feb 2016 14:18:29 +0000 (14:18 +0000)]
slirp/slirp.h: Remove now-empty #ifdefs

After automatic cleanup to remove unnecessary #includes of headers that
osdep.h provides, slirp.h has a few now unnecessary #ifdef/#endif pairs;
remove them.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 1456237112-32662-2-git-send-email-peter.maydell@linaro.org

8 years agoMerge remote-tracking branch 'remotes/armbru/tags/pull-error-2016-03-16' into staging
Peter Maydell [Wed, 16 Mar 2016 11:09:36 +0000 (11:09 +0000)]
Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2016-03-16' into staging

Error reporting patches for 2016-03-16

# gpg: Signature made Wed 16 Mar 2016 09:57:00 GMT using RSA key ID EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"

* remotes/armbru/tags/pull-error-2016-03-16:
  error: ensure errno detail is printed with error_abort

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoMerge remote-tracking branch 'remotes/armbru/tags/pull-monitor-2016-03-16' into staging
Peter Maydell [Wed, 16 Mar 2016 10:38:14 +0000 (10:38 +0000)]
Merge remote-tracking branch 'remotes/armbru/tags/pull-monitor-2016-03-16' into staging

Monitor patches for 2016-03-16

# gpg: Signature made Wed 16 Mar 2016 09:47:23 GMT using RSA key ID EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"

* remotes/armbru/tags/pull-monitor-2016-03-16:
  qdev-monitor: add missing aliases for virtio device classes
  qdev-monitor: sort alias table by typename
  qdev-monitor: improve error message when alias device is unavailable

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoMerge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.6-20160316' into staging
Peter Maydell [Wed, 16 Mar 2016 10:09:26 +0000 (10:09 +0000)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.6-20160316' into staging

ppc patch queue for 2016-03-16

Accumulated patches for target-ppc, pseries machine type and related
devices.  As we are now in soft freeze, these are mostly fixes.
   * Fix KVM migration for several SPRs that qemu didn't handle
   * Clean up handling of SDR1, which allows a fix to the gdbstub
   * Fix a race in spapr_rng
   * Fix a bug with multifunction hotplug

The exception is the 7 patches to allow EEH on spapr-pci-host-bridge
devices (rather than the special and poorly designed
spapr-vfio-pci-host-bridge device).  I believe these are low risk of
breaking non-EEH cases, and EEH cases were little used in practice
previously (since libvirt did not support the special device amongst
other things).  It did have a draft posted before the soft freeze,
removes a very ugly VFIO interface, and removes device we'd like to
deprecate sooner rather than later.  So, I'm hoping we can squeeze
these in during the soft freeze.

This includes two patches to the VFIO code, which Alex Williamson has
indicated he's ok with coming through my tree.

# gpg: Signature made Wed 16 Mar 2016 05:04:52 GMT using RSA key ID 20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.6-20160316:
  vfio: Eliminate vfio_container_ioctl()
  spapr_pci: Remove finish_realize hook
  spapr_pci: (Mostly) remove spapr-pci-vfio-host-bridge
  spapr_pci: Allow EEH on spapr-pci-host-bridge
  spapr_pci: Eliminate class callbacks
  spapr_pci: Switch to vfio_eeh_as_op() interface
  vfio: Start improving VFIO/EEH interface
  spapr_rng: fix race with main loop
  target-ppc: Eliminate kvmppc_kern_htab global
  target-ppc: Add helpers for updating a CPU's SDR1 and external HPT
  target-ppc: Split out SREGS get/put functions
  spapr_pci: fix multifunction hotplug
  target-ppc: Add PVR for POWER8NVL processor
  ppc: Add a few more P8 PMU SPRs
  ppc: Fix migration of the TAR SPR
  ppc: Define the PSPB register on POWER8

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoerror: ensure errno detail is printed with error_abort
Daniel P. Berrange [Wed, 9 Mar 2016 17:28:24 +0000 (17:28 +0000)]
error: ensure errno detail is printed with error_abort

When &error_abort is passed in, the error reporting code
will print the current error message and then abort() the
process. Unfortunately at the time it aborts, we've not
yet appended the errno detail. This makes debugging certain
problems significantly harder as the log is incomplete.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1457544504-8548-22-git-send-email-berrange@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
8 years agoMerge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
Peter Maydell [Wed, 16 Mar 2016 09:27:58 +0000 (09:27 +0000)]
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

acpi: minor fix

Since previous pull acpi test triggers warnings,
fix it up.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Tue 15 Mar 2016 21:26:38 GMT using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream:
  acpi-test: update UID for GSI links

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoqdev-monitor: add missing aliases for virtio device classes
Sascha Silbe [Thu, 18 Feb 2016 21:44:14 +0000 (22:44 +0100)]
qdev-monitor: add missing aliases for virtio device classes

virtio-{blk,balloon,net,serial} are aliases for their actual,
architecture-dependent implementations (*-ccw on s390x, *-pci on other
architectures supporting virtio). This makes it a lot easier to craft
qemu invocations that work on all supported architectures. Complete
the set to cover all existing non-abstract virtio device classes.

For virtio-balloon, only the CCW implementation was missing.

Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Message-Id: <1455831854-49013-4-git-send-email-silbe@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
8 years agoqdev-monitor: sort alias table by typename
Sascha Silbe [Thu, 18 Feb 2016 21:44:13 +0000 (22:44 +0100)]
qdev-monitor: sort alias table by typename

Sort the alias table by typename so it's easier to see which aliases
exist.

Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Message-Id: <1455831854-49013-3-git-send-email-silbe@linux.vnet.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
8 years agoqdev-monitor: improve error message when alias device is unavailable
Sascha Silbe [Thu, 18 Feb 2016 21:44:12 +0000 (22:44 +0100)]
qdev-monitor: improve error message when alias device is unavailable

When trying to instantiate an alias that points to a device class that
doesn't exist, the error message looks like qemu misunderstood the
request:

$ s390x-softmmu/qemu-system-s390x -device virtio-gpu
qemu-system-s390x: -device virtio-gpu: 'virtio-gpu-ccw' is not a valid
device model name

Special-case the error message to make it explicit that alias
expansion is going on:

$ s390x-softmmu/qemu-system-s390x -device virtio-gpu
qemu-system-s390x: -device virtio-gpu: 'virtio-gpu' (alias
'virtio-gpu-ccw') is not a valid device model name

Suggested-By: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Message-Id: <1455831854-49013-2-git-send-email-silbe@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
8 years agovfio: Eliminate vfio_container_ioctl()
David Gibson [Wed, 9 Mar 2016 00:57:20 +0000 (11:57 +1100)]
vfio: Eliminate vfio_container_ioctl()

vfio_container_ioctl() was a bad interface that bypassed abstraction
boundaries, had semantics that sat uneasily with its name, and was unsafe
in many realistic circumstances.  Now that spapr-pci-vfio-host-bridge has
been folded into spapr-pci-host-bridge, there are no more users, so remove
it.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
8 years agospapr_pci: Remove finish_realize hook
David Gibson [Mon, 29 Feb 2016 06:20:00 +0000 (17:20 +1100)]
spapr_pci: Remove finish_realize hook

Now that spapr-pci-vfio-host-bridge is reduced to just a stub, there is
only one implementation of the finish_realize hook in sPAPRPHBClass.  So,
we can fold that implementation into its (single) caller, and remove the
hook.  That's the last thing left in sPAPRPHBClass, so that can go away as
well.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
8 years agospapr_pci: (Mostly) remove spapr-pci-vfio-host-bridge
David Gibson [Mon, 29 Feb 2016 06:19:50 +0000 (17:19 +1100)]
spapr_pci: (Mostly) remove spapr-pci-vfio-host-bridge

Now that the regular spapr-pci-host-bridge can handle EEH, there are only
two things that spapr-pci-vfio-host-bridge does differently:
    1. automatically sizes its DMA window to match the host IOMMU
    2. checks if the attached VFIO container is backed by the
       VFIO_SPAPR_TCE_IOMMU type on the host

(1) is not particularly useful, since the default window used by the
regular host bridge will work with the host IOMMU configuration on all
current systems anyway.

Plus, automatically changing guest visible configuration (such as the DMA
window) based on host settings is generally a bad idea.  It's not
definitively broken, since spapr-pci-vfio-host-bridge is only supposed to
support VFIO devices which can't be migrated anyway, but still.

(2) is not really useful, because if a guest tries to configure EEH on a
different host IOMMU, the first call will fail and that will be that.

It's possible there are scripts or tools out there which expect
spapr-pci-vfio-host-bridge, so we don't remove it entirely.  This patch
reduces it to just a stub for backwards compatibility.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
8 years agospapr_pci: Allow EEH on spapr-pci-host-bridge
David Gibson [Mon, 29 Feb 2016 06:19:42 +0000 (17:19 +1100)]
spapr_pci: Allow EEH on spapr-pci-host-bridge

Now that the EEH code is independent of the special
spapr-vfio-pci-host-bridge device, we can allow it on all spapr PCI
host bridges instead.  We do this by changing spapr_phb_eeh_available()
to be based on the vfio_eeh_as_ok() call instead of the host bridge class.

Because the value of vfio_eeh_as_ok() can change with devices being
hotplugged or unplugged, this can potentially lead to some strange edge
cases where the guest starts using EEH, then it starts failing because
of a change in status.

However, it's not really any worse than the current situation.  Cases that
would have worked previously will still work (i.e. VFIO devices from at
most one VFIO IOMMU group per vPHB), it's just that it's no longer
necessary to use spapr-vfio-pci-host-bridge with the groupid pre-specified.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
8 years agospapr_pci: Eliminate class callbacks
David Gibson [Mon, 29 Feb 2016 06:45:05 +0000 (17:45 +1100)]
spapr_pci: Eliminate class callbacks

The EEH operations in the spapr-vfio-pci-host-bridge no longer rely on the
special groupid field in sPAPRPHBVFIOState.  So we can simplify, removing
the class specific callbacks with direct calls based on a simple
spapr_phb_eeh_enabled() helper.  For now we implement that in terms of
a boolean in the class, but we'll continue to clean that up later.

On its own this is a rather strange way of doing things, but it's a useful
intermediate step to further cleanups.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
8 years agospapr_pci: Switch to vfio_eeh_as_op() interface
David Gibson [Mon, 29 Feb 2016 03:00:34 +0000 (14:00 +1100)]
spapr_pci: Switch to vfio_eeh_as_op() interface

This switches all EEH on VFIO operations in spapr_pci_vfio.c from the
broken vfio_container_ioctl() interface to the new vfio_as_eeh_op()
interface.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
8 years agovfio: Start improving VFIO/EEH interface
David Gibson [Wed, 9 Mar 2016 00:56:06 +0000 (11:56 +1100)]
vfio: Start improving VFIO/EEH interface

At present the code handling IBM's Enhanced Error Handling (EEH) interface
on VFIO devices operates by bypassing the usual VFIO logic with
vfio_container_ioctl().  That's a poorly designed interface with unclear
semantics about exactly what can be operated on.

In particular it operates on a single vfio container internally (hence the
name), but takes an address space and group id, from which it deduces the
container in a rather roundabout way.  groupids are something that code
outside vfio shouldn't even be aware of.

This patch creates new interfaces for EEH operations.  Internally we
have vfio_eeh_container_op() which takes a VFIOContainer object
directly.  For external use we have vfio_eeh_as_ok() which determines
if an AddressSpace is usable for EEH (at present this means it has a
single container with exactly one group attached), and vfio_eeh_as_op()
which will perform an operation on an AddressSpace in the unambiguous case,
and otherwise returns an error.

This interface still isn't great, but it's enough of an improvement to
allow a number of cleanups in other places.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
8 years agospapr_rng: fix race with main loop
Greg Kurz [Fri, 11 Mar 2016 18:48:47 +0000 (19:48 +0100)]
spapr_rng: fix race with main loop

Since commit "60253ed1e6ec rng: add request queue support to rng-random",
the use of a spapr_rng device may hang vCPU threads.

The following path is taken without holding the lock to the main loop mutex:

h_random()
  rng_backend_request_entropy()
    rng_random_request_entropy()
      qemu_set_fd_handler()

The consequence is that entropy_available() may be called before the vCPU
thread could even queue the request: depending on the scheduling, it may
happen that entropy_available() does not call random_recv()->qemu_sem_post().
The vCPU thread will then sleep forever in h_random()->qemu_sem_wait().

This could not happen before 60253ed1e6ec because entropy_available() used
to call random_recv() unconditionally.

This patch ensures the lock is held to avoid the race.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Cédric Le Goater <clg@fr.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agotarget-ppc: Eliminate kvmppc_kern_htab global
David Gibson [Tue, 8 Mar 2016 00:35:15 +0000 (11:35 +1100)]
target-ppc: Eliminate kvmppc_kern_htab global

fa48b43 "target-ppc: Remove hack for ppc_hash64_load_hpte*() with HV KVM"
purports to remove a hack in the handling of hash page tables (HPTs)
managed by KVM instead of qemu.  However, it actually went in the wrong
direction.

That patch requires anything looking for an external HPT (that is one not
managed by the guest itself) to check both env->external_htab (for a qemu
managed HPT) and kvmppc_kern_htab (for a KVM managed HPT).  That's a
problem because kvmppc_kern_htab is local to mmu-hash64.c, but some places
which need to check for an external HPT are outside that, such as
kvm_arch_get_registers().  The latter was subtly broken by the earlier
patch such that gdbstub can no longer access memory.

Basically a KVM managed HPT is much more like a qemu managed HPT than it is
like a guest managed HPT, so the original "hack" was actually on the right
track.

This partially reverts fa48b43, so we again mark a KVM managed external HPT
by putting a special but non-NULL value in env->external_htab.  It then
goes further, using that marker to eliminate the kvmppc_kern_htab global
entirely.  The ppc_hash64_set_external_hpt() helper function is extended
to set that marker if passed a NULL value (if you're setting an external
HPT, but don't have an actual HPT to set, the assumption is that it must
be a KVM managed HPT).

This also has some flow-on changes to the HPT access helpers, required by
the above changes.

Reported-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Tested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
8 years agotarget-ppc: Add helpers for updating a CPU's SDR1 and external HPT
David Gibson [Tue, 8 Mar 2016 00:33:46 +0000 (11:33 +1100)]
target-ppc: Add helpers for updating a CPU's SDR1 and external HPT

When a Power cpu with 64-bit hash MMU has it's hash page table (HPT)
pointer updated by a write to the SDR1 register we need to update some
derived variables.  Likewise, when the cpu is configured for an external
HPT (one not in the guest memory space) some derived variables need to be
updated.

Currently the logic for this is (partially) duplicated in ppc_store_sdr1()
and in spapr_cpu_reset().  In future we're going to need it in some other
places, so make some common helpers for this update.

In addition the new ppc_hash64_set_external_hpt() helper also updates
SDR1 in KVM - it's not updated by the normal runtime KVM <-> qemu CPU
synchronization.  In a sense this belongs logically in the
ppc_hash64_set_sdr1() helper, but that is called from
kvm_arch_get_registers() so can't itself call cpu_synchronize_state()
without infinite recursion.  In practice this doesn't matter because
the only other caller is TCG specific.

Currently there aren't situations where updating SDR1 at runtime in KVM
matters, but there are going to be in future.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
8 years agotarget-ppc: Split out SREGS get/put functions
David Gibson [Wed, 9 Mar 2016 00:58:33 +0000 (11:58 +1100)]
target-ppc: Split out SREGS get/put functions

Currently the getting and setting of Power MMU registers (sregs) take up
large inline chunks of the kvm_arch_get_registers() and
kvm_arch_put_registers() functions.  Especially since there are two
variants (for Book-E and Book-S CPUs), only one of which will be used in
practice, this is pretty hard to read.

This patch splits these out into helper functions for clarity.  No
functional change is expected.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
8 years agospapr_pci: fix multifunction hotplug
Michael Roth [Thu, 3 Mar 2016 21:55:36 +0000 (15:55 -0600)]
spapr_pci: fix multifunction hotplug

Since 3f1e147, QEMU has adopted a convention of supporting function
hotplug by deferring hotplug events until func 0 is hotplugged.
This is likely how management tools like libvirt would expose
such support going forward.

Since sPAPR guests rely on per-func events rather than
slot-based, our protocol has been to hotplug func 0 *first* to
avoid cases where devices appear within guests without func 0
present to avoid undefined behavior.

To remain compatible with new convention, defer hotplug in a
similar manner, but then generate events in 0-first order as we
did in the past. Once func 0 present, fail any attempts to plug
additional functions (as we do with PCIe).

For unplug, defer unplug operations in a similar manner, but
generate unplug events such that function 0 is removed last in guest.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agotarget-ppc: Add PVR for POWER8NVL processor
Alexey Kardashevskiy [Thu, 3 Mar 2016 00:08:19 +0000 (11:08 +1100)]
target-ppc: Add PVR for POWER8NVL processor

This adds a new POWER8+NVLink CPU PVR which core is identical to POWER8
but has a different PVR. The only available machine now has PVR
pvr 004c 0100 so this defines "POWER8NVL" alias as v1.0.

The corresponding kernel commit is
https://github.com/torvalds/linux/commit/ddee09c099c3
"powerpc: Add PVR for POWER8NVL processor"

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agoppc: Add a few more P8 PMU SPRs
Benjamin Herrenschmidt [Wed, 2 Mar 2016 20:19:22 +0000 (21:19 +0100)]
ppc: Add a few more P8 PMU SPRs

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agoppc: Fix migration of the TAR SPR
Thomas Huth [Wed, 2 Mar 2016 20:19:21 +0000 (21:19 +0100)]
ppc: Fix migration of the TAR SPR

The TAR special purpose register currently does not get migrated
under KVM because it does not get synchronized with the kernel.
Use spr_register_kvm() instead of spr_register() to fix this issue.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agoppc: Define the PSPB register on POWER8
Thomas Huth [Wed, 2 Mar 2016 20:19:20 +0000 (21:19 +0100)]
ppc: Define the PSPB register on POWER8

POWER8 / PowerISA 2.07 has a new special purpose register called PSPB
("Problem State Priority Boost Register"). The contents of this register
are currently lost during migration. To be able to migrate this register,
too, we've got to define this SPR along with the other SPRs of POWER8.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agoacpi-test: update UID for GSI links
Michael S. Tsirkin [Tue, 15 Mar 2016 21:23:16 +0000 (23:23 +0200)]
acpi-test: update UID for GSI links

Update acpi test data to match
commit 6a991e07bb8eeb7d7799a949c0528dffb84b2a98
("hw/acpi: fix GSI links UID").

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
8 years agoMerge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
Peter Maydell [Tue, 15 Mar 2016 17:56:14 +0000 (17:56 +0000)]
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* Miscellaneous exec.c fixes (Markus, myself)
* Q35 support for -machine kernel_irqchip=split (Rita)
* Chardev replay support (Pavel)
* icount "warping" cleanups (Pavel)

# gpg: Signature made Tue 15 Mar 2016 17:24:08 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"

* remotes/bonzini/tags/for-upstream:
  icount: decouple warp calls
  icount: remove obsolete warp call
  replay: character devices
  exec: fix early return from ram_block_add
  exec: Fix memory allocation when memory path isn't on hugetlbfs
  exec: Fix memory allocation when memory path names new file
  update-linux-headers: Add userfaultfd.h
  kvm: x86: q35: Add support for -machine kernel_irqchip=split for q35

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoicount: decouple warp calls
Pavel Dovgalyuk [Thu, 10 Mar 2016 11:56:09 +0000 (14:56 +0300)]
icount: decouple warp calls

qemu_clock_warp function is called to update virtual clock when CPU
is sleeping. This function includes replay checkpoint to make execution
deterministic in icount mode.
Record/replay module flushes async event queue at checkpoints.
Some of the events (e.g., block devices operations) include interaction
with hardware. E.g., APIC polled by block devices sets one of IRQ flags.
Flag to be set depends on currently executed thread (CPU or iothread).
Therefore in replay mode we have to process the checkpoints in the same thread
as they were recorded.
qemu_clock_warp function (and its checkpoint) may be called from different
thread. This patch decouples two different execution cases of this function:
call when CPU is sleeping from iothread and call from cpu thread to update
virtual clock.
First task is performed by qemu_start_warp_timer function. It sets warp
timer event to the moment of nearest pending virtual timer.
Second function (qemu_account_warp_timer) is called from cpu thread
before execution of the code. It advances virtual clock by adding the length
of period while CPU was sleeping.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20160310115609.4812.44986.stgit@PASHA-ISP>
[Update docs. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agoicount: remove obsolete warp call
Pavel Dovgalyuk [Thu, 10 Mar 2016 11:56:03 +0000 (14:56 +0300)]
icount: remove obsolete warp call

qemu_clock_warp call in qemu_tcg_wait_io_event function is not needed
anymore, because it is called in every iteration of main_loop_wait.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20160310115603.4812.67559.stgit@PASHA-ISP>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agoreplay: character devices
Pavel Dovgalyuk [Mon, 14 Mar 2016 07:44:36 +0000 (10:44 +0300)]
replay: character devices

This patch implements record and replay of character devices.
It records chardevs communication in replay mode. Recorded information
include data read from backend and counter of bytes written
from frontend to backend to preserve frontend internal state.
If character device was configured through the command line in record mode,
then in replay mode it should be also added to command line. Backend of
the character device could be changed in replay mode.
Replaying of devices that perform ioctl and get_msgfd operations is not
supported.
gdbstub which also acts as a backend is not recorded to allow controlling
the replaying through gdb. Monitor backends are also not recorded.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20160314074436.4980.83856.stgit@PASHA-ISP>
[Add stubs. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agoexec: fix early return from ram_block_add
Paolo Bonzini [Wed, 9 Mar 2016 17:14:01 +0000 (18:14 +0100)]
exec: fix early return from ram_block_add

After reporting an error, ram_block_add was going on with the registration
of the RAMBlock.  The visible effect is that it unlocked the ramlist
mutex twice.

Fixes: 528f46af6ecd1e300db18684969104d4067b867b
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agoexec: Fix memory allocation when memory path isn't on hugetlbfs
Markus Armbruster [Mon, 7 Mar 2016 19:25:14 +0000 (20:25 +0100)]
exec: Fix memory allocation when memory path isn't on hugetlbfs

gethugepagesize() works reliably only when its argument is on
hugetlbfs.  When it's not, it returns the filesystem's "optimal
transfer block size", which may or may not be the actual page size
you'll get when you mmap().

If the value is too small or not a power of two, we fail
qemu_ram_mmap()'s assertions.  These were added in commit 794e8f3
(v2.5.0).  The bug's impact before that is currently unknown.  Seems
fairly unlikely at least when the normal page size is 4KiB.

Else, if the value is too large, we align more strictly than
necessary.

gethugepagesize() goes back to commit c902760 (v0.13).  That commit
clearly intended gethugepagesize() to be used on hugetlbfs only.  Not
only was it named accordingly, it also printed a warning when used on
anything else.  However, the commit neglected to spell out the
restriction in user documentation of -mem-path.

Commit bfc2a1a (v2.5.0) dropped the warning as bogus "because QEMU
functions perfectly well with the path on a regular tmpfs filesystem".
It sure does when you're sufficiently lucky.  In my testing, I was
lucky, too.

Fix by switching to qemu_fd_getpagesize().  Rename the variable
holding its result from hpagesize to page_size.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1457378754-21649-3-git-send-email-armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agoexec: Fix memory allocation when memory path names new file
Markus Armbruster [Mon, 7 Mar 2016 19:25:13 +0000 (20:25 +0100)]
exec: Fix memory allocation when memory path names new file

Commit 8d31d6b extended file_ram_alloc() to accept file names in
addition to directory names.  Even though it passes O_CREAT to open(),
it actually works only for existing files.  Reproducer adapted from
the commit's qemu-doc.texi update:

    $ qemu-system-x86_64 -object memory-backend-file,size=2M,mem-path=/dev/hugepages/my-shmem-file,id=mb1
    qemu-system-x86_64: -object memory-backend-file,size=2M,mem-path=/dev/hugepages/my-shmem-file,id=mb1: failed to get page size of file /dev/hugepages/my-shmem-file: No such file or directory

This is because we first get the page size for @path, then open the
actual file.  Unwise even before the flawed commit, because the
directory could change in between, invalidating the page size.
Unlikely to bite in practice.

Rearrange the code to create the file (if necessary) before getting
its page size.  Carefully avoid TOCTTOU conditions with a method
suggested by Paolo Bonzini.

While there, replace "hugepages" by "guest RAM" in error messages,
because host memory backends can be used for purposes other than huge
pages, e.g. /dev/shm/ shared memory.  Help text of -mem-path agrees.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1457378754-21649-2-git-send-email-armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agoupdate-linux-headers: Add userfaultfd.h
Alexey Kardashevskiy [Mon, 15 Feb 2016 04:59:41 +0000 (15:59 +1100)]
update-linux-headers: Add userfaultfd.h

userfailtfd.h is used by post-copy migration so include it to
the update-linux-headers.sh as we want it updated altogether with
other kernel headers.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <1455512381-15271-1-git-send-email-aik@ozlabs.ru>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agokvm: x86: q35: Add support for -machine kernel_irqchip=split for q35
Rita Sinha [Mon, 7 Mar 2016 19:22:05 +0000 (00:52 +0530)]
kvm: x86: q35: Add support for -machine kernel_irqchip=split for q35

The split IRQ chip mode via KVM_CAP_SPLIT_IRQCHIP was introduced with commit
15eafc2e60 but was broken for q35. This patch makes kernel_irqchip=split
functional for q35.

Signed-off-by: Rita Sinha <rita.sinha89@gmail.com>
Message-Id: <1457378525-16455-1-git-send-email-rita.sinha89@gmail.com>
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agoMerge remote-tracking branch 'remotes/thibault/tags/samuel-thibault' into staging
Peter Maydell [Tue, 15 Mar 2016 17:09:52 +0000 (17:09 +0000)]
Merge remote-tracking branch 'remotes/thibault/tags/samuel-thibault' into staging

slirp: Adding IPv6 support to Qemu -net user mode

# gpg: Signature made Tue 15 Mar 2016 16:06:03 GMT using RSA key ID FB6B2F1D
# gpg: Good signature from "Samuel Thibault <samuel.thibault@gnu.org>"
# gpg:                 aka "Samuel Thibault <sthibault@debian.org>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@inria.fr>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@labri.fr>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@ens-lyon.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 900C B024 B679 31D4 0F82  304B D017 8C76 7D06 9EE6
#      Subkey fingerprint: F632 74CD C630 0873 CB3D  29D9 E3E5 1CE8 FB6B 2F1D

* remotes/thibault/tags/samuel-thibault:
  slirp: Add IPv6 support to the TFTP code
  qapi-schema, qemu-options & slirp: Adding Qemu options for IPv6 addresses
  slirp: Adding IPv6 address for DNS relay
  slirp: Handle IPv6 in TCP functions
  slirp: Reindent after refactoring
  slirp: Generalizing and neutralizing various TCP functions before adding IPv6 stuff
  slirp: Factorizing tcpiphdr structure with an union
  slirp: Adding IPv6 UDP support
  slirp: Adding ICMPv6 error sending
  slirp: Fix ICMP error sending
  slirp: Adding IPv6, ICMPv6 Echo and NDP autoconfiguration

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoMerge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
Peter Maydell [Tue, 15 Mar 2016 16:43:48 +0000 (16:43 +0000)]
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

vhost, virtio, pci, pc, acpi

nvdimm work
sparse cpu id rework
ipmi enhancements
fixes all over the place
pxb option to tweak chassis number

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Tue 15 Mar 2016 14:33:10 GMT using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream: (51 commits)
  hw/acpi: fix GSI links UID
  ipmi: add some local variables in ipmi_sdr_init
  ipmi: remove the need of an ending record in the SDR table
  ipmi: use a function to initialize the SDR table
  ipmi: add a realize function to the device class
  ipmi: add rsp_buffer_set_error() helper
  ipmi: remove IPMI_CHECK_RESERVATION() macro
  ipmi: replace IPMI_ADD_RSP_DATA() macro with inline helpers
  ipmi: remove IPMI_CHECK_CMD_LEN() macro
  MAINTAINERS: machine core
  MAINTAINERS: Add an entry for virtio header files
  pc: acpi: clarify why possible LAPIC entries must be present in MADT
  pc: acpi: drop cpu->found_cpus bitmap
  pc: acpi: create Processor and Notify objects only for valid lapics
  pc: acpi: create MADT.lapic entries only for valid lapics
  pc: acpi: SRAT: create only valid processor lapic entries
  pc: acpi: cleanup qdev_get_machine() calls
  machine: introduce MachineClass.possible_cpu_arch_ids() hook
  pc: init pcms->apic_id_limit once and use it throughout pc.c
  pc: acpi: remove NOP assignment
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoslirp: Add IPv6 support to the TFTP code
Thomas Huth [Tue, 15 Mar 2016 09:31:23 +0000 (10:31 +0100)]
slirp: Add IPv6 support to the TFTP code

Add the handler code for incoming TFTP packets to udp6_input(),
and make sure that the TFTP code can send packets with both,
udp_output() and udp6_output() by introducing a wrapper function
called tftp_udp_output().

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
8 years agoMerge remote-tracking branch 'remotes/berrange/tags/pull-io-next-2016-03-15-1' into...
Peter Maydell [Tue, 15 Mar 2016 15:51:06 +0000 (15:51 +0000)]
Merge remote-tracking branch 'remotes/berrange/tags/pull-io-next-2016-03-15-1' into staging

Merge I/O fixes

# gpg: Signature made Tue 15 Mar 2016 14:42:43 GMT using RSA key ID 15104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg:                 aka "Daniel P. Berrange <berrange@redhat.com>"

* remotes/berrange/tags/pull-io-next-2016-03-15-1:
  io: stronger check for support for IPv4/6

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agohw/acpi: fix GSI links UID
Marcel Apfelbaum [Sun, 13 Mar 2016 11:40:29 +0000 (13:40 +0200)]
hw/acpi: fix GSI links UID

According to the ACPI spec, each UID must be unique.
Use the irq number as UID for GSI links.

Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
8 years agoio: stronger check for support for IPv4/6
Daniel P. Berrange [Mon, 14 Mar 2016 18:15:35 +0000 (14:15 -0400)]
io: stronger check for support for IPv4/6

Instead of just checking for bind(), also check whether
getaddrinfo can resolve IPv6 addresses. This catches
failure when travis runs QEMU builds inside minimal
docker containers

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
8 years agoMerge remote-tracking branch 'remotes/ehabkost/tags/x86-pull-request' into staging
Peter Maydell [Tue, 15 Mar 2016 11:05:37 +0000 (11:05 +0000)]
Merge remote-tracking branch 'remotes/ehabkost/tags/x86-pull-request' into staging

X86 fixes

# gpg: Signature made Mon 14 Mar 2016 20:26:25 GMT using RSA key ID 984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"

* remotes/ehabkost/tags/x86-pull-request:
  kvm: Remove x2apic feature from CPU model when kernel_irqchip is off
  hyperv: cpu hotplug fix with HyperV enabled

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoMerge remote-tracking branch 'remotes/rth/tags/pull-i386-20160314' into staging
Peter Maydell [Tue, 15 Mar 2016 10:08:12 +0000 (10:08 +0000)]
Merge remote-tracking branch 'remotes/rth/tags/pull-i386-20160314' into staging

target-i386 fixes

# gpg: Signature made Mon 14 Mar 2016 17:54:06 GMT using RSA key ID 4DD0279B
# gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
# gpg:                 aka "Richard Henderson <rth@redhat.com>"
# gpg:                 aka "Richard Henderson <rth@twiddle.net>"

* remotes/rth/tags/pull-i386-20160314:
  target-i386: Dump unknown opcodes with -d unimp
  target-i386: Fix inhibit irq mask handling
  target-i386: Use gen_nop_modrm for prefetch instructions
  target-i386: Fix addr16 prefix
  target-i386: Fix SMSW for 64-bit mode
  target-i386: Fix SMSW and LMSW from/to register
  target-i386: Avoid repeated calls to the bnd_jmp helper

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoqapi-schema, qemu-options & slirp: Adding Qemu options for IPv6 addresses
Yann Bordenave [Tue, 15 Mar 2016 09:31:22 +0000 (10:31 +0100)]
qapi-schema, qemu-options & slirp: Adding Qemu options for IPv6 addresses

This patch adds parameters to manage some new options in the qemu -net
command.
Slirp IPv6 address, network prefix, and DNS IPv6 address can be given in
argument to the qemu command.
Defaults parameters are respectively fec0::2, fec0::, /64 and fec0::3.

Signed-off-by: Yann Bordenave <meow@meowstars.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
8 years agoslirp: Adding IPv6 address for DNS relay
Guillaume Subiron [Tue, 15 Mar 2016 09:31:22 +0000 (10:31 +0100)]
slirp: Adding IPv6 address for DNS relay

This patch adds an IPv6 address to the DNS relay. in6_equal_dns() is
developed using this Slirp attribute.
sotranslate_in/out/accept() are also updated to manage the IPv6 case so the
guest can be able to join the host using one of the Slirp addresses.

For now this only points to localhost. Further development will be needed to
automatically fetch the IPv6 address from resolv.conf, and announce this via
RDNSS.

Signed-off-by: Guillaume Subiron <maethor@subiron.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
8 years agoslirp: Handle IPv6 in TCP functions
Guillaume Subiron [Tue, 15 Mar 2016 09:31:21 +0000 (10:31 +0100)]
slirp: Handle IPv6 in TCP functions

This patch adds IPv6 case in TCP functions refactored by the last
patches.
This also adds IPv6 pseudo-header in tcpiphdr structure.
Finally, tcp_input() is called by ip6_input().

Signed-off-by: Guillaume Subiron <maethor@subiron.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
8 years agoslirp: Reindent after refactoring
Guillaume Subiron [Tue, 15 Mar 2016 09:31:21 +0000 (10:31 +0100)]
slirp: Reindent after refactoring

No code change.

Signed-off-by: Guillaume Subiron <maethor@subiron.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
8 years agoslirp: Generalizing and neutralizing various TCP functions before adding IPv6 stuff
Guillaume Subiron [Tue, 15 Mar 2016 09:31:21 +0000 (10:31 +0100)]
slirp: Generalizing and neutralizing various TCP functions before adding IPv6 stuff

Basically, this patch adds some switch in various TCP functions to
prepare them for the IPv6 case.

To have something to "switch" in tcp_input() and tcp_respond(), a new
argument is used to give them the sa_family of the addresses they are
working on.

This patch does not include the entailed reindentation, to make proofread
easier. Reindentation is adressed in the following no-op patch.

Signed-off-by: Guillaume Subiron <maethor@subiron.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
8 years agoslirp: Factorizing tcpiphdr structure with an union
Guillaume Subiron [Tue, 15 Mar 2016 09:31:20 +0000 (10:31 +0100)]
slirp: Factorizing tcpiphdr structure with an union

This patch factorizes the tcpiphdr structure to put the IPv4 fields in
an union, for addition of version 6 in further patch.
Using some macros, retrocompatibility of the existing code is assured.

This patch also fixes the SLIRP_MSIZE and margin computation in various
functions, and makes them compatible with the new tcpiphdr structure,
whose size will be bigger than sizeof(struct tcphdr) + sizeof(struct ip)

Signed-off-by: Guillaume Subiron <maethor@subiron.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
8 years agoslirp: Adding IPv6 UDP support
Guillaume Subiron [Tue, 15 Mar 2016 09:31:20 +0000 (10:31 +0100)]
slirp: Adding IPv6 UDP support

This adds the sin6 case in the fhost and lhost unions and related macros.
It adds udp6_input() and udp6_output().
It adds the IPv6 case in sorecvfrom().
Finally, udp_input() is called by ip6_input().

Signed-off-by: Guillaume Subiron <maethor@subiron.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
8 years agoslirp: Adding ICMPv6 error sending
Yann Bordenave [Tue, 15 Mar 2016 09:31:19 +0000 (10:31 +0100)]
slirp: Adding ICMPv6 error sending

Adding icmp6_send_error to send ICMPv6 Error messages. This function is
simpler than the v4 version.
Adding some calls in various functions to send ICMP errors, when a
received packet is too big, or when its hop limit is 0.

Signed-off-by: Yann Bordenave <meow@meowstars.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
8 years agoslirp: Fix ICMP error sending
Yann Bordenave [Tue, 15 Mar 2016 09:31:19 +0000 (10:31 +0100)]
slirp: Fix ICMP error sending

Disambiguation : icmp_error is renamed into icmp_send_error, since it
doesn't manage errors, but only sends ICMP Error messages.

Signed-off-by: Yann Bordenave <meow@meowstars.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
8 years agoslirp: Adding IPv6, ICMPv6 Echo and NDP autoconfiguration
Guillaume Subiron [Tue, 15 Mar 2016 09:31:19 +0000 (10:31 +0100)]
slirp: Adding IPv6, ICMPv6 Echo and NDP autoconfiguration

This patch adds the functions needed to handle IPv6 packets. ICMPv6 and
NDP headers are implemented.

Slirp is now able to send NDP Router or Neighbor Advertisement when it
receives Router or Neighbor Solicitation. Using a 64bit-sized IPv6
prefix, the guest is now able to perform stateless autoconfiguration
(SLAAC) and to compute its IPv6 address.

This patch adds an ndp_table, mainly inspired by arp_table, to keep an
NDP cache and manage network address resolution.
Slirp regularly sends NDP Neighbor Advertisement, as recommended by the
RFC, to make the guest refresh its route.

This also adds ip6_cksum() to compute ICMPv6 checksums using IPv6
pseudo-header.

Some #define ETH_* are moved upper in slirp.h to make them accessible to
other slirp/*.h

Signed-off-by: Guillaume Subiron <maethor@subiron.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
8 years agoMerge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Peter Maydell [Tue, 15 Mar 2016 09:13:06 +0000 (09:13 +0000)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches

# gpg: Signature made Mon 14 Mar 2016 16:36:52 GMT using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream: (40 commits)
  iotests: Add test for QMP event rates
  monitor: Use QEMU_CLOCK_VIRTUAL for the event queue in qtest mode
  monitor: Separate QUORUM_REPORT_BAD events according to the node name
  quorum: Fix crash in quorum_aio_cb()
  iotests: Correct 081's reference output
  block: Remove unused typedef of BlockDriverDirtyHandler
  block: Move block dirty bitmap code to separate files
  typedefs: Add BdrvDirtyBitmap
  block: Include hbitmap.h in block.h
  backup: Use Bitmap to replace "s->bitmap"
  vpc: Use BB functions in .bdrv_create()
  vmdk: Use BB functions in .bdrv_create()
  vhdx: Use BB functions in .bdrv_create()
  vdi: Use BB functions in .bdrv_create()
  sheepdog: Use BB functions in .bdrv_create()
  qed: Use BB functions in .bdrv_create()
  qcow2: Use BB functions in .bdrv_create()
  qcow: Use BB functions in .bdrv_create()
  parallels: Use BB functions in .bdrv_create()
  block: Introduce blk_set_allow_write_beyond_eof()
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agokvm: Remove x2apic feature from CPU model when kernel_irqchip is off
Lan Tianyu [Thu, 25 Feb 2016 15:15:12 +0000 (23:15 +0800)]
kvm: Remove x2apic feature from CPU model when kernel_irqchip is off

x2apic feature is in the kvm_default_props and automatically added to all
CPU models when KVM is enabled. But userspace devices don't support x2apic
which can't be enabled without the in-kernel irqchip. It will trigger
warning of "host doesn't support requested feature: CPUID.01H:ECX.x2apic
[bit 21]" when kernel_irqchip is off. This patch is to fix it via removing
x2apic feature when kernel_irqchip is off.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
8 years agohyperv: cpu hotplug fix with HyperV enabled
Denis V. Lunev [Mon, 22 Feb 2016 09:13:02 +0000 (12:13 +0300)]
hyperv: cpu hotplug fix with HyperV enabled

With Hyper-V enabled CPU hotplug stops working. The CPU appears
in device manager on Windows but does not appear in peformance
monitor and control panel.

The root of the problem is the following. Windows checks
HV_X64_CPU_DYNAMIC_PARTITIONING_AVAILABLE bit in CPUID. The
presence of this bit is enough to cure the situation.

The bit should be set when CPU hotplug is allowed for HyperV VM.
The check that hot_add_cpu callback is defined is enough from the
protocol point of view. Though this callback is defined almost
always thus there is no need to export that knowledge in the
other way.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Eduardo Habkost <ehabkost@redhat.com>
CC: "Andreas Färber" <afaerber@suse.de>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
8 years agotarget-i386: Dump unknown opcodes with -d unimp
Richard Henderson [Wed, 2 Mar 2016 00:53:18 +0000 (16:53 -0800)]
target-i386: Dump unknown opcodes with -d unimp

We discriminate here between opcodes that are illegal in the current
cpu mode or with illegal arguments (such as modrm.mod == 3) and
encodings that are unknown (such as an unimplemented isa extension).

Signed-off-by: Richard Henderson <rth@twiddle.net>
8 years agotarget-i386: Fix inhibit irq mask handling
Richard Henderson [Thu, 3 Mar 2016 05:16:51 +0000 (21:16 -0800)]
target-i386: Fix inhibit irq mask handling

The patch in 7f0b714 was too simplistic, in that we wound up setting
the flag and then resetting it immediately in gen_eob.

Fixes the reported boot problem with Windows XP.

Reported-by: Hervé Poussineau <hpoussin@reactos.org>
Tested-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
8 years agotarget-i386: Use gen_nop_modrm for prefetch instructions
Richard Henderson [Wed, 2 Mar 2016 18:31:35 +0000 (10:31 -0800)]
target-i386: Use gen_nop_modrm for prefetch instructions

Tested-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
8 years agotarget-i386: Fix addr16 prefix
Paolo Bonzini [Wed, 2 Mar 2016 15:04:38 +0000 (16:04 +0100)]
target-i386: Fix addr16 prefix

While ADDSEG will only be false in 16-bit mode for LEA, it can be
false even in other cases when 16-bit addresses are obtained via
the 67h prefix in 32-bit mode.  In this case, gen_lea_v_seg forgets
to add a nonzero FS or GS base if CS/DS/ES/SS are all zero.  This
case is pretty rare but happens when booting Windows 95/98, and
this patch fixes it.

The bug is visible since commit d6a291498, but it was introduced
together with gen_lea_v_seg and it probably could be reproduced
with a "addr16 gs movsb" instruction as early as in commit
ca2f29f555805d07fb0b9ebfbbfc4e3656530977.

Reported-by: Hervé Poussineau <hpoussin@reactos.org>
Tested-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1456931078-21635-1-git-send-email-pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
8 years agotarget-i386: Fix SMSW for 64-bit mode
Richard Henderson [Tue, 1 Mar 2016 16:59:32 +0000 (08:59 -0800)]
target-i386: Fix SMSW for 64-bit mode

In non-64-bit modes, the instruction always stores 16 bits.
But in 64-bit mode, when the destination is a register, the
instruction can write 32 or 64 bits.

Tested-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
8 years agotarget-i386: Fix SMSW and LMSW from/to register
Paolo Bonzini [Tue, 1 Mar 2016 15:12:14 +0000 (16:12 +0100)]
target-i386: Fix SMSW and LMSW from/to register

SMSW and LMSW accept register operands, but commit 1906b2a ("target-i386:
Rearrange processing of 0F 01", 2016-02-13) did not account for that.

Fixes: 1906b2af7c2345037d9b2fdf484b457b5acd09d1
Reported-by: Hervé Poussineau <hpoussin@reactos.org>
Tested-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1456845134-18812-1-git-send-email-pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
8 years agotarget-i386: Avoid repeated calls to the bnd_jmp helper
Paolo Bonzini [Tue, 1 Mar 2016 15:12:25 +0000 (16:12 +0100)]
target-i386: Avoid repeated calls to the bnd_jmp helper

Two flags were tested the wrong way.

Tested-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1456845145-18891-1-git-send-email-pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
[rth: Fixed enable test as well.]

8 years agoMerge remote-tracking branch 'mreitz/tags/pull-block-for-kevin-2016-03-14-v2' into...
Kevin Wolf [Mon, 14 Mar 2016 16:36:31 +0000 (17:36 +0100)]
Merge remote-tracking branch 'mreitz/tags/pull-block-for-kevin-2016-03-14-v2' into queue-block

Block patches for pi day, v2.

# gpg: Signature made Mon Mar 14 17:35:29 2016 CET using RSA key ID E838ACAD
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"

* mreitz/tags/pull-block-for-kevin-2016-03-14-v2:
  iotests: Add test for QMP event rates
  monitor: Use QEMU_CLOCK_VIRTUAL for the event queue in qtest mode
  monitor: Separate QUORUM_REPORT_BAD events according to the node name
  quorum: Fix crash in quorum_aio_cb()
  iotests: Correct 081's reference output
  block: Remove unused typedef of BlockDriverDirtyHandler
  block: Move block dirty bitmap code to separate files
  typedefs: Add BdrvDirtyBitmap
  block: Include hbitmap.h in block.h
  backup: Use Bitmap to replace "s->bitmap"

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoiotests: Add test for QMP event rates
Alberto Garcia [Thu, 10 Mar 2016 11:55:27 +0000 (13:55 +0200)]
iotests: Add test for QMP event rates

This test verifies that the rate-limited QMP events are emitted at a
maximum rate of 1 per second as defined in monitor_qapi_event_conf in
monitor.c

It also checks that QUORUM_REPORT_BAD events generated from different
nodes are kept in separate queues so they don't mask each other.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 0dbd3ee88a59a6363042ad81cfb345037bfbf612.1457610443.git.berto@igalia.com
[mreitz@redhat.com: Renamed test from 146 to 148]
Signed-off-by: Max Reitz <mreitz@redhat.com>
8 years agomonitor: Use QEMU_CLOCK_VIRTUAL for the event queue in qtest mode
Alberto Garcia [Thu, 10 Mar 2016 11:55:26 +0000 (13:55 +0200)]
monitor: Use QEMU_CLOCK_VIRTUAL for the event queue in qtest mode

This allows us to perform tests on the monitor queues to verify that
the rate limits are enforced.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: dde511809e954a5c32d5b648bb184c03c89ed5d5.1457610443.git.berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
8 years agomonitor: Separate QUORUM_REPORT_BAD events according to the node name
Alberto Garcia [Thu, 10 Mar 2016 11:55:25 +0000 (13:55 +0200)]
monitor: Separate QUORUM_REPORT_BAD events according to the node name

The QUORUM_REPORT_BAD event is emitted whenever there's an I/O error
in a child of a Quorum device. This event is emitted at a maximum rate
of 1 per second. This means that an error in one of the children will
mask errors in the other children if they happen within the same 1
second interval.

This patch modifies qapi_event_throttle_equal() so QUORUM_REPORT_BAD
events are kept separately if they come from different children.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: b989c0cb3755bc4b6696e796fa8ed2ef6c56606a.1457610443.git.berto@igalia.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
8 years agoquorum: Fix crash in quorum_aio_cb()
Alberto Garcia [Thu, 10 Mar 2016 11:55:24 +0000 (13:55 +0200)]
quorum: Fix crash in quorum_aio_cb()

quorum_aio_cb() emits the QUORUM_REPORT_BAD event if there's
an I/O error in a Quorum child. However sacb->aiocb must be
correctly initialized for this to happen. read_quorum_children() and
read_fifo_child() are not doing this, which results in a QEMU crash.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 8138570d071ba7e25db3736979234a1fd71dbd05.1457610443.git.berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
8 years agoiotests: Correct 081's reference output
Max Reitz [Fri, 11 Mar 2016 14:14:47 +0000 (15:14 +0100)]
iotests: Correct 081's reference output

The newly added type parameter for the QUORUM_REPORT_BAD event changed
the output of iotest 081, so the reference should be amended
accordingly.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 1457705687-27122-1-git-send-email-mreitz@redhat.com
Reviewed-by: Alberto Garcia <berto@igalia.com>
8 years agoblock: Remove unused typedef of BlockDriverDirtyHandler
Fam Zheng [Tue, 8 Mar 2016 04:44:56 +0000 (12:44 +0800)]
block: Remove unused typedef of BlockDriverDirtyHandler

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1457412306-18940-6-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
8 years agoblock: Move block dirty bitmap code to separate files
Fam Zheng [Tue, 8 Mar 2016 04:44:55 +0000 (12:44 +0800)]
block: Move block dirty bitmap code to separate files

The only code change is making bdrv_dirty_bitmap_truncate public. It is
used in block.c.

Also two long lines (bdrv_get_dirty) are wrapped.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1457412306-18940-5-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
8 years agotypedefs: Add BdrvDirtyBitmap
Fam Zheng [Tue, 8 Mar 2016 04:44:54 +0000 (12:44 +0800)]
typedefs: Add BdrvDirtyBitmap

Following patches to refactor and move block dirty bitmap code could use
this.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1457412306-18940-4-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
8 years agoblock: Include hbitmap.h in block.h
Fam Zheng [Tue, 8 Mar 2016 04:44:53 +0000 (12:44 +0800)]
block: Include hbitmap.h in block.h

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1457412306-18940-3-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
8 years agobackup: Use Bitmap to replace "s->bitmap"
Fam Zheng [Tue, 8 Mar 2016 04:44:52 +0000 (12:44 +0800)]
backup: Use Bitmap to replace "s->bitmap"

"s->bitmap" tracks done sectors, we only check bit states without using any
iterator which HBitmap is good for. Switch to "Bitmap" which is simpler and
more memory efficient.

Meanwhile, rename it to done_bitmap, to reflect the intention.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1457412306-18940-2-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
8 years agoMerge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
Peter Maydell [Mon, 14 Mar 2016 16:22:17 +0000 (16:22 +0000)]
Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging

# gpg: Signature made Mon 14 Mar 2016 11:27:01 GMT using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/tracing-pull-request:
  trace: separate MMIO tracepoints from TB-access tracepoints
  trace: include CPU index in trace_memory_region_*()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agovpc: Use BB functions in .bdrv_create()
Kevin Wolf [Tue, 8 Mar 2016 14:57:05 +0000 (15:57 +0100)]
vpc: Use BB functions in .bdrv_create()

All users of the block layers are supposed to go through a BlockBackend.
The .bdrv_create() implementation is one such user, so this patch
converts it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agovmdk: Use BB functions in .bdrv_create()
Kevin Wolf [Tue, 8 Mar 2016 14:57:05 +0000 (15:57 +0100)]
vmdk: Use BB functions in .bdrv_create()

All users of the block layers are supposed to go through a BlockBackend.
The .bdrv_create() implementation is one such user, so this patch
converts it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agovhdx: Use BB functions in .bdrv_create()
Kevin Wolf [Tue, 8 Mar 2016 14:57:05 +0000 (15:57 +0100)]
vhdx: Use BB functions in .bdrv_create()

All users of the block layers are supposed to go through a BlockBackend.
The .bdrv_create() implementation is one such user, so this patch
converts it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agovdi: Use BB functions in .bdrv_create()
Kevin Wolf [Tue, 8 Mar 2016 14:57:05 +0000 (15:57 +0100)]
vdi: Use BB functions in .bdrv_create()

All users of the block layers are supposed to go through a BlockBackend.
The .bdrv_create() implementation is one such user, so this patch
converts it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agosheepdog: Use BB functions in .bdrv_create()
Kevin Wolf [Tue, 8 Mar 2016 14:57:05 +0000 (15:57 +0100)]
sheepdog: Use BB functions in .bdrv_create()

All users of the block layers are supposed to go through a BlockBackend.
The .bdrv_create() implementation is one such user, so this patch
converts it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoqed: Use BB functions in .bdrv_create()
Kevin Wolf [Tue, 8 Mar 2016 14:57:05 +0000 (15:57 +0100)]
qed: Use BB functions in .bdrv_create()

All users of the block layers are supposed to go through a BlockBackend.
The .bdrv_create() implementation is one such user, so this patch
converts it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoqcow2: Use BB functions in .bdrv_create()
Kevin Wolf [Tue, 8 Mar 2016 14:57:05 +0000 (15:57 +0100)]
qcow2: Use BB functions in .bdrv_create()

All users of the block layers are supposed to go through a BlockBackend.
The .bdrv_create() implementation is one such user, so this patch
converts it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoqcow: Use BB functions in .bdrv_create()
Kevin Wolf [Tue, 8 Mar 2016 14:57:05 +0000 (15:57 +0100)]
qcow: Use BB functions in .bdrv_create()

All users of the block layers are supposed to go through a BlockBackend.
The .bdrv_create() implementation is one such user, so this patch
converts it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoparallels: Use BB functions in .bdrv_create()
Kevin Wolf [Tue, 8 Mar 2016 14:57:05 +0000 (15:57 +0100)]
parallels: Use BB functions in .bdrv_create()

All users of the block layers are supposed to go through a BlockBackend.
The .bdrv_create() implementation is one such user, so this patch
converts it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoblock: Introduce blk_set_allow_write_beyond_eof()
Kevin Wolf [Tue, 8 Mar 2016 15:39:49 +0000 (16:39 +0100)]
block: Introduce blk_set_allow_write_beyond_eof()

We check that the guest can't write beyond the end of its disk, but for
other internal users it can make sense to allow growing a file.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoblock: Use writeback in .bdrv_create() implementations
Kevin Wolf [Fri, 4 Mar 2016 13:53:50 +0000 (14:53 +0100)]
block: Use writeback in .bdrv_create() implementations

There's no reason to use a writethrough cache mode while creating an
image.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agohmp: Extend drive_del to delete nodes without BB
Kevin Wolf [Tue, 23 Feb 2016 16:50:37 +0000 (17:50 +0100)]
hmp: Extend drive_del to delete nodes without BB

Now that we can use drive_add to create new nodes without a BB, we also
want to be able to delete such nodes again.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
8 years agohmp: 'drive_add -n' for creating a node without BB
Kevin Wolf [Tue, 23 Feb 2016 16:33:24 +0000 (17:33 +0100)]
hmp: 'drive_add -n' for creating a node without BB

This patch adds an option to the drive_add HMP command to create only a
BlockDriverState without a BlockBackend on top.

The motivation for this is that libvirt needs to specify options to a
migration target (specifically, detect-zeroes). drive-mirror doesn't
allow specifying options, and the proper way to do this is to create the
target BDS separately with blockdev-add (where you can specify options)
and then use blockdev-mirror to that BDS.

However, libvirt can't use blockdev-add as long as it is still
experimental, and we're expecting that it will still take some time, so
we need to resort to drive_add.

The problem with drive_add is that so far it always created a BB, and
BDSes with a BB can't be used as a mirroring target as long as we don't
support multiple BBs per BDS - and while we're working towards that
goal, it's another thing that will still take some time.

So to achieve the goal, the simplest solution to provide the
functionality now without adding one-off options to the mirror QMP
commands is to extend drive_add to create nodes without BBs.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agovmdk: Switch to heap arrays for vmdk_parent_open
Fam Zheng [Tue, 8 Mar 2016 08:24:36 +0000 (16:24 +0800)]
vmdk: Switch to heap arrays for vmdk_parent_open

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agovmdk: Switch to heap arrays for vmdk_read_cid
Fam Zheng [Tue, 8 Mar 2016 08:24:35 +0000 (16:24 +0800)]
vmdk: Switch to heap arrays for vmdk_read_cid

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agovmdk: Switch to heap arrays for vmdk_write_cid
Fam Zheng [Tue, 8 Mar 2016 08:24:34 +0000 (16:24 +0800)]
vmdk: Switch to heap arrays for vmdk_write_cid

It is only called once for each opened image, so we can do it the easy
way.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoblock: Fix cache mode defaults in bds_tree_init()
Kevin Wolf [Mon, 7 Mar 2016 13:23:04 +0000 (14:23 +0100)]
block: Fix cache mode defaults in bds_tree_init()

Without setting explicit defaults in the options, blockdev-add without
an ID ended up defaulting to writethrough. It should be writeback as
documented.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
8 years agoblock: Fix snapshot=on cache modes
Kevin Wolf [Mon, 7 Mar 2016 12:02:15 +0000 (13:02 +0100)]
block: Fix snapshot=on cache modes

Since commit 91a097e, we end up with a somewhat weird cache mode
configuration with snapshot=on: The commit broke the cache mode
inheritance for the snapshot overlay so that it is opened as
writethrough instead of unsafe now. The following bdrv_append() call to
put it on top of the tree swaps the WCE flag with the snapshot's backing
file (i.e. the originally given file), so what we eventually get is
cache=writeback on the temporary overlay and
cache=writethrough,cache.no-flush=on on the real image file.

This patch changes things so that the temporary overlay gets
cache=unsafe again like it used to, and the real images get whatever the
user specified. This means that cache.direct is now respected even with
snapshot=on, and in the case of committing changes, the final flush is
no longer ignored except explicitly requested by the user.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
8 years agoblockdev: Snapshotting must not open second instance of old top
Kevin Wolf [Wed, 2 Mar 2016 11:16:44 +0000 (12:16 +0100)]
blockdev: Snapshotting must not open second instance of old top

Calling bdrv_img_create() with a size of -1 means that it determines the
size automatically by opening the backing file. However, in the case of
live snapshots, the backing file is already opened and we must avoid
opening the same image twice at the same time. Apart from that, just
getting the size from the already existing BDS is a lot less overhead
than opening a new instance.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
8 years agoquorum: modify vote rules for flush operation
Changlong Xie [Fri, 26 Feb 2016 01:39:02 +0000 (09:39 +0800)]
quorum: modify vote rules for flush operation

Keep flush interface the same logic as quorum read/write, Otherwise in
following scenario, we'll encounter unexpected errors.

Quorum has two children(A, B). A do flush sucessfully, but B flush failed.
This cause the filesystem of guest become read-only with following errors:

end_request: I/O error, dev vda, sector 11159960
Aborting journal on device vda3-8
EXT4-fs error (device vda3): ext4_journal_start_sb:327: Detected abort journal
EXT4-fs (vda3): Remounting filesystem read-only

Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoqmp event: Refactor QUORUM_REPORT_BAD
Changlong Xie [Fri, 26 Feb 2016 01:39:01 +0000 (09:39 +0800)]
qmp event: Refactor QUORUM_REPORT_BAD

Introduce QuorumOpType, and make QUORUM_REPORT_BAD compatible
with it.

Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>