sdk/emulator/qemu.git
8 years agotarget-i386: Enable "check" mode by default
Eduardo Habkost [Wed, 26 Aug 2015 16:25:44 +0000 (13:25 -0300)]
target-i386: Enable "check" mode by default

Current default behavior of QEMU is to silently disable features that
are not supported by the host when a CPU model is requested in the
command-line. This means that in addition to risking breaking guest ABI
by default, we are silent about it.

I would like to enable "enforce" by default, but this can easily break
existing production systems because of the way libvirt makes assumptions
about CPU models today (this will change in the future, once QEMU
provide a proper interface for checking if a CPU model is runnable).

But there's no reason we should be silent about it. So, change
target-i386 to enable "check" mode by default so at least we have some
warning printed to stderr (and hopefully logged somewhere) when QEMU
disables a feature that is not supported by the host system.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
8 years agotarget-i386: Don't left shift negative constant
Eduardo Habkost [Tue, 29 Sep 2015 20:34:23 +0000 (17:34 -0300)]
target-i386: Don't left shift negative constant

Left shift of negative values is undefined behavior. Detected by clang:
  qemu/target-i386/translate.c:2423:26: runtime error:
    left shift of negative value -8

This changes the code to reverse the sign after the left shift.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
8 years agoMerge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
Peter Maydell [Tue, 27 Oct 2015 10:10:46 +0000 (10:10 +0000)]
Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging

# gpg: Signature made Tue 27 Oct 2015 05:47:28 GMT using RSA key ID 398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211

* remotes/jasowang/tags/net-pull-request:
  net: free the string returned by object_get_canonical_path_component
  net: make iov_to_buf take right size argument in nc_sendv_compat()
  net: Remove duplicate data from query-rx-filter on multiqueue net devices
  vmxnet3: Do not fill stats if device is inactive
  options: Add documentation for filter-dump
  net/dump: Provide the dumping facility as a net-filter
  net/dump: Separate the NetClientState from the DumpState
  net/dump: Rework net-dump init functions
  net/dump: Add support for receive_iov function
  net: cadence_gem: Set initial MAC address

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agonet: free the string returned by object_get_canonical_path_component
Yang Hongyang [Tue, 20 Oct 2015 01:51:26 +0000 (09:51 +0800)]
net: free the string returned by object_get_canonical_path_component

The value returned from object_get_canonical_path_component
must be freed.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
8 years agonet: make iov_to_buf take right size argument in nc_sendv_compat()
Yang Hongyang [Tue, 20 Oct 2015 01:51:25 +0000 (09:51 +0800)]
net: make iov_to_buf take right size argument in nc_sendv_compat()

We want "buf, sizeof(buf)" here.  sizeof(buffer) is the size of a
pointer, which is wrong.
Thanks to Paolo for pointing it out.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
8 years agonet: Remove duplicate data from query-rx-filter on multiqueue net devices
Vladislav Yasevich [Mon, 19 Oct 2015 13:04:38 +0000 (09:04 -0400)]
net: Remove duplicate data from query-rx-filter on multiqueue net devices

When responding to a query-rx-filter command on a multiqueue
netdev, qemu reports the data for each queue.  The data, however,
is not per-queue, but per device and the same data is reported
multiple times.  This causes confusion and may also cause extra
unnecessary processing when looking at the data.

Commit 638fb14169 (net: Make qmp_query_rx_filter() with name argument
more obvious) partially addresses this issue, by limiting the output
when the name is specified.  However, when the name is not specified,
the issue still persists.

Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
8 years agovmxnet3: Do not fill stats if device is inactive
Shmulik Ladkani [Thu, 15 Oct 2015 10:54:30 +0000 (13:54 +0300)]
vmxnet3: Do not fill stats if device is inactive

Guest OS may issue VMXNET3_CMD_GET_STATS even before device was
activated (for example in linux, after insmod but prior net-dev open).

Accessing shared descriptors prior device activation is illegal as the
VMXNET3State structures have not been fully initialized.

As a result, guest memory gets corrupted and may lead to guest OS
crashes.

Fix, by not filling the stats descriptors if device is inactive.

Reported-by: Leonid Shatz <leonid.shatz@ravellosystems.com>
Acked-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Dana Rubin <dana.rubin@ravellosystems.com>
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
8 years agooptions: Add documentation for filter-dump
Thomas Huth [Tue, 13 Oct 2015 10:40:02 +0000 (12:40 +0200)]
options: Add documentation for filter-dump

Add a short description for the filter-dump command line options.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
8 years agonet/dump: Provide the dumping facility as a net-filter
Thomas Huth [Tue, 13 Oct 2015 10:40:01 +0000 (12:40 +0200)]
net/dump: Provide the dumping facility as a net-filter

Use the net-filter infrastructure to provide the dumping
functions for netdev devices, too.

Reviewed-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
8 years agonet/dump: Separate the NetClientState from the DumpState
Thomas Huth [Tue, 13 Oct 2015 10:40:00 +0000 (12:40 +0200)]
net/dump: Separate the NetClientState from the DumpState

With the upcoming dumping-via-netfilter patch, the DumpState
should not be related to NetClientState anymore, so move the
related information to a new struct called DumpNetClient.

Reviewed-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
8 years agonet/dump: Rework net-dump init functions
Thomas Huth [Tue, 13 Oct 2015 10:39:59 +0000 (12:39 +0200)]
net/dump: Rework net-dump init functions

Move the creation of the dump client from net_dump_init() into
net_init_dump(), so we can later use the former function for
dump via netfilter, too. Also rename net_dump_init() to
net_dump_state_init() to make it easier distinguishable from
net_init_dump().

Reviewed-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
8 years agonet/dump: Add support for receive_iov function
Thomas Huth [Tue, 13 Oct 2015 10:39:58 +0000 (12:39 +0200)]
net/dump: Add support for receive_iov function

Adding a proper receive_iov function to the net dump module.
This will make it easier to support the dump filter feature for
the -netdev option in later patches.

Reviewed-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
8 years agonet: cadence_gem: Set initial MAC address
Sebastian Huber [Mon, 12 Oct 2015 08:25:01 +0000 (10:25 +0200)]
net: cadence_gem: Set initial MAC address

Set initial MAC address to the one specified by the command line.

Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
8 years agoMerge remote-tracking branch 'remotes/sstabellini/tags/xen-2015-10-26' into staging
Peter Maydell [Mon, 26 Oct 2015 13:13:38 +0000 (13:13 +0000)]
Merge remote-tracking branch 'remotes/sstabellini/tags/xen-2015-10-26' into staging

Xen 2015-10-26

# gpg: Signature made Mon 26 Oct 2015 11:32:50 GMT using RSA key ID 70E1AE90
# gpg: Good signature from "Stefano Stabellini <stefano.stabellini@eu.citrix.com>"

* remotes/sstabellini/tags/xen-2015-10-26:
  xen-platform: Replace assert() with appropriate error reporting
  xen_platform: switch to realize
  Qemu/Xen: Fix early freeing MSIX MMIO memory region

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoxen-platform: Replace assert() with appropriate error reporting
Eduardo Habkost [Wed, 21 Oct 2015 15:46:50 +0000 (13:46 -0200)]
xen-platform: Replace assert() with appropriate error reporting

Commit dbb7405d8caad0814ceddd568cb49f163a847561 made it possible to
trigger an assert using "-device xen-platform". Replace it with
appropriate error reporting.

Before:

  $ qemu-system-x86_64 -device xen-platform
  qemu-system-x86_64: hw/i386/xen/xen_platform.c:391: xen_platform_initfn: Assertion `xen_enabled()' failed.
  Aborted (core dumped)
  $

After:

  $ qemu-system-x86_64 -device xen-platform
  qemu-system-x86_64: -device xen-platform: xen-platform device requires the Xen accelerator
  $

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
8 years agoxen_platform: switch to realize
Stefano Stabellini [Wed, 21 Oct 2015 15:46:49 +0000 (13:46 -0200)]
xen_platform: switch to realize

Use realize to initialize the xen_platform device

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
8 years agoMerge remote-tracking branch 'remotes/elmarco/tags/ivshmem-pull-request' into staging
Peter Maydell [Mon, 26 Oct 2015 11:32:20 +0000 (11:32 +0000)]
Merge remote-tracking branch 'remotes/elmarco/tags/ivshmem-pull-request' into staging

ivshmem series

# gpg: Signature made Mon 26 Oct 2015 09:27:46 GMT using RSA key ID 75969CE5
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>"
# gpg:                 aka "Marc-André Lureau <marcandre.lureau@gmail.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276  F62D DAE8 E109 7596 9CE5

* remotes/elmarco/tags/ivshmem-pull-request: (51 commits)
  doc: document ivshmem & hugepages
  ivshmem: use little-endian int64_t for the protocol
  ivshmem: use kvm irqfd for msi notifications
  ivshmem: rename MSI eventfd_table
  ivshmem: remove EventfdEntry.vector
  ivshmem: add hostmem backend
  ivshmem: use qemu_strtosz()
  ivshmem: do not keep shm_fd open
  tests: add ivshmem qtest
  qtest: add qtest_add_abrt_handler()
  msix: implement pba write (but read-only)
  contrib: remove unnecessary strdup()
  ivshmem: add check on protocol version in QEMU
  docs: update ivshmem device spec
  ivshmem-server: fix hugetlbfs support
  ivshmem-server: use a uint16 for client ID
  ivshmem-client: check the number of vectors
  contrib: add ivshmem client and server
  util: const event_notifier_get_fd() argument
  ivshmem: reset mask on device reset
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoQemu/Xen: Fix early freeing MSIX MMIO memory region
Lan Tianyu [Sun, 11 Oct 2015 15:19:24 +0000 (23:19 +0800)]
Qemu/Xen: Fix early freeing MSIX MMIO memory region

msix->mmio is added to XenPCIPassthroughState's object as property.
object_finalize_child_property is called for XenPCIPassthroughState's
object, which calls object_property_del_all, which is going to try to
delete msix->mmio. object_finalize_child_property() will access
msix->mmio's obj. But the whole msix struct has already been freed
by xen_pt_msix_delete. This will cause segment fault when msix->mmio
has been overwritten.

This patch is to fix the issue.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
8 years agodoc: document ivshmem & hugepages
Marc-André Lureau [Wed, 7 Oct 2015 14:31:47 +0000 (16:31 +0200)]
doc: document ivshmem & hugepages

Document and give some examples of hugepages support with ivshmem device
and server.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
8 years agoivshmem: use little-endian int64_t for the protocol
Marc-André Lureau [Thu, 24 Sep 2015 10:55:01 +0000 (12:55 +0200)]
ivshmem: use little-endian int64_t for the protocol

The current ivshmem protocol uses 'long' for integers. But the
sizeof(long) depends on the host and the endianess is not defined, which
may cause portability troubles.

Instead, switch to using little-endian int64_t. This breaks the
protocol, except on x64 little-endian host where this change
should be compatible.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: use kvm irqfd for msi notifications
Marc-André Lureau [Thu, 9 Jul 2015 13:50:13 +0000 (15:50 +0200)]
ivshmem: use kvm irqfd for msi notifications

Use irqfd for improving context switch when notifying the guest.
If the host doesn't support kvm irqfd, regular msi notifications are
still supported.

Note: the ivshmem implementation doesn't allow switching between MSI and
IO interrupts, this patch doesn't either.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agoivshmem: rename MSI eventfd_table
Marc-André Lureau [Mon, 27 Jul 2015 10:59:19 +0000 (12:59 +0200)]
ivshmem: rename MSI eventfd_table

The array is used to have vector specific data, so use a more
descriptive name.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: remove EventfdEntry.vector
Marc-André Lureau [Fri, 24 Jul 2015 16:52:19 +0000 (18:52 +0200)]
ivshmem: remove EventfdEntry.vector

No need to store an extra int for the vector number when it can be
computed easily by looking at the position in the array.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: add hostmem backend
Marc-André Lureau [Mon, 29 Jun 2015 22:10:16 +0000 (00:10 +0200)]
ivshmem: add hostmem backend

Instead of handling allocation, teach ivshmem to use a memory backend.
This allows to use hugetlbfs backed memory now.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: use qemu_strtosz()
Marc-André Lureau [Mon, 29 Jun 2015 22:06:03 +0000 (00:06 +0200)]
ivshmem: use qemu_strtosz()

Use the common qemu utility function to parse the memory size.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: do not keep shm_fd open
Marc-André Lureau [Mon, 29 Jun 2015 22:04:19 +0000 (00:04 +0200)]
ivshmem: do not keep shm_fd open

Remove shm_fd from device state, closing it as early as possible to avoid leaks.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agotests: add ivshmem qtest
Marc-André Lureau [Wed, 2 Apr 2014 14:57:48 +0000 (16:57 +0200)]
tests: add ivshmem qtest

Adds 4 ivshmemtests:
- single qemu instance and basic IO
- pair of instances, check memory sharing
- pair of instances with server, and MSIX
- hot plug/unplug

A temporary shm is created as well as a directory to place server
socket, both should be clear on exit and abort.

Cc: Cam Macdonell <cam@cs.ualberta.ca>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
8 years agoqtest: add qtest_add_abrt_handler()
Marc-André Lureau [Fri, 19 Jun 2015 16:45:14 +0000 (18:45 +0200)]
qtest: add qtest_add_abrt_handler()

Allow a test to add abort handlers, use GHook for all handlers.

There is currently no way to remove a handler, but it could be
later added if needed.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agomsix: implement pba write (but read-only)
Marc-André Lureau [Fri, 26 Jun 2015 12:25:29 +0000 (14:25 +0200)]
msix: implement pba write (but read-only)

qpci_msix_pending() writes on pba region, causing qemu to SEGV:

  Program received signal SIGSEGV, Segmentation fault.
  [Switching to Thread 0x7ffff7fba8c0 (LWP 25882)]
  0x0000000000000000 in ?? ()
  (gdb) bt
  #0  0x0000000000000000 in  ()
  #1  0x00005555556556c5 in memory_region_oldmmio_write_accessor (mr=0x5555579f3f80, addr=0, value=0x7fffffffbf68, size=4, shift=0, mask=4294967295, attrs=...) at /home/elmarco/src/qemu/memory.c:434
  #2  0x00005555556558e1 in access_with_adjusted_size (addr=0, value=0x7fffffffbf68, size=4, access_size_min=1, access_size_max=4, access=0x55555565563e <memory_region_oldmmio_write_accessor>, mr=0x5555579f3f80, attrs=...) at /home/elmarco/src/qemu/memory.c:506
  #3  0x00005555556581eb in memory_region_dispatch_write (mr=0x5555579f3f80, addr=0, data=0, size=4, attrs=...) at /home/elmarco/src/qemu/memory.c:1176
  #4  0x000055555560b6f9 in address_space_rw (as=0x555555eff4e0 <address_space_memory>, addr=3759147008, attrs=..., buf=0x7fffffffc1b0 "", len=4, is_write=true) at /home/elmarco/src/qemu/exec.c:2439
  #5  0x000055555560baa2 in cpu_physical_memory_rw (addr=3759147008, buf=0x7fffffffc1b0 "", len=4, is_write=1) at /home/elmarco/src/qemu/exec.c:2534
  #6  0x000055555564c005 in cpu_physical_memory_write (addr=3759147008, buf=0x7fffffffc1b0, len=4) at /home/elmarco/src/qemu/include/exec/cpu-common.h:80
  #7  0x000055555564cd9c in qtest_process_command (chr=0x55555642b890, words=0x5555578de4b0) at /home/elmarco/src/qemu/qtest.c:378
  #8  0x000055555564db77 in qtest_process_inbuf (chr=0x55555642b890, inbuf=0x55555641b340) at /home/elmarco/src/qemu/qtest.c:569
  #9  0x000055555564dc07 in qtest_read (opaque=0x55555642b890, buf=0x7fffffffc2e0 "writel 0xe0100800 0x0\n", size=22) at /home/elmarco/src/qemu/qtest.c:581
  #10 0x000055555574ce3e in qemu_chr_be_write (s=0x55555642b890, buf=0x7fffffffc2e0 "writel 0xe0100800 0x0\n", len=22) at qemu-char.c:306
  #11 0x0000555555751263 in tcp_chr_read (chan=0x55555642bcf0, cond=G_IO_IN, opaque=0x55555642b890) at qemu-char.c:2876
  #12 0x00007ffff64c9a8a in g_main_context_dispatch (context=0x55555641c400) at gmain.c:3122

(without this patch, this can be reproduced with the ivshmem qtest)

Implement an empty mmio write to avoid the crash.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agocontrib: remove unnecessary strdup()
Marc-André Lureau [Wed, 24 Jun 2015 11:33:32 +0000 (13:33 +0200)]
contrib: remove unnecessary strdup()

getopt() optarg points to argv memory, no need to dup those values,
fixes small leaks detected by clang-analyzer.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
8 years agoivshmem: add check on protocol version in QEMU
David Marchand [Tue, 16 Jun 2015 15:43:34 +0000 (17:43 +0200)]
ivshmem: add check on protocol version in QEMU

Send a protocol version as the first message from server, clients must
close communication if they don't support this protocol version.  Older
QEMUs should be fine with this change in the protocol since they
overrides their own vm_id on reception of an id associated to no
eventfd.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[use fifo_update_and_get()]
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agodocs: update ivshmem device spec
David Marchand [Mon, 8 Sep 2014 09:17:49 +0000 (11:17 +0200)]
docs: update ivshmem device spec

Add some notes on the parts needed to use ivshmem devices: more specifically,
explain the purpose of an ivshmem server and the basic concept to use the
ivshmem devices in guests.
Move some parts of the documentation and re-organise it.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
8 years agoivshmem-server: fix hugetlbfs support
Marc-André Lureau [Mon, 29 Jun 2015 17:53:15 +0000 (19:53 +0200)]
ivshmem-server: fix hugetlbfs support

As pointed out on the ML by Andrew Jones, glibc no longer permits
creating POSIX shm on hugetlbfs directly. When given a hugetlbfs path,
create a shareable file there.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
8 years agoivshmem-server: use a uint16 for client ID
Marc-André Lureau [Tue, 23 Jun 2015 15:09:59 +0000 (17:09 +0200)]
ivshmem-server: use a uint16 for client ID

In practice, the number of VM is limited to MAXUINT16 in ivshmem, so use
the same limit on the server (removes a theorical infinite loop)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem-client: check the number of vectors
Marc-André Lureau [Tue, 23 Jun 2015 14:41:58 +0000 (16:41 +0200)]
ivshmem-client: check the number of vectors

Check the number of vectors received from the server, to avoid
out of bound array access.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agocontrib: add ivshmem client and server
David Marchand [Mon, 8 Sep 2014 09:17:48 +0000 (11:17 +0200)]
contrib: add ivshmem client and server

When using ivshmem devices, notifications between guests can be sent as
interrupts using a ivshmem-server (typical use described in documentation).
The client is provided as a debug tool.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
[fix a valgrind warning, option and server_close() segvs, extra server
headers includes, getopt() return type, out-of-tree build, use qemu
event_notifier instead of eventfd, fix x86/osx warnings - Marc-André]
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
8 years agoutil: const event_notifier_get_fd() argument
Marc-André Lureau [Tue, 13 Oct 2015 10:12:16 +0000 (12:12 +0200)]
util: const event_notifier_get_fd() argument

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
8 years agoivshmem: reset mask on device reset
Marc-André Lureau [Tue, 23 Jun 2015 12:13:08 +0000 (14:13 +0200)]
ivshmem: reset mask on device reset

The interrupt mask is a state value, it should be reset, like the
interrupt status.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: error on too many eventfd received
Marc-André Lureau [Tue, 23 Jun 2015 12:07:11 +0000 (14:07 +0200)]
ivshmem: error on too many eventfd received

The number of eventfd that can be handled per peer is limited by the
number of vectors. Return an error when receiving too many of them.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: replace 'guest' for 'peer' appropriately
Marc-André Lureau [Tue, 23 Jun 2015 11:38:46 +0000 (13:38 +0200)]
ivshmem: replace 'guest' for 'peer' appropriately

The terms 'guest' and 'peer' are used sometime interchangeably which may
be confusing. Instead, use 'peer' for the remote instances of ivshmem
clients, and 'guest' for the local VM.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: fix pci_ivshmem_exit()
Marc-André Lureau [Tue, 23 Jun 2015 10:57:16 +0000 (12:57 +0200)]
ivshmem: fix pci_ivshmem_exit()

Free all objects owned by the device, making sure the device is free,
fixing hot-unplug.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: add device description
Marc-André Lureau [Tue, 23 Jun 2015 11:01:40 +0000 (13:01 +0200)]
ivshmem: add device description

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: check shm isn't already initialized
Marc-André Lureau [Tue, 23 Jun 2015 10:55:41 +0000 (12:55 +0200)]
ivshmem: check shm isn't already initialized

The server should not change the shm, and this isn't handled by qemu and
we should should verify this in qemu.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: shmfd can be 0
Marc-André Lureau [Tue, 23 Jun 2015 10:53:42 +0000 (12:53 +0200)]
ivshmem: shmfd can be 0

0 is a valid fd value, so change conditions and set -1 value early

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: migrate with VMStateDescription
Marc-André Lureau [Thu, 18 Jun 2015 12:05:46 +0000 (14:05 +0200)]
ivshmem: migrate with VMStateDescription

load_state_old() is used to keep compatibility with version 0.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: use common is_power_of_2()
Marc-André Lureau [Thu, 18 Jun 2015 14:10:33 +0000 (16:10 +0200)]
ivshmem: use common is_power_of_2()

The common version correctly checks for 0 value case.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: use common return
Marc-André Lureau [Fri, 19 Jun 2015 10:21:46 +0000 (12:21 +0200)]
ivshmem: use common return

Both if branches return, move this out to common end.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: simplify a bit the code
Marc-André Lureau [Fri, 19 Jun 2015 10:19:55 +0000 (12:19 +0200)]
ivshmem: simplify a bit the code

Use some more explicit variables to simplify the code.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: print error on invalid peer id
Marc-André Lureau [Tue, 23 Jun 2015 11:34:09 +0000 (13:34 +0200)]
ivshmem: print error on invalid peer id

The server shouldn't send invalid peer id, so print an error if it's the
case.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: improve error handling
Marc-André Lureau [Thu, 18 Jun 2015 12:39:49 +0000 (14:39 +0200)]
ivshmem: improve error handling

The test whether the chardev is an AF_UNIX socket rejects
"-chardev socket,id=chr0,path=/tmp/foo,server,nowait -device
ivshmem,chardev=chr0", but fails to explain why.

Use an explicit error on why a chardev may be rejected.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: improve debug messages
Marc-André Lureau [Thu, 18 Jun 2015 13:04:13 +0000 (15:04 +0200)]
ivshmem: improve debug messages

Some misc improvements to ivshmem debug.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: remove max_peer field
Marc-André Lureau [Fri, 19 Jun 2015 10:17:26 +0000 (12:17 +0200)]
ivshmem: remove max_peer field

max_peer isn't really useful, it tracks the maximum received VM id, but
that quickly matches nb_peers, the size of the peers array. Since VM
come and go, there might be sparse peers so it doesn't help much in
general to have this value around.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: initialize max_peer to -1
Marc-André Lureau [Thu, 25 Jun 2015 11:49:09 +0000 (13:49 +0200)]
ivshmem: initialize max_peer to -1

There is no peer when device is initialized, do not let doorbell for
inexisting peer 0.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: remove useless ivshmem_update_irq() val argument
Marc-André Lureau [Thu, 18 Jun 2015 13:00:52 +0000 (15:00 +0200)]
ivshmem: remove useless ivshmem_update_irq() val argument

val isn't used in ivshmem_update_irq() function.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: allocate eventfds in resize_peers()
Marc-André Lureau [Tue, 15 Sep 2015 15:23:07 +0000 (17:23 +0200)]
ivshmem: allocate eventfds in resize_peers()

It simplifies a bit the code to allocate the array when setting the
number of peers instead of lazily when receiving the first vector.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: simplify around increase_dynamic_storage()
Marc-André Lureau [Tue, 15 Sep 2015 15:21:37 +0000 (17:21 +0200)]
ivshmem: simplify around increase_dynamic_storage()

Set the number of peers and array allocation in a single place. Rename
to better reflect the function content.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: limit maximum number of peers to G_MAXUINT16
Marc-André Lureau [Tue, 15 Sep 2015 14:55:10 +0000 (16:55 +0200)]
ivshmem: limit maximum number of peers to G_MAXUINT16

Limit the maximum number of peers to MAXUINT16. This is more realistic
and better matches the limit of the doorbell register.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: remove last exit(1)
Marc-André Lureau [Mon, 22 Jun 2015 10:55:16 +0000 (12:55 +0200)]
ivshmem: remove last exit(1)

Failing to create a chardev shouldn't be fatal.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: more qdev conversion
Marc-André Lureau [Thu, 18 Jun 2015 12:59:28 +0000 (14:59 +0200)]
ivshmem: more qdev conversion

Use the latest qemu device modeling API, in particular, convert to
realize to fix the error handling; right now a botched device_add
ivhsmem command kills the VM.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: remove useless doorbell field
Marc-André Lureau [Thu, 18 Jun 2015 14:17:48 +0000 (16:17 +0200)]
ivshmem: remove useless doorbell field

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: remove superflous ivshmem_attr field
Marc-André Lureau [Thu, 18 Jun 2015 14:24:33 +0000 (16:24 +0200)]
ivshmem: remove superflous ivshmem_attr field

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: remove unnecessary dup()
Marc-André Lureau [Mon, 22 Jun 2015 10:38:34 +0000 (12:38 +0200)]
ivshmem: remove unnecessary dup()

qemu_chr_fe_get_msgfd() transfers ownership, there is no need to dup the
fd.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: factor out the incoming fifo handling
Marc-André Lureau [Tue, 23 Jun 2015 15:56:37 +0000 (17:56 +0200)]
ivshmem: factor out the incoming fifo handling

Make a new function fifo_update_and_get() that can be reused by other
functions (in next commits).

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivshmem: fix number of bytes to push to fifo
Marc-André Lureau [Tue, 23 Jun 2015 15:53:46 +0000 (17:53 +0200)]
ivshmem: fix number of bytes to push to fifo

If the fifo has 0 bytes, and the read is of size 1, the call to
fifo8_push_all() will copy off boundary data.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agoivhsmem: read do not accept more than sizeof(long)
Marc-André Lureau [Fri, 19 Jun 2015 11:00:32 +0000 (13:00 +0200)]
ivhsmem: read do not accept more than sizeof(long)

ivshmem_read() only reads sizeof(long) from the input buffer.  Accepting
more could lead to fifo8 abort() on 32bit systems if fifo is not empty.

A following patch will change the protocol to 64-bit little-endian
instead.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agomsix: add VMSTATE_MSIX_TEST
Marc-André Lureau [Thu, 18 Jun 2015 12:05:13 +0000 (14:05 +0200)]
msix: add VMSTATE_MSIX_TEST

ivshmem is going to use MSIX state conditionally.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agochar: add qemu_chr_free()
Marc-André Lureau [Mon, 22 Jun 2015 16:20:18 +0000 (18:20 +0200)]
char: add qemu_chr_free()

If a chardev is allowed to be created outside of QMP, then it must be
also possible to free it. This is useful for ivshmem that creates
chardev anonymously and must be able to free them.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
8 years agotests: Add ivshmem qtest
Andreas Färber [Sat, 10 Oct 2015 22:18:32 +0000 (00:18 +0200)]
tests: Add ivshmem qtest

Note that it launches two instances, as sharing memory is the purpose of
ivshmem.

Cc: Cam Macdonell <cam@cs.ualberta.ca>
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
[ Remove Nahanni codename, add test to pci set - Marc-André ]
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
8 years agoconfig: enable ivshmem on POSIX
Marc-André Lureau [Mon, 12 Oct 2015 13:25:55 +0000 (15:25 +0200)]
config: enable ivshmem on POSIX

ivshmem doesn't actually require kvm, so enable it when POSIX is
enabled. (it is required however when ioeventfd is enabled)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
8 years agoMerge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Peter Maydell [Fri, 23 Oct 2015 17:14:42 +0000 (18:14 +0100)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches

# gpg: Signature made Fri 23 Oct 2015 17:59:56 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream: (37 commits)
  tests: Add test case for aio_disable_external
  block: Add "drained begin/end" for internal snapshot
  block: Add "drained begin/end" for transactional blockdev-backup
  block: Add "drained begin/end" for transactional backup
  block: Add "drained begin/end" for transactional external snapshot
  block: Introduce "drained begin/end" API
  aio: introduce aio_{disable,enable}_external
  dataplane: Mark host notifiers' client type as "external"
  nbd: Mark fd handlers client type as "external"
  aio: Add "is_external" flag for event handlers
  throttle: Remove throttle_group_lock/unlock()
  blockdev: Allow more options for BB-less BDS tree
  blockdev: Pull out blockdev option extraction
  blockdev: Do not create BDS for empty drive
  block: Prepare for NULL BDS
  block: Add blk_insert_bs()
  block: Prepare remaining BB functions for NULL BDS
  block: Fail requests to empty BlockBackend
  block: Make some BB functions fall back to BBRS
  block: Add BlockBackendRootState
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agotests: Add test case for aio_disable_external
Fam Zheng [Fri, 23 Oct 2015 03:08:14 +0000 (11:08 +0800)]
tests: Add test case for aio_disable_external

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoblock: Add "drained begin/end" for internal snapshot
Fam Zheng [Fri, 23 Oct 2015 03:08:13 +0000 (11:08 +0800)]
block: Add "drained begin/end" for internal snapshot

This ensures the atomicity of the transaction by avoiding processing of
external requests such as those from ioeventfd.

state->bs is assigned right after bdrv_drained_begin. Because it was
used as the flag for deletion or not in abort, now we need a separate
flag - InternalSnapshotState.created.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoblock: Add "drained begin/end" for transactional blockdev-backup
Fam Zheng [Fri, 23 Oct 2015 03:08:12 +0000 (11:08 +0800)]
block: Add "drained begin/end" for transactional blockdev-backup

Similar to the previous patch, make sure that external events are not
dispatched during transaction operations.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoblock: Add "drained begin/end" for transactional backup
Fam Zheng [Fri, 23 Oct 2015 03:08:11 +0000 (11:08 +0800)]
block: Add "drained begin/end" for transactional backup

This ensures the atomicity of the transaction by avoiding processing of
external requests such as those from ioeventfd.

Move the assignment to state->bs up right after bdrv_drained_begin, so
that we can use it in the clean callback. The abort callback will still
check bs->job and state->job, so it's OK.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoblock: Add "drained begin/end" for transactional external snapshot
Fam Zheng [Fri, 23 Oct 2015 03:08:10 +0000 (11:08 +0800)]
block: Add "drained begin/end" for transactional external snapshot

This ensures the atomicity of the transaction by avoiding processing of
external requests such as those from ioeventfd.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoblock: Introduce "drained begin/end" API
Fam Zheng [Fri, 23 Oct 2015 03:08:09 +0000 (11:08 +0800)]
block: Introduce "drained begin/end" API

The semantics is that after bdrv_drained_begin(bs), bs will not get new external
requests until the matching bdrv_drained_end(bs).

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoaio: introduce aio_{disable,enable}_external
Fam Zheng [Fri, 23 Oct 2015 03:08:08 +0000 (11:08 +0800)]
aio: introduce aio_{disable,enable}_external

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agodataplane: Mark host notifiers' client type as "external"
Fam Zheng [Fri, 23 Oct 2015 03:08:07 +0000 (11:08 +0800)]
dataplane: Mark host notifiers' client type as "external"

They will be excluded by type in the nested event loops in block layer,
so that unwanted events won't be processed there.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agonbd: Mark fd handlers client type as "external"
Fam Zheng [Fri, 23 Oct 2015 03:08:06 +0000 (11:08 +0800)]
nbd: Mark fd handlers client type as "external"

So we could distinguish it from internal used fds, thus avoid handling
unwanted events in nested aio polls.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoaio: Add "is_external" flag for event handlers
Fam Zheng [Fri, 23 Oct 2015 03:08:05 +0000 (11:08 +0800)]
aio: Add "is_external" flag for event handlers

All callers pass in false, and the real external ones will switch to
true in coming patches.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agothrottle: Remove throttle_group_lock/unlock()
Alberto Garcia [Wed, 21 Oct 2015 18:36:05 +0000 (21:36 +0300)]
throttle: Remove throttle_group_lock/unlock()

The group throttling code was always meant to handle its locking
internally. However, bdrv_swap() was touching the ThrottleGroup
structure directly and therefore needed an API for that.

Now that bdrv_swap() no longer exists there's no need for the
throttle_group_lock() API anymore.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoblockdev: Allow more options for BB-less BDS tree
Max Reitz [Mon, 19 Oct 2015 15:53:32 +0000 (17:53 +0200)]
blockdev: Allow more options for BB-less BDS tree

Most of the options which blockdev_init() parses for both the
BlockBackend and the root BDS are valid for just the root BDS as well
(e.g. read-only). This patch allows specifying these options even if not
creating a BlockBackend.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoblockdev: Pull out blockdev option extraction
Max Reitz [Mon, 19 Oct 2015 15:53:31 +0000 (17:53 +0200)]
blockdev: Pull out blockdev option extraction

Extract some of the blockdev option extraction code from blockdev_init()
into its own function. This simplifies blockdev_init() and will allow
reusing the code in a different function added in a follow-up patch.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoblockdev: Do not create BDS for empty drive
Max Reitz [Mon, 19 Oct 2015 15:53:30 +0000 (17:53 +0200)]
blockdev: Do not create BDS for empty drive

Do not use "rudimentary" BDSs for empty drives any longer (for
freshly created drives).

After a follow-up patch, empty drives will generally use a NULL BDS, not
only the freshly created drives.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoblock: Prepare for NULL BDS
Max Reitz [Mon, 19 Oct 2015 15:53:29 +0000 (17:53 +0200)]
block: Prepare for NULL BDS

blk_bs() will not necessarily return a non-NULL value any more (unless
blk_is_available() is true or it can be assumed to otherwise, e.g.
because it is called immediately after a successful blk_new_with_bs() or
blk_new_open()).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoblock: Add blk_insert_bs()
Max Reitz [Mon, 19 Oct 2015 15:53:28 +0000 (17:53 +0200)]
block: Add blk_insert_bs()

This function associates the given BlockDriverState with the given
BlockBackend.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoblock: Prepare remaining BB functions for NULL BDS
Max Reitz [Mon, 19 Oct 2015 15:53:27 +0000 (17:53 +0200)]
block: Prepare remaining BB functions for NULL BDS

There are several BlockBackend functions which, in theory, cannot fail.
This patch makes them cope with the BlockDriverState pointer being NULL
by making them fall back to some default action like ignoring the value
in setters and returning the default in getters.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoblock: Fail requests to empty BlockBackend
Max Reitz [Mon, 19 Oct 2015 15:53:26 +0000 (17:53 +0200)]
block: Fail requests to empty BlockBackend

If there is no BlockDriverState in a BlockBackend or if the tray of the
guest device is open, fail all requests (where that is possible) with
-ENOMEDIUM.

The reason the status of the guest device is taken into account is
because once the guest device's tray is opened, any request on the same
BlockBackend as the guest uses should fail. If the BDS tree is supposed
to be usable even after ejecting it from the guest, a different
BlockBackend must be used.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoblock: Make some BB functions fall back to BBRS
Max Reitz [Mon, 19 Oct 2015 15:53:25 +0000 (17:53 +0200)]
block: Make some BB functions fall back to BBRS

If there is no BDS tree attached to a BlockBackend, functions that can
do so should fall back to the BlockBackendRootState structure.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoblock: Add BlockBackendRootState
Max Reitz [Mon, 19 Oct 2015 15:53:24 +0000 (17:53 +0200)]
block: Add BlockBackendRootState

This structure will store some of the state of the root BDS if the BDS
tree is removed, so that state can be restored once a new BDS tree is
inserted.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoblock/throttle-groups: Make incref/decref public
Max Reitz [Mon, 19 Oct 2015 15:53:23 +0000 (17:53 +0200)]
block/throttle-groups: Make incref/decref public

Throttle groups are not necessarily referenced by BDSs alone; a later
patch will essentially allow BBs to reference them, too. Make the
ref/unref functions public so that reference can be properly accounted
for.

Their interface is slightly adjusted in that they return and take a
ThrottleState pointer, respectively, instead of a ThrottleGroup pointer.
Functionally, they are equivalent, but since ThrottleGroup is not meant
to be used outside of block/throttle-groups.c, ThrottleState is easier
to handle.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoblock: Move I/O status and error actions into BB
Max Reitz [Mon, 19 Oct 2015 15:53:22 +0000 (17:53 +0200)]
block: Move I/O status and error actions into BB

These options are only relevant for the user of a whole BDS tree (like a
guest device or a block job) and should thus be moved into the
BlockBackend.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoblock: Move BlockAcctStats into BlockBackend
Max Reitz [Mon, 19 Oct 2015 15:53:21 +0000 (17:53 +0200)]
block: Move BlockAcctStats into BlockBackend

As the comment above bdrv_get_stats() says, BlockAcctStats is something
which belongs to the device instead of each BlockDriverState. This patch
therefore moves it into the BlockBackend.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoblock: Remove wr_highest_sector from BlockAcctStats
Max Reitz [Mon, 19 Oct 2015 15:53:20 +0000 (17:53 +0200)]
block: Remove wr_highest_sector from BlockAcctStats

BlockAcctStats contains statistics about the data transferred from and
to the device; wr_highest_sector does not fit in with the rest.

Furthermore, those statistics are supposed to be specific for a certain
device and not necessarily for a BDS (see the comment above
bdrv_get_stats()); on the other hand, wr_highest_sector may be a rather
important information to know for each BDS. When BlockAcctStats is
finally removed from the BDS, we will want to keep wr_highest_sector in
the BDS.

Finally, wr_highest_sector is renamed to wr_highest_offset and given the
appropriate meaning. Externally, it is represented as an offset so there
is no point in doing something different internally. Its definition is
changed to match that in qapi/block-core.json which is "the offset after
the greatest byte written to". Doing so should not cause any harm since
if external programs tried to calculate the volume usage by
(wr_highest_offset + 512) / volume_size, after this patch they will just
assume the volume to be full slightly earlier than before.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoblock: Move guest_block_size into BlockBackend
Max Reitz [Mon, 19 Oct 2015 15:53:19 +0000 (17:53 +0200)]
block: Move guest_block_size into BlockBackend

guest_block_size is a guest device property so it should be moved into
the interface between block layer and guest devices, which is the
BlockBackend.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoblock: Fix BB AIOCB AioContext without BDS
Max Reitz [Mon, 19 Oct 2015 15:53:18 +0000 (17:53 +0200)]
block: Fix BB AIOCB AioContext without BDS

Fix the BlockBackend's AIOCB AioContext for aborting AIO in case there
is no BDS. If there is no implementation of AIOCBInfo::get_aio_context()
the AioContext is derived from the BDS the AIOCB belongs to. If that BDS
is NULL (because it has been removed from the BB) this will not work.

This patch makes blk_get_aio_context() fall back to the main loop
context if the BDS pointer is NULL and implements
AIOCBInfo::get_aio_context() (blk_aiocb_get_aio_context()) which invokes
blk_get_aio_context().

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agohw/usb-storage: Check whether BB is inserted
Max Reitz [Mon, 19 Oct 2015 15:53:17 +0000 (17:53 +0200)]
hw/usb-storage: Check whether BB is inserted

Only call bdrv_add_key() on the BlockDriverState if it is not NULL.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agohw/block/fdc: Implement tray status
Max Reitz [Mon, 19 Oct 2015 15:53:16 +0000 (17:53 +0200)]
hw/block/fdc: Implement tray status

The tray of an FDD is open iff there is no medium inserted (there are
only two states for an FDD: "medium inserted" or "no medium inserted").

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoblock: Invoke change media CB before NULLing drv
Max Reitz [Mon, 19 Oct 2015 15:53:15 +0000 (17:53 +0200)]
block: Invoke change media CB before NULLing drv

In order to handle host device passthrough, some guest device models
may call blk_is_inserted() to check whether the medium is inserted on
the host, when checking the guest tray status.

This tray status is inquired by blk_dev_change_media_cb(); because
bdrv_is_inserted() (invoked by blk_is_inserted()) always returns false
for BDS with drv set to NULL, blk_dev_change_media_cb() should therefore
be called before drv is set to NULL.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoblock/raw_bsd: Drop raw_is_inserted()
Max Reitz [Mon, 19 Oct 2015 15:53:14 +0000 (17:53 +0200)]
block/raw_bsd: Drop raw_is_inserted()

With the new automatically-recursive implementation of
bdrv_is_inserted() checking by default whether all the children of a BDS
are inserted, we can drop raw's own implementation.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>