sdk/emulator/qemu.git
10 years agoqemu-options: add missing -drive discard option to cmdline help
Peter Lieven [Mon, 28 Jul 2014 19:53:02 +0000 (21:53 +0200)]
qemu-options: add missing -drive discard option to cmdline help

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoparallels: 2TB+ parallels images support
Denis V. Lunev [Mon, 28 Jul 2014 16:23:55 +0000 (20:23 +0400)]
parallels: 2TB+ parallels images support

Parallels has released in the recent updates of Parallels Server 5/6
new addition to his image format. Images with signature WithouFreSpacExt
have offsets in the catalog coded not as offsets in sectors (multiple
of 512 bytes) but offsets coded in blocks (i.e. header->tracks * 512)

In this case all 64 bits of header->nb_sectors are used for image size.

This patch implements support of this for qemu-img and also adds specific
check for an incorrect image. Images with block size greater than
INT_MAX/513 are not supported. The biggest available Parallels image
cluster size in the field is 1 Mb. Thus this limit will not hurt
anyone.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Jeff Cody <jcody@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoparallels: split check for parallels format in parallels_open
Denis V. Lunev [Mon, 28 Jul 2014 16:23:54 +0000 (20:23 +0400)]
parallels: split check for parallels format in parallels_open

and rework error path a bit. There is no difference at the moment, but
the code will be definitely shorter when additional processing will
be required for WithouFreSpacExt

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Jeff Cody <jcody@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoparallels: replace tabs with spaces in block/parallels.c
Denis V. Lunev [Mon, 28 Jul 2014 16:23:53 +0000 (20:23 +0400)]
parallels: replace tabs with spaces in block/parallels.c

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Jeff Cody <jcody@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoparallels: extend parallels format header with actual data values
Denis V. Lunev [Mon, 28 Jul 2014 16:23:52 +0000 (20:23 +0400)]
parallels: extend parallels format header with actual data values

Parallels image format has several additional fields inside:
- nb_sectors is actually 64 bit wide. Upper 32bits are not used for
  images with signature "WithoutFreeSpace" and must be explicitly
  zeroed according to Parallels. They will be used for images with
  signature "WithouFreSpacExt"
- inuse is magic which means that the image is currently opened for
  read/write or was not closed correctly, the magic is 0x746f6e59
- data_off is the location of the first data block. It can be zero
  and in this case data starts just beyond the header aligned to
  512 bytes. Though this field does not matter for read-only driver

This patch adds these values to struct parallels_header and adds
proper handling of nb_sectors for currently supported WithoutFreeSpace
images.

WithouFreSpacExt will be covered in next patches.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Jeff Cody <jcody@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agodataplane: stop trying on notifier error
Cornelia Huck [Fri, 25 Jul 2014 12:10:48 +0000 (14:10 +0200)]
dataplane: stop trying on notifier error

If we fail to set up guest or host notifiers, there's no use trying again
every time the guest kicks, so disable dataplane in that case.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agodataplane: fail notifier setting gracefully
Cornelia Huck [Fri, 25 Jul 2014 12:10:47 +0000 (14:10 +0200)]
dataplane: fail notifier setting gracefully

The dataplane code is currently doing a hard exit if it fails to set
up either guest or host notifiers. In practice, this may mean that a
guest suddenly dies after a dataplane device failed to come up (e.g.,
when a file descriptor limit is hit for tne nth device).

Let's just try to unwind the setup instead and return.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agodataplane: print why starting failed
Cornelia Huck [Fri, 25 Jul 2014 12:10:46 +0000 (14:10 +0200)]
dataplane: print why starting failed

Setting up guest or host notifiers may fail, but the user will have
no idea why: Let's print the error returned by the callback.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agochannel-posix: using qemu_set_nonblock() instead of fcntl(O_NONBLOCK)
Gonglei [Mon, 11 Aug 2014 09:34:21 +0000 (17:34 +0800)]
channel-posix: using qemu_set_nonblock() instead of fcntl(O_NONBLOCK)

Technically, fcntl(soc, F_SETFL, O_NONBLOCK)
is incorrect since it clobbers all other file flags.
We can use F_GETFL to get the current flags, set or
clear the O_NONBLOCK flag, then use F_SETFL to set the flags.

Using the qemu_set_nonblock() wrapper.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Wangxin <wangxinxin.wang@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoqemu-char: using qemu_set_nonblock() instead of fcntl(O_NONBLOCK)
Gonglei [Mon, 11 Aug 2014 09:34:20 +0000 (17:34 +0800)]
qemu-char: using qemu_set_nonblock() instead of fcntl(O_NONBLOCK)

Technically, fcntl(soc, F_SETFL, O_NONBLOCK)
is incorrect since it clobbers all other file flags.
We can use F_GETFL to get the current flags, set or
clear the O_NONBLOCK flag, then use F_SETFL to set the flags.

Using the qemu_set_nonblock() wrapper.

Signed-off-by: Wangxin <wangxinxin.wang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agocmd646: synchronise UDMA interrupt status with DMA interrupt status
Mark Cave-Ayland [Fri, 8 Aug 2014 16:23:36 +0000 (17:23 +0100)]
cmd646: synchronise UDMA interrupt status with DMA interrupt status

Make sure that both registers are synchronised when being accessed through
PCI configuration space.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agocmd646: allow MRDMODE interrupt status bits clearing from PCI config space
Mark Cave-Ayland [Fri, 8 Aug 2014 16:23:35 +0000 (17:23 +0100)]
cmd646: allow MRDMODE interrupt status bits clearing from PCI config space

Make sure that we also update the normal DMA interrupt status bits at the
same time, and alter the IRQ if being cleared accordingly.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agocmd646: switch cmd646_update_irq() to accept PCIDevice instead of PCIIDEState
Mark Cave-Ayland [Fri, 8 Aug 2014 16:23:34 +0000 (17:23 +0100)]
cmd646: switch cmd646_update_irq() to accept PCIDevice instead of PCIIDEState

This is in preparation for adding configuration space accessors which accept
PCIDevice as a parameter.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agocmd646: synchronise DMA interrupt status with UDMA interrupt status
Mark Cave-Ayland [Fri, 8 Aug 2014 16:23:33 +0000 (17:23 +0100)]
cmd646: synchronise DMA interrupt status with UDMA interrupt status

Make sure that the standard DMA interrupt status bits reflect any changes made
to the UDMA interrupt status bits. The CMD646U2 datasheet claims that these
bits are equivalent, and they must be synchronised for guests that manipulate
both registers.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agocmd646: add constants for CNTRL register access
Mark Cave-Ayland [Fri, 8 Aug 2014 16:23:32 +0000 (17:23 +0100)]
cmd646: add constants for CNTRL register access

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoqtest/ide: Fix small memory leak
John Snow [Mon, 4 Aug 2014 21:11:25 +0000 (17:11 -0400)]
qtest/ide: Fix small memory leak

For libqos debugging purposes, it's nice to
be able to assert that tests and associated libraries
have no memory leaks. To that end, free up the
trivial cmdline leak.

The remaining leaks caused by pc_alloc_init are fixed
instead by my first-fit pc_alloc implementation already
on the qemu-devel mailing list.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agolibqos: allow qpci_iomap to return BAR mapping size
John Snow [Mon, 4 Aug 2014 21:11:24 +0000 (17:11 -0400)]
libqos: allow qpci_iomap to return BAR mapping size

This patch allows qpci_iomap to return the size of the
BAR mapping that it created, to allow driver applications
(e.g, ahci-test) to make determinations about the suitability
or the mapping size, or in the specific case of AHCI, how
many ports are supported by the HBA.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agolibqos: Fixes a small memory leak.
John Snow [Mon, 4 Aug 2014 21:11:23 +0000 (17:11 -0400)]
libqos: Fixes a small memory leak.

Allow users the chance to clean up the QPCIBusPC structure
by adding a small cleanup routine. Helps clear up small
memory leaks during setup/teardown, to allow for cleaner
debug output messages.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agolibqtest: Correct small memory leak.
John Snow [Mon, 4 Aug 2014 21:11:22 +0000 (17:11 -0400)]
libqtest: Correct small memory leak.

Fixes a small memory leak inside of libqtest.
After we produce a test path and glib copies the string
for itself, we should clean up our temporary copy.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agolibqos: Correct memory leak
John Snow [Mon, 4 Aug 2014 21:11:21 +0000 (17:11 -0400)]
libqos: Correct memory leak

Fix a small memory leak inside of libqos, in the pc_alloc_init routine.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoqtest: Adding qtest_memset and qmemset.
John Snow [Mon, 4 Aug 2014 21:11:20 +0000 (17:11 -0400)]
qtest: Adding qtest_memset and qmemset.

Currently, libqtest allows for memread and memwrite, but
does not offer a simple way to zero out regions of memory.
This patch adds a simple function to do so.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoq35: Enable the ioapic device to be seen by qtest.
John Snow [Mon, 4 Aug 2014 21:11:19 +0000 (17:11 -0400)]
q35: Enable the ioapic device to be seen by qtest.

Currently, the ioapic device can not be found in a qtest environment
when requesting "irq_interrupt_in ioapic" via the qtest socket.

By mirroring how the ioapic is added in i44ofx (hw/i440/pc_piix.c),
as a child of "q35," the device is able to be seen by qtest.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoahci: construct PIO Setup FIS for PIO commands
Paolo Bonzini [Mon, 4 Aug 2014 21:11:18 +0000 (17:11 -0400)]
ahci: construct PIO Setup FIS for PIO commands

PIO commands should put a PIO Setup FIS in the receive area when data
transfer ends.  Currently QEMU does not do this and only places the
D2H FIS at the end of the operation.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoide: make all commands go through cmd_done
Paolo Bonzini [Mon, 4 Aug 2014 21:11:17 +0000 (17:11 -0400)]
ide: make all commands go through cmd_done

AHCI has code to fill in the D2H FIS trigger the IRQ all over the place.
Centralize this in a single cmd_done callback by generalizing the existing
async_cmd_done callback.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoide: stop PIO transfer on errors
Paolo Bonzini [Mon, 4 Aug 2014 21:11:16 +0000 (17:11 -0400)]
ide: stop PIO transfer on errors

This will provide a hook for sending the result of the command via the
FIS receive area.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoahci: remove duplicate PORT_IRQ_* constants
Paolo Bonzini [Mon, 4 Aug 2014 21:11:15 +0000 (17:11 -0400)]
ahci: remove duplicate PORT_IRQ_* constants

These are defined twice, just use one set consistently.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoide: move retry constants out of BM_STATUS_* namespace
Paolo Bonzini [Mon, 4 Aug 2014 21:11:14 +0000 (17:11 -0400)]
ide: move retry constants out of BM_STATUS_* namespace

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoide: move BM_STATUS bits to pci.[ch]
Paolo Bonzini [Mon, 4 Aug 2014 21:11:13 +0000 (17:11 -0400)]
ide: move BM_STATUS bits to pci.[ch]

They are not used by AHCI, and should not be even available there.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoide: fold add_status callback into set_inactive
Paolo Bonzini [Mon, 4 Aug 2014 21:11:12 +0000 (17:11 -0400)]
ide: fold add_status callback into set_inactive

It is now called only after the set_inactive callback.  Put the two together.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoide: remove wrong setting of BM_STATUS_INT
Paolo Bonzini [Mon, 4 Aug 2014 21:11:11 +0000 (17:11 -0400)]
ide: remove wrong setting of BM_STATUS_INT

Similar to the case removed in commit 69c38b8 (ide/core: Remove explicit
setting of BM_STATUS_INT, 2011-05-19), the only remaining use of
add_status(..., BM_STATUS_INT) is for short PRDs.  The flag should
not be raised in this case.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoide: wrap start_dma callback
Paolo Bonzini [Mon, 4 Aug 2014 21:11:10 +0000 (17:11 -0400)]
ide: wrap start_dma callback

Make it optional and prepare for the next patches.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoide: simplify start_transfer callbacks
Paolo Bonzini [Mon, 4 Aug 2014 21:11:09 +0000 (17:11 -0400)]
ide: simplify start_transfer callbacks

Drop the unused return value and make the callback optional.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoide: simplify async_cmd_done callbacks
Paolo Bonzini [Mon, 4 Aug 2014 21:11:08 +0000 (17:11 -0400)]
ide: simplify async_cmd_done callbacks

Drop the unused return value.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoide: simplify set_inactive callbacks
Paolo Bonzini [Mon, 4 Aug 2014 21:11:07 +0000 (17:11 -0400)]
ide: simplify set_inactive callbacks

Drop the unused return value and make the callback optional.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoide: simplify reset callbacks
Paolo Bonzini [Mon, 4 Aug 2014 21:11:06 +0000 (17:11 -0400)]
ide: simplify reset callbacks

Drop the unused return value and make the callback optional.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoide: stash aiocb for flushes
Paolo Bonzini [Mon, 4 Aug 2014 21:11:05 +0000 (17:11 -0400)]
ide: stash aiocb for flushes

This ensures that operations are completed after a reset

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoide-test: add test for werror=stop
Paolo Bonzini [Mon, 4 Aug 2014 21:11:04 +0000 (17:11 -0400)]
ide-test: add test for werror=stop

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agolibqtest: add QTEST_LOG for debugging qtest testcases
Paolo Bonzini [Mon, 4 Aug 2014 21:11:03 +0000 (17:11 -0400)]
libqtest: add QTEST_LOG for debugging qtest testcases

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoblkdebug: report errors on flush too
Paolo Bonzini [Mon, 4 Aug 2014 21:11:02 +0000 (17:11 -0400)]
blkdebug: report errors on flush too

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoMerge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
Peter Maydell [Fri, 15 Aug 2014 15:37:17 +0000 (16:37 +0100)]
Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging

Tracing pull request

* remotes/stefanha/tags/tracing-pull-request:
  virtio-rng: add some trace events
  trace: add some tcg tracing support
  trace: teach lttng backend to use format strings
  trace: [tcg] Include TCG-tracing header on all targets
  trace: [tcg] Include event definitions in "trace.h"
  trace: [tcg] Generate TCG tracing routines
  trace: [tcg] Include TCG-tracing helpers
  trace: [tcg] Define TCG tracing helper routine wrappers
  trace: [tcg] Define TCG tracing helper routines
  trace: [tcg] Declare TCG tracing helper routines
  trace: [tcg] Add 'tcg' event property
  trace: [tcg] Argument type transformation machinery
  trace: [tcg] Argument type transformation rules
  trace: [tcg] Add documentation
  trace: install simpletrace SystemTap tapset
  simpletrace: add simpletrace.py --no-header option
  trace: add tracetool simpletrace_stap format
  trace: extract stap_escape() function for reuse

Conflicts:
Makefile.objs

10 years agoMerge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Peter Maydell [Fri, 15 Aug 2014 13:49:50 +0000 (14:49 +0100)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block patches

# gpg: Signature made Fri 15 Aug 2014 14:07:42 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream: (59 commits)
  block: Catch !bs->drv in bdrv_check()
  iotests: Add test for image header overlap
  qcow2: Catch !*host_offset for data allocation
  qcow2: Return useful error code in refcount_init()
  mirror: Handle failure for potentially large allocations
  vpc: Handle failure for potentially large allocations
  vmdk: Handle failure for potentially large allocations
  vhdx: Handle failure for potentially large allocations
  vdi: Handle failure for potentially large allocations
  rbd: Handle failure for potentially large allocations
  raw-win32: Handle failure for potentially large allocations
  raw-posix: Handle failure for potentially large allocations
  qed: Handle failure for potentially large allocations
  qcow2: Handle failure for potentially large allocations
  qcow1: Handle failure for potentially large allocations
  parallels: Handle failure for potentially large allocations
  nfs: Handle failure for potentially large allocations
  iscsi: Handle failure for potentially large allocations
  dmg: Handle failure for potentially large allocations
  curl: Handle failure for potentially large allocations
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
10 years agoblock: Catch !bs->drv in bdrv_check()
Max Reitz [Thu, 7 Aug 2014 20:47:55 +0000 (22:47 +0200)]
block: Catch !bs->drv in bdrv_check()

qemu-img check calls bdrv_check() twice if the first run repaired some
inconsistencies. If the first run however again triggered corruption
prevention (on qcow2) due to very bad inconsistencies, bs->drv may be
NULL afterwards. Thus, bdrv_check() should check whether bs->drv is set.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoiotests: Add test for image header overlap
Max Reitz [Thu, 7 Aug 2014 20:47:54 +0000 (22:47 +0200)]
iotests: Add test for image header overlap

Add a test for an image with an unallocated image header; instead of an
assertion, this should result in the image being marked corrupt.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoqcow2: Catch !*host_offset for data allocation
Max Reitz [Thu, 7 Aug 2014 20:47:53 +0000 (22:47 +0200)]
qcow2: Catch !*host_offset for data allocation

qcow2_alloc_cluster_offset() uses host_offset == 0 as "no preferred
offset" for the (data) cluster range to be allocated. However, this
offset is actually valid and may be allocated on images with a corrupted
refcount table or first refcount block.

In this case, the corruption prevention should normally catch that
write anyway (because it would overwrite the image header). But since 0
is a special value here, the function assumes that nothing has been
allocated at all which it asserts against.

Because this condition is not qemu's fault but rather that of a broken
image, it shouldn't throw an assertion but rather mark the image corrupt
and show an appropriate message, which this patch does by calling the
corruption check earlier than it would be called normally (before the
assertion).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoqcow2: Return useful error code in refcount_init()
Max Reitz [Wed, 28 May 2014 22:19:54 +0000 (00:19 +0200)]
qcow2: Return useful error code in refcount_init()

If bdrv_pread() returns an error, it is very unlikely that it was
ENOMEM. In this case, the return value should be passed along; as
bdrv_pread() will always either return the number of bytes read or a
negative value (the error code), the condition for checking whether
bdrv_pread() failed can be simplified (and clarified) as well.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agomirror: Handle failure for potentially large allocations
Kevin Wolf [Wed, 21 May 2014 16:16:21 +0000 (18:16 +0200)]
mirror: Handle failure for potentially large allocations

Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the mirror block job.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agovpc: Handle failure for potentially large allocations
Kevin Wolf [Wed, 21 May 2014 16:08:38 +0000 (18:08 +0200)]
vpc: Handle failure for potentially large allocations

Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the vpc block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agovmdk: Handle failure for potentially large allocations
Kevin Wolf [Tue, 20 May 2014 11:56:27 +0000 (13:56 +0200)]
vmdk: Handle failure for potentially large allocations

Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the vmdk block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agovhdx: Handle failure for potentially large allocations
Kevin Wolf [Tue, 20 May 2014 11:55:50 +0000 (13:55 +0200)]
vhdx: Handle failure for potentially large allocations

Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the vhdx block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agovdi: Handle failure for potentially large allocations
Kevin Wolf [Tue, 20 May 2014 11:25:43 +0000 (13:25 +0200)]
vdi: Handle failure for potentially large allocations

Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the vdi block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agorbd: Handle failure for potentially large allocations
Kevin Wolf [Wed, 21 May 2014 16:11:48 +0000 (18:11 +0200)]
rbd: Handle failure for potentially large allocations

Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the rbd block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoraw-win32: Handle failure for potentially large allocations
Kevin Wolf [Wed, 21 May 2014 16:05:47 +0000 (18:05 +0200)]
raw-win32: Handle failure for potentially large allocations

Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the raw-win32 block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoraw-posix: Handle failure for potentially large allocations
Kevin Wolf [Wed, 21 May 2014 16:02:42 +0000 (18:02 +0200)]
raw-posix: Handle failure for potentially large allocations

Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the raw-posix block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoqed: Handle failure for potentially large allocations
Kevin Wolf [Tue, 20 May 2014 11:39:57 +0000 (13:39 +0200)]
qed: Handle failure for potentially large allocations

Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the qed block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agoqcow2: Handle failure for potentially large allocations
Kevin Wolf [Tue, 20 May 2014 15:12:47 +0000 (17:12 +0200)]
qcow2: Handle failure for potentially large allocations

Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the qcow2 block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoqcow1: Handle failure for potentially large allocations
Kevin Wolf [Tue, 20 May 2014 11:36:05 +0000 (13:36 +0200)]
qcow1: Handle failure for potentially large allocations

Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the qcow1 block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoparallels: Handle failure for potentially large allocations
Kevin Wolf [Tue, 20 May 2014 11:32:14 +0000 (13:32 +0200)]
parallels: Handle failure for potentially large allocations

Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the parallels block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agonfs: Handle failure for potentially large allocations
Kevin Wolf [Tue, 20 May 2014 11:31:20 +0000 (13:31 +0200)]
nfs: Handle failure for potentially large allocations

Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the nfs block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agoiscsi: Handle failure for potentially large allocations
Kevin Wolf [Tue, 20 May 2014 11:30:49 +0000 (13:30 +0200)]
iscsi: Handle failure for potentially large allocations

Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the iscsi block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
10 years agodmg: Handle failure for potentially large allocations
Kevin Wolf [Tue, 20 May 2014 11:28:14 +0000 (13:28 +0200)]
dmg: Handle failure for potentially large allocations

Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the dmg block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agocurl: Handle failure for potentially large allocations
Kevin Wolf [Tue, 20 May 2014 11:26:40 +0000 (13:26 +0200)]
curl: Handle failure for potentially large allocations

Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the curl block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agocloop: Handle failure for potentially large allocations
Kevin Wolf [Tue, 20 May 2014 11:22:38 +0000 (13:22 +0200)]
cloop: Handle failure for potentially large allocations

Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the cloop block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agobochs: Handle failure for potentially large allocations
Kevin Wolf [Tue, 20 May 2014 11:21:26 +0000 (13:21 +0200)]
bochs: Handle failure for potentially large allocations

Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the bochs block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agoblock: Handle failure for potentially large allocations
Kevin Wolf [Tue, 20 May 2014 11:16:51 +0000 (13:16 +0200)]
block: Handle failure for potentially large allocations

Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses bounce buffer allocations in block.c. While at it,
convert bdrv_commit() from plain g_malloc() to qemu_try_blockalign().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoblock: Introduce qemu_try_blockalign()
Kevin Wolf [Tue, 20 May 2014 10:24:05 +0000 (12:24 +0200)]
block: Introduce qemu_try_blockalign()

This function returns NULL instead of aborting when an allocation fails.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agoblock: iotest - update 084 to test static VDI image creation
Jeff Cody [Wed, 23 Jul 2014 21:23:01 +0000 (17:23 -0400)]
block: iotest - update 084 to test static VDI image creation

This updates the VDI corruption test to also test static VDI image
creation, as well as the default dynamic image creation.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoblock: vpc - use block layer ops in vpc_create, instead of posix calls
Jeff Cody [Wed, 23 Jul 2014 21:23:00 +0000 (17:23 -0400)]
block: vpc - use block layer ops in vpc_create, instead of posix calls

Use the block layer to create, and write to, the image file in the VPC
.bdrv_create() operation.

This has a couple of benefits: Images can now be created over protocols,
and hacks such as NOCOW are not needed in the image format driver, and
the underlying file protocol appropriate for the host OS can be relied
upon.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoblock: use the standard 'ret' instead of 'result'
Jeff Cody [Wed, 23 Jul 2014 21:22:59 +0000 (17:22 -0400)]
block: use the standard 'ret' instead of 'result'

Most QEMU code uses 'ret' for function return values. The VDI driver
uses a mix of 'result' and 'ret'.  This cleans that up, switching over
to the standard 'ret' usage.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoblock: vdi - use block layer ops in vdi_create, instead of posix calls
Jeff Cody [Wed, 23 Jul 2014 21:22:58 +0000 (17:22 -0400)]
block: vdi - use block layer ops in vdi_create, instead of posix calls

Use the block layer to create, and write to, the image file in the
VDI .bdrv_create() operation.

This has a couple of benefits: Images can now be created over protocols,
and hacks such as NOCOW are not needed in the image format driver, and
the underlying file protocol appropriate for the host OS can be relied
upon.

Also some minor cleanup for error handling.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoblock: allow bdrv_unref() to be passed NULL pointers
Jeff Cody [Wed, 23 Jul 2014 21:22:57 +0000 (17:22 -0400)]
block: allow bdrv_unref() to be passed NULL pointers

If bdrv_unref() is passed a NULL BDS pointer, it is safe to
exit with no operation.  This will allow cleanup code to blindly
call bdrv_unref() on a BDS that has been initialized to NULL.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agotest-coroutine: add baseline test that times the cost of function calls
Paolo Bonzini [Wed, 6 Aug 2014 09:33:41 +0000 (11:33 +0200)]
test-coroutine: add baseline test that times the cost of function calls

This can be used to compute the cost of coroutine operations.  In the
end the cost of the function call is a few clock cycles, so it's pretty
cheap for now, but it may become more relevant as the coroutine code
is optimized.

For example, here are the results on my machine:

   Function call 100000000 iterations: 0.173884 s
   Yield 100000000 iterations: 8.445064 s
   Lifecycle 1000000 iterations: 0.098445 s
   Nesting 10000 iterations of 1000 depth each: 7.406431 s

One yield takes 83 nanoseconds, one enter takes 97 nanoseconds,
one coroutine allocation takes (roughly, since some of the allocations
in the nesting test do hit the pool) 739 nanoseconds:

   (8.445064 - 0.173884) * 10^9 / 100000000 = 82.7
   (0.098445 * 100 - 0.173884) * 10^9 / 100000000 = 96.7
   (7.406431 * 10 - 0.173884) * 10^9 / 100000000 = 738.9

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoblock: VHDX endian fixes
Jeff Cody [Wed, 6 Aug 2014 19:54:58 +0000 (15:54 -0400)]
block: VHDX endian fixes

This patch contains several changes for endian conversion fixes for
VHDX, particularly for big-endian machines (multibyte values in VHDX are
all on disk in LE format).

Tests were done with existing qemu-iotests on an IBM POWER7 (8406-71Y).
This includes sample images created by Hyper-V, both with dirty logs and
without.

In addition, VHDX image files created (and written to) on a BE machine
were tested on a LE machine, and vice-versa.

Reported-by: Markus Armburster <armbru@redhat.com>
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoblock: vhdx - add error check
Jeff Cody [Wed, 6 Aug 2014 19:54:57 +0000 (15:54 -0400)]
block: vhdx - add error check

This add an error check for an invalid descriptor entry signature,
when flushing the log descriptor entries.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agothread-pool: avoid deadlock in nested aio_poll() calls
Stefan Hajnoczi [Tue, 15 Jul 2014 14:44:26 +0000 (16:44 +0200)]
thread-pool: avoid deadlock in nested aio_poll() calls

The thread pool has a race condition if two elements complete before
thread_pool_completion_bh() runs:

  If element A's callback waits for element B using aio_poll() it will
  deadlock since pool->completion_bh is not marked scheduled when the
  nested aio_poll() runs.

Fix this by marking the BH scheduled while thread_pool_completion_bh()
is executing.  This way any nested aio_poll() loops will enter
thread_pool_completion_bh() and complete the remaining elements.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agothread-pool: avoid per-thread-pool EventNotifier
Stefan Hajnoczi [Tue, 15 Jul 2014 14:44:25 +0000 (16:44 +0200)]
thread-pool: avoid per-thread-pool EventNotifier

EventNotifier is implemented using an eventfd or pipe.  It therefore
consumes file descriptors, which can be limited by rlimits and should
therefore be used sparingly.

Switch from EventNotifier to QEMUBH in thread-pool.c.  Originally
EventNotifier was used because qemu_bh_schedule() was not thread-safe
yet.

Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoblock: bump coroutine pool size for drives
Stefan Hajnoczi [Mon, 7 Jul 2014 13:15:53 +0000 (15:15 +0200)]
block: bump coroutine pool size for drives

When a BlockDriverState is associated with a storage controller
DeviceState we expect guest I/O.  Use this opportunity to bump the
coroutine pool size by 64.

This patch ensures that the coroutine pool size scales with the number
of drives attached to the guest.  It should increase coroutine pool
usage (which makes qemu_coroutine_create() fast) without hogging too
much memory when fewer drives are attached.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
10 years agocoroutine: make pool size dynamic
Stefan Hajnoczi [Mon, 7 Jul 2014 13:15:52 +0000 (15:15 +0200)]
coroutine: make pool size dynamic

Allow coroutine users to adjust the pool size.  For example, if the
guest has multiple emulated disk drives we should keep around more
coroutines.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
10 years agoqemu-iotests: add support for Archipelago protocol
Chrysostomos Nanakos [Wed, 23 Jul 2014 14:07:33 +0000 (17:07 +0300)]
qemu-iotests: add support for Archipelago protocol

Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoQMP: Add support for Archipelago
Chrysostomos Nanakos [Wed, 30 Jul 2014 17:59:09 +0000 (20:59 +0300)]
QMP: Add support for Archipelago

Introduce new enum BlockdevOptionsArchipelago.

@volume:              #Name of the Archipelago volume image

@mport:               #'mport' is the port number on which mapperd is
                      listening. This is optional and if not specified,
                      QEMU will make Archipelago to use the default port.

@vport:               #'vport' is the port number on which vlmcd is
                      listening. This is optional and if not specified,
                      QEMU will make Archipelago to use the default port.

@segment:             #optional The name of the shared memory segment
                      Archipelago stack is using. This is optional
                      and if not specified, QEMU will make Archipelago
                      use the default value, 'archipelago'.

Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoblock/archipelago: Add support for creating images
Chrysostomos Nanakos [Wed, 23 Jul 2014 14:07:31 +0000 (17:07 +0300)]
block/archipelago: Add support for creating images

qemu-img archipelago:<volumename>[/mport=<mapperd_port>[:vport=<vlmcd_port>]
 [:segment=<segment_name>]] [size]

Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoblock/archipelago: Implement bdrv_parse_filename()
Chrysostomos Nanakos [Wed, 23 Jul 2014 14:07:30 +0000 (17:07 +0300)]
block/archipelago: Implement bdrv_parse_filename()

VM Image on Archipelago volume can also be specified like this:

file=archipelago:<volumename>[/mport=<mapperd_port>[:vport=<vlmcd_port>][:
segment=<segment_name>]]

Examples:

file=archipelago:my_vm_volume
file=archipelago:my_vm_volume/mport=123
file=archipelago:my_vm_volume/mport=123:vport=1234
file=archipelago:my_vm_volume/mport=123:vport=1234:segment=my_segment

Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoblock: Support Archipelago as a QEMU block backend
Chrysostomos Nanakos [Mon, 4 Aug 2014 14:35:32 +0000 (17:35 +0300)]
block: Support Archipelago as a QEMU block backend

VM Image on Archipelago volume is specified like this:

file.driver=archipelago,file.volume=<volumename>[,file.mport=<mapperd_port>[,
file.vport=<vlmcd_port>][,file.segment=<segment_name>]]

'archipelago' is the protocol.

'mport' is the port number on which mapperd is listening. This is optional
and if not specified, QEMU will make Archipelago to use the default port.

'vport' is the port number on which vlmcd is listening. This is optional
and if not specified, QEMU will make Archipelago to use the default port.

'segment' is the name of the shared memory segment Archipelago stack is using.
This is optional and if not specified, QEMU will make Archipelago to use the
default value, 'archipelago'.

Examples:

file.driver=archipelago,file.volume=my_vm_volume
file.driver=archipelago,file.volume=my_vm_volume,file.mport=123
file.driver=archipelago,file.volume=my_vm_volume,file.mport=123,
file.vport=1234
file.driver=archipelago,file.volume=my_vm_volume,file.mport=123,
file.vport=1234,file.segment=my_segment

Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoqemu-img info: show nocow info
Chunyan Liu [Wed, 30 Jul 2014 02:55:06 +0000 (10:55 +0800)]
qemu-img info: show nocow info

Add nocow info in 'qemu-img info' output to show whether the file
currently has NOCOW flag set or not.

Signed-off-by: Chunyan Liu <cyliu@suse.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agovmdk: Optimize cluster allocation
Fam Zheng [Wed, 30 Jul 2014 06:39:10 +0000 (14:39 +0800)]
vmdk: Optimize cluster allocation

This drops the unnecessary bdrv_truncate() from, and also improves,
cluster allocation code path.

Before, when we need a new cluster, get_cluster_offset truncates the
image to bdrv_getlength() + cluster_size, and returns the offset of
added area, i.e. the image length before truncating.

This is not efficient, so it's now rewritten as:

  - Save the extent file length when opening.

  - When allocating cluster, use the saved length as cluster offset.

  - Don't truncate image, because we'll anyway write data there: just
    write any data at the EOF position, in descending priority:

    * New user data (cluster allocation happens in a write request).

    * Filling data in the beginning and/or ending of the new cluster, if
      not covered by user data: either backing file content (COW), or
      zero for standalone images.

One major benifit of this change is, on host mounted NFS images, even
over a fast network, ftruncate is slow (see the example below). This
change significantly speeds up cluster allocation. Comparing by
converting a cirros image (296M) to VMDK on an NFS mount point, over
1Gbe LAN:

    $ time qemu-img convert cirros-0.3.1.img /mnt/a.raw -O vmdk

    Before:
        real    0m21.796s
        user    0m0.130s
        sys     0m0.483s

    After:
        real    0m2.017s
        user    0m0.047s
        sys     0m0.190s

We also get rid of unchecked bdrv_getlength() and bdrv_truncate(), and
get a little more documentation in function comments.

Tested that this passes qemu-iotests for all VMDK subformats.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoqemu-iotests: Add data pattern in version3 VMDK sample image in 059
Fam Zheng [Wed, 30 Jul 2014 06:39:09 +0000 (14:39 +0800)]
qemu-iotests: Add data pattern in version3 VMDK sample image in 059

It's possible that we diverge from the specification with our
implementation.  Having a reference image in the test cases may detect
such problems when we introduce a bug that can read what it creates, but
can't handle a real VMDK.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoqdev-monitor: include QOM properties in -device FOO, help output
Stefan Hajnoczi [Wed, 9 Jul 2014 12:01:32 +0000 (14:01 +0200)]
qdev-monitor: include QOM properties in -device FOO, help output

Update -device FOO,help to include QOM properties in addition to qdev
properties.  Devices are gradually adding more QOM properties that are
not reflected as qdev properties.

It is important to report all device properties since management tools
like libvirt use this information (and device-list-properties QMP) to
detect the presence of QEMU features.

This patch reuses the device-list-properties QMP machinery to avoid code
duplication.

Reported-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Cole Robinson <crobinso@redhat.com>
10 years agoqmp: hide "hotplugged" device property from device-list-properties
Stefan Hajnoczi [Wed, 9 Jul 2014 12:01:31 +0000 (14:01 +0200)]
qmp: hide "hotplugged" device property from device-list-properties

The "hotplugged" device property was not reported before commit
f4eb32b590bf58c1c67570775eb78beb09964fad ("qmp: show QOM properties in
device-list-properties").  Fix this difference.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
10 years agodocs/multiple-iothreads.txt: add documentation on IOThread programming
Stefan Hajnoczi [Wed, 23 Jul 2014 11:55:32 +0000 (12:55 +0100)]
docs/multiple-iothreads.txt: add documentation on IOThread programming

This document explains how IOThreads and the main loop are related,
especially how to write code that can run in an IOThread.  Currently
only virtio-blk-data-plane uses these techniques.  The next obvious
target is virtio-scsi; there has also been work on virtio-net.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
10 years agoxen_disk: fix possible null-ptr dereference
Gonglei (Arei) [Mon, 28 Jul 2014 06:03:45 +0000 (06:03 +0000)]
xen_disk:  fix possible null-ptr dereference

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoconfigure: explicitly state version requirements to devel packages
Hu Tao [Thu, 26 Jun 2014 09:34:50 +0000 (17:34 +0800)]
configure: explicitly state version requirements to devel packages

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agodocs: Make the recommendation for the backing file name position a requirement
Maria Kustova [Mon, 21 Jul 2014 11:16:33 +0000 (15:16 +0400)]
docs: Make the recommendation for the backing file name position a requirement

The current version of the qcow2 specification recommends to save the backing
file name in the end of the first cluster. It follows that the backing file
name can be saved somewhere in the image, but the first cluster, which
contradicts the current QEMU implementation.

The patch makes the backing file name required to be placed after the header
extensions in the first image cluster.

Signed-off-by: Maria Kustova <maria.k@catit.be>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoblock: Avoid bdrv_get_geometry() where errors should be detected
Markus Armbruster [Thu, 26 Jun 2014 11:23:25 +0000 (13:23 +0200)]
block: Avoid bdrv_get_geometry() where errors should be detected

bdrv_get_geometry() hides errors.  Use bdrv_nb_sectors() or
bdrv_getlength() instead where that's obviously inappropriate.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoqemu-img: Make img_convert() get image size just once per image
Markus Armbruster [Thu, 26 Jun 2014 11:23:24 +0000 (13:23 +0200)]
qemu-img: Make img_convert() get image size just once per image

Chiefly so I don't have to do the error checking in quadruplicate in
the next commit.  Moreover, replacing the frequently updated
bs_sectors by an array assigned just once makes the code easier to
understand.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoblock: Drop superfluous aligning of bdrv_getlength()'s value
Markus Armbruster [Thu, 26 Jun 2014 11:23:23 +0000 (13:23 +0200)]
block: Drop superfluous aligning of bdrv_getlength()'s value

It returns a multiple of the sector size.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoblock: Use bdrv_nb_sectors() where sectors, not bytes are wanted
Markus Armbruster [Thu, 26 Jun 2014 11:23:22 +0000 (13:23 +0200)]
block: Use bdrv_nb_sectors() where sectors, not bytes are wanted

Instead of bdrv_getlength().

Aside: a few of these callers don't handle errors.  I didn't
investigate whether they should.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoblock: Use bdrv_nb_sectors() in img_convert()
Markus Armbruster [Thu, 26 Jun 2014 11:23:21 +0000 (13:23 +0200)]
block: Use bdrv_nb_sectors() in img_convert()

Instead of bdrv_getlength().  Replace variable output_length by
output_sectors.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoblock: Use bdrv_nb_sectors() in bdrv_co_get_block_status()
Markus Armbruster [Thu, 26 Jun 2014 11:23:20 +0000 (13:23 +0200)]
block: Use bdrv_nb_sectors() in bdrv_co_get_block_status()

Instead of bdrv_getlength().

Replace variables length, length2 by total_sectors, nb_sectors2.
Bonus: use total_sectors instead of the slightly unclean
bs->total_sectors.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoblock: Use bdrv_nb_sectors() in bdrv_aligned_preadv()
Markus Armbruster [Thu, 26 Jun 2014 11:23:19 +0000 (13:23 +0200)]
block: Use bdrv_nb_sectors() in bdrv_aligned_preadv()

Instead of bdrv_getlength().  Eliminate variable len.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoblock: Use bdrv_nb_sectors() in bdrv_make_zero()
Markus Armbruster [Thu, 26 Jun 2014 11:23:18 +0000 (13:23 +0200)]
block: Use bdrv_nb_sectors() in bdrv_make_zero()

Instead of bdrv_getlength().

Variable target_size is initially in bytes, then changes meaning to
sectors.  Ugh.  Replace by target_sectors.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10 years agoblock: New bdrv_nb_sectors()
Markus Armbruster [Thu, 26 Jun 2014 11:23:17 +0000 (13:23 +0200)]
block: New bdrv_nb_sectors()

A call to retrieve the image size converts between bytes and sectors
several times:

* BlockDriver method bdrv_getlength() returns bytes.

* refresh_total_sectors() converts to sectors, rounding up, and stores
  in total_sectors.

* bdrv_getlength() converts total_sectors back to bytes (now rounded
  up to a multiple of the sector size).

* Callers wanting sectors rather bytes convert it right back.
  Example: bdrv_get_geometry().

bdrv_nb_sectors() provides a way to omit the last two conversions.
It's exactly bdrv_getlength() with the conversion to bytes omitted.
It's functionally like bdrv_get_geometry() without its odd error
handling.

Reimplement bdrv_getlength() and bdrv_get_geometry() on top of
bdrv_nb_sectors().

The next patches will convert some users of bdrv_getlength() to
bdrv_nb_sectors().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>