review.tizen.org Git - sdk/emulator/qemu.git/log

target-i386: QOM'ify CPU reset

Move code from cpu_state_reset() into QOM x86_cpu_reset(),
fixing style issues for FPU init.

Signed-off-by: Andreas Färber <afaerber@suse.de>

target-i386: QOM'ify CPU init

Move code from cpu_x86_init() to new QOM x86_cpu_initfn().
Also move mce_init() to cpu.c since it's used nowhere else.

Signed-off-by: Andreas Färber <afaerber@suse.de>

target-i386: QOM'ify CPU

Embed CPUX86State as first member of X86CPU.
Distinguish between "x86_64-cpu" and "i386-cpu".
Drop cpu_x86_close() in favor of calling object_delete() directly.

For now let CPUClass::reset() call cpu_state_reset().

Signed-off-by: Andreas Färber <afaerber@suse.de>

target-i386: Rename cpuid.c

Name it cpu.c to align with other QOM'ified targets.

Signed-off-by: Andreas Färber <afaerber@suse.de>

Merge commit 'ff71f2e8cacefae99179993204172bc65e4303df' into staging

* commit 'ff71f2e8cacefae99179993204172bc65e4303df': (21 commits)
  rtl8139: do the network/host communication only in normal operating mode
  rtl8139: correctly check the opmode
  net: move compute_mcast_idx() to net.h
  rtl8139: support byte read to TxStatus registers
  rtl8139: remove unused marco
  rtl8139: limit transmission buffer size in c+ mode
  pci_regs: Add PCI_EXP_TYPE_PCIE_BRIDGE
  virtio-net: add DATA_VALID flag
  pci_bridge: upper 32 bit are long registers
  pci: fix bridge IO/BASE
  pcie: drop functionality moved to core
  pci: set memory type for memory behind the bridge
  pci: add standard bridge device
  slotid: add slot id capability
  shpc: standard hot plug controller
  pci_bridge: user-friendly default bus name
  pci: make another unused extern function static
  pci: don't export an internal function
  pci_regs: Fix value of PCI_EXP_TYPE_RC_EC.
  pci: Do not check if a bus exist in pci_parse_devaddr.
  ...

Merge remote-tracking branch 'qmp/queue/qmp' into staging

* qmp/queue/qmp:
qapi: convert device_del
qdev: qdev_unplug(): use error_set()

Merge remote-tracking branch 'kwolf/for-anthony' into staging

* kwolf/for-anthony: (46 commits)
  qed: remove incoming live migration blocker
  qed: honor BDRV_O_INCOMING for incoming live migration
  migration: clear BDRV_O_INCOMING flags on end of incoming live migration
  qed: add bdrv_invalidate_cache to be called after incoming live migration
  blockdev: open images with BDRV_O_INCOMING on incoming live migration
  block: add a function to clear incoming live migration flags
  block: Add new BDRV_O_INCOMING flag to notice incoming live migration
  block stream: close unused files and update ->backing_hd
  qemu-iotests: Fix call syntax for qemu-io
  qemu-iotests: Fix call syntax for qemu-img
  qemu-iotests: Test unknown qcow2 header extensions
  qemu-iotests: qcow2.py
  sheepdog: fix send req helpers
  sheepdog: implement SD_OP_FLUSH_VDI operation
  block: bdrv_append() fixes
  qed: track dirty flag status
  qemu-img: add dirty flag status
  qed: image fragmentation statistics
  qemu-img: add image fragmentation statistics
  block: document job API
  ...

Merge remote-tracking branch 'stefanha/trivial-patches' into staging

* stefanha/trivial-patches:
  make: fix clean rule by removing build file in qom/
  configure: Link qga against UST tracing related libraries
  configure: Link QEMU against 'liburcu-bp'
  main-loop: make qemu_event_handle static
  block/curl: Replace usleep by g_usleep
  qtest: Add missing GCC_FMT_ATTR
  w32: Undefine error constants before their redefinition
  configure: fix mingw32 libs_qga typo

petalogix_s3adsp1800: deleted bad FIXME comment

This FIXME has already been actioned. Deleted comment.

Signed-off-by: Peter A. G. Crosthwaite <peter.crosthwaite@petalogix.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>

qapi: convert device_del

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>

qdev: qdev_unplug(): use error_set()

It currently uses qerror_report(), but next commit will convert
the drive_del command to the QAPI and this requires using
error_set().

One particularity of qerror_report() is that it knows when it's
running on monitor context or command-line context and prints the
error message accordingly. error_set() doesn't do this, so we
have to be careful not to drop error messages.

qdev_unplug() has three kinds of usages:

1. It's called when hot adding a device fails, to undo anything
    that has been done before hitting the error

2. It's called by function monitor functions like device_del(),
    to unplug a device

3. It's used by xen_platform.c in a way that doesn't _seem_ to
    be in monitor context

Only item 2 can print an error message to the user, this commit
maintains that.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>

Merge branch 'memory/core' of git://git./virt/kvm/qemu-kvm

* 'memory/core' of git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm:
  memory: check address space when a listener is registered
  memory: print aliased IO ranges in info mtree
  ioport: use INT64_MAX for IO ranges

Add QEMU_NORETURN to function cpu_io_recompile

cpu_io_recompile terminates by calling either cpu_abort or
cpu_resume_from_signal which both never return.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

Add QEMU_NORETURN to function cpu_resume_from_signal

cpu_resume_from_signal terminates by calling longjmp.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

Replace Qemu by QEMU in comments

The official spelling is QEMU.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Andreas Färber <afaerber@suse.de>
[blauwirbel@gmail.com: fixed comment style in hw/sun4m.c]
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

Replace Qemu by QEMU in w32 installation path (prefix)

The official spelling is QEMU.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

Replace Qemu by QEMU in internal documentation

The official spelling is QEMU.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

Replace Qemu by QEMU in user visible documentation

The official spelling is QEMU.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

remove useless comments in dma

This comment is useless, just removes it and makes the codes clear.

Signed-off-by: Wanpeng Li <liwp@linux.vnet.ibm.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

tci: Support targets with CONFIG_TCG_PASS_AREG0 (fix broken build)

Builds with --enable-tcg-interpreter failed because more and more
targets (currently alpha and sparc) replaced the global env in AREG0
by function parameters.

Convert the TCG interpreter to use the new helper functions and add
defines for those targets which still use AREG0.

Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

Merge branch 'linux-user-for-upstream' of git://git.linaro.org/people/rikuvoipio/qemu

* 'linux-user-for-upstream' of git://git.linaro.org/people/rikuvoipio/qemu:
  Userspace ARM BE8 support
  elf.h: Update EF_ARM_ constants to newer ABI versions
  arm-linux-user: fix elfload.c's AT_HWCAP to reflect cpu features.
  linux-user/arm/syscall_nr.h: Add syscall number for ppoll
  linux-user: Add support for prctl PR_GET_NAME and PR_SET_NAME
  linux-user/syscall.c: Fix indentation in prctl handling
  linux-user: reserve 4GB of vmem for 32-on-64
  linux-user: resolve reserved_va vma downwards
  linux-user: take RESERVED_VA into account for g2h_valid()
  linux-user: fix fallocate
  linux-user: Add ioctl for BLKBSZGET
  linux-user: add BLKSSZGET ioctl wrapper
  linux-user: fix BLK ioctl arguments
  linux-user: add struct old_dev_t compat
  linux-user: implement device mapper ioctls
  linux-user: target_argv is placed on ts->bprm->argv and can't be freed()
  linux-user: improve fake /proc/self/stat making `ps` not segfault.

w64: Fix data type of tb_next and other variables used for host addresses

QEMU host addresses must use uintptr_t to be portable for hosts with
an unusual size of long (w64).

tb_jmp_offset is an uint16_t value, therefore the local variable offset
in function tb_set_jmp_target was changed from unsigned long to uint16_t.

The type cast to long in function tb_add_jump now also uses uintptr_t.
For the bit operation used here, the signedness of the type cast does
not matter.

Some remaining unsigned long values are either only used for ARM assembler
code or will be fixed in a later patch for PPC.

v2:
Fix signature of tb_find_pc in exec.c, too (hint from Blue Swirl, thanks).
There remain lots of other long / unsigned long in exec.c which must be
replaced by uintptr_t. This will be done in a separate patch. Here
only one of these type casts is fixed.

v3:
Also fix signature of page_unprotect.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

softfloat: roundAndPackInt{32, 64}: Don't assume int32 is 32 bits

Fix code in roundAndPackInt32 that assumed that int32 was only
32 bits, by simply using int32_t instead. Fix the parallel bug
in roundAndPackInt64 as well, although that one is only theoretical
since it's unlikely that int64 will ever be more than 64 bits.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

softfloat: float*_to_int32_round_to_zero: don't assume int32 is 32 bits

Code in the float64_to_int32_round_to_zero() function was assuming
that int32 would not be wider than 32 bits; this meant it might
not correctly detect the overflow case. We take the simple approach
of using int32_t. Also fix equivalent issues in the functions
for other float sizes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

configure: require glib 2.12, 2.20 for mingw32

These are pretty sane requirements to move forward with glib usage.
2.12 is the version found in RHEL/CentOS 5, and 2.20 is the
first version to support g_poll. Without g_poll, we cannot
integrate well with the glib main loop.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

main-loop: integrate glib sources for w32

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

main-loop: replace WaitForMultipleObjects with g_poll

On w32, glib implements g_poll using WaitForMultipleObjects
or MsgWaitForMultipleObjects. This means that we can simplify
our code by switching to g_poll, and at the same time prepare for
adding back glib sources.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

main-loop: interrupt wait when data arrives on a socket

Right now, the main loop is not interrupted when data arrives on a
socket.  To fix this, register each socket to interrupt the main loop
with WSAEventSelect.  This does not replace select, it only communicates
a change in socket state that requires a select call.

Since the interrupt fires only once per recv call, or only once
after a send call returns EWOULDBLOCK we can activate it on all events
unconditionally.  If QEMU is momentarily uninterested on some condition,
the main loop will not busy wait.  Instead, it may get one extra wakeup,
but then it will ignore the condition until progress occurs and/or
qemu_set_fd_handler is called to set a callback.  At this point the
condition will be tested via select and the callback will be invoked
even if it is still disabled on the event.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

main-loop: disable fd_set-based glib integration under w32

Using select with glib pollfds is wrong under w32. Restrict
the code to the POSIX case.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

main loop: use msec-based timeout in glib_select_fill

The timeval-based timeout is not needed until we actually invoke select,
so compute it only then. Also group the two calls that modify the
timeout, glib_select_fill and os_host_main_loop_wait.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

target-sparc: QOM'ify CPU

Embed CPUSPARCState as first member of SPARCCPU.
Drop cpu_sparc_close() in favor of object_delete() and a finalizer.
Let cpu_state_reset() call cpu_reset().

Make TYPE_SPARC_CPU non-abstract for now.
Distinguish between "sparc-cpu" and "sparc64-cpu".

Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

target-sparc: Rename cpu_init.c

Align QOM'ified targets, with a view to simplify Makefile.target.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

Userspace ARM BE8 support

Add support for ARM BE8 userspace binaries.
i.e. big-endian data and little-endian code.
In principle LE8 mode is also possible, but AFAIK has never actually
been implemented/used.

System emulation doesn't have any useable big-endian board models,
but should in principle work once you fix that.
Dynamic endianness switching requires messing with data accesses,
preferably with TCG cooperation, and is orthogonal to BE8 support.

Signed-off-by: Paul Brook <paul@codesourcery.com>
[PMM: various changes, mostly as per my suggestions in code review:
* rebase
* use EF_ defines rather than hardcoded constants
* make bswap_code a bool for future VMSTATE macro compatibility
* update comment in cpu.h about TB flags bit field usage
* factor out load-code-and-swap into arm_ld*_code functions and
get_user_code* macros
* fix stray trailing space at end of line
* added braces in disas.c to satisfy checkpatch
]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

elf.h: Update EF_ARM_ constants to newer ABI versions

Update the EF_ARM_* constants (for the ELF header e_flags field)
to include the newer flags specified for later versions of the ABI.
(This set of constants is from include/elf/arm.h from binutils-2.17
and so licensed under GPL-v2-or-later.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

arm-linux-user: fix elfload.c's AT_HWCAP to reflect cpu features.

The cpu capabilities passed by the elf loader in AT_HWCAP where
a constant.
Make AT_HWCAP reflect the emulated cpu features in order to give
correct clues to eglibc.

Riku Voipio: fixed to apply to current head

Fix : [Bug 887516] [NEW] VFP support reported for the PXA270

Signed-off-by: Benoit Canet <benoit.canet@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user/arm/syscall_nr.h: Add syscall number for ppoll

The list of ARM syscall numbers was missing the entry for ppoll,
which meant we were accidentally not providing it. (This wasn't
causing any practical issues beyond warnings about unimplemented
syscalls, because glibc will fall back to another code path if the
syscall isn't present.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Add support for prctl PR_GET_NAME and PR_SET_NAME

Add support for the prctl options PR_GET_NAME and PR_SET_NAME,
which take or return a name in a 16 byte buffer pointed to by arg2.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user/syscall.c: Fix indentation in prctl handling

Clean up the odd indentation of this switch statement before
we double its size by adding new cases to it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: reserve 4GB of vmem for 32-on-64

When running 32-on-64 bit guests, we should always reserve as much
virtual memory as we possibly can for the guest process, so it can
never overlap with QEMU address space.

Fortunately we already have the infrastructure for that. All that's
missing is some sane default value to also make use of it!

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: resolve reserved_va vma downwards

After consulting with Paul Brook, we concluded that it's best to search
the VMA space downwards, so that we don't even get the chance to conflict
with the brk range.

This patch resolves a bunch of allocation conflicts when using -R.

Signed-off-by: Alexander Graf <agraf@suse.de>
[minor changes to get it to apply -- PMM]

Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: take RESERVED_VA into account for g2h_valid()

When running with -R (RESERVED_VA > 0) all guest virtual addresses
are within the [0..RESERVED_VA] range. Reflect this with g2h_valid()
too so we can safely check for boundaries of our guest address space.

This is required to have the /proc/self/maps code not show maps that
aren't accessible from the guest process's point of view.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: fix fallocate

Fallocate gets off_t parameters passed in, so we should also read them out
accordingly.

Signed-off-by: Alexander Graf <agraf@suse.de>
---

v1 -> v2:

- unbreak 64-bit guests

Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Add ioctl for BLKBSZGET

This patch adds the ioctl wrapper definition for BLKBSZGET.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: add BLKSSZGET ioctl wrapper

This patch adds an ioctl definition for BLKSSZGET.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: fix BLK ioctl arguments

Some BLK ioctls passed sizeof(x) into a macro that already did sizeof() on
the passed in argument, rendering the size information inside the ioctl be
the size of the host default integer type.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: add struct old_dev_t compat

The compat LOOP_SET_STATUS ioctl uses struct old_dev_t in its passed
struct. That variable type is vastly different between different
architectures. Implement wrapping around it so we can use it.

This fixes running arm kpartx on an x86_64 host for me.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: implement device mapper ioctls

This patch implements all ioctls currently implemented by device mapper,
enabling us to run dmsetup and kpartx inside of linux-user.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: target_argv is placed on ts->bprm->argv and can't be freed()

TaskState contains linux_bprm struct which encapsulates argv among
other things.
argv might be used around the code and is expected to contain valid
data. Before this patch, ts->bprm->argv was NULL due to it being
freed right after loader_exec().

Signed-off-by: Fabio Erculiani <lxnay@sabayon.org>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: improve fake /proc/self/stat making `ps` not segfault.

With the current fake /proc/self/stat implementation `ps` is
segfaulting because it expects to read PID and argv[0] as first and
second field respectively, with the latter being enclosed between
backets.

Reproducing is as easy as running: `ps` inside qemu-user chroot
with /proc mounted.

Signed-off-by: Fabio Erculiani <lxnay@sabayon.org>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

qed: remove incoming live migration blocker

Signed-off-by: Benoit Canet <benoit.canet@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qed: honor BDRV_O_INCOMING for incoming live migration

From original commit with Patchwork-id: 31108 by
Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>

"The QED image format includes a file header bit to mark images dirty.
QED normally checks dirty images on open and fixes inconsistent
metadata. This is undesirable during live migration since the dirty bit
may be set if the source host is modifying the image file. The check
should be postponed until migration completes.

Skip operations that modify the image file if the BDRV_O_INCOMING flag
is set."

Signed-off-by: Benoit Canet <benoit.canet@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

migration: clear BDRV_O_INCOMING flags on end of incoming live migration

Signed-off-by: Benoît Canet <benoit.canet@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qed: add bdrv_invalidate_cache to be called after incoming live migration

The QED image is reopened to flush metadata and check consistency.

Signed-off-by: Benoit Canet <benoit.canet@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

blockdev: open images with BDRV_O_INCOMING on incoming live migration

Open images with BDRV_O_INCOMING in order to inform block drivers
that an incoming live migration is coming.

Signed-off-by: Benoit Canet <benoit.canet@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: add a function to clear incoming live migration flags

This function will clear all BDRV_O_INCOMING flags.

Signed-off-by: Benoit Canet <benoit.canet@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: Add new BDRV_O_INCOMING flag to notice incoming live migration

From original patch with Patchwork-id: 31110 by
Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>

"Add a flag to indicate that incoming migration is pending and care needs
to be taken for data consistency.  Block drivers should not modify the
image file before incoming migration is complete since the migration
source host is still using the image file."

The rationale for not using bdrv->read_only is the following.

"Unfortunately this is not possible because too many other places in QEMU
test bdrv_is_read_only() and use it for their own evil purposes.  For
example, ide_init_drive() will error out because read-only harddisks are
not supported.  We're mixing guest and host side read-only concepts so
this simpler alternative does not work."

Signed-off-by: Benoit Canet <benoit.canet@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block stream: close unused files and update ->backing_hd

Close the now unused images that were part of the previous backing file
chain and adjust ->backing_hd, backing_filename and backing_format
properly.

Fixes https://bugzilla.redhat.com/show_bug.cgi?id=801449

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qemu-iotests: Fix call syntax for qemu-io

qemu-io requires options first, then fixed parameters.

GNU getopt also allows options at the end, but POSIX getopt
doesn't. Try "export POSIXLY_CORRECT=y" to get the POSIX
behaviour with GNU getopt, too.

Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qemu-iotests: Fix call syntax for qemu-img

qemu-img requires first options, then file name, then size.

GNU getopt also allows options at the end, but POSIX getopt
doesn't. Try "export POSIXLY_CORRECT=y" to get the POSIX
behaviour with GNU getopt, too.

Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qemu-iotests: Test unknown qcow2 header extensions

The immportant thing here is that header extensions don't get silently
dropped when the header is rewritten, e.g. during a rebase.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qemu-iotests: qcow2.py

This adds a tool that is meant to inspect and edit qcow2 files in a
low-level way, that wouldn't be possible with qemu-img/io, for example
by adding yet unknown extensions or flags. This way we can test whether
qemu deals properly with future backwards compatible extensions.

For now, let's start with the image header and header extensions.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>

sheepdog: fix send req helpers

We should return if reading of the header fails.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Liu Yuan <tailai.ly@taobao.com>
Acked-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

sheepdog: implement SD_OP_FLUSH_VDI operation

Flush operation is supposed to flush the write-back cache of
sheepdog cluster.

By issuing flush operation, we can assure the Guest of data
reaching the sheepdog cluster storage.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Liu Yuan <tailai.ly@taobao.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: bdrv_append() fixes

A few fixups for bdrv_append():

The new bs (bs_new) passed into bdrv_append() should be anonymous. Rather
than call bdrv_make_anon() to enforce this, use an assert to catch when a caller
is passing in a bs_new that is not anonymous.

Also, the new top layer should have its backing_format reflect the original
top's format.

And last, after the swap of bs contents, the device_name will have been copied
down. This needs to be cleared to reflect the anonymity of the bs that was
pushed down.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qed: track dirty flag status

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qemu-img: add dirty flag status

Some block drivers can verify their image files are clean or not. So we can show
it while using "qemu-img info".

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qed: image fragmentation statistics

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qemu-img: add image fragmentation statistics

Discussion can be found at:
http://patchwork.ozlabs.org/patch/128730/

This patch add image fragmentation statistics while using qemu-img check.

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: document job API

I am not sure that these are really proper GtkDoc, but they follow
the existing documentation in block_int.h.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: set job->speed in block_set_speed

There is no need to do this in every implementation of set_speed
(even though there is only one right now).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: fix streaming/closing race

Streaming can issue I/O while qcow2_close is running.  This causes the
L2 caches to become very confused or, alternatively, could cause a
segfault when the streaming coroutine is reentered after closing its
block device.  The fix is to cancel streaming jobs when closing their
underlying device.

The cancellation must be synchronous, on the other hand qemu_aio_wait
will not restart a coroutine that is sleeping in co_sleep.  So add
a flag saying whether streaming has in-flight I/O.  If the busy flag
is false, the coroutine is quiescent and, when cancelled, will not
issue any new I/O.

This protects streaming against closing, but not against deleting.
We have a reference count protecting us against concurrent deletion,
but I still added an assertion to ensure nothing bad happens.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: cancel jobs when a device is ready to go away

We do not want jobs to keep a device busy for a possibly very long
time, and management could become confused because they thought a
device was not even there anymore. So, cancel long-running jobs
as soon as their device is going to disappear.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: disable I/O throttling on sync api

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Use DMADirection type for dma_bdrv_io

Currently dma_bdrv_io() takes a 'to_dev' boolean parameter to
determine the direction of DMA it is emulating. We already have a
DMADirection enum designed specifically to encode DMA directions.
This patch uses it for dma_bdrv_io() as well. This involves removing
the DMADirection definition from the #ifdef it was inside, but since that
only existed to protect the definition of dma_addr_t from places where
config.h is not included, there wasn't any reason for it to be there in
the first place.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

vdi: change goto to loop

Finally reindent all code and change goto statements to a loop.

Acked-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

vdi: do not create useless iovecs

Reads and writes to the underlying file can also occur with the simple
non-vectored I/O interfaces.

Acked-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

vdi: leave bounce buffering to block layer

vdi.c really works as if it implemented bdrv_read and bdrv_write. However,
because only vector I/O is supported by the asynchronous callbacks, it
went through extra pain to bounce-buffer the I/O. This can be handled
by the block layer now that the format is coroutine-based.

Acked-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

vdi: move aiocb fields to locals

Most of the AIOCB really holds local variables that need to persist
across callback invocation. It can go away now.

Acked-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

vdi: merge aio_read_cb and aio_write_cb into callers

Now inline the former AIO callbacks into vdi_co_readv and vdi_co_writev.
While many cleanups are possible, the code now really looks synchronous.

Acked-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

vdi: move end-of-I/O handling at the end

The next step is to take code that only triggers after the first operation,
and move it at the end of vdi_aio_read_cb and vdi_aio_write_cb.

Acked-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

vdi: basic conversion to coroutines

Even a basic conversion changing the bdrv_aio_readv/bdrv_aio_writev calls
to bdrv_co_readv/bdrv_co_writev, and callbacks to goto statements can
eliminate a lot of code. This is because error handling is simplified
and indirections through bottom halves can go away.

After this patch, I/O to the underlying file already happens via
coroutines, but the code still looks a lot like if asynchronous I/O was
being used.

Acked-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: enforce constraints on block size properties

Nicolae Mogoreanu <mogo@google.com> noticed that I/O requests can lead
to QEMU crashes when the logical_block_size property is smaller than 512
bytes.

Using the new "blocksize" property we can properly enforce constraints
on the block size such that QEMU's block layer is able to operate
correctly.

Reported-by: Nicolae Mogoreanu <mogo@google.com>
Reported-by: Michael Halcrow <mhalcrow@google.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qdev: add blocksize property type

Storage interfaces like virtio-blk can be configured with block size
information so that the guest can take advantage of efficient I/O
request sizes.

According to the SCSI Block Commands (SBC) standard a device's block
size is "almost always greater than one byte and may be a multiple of
512 bytes".  QEMU currently has a 512 byte minimum block size because
the block layer functions work at that granularity.  Furthermore, the
block size should be a power of 2 because QEMU calculates bitmasks from
the value.

Introduce a "blocksize" property type so devices can enforce these
constraints on block size values.  If the constraints are relaxed in the
future then this property can be updated.

Introduce the new PropertyValueNotPowerOf2 QError so QMP clients know
exactly why a block size value was rejected.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qerror: fix QERR_PROPERTY_VALUE_OUT_OF_RANGE description

Fix a typo in the description for QERR_PROPERTY_VALUE_OUT_OF_RANGE where
"'" was used instead of ")".

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/vpc: write checksum back to footer after check

After validation check, the 'checksum' is not written back
to footer, which leave it with zero.

This results in errors while loadding it under Microsoft's
Hyper-V environment, and also errors from utilities like
Citrix's vhd-util.

Signed-off-by: Zhang Shengju <sean_zhang@trendmicro.com.cn>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

ide: Adds wwn=hex qdev option

Allow the user to specify a disk's World Wide Name.

Linux guests can address disks by their unique World Wide Name number
(e.g. /dev/disk/by-id/wwn-0x5001517959123522). This patch adds support
for assigning a World Wide Name number to a virtual IDE disk.

Cc: kwolf@redhat.com
Signed-off-by: Floris Bos <dev@noc-ps.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

ide: Change serial number strncpy() to pstrcpy()

strncpy may not null-terminate the destination string.

Cc: kwolf@redhat.com
Signed-off-by: Floris Bos <dev@noc-ps.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

ide: Add "model=s" qdev option

Allow the user to override the default disk model name "QEMU HARDDISK".

Some Linux distributions use the /dev/disk/by-id/scsi-SATA_name-of-disk-
model_serial addressing scheme when refering to partitions in /etc/fstab
and elsewhere. This causes problems when starting a disk image taken from
an existing physical server under qemu, because when running under qemu
name-of-disk-model is always "QEMU HARDDISK".

This patch introduces a model=s option which in combination with the
existing serial=s option can be used to fake the disk the operating
system was previously on, allowing the OS to boot properly.

Cc: kwolf@redhat.com
Signed-off-by: Floris Bos <dev@noc-ps.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

ide: IDENTIFY word 86 bit 14 is reserved

Reserved bits should be cleared to zero.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>

aio: move BlockDriverAIOCB to qemu-aio.h

And remove several block_int.h inclusions that should not be there.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: push recursive flushing up from drivers

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qemu-io: add option to enable tracing

It can be useful to enable QEMU tracing when trying out block layer
interfaces via qemu-io.  Tracing can be enabled using the new -T FILE
option where the given file contains a list of trace events to enable
(just like the qemu --trace events=FILE option).

  $ echo qemu_vfree >my-events
  $ ./qemu-io -T my-events ...

Remember to use ./configure --enable-trace-backend=BACKEND when building
qemu-io.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qcow2: Remove unused parameter in get_cluster_table()

Since everything goes through the cache, callers don't use the L2 table
offset any more.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>

tracetool: Forbid argument name 'next'

It has happened more than once that patches that look perfectly sane
and work with simpletrace broke systemtap because they use 'next' as an
argument name for a tracing function. However, 'next' is a keyword for
systemtap, so we shouldn't use it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>

trace-events: Rename 'next' argument

'next' is a systemtap keyword, so it's a bad idea to use it as an
argument name.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>

memory: check address space when a listener is registered

This patch resolves a bug in memory listener registration.
"range_add" callback was called on each section of the both
address space (IO and memory space) even if it doesn't match
the address space filter.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

Merge branch 's390-for-upstream' of git://repo.or.cz/qemu/agraf

* 's390-for-upstream' of git://repo.or.cz/qemu/agraf:
  target-s390x: Update s390x_{tod,cpu}_timer() to use S390CPU
  target-s390x: QOM'ify CPU init
  target-s390x: QOM'ify CPU reset
  target-s390x: QOM'ify CPU

Improve interrupt handling priority

The vector interrupt has higher priority than interrupt_level_n.
Also check only interrupt_level_n concurency when TL > 0, the traps of
other types may be nested.

Signed-off-by: Artyom Tarasenko <atar4qemu@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

Fix vector interrupt handling

Don't produce stray irq 5, don't overwrite ivec_data if still busy with
processing of the previous interrupt.

Signed-off-by: Artyom Tarasenko <atar4qemu@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

target-s390x: Update s390x_{tod,cpu}_timer() to use S390CPU

In place of CPUS390XState pass S390CPU as opaque from the new initfn.
cpu_interrupt() is anticipated to take a CPUState in the future.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>