review.tizen.org Git - sdk/emulator/qemu.git/log

block: remove BLOCK_OPT_NOCOW from vpc_create_opts

In commit fef6070, the need for NOCOW was removed from the vpc driver,
as we removed the the posix calls. However, the BLOCK_OPT_NOCOW was not
removed from vpc_create_opts. This was a mistake - remove the opt from
there as well.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Message-id: 8ba076fa725fed681cde7d8afc4fb239ae06a9c6.1417620301.git.jcody@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: remove BLOCK_OPT_NOCOW from vdi_create_opts

In commit 7074786, the need for NOCOW was removed from the vdi driver,
as we removed the the posix calls. However, the BLOCK_OPT_NOCOW was not
removed from vdi_create_opts. This was a mistake - remove the opt from
there as well.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Message-id: e189364de11929d8fa04722f5d845de0a9834d44.1417620301.git.jcody@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qemu-iotests: Skip 099 for VMDK subformats with desc file

VMDK extent parsing code doesn't handle the JSON file name, so the case
fails for these subformats. Disabled them.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1417571370-19495-1-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/raw-posix: Fix ret in raw_open_common()

The return value must be negative on error; there is one place in
raw_open_common() where errp is set, but ret remains 0. Fix it.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qcow2: Respect bdrv_truncate() error

bdrv_truncate() may fail and qcow2_write_compressed() should return the
error code in that case.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qcow2: Flushing the caches in qcow2_close may fail

qcow2_cache_flush() may fail; if one of the caches failed to be flushed
successfully to disk in qcow2_close() the image should not be marked
clean, and we should emit a warning.

This breaks the (qcow2-specific) iotests 026, 071 and 089; change their
output accordingly.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qcow2: Prevent numerical overflow

In qcow2_alloc_cluster_offset(), *num is limited to
INT_MAX >> BDRV_SECTOR_BITS by all callers. However, since remaining is
of type uint64_t, we might as well cast *num to that type before
performing the shift.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

iotests: Add test for unsupported image creation

Add a test for creating and amending images (amendment uses the creation
options) with formats not supporting creation over protocols not
supporting creation.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

iotests: Only kill NBD server if it runs

There may be NBD tests which do not create a sample image and simply
test whether wrong usage of the protocol is rejected as expected. In
this case, there will be no NBD server and trying to kill it during
clean-up will fail.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qemu-img: Check create_opts before image amendment

The image options which can be amended are described by the .create_opts
field for every driver. This field must therefore be non-NULL so that
anything can be amended in the first place. Check that this holds true
before going into qemu_opts_create() (because if .create_opts is NULL,
the create_opts pointer in img_amend() will be NULL after
qemu_opts_append()).

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qemu-img: Check create_opts before image creation

If a driver supports image creation, it needs to set the .create_opts
field. We can use that to make sure .create_opts for both drivers
involved is not NULL for the target image in qemu-img convert, which is
important so that the create_opts pointer in img_convert() is not NULL
after the qemu_opts_append() calls and when going into
qemu_opts_create().

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: Check create_opts before image creation

If a driver supports image creation, it needs to set the .create_opts
field. We can use that to make sure .create_opts for both drivers
involved is not NULL in bdrv_img_create(), which is important so that
the create_opts pointer in that function is not NULL after the
qemu_opts_append() calls and when going into qemu_opts_create().

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/nfs: Add create_opts

The nfs protocol driver is capable of creating images, but did not
specify any creation options. Fix it.

A way to test this issue is the following:

$ qemu-img create -f nfs nfs://127.0.0.1/foo.qcow2 64M

Without this patch, it segfaults. With this patch, it does not. However,
this is not something that should really work; qemu-img should check
whether the parameter for the -f option (and -O for convert) is indeed a
format, and error out if it is not. Therefore, I am not making it an
iotest.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/vvfat: qcow driver may not be found

Although virtually impossible right now, bdrv_find_format("qcow") may
fail. The vvfat block driver should heed that case.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: Omit bdrv_find_format for essential drivers

We can always assume raw, file and qcow2 being available; so do not use
bdrv_find_format() to locate their BlockDriver objects but statically
reference the respective objects.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: Make essential BlockDriver objects public

There are some block drivers which are essential to QEMU and may not be
removed: These are raw, file and qcow2 (as the default non-raw format).
Make their BlockDriver objects public so they can be directly referenced
throughout the block layer without needing to call bdrv_find_format()
and having to deal with an error at runtime, while the real problem
occurred during linking (where raw, file or qcow2 were not linked into
qemu).

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

iotests: Specify qcow2 format for qemu-io in 059

There are two instances of iotest 059 using qemu-io on a qcow2 image. As
of "qemu-iotests: Use qemu-io -f $IMGFMT" the iotests can no longer rely
on $QEMU_IO doing probing, therefore the qcow2 format has to be
specified explicitly here.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

ide: Check validity of logical block size

Our IDE emulation can't handle logical block sizes other than 512. Check
for it.

The original assumption was that other values would silently be ignored
(which is bad enough), but it's not quite true: The physical block size
is exposed in IDENTIFY DEVICE as a multiple of the logical block size.
Setting a logical block size therefore also corrupts the physical block
size (4096/4096 doesn't silently downgrade to 4096/512, but 512/512).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>

nvme: 64kB page size fixes

Initialise our maximum page size capability to 64kB and increase
the page_size variable from 16 to 32 bits.

Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qemu-iotests: 082: Filter the real disk size

The real on-disk size of an image depends on things like the host
filesystem. _img_info already filters it out, use the function in 082.

Signed-off-by: Michael Mueller <mimu@linux.vnet.ibm.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qemu-iotests: 060: Filter the real disk size

The real on-disk size of an image depends on things like the host
filesystem. _img_info already filters it out, use the function in 060.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Michael Mueller <mimu@linux.vnet.ibm.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>

block: do not use get_clock()

Use the external qemu-timer API instead.

No one else should be calling cpu_get_clock(), get_clock() and
get_clock_realtime() directly; they are internal functions and they
should be confined to qemu-timer.c and cpus.c (where the icount
implementation resides). All accesses should go through
qemu_clock_get_ns.

Cc: kwolf@redhat.com
Cc: stefanha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1417010463-3527-2-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: Don't probe for unknown backing file format

If a qcow2 image specifies a backing file format that doesn't correspond
to any format driver that qemu knows, we shouldn't fall back to probing,
but simply error out.

Not looking up the backing file driver in bdrv_open_backing_file(), but
just filling in the "driver" option if it isn't there moves us closer to
the goal of having everything in QDict options and gets us the error
handling of bdrv_open(), which correctly refuses unknown drivers.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1416935562-7760-4-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qcow2.py: Add required padding for header extensions

The qcow2 specification requires that the header extension data be
padded to round up the extension size to the next multiple of 8 bytes.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1416935562-7760-3-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qcow2: Fix header extension size check

After reading the extension header, offset is incremented, but not
checked against end_offset any more. This way an integer overflow could
happen when checking whether the extension end is within the allowed
range, effectively disabling the check.

This patch adds the missing check and a test case for it.

Cc: qemu-stable@nongnu.org
Reported-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1416935562-7760-2-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

blockdev: check for BLOCK_OP_TYPE_INTERNAL_SNAPSHOT

The BLOCK_OP_TYPE_INTERNAL_SNAPSHOT op blocker exists but was never
used! Let's fix that so internal snapshots can be blocked.

[Fixed s/external/internal/ typo as pointed out by Paolo Bonzini and Max
Reitz.
--Stefan]

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1416566940-4430-5-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

blockdev: acquire AioContext in QMP 'transaction' actions

The transaction QMP command performs operations atomically on a group of
drives.  This command needs to acquire AioContext in order to work
safely when virtio-blk dataplane IOThreads are accessing drives.

The transactional nature of the command means that actions are split
into prepare, commit, abort, and clean functions.  Acquire the
AioContext in prepare and don't release it until one of the other
functions is called.  This prevents the IOThread from running the
AioContext before the transaction has completed.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1416566940-4430-4-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

blockdev: drop unnecessary DriveBackupState field assignment

drive_backup_prepare() assigns DriveBackupState fields to NULL in the
error path. This is unnecessary because the DriveBackupState is
allocated using g_malloc0() and other functions like
external_snapshot_prepare() already rely on this.

Do not explicitly assign fields to NULL so that the error path is
concise and does not require modification when fields are added to
DriveBackupState.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1416566940-4430-3-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

blockdev: update outdated qmp_transaction() comments

Originally the transaction QMP command was just for taking snapshots.
The command became more general when drive-backup and abort were added.

It is more accurate to say the command is about performing operations on
an atomic group than to say it is about snapshots.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1416566940-4430-2-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qemu-iotests: Test writing non-raw image headers to raw image

This is forbidden if the raw driver was probed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1416497234-29880-10-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qemu-iotests: Fix stderr handling in common.qemu

The original intention was to pipe stderr of qemu into $fifo_out.
However, the redirections were specified in the wrong order for this.
This patch fixes it.

Now qemu's output on stderr can be retrieved with _send_qemu_cmd, which
applies several useful filters on the output that were missing before.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1416497234-29880-9-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

raw: Prohibit dangerous writes for probed images

If the user neglects to specify the image format, QEMU probes the
image to guess it automatically, for convenience.

Relying on format probing is insecure for raw images (CVE-2008-2004).
If the guest writes a suitable header to the device, the next probe
will recognize a format chosen by the guest.  A malicious guest can
abuse this to gain access to host files, e.g. by crafting a QCOW2
header with backing file /etc/shadow.

Commit 1e72d3b (April 2008) provided -drive parameter format to let
users disable probing.  Commit f965509 (March 2009) extended QCOW2 to
optionally store the backing file format, to let users disable backing
file probing.  QED has had a flag to suppress probing since the
beginning (2010), set whenever a raw backing file is assigned.

All of these additions that allow to avoid format probing have to be
specified explicitly. The default still allows the attack.

In order to fix this, commit 79368c8 (July 2010) put probed raw images
in a restricted mode, in which they wouldn't be able to overwrite the
first few bytes of the image so that they would identify as a different
image. If a write to the first sector would write one of the signatures
of another driver, qemu would instead zero out the first four bytes.
This patch was later reverted in commit 8b33d9e (September 2010) because
it didn't get the handling of unaligned qiov members right.

Today's block layer that is based on coroutines and has qiov utility
functions makes it much easier to get this functionality right, so this
patch implements it.

The other differences of this patch to the old one are that it doesn't
silently write something different than the guest requested by zeroing
out some bytes (it fails the request instead) and that it doesn't
maintain a list of signatures in the raw driver (it calls the usual
probe function instead).

Note that this change doesn't introduce new breakage for false positive
cases where the guest legitimately writes data into the first sector
that matches the signatures of an image format (e.g. for nested virt):
These cases were broken before, only the failure mode changes from
corruption after the next restart (when the wrong format is probed) to
failing the problematic write request.

Also note that like in the original patch, the restrictions only apply
if the image format has been guessed by probing. Explicitly specifying a
format allows guests to write anything they like.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1416497234-29880-8-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: Read only one sector for format probing

The only image format driver that even potentially accesses anything
after 512 bytes in its bdrv_probe() implementation is VMDK, which reads
a plain-text descriptor file. In practice, the field it's looking for
seems to come first and will be well within the first 512 bytes, too.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1416497234-29880-7-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: Factor bdrv_probe_all() out of find_image_format()

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1416497234-29880-6-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qtests: Specify image format explicitly

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1416497234-29880-5-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qemu-iotests: Add qemu-io format option in Python tests

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1416497234-29880-4-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qemu-iotests: Use qemu-io -f $IMGFMT

This patch changes $QEMU_IO so that all tests by default pass a format
argument to qemu-io.

There are a few cases where -f $IMGFMT is not wanted because it selects
the wrong driver or json: filenames including a driver are used. They
are changed to use $QEMU_IO_PROG, which doesn't include any options.

Tests 071 and 081 have output changes because now the actual request
fails instead of reading the 2k probing buffer.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1416497234-29880-3-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qemu-io: Allow explicitly specifying format

This adds a -f option to qemu-io which allows to explicitly specify the
block driver to use for the given image.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1416497234-29880-2-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

tests: Use "command -v" instead of which(1) in shell scripts

When which(1) is not installed, we would complain "perl not found"
because it's the first set_prog_path check. The error message is
wrong.

Fix it by using "command -v", a native way to query the existence of a
command.

Suggested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1416380832-9697-1-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qemu-nbd: Use BlockBackend where reasonable

Because qemu-nbd creates the BlockBackend by itself, it should create
the according BlockDriverState tree by itself as well; that means, it
has call bdrv_open() on its own. This is one of the places where
qemu-nbd still needs to use a BlockDriverState directly (the root BDS
below the BB); other places are the configuration of zero detection
(which may be lifted into the BB eventually, but is not yet) and
temporarily loading a snapshot.

Everywhere else, though, qemu-nbd can and thus should use BlockBackend.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1416309679-333-7-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

nbd: Use BlockBackend internally

With all externally visible functions changed to use BlockBackend, this
patch makes nbd use BlockBackend for everything internally as well.

While touching them, substitute 512 by BDRV_SECTOR_SIZE in the calls to
blk_read(), blk_write() and blk_co_discard().

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1416309679-333-6-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

nbd: Change external interface to BlockBackend

Substitute BlockDriverState by BlockBackend in every globally visible
function provided by nbd.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1416309679-333-5-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: Add blk_add_close_notifier() for BB

Adding something like a "delete notifier" to a BlockBackend would not
make much sense, because whoever is interested in registering there will
probably hold a reference to that BlockBackend; therefore, the notifier
will never be called (or only when the notifiee already relinquished its
reference and thus most probably is no longer interested in that
notification).

Therefore, this patch just passes through the close notifier interface
of the root BDS. This will be called when the device is ejected, for
instance, and therefore does make sense.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1416309679-333-4-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: Add AioContextNotifier functions to BB

Because all BlockDriverStates behind a single BlockBackend reside in a
single AioContext, it is fine to just pass these functions
(blk_add_aio_context_notifier() and blk_remove_aio_context_notifier())
through to the root BlockDriverState.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1416309679-333-3-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: Lift more functions into BlockBackend

There are already some blk_aio_* functions, so we might as well have
blk_co_* functions (as far as we need them). This patch adds
blk_co_flush(), blk_co_discard(), and also blk_invalidate_cache() (which
is not a blk_co_* function but is needed nonetheless).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1416309679-333-2-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

ahci: replace SATA FIS type magic numbers with constants

SATA 3.0 "10.3.1 FIS Type values" defines the constants used to
differentiate between FIS types.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1415874281-7371-3-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

ahci: avoid #ifdef DEBUG_AHCI bitrot

Debug code using #ifdef is susceptible to bitrot because the compiler
never checks the debug code.

This is easy to avoid, change the DPRINTF() macro to use if (DEBUG_AHCI)
and always give it a 0 or 1 value.

This also allows us to drop an #ifdef DEBUG_AHCI in ahci_start_dma()
since the compiler can now see the local variable is used.

The motivation for this change is a recent DEBUG_AHCI build failure due
to an outdated DPRINTF() format string. From now on the compiler will
catch these errors.

Cc: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1415874281-7371-2-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

iotests: Plain blkdebug filename generation

Add one test whether blkdebug is able to generate a plain filename if
given a configuration file and a file to be tested only; and add another
test whether blkdebug is able to do the same without being given a
configuration file.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1415697825-26678-3-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

blkdebug: Simplify and improve filename generation

Instead of actually recreating the options from scratch, just reuse the
options given for creating the BDS, which are the configuration file
name and additional options. In case there are no additional options we
can thus create a plain filename.

This obviously results in a different output for qemu-iotest 099 which
exactly tests this filename generation. Fix it up as well.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1415697825-26678-2-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

monitor: Fix HMP tab completion

Commands with multiple boolean flag options (like 'info block') didn't
provide correct completion because only the first one was skipped.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/hmp: Allow node-name in 'info block'

The optional parameter specifying a block device allows now to use a
node-name instead of a drive name (and therefore to inspect any node in
the graph). The new -n options allows listing all named nodes instead of
BlockBackends.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/hmp: Allow info = NULL in print_block_info()

This allows printing infos of BlockDriverStates that aren't at the root
of the graph (and logically implementing a BlockBackend).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/hmp: Factor out print_block_info()

The new function prints the info for a single BlockDriverState.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/qapi: Add cache information to query-block

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

blockdev: acquire AioContext in change-backing-file

Add dataplane support to the change-backing-file QMP commands. By
acquiring the AioContext we avoid race conditions with the dataplane
thread which may also be accessing the BlockDriverState.

Note that this command operates on both bs and a node in its chain
(image_bs). The bdrv_chain_contains(bs, image_bs) check guarantees that
bs and image_bs are in the same AioContext.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

blockdev: acquire AioContext in eject, change, and block_passwd

By acquiring the AioContext we avoid race conditions with the dataplane
thread which may also be accessing the BlockDriverState.

Fix up eject, change, and block_passwd in a single patch because
qmp_eject() and qmp_change_blockdev() both call eject_device(). Also
fix block_passwd while we're tackling a command that takes a block
encryption password.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

blockdev: check for BLOCK_OP_TYPE_INTERNAL_SNAPSHOT_DELETE

The BLOCK_OP_TYPE_INTERNAL_SNAPSHOT_DELETE op blocker exists but was
never used! Let's fix that so snapshot delete can be blocked.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

blockdev: acquire AioContext in blockdev-snapshot-delete-internal-sync

Add dataplane support to the blockdev-snapshot-delete-internal-sync QMP
command. By acquiring the AioContext we avoid race conditions with the
dataplane thread which may also be accessing the BlockDriverState.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

iotests: Use -qmp-pretty in 067

067 invokes query-block, resulting in a reference output with really
long lines (which may pose a problem in email patches and always poses a
problem when the output changes, because it is hard to see what has
actually changed). Use -qmp-pretty to mitigate this issue.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

iotests: _filter_qmp for pretty JSON output

_filter_qmp should be able to correctly filter out the QMP version
object for pretty JSON output.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

chardev: Add -qmp-pretty

Add a command line option for adding a QMP monitor using pretty JSON
formatting.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qjson: Drop trailing space for pretty formatting

For the pretty formatting, the functions converting QDicts and QLists to
JSON should not print a space after the comma separating objects,
because a newline will emitted immediately afterwards, making the
whitespace superfluous.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qmp: Add optional switch "query-nodes" in query-blockstats

This bool option will allow query all the node names. It iterates all
the BDSes that are assigned a name, also in this case don't query up the
backing chain.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: Include "node-name" if present in query-blockstats

Node name is a better identifier of BDS.

We will want to query statistics of a BDS node buried in the BDS graph,
so reporting the node's name if there is one will do the trick.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: Add bdrv_get_node_name

This returns the node name of a BDS. Remove the TODO comment and expect
the callers to be explicit.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: Add bdrv_next_node

Similar to bdrv_next, this traverses through graph_bdrv_states. Will be
useful to enumerate all the named nodes.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Open 2.3 development tree

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Update version for v2.2.0 release

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Update version for v2.2.0-rc5 release

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/kraxel/tags/pull-cve-2014-8106-20141204-1' into staging

cirrus: fix blit region check

# gpg: Signature made Thu 04 Dec 2014 11:54:57 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-cve-2014-8106-20141204-1:
  cirrus: don't overflow CirrusVGAState->cirrus_bltbuf
  cirrus: fix blit region check

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Update version for v2.2.0-rc4 release

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

vhost: Fix vhostfd leak in error branch

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1417166789-1960-1-git-send-email-arei.gonglei@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

cirrus: don't overflow CirrusVGAState->cirrus_bltbuf

This is CVE-2014-8106.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

cirrus: fix blit region check

Issues:
* Doesn't check pitches correctly in case it is negative.
* Doesn't check width at all.

Turn macro into functions while being at it, also factor out the check
for one region which we then can simply call twice for src + dst.

This is CVE-2014-8106.

Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

Fix for crash after migration in virtio-rng on bi-endian targets

VirtIO devices now remember which endianness they're operating in in order
to support targets which may have guests of either endianness, such as
powerpc.  This endianness state is transferred in a subsection of the
virtio device's information.

With virtio-rng this can lead to an abort after a loadvm hitting the
assert() in virtio_is_big_endian().  This can be reproduced by doing a
migrate and load from file on a bi-endian target with a virtio-rng device.
The actual guest state isn't particularly important to triggering this.

The cause is that virtio_rng_load_device() calls virtio_rng_process() which
accesses the ring and thus needs the endianness.  However,
virtio_rng_process() is called via virtio_load() before it loads the
subsections.  Essentially the ->load callback in VirtioDeviceClass should
only be used for actually reading the device state from the stream, not for
post-load re-initialization.

This patch fixes the bug by moving the virtio_rng_process() after the call
to virtio_load().  Better yet would be to convert virtio to use vmsd and
have the virtio_rng_process() as a post_load callback, but that's a bigger
project for another day.

This is bugfix, and should be considered for the 2.2 branch.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Message-id: 1417067290-20715-1-git-send-email-david@gibson.dropbear.id.au
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

virtio-net: fix unmap leak

virtio_net_handle_ctrl() and other functions that process control vq
request call iov_discard_front() which will shorten the iov. This will
lead unmapping in virtqueue_push() leaks mapping.

Fixes this by keeping the original iov untouched and using a temp variable
in those functions.

Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1417082643-23907-1-git-send-email-jasowang@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

hmp: fix regression of HMP device_del auto-completion

The commits:
- 6a1fa9f5 (monitor: add del completion for peripheral device)
- 66e56b13 (qdev: add qdev_build_hotpluggable_device_list helper)

cause a QEMU crash when trying to use HMP device_del auto-completion.
It can be easily reproduced by:
    <qemu-bin> -enable-kvm  ~/images/fedora.qcow2 -monitor stdio -device virtio-net-pci,id=vnet

    (qemu) device_del
    /home/mapfelba/git/upstream/qemu/hw/core/qdev.c:941:qdev_build_hotpluggable_device_list: Object 0x7f6ce04e4fe0 is not an instance of type device
    Aborted (core dumped)

The root cause is qdev_build_hotpluggable_device_list going recursively over
all peripherals and their children assuming all are devices. It doesn't work
since PCI devices have at least on child which is a memory region (bus master).

Solved by observing that all devices appear as direct children of
/machine/peripheral container. No need of going recursively
over all the children.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reported-by: Gal Hammer <ghammer@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-id: 1417002601-20799-1-git-send-email-marcel.a@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

qemu-timer: Avoid overflows when converting timeout to struct timespec

In qemu_poll_ns(), when we convert an int64_t nanosecond timeout into
a struct timespec, we may accidentally run into overflow problems if
the timeout is very long. This happens because the tv_sec field is a
time_t, which is signed, so we might end up setting it to a negative
value by mistake. This will result in what was intended to be a
near-infinite timeout turning into an instantaneous timeout, and we'll
busy loop. Cap the maximum timeout at INT32_MAX seconds (about 68 years)
to avoid this problem.

This specifically manifested on ARM hosts as an extreme slowdown on
guest shutdown (when the guest reprogrammed the PL031 RTC to not
generate alarms using a very long timeout) but could happen on other
hosts and guests too.

Reported-by: Christoffer Dall <christoffer.dall@linaro.org>
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1416939705-1272-1-git-send-email-peter.maydell@linaro.org

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

The final 2.2 patches from me.

# gpg: Signature made Wed 26 Nov 2014 11:12:25 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream:
  s390x/kvm: Fix compile error
  fw_cfg: fix boot order bug when dynamically modified via QOM
  -machine vmport=auto: Fix handling of VMWare ioport emulation for xen

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

s390x/kvm: Fix compile error

commit a2b257d6212a "memory: expose alignment used for allocating RAM
as MemoryRegion API" triggered a compile error on KVM/s390x.

Fix the prototype and the implementation of legacy_s390_alloc.

Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

fw_cfg: fix boot order bug when dynamically modified via QOM

When we dynamically modify boot order, the length of
boot order will be changed, but we don't update
s->files->f[i].size with new length. This casuse
seabios read a wrong vale of qemu cfg file about
bootorder.

Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

-machine vmport=auto: Fix handling of VMWare ioport emulation for xen

c/s 9b23cfb76b3a5e9eb5cc899eaf2f46bc46d33ba4

or

c/s b154537ad07598377ebf98252fb7d2aff127983b

moved the testing of xen_enabled() from pc_init1() to
pc_machine_initfn().

xen_enabled() does not return the correct value in
pc_machine_initfn().

Changed vmport from a bool to an enum. Added the value "auto" to do
the old way. Move check of xen_enabled() back to pc_init1().

Acked-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Don Slutz <dslutz@verizon.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Update version for v2.2.0-rc3 release

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

input: move input-send-event into experimental namespace

Ongoing discussions on how we are going to specify the console,
so tag the command as experiental so we can refine things in
the 2.3 development cycle.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1416923657-10614-1-git-send-email-armbru@redhat.com
[Spell out "not a stable API", and x- the QAPI schema, too]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

pc, pci, misc bugfixes

A bunch of bugfixes for 2.2.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Mon 24 Nov 2014 18:59:47 GMT using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream:
  pc: acpi: mark all possible CPUs as enabled in SRAT
  pcie: fix improper use of negative value
  pcie: fix typo in pcie_cap_deverr_init()
  target-i386: move generic memory hotplug methods to DSDTs
  acpi-build: mark RAM dirty on table update
  hw/pci: fix crash on shpc error flow
  pc: count in 1Gb hugepage alignment when sizing hotplug-memory container
  pc: explicitly check maxmem limit when adding DIMM
  pc: pc-dimm: use backend alignment during address auto allocation
  pc: align DIMM's address/size by backend's alignment value
  memory: expose alignment used for allocating RAM as MemoryRegion API
  pc: limit DIMM address and size to page aligned values
  pc: make pc_dimm_plug() more readble
  pc: kvm: check if KVM has free memory slots to avoid abort()
  qemu-char: fix tcp_get_fds

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

pc: acpi: mark all possible CPUs as enabled in SRAT

If QEMU is started with -numa ... Windows only notices that
CPU has been hot-added but it will not online such CPUs.

It's caused by the fact that possible CPUs are flagged as
not enabled in SRAT and Windows honoring that information
doesn't use corresponding CPU.

ACPI 5.0 Spec regarding to flag says:
"
Table 5-47 Local APIC Flags
...
Enabled: if zero, this processor is unusable, and the operating system
support will not attempt to use it.
"

Fix QEMU to adhere to spec and mark possible CPUs as enabled
in SRAT.

With that Windows onlines hot-added CPUs as expected.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

pcie: fix improper use of negative value

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

pcie: fix typo in pcie_cap_deverr_init()

Reported-by:
https://bugs.launchpad.net/qemu/+bug/1393440

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

target-i386: move generic memory hotplug methods to DSDTs

This makes it simpler to keep the SSDT byte-for-byte identical for a
given machine type, which is a goal we want to have for 2.2 and newer
types.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

acpi-build: mark RAM dirty on table update

acpi build modifies internal FW CFG RAM on first access
but we forgot to mark it dirty.
If this RAM has been migrated already, it won't be
migrated again, returning corrupted tables to guest.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

hw/pci: fix crash on shpc error flow

If the pci bridge enters in error flow as part
of init process it will only delete the shpc mmio
subregion but not remove it from the properties list,
resulting in segmentation fault when the bridge runs
the exit function.

Example: add a pci bridge without specifing the chassis number:
    <qemu-bin> ... -device pci-bridge,id=p1
Result:
    (qemu) qemu-system-x86_64: -device pci-bridge,id=p1: Bridge chassis not specified. Each bridge is required to be assigned a unique chassis id > 0.
    qemu-system-x86_64: -device pci-bridge,id=p1: Device
    initialization failed.
    Segmentation fault (core dumped)

    if (child->class->unparent) {
    #0  0x00005555558d629b in object_finalize_child_property (obj=0x555556d2e830, name=0x555556d30630 "shpc-mmio[0]", opaque=0x555556a42fc8) at qom/object.c:1078
    #1  0x00005555558d4b1f in object_property_del_all (obj=0x555556d2e830) at qom/object.c:367
    #2  0x00005555558d4ca1 in object_finalize (data=0x555556d2e830) at qom/object.c:412
    #3  0x00005555558d55a1 in object_unref (obj=0x555556d2e830) at qom/object.c:720
    #4  0x000055555572c907 in qdev_device_add (opts=0x5555563544f0) at qdev-monitor.c:566
    #5  0x0000555555744f16 in device_init_func (opts=0x5555563544f0, opaque=0x0) at vl.c:2213
    #6  0x00005555559cf5f0 in qemu_opts_foreach (list=0x555555e0f8e0 <qemu_device_opts>, func=0x555555744efa <device_init_func>, opaque=0x0, abort_on_failure=1) at util/qemu-option.c:1057
    #7  0x000055555574a11b in main (argc=16, argv=0x7fffffffdde8, envp=0x7fffffffde70) at vl.c:423

Unparent the shpc mmio region as part of shpc cleanup.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>

pc: count in 1Gb hugepage alignment when sizing hotplug-memory container

if DIMMs with different size/alignment are interleaved
in creation order, it could lead to hotplug-memory
container fragmentation and following inability to use
all RAM upto maxmem.
For example:
    -m 4G,slots=3,maxmem=7G
    -object memory-backend-file,id=mem-1,size=256M,mem-path=/pagesize-2MB
    -device pc-dimm,id=mem1,memdev=mem-1
    -object memory-backend-file,id=mem-2,size=1G,mem-path=/pagesize-1GB
    -device pc-dimm,id=mem2,memdev=mem-2
    -object memory-backend-file,id=mem-3,size=256M,mem-path=/pagesize-2MB
    -device pc-dimm,id=mem3,memdev=mem-3

fragments hotplug-memory container and doesn't allow
to use 1GB hugepage backend to consume remainig 1Gb.

To ease managment factor count in max 1Gb alignment for
each memory slot when sizing hotplug-memory region so
that regadless of fragmentaion it would be possible to
add max aligned DIMM.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

pc: explicitly check maxmem limit when adding DIMM

Currently maxmem limit is not checked and depends on
hotplug region container not being able to fit more RAM
than maxmem. Do check explicitly so that it would
be possible to change hotplug container size later
to deal with fragmentation.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block patches for 2.2.0-rc3

# gpg: Signature made Mon 24 Nov 2014 12:52:23 GMT using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream:
Revert "qemu-img info: show nocow info"

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

Three patches to fix ExtINT for the QEMU implementation of the local APIC.

# gpg: Signature made Mon 24 Nov 2014 13:38:36 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream:
  apic: fix incorrect handling of ExtINT interrupts wrt processor priority
  apic: fix loss of IPI due to masked ExtINT
  apic: avoid getting out of halted state on masked PIC interrupts

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

apic: fix incorrect handling of ExtINT interrupts wrt processor priority

This fixes another failure with ExtINT, demonstrated by QNX. The failure
mode is as follows:
- IPI sent to cpu 0 (bit set in APIC irr)
- IPI accepted by cpu 0 (bit cleared in irr, set in isr)
- IPI sent to cpu 0 (bit set in both irr and isr)
- PIC interrupt sent to cpu 0

The PIC interrupt causes CPU_INTERRUPT_HARD to be set, but
apic_irq_pending observes that the highest pending APIC interrupt priority
(the IPI) is the same as the processor priority (since the IPI is still
being handled), so apic_get_interrupt returns a spurious interrupt rather
than the pending PIC interrupt. The result is an endless sequence of
spurious interrupts, since nothing will clear CPU_INTERRUPT_HARD.

Instead, ExtINT interrupts should have ignored the processor priority.
Calling apic_check_pic early in apic_get_interrupt ensures that
apic_deliver_pic_intr is called instead of delivering the spurious
interrupt. apic_deliver_pic_intr then clears CPU_INTERRUPT_HARD if needed.

Reported-by: Richard Bilson <rbilson@qnx.com>
Tested-by: Richard Bilson <rbilson@qnx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

apic: fix loss of IPI due to masked ExtINT

This patch fixes an obscure failure of the QNX kernel on QEMU x86 SMP.
In QNX, all hardware interrupts come via the PIC, and are delivered by
the cpu 0 LAPIC in ExtINT mode, while IPIs are delivered by the LAPIC
in fixed mode.

This bug happens as follows:
- cpu 0 masks a particular PIC interrupt
- IPI sent to cpu 0 (CPU_INTERRUPT_HARD is set)
- before the IPI is accepted, the masked interrupt line is asserted by the
device

Since the interrupt is masked, apic_deliver_pic_intr will clear
CPU_INTERRUPT_HARD. The IPI will still be set in the APIC irr, but since
CPU_INTERRUPT_HARD is not set the cpu will not notice. Depending on the
scenario this can cause a system hang, i.e. if cpu 0 is expected to unmask
the interrupt.

In order to fix this, do a full check of the APIC before an EXTINT
is acknowledged. This can result in clearing CPU_INTERRUPT_HARD, but
can also result in delivering the lost IPI.

Reported-by: Richard Bilson <rbilson@qnx.com>
Tested-by: Richard Bilson <rbilson@qnx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

apic: avoid getting out of halted state on masked PIC interrupts

After the next patch, if a masked PIC interrupts causes CPU_INTERRUPT_POLL
to be set, the CPU will spuriously get out of halted state.  While this
is technically valid, we should avoid that.

Make CPU_INTERRUPT_POLL run apic_update_irq in the right thread and then
look at CPU_INTERRUPT_HARD.  If CPU_INTERRUPT_HARD does not get set,
do not report the CPU as having work.

Also move the handling of software-disabled APIC from apic_update_irq
to apic_irq_pending, and always trigger CPU_INTERRUPT_POLL.  This will
be important once we will add a case that resets CPU_INTERRUPT_HARD
from apic_update_irq.  We want to run it even if we go through
CPU_INTERRUPT_POLL, and even if the local APIC is software disabled.

Reported-by: Richard Bilson <rbilson@qnx.com>
Tested-by: Richard Bilson <rbilson@qnx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Revert "qemu-img info: show nocow info"

This reverts commit 000c4dfff4d7686e2fba3066a477a1290ed60622.

The main reason for reverting this commit before the 2.2 release is that
it adds a QAPI interface that we don't want to keep: The 'nocow' flag
doesn't generally make sense for block nodes, but only for the raw-posix
driver. It should therefore be part of ImageInfoSpecific rather than
ImageInfo.

The commit contains more problems, but unlike the API stability issue
they wouldn't justify reverting it.

Conflicts:
block/qapi.c

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

pc: pc-dimm: use backend alignment during address auto allocation

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>