sdk/emulator/qemu.git
9 years agocoroutine: rewrite pool to avoid mutex
Paolo Bonzini [Tue, 2 Dec 2014 11:05:48 +0000 (12:05 +0100)]
coroutine: rewrite pool to avoid mutex

This patch removes the mutex by using fancy lock-free manipulation of
the pool.  Lock-free stacks and queues are not hard, but they can suffer
from the ABA problem so they are better avoided unless you have some
deferred reclamation scheme like RCU.  Otherwise you have to stick
with adding to a list, and emptying it completely.  This is what this
patch does, by coupling a lock-free global list of available coroutines
with per-CPU lists that are actually used on coroutine creation.

Whenever the destruction pool is big enough, the next thread that runs
out of coroutines will steal the whole destruction pool.  This is positive
in two ways:

1) the allocation does not have to do any atomic operation in the fast
path, it's entirely using thread-local storage.  Once every POOL_BATCH_SIZE
allocations it will do a single atomic_xchg.  Release does an atomic_cmpxchg
loop, that hopefully doesn't cause any starvation, and an atomic_inc.

A later patch will also remove atomic operations from the release path,
and try to avoid the atomic_xchg altogether---succeeding in doing so if
all devices either use ioeventfd or are not submitting requests actively.

2) in theory this should be completely adaptive.  The number of coroutines
around should be a little more than POOL_BATCH_SIZE * number of allocating
threads; so this also empties qemu_coroutine_adjust_pool_size.  (The previous
pool size was POOL_BATCH_SIZE * number of block backends, so it was a bit
more generous.  But if you actually have many high-iodepth disks, it's better
to put them in different iothreads, which will also use separate thread
pools and aio=native file descriptors).

This speeds up perf/cost (in tests/test-coroutine) by a factor of ~1.33.
No matter if we end with some kind of coroutine bypass scheme or not,
it cannot hurt to optimize hot code.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1417518350-6167-6-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9 years agoQSLIST: add lock-free operations
Paolo Bonzini [Tue, 2 Dec 2014 11:05:47 +0000 (12:05 +0100)]
QSLIST: add lock-free operations

These operations are trivial to implement and do not have ABA problems.
They are enough to implement simple multiple-producer, single consumer
lock-free lists or, as in the next patch, the multiple consumers can
steal a whole batch of elements and process them at their leisure.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1417518350-6167-5-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9 years agotest-coroutine: avoid overflow on 32-bit systems
Paolo Bonzini [Tue, 2 Dec 2014 11:05:46 +0000 (12:05 +0100)]
test-coroutine: avoid overflow on 32-bit systems

unsigned long is not large enough to represent 1000000000 * duration there.
Just use floating point.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1417518350-6167-4-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9 years agoqemu-thread: add per-thread atexit functions
Paolo Bonzini [Tue, 2 Dec 2014 11:05:45 +0000 (12:05 +0100)]
qemu-thread: add per-thread atexit functions

Destructors are the main additional feature of pthread TLS compared
to __thread.  If we were using C++ (hint, hint!) we could have used
thread-local objects with a destructor.  Since we are not, instead,
we add a simple Notifier-based API.

Note that the notifier must be per-thread as well.  We can add a
global list as well later, perhaps.

The Win32 implementation has some complications because a) detached
threads used not to have a QemuThreadData; b) the main thread does
not go through win32_start_routine, so we have to use atexit too.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1417518350-6167-3-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9 years agocoroutine-ucontext: use __thread
Paolo Bonzini [Tue, 2 Dec 2014 11:05:44 +0000 (12:05 +0100)]
coroutine-ucontext: use __thread

ELF thread local storage is about 10% faster on tests/test-coroutine's
perf/cost test.  The timing on my machine is 190ns per iteration with
pthread TLS, 170 with ELF TLS.

Based on a patch by Kevin Wolf and Peter Lieven, but redone to follow
the model of coroutine-win32.c (including the important "noinline"
attribute!).

Platforms without thread-local storage (OpenBSD probably?) will need
a new-enough GCC for this to compile, in order to use the same emutls
support that Windows already relies on.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1417518350-6167-2-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9 years agoqemu-iotests: Add supported os parameter for python tests
Fam Zheng [Sun, 4 Jan 2015 01:53:52 +0000 (09:53 +0800)]
qemu-iotests: Add supported os parameter for python tests

If I understand correctly, qemu-iotests never meant to be portable. We
only support Linux for all the shell cases, but didn't specify it for
python tests. Now add this and default all the python tests as Linux
only. If we cares enough later, we can override the parameter in
individual cases.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9 years agoqemu-iotests: Add "_supported_os Linux" to 058
Fam Zheng [Sun, 4 Jan 2015 01:53:49 +0000 (09:53 +0800)]
qemu-iotests: Add "_supported_os Linux" to 058

Other cases have this, and this test is not portable as well, as we want
to add "make check-block" to "make check", it shouldn't fail on Mac OS
X.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9 years agoqemu-iotests: Replace "/bin/true" with "true"
Fam Zheng [Sun, 4 Jan 2015 01:53:48 +0000 (09:53 +0800)]
qemu-iotests: Replace "/bin/true" with "true"

The former is not portable because on Mac OSX it is /usr/bin/true.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9 years ago.gitignore: Ignore generated "common.env"
Fam Zheng [Sun, 4 Jan 2015 01:53:46 +0000 (09:53 +0800)]
.gitignore: Ignore generated "common.env"

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9 years agolibqos: Convert malloc-pc allocator to a generic allocator
Marc Marí [Thu, 23 Oct 2014 08:12:42 +0000 (10:12 +0200)]
libqos: Convert malloc-pc allocator to a generic allocator

The allocator in malloc-pc has been extracted, so it can be used in every arch.
This operation showed that both the alloc and free functions can be also
generic.
Because of this, the QGuestAllocator has been removed from is function to wrap
the alloc and free function, and now just contains the allocator parameters.
As a result, only the allocator initalizer and unitializer are arch dependent.

Signed-off-by: Marc Marí <marc.mari.barcelo@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9 years agomigration/block: fix pending() return value
Vladimir Sementsov-Ogievskiy [Tue, 30 Dec 2014 10:04:16 +0000 (13:04 +0300)]
migration/block: fix pending() return value

Because of wrong return value of .save_live_pending() in
migration/block.c, migration finishes before the whole disk is
transferred. Such situation occurs when the migration process is fast
enough, for example when source and dest are on the same host.

If in the bulk phase we return something < max_size, we will skip
transferring the tail of the device. Currently we have "set pending to
BLOCK_SIZE if it is zero" for bulk phase, but there no guarantee, that
it will be < max_size.

True approach is to return, for example, max_size+1 when we are in the
bulk phase.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@parallels.com>
Message-id: 1419933856-4018-2-git-send-email-vsementsov@parallels.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9 years agoiotests: Filter out "I/O thread spun..." warning
Max Reitz [Fri, 19 Dec 2014 16:17:06 +0000 (17:17 +0100)]
iotests: Filter out "I/O thread spun..." warning

Filter out the "main loop: WARNING: I/O thread spun for..." warning from
qemu output (it hardly matters for code specifically testing I/O).

Furthermore, use _filter_qemu in all the custom functions which run
qemu.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoqemu-iotests: Test blockdev-backup in 055
Fam Zheng [Thu, 18 Dec 2014 10:37:07 +0000 (18:37 +0800)]
qemu-iotests: Test blockdev-backup in 055

This applies cases on drive-backup on blockdev-backup, except cases with
target format and mode.

Also add a case to check source == target.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1418899027-8445-5-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
9 years agoblock: Add blockdev-backup to transaction
Fam Zheng [Thu, 18 Dec 2014 10:37:06 +0000 (18:37 +0800)]
block: Add blockdev-backup to transaction

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1418899027-8445-4-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
9 years agoqmp: Add command 'blockdev-backup'
Fam Zheng [Thu, 18 Dec 2014 10:37:05 +0000 (18:37 +0800)]
qmp: Add command 'blockdev-backup'

Similar to drive-backup, but this command uses a device id as target
instead of creating/opening an image file.

Also add blocker on target bs, since the target is also a named device
now.

Add check and report error for bs == target which became possible but is
an illegal case with introduction of blockdev-backup.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1418899027-8445-3-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
9 years agoqapi: Comment version info in TransactionAction
Fam Zheng [Thu, 18 Dec 2014 10:37:04 +0000 (18:37 +0800)]
qapi: Comment version info in TransactionAction

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1418899027-8445-2-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
9 years agoblock: fix spoiling all dirty bitmaps by mirror and migration
Vladimir Sementsov-Ogievskiy [Thu, 27 Nov 2014 09:40:46 +0000 (12:40 +0300)]
block: fix spoiling all dirty bitmaps by mirror and migration

Mirror and migration use dirty bitmaps for their purposes, and since
commit [block: per caller dirty bitmap] they use their own bitmaps, not
the global one. But they use old functions bdrv_set_dirty and
bdrv_reset_dirty, which change all dirty bitmaps.

Named dirty bitmaps series by Fam and Snow are affected: mirroring and
migration will spoil all (not related to this mirroring or migration)
named dirty bitmaps.

This patch fixes this by adding bdrv_set_dirty_bitmap and
bdrv_reset_dirty_bitmap, which change concrete bitmap. Also, to prevent
such mistakes in future, old functions bdrv_(set,reset)_dirty are made
static, for internal block usage.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@parallels.com>
CC: John Snow <jsnow@redhat.com>
CC: Fam Zheng <famz@redhat.com>
CC: Denis V. Lunev <den@openvz.org>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1417081246-3593-1-git-send-email-vsementsov@parallels.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
9 years agoqapi: Fix document for BlockStats.node-name
Fam Zheng [Tue, 16 Dec 2014 01:40:24 +0000 (09:40 +0800)]
qapi: Fix document for BlockStats.node-name

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1418694024-26498-1-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
9 years agoiotests: Add test for relative backing file names
Max Reitz [Wed, 26 Nov 2014 16:20:29 +0000 (17:20 +0100)]
iotests: Add test for relative backing file names

Sometimes, qemu does not have a filename to work with, so it does not
know which directory to use for a backing file specified by a relative
filename. Add a test which tests that qemu exits with an appropriate
error message.

Additionally, add a test for qemu-img create with a backing filename
relative to the backed image's base directory while omitting the image
size.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock/vmdk: Relative backing file for creation
Max Reitz [Wed, 26 Nov 2014 16:20:28 +0000 (17:20 +0100)]
block/vmdk: Relative backing file for creation

When a vmdk image is created with a backing file, it is opened to check
whether it is indeed a vmdk file by letting qemu probe it. When doing
so, the backing filename is relative to the image's base directory so it
should be interpreted accordingly.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock: Relative backing file for image creation
Max Reitz [Wed, 26 Nov 2014 16:20:27 +0000 (17:20 +0100)]
block: Relative backing file for image creation

Relative backing filenames are always relative to the backed image's
directory; the same applies to image creation. Therefore, if the backing
file has to be opened for determining its size (in case the size has not
been explicitly specified) its filename should be interpreted relative
to the new image's base directory and not relative to qemu's working
directory.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock: JSON filenames and relative backing files
Max Reitz [Wed, 26 Nov 2014 16:20:26 +0000 (17:20 +0100)]
block: JSON filenames and relative backing files

When using a relative backing file name, qemu needs to know the
directory of the top image file. For JSON filenames, such a directory
cannot be easily determined (e.g. how do you determine the directory of
a qcow2 BDS directly on top of a quorum BDS?). Therefore, do not allow
relative filenames for the backing file of BDSs only having a JSON
filename.

Furthermore, BDS::exact_filename should be used whenever possible. If
BDS::filename is not equal to BDS::exact_filename, the former will
always be a JSON object.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock: Get full backing filename from string
Max Reitz [Wed, 26 Nov 2014 16:20:25 +0000 (17:20 +0100)]
block: Get full backing filename from string

Introduce bdrv_get_full_backing_filename_from_filename(), a function
which takes the name of the backed file and a potentially relative
backing filename to produce the full (absolute) backing filename.

Use this function from bdrv_get_full_backing_filename().

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agocheckpatch: Brace handling on multi-line condition
Max Reitz [Wed, 26 Nov 2014 16:20:24 +0000 (17:20 +0100)]
checkpatch: Brace handling on multi-line condition

CODING_STYLE states the following about braces around blocks:

> The opening brace is on the line that contains the control flow
> statement that introduces the new block; [...]

This is obviously impossible with multi-line conditions. Therefore,
CODING_STYLE does not make any clear statement about where to put the
opening brace after a multi-line condition.

There is a reason to prefer to place the opening brace on an own line
after such a condition while still placing it on the same line as the
"control flow statement" if possible; that reason is that the last line
of a multi-line condition is indented, in the case of "if", it is often
indented by four spaces, just as much as the first statement in the
block will be indented. This is hard to read as there is no clearly
visible distinction between condition and block. Placing the opening
brace on a separate line solves this issue.

Also, there are cases where placing the opening brace on a separate line
is the only viable option; if the previous line had nearly 80 characters
and splitting it is not desirable, the opening brace is naturally placed
on an own line.

This patch fixes checkpatch.pl to not complain about braces on own lines
if the condition introducing the block spanned more than one line, or if
the previous line had 79 or 80 characters.

Furthermore, the warning about not having braces around a block is fixed
to mind braces not being on the last line of the condition.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock: replace g_new0 with g_new for bottom half allocation.
Paolo Bonzini [Wed, 17 Dec 2014 15:10:00 +0000 (16:10 +0100)]
block: replace g_new0 with g_new for bottom half allocation.

This saves about 15% of the clock cycles spent on allocation.  Using the
slice allocator does not add a visible improvement; allocation is faster
than malloc, while freeing seems to be slower.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock: do not allocate an iovec per read of a growable/zero_after_eof BDS
Paolo Bonzini [Wed, 17 Dec 2014 15:09:59 +0000 (16:09 +0100)]
block: do not allocate an iovec per read of a growable/zero_after_eof BDS

Most reads do not go past the end of the file, and they can use the
input QEMUIOVector instead of creating one.  This removes the
qemu_iovec_* functions from the profile.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoblock: mark AioContext as recursive
Paolo Bonzini [Wed, 17 Dec 2014 15:09:58 +0000 (16:09 +0100)]
block: mark AioContext as recursive

AioContext can be accessed recursively, in fact that's what we do with
aio_poll.  Marking the GSource as recursive avoids that GLib blocks it
and unblocks it around every call to aio_dispatch, which is a pretty
expensive operation.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoqemu-iotests: Speed up make check-block
Fam Zheng [Mon, 15 Dec 2014 05:07:18 +0000 (13:07 +0800)]
qemu-iotests: Speed up make check-block

Using /tmp, which is usually mounted as tmpfs, the quick group can be
quicker.

On my laptop (Lenovo T430s with Fedora 20), this reduces the time from
50s to 30s.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoqemu-iotests: Remove 091 from quick group
Fam Zheng [Mon, 15 Dec 2014 05:07:17 +0000 (13:07 +0800)]
qemu-iotests: Remove 091 from quick group

For the purpose of allowing running quick group on tmpfs.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9 years agoMerge remote-tracking branch 'remotes/stefanha/tags/net-pull-request' into staging
Peter Maydell [Mon, 12 Jan 2015 11:13:24 +0000 (11:13 +0000)]
Merge remote-tracking branch 'remotes/stefanha/tags/net-pull-request' into staging

# gpg: Signature made Mon 12 Jan 2015 10:27:41 GMT using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/net-pull-request:
  hw/net/xen_nic.c: Set 'netdev->mac' to NULL after free it
  hw/net/xen_nic.c: Need free 'netdev->nic' in net_free() instead of net_disconnect()
  hw/net/xen_nic.c: Free 'netdev->txs' when map 'netdev->rxs' fails
  net: remove all cleanup methods from NIC NetClientInfos

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9 years agohw/net/xen_nic.c: Set 'netdev->mac' to NULL after free it
Chen Gang [Tue, 16 Dec 2014 20:58:42 +0000 (04:58 +0800)]
hw/net/xen_nic.c: Set 'netdev->mac' to NULL after free it

Since net_init() checks whether 'netdev->mac' is NULL, before alloc it;
net_release() also need set 'netdev->mac' to NULL after free it.

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9 years agohw/net/xen_nic.c: Need free 'netdev->nic' in net_free() instead of net_disconnect()
Chen Gang [Tue, 16 Dec 2014 20:52:16 +0000 (04:52 +0800)]
hw/net/xen_nic.c: Need free 'netdev->nic' in net_free() instead of net_disconnect()

net_init() and net_free() are pairs, net_connect() and net_disconnect()
are pairs. net_init() creates 'netdev->nic', so also need free it in
net_free().

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9 years agohw/net/xen_nic.c: Free 'netdev->txs' when map 'netdev->rxs' fails
Chen Gang [Tue, 16 Dec 2014 20:48:54 +0000 (04:48 +0800)]
hw/net/xen_nic.c: Free 'netdev->txs' when map 'netdev->rxs' fails

When map 'netdev->rxs' fails, need free the original resource, or will
cause resource leak.

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9 years agonet: remove all cleanup methods from NIC NetClientInfos
Paolo Bonzini [Tue, 23 Dec 2014 16:53:19 +0000 (17:53 +0100)]
net: remove all cleanup methods from NIC NetClientInfos

All NICs have a cleanup function that, in most cases, zeroes the pointer
to the NICState.  In some cases, it frees data belonging to the NIC.

However, this function is never called except when exiting from QEMU.
It is not necessary to NULL pointers and free data here; the right place
to do that would be in the device's unrealize function, after calling
qemu_del_nic.  Zeroing the NIC multiple times is also wrong for multiqueue
devices.

This cleanup function gets in the way of making the NetClientStates for
the NIC hold an object_ref reference to the object, so get rid of it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9 years agoMerge remote-tracking branch 'remotes/cohuck/tags/s390x-20150112-v3' into staging
Peter Maydell [Mon, 12 Jan 2015 10:09:41 +0000 (10:09 +0000)]
Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20150112-v3' into staging

s390x patches for 2.3.

Highlight is support for PCI devices on s390x. Otherwise, performance
improvements (register sync) and small cleanups.

# gpg: Signature made Mon 12 Jan 2015 09:49:31 GMT using RSA key ID C6F02FAF
# gpg: Good signature from "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg:                 aka "Cornelia Huck <cornelia.huck@de.ibm.com>"

* remotes/cohuck/tags/s390x-20150112-v3:
  kvm: extend kvm_irqchip_add_msi_route to work on s390
  s390: implement pci instructions
  s390: Add PCI bus support
  s390x/kvm: avoid syscalls by syncing registers with kvm_run
  s390x/kvm: sync register support helper function
  s390x/css: Clean up unnecessary CONFIG_USER_ONLY wrappers
  s390x/ccw: fix oddity in machine class init

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9 years agokvm: extend kvm_irqchip_add_msi_route to work on s390
Frank Blaschka [Fri, 9 Jan 2015 08:04:40 +0000 (09:04 +0100)]
kvm: extend kvm_irqchip_add_msi_route to work on s390

on s390 MSI-X irqs are presented as thin or adapter interrupts
for this we have to reorganize the routing entry to contain
valid information for the adapter interrupt code on s390.
To minimize impact on existing code we introduce an architecture
function to fixup the routing entry.

Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
9 years agos390: implement pci instructions
Frank Blaschka [Fri, 9 Jan 2015 08:04:39 +0000 (09:04 +0100)]
s390: implement pci instructions

This patch implements the s390 pci instructions in qemu. It allows
to access and drive pci devices attached to the s390 pci bus.
Because of platform constrains devices using IO BARs are not
supported. Also a device has to support MSI/MSI-X to run on s390.

Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
9 years agos390: Add PCI bus support
Frank Blaschka [Fri, 9 Jan 2015 08:04:38 +0000 (09:04 +0100)]
s390: Add PCI bus support

This patch implements a pci bus for s390x together with infrastructure
to generate and handle hotplug events, to configure/unconfigure via
sclp instruction, to do iommu translations and provide s390 support for
MSI/MSI-X notification processing.

Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
9 years agos390x/kvm: avoid syscalls by syncing registers with kvm_run
David Hildenbrand [Wed, 3 Dec 2014 14:38:31 +0000 (15:38 +0100)]
s390x/kvm: avoid syscalls by syncing registers with kvm_run

We can avoid loads of syscalls when dropping to user space by storing the values
of more registers directly within kvm_run.

Support is added for:
- ARCH0: CPU timer, clock comparator, TOD programmable register,
         guest breaking-event register, program parameter
- PFAULT: pfault parameters (token, select, compare)

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
9 years agos390x/kvm: sync register support helper function
David Hildenbrand [Wed, 3 Dec 2014 14:38:30 +0000 (15:38 +0100)]
s390x/kvm: sync register support helper function

Let's unify the code to sync registers by moving the checks into a helper
function can_sync_regs().

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
9 years agos390x/css: Clean up unnecessary CONFIG_USER_ONLY wrappers
Thomas Huth [Wed, 3 Dec 2014 14:38:29 +0000 (15:38 +0100)]
s390x/css: Clean up unnecessary CONFIG_USER_ONLY wrappers

The css functions are only used from ioinst.c and other files that are
only built for CONFIG_SOFTMMU. So we do not need the dummy wrappers for
the CONFIG_USER_ONLY target in the cpu.h header.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@us.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
9 years agos390x/ccw: fix oddity in machine class init
Cornelia Huck [Wed, 3 Dec 2014 14:38:28 +0000 (15:38 +0100)]
s390x/ccw: fix oddity in machine class init

ccw_machine_class_init() uses ',' instead of ';' while initializing
the class' fields. This is almost certainly a copy/paste error and,
while legal C, rather on the unusual side. Just use ';' everywhere.

Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
9 years agoMerge remote-tracking branch 'remotes/awilliam/tags/vfio-update-20150109.0' into...
Peter Maydell [Sat, 10 Jan 2015 22:29:09 +0000 (22:29 +0000)]
Merge remote-tracking branch 'remotes/awilliam/tags/vfio-update-20150109.0' into staging

VFIO fixes:
- Fix 32bit overflow in handling large PCI BARs (Alex Williamson)
- Fix interrupt shutdown ordering (Alex Williamson)

# gpg: Signature made Fri 09 Jan 2015 16:23:42 GMT using RSA key ID 3BB08B22
# gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>"
# gpg:                 aka "Alex Williamson <alex@shazbot.org>"
# gpg:                 aka "Alex Williamson <alwillia@redhat.com>"
# gpg:                 aka "Alex Williamson <alex.l.williamson@gmail.com>"

* remotes/awilliam/tags/vfio-update-20150109.0:
  vfio-pci: Fix interrupt disabling
  vfio-pci: Fix BAR size overflow

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9 years agoMerge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
Peter Maydell [Sat, 10 Jan 2015 21:02:23 +0000 (21:02 +0000)]
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

pc: resizeable ROM blocks

This makes ROM blocks resizeable.  This infrastructure is required for other
functionality we have queued.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Thu 08 Jan 2015 11:19:24 GMT using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream:
  acpi-build: make ROMs RAM blocks resizeable
  memory: API to allocate resizeable RAM MR
  arch_init: support resizing on incoming migration
  exec: qemu_ram_alloc_resizeable, qemu_ram_resize
  exec: split length -> used_length/max_length
  exec: cpu_physical_memory_set/clear_dirty_range
  memory: add memory_region_set_size

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9 years agoMerge remote-tracking branch 'remotes/agraf/tags/signed-ppc-for-upstream' into staging
Peter Maydell [Sat, 10 Jan 2015 19:50:21 +0000 (19:50 +0000)]
Merge remote-tracking branch 'remotes/agraf/tags/signed-ppc-for-upstream' into staging

Patch queue for ppc - 2015-01-07

New year's release. This time's highlights:

  - E500: More RAM support
  - pseries: New SLOF release
  - Migration fixes
  - Simplify USB spawning logic, removes support for explicit usb=off
  - TCG: Simple untansactional TM emulation

# gpg: Signature made Wed 07 Jan 2015 15:19:37 GMT using RSA key ID 03FEDC60
# gpg: Good signature from "Alexander Graf <agraf@suse.de>"
# gpg:                 aka "Alexander Graf <alex@csgraf.de>"

* remotes/agraf/tags/signed-ppc-for-upstream: (37 commits)
  hw/ppc/mac_newworld: simplify usb controller creation logic
  hw/ppc/spapr: simplify usb controller creation logic
  hw/ppc/mac_newworld: QOMified mac99 machines
  hw/usb: simplified usb_enabled
  hw/machine: added machine_usb wrapper
  hw/ppc: modified the condition for usb controllers to be created for some ppc machines
  target-ppc: Cast ssize_t to size_t before printing with %zx
  target-ppc: Mark SR() and gen_sync_exception() as !CONFIG_USER_ONLY
  PPC: e500: Fix GPIO controller interrupt number
  target-ppc: Introduce Privileged TM Noops
  target-ppc: Introduce tcheck
  target-ppc: Introduce TM Noops
  target-ppc: Introduce tbegin
  target-ppc: Introduce TEXASRU Bit Fields
  target-ppc: Power8 Supports Transactional Memory
  target-ppc: Introduce tm_enabled Bit to CPU State
  target-ppc: Introduce Feature Flag for Transactional Memory
  target-ppc: Introduce Instruction Type for Transactional Memory
  pseries: Update SLOF firmware image to 20141202
  PPC: Fix crash on spapr_tce_table_finalize()
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9 years agoMerge remote-tracking branch 'remotes/otubo/tags/pull-seccomp-20150105' into staging
Peter Maydell [Sat, 10 Jan 2015 19:06:41 +0000 (19:06 +0000)]
Merge remote-tracking branch 'remotes/otubo/tags/pull-seccomp-20150105' into staging

seccomp branch queue

# gpg: Signature made Mon 05 Jan 2015 17:17:01 GMT using RSA key ID 12F8BD2F
# gpg: Can't check signature: public key not found

* remotes/otubo/tags/pull-seccomp-20150105:
  seccomp: add mbind() to the syscall whitelist
  seccomp: typo in configure error message

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9 years agoMerge remote-tracking branch 'remotes/amit-virtio-rng/tags/rng-for-2.3' into staging
Peter Maydell [Fri, 9 Jan 2015 18:55:29 +0000 (18:55 +0000)]
Merge remote-tracking branch 'remotes/amit-virtio-rng/tags/rng-for-2.3' into staging

Fixes an init-time check for parameter validity

# gpg: Signature made Mon 05 Jan 2015 08:34:05 GMT using RSA key ID 854083B6
# gpg: Good signature from "Amit Shah <amit@amitshah.net>"
# gpg:                 aka "Amit Shah <amit@kernel.org>"
# gpg:                 aka "Amit Shah <amitshah@gmx.net>"

* remotes/amit-virtio-rng/tags/rng-for-2.3:
  virtio-rng: fix check for period_ms validity

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9 years agoMerge remote-tracking branch 'remotes/amit/tags/for-2.3' into staging
Peter Maydell [Fri, 9 Jan 2015 17:59:16 +0000 (17:59 +0000)]
Merge remote-tracking branch 'remotes/amit/tags/for-2.3' into staging

Migration fix for virtio-serial devices on bi-endian targets by David
Gibson.

# gpg: Signature made Mon 05 Jan 2015 07:26:07 GMT using RSA key ID 854083B6
# gpg: Good signature from "Amit Shah <amit@amitshah.net>"
# gpg:                 aka "Amit Shah <amit@kernel.org>"
# gpg:                 aka "Amit Shah <amitshah@gmx.net>"

* remotes/amit/tags/for-2.3:
  virtio-serial: Don't keep a persistent copy of config space
  virtio_serial: Don't use vser->config.max_nr_ports internally

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9 years agoMerge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
Peter Maydell [Fri, 9 Jan 2015 16:29:36 +0000 (16:29 +0000)]
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

More migration fixes and more record/replay preparations.  Also moves
the sdhci-pci device id to make space for the rocker device.

# gpg: Signature made Sat 03 Jan 2015 08:22:36 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream:
  pci: move REDHAT_SDHCI device ID to make room for Rocker
  block/iscsi: fix uninitialized variable
  pckbd: set bits 2-3-6-7 of the output port by default
  serial: refine serial_thr_ipending_needed
  gen-icount: check cflags instead of use_icount global
  translate: check cflags instead of use_icount global
  cpu-exec: add a new CF_USE_ICOUNT cflag
  target-ppc: pass DisasContext to SPR generator functions
  atomic: fix position of volatile qualifier

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9 years agovfio-pci: Fix interrupt disabling
Alex Williamson [Fri, 9 Jan 2015 15:50:53 +0000 (08:50 -0700)]
vfio-pci: Fix interrupt disabling

When disabling MSI/X interrupts the disable functions will leave the
device in INTx mode (when available).  This matches how hardware
operates, INTx is enabled unless MSI/X is enabled (DisINTx is handled
separately).  Therefore when we really want to disable all interrupts,
such as when removing the device, and we start with the device in
MSI/X mode, we need to pass through INTx on our way to being
completely quiesced.

In well behaved situations, the guest driver will have shutdown the
device and it will start vfio_exitfn() in INTx mode, producing the
desired result.  If hot-unplug causes the guest to crash, we may get
the device in MSI/X state, which will leave QEMU with a bogus handler
installed.

Fix this by re-ordering our disable routine so that it should always
finish in VFIO_INT_NONE state, which is what all callers expect.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
9 years agovfio-pci: Fix BAR size overflow
Alex Williamson [Fri, 9 Jan 2015 15:50:53 +0000 (08:50 -0700)]
vfio-pci: Fix BAR size overflow

We use an unsigned int when working with the PCI BAR size, which can
obviously overflow if the BAR is 4GB or larger.  This needs to change
to a fixed length uint64_t.  A similar issue is possible, though even
more unlikely, when mapping the region above an MSI-X table.  The
start of the MSI-X vector table must be below 4GB, but the end, and
therefore the start of the next mapping region, could still land at
4GB.

Suggested-by: Nishank Trivedi <nishank.trivedi@netapp.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Don Slutz <dslutz@verizon.com>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
9 years agoMerge remote-tracking branch 'remotes/mwalle/tags/lm32-fixes/20141229' into staging
Peter Maydell [Fri, 9 Jan 2015 15:38:20 +0000 (15:38 +0000)]
Merge remote-tracking branch 'remotes/mwalle/tags/lm32-fixes/20141229' into staging

lm32: milkymist fixes and MAINTAINER update

# gpg: Signature made Tue 30 Dec 2014 16:54:15 GMT using DSA key ID 3F98A378
# gpg: Can't check signature: public key not found

* remotes/mwalle/tags/lm32-fixes/20141229:
  MAINTAINERS: add myself to lm32 and milkymist
  milkymist: softmmu: fix event handling

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9 years agohw/ppc/mac_newworld: simplify usb controller creation logic
Marcel Apfelbaum [Tue, 6 Jan 2015 13:29:17 +0000 (15:29 +0200)]
hw/ppc/mac_newworld: simplify usb controller creation logic

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Message-id: 1420550957-22337-7-git-send-email-marcel@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9 years agohw/ppc/spapr: simplify usb controller creation logic
Marcel Apfelbaum [Tue, 6 Jan 2015 13:29:16 +0000 (15:29 +0200)]
hw/ppc/spapr: simplify usb controller creation logic

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Message-id: 1420550957-22337-6-git-send-email-marcel@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9 years agohw/ppc/mac_newworld: QOMified mac99 machines
Marcel Apfelbaum [Tue, 6 Jan 2015 13:29:15 +0000 (15:29 +0200)]
hw/ppc/mac_newworld: QOMified mac99 machines

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Message-id: 1420550957-22337-5-git-send-email-marcel@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9 years agohw/usb: simplified usb_enabled
Marcel Apfelbaum [Tue, 6 Jan 2015 13:29:14 +0000 (15:29 +0200)]
hw/usb: simplified usb_enabled

The argument is not longer used and the implementation
uses now QOM instead of QemuOpts.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Message-id: 1420550957-22337-4-git-send-email-marcel@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9 years agohw/machine: added machine_usb wrapper
Marcel Apfelbaum [Tue, 6 Jan 2015 13:29:13 +0000 (15:29 +0200)]
hw/machine: added machine_usb wrapper

Following QOM convention, object properties should
not be accessed directly.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Message-id: 1420550957-22337-3-git-send-email-marcel@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9 years agohw/ppc: modified the condition for usb controllers to be created for some ppc machines
Marcel Apfelbaum [Tue, 6 Jan 2015 13:29:12 +0000 (15:29 +0200)]
hw/ppc: modified the condition for usb controllers to be created for some ppc machines

Some ppc machines create a default usb controller based on a 'machine condition'.
Until now the logic was: create the usb controller if:
 -  the usb option was supplied in cli and value is true or
 -  the usb option was absent and both set_defaults and the machine
    condition were true.

Modified the logic to:
Create the usb controller if:
 - the machine condition is true and defaults are enabled or
 - the usb option is supplied and true.

The main for this is to simplify the usb_enabled method.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Message-id: 1420550957-22337-2-git-send-email-marcel@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
9 years agoacpi-build: make ROMs RAM blocks resizeable
Michael S. Tsirkin [Mon, 17 Nov 2014 05:51:50 +0000 (07:51 +0200)]
acpi-build: make ROMs RAM blocks resizeable

Use resizeable ram API so we can painlessly extend ROMs in the
future.  Note: migration is not affected, as we are
not actually changing the used length for RAM, which
is the part that's migrated.

Use this in acpi: reserve x16 more RAM space.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
9 years agomemory: API to allocate resizeable RAM MR
Michael S. Tsirkin [Sun, 16 Nov 2014 22:24:36 +0000 (00:24 +0200)]
memory: API to allocate resizeable RAM MR

Add API to allocate resizeable RAM MR.

This looks just like regular RAM generally, but
has a special property that only a portion of it
(used_length) is actually used, and migrated.

This used_length size can change across reboots.

Follow up patches will change used_length for such blocks at migration,
making it easier to extend devices using such RAM (notably ACPI,
but in the future thinkably other ROMs) without breaking migration
compatibility or wasting ROM (guest) memory.

Device is notified on resize, so it can adjust if necessary.

Note: nothing prevents making all RAM resizeable in this way.
However, reviewers felt that only enabling this selectively will
make some class of errors easier to detect.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
9 years agoarch_init: support resizing on incoming migration
Michael S. Tsirkin [Mon, 17 Nov 2014 15:55:43 +0000 (17:55 +0200)]
arch_init: support resizing on incoming migration

If block used_length does not match, try to resize it.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
9 years agoexec: qemu_ram_alloc_resizeable, qemu_ram_resize
Michael S. Tsirkin [Wed, 12 Nov 2014 12:27:41 +0000 (14:27 +0200)]
exec: qemu_ram_alloc_resizeable, qemu_ram_resize

Add API to allocate "resizeable" RAM.
This looks just like regular RAM generally, but
has a special property that only a portion of it
(used_length) is actually used, and migrated.

This used_length size can change across reboots.

Follow up patches will change used_length for such blocks at migration,
making it easier to extend devices using such RAM (notably ACPI,
but in the future thinkably other ROMs) without breaking migration
compatibility or wasting ROM (guest) memory.

Device is notified on resize, so it can adjust if necessary.

qemu_ram_alloc_resizeable allocates this memory, qemu_ram_resize resizes
it.

Note: nothing prevents making all RAM resizeable in this way.
However, reviewers felt that only enabling this selectively will
make some class of errors easier to detect.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
9 years agoexec: split length -> used_length/max_length
Michael S. Tsirkin [Mon, 15 Dec 2014 20:55:32 +0000 (22:55 +0200)]
exec: split length -> used_length/max_length

This patch allows us to distinguish between two
length values for each block:
    max_length - length of memory block that was allocated
    used_length - length of block used by QEMU/guest

Currently, we set used_length - max_length, unconditionally.
Follow-up patches allow used_length <= max_length.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
9 years agoexec: cpu_physical_memory_set/clear_dirty_range
Michael S. Tsirkin [Mon, 17 Nov 2014 15:54:07 +0000 (17:54 +0200)]
exec: cpu_physical_memory_set/clear_dirty_range

Make cpu_physical_memory_set/clear_dirty_range
behave symmetrically.

To clear range for a given client type only, add
cpu_physical_memory_clear_dirty_range_type.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
9 years agomemory: add memory_region_set_size
Michael S. Tsirkin [Tue, 16 Dec 2014 09:21:23 +0000 (11:21 +0200)]
memory: add memory_region_set_size

Add API to change MR size.
Will be used internally for RAM resize.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
9 years agohw/ppc/mac_newworld: simplify usb controller creation logic
Marcel Apfelbaum [Tue, 6 Jan 2015 13:29:17 +0000 (15:29 +0200)]
hw/ppc/mac_newworld: simplify usb controller creation logic

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agohw/ppc/spapr: simplify usb controller creation logic
Marcel Apfelbaum [Tue, 6 Jan 2015 13:29:16 +0000 (15:29 +0200)]
hw/ppc/spapr: simplify usb controller creation logic

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agohw/ppc/mac_newworld: QOMified mac99 machines
Marcel Apfelbaum [Tue, 6 Jan 2015 13:29:15 +0000 (15:29 +0200)]
hw/ppc/mac_newworld: QOMified mac99 machines

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agohw/usb: simplified usb_enabled
Marcel Apfelbaum [Tue, 6 Jan 2015 13:29:14 +0000 (15:29 +0200)]
hw/usb: simplified usb_enabled

The argument is not longer used and the implementation
uses now QOM instead of QemuOpts.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agohw/machine: added machine_usb wrapper
Marcel Apfelbaum [Tue, 6 Jan 2015 13:29:13 +0000 (15:29 +0200)]
hw/machine: added machine_usb wrapper

Following QOM convention, object properties should
not be accessed directly.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agohw/ppc: modified the condition for usb controllers to be created for some ppc machines
Marcel Apfelbaum [Tue, 6 Jan 2015 13:29:12 +0000 (15:29 +0200)]
hw/ppc: modified the condition for usb controllers to be created for some ppc machines

Some ppc machines create a default usb controller based on a 'machine condition'.
Until now the logic was: create the usb controller if:
 -  the usb option was supplied in cli and value is true or
 -  the usb option was absent and both set_defaults and the machine
    condition were true.

Modified the logic to:
Create the usb controller if:
 - the machine condition is true and defaults are enabled or
 - the usb option is supplied and true.

The main for this is to simplify the usb_enabled method.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agotarget-ppc: Cast ssize_t to size_t before printing with %zx
Peter Maydell [Tue, 23 Dec 2014 22:22:16 +0000 (22:22 +0000)]
target-ppc: Cast ssize_t to size_t before printing with %zx

The mingw32 compiler complains about trying to print variables of type
ssize_t with the %z format string specifier. Since we're printing it
as unsigned hex anyway, cast to size_t to silence the warning.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agotarget-ppc: Mark SR() and gen_sync_exception() as !CONFIG_USER_ONLY
Peter Maydell [Tue, 23 Dec 2014 22:22:15 +0000 (22:22 +0000)]
target-ppc: Mark SR() and gen_sync_exception() as !CONFIG_USER_ONLY

The functions SR() and gen_sync_exception() are only used in softmmu
configs; wrap them in #ifndef CONFIG_USER_ONLY to suppress clang warnings
on the linux-user builds.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agoPPC: e500: Fix GPIO controller interrupt number
Amit Tomar [Fri, 19 Dec 2014 14:20:37 +0000 (14:20 +0000)]
PPC: e500: Fix GPIO controller interrupt number

The GPIO controller lives at IRQ 47, not 43 on real hardware. This is a problem
because IRQ 43 is occupied by the I2C controller which we want to implement
next, so we'd have a conflict on that IRQ number.

Move the GPIO controller to IRQ 47 where it belongs.

Signed-off-by: Amit Singh Tomar <amit.tomar@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agotarget-ppc: Introduce Privileged TM Noops
Tom Musta [Thu, 18 Dec 2014 16:34:37 +0000 (10:34 -0600)]
target-ppc: Introduce Privileged TM Noops

Add the supervisory Transactional Memory instructions treclaim. and
trechkpt.  The implementation is a degenerate one that simply
checks privileged state, TM availability and then sets CR[0] to
0b0000, just like the unprivileged noops.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agotarget-ppc: Introduce tcheck
Tom Musta [Thu, 18 Dec 2014 16:34:36 +0000 (10:34 -0600)]
target-ppc: Introduce tcheck

Add a degenerate implementation of the Transaction Check (tcheck)
instruction.  Since transaction always immediately fail, this
implementation simply sets CR[BF] to 0b1000, i.e. TDOOMED = 1
and MSR[TS] == 0.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agotarget-ppc: Introduce TM Noops
Tom Musta [Thu, 18 Dec 2014 16:34:35 +0000 (10:34 -0600)]
target-ppc: Introduce TM Noops

Add degenerate implementations of the non-privileged Transactional
Memory instructions tend., tabort*. and tsr.  This implementation
simply checks the MSR[TM] bit and then sets CR0 to 0b0000.  This
is a reasonable degenerate implementation since transactions are
never allowed to begin and hence MSR[TS] is always 0b00.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agotarget-ppc: Introduce tbegin
Tom Musta [Thu, 18 Dec 2014 16:34:34 +0000 (10:34 -0600)]
target-ppc: Introduce tbegin

Provide a degenerate implementation of the tbegin instruction.  This
implementation always fails the transaction, recording the failure
per Book II Section 5.3.2 of the Power ISA V2.07.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agotarget-ppc: Introduce TEXASRU Bit Fields
Tom Musta [Thu, 18 Dec 2014 16:34:33 +0000 (10:34 -0600)]
target-ppc: Introduce TEXASRU Bit Fields

Define mnemonics for the various bit fields in the Transaction
EXception And Summary Register (TEXASR).
Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agotarget-ppc: Power8 Supports Transactional Memory
Tom Musta [Thu, 18 Dec 2014 16:34:32 +0000 (10:34 -0600)]
target-ppc: Power8 Supports Transactional Memory

The Power8 processor implements the Transactional Memory Facility
as defined in Power ISA 2.07.  Update the initialization code to
indicate this.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agotarget-ppc: Introduce tm_enabled Bit to CPU State
Tom Musta [Thu, 18 Dec 2014 16:34:31 +0000 (10:34 -0600)]
target-ppc: Introduce tm_enabled Bit to CPU State

Add a bit (tm_enabled) to CPU state that mirrors the MSR[TM] bit.
This is analogous to the other "available" bits in the MSR (FP,
VSX, etc.).

NOTE: Since MSR[TM] occupies big-endian bit 31, the code is wrapped
with a PPC64 bit check.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agotarget-ppc: Introduce Feature Flag for Transactional Memory
Tom Musta [Thu, 18 Dec 2014 16:34:30 +0000 (10:34 -0600)]
target-ppc: Introduce Feature Flag for Transactional Memory

Add a flag (POWERPC_FLAG_TM) for the Transactional Memory
Facility introduced in Power ISA 2.07.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agotarget-ppc: Introduce Instruction Type for Transactional Memory
Tom Musta [Thu, 18 Dec 2014 16:34:29 +0000 (10:34 -0600)]
target-ppc: Introduce Instruction Type for Transactional Memory

Add a category (PPC2_TM) for the Transactional Memory instructions
introduced in Power ISA 2.07.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agopseries: Update SLOF firmware image to 20141202
Alexey Kardashevskiy [Tue, 2 Dec 2014 04:32:46 +0000 (15:32 +1100)]
pseries: Update SLOF firmware image to 20141202

The changelog is:
  > version: update to 20141202
  > ipv4: Fix send packet across a subnet
  > pci: scan only type 0 and type 1
  > usb-xhci: support xhci extended capabilities
  > Fix term-io-key to also work when stdin has not been set yet
  > net-snk: llfw startup is using the wrong offset to handler
  > net-snk: Make call_client_interface() a bit more ABI compliant
  > net-snk: Remove custom printf version
  > net-snk: Sanitize our .lds file
  > net-snk: Avoid type clash for stdin & stdout
  > net-snk: use socket descriptor in the network stack
  > net-snk: Remove printk() in favor of printf()
  > net-snk: Remove redundant prototypes
  > net-snk: Remove unused timer functions
  > net-snk: Remove some unused PCI functions
  > net-snk: Remove module system
  > net-snk: Remove insmod/rmmod
  > net-snk: Remove snk_kernel_interface and related definitions
  > net-snk: Remove pci/vio_config gunk
  > js2x: Fix build
  > net-snk: Remoe some now unused "kernel" functions
  > rtas: Improve error handling in instantiate-rtas
  > version: update to 20140827
  > Add private HCALL to inform updated RTAS base and entry
  > xhci: fix port assignment

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agoPPC: Fix crash on spapr_tce_table_finalize()
David Gibson [Mon, 8 Dec 2014 02:48:02 +0000 (13:48 +1100)]
PPC: Fix crash on spapr_tce_table_finalize()

spapr_tce_table_finalize() can SEGV if the object was not previously
realized.  In particular this can be triggered by running
         qemu-system-ppc -device spapr-tce-table,?

The basic problem is that we have mismatched initialization versus
finalization: spapr_tce_table_finalize() is attempting to undo things that
are done in spapr_tce_table_realize(), not an instance_init function.

Therefore, replace spapr_tce_table_finalize() with
spapr_tce_table_unrealize().

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cc: qemu-stable@nongnu.org
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agoppc: do not use get_clock_realtime()
Paolo Bonzini [Wed, 26 Nov 2014 14:01:01 +0000 (15:01 +0100)]
ppc: do not use get_clock_realtime()

Use the external qemu-timer API instead.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agodevice-tree: fix memory leak
Sergey Fedorov [Thu, 11 Dec 2014 15:45:05 +0000 (18:45 +0300)]
device-tree: fix memory leak

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agospapr: Fix stale HTAB during live migration (TCG)
Samuel Mendoza-Jonas [Mon, 17 Nov 2014 04:12:30 +0000 (15:12 +1100)]
spapr: Fix stale HTAB during live migration (TCG)

If a TCG guest reboots during a running migration HTAB entries are not
marked dirty, and the destination boots with an invalid HTAB.

When a reboot occurs, explicitly mark the current HTAB dirty after
clearing it.

Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agospapr: Fix integer overflow during migration (TCG)
Samuel Mendoza-Jonas [Mon, 17 Nov 2014 04:12:29 +0000 (15:12 +1100)]
spapr: Fix integer overflow during migration (TCG)

The n_valid and n_invalid fields are unsigned short integers but it is
possible to have more than 65535 entries in a contiguous hunk, overflowing
the field. This results in an incorrect HTAB being sent to the destination
during migration.

Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agospapr: Fix stale HTAB during live migration (KVM)
Samuel Mendoza-Jonas [Mon, 17 Nov 2014 04:12:28 +0000 (15:12 +1100)]
spapr: Fix stale HTAB during live migration (KVM)

If a guest reboots during a running migration, changes to the
hash page table are not necessarily updated on the destination.
Opening a new file descriptor to the HTAB forces the migration
handler to resend the entire table.

Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agotarget-ppc: explicitly save page table headers in big endian
Cédric Le Goater [Mon, 3 Nov 2014 15:14:50 +0000 (16:14 +0100)]
target-ppc: explicitly save page table headers in big endian

Currently, when the page tables are saved, the kvm_get_htab_header structs
and the ptes are assumed being big endian and dumped as a indistinct blob
in the statefile. This is no longer true when the host is little endian
and this breaks restoration.

This patch unfolds the kvmppc_save_htab routine to write explicitly the
kvm_get_htab_header structs in big endian. The ptes are left untouched.

Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agotarget-ppc: Eliminate set_fprf Argument From helper_compute_fprf
Tom Musta [Wed, 12 Nov 2014 21:46:04 +0000 (15:46 -0600)]
target-ppc: Eliminate set_fprf Argument From helper_compute_fprf

The set_fprf argument to the helper_compute_fprf helper function
is no longer necessary -- the helper is only invoked when FPSCR[FPRF]
is going to be set.

Eliminate the unnecessary argument from the function signature and
its corresponding implementation.  Change the return value of the
helper to "void".  Update the name of the local variable "ret" to
"fprf", which now makes more sense.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agotarget-ppc: Eliminate set_fprf Argument From gen_compute_fprf
Tom Musta [Wed, 12 Nov 2014 21:46:03 +0000 (15:46 -0600)]
target-ppc: Eliminate set_fprf Argument From gen_compute_fprf

The set_fprf argument to the gen_compute_fprf() utility is no longer
needed -- gen_compute_fprf() is now called only when FPRF is actually
computed and set.  Eliminate the obsolete argument.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agotarget-ppc: Fully Migrate to gen_set_cr1_from_fpscr
Tom Musta [Wed, 12 Nov 2014 21:46:02 +0000 (15:46 -0600)]
target-ppc: Fully Migrate to gen_set_cr1_from_fpscr

Eliminate the set_rc argument from the gen_compute_fprf utility and
the corresponding (and incorrect) implementation.  Replace it with
calls to the gen_set_cr1_from_fpscr() utility.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agotarget-ppc: mffs. Should Set CR1 from FPSCR Bits
Tom Musta [Wed, 12 Nov 2014 21:46:01 +0000 (15:46 -0600)]
target-ppc: mffs. Should Set CR1 from FPSCR Bits

Update the Move From FPSCR (mffs.) instruction to correctly
set CR[1] from FPSCR[FX,FEX,VX,OX].

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agotarget-ppc: Fix Floating Point Move Instructions That Set CR1
Tom Musta [Wed, 12 Nov 2014 21:46:00 +0000 (15:46 -0600)]
target-ppc: Fix Floating Point Move Instructions That Set CR1

The Floating Point Move instructions (fmr., fabs., fnabs., fneg.,
and fcpsgn.) incorrectly copy FPSCR[FPCC] instead of [FX,FEX,VX,OX].
Furthermore, the current code does this via a call to gen_compute_fprf,
which is awkward since these instructions do not actually set FPRF.

Change the code to use the gen_set_cr1_from_fpscr utility.

Signed-off-by: Tom Musta <tommusta@gmail.com>
[agraf: whitespace fixes]
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agotarget-ppc: VXSQRT Should Not Be Set for NaNs
Tom Musta [Wed, 12 Nov 2014 21:45:59 +0000 (15:45 -0600)]
target-ppc: VXSQRT Should Not Be Set for NaNs

The Power ISA square root instructions (fsqrt[s], frsqrte[s]) must
set the FPSCR[VXSQRT] flag when operating on a negative value.
However, NaNs have no sign and therefore this flag should not
be set when operating on one.

Change the order of the checks in the helper code.  Move the
SNaN-to-QNaN macro to the top of the file so that it can be
re-used.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agotarget-ppc: Load/Store Vector Element Storage Alignment
Tom Musta [Mon, 17 Nov 2014 20:58:31 +0000 (14:58 -0600)]
target-ppc: Load/Store Vector Element Storage Alignment

The Load Vector Element Indexed and Store Vector Element Indexed
instructions compute an effective address in the usual manner.
However, they truncate that address to the natural boundary.
For example, the lvewx instruction will ignore the least significant
two bits of the address and thus load the aligned word of storage.

Fix the generators for these instruction to properly perform this
truncation.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agoPPC: e500 pci host: Add support for ATMUs
Alexander Graf [Wed, 12 Nov 2014 21:44:52 +0000 (22:44 +0100)]
PPC: e500 pci host: Add support for ATMUs

The e500 PCI controller has configurable windows that allow a guest OS
to selectively map parts of the PCI bus space to CPU address space and
to selectively map parts of the CPU address space for DMA requests into
PCI visible address ranges.

So far, we've simply assumed that this mapping is 1:1 and ignored it.

However, the PCICSRBAR (CCSR mapped in PCI bus space) always has to live
inside the first 32bits of address space. This means if we always treat
all mappings as 1:1, this map will collide with our RAM map from the CPU's
point of view.

So this patch adds proper ATMU support which allows us to keep the PCICSRBAR
below 32bits local to the PCI bus and have another, different window to PCI
BARs at the upper end of address space. We leverage this on e500plat though,
mpc8544ds stays virtually 1:1 like it was before, but now also goes via ATMU.

With this patch, I can run guests with lots of RAM and not coincidently access
MSI-X mappings while I really want to access RAM.

Signed-off-by: Alexander Graf <agraf@suse.de>
9 years agoPPC: mpc8554ds: Tell user about exceeding RAM limits
Alexander Graf [Wed, 12 Nov 2014 21:35:33 +0000 (22:35 +0100)]
PPC: mpc8554ds: Tell user about exceeding RAM limits

The mpc8544ds board only supports up to 3GB of RAM due to its limited
address space.

When the user requests more, abort and tell him that he should use less.

Signed-off-by: Alexander Graf <agraf@suse.de>