Alexey Kardashevskiy [Thu, 26 May 2016 15:43:22 +0000 (09:43 -0600)]
vfio: Fix 128 bit handling when deleting region
7532d3cbf "vfio: Fix 128 bit handling" added support for 64bit IOMMU
memory regions when those are added to VFIO address space; however
removing code cannot cope with these as int128_get64() will fail on
1<<64.
This copies 128bit handling from region_add() to region_del().
Since the only machine type which is actually going to use 64bit IOMMU
is pseries and it never really removes them (instead it will dynamically
add/remove subregions), this should cause no behavioral change.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Alex Williamson [Thu, 26 May 2016 15:43:22 +0000 (09:43 -0600)]
vfio/pci: Add IGD documentation
Document the usage modes, host primary graphics considerations, usage,
and fw_cfg ABI required for IGD assignment with vfio.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Alex Williamson [Thu, 26 May 2016 15:43:22 +0000 (09:43 -0600)]
vfio/pci: Add a separate option for IGD OpRegion support
The IGD OpRegion is enabled automatically when running in legacy mode,
but it can sometimes be useful in universal passthrough mode as well.
Without an OpRegion, output spigots don't work, and even though Intel
doesn't officially support physical outputs in UPT mode, it's a
useful feature. Note that if an OpRegion is enabled but a monitor is
not connected, some graphics features will be disabled in the guest
versus a headless system without an OpRegion, where they would work.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Alex Williamson [Thu, 26 May 2016 15:43:21 +0000 (09:43 -0600)]
vfio/pci: Intel graphics legacy mode assignment
Enable quirks to support SandyBridge and newer IGD devices as primary
VM graphics. This requires new vfio-pci device specific regions added
in kernel v4.6 to expose the IGD OpRegion, the shadow ROM, and config
space access to the PCI host bridge and LPC/ISA bridge. VM firmware
support, SeaBIOS only so far, is also required for reserving memory
regions for IGD specific use. In order to enable this mode, IGD must
be assigned to the VM at PCI bus address 00:02.0, it must have a ROM,
it must be able to enable VGA, it must have or be able to create on
its own an LPC/ISA bridge of the proper type at PCI bus address
00:1f.0 (sorry, not compatible with Q35 yet), and it must have the
above noted vfio-pci kernel features and BIOS. The intention is that
to enable this mode, a user simply needs to assign 00:02.0 from the
host to 00:02.0 in the VM:
-device vfio-pci,host=0000:00:02.0,bus=pci.0,addr=02.0
and everything either happens automatically or it doesn't. In the
case that it doesn't, we leave error reports, but assume the device
will operate in universal passthrough mode (UPT), which doesn't
require any of this, but has a much more narrow window of supported
devices, supported use cases, and supported guest drivers.
When using IGD in this mode, the VM firmware is required to reserve
some VM RAM for the OpRegion (on the order or several 4k pages) and
stolen memory for the GTT (up to 8MB for the latest GPUs). An
additional option, x-igd-gms allows the user to specify some amount
of additional memory (value is number of 32MB chunks up to 512MB) that
is pre-allocated for graphics use. TBH, I don't know of anything that
requires this or makes use of this memory, which is why we don't
allocate any by default, but the specification suggests this is not
actually a valid combination, so the option exists as a workaround.
Please report if it's actually necessary in some environment.
See code comments for further discussion about the actual operation
of the quirks necessary to assign these devices.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Alex Williamson [Thu, 26 May 2016 15:43:21 +0000 (09:43 -0600)]
vfio/pci: Setup BAR quirks after capabilities probing
Capability probing modifies wmask, which quirks may be interested in
changing themselves. Apply our BAR quirks after the capability scan
to make this possible.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Alex Williamson [Thu, 26 May 2016 15:43:21 +0000 (09:43 -0600)]
vfio/pci: Consolidate VGA setup
Combine VGA discovery and registration. Quirks can have dependencies
on BARs, so the quirks push out until after we've scanned the BARs.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Alex Williamson [Thu, 26 May 2016 15:43:20 +0000 (09:43 -0600)]
vfio/pci: Fix return of vfio_populate_vga()
This function returns success if either we setup the VGA region or
the host vfio doesn't return enough regions to support the VGA index.
This latter case doesn't make any sense. If we're asked to populate
VGA, fail if it doesn't exist and let the caller decide if that's
important.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Alex Williamson [Thu, 26 May 2016 15:43:20 +0000 (09:43 -0600)]
vfio: Create device specific region info helper
Given a device specific region type and sub-type, find it. Also
cleanup return point on error in vfio_get_region_info() so that we
always return 0 with a valid pointer or -errno and NULL.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Alex Williamson [Thu, 26 May 2016 15:43:20 +0000 (09:43 -0600)]
vfio: Enable sparse mmap capability
The sparse mmap capability in a vfio region info allows vfio to tell
us which sub-areas of a region may be mmap'd. Thus rather than
assuming a single mmap covers the entire region and later frobbing it
ourselves for things like the PCI MSI-X vector table, we can read that
directly from vfio.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Peter Maydell [Thu, 26 May 2016 13:29:29 +0000 (14:29 +0100)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches
# gpg: Signature made Wed 25 May 2016 18:32:40 BST using RSA key ID
C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
* remotes/kevin/tags/for-upstream: (31 commits)
blockjob: Remove BlockJob.bs
commit: Use BlockBackend for I/O
backup: Use BlockBackend for I/O
backup: Remove bs parameter from backup_do_cow()
backup: Pack Notifier within BackupBlockJob
backup: Don't leak BackupBlockJob in error path
mirror: Use BlockBackend for I/O
mirror: Allow target that already has a BlockBackend
stream: Use BlockBackend for I/O
block: Make blk_co_preadv/pwritev() public
block: Convert block job core to BlockBackend
block: Default to enabled write cache in blk_new()
block: Cancel jobs first in bdrv_close_all()
block: keep a list of block jobs
block: Rename blk_write_zeroes()
dma-helpers: change BlockBackend to opaque value in DMAIOFunc
dma-helpers: change interface to byte-based
block: Propagate .drained_begin/end callbacks
block: Fix reconfiguring graph with drained nodes
block: Make bdrv_drain() use bdrv_drained_begin/end()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Andreas Färber [Mon, 13 Jul 2015 17:35:41 +0000 (19:35 +0200)]
qdev: Start disentangling bus from device
Move bus type and related APIs to a separate file bus.c.
This is a first step in breaking up qdev.c into more manageable chunks.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[AF: Rebased onto osdep.h]
Signed-off-by: Andreas Färber <afaerber@suse.de>
[PMM: added bus.o to link line for test-qdev-global-props]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Sergey Fedorov [Mon, 16 May 2016 13:13:00 +0000 (16:13 +0300)]
cpu-exec: Fix direct jump to TB spanning page
It is not safe to make a direct jump to a TB spanning two pages in
system emulation because the mapping for the second page can get changed
but we don't take care of direct jumps in this case.
However in user mode emulation, this is not the case because there's
only static address translation and TBs are always invalidated properly.
Fixes:
5b053a4a2827 ("tcg: Clean up direct block chaining safety checks")
Reported-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Tested-by: Max Filippov <jcmvbkbc@gmail.com>
Message-id:
1463404380-29302-1-git-send-email-sergey.fedorov@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Thu, 26 May 2016 11:41:12 +0000 (12:41 +0100)]
Merge remote-tracking branch 'remotes/afaerber/tags/maintainers-for-peter' into staging
Andreas stepping down from most maintainer positions
# gpg: Signature made Wed 25 May 2016 16:53:45 BST using RSA key ID
3E7E013F
# gpg: Good signature from "Andreas Färber <afaerber@suse.de>"
# gpg: aka "Andreas Färber <afaerber@suse.com>"
* remotes/afaerber/tags/maintainers-for-peter:
MAINTAINERS: Drop Andreas as CPU maintainer
MAINTAINERS: Drop Andreas as 0.15 maintainer
MAINTAINERS: Drop Andreas as PReP maintainer
MAINTAINERS: Drop Andreas as Cocoa maintainer
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Kevin Wolf [Mon, 18 Apr 2016 15:30:17 +0000 (17:30 +0200)]
blockjob: Remove BlockJob.bs
There is a single remaining user in qemu-img, and another one in a test
case, both of which can be trivially converted to using BlockJob.blk
instead.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Kevin Wolf [Thu, 14 Apr 2016 11:09:53 +0000 (13:09 +0200)]
commit: Use BlockBackend for I/O
This changes the commit block job to use the job's BlockBackend for
performing its I/O. job->bs isn't used by the commit code any more
afterwards.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Kevin Wolf [Thu, 14 Apr 2016 11:09:53 +0000 (13:09 +0200)]
backup: Use BlockBackend for I/O
This changes the backup block job to use the job's BlockBackend for
performing its I/O. job->bs isn't used by the backup code any more
afterwards.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Kevin Wolf [Thu, 14 Apr 2016 13:56:02 +0000 (15:56 +0200)]
backup: Remove bs parameter from backup_do_cow()
Now that we pass the job to the function, bs is implied by that.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
John Snow [Tue, 26 Jan 2016 23:54:58 +0000 (18:54 -0500)]
backup: Pack Notifier within BackupBlockJob
Instead of relying on peeking at bs->job, we want to explicitly get
a reference to the job that was involved in this notifier callback.
Pack the Notifier inside of the BackupBlockJob so we can use
container_of to get a reference back to the BackupBlockJob object.
This cuts out one more case where we rely unnecessarily on bs->job.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Kevin Wolf [Thu, 14 Apr 2016 10:59:55 +0000 (12:59 +0200)]
backup: Don't leak BackupBlockJob in error path
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Kevin Wolf [Tue, 12 Apr 2016 14:17:41 +0000 (16:17 +0200)]
mirror: Use BlockBackend for I/O
This changes the mirror block job to use the job's BlockBackend for
performing its I/O. job->bs isn't used by the mirroring code any more
afterwards.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Kevin Wolf [Tue, 12 Apr 2016 14:20:59 +0000 (16:20 +0200)]
mirror: Allow target that already has a BlockBackend
We had to forbid mirroring to a target BDS that already had a BB
attached because the node swapping at job completion would add a second
BB and we didn't support multiple BBs on a single BDS at the time. Now
we do, so we can lift the restriction.
As we allow additional BlockBackends for the target, we must expect
other users to be sending requests. There may no requests be in flight
during the graph modification, so we have to drain those users now.
The core part of this patch is a revert of commit
40365552.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Kevin Wolf [Tue, 12 Apr 2016 13:15:49 +0000 (15:15 +0200)]
stream: Use BlockBackend for I/O
This changes the streaming block job to use the job's BlockBackend for
performing the COR reads. job->bs isn't used by the streaming code any
more afterwards.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Kevin Wolf [Thu, 14 Apr 2016 14:40:16 +0000 (16:40 +0200)]
block: Make blk_co_preadv/pwritev() public
Also add trace points now that the function can be directly called.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Kevin Wolf [Fri, 8 Apr 2016 12:51:09 +0000 (14:51 +0200)]
block: Convert block job core to BlockBackend
This adds a new BlockBackend field to the BlockJob struct, which
coexists with the BlockDriverState while converting the individual jobs.
When creating a block job, a new BlockBackend is created on top of the
given BlockDriverState, and it is destroyed when the BlockJob ends. The
reference to the BDS is now held by the BlockBackend instead of calling
bdrv_ref/unref manually.
We have to be careful when we use bdrv_replace_in_backing_chain() in
block jobs because this changes the BDS that job->blk points to. At the
moment block jobs are too tightly coupled with their BDS, so that moving
a job to another BDS isn't easily possible; therefore, we need to just
manually undo this change afterwards.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Kevin Wolf [Tue, 19 Apr 2016 15:27:24 +0000 (17:27 +0200)]
block: Default to enabled write cache in blk_new()
The existing users of the function are:
1. blk_new_open(), which already enabled the write cache
2. Some test cases that don't care about the setting
3. blockdev_init() for empty drives, where the cache mode is overridden
with the value from the options when a medium is inserted
Therefore, this patch doesn't change the current behaviour. It will be
convenient, however, for additional users of blk_new() (like block
jobs) if the most sensible WCE setting is the default.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Kevin Wolf [Fri, 8 Apr 2016 16:26:37 +0000 (18:26 +0200)]
block: Cancel jobs first in bdrv_close_all()
So far, bdrv_close_all() first removed all root BlockDriverStates of
BlockBackends and monitor owned BDSes, and then assumed that the
remaining BDSes must be related to jobs and cancelled these jobs.
This order doesn't work that well any more when block jobs use
BlockBackends internally because then they will lose their BDS before
being cancelled.
This patch changes bdrv_close_all() to first cancel all jobs and then
remove all root BDSes from the remaining BBs.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Alberto Garcia [Mon, 4 Apr 2016 13:43:51 +0000 (16:43 +0300)]
block: keep a list of block jobs
The current way to obtain the list of existing block jobs is to
iterate over all root nodes and check which ones own a job.
Since we want to be able to support block jobs in other nodes as well,
this patch keeps a list of jobs that is updated every time one is
created or destroyed.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Eric Blake [Tue, 24 May 2016 22:25:20 +0000 (16:25 -0600)]
block: Rename blk_write_zeroes()
Commit
983a1600 changed the semantics of blk_write_zeroes() to
be byte-based rather than sector-based, but did not change the
name, which is an open invitation for other code to misuse the
function. Renaming to pwrite_zeroes() makes it more in line
with other byte-based interfaces, and will help make it easier
to track which remaining write_zeroes interfaces still need
conversion.
Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Paolo Bonzini [Mon, 23 May 2016 12:54:06 +0000 (14:54 +0200)]
dma-helpers: change BlockBackend to opaque value in DMAIOFunc
Callers of dma_blk_io have no way to pass extra data to the DMAIOFunc,
because the original callback and opaque are gone by the time DMAIOFunc
is called. On the other hand, the BlockBackend is usually derived
from those extra data that you could pass to the DMAIOFunc (in the
next patch, that would be the SCSIRequest).
So change DMAIOFunc's prototype, decoupling it from blk_aio_readv
and blk_aio_writev's. The new prototype loses the BlockBackend
and gains an extra opaque value which, in the case of dma_blk_readv
and dma_blk_writev, is of course used for the BlockBackend.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Paolo Bonzini [Mon, 23 May 2016 12:54:05 +0000 (14:54 +0200)]
dma-helpers: change interface to byte-based
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Kevin Wolf [Mon, 23 May 2016 16:46:59 +0000 (18:46 +0200)]
block: Propagate .drained_begin/end callbacks
When draining intermediate nodes (i.e. nodes that aren't the root node
for at least one of their parents; with node references, the user can
always configure the graph to create this situation), we need to
propagate the .drained_begin/end callbacks all the way up to the root
for the drain to be effective.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Kevin Wolf [Tue, 17 May 2016 12:51:55 +0000 (14:51 +0200)]
block: Fix reconfiguring graph with drained nodes
When changing the BlockDriverState that a BdrvChild points to while the
node is currently drained, we must call the .drained_end() parent
callback. Conversely, when this means attaching a new node that is
already drained, we need to call .drained_begin().
bdrv_root_attach_child() takes now an opaque parameter, which is needed
because the callbacks must also be called if we're attaching a new child
to the BlockBackend when the root node is already drained, and they need
a way to identify the BlockBackend. Previously, child->opaque was set
too late and the callbacks would still see it as NULL.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Kevin Wolf [Mon, 23 May 2016 14:08:55 +0000 (16:08 +0200)]
block: Make bdrv_drain() use bdrv_drained_begin/end()
Until now, bdrv_drained_begin() used bdrv_drain() internally to drain
the queue. This is kind of backwards and caused quiescing code to be
duplicated because bdrv_drained_begin() had to ensure that no new
requests come in even after bdrv_drain() returns, whereas bdrv_drain()
had to have them because it could be called from other places.
Instead move the bdrv_drain() code to bdrv_drained_begin() and make
bdrv_drain() a simple wrapper around bdrv_drained_begin/end().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Kevin Wolf [Mon, 23 May 2016 13:52:26 +0000 (15:52 +0200)]
block: Introduce bdrv_replace_child()
This adds a common function that is called when attaching a new child to
a parent, removing a child from a parent and when reconfiguring the
graph so that an existing child points to a different node now.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Max Reitz [Tue, 17 May 2016 14:41:34 +0000 (16:41 +0200)]
block: Drop errp parameter from blk_new()
blk_new() cannot fail so its Error ** parameter has become superfluous.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Tue, 17 May 2016 14:41:33 +0000 (16:41 +0200)]
block: Drop bdrv_parent_cb_...() from bdrv_close()
bdrv_close() now asserts that the BDS's refcount is 0, therefore it
cannot have any parents and the bdrv_parent_cb_change_media() call is a
no-op.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Tue, 17 May 2016 14:41:32 +0000 (16:41 +0200)]
block: Assert !bs->refcnt in bdrv_close()
The only caller of bdrv_close() left is bdrv_delete(). We may as well
assert that, in a way (there are some things in bdrv_close() that make
more sense under that assumption, such as the call to
bdrv_release_all_dirty_bitmaps() which in turn assumes that no frozen
bitmaps are attached to the BDS).
In addition, being called only in bdrv_delete() means that we can drop
bdrv_close()'s forward declaration at the top of block.c.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Tue, 17 May 2016 14:41:31 +0000 (16:41 +0200)]
block: Make bdrv_open() return a BDS
There are no callers to bdrv_open() or bdrv_open_inherit() left that
pass a pointer to a non-NULL BDS pointer as the first argument of these
functions, so we can finally drop that parameter and just make them
return the new BDS.
Generally, the following pattern is applied:
bs = NULL;
ret = bdrv_open(&bs, ..., &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
...
}
by
bs = bdrv_open(..., errp);
if (!bs) {
ret = -EINVAL;
...
}
Of course, there are only a few instances where the pattern is really
pure.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Tue, 17 May 2016 14:41:30 +0000 (16:41 +0200)]
block: Drop bdrv_new_root()
It is unused now, so we may just as well drop it.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Tue, 17 May 2016 14:41:29 +0000 (16:41 +0200)]
block: Drop blk_new_with_bs()
Its only caller is blk_new_open(), so we can just inline it there.
The bdrv_new_root() call is dropped in the process because we can just
let bdrv_open() create the BDS.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Tue, 17 May 2016 14:41:28 +0000 (16:41 +0200)]
tests: Drop BDS from test-throttle.c
Now that throttling has been moved to the BlockBackend level, we do not
need to create a BDS along with the BB in the I/O throttling test.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Tue, 17 May 2016 14:41:27 +0000 (16:41 +0200)]
block: Let bdrv_open_inherit() return the snapshot
If bdrv_open_inherit() creates a snapshot BDS and *pbs is NULL, that
snapshot BDS should be returned instead of the BDS under it.
This has worked so far because (nearly) all users of BDRV_O_SNAPSHOT use
blk_new_open() to create the BDS tree. bdrv_append() (which is called by
bdrv_append_temp_snapshot()) redirects pointers from parents (i.e. the
BB in this case) to the newly appended child (i.e. the overlay),
therefore, while bdrv_open_inherit() did not return the root BDS, the BB
still pointed to it.
The only instance where BDRV_O_SNAPSHOT is used but blk_new_open() is
not is in blockdev_init() if no BDS tree is created, and instead
blk_new() is used and the flags are stored in the BB root state.
However, qmp_blockdev_change_medium() filters the BDRV_O_SNAPSHOT flag
before invoking bdrv_open(), so it will not have any effect.
In any case, it would be nicer if bdrv_open_inherit() could just always
return the root of the BDS tree that has been created.
To this end, bdrv_append_temp_snapshot() now returns the snapshot BDS
instead of just appending it on top of the snapshotted BDS. Also, it
calls bdrv_ref() before bdrv_append() (which bdrv_open_inherit() has to
undo if not returning the overlay).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Tue, 17 May 2016 14:41:26 +0000 (16:41 +0200)]
block: Drop useless bdrv_new() call
bdrv_append_temp_snapshot() uses bdrv_new() to create an empty BDS
before invoking bdrv_open() on that BDS. This is probably a relict from
when it used to do some modifications on that empty BDS, but now that is
unnecessary, so we can just set bs_snapshot to NULL and let bdrv_open()
do the rest.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Kevin Wolf [Fri, 20 May 2016 16:49:07 +0000 (18:49 +0200)]
block: Fix bdrv_next() memory leak
The bdrv_next() users all leaked the BdrvNextIterator after completing
the iteration. Simply changing bdrv_next() to free the iterator before
returning NULL at the end of list doesn't work because some callers exit
the loop before looking at all BDSes.
This patch moves the BdrvNextIterator from the heap to the stack of
the caller and switches to a bdrv_first()/bdrv_next() interface for
initialising the iterator.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Andreas Färber [Wed, 25 May 2016 15:21:59 +0000 (17:21 +0200)]
MAINTAINERS: Drop Andreas as CPU maintainer
Signed-off-by: Andreas Färber <afaerber@suse.de>
Andreas Färber [Wed, 25 May 2016 15:16:39 +0000 (17:16 +0200)]
MAINTAINERS: Drop Andreas as 0.15 maintainer
Downgrade to orphan status, like all other remaining stable entries.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Andreas Färber [Wed, 25 May 2016 15:18:14 +0000 (17:18 +0200)]
MAINTAINERS: Drop Andreas as PReP maintainer
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Andreas Färber [Wed, 25 May 2016 15:14:56 +0000 (17:14 +0200)]
MAINTAINERS: Drop Andreas as Cocoa maintainer
Peter has taken over Cocoa maintainership.
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Peter Maydell [Tue, 24 May 2016 12:06:32 +0000 (13:06 +0100)]
Merge remote-tracking branch 'remotes/ehabkost/tags/x86-pull-request' into staging
X86 queue, 2016-05-23
# gpg: Signature made Mon 23 May 2016 23:48:27 BST using RSA key ID
984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
* remotes/ehabkost/tags/x86-pull-request:
target-i386: kvm: Eliminate kvm_msr_entry_set()
target-i386: kvm: Simplify MSR setting functions
target-i386: kvm: Simplify MSR array construction
target-i386: kvm: Increase MSR_BUF_SIZE
target-i386: kvm: Allocate kvm_msrs struct once per VCPU
target-i386: Call cpu_exec_init() on realize
target-i386: Move TCG initialization to realize time
target-i386: Move TCG initialization check to tcg_x86_init()
cpu: Eliminate cpudef_init(), cpudef_setup()
target-i386: Set constant model_id for qemu64/qemu32/athlon
pc: Set CPU model-id on compat_props for pc <= 2.4
osdep: Move default qemu_hw_version() value to a macro
target-i386: kvm: Use X86XSaveArea struct for xsave save/load
target-i386: Use xsave structs for ext_save_area
target-i386: Define structs for layout of xsave area
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Tue, 24 May 2016 11:21:07 +0000 (12:21 +0100)]
Merge remote-tracking branch 'remotes/amit-migration/tags/migration-2.7-1' into staging
migration fixes:
- ensure src block devices continue fine after a failed migration
- fail on migration blockers; helps 9p savevm/loadvm
- move autoconverge commands out of experimental state
- move the migration-specific qjson in migration/
# gpg: Signature made Mon 23 May 2016 18:15:09 BST using RSA key ID
657EF670
# gpg: Good signature from "Amit Shah <amit@amitshah.net>"
# gpg: aka "Amit Shah <amit@kernel.org>"
# gpg: aka "Amit Shah <amitshah@gmx.net>"
* remotes/amit-migration/tags/migration-2.7-1:
migration: regain control of images when migration fails to complete
savevm: fail if migration blockers are present
migration: Promote improved autoconverge commands out of experimental state
migration/qjson: Drop gratuitous use of QOM
migration: Move qjson.[ch] to migration/
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Tue, 24 May 2016 10:38:22 +0000 (11:38 +0100)]
Merge remote-tracking branch 'remotes/amit-virtio-rng/tags/rng-2.7-1' into staging
rng: rename RndRandom to RndRandom
# gpg: Signature made Mon 23 May 2016 16:44:58 BST using RSA key ID
657EF670
# gpg: Good signature from "Amit Shah <amit@amitshah.net>"
# gpg: aka "Amit Shah <amit@kernel.org>"
# gpg: aka "Amit Shah <amitshah@gmx.net>"
* remotes/amit-virtio-rng/tags/rng-2.7-1:
rng-random: rename RndRandom to RngRandom
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Tue, 24 May 2016 09:19:45 +0000 (10:19 +0100)]
Merge remote-tracking branch 'remotes/xtensa/tags/
20160523-opencores_eth' into staging
opencores_eth cleanups:
- use mii.h
- reduce stack usage in open_eth_start_xmit.
# gpg: Signature made Mon 23 May 2016 20:14:20 BST using RSA key ID
F83FA044
# gpg: Good signature from "Max Filippov <max.filippov@cogentembedded.com>"
# gpg: aka "Max Filippov <jcmvbkbc@gmail.com>"
* remotes/xtensa/tags/
20160523-opencores_eth:
hw/net/opencores_eth: Allocating Large sized arrays to heap
hw/net/opencores_eth: use mii.h
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Eduardo Habkost [Wed, 16 Dec 2015 19:06:46 +0000 (17:06 -0200)]
target-i386: kvm: Eliminate kvm_msr_entry_set()
Inline the function inside kvm_msr_entry_add().
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Eduardo Habkost [Wed, 16 Dec 2015 19:06:45 +0000 (17:06 -0200)]
target-i386: kvm: Simplify MSR setting functions
Simplify kvm_put_tscdeadline_msr() and
kvm_put_msr_feature_control() using kvm_msr_buf and the
kvm_msr_entry_add() helper.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Eduardo Habkost [Wed, 16 Dec 2015 19:06:44 +0000 (17:06 -0200)]
target-i386: kvm: Simplify MSR array construction
Add a helper function that appends new entries to the MSR buffer
and checks for the buffer size limit.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Eduardo Habkost [Wed, 16 Dec 2015 19:06:43 +0000 (17:06 -0200)]
target-i386: kvm: Increase MSR_BUF_SIZE
We are dangerously close to the array limits in kvm_put_msrs()
and kvm_get_msrs(): with the default mcg_cap configuration, we
can set up to 148 MSRs in kvm_put_msrs(), and if we allow mcg_cap
to be changed, we can write up to 236 MSRs.
Use 4096 bytes for the buffer, that can hold 255 kvm_msr_entry
structs.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Eduardo Habkost [Wed, 16 Dec 2015 19:06:42 +0000 (17:06 -0200)]
target-i386: kvm: Allocate kvm_msrs struct once per VCPU
Instead of using 2400 bytes in the stack for 150 MSR entries in
kvm_get_msrs() and kvm_put_msrs(), allocate a buffer once for
each VCPU.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Eduardo Habkost [Fri, 13 Feb 2015 01:04:50 +0000 (23:04 -0200)]
target-i386: Call cpu_exec_init() on realize
QOM instance_init functions are not supposed to have any side-effects,
as new objects may be created at any moment for querying property
information (see qmp_device_list_properties()).
Calling cpu_exec_init() also affects QEMU's ability to handle errors
during CPU creation, as some actions done by cpu_exec_init() can't be
reverted.
Move cpu_exec_init() call to realize so a simple object_new() won't
trigger it, and so that it is called after some basic validation of CPU
parameters.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Eduardo Habkost [Fri, 13 Feb 2015 00:57:44 +0000 (22:57 -0200)]
target-i386: Move TCG initialization to realize time
QOM instance_init functions are not supposed to have any side-effects,
as new objects may be created at any moment for querying property
information (see qmp_device_list_properties()).
Move TCG initialization to realize time so it won't be called when just
doing object_new() on a X86CPU subclass.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Eduardo Habkost [Thu, 5 Mar 2015 16:43:16 +0000 (13:43 -0300)]
target-i386: Move TCG initialization check to tcg_x86_init()
Instead of requiring cpu.c to check if TCG was already initialized,
simply let the function be called multiple times.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Eduardo Habkost [Fri, 30 Oct 2015 20:10:57 +0000 (18:10 -0200)]
cpu: Eliminate cpudef_init(), cpudef_setup()
x86_cpudef_init() doesn't do anything anymore, cpudef_init(),
cpudef_setup(), and x86_cpudef_init() can be finally removed.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Eduardo Habkost [Sat, 9 Apr 2016 19:44:20 +0000 (16:44 -0300)]
target-i386: Set constant model_id for qemu64/qemu32/athlon
Newer PC machines don't set hw_version, and older machines set
model-id on compat_props explicitly, so we don't need the
x86_cpudef_setup() code that sets model_id using
qemu_hw_version() anymore.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Zhou Jie [Wed, 27 Apr 2016 02:07:48 +0000 (10:07 +0800)]
hw/net/opencores_eth: Allocating Large sized arrays to heap
open_eth_start_xmit has a huge stack usage of 65536 bytes approx.
Moving large arrays to heap to reduce stack usage.
Reduce size of a buffer allocated on stack to 0x600 bytes, which is the
maximal frame length when HUGEN bit is not set in MODER, only allocate
buffer on heap when that is too small. Thus heap is not used in typical
use case.
Signed-off-by: Zhou Jie <zhoujie2011@cn.fujitsu.com>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Max Filippov [Sun, 3 Apr 2016 23:12:51 +0000 (02:12 +0300)]
hw/net/opencores_eth: use mii.h
Drop local definitions of MII registers and use constants from mii.h for
registers and register bits. No functional changes.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Greg Kurz [Wed, 18 May 2016 13:44:36 +0000 (15:44 +0200)]
migration: regain control of images when migration fails to complete
We currently have an error path during migration that can cause
the source QEMU to abort:
migration_thread()
migration_completion()
runstate_is_running() ----------------> true if guest is running
bdrv_inactivate_all() ----------------> inactivate images
qemu_savevm_state_complete_precopy()
... qemu_fflush()
socket_writev_buffer() --------> error because destination fails
qemu_fflush() -------------------> set error on migration stream
migration_completion() -----------------> set migrate state to FAILED
migration_thread() -----------------------> break migration loop
vm_start() -----------------------------> restart guest with inactive
images
and you get:
qemu-system-ppc64: socket_writev_buffer: Got err=104 for (32768/
18446744073709551615)
qemu-system-ppc64: /home/greg/Work/qemu/qemu-master/block/io.c:1342:bdrv_co_do_pwritev: Assertion `!(bs->open_flags & 0x0800)' failed.
Aborted (core dumped)
If we try postcopy with a similar scenario, we also get the writev error
message but QEMU leaves the guest paused because entered_postcopy is true.
We could possibly do the same with precopy and leave the guest paused.
But since the historical default for migration errors is to restart the
source, this patch adds a call to bdrv_invalidate_cache_all() instead.
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Message-Id: <
146357896785.6003.
11983081732454362715.stgit@bahia.huguette.org>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Eduardo Habkost [Sat, 9 Apr 2016 19:26:38 +0000 (16:26 -0300)]
pc: Set CPU model-id on compat_props for pc <= 2.4
Instead of relying on x86_cpudef_setup() calling
qemu_hw_version(), just make old machines set model-id explicitly
on compat_props for qemu64, qemu32, and athlon. This will allow
us to eliminate x86_cpudef_setup() later.
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Eduardo Habkost [Sat, 9 Apr 2016 19:42:44 +0000 (16:42 -0300)]
osdep: Move default qemu_hw_version() value to a macro
The macro will be used by code that will stop calling
qemu_hw_version() at runtime and just need a constant value.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Eduardo Habkost [Mon, 23 Nov 2015 12:43:26 +0000 (10:43 -0200)]
target-i386: kvm: Use X86XSaveArea struct for xsave save/load
Instead of using offset macros and bit operations in a uint32_t
array, use the X86XSaveArea struct to perform the loading/saving
operations in kvm_put_xsave() and kvm_get_xsave().
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Eduardo Habkost [Sat, 28 Nov 2015 16:32:26 +0000 (14:32 -0200)]
target-i386: Use xsave structs for ext_save_area
This doesn't introduce any change in the code, as the offsets and
struct sizes match what was present in the table. This can be
validated by the QEMU_BUILD_BUG_ON lines on target-i386/cpu.h,
which ensures the struct sizes and offsets match the existing
values in ext_save_area.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Eduardo Habkost [Thu, 19 Nov 2015 18:52:33 +0000 (16:52 -0200)]
target-i386: Define structs for layout of xsave area
Add structs that define the layout of the xsave areas used by
Intel processors. Add some QEMU_BUILD_BUG_ON lines to ensure the
structs match the XSAVE_* macros in target-i386/kvm.c and the
offsets and sizes at target-i386/cpu.c:ext_save_areas.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Greg Kurz [Wed, 4 May 2016 19:44:19 +0000 (21:44 +0200)]
savevm: fail if migration blockers are present
QEMU has currently two ways to prevent migration to occur:
- migration blocker when it depends on runtime state
- VMStateDescription.unmigratable when migration is not supported at all
This patch gathers all the logic into a single function to be called from
both the savevm and the migrate paths.
This fixes a bug with 9p, at least, where savevm would succeed and the
following would happen in the guest after loadvm:
$ ls /host
ls: cannot access /host: Protocol error
With this patch:
(qemu) savevm foo
Migration is disabled when VirtFS export path '/' is mounted in the guest
using mount_tag 'host'
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <
146239057139.11271.
9011797645454781543.stgit@bahia.huguette.org>
[Update subject according to Paolo's suggestion - Amit]
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Peter Maydell [Mon, 23 May 2016 15:15:51 +0000 (16:15 +0100)]
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* NMI cleanups (Bandan)
* RAMBlock/Memory cleanups and fixes (Dominik, Gonglei, Fam, me)
* first part of linuxboot support for fw_cfg DMA (Richard)
* IOAPIC fix (Peter Xu)
* iSCSI SG_IO fix (Vadim)
* Various infrastructure bug fixes (Zhijian, Peter M., Stefan)
* CVE fixes (Prasad)
# gpg: Signature made Mon 23 May 2016 16:06:18 BST using RSA key ID
78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
* remotes/bonzini/tags/for-upstream: (24 commits)
cpus: call the core nmi injection function
nmi: remove x86 specific nmi handling
target-i386: add a generic x86 nmi handler
coccinelle: add g_assert_cmp* to macro file
iscsi: pass SCSI status back for SG_IO
esp: check dma length before reading scsi command(CVE-2016-4441)
esp: check command buffer length before write(CVE-2016-4439)
scripts/signrom.py: Check for magic in option ROMs.
scripts/signrom.py: Allow option ROM checksum script to write the size header.
Remove config-devices.mak on 'make clean'
cpus.c: Use pthread_sigmask() rather than sigprocmask()
memory: remove unnecessary masking of MemoryRegion ram_addr
memory: Drop FlatRange.romd_mode
memory: Remove code for mr->may_overlap
exec: adjust rcu_read_lock requirement
memory: drop find_ram_block()
vl: change runstate only if new state is different from current state
ioapic: clear remote irr bit for edge-triggered interrupts
ioapic: keep RO bits for IOAPIC entry
target-i386: key sfence availability on CPUID_SSE, not CPUID_SSE2
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Bandan Das [Fri, 20 May 2016 16:28:37 +0000 (12:28 -0400)]
cpus: call the core nmi injection function
We can call the common function here directly since
x86 specific actions will be taken care of by the arch
specific nmi handler
Signed-off-by: Bandan Das <bsd@redhat.com>
Message-Id: <
1463761717-26558-4-git-send-email-bsd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Bandan Das [Fri, 20 May 2016 16:28:36 +0000 (12:28 -0400)]
nmi: remove x86 specific nmi handling
nmi_monitor_handle is wired to call the x86 nmi
handler. So, we can directly use it at call sites.
Signed-off-by: Bandan Das <bsd@redhat.com>
Message-Id: <
1463761717-26558-3-git-send-email-bsd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Bandan Das [Fri, 20 May 2016 16:28:35 +0000 (12:28 -0400)]
target-i386: add a generic x86 nmi handler
Instead of having x86 ifdefs in core nmi code, this
change adds a arch specific handler that the nmi common
code can call.
Signed-off-by: Bandan Das <bsd@redhat.com>
Message-Id: <
1463761717-26558-2-git-send-email-bsd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Wed, 18 May 2016 09:11:55 +0000 (11:11 +0200)]
coccinelle: add g_assert_cmp* to macro file
This helps applying semantic patches to unit tests.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Vadim Rozenfeld [Fri, 13 May 2016 11:03:22 +0000 (13:03 +0200)]
iscsi: pass SCSI status back for SG_IO
Signed-off-by: Vadim Rozenfeld <vrozenfe@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Prasad J Pandit [Thu, 19 May 2016 10:39:31 +0000 (16:09 +0530)]
esp: check dma length before reading scsi command(CVE-2016-4441)
The 53C9X Fast SCSI Controller(FSC) comes with an internal 16-byte
FIFO buffer. It is used to handle command and data transfer.
Routine get_cmd() uses DMA to read scsi commands into this buffer.
Add check to validate DMA length against buffer size to avoid any
overrun.
Fixes CVE-2016-4441.
Reported-by: Li Qiang <liqiang6-s@360.cn>
Cc: qemu-stable@nongnu.org
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <
1463654371-11169-3-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Prasad J Pandit [Thu, 19 May 2016 10:39:30 +0000 (16:09 +0530)]
esp: check command buffer length before write(CVE-2016-4439)
The 53C9X Fast SCSI Controller(FSC) comes with an internal 16-byte
FIFO buffer. It is used to handle command and data transfer. While
writing to this command buffer 's->cmdbuf[TI_BUFSZ=16]', a check
was missing to validate input length. Add check to avoid OOB write
access.
Fixes CVE-2016-4439.
Reported-by: Li Qiang <liqiang6-s@360.cn>
Cc: qemu-stable@nongnu.org
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <
1463654371-11169-2-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Richard W.M. Jones [Wed, 11 May 2016 21:06:46 +0000 (22:06 +0100)]
scripts/signrom.py: Check for magic in option ROMs.
Because of the risk that compilers might not emit the asm() block at
the beginning of the option ROM, check that the ROM contains the
required magic signature.
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Message-Id: <
1463000807-18015-3-git-send-email-rjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Richard W.M. Jones [Wed, 11 May 2016 21:06:45 +0000 (22:06 +0100)]
scripts/signrom.py: Allow option ROM checksum script to write the size header.
Modify the signrom.py script so that if the size byte in the header is
0 (ie. not set) then the script will set the size. If the size byte
is non-zero then we do the same as before, so this doesn't require
changes to any existing ROM sourcecode.
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Message-Id: <
1463000807-18015-2-git-send-email-rjones@redhat.com>
Peter Maydell [Tue, 17 May 2016 11:27:31 +0000 (12:27 +0100)]
Remove config-devices.mak on 'make clean'
Our dependency mechanism works like this:
* on first build there is neither a .o nor a .d
* we create the .d as a side effect of creating the .o
* for rebuilds we know when we need to update the .o,
which also updates the .d
This system requires that you're never in a situation where there is
a .o file but no .d (because then we will never realise we need to
build the .d, and we will not have the dependency information about
when to rebuild the .o).
This is working fine for our object files, but we also try to use it
for $TARGET/config-devices.mak (where the dependency file is
in $TARGET-config-devices.mak.d). Unfortunately "make clean" doesn't
remove config-devices.mak, which means that it puts us in the
forbidden situation of "object file exists but not its .d file".
This in turn means that we will fail to notice when we need to rebuild:
mkdir build/depbug
(cd build/depbug && '../../configure')
make -C build/depbug -j8
make -C build/depbug clean
echo "CONFIG_CANARY = y" >> default-configs/arm-softmmu.mak
make -C build/depbug
grep CANARY build/depbug/aarch64-softmmu/config-devices.mak
The CANARY token should show up in config-devices.mak but does not.
Fix this bug by making "make clean" delete the config-devices.mak files.
config-all-devices.mak doesn't have the same problem since it has
no .d file, but delete it too, since it is created by "make" and
logically should be removed by "make clean".
(Note that it is important not to remove config-devices.mak until
after we have recursively run 'make clean' in the subdirectories.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <
1463484451-22979-1-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Peter Maydell [Mon, 16 May 2016 17:33:59 +0000 (18:33 +0100)]
cpus.c: Use pthread_sigmask() rather than sigprocmask()
On Linux, sigprocmask() and pthread_sigmask() are in practice the
same thing (they only set the signal mask for the calling thread),
but the documentation states that the behaviour of sigprocmask() in a
multithreaded process is undefined. Use pthread_sigmask() instead
(which is what we do in almost all places in QEMU that alter the
signal mask already).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <
1463420039-29761-1-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Tue, 1 Mar 2016 09:44:50 +0000 (10:44 +0100)]
memory: remove unnecessary masking of MemoryRegion ram_addr
mr->ram_block->offset is already aligned to both host and target size
(see qemu_ram_alloc_internal). Remove further masking as it is
unnecessary.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fam Zheng [Fri, 25 Mar 2016 10:10:29 +0000 (18:10 +0800)]
memory: Drop FlatRange.romd_mode
Its value is alway set to mr->romd_mode, so the removed comparisons are
fully superseded by "a->mr == b->mr".
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <
1458900629-2334-3-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fam Zheng [Fri, 25 Mar 2016 10:10:28 +0000 (18:10 +0800)]
memory: Remove code for mr->may_overlap
The collision check does nothing and hasn't been used. Remove the
variable together with related code.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <
1458900629-2334-2-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Gonglei [Tue, 10 May 2016 02:05:00 +0000 (10:05 +0800)]
exec: adjust rcu_read_lock requirement
qemu_ram_unset_idstr() doesn't need rcu lock anymore,
meanwhile make the range of rcu lock in
qemu_ram_set_idstr() as small as possible.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Message-Id: <
1462845901-89716-3-git-send-email-arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Gonglei [Tue, 10 May 2016 02:04:59 +0000 (10:04 +0800)]
memory: drop find_ram_block()
On the one hand, we have already qemu_get_ram_block() whose function
is similar. On the other hand, we can directly use mr->ram_block but
searching RAMblock by ram_addr which is a kind of waste.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-Id: <
1462845901-89716-2-git-send-email-arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Li Zhijian [Thu, 14 Apr 2016 03:25:52 +0000 (11:25 +0800)]
vl: change runstate only if new state is different from current state
Previously, qemu will abort at following scenario:
(qemu) stop
(qemu) system_reset
(qemu) system_reset
(qemu) 2016-04-13T20:54:38.979158Z qemu-system-x86_64: invalid runstate transition: 'prelaunch' -> 'prelaunch'
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <
1460604352-18630-1-git-send-email-lizhijian@cn.fujitsu.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Peter Xu [Tue, 10 May 2016 10:21:22 +0000 (18:21 +0800)]
ioapic: clear remote irr bit for edge-triggered interrupts
This is to better emulate IOAPIC version 0x1X hardware. Linux kernel
leveraged this "feature" to do explicit EOI since EOI register is still
not introduced at that time. This will also fix the issue that level
triggered interrupts failed to work when IR enabled (tested with Linux
kernel version 4.5).
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <
1462875682-1349-3-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Peter Xu [Tue, 10 May 2016 10:21:21 +0000 (18:21 +0800)]
ioapic: keep RO bits for IOAPIC entry
Currently IOAPIC RO bits can be written. To be better aligned with
hardware, we should let them read-only.
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <
1462875682-1349-2-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Mon, 16 May 2016 09:11:29 +0000 (11:11 +0200)]
target-i386: key sfence availability on CPUID_SSE, not CPUID_SSE2
sfence was introduced before lfence and mfence. This fixes Linux
2.4's measurement of checksumming speeds for the pIII_sse
algorithm:
md: linear personality registered as nr 1
md: raid0 personality registered as nr 2
md: raid1 personality registered as nr 3
md: raid5 personality registered as nr 4
raid5: measuring checksumming speed
8regs : 384.400 MB/sec
32regs : 259.200 MB/sec
invalid operand: 0000
CPU: 0
EIP: 0010:[<
c0240b2a>] Not tainted
EFLAGS:
00000246
eax:
c15d8000 ebx:
00000000 ecx:
00000000 edx:
c15d5000
esi:
8005003b edi:
00000004 ebp:
00000000 esp:
c15bdf50
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 1, stackpage=
c15bd000)
Stack:
00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000
00000000 00000206 c0241c6c 00001000 c15d4000 c15d7000 c15d4000
c15d4000
Call Trace: [<
c0241c6c>] [<
c0105000>] [<
c0241db4>] [<
c010503b>]
[<
c0105000>]
[<
c0107416>] [<
c0105030>]
Code: 0f ae f8 0f 10 04 24 0f 10 4c 24 10 0f 10 54 24 20 0f 10 5c
<0>Kernel panic: Attempted to kill init!
Reported-by: Stefan Weil <sw@weilnetz.de>
Fixes:
121f3157887f92268a3d6169e2d4601f9292020b
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Stefan Weil [Thu, 28 Apr 2016 21:33:41 +0000 (23:33 +0200)]
configure: Allow builds with extra warnings
The clang compiler supports a useful compiler option -Weverything,
and GCC also has other warnings not enabled by -Wall.
If glib header files trigger a warning, however, testing glib with
-Werror will always fail. A size mismatch is also detected without
-Werror, so simply remove it.
Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Message-Id: <
1461879221-13338-1-git-send-email-sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Prasad J Pandit [Thu, 7 Apr 2016 07:20:08 +0000 (12:50 +0530)]
i386: kvmvapic: initialise imm32 variable
When processing Task Priorty Register(TPR) access, it could leak
automatic stack variable 'imm32' in patch_instruction().
Initialise the variable to avoid it.
Reported by: Donghai Zdh <donghai.zdh@alibaba-inc.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <
1460013608-16670-1-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pranith Kumar [Mon, 2 May 2016 14:20:52 +0000 (10:20 -0400)]
docs/atomics.txt: Update pointer to linux macro
Add a missing end brace and update doc to point to the latest access
macro. ACCESS_ONCE() is deprecated.
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <
1462198852-28694-1-git-send-email-bobby.prani@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Dominik Dingel [Mon, 25 Apr 2016 11:55:38 +0000 (13:55 +0200)]
exec.c: Ensure right alignment also for file backed ram
While in the anonymous ram case we already take care of the right alignment
such an alignment gurantee does not exist for file backed ram allocation.
Instead, pagesize is used for alignment. On s390 this is not enough for gmap,
as we need to satisfy an alignment up to segments.
Reported-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Message-Id: <
1461585338-45863-1-git-send-email-dingel@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Peter Maydell [Mon, 23 May 2016 14:53:02 +0000 (15:53 +0100)]
Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-
20160523-1' into staging
usb: add xen pvUSB backend, add num-ports check to ohci.
# gpg: Signature made Mon 23 May 2016 14:02:25 BST using RSA key ID
D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg: aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
* remotes/kraxel/tags/pull-usb-
20160523-1:
usb/ohci: Fix crash with when specifying too many num-ports
xen: add pvUSB backend
xen: write information about supported backends
xen: introduce dummy system device
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Mon, 23 May 2016 13:50:40 +0000 (14:50 +0100)]
Merge remote-tracking branch 'remotes/kraxel/tags/pull-vga-
20160523-1' into staging
vga: fix CVE-2016-3712 regression, misc virtio-gpu fixes.
# gpg: Signature made Mon 23 May 2016 13:30:26 BST using RSA key ID
D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg: aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
* remotes/kraxel/tags/pull-vga-
20160523-1:
vga: add sr_vbe register set
virtio-gpu: fix ui idx check
virtio-gpu: use VIRTIO_GPU_MAX_SCANOUTS
virtio-gpu: check max_outputs only
virtio-gpu: check max_outputs value
virtio-vga: propagate on gpu realized error
virtio-gpu: check early scanout id
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Thomas Huth [Mon, 23 May 2016 09:23:07 +0000 (11:23 +0200)]
usb/ohci: Fix crash with when specifying too many num-ports
QEMU currently crashes when an OHCI controller is instantiated with
too many ports, e.g. "-device pci-ohci,num-ports=100,masterbus=1".
Thus add a proper check in usb_ohci_init() to make sure that we
do not use more than OHCI_MAX_PORTS = 15 ports here.
Ticket: https://bugs.launchpad.net/qemu/+bug/1581308
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-id:
1463995387-11710-1-git-send-email-thuth@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Gerd Hoffmann [Tue, 17 May 2016 08:54:54 +0000 (10:54 +0200)]
vga: add sr_vbe register set
Commit "fd3c136 vga: make sure vga register setup for vbe stays intact
(CVE-2016-3712)." causes a regression. The win7 installer is unhappy
because it can't freely modify vga registers any more while in vbe mode.
This patch introduces a new sr_vbe register set. The vbe_update_vgaregs
will fill sr_vbe[] instead of sr[]. Normal vga register reads and
writes go to sr[]. Any sr register read access happens through a new
sr() helper function which will read from sr_vbe[] with vbe active and
from sr[] otherwise.
This way we can allow guests update sr[] registers as they want, without
allowing them disrupt vbe video modes that way.
Cc: qemu-stable@nongnu.org
Reported-by: Thomas Lamprecht <thomas@lamprecht.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id:
1463475294-14119-1-git-send-email-kraxel@redhat.com