sdk/emulator/qemu.git
8 years agoMAINTAINERS: add more devices to the PCI section
Paolo Bonzini [Tue, 22 Sep 2015 09:56:48 +0000 (11:56 +0200)]
MAINTAINERS: add more devices to the PCI section

Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
8 years agoMAINTAINERS: add more devices to the PC section
Paolo Bonzini [Tue, 22 Sep 2015 09:56:47 +0000 (11:56 +0200)]
MAINTAINERS: add more devices to the PC section

For chipset devices, I can co-maintain it with Michael.

Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
8 years agovhost-user: add a new message to disable/enable a specific virt queue.
Changchun Ouyang [Wed, 23 Sep 2015 04:20:01 +0000 (12:20 +0800)]
vhost-user: add a new message to disable/enable a specific virt queue.

Add a new message, VHOST_USER_SET_VRING_ENABLE, to enable or disable
a specific virt queue, which is similar to attach/detach queue for
tap device.

virtio driver on guest doesn't have to use max virt queue pair, it
could enable any number of virt queue ranging from 1 to max virt
queue pair.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Tested-by: Marcel Apfelbaum <marcel@redhat.com>
8 years agovhost-user: add multiple queue support
Changchun Ouyang [Wed, 23 Sep 2015 04:20:00 +0000 (12:20 +0800)]
vhost-user: add multiple queue support

This patch is initially based a patch from Nikolay Nikolaev.

This patch adds vhost-user multiple queue support, by creating a nc
and vhost_net pair for each queue.

Qemu exits if find that the backend can't support the number of requested
queues (by providing queues=# option). The max number is queried by a
new message, VHOST_USER_GET_QUEUE_NUM, and is sent only when protocol
feature VHOST_USER_PROTOCOL_F_MQ is present first.

The max queue check is done at vhost-user initiation stage. We initiate
one queue first, which, in the meantime, also gets the max_queues the
backend supports.

In older version, it was reported that some messages are sent more times
than necessary. Here we came an agreement with Michael that we could
categorize vhost user messages to 2 types: non-vring specific messages,
which should be sent only once, and vring specific messages, which should
be sent per queue.

Here I introduced a helper function vhost_user_one_time_request(), which
lists following messages as non-vring specific messages:

        VHOST_USER_SET_OWNER
        VHOST_USER_RESET_DEVICE
        VHOST_USER_SET_MEM_TABLE
        VHOST_USER_GET_QUEUE_NUM

For above messages, we simply ignore them when they are not sent the first
time.

Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Tested-by: Marcel Apfelbaum <marcel@redhat.com>
8 years agovhost: introduce vhost_backend_get_vq_index method
Yuanhan Liu [Wed, 23 Sep 2015 04:19:59 +0000 (12:19 +0800)]
vhost: introduce vhost_backend_get_vq_index method

Minusing the idx with the base(dev->vq_index) for vhost-kernel, and
then adding it back for vhost-user doesn't seem right. Here introduces
a new method vhost_backend_get_vq_index() for getting the right vq
index for following vhost messages calls.

Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Marcel Apfelbaum <marcel@redhat.com>
8 years agovhost-user: add VHOST_USER_GET_QUEUE_NUM message
Yuanhan Liu [Wed, 23 Sep 2015 04:19:58 +0000 (12:19 +0800)]
vhost-user: add VHOST_USER_GET_QUEUE_NUM message

This is for querying how many queues the backend supports if it has mq
support(when VHOST_USER_PROTOCOL_F_MQ flag is set from the quried
protocol features).

vhost_net_get_max_queues() is the interface to export that value, and
to tell if the backend supports # of queues user requested, which is
done in the following patch.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Marcel Apfelbaum <marcel@redhat.com>
8 years agovhost: rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE
Yuanhan Liu [Wed, 23 Sep 2015 04:19:57 +0000 (12:19 +0800)]
vhost: rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE

Quote from Michael:

    We really should rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE.

Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Tested-by: Marcel Apfelbaum <marcel@redhat.com>
8 years agovhost-user: add protocol feature negotiation
Michael S. Tsirkin [Wed, 23 Sep 2015 04:19:56 +0000 (12:19 +0800)]
vhost-user: add protocol feature negotiation

Support a separate bitmask for vhost-user protocol features,
and messages to get/set protocol features.

Invoke them at init.

No features are defined yet.

[ leverage vhost_user_call for request handling -- Yuanhan Liu ]

Signed-off-by: Michael S. Tsirkin <address@hidden>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Tested-by: Marcel Apfelbaum <marcel@redhat.com>
8 years agovhost-user: use VHOST_USER_XXX macro for switch statement
Yuanhan Liu [Wed, 23 Sep 2015 04:19:55 +0000 (12:19 +0800)]
vhost-user: use VHOST_USER_XXX macro for switch statement

So that we could let vhost_user_call to handle extented requests,
such as VHOST_USER_GET/SET_PROTOCOL_FEATURES, instead of invoking
vhost_user_read/write and constructing the msg again by ourself.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Tested-by: Marcel Apfelbaum <marcel@redhat.com>
8 years agovirtio-ccw: enable virtio-1
Cornelia Huck [Fri, 11 Sep 2015 13:16:44 +0000 (15:16 +0200)]
virtio-ccw: enable virtio-1

Let's enable revision 1 for virtio-ccw devices. We can always offer
VERSION_1 as drivers in legacy mode won't be able to see it anyway.

We have to introduce a way to set a lower maximum revision for a device
to accommodate the following cases:
- compat machines (to enforce legacy only)
- virtio-blk with scsi support (version 1 + scsi is fenced by common
  code, with a user-configured max revision of 0 we can allow scsi
  via not offering VERSION_1)

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
8 years agovirtio-ccw: feature bits > 31 handling
Cornelia Huck [Fri, 11 Sep 2015 13:16:43 +0000 (15:16 +0200)]
virtio-ccw: feature bits > 31 handling

We currently switch off the VERSION_1 feature bit if the guest has
not negotiated at least revision 1. As no feature bits beyond 31 are
valid however unless VERSION_1 has been negotiated, make sure that
legacy guests never see a feature bit beyond 31.

Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
8 years agovirtio-ccw: support ring size changes
Cornelia Huck [Fri, 11 Sep 2015 13:16:42 +0000 (15:16 +0200)]
virtio-ccw: support ring size changes

Wire up changing the ring size for virtio-1 devices.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
8 years agovirtio: ring sizes vs. reset
Cornelia Huck [Fri, 11 Sep 2015 13:16:41 +0000 (15:16 +0200)]
virtio: ring sizes vs. reset

We allow guests to change the size of the virtqueue rings by supplying
a number of buffers that is different from the number of buffers the
device was initialized with. Current code has some problems, however,
since reset does not reset the ringsizes to the default values (as this
is not saved anywhere).

Let's extend the core code to keep track of the default ringsizes and
migrate them once the guest changed them for any of the virtqueues
for a device.

Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
8 years agopc: Introduce pc-*-2.5 machine classes
Eduardo Habkost [Fri, 11 Sep 2015 20:14:25 +0000 (17:14 -0300)]
pc: Introduce pc-*-2.5 machine classes

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
8 years agoq35: Move options common to all classes to pc_i440fx_machine_options()
Eduardo Habkost [Fri, 11 Sep 2015 20:14:24 +0000 (17:14 -0300)]
q35: Move options common to all classes to pc_i440fx_machine_options()

The existing default_machine_opts and default_display settings will
still apply to future machine classes. So it makes sense to move them to
pc_i440fx_machine_options() instead of keeping them in a
version-specific machine_options function.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
8 years agoq35: Move options common to all classes to pc_q35_machine_options()
Eduardo Habkost [Fri, 11 Sep 2015 20:14:23 +0000 (17:14 -0300)]
q35: Move options common to all classes to pc_q35_machine_options()

The existing default_machine_opts, default_display, no_floppy, and
no_tco settings will still apply to future machine classes. So it makes
sense to move them to pc_q35_machine_options() instead of keeping them
in a version-specific machine_options function.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
8 years agovirtio-net: unbreak self announcement and guest offloads after migration
Jason Wang [Fri, 11 Sep 2015 08:01:56 +0000 (16:01 +0800)]
virtio-net: unbreak self announcement and guest offloads after migration

After commit 019a3edbb25f1571e876f8af1ce4c55412939e5d ("virtio: make
features 64bit wide"). Device's guest_features was actually set after
vdc->load(). This breaks the assumption that device specific load()
function can check guest_features. For virtio-net, self announcement
and guest offloads won't work after migration.

Fixing this by defer them to virtio_net_load() where guest_features
were guaranteed to be set. Other virtio devices looks fine.

Fixes: 019a3edbb25f1571e876f8af1ce4c55412939e5d
       ("virtio: make features 64bit wide")
Cc: qemu-stable@nongnu.org
Cc: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
8 years agovirtio: right size for virtio_queue_get_avail_size
Pierre Morel [Thu, 10 Sep 2015 11:37:10 +0000 (13:37 +0200)]
virtio: right size for virtio_queue_get_avail_size

Being working on dataplane I notice something strange:

virtio_queue_get_avail_size() used a 64bit size index
for the calculation of the available ring size.

It is quite strange but it did work with the old calculation
of the avail ring, at most with performance penalty,
and I wonder where I missed something.

This patch let use a 16bit size as defined in virtio_ring.h

Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
8 years agoMerge remote-tracking branch 'remotes/awilliam/tags/vfio-update-20150923.0' into...
Peter Maydell [Wed, 23 Sep 2015 20:39:46 +0000 (21:39 +0100)]
Merge remote-tracking branch 'remotes/awilliam/tags/vfio-update-20150923.0' into staging

VFIO updates 2015-09-23

 - Tracing improvements to use common prefixes for functional areas
 - Quirks overhaul:
   - Split PCI quirks to separate file
   - Make them understandable and more extensible
   - Improve use of MemoryRegions and eliminate use of target pagesize
 - Eliminate build-time debugging, everything migrated to runtime opts

# gpg: Signature made Wed 23 Sep 2015 21:09:05 BST using RSA key ID 3BB08B22
# gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>"
# gpg:                 aka "Alex Williamson <alex@shazbot.org>"
# gpg:                 aka "Alex Williamson <alwillia@redhat.com>"
# gpg:                 aka "Alex Williamson <alex.l.williamson@gmail.com>"

* remotes/awilliam/tags/vfio-update-20150923.0:
  vfio/pci: Add emulated PCI IDs
  vfio/pci: Cache vendor and device ID
  vfio/pci: Move AMD device specific reset to quirks
  vfio/pci: Remove old config window and mirror quirks
  vfio/pci: Config mirror quirk
  vfio/pci: Config window quirks
  vfio/pci: Rework RTL8168 quirk
  vfio/pci: Cleanup Nvidia 0x3d0 quirk
  vfio/pci: Cleanup ATI 0x3c3 quirk
  vfio/pci: Foundation for new quirk structure
  vfio/pci: Cleanup ROM blacklist quirk
  vfio/pci: Split quirks to a separate file
  vfio/pci: Extract PCI structures to a separate header
  vfio: Change polarity of our no-mmap option
  vfio/pci: Make interrupt bypass runtime configurable
  vfio/pci: Rename MSI/X functions for easier tracing
  vfio/pci: Rename INTx functions for easier tracing
  vfio/pci: Cleanup vfio_early_setup_msix() error path
  vfio/pci: Cleanup RTL8168 quirk and tracing

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agovfio/pci: Add emulated PCI IDs
Alex Williamson [Wed, 23 Sep 2015 19:04:49 +0000 (13:04 -0600)]
vfio/pci: Add emulated PCI IDs

Specifying an emulated PCI vendor/device ID can be useful for testing
various quirk paths, even though the behavior and functionality of
the device with bogus IDs is fully unsupportable.  We need to use a
uint32_t for the vendor/device IDs, even though the registers
themselves are only 16-bit in order to be able to determine whether
the value is valid and user set.

The same support is added for subsystem vendor/device ID, though these
have the possibility of being useful and supported for more than a
testing tool.  An emulated platform might want to impose their own
subsystem IDs or at least hide the physical subsystem ID.  Windows
guests will often reinstall drivers due to a change in subsystem IDs,
something that VM users may want to avoid.  Of course careful
attention would be required to ensure that guest drivers do not rely
on the subsystem ID as a basis for device driver quirks.

All of these options are added using the standard experimental option
prefix and should not be considered stable.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
8 years agovfio/pci: Cache vendor and device ID
Alex Williamson [Wed, 23 Sep 2015 19:04:49 +0000 (13:04 -0600)]
vfio/pci: Cache vendor and device ID

Simplify access to commonly referenced PCI vendor and device ID by
caching it on the VFIOPCIDevice struct.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
8 years agovfio/pci: Move AMD device specific reset to quirks
Alex Williamson [Wed, 23 Sep 2015 19:04:49 +0000 (13:04 -0600)]
vfio/pci: Move AMD device specific reset to quirks

This is just another quirk, for reset rather than affecting memory
regions.  Move it to our new quirks file.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
8 years agovfio/pci: Remove old config window and mirror quirks
Alex Williamson [Wed, 23 Sep 2015 19:04:48 +0000 (13:04 -0600)]
vfio/pci: Remove old config window and mirror quirks

These are now unused.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
8 years agovfio/pci: Config mirror quirk
Alex Williamson [Wed, 23 Sep 2015 19:04:48 +0000 (13:04 -0600)]
vfio/pci: Config mirror quirk

Re-implement our mirror quirk using the new infrastructure.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
8 years agovfio/pci: Config window quirks
Alex Williamson [Wed, 23 Sep 2015 19:04:48 +0000 (13:04 -0600)]
vfio/pci: Config window quirks

Config windows make use of an address register and a data register.
In VGA cards, these are often used to provide real mode code in the
BIOS an easy way to access MMIO registers since the window often
resides in an I/O port register.  When the MMIO register has a mirror
of PCI config space, we need to trap those accesses and redirect them
to emulated config space.

The previous version of this functionality made use of a single
MemoryRegion and single match address.  This version uses separate
MemoryRegions for each of the address and data registers and allows
for multiple match addresses.  This is useful for Nvidia cards which
have two ranges which index into PCI config space.

The previous implementation is left for the follow-on patch for a more
reviewable diff.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
8 years agovfio/pci: Rework RTL8168 quirk
Alex Williamson [Wed, 23 Sep 2015 19:04:47 +0000 (13:04 -0600)]
vfio/pci: Rework RTL8168 quirk

Another rework of this quirk, this time to update to the new quirk
structure.  We can handle the address and data registers with
separate MemoryRegions and a quirk specific data structure, making the
code much more understandable.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
8 years agovfio/pci: Cleanup Nvidia 0x3d0 quirk
Alex Williamson [Wed, 23 Sep 2015 19:04:47 +0000 (13:04 -0600)]
vfio/pci: Cleanup Nvidia 0x3d0 quirk

The Nvidia 0x3d0 quirk makes use of a two separate registers and gives
us our first chance to make use of separate memory regions for each to
simplify the code a bit.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
8 years agovfio/pci: Cleanup ATI 0x3c3 quirk
Alex Williamson [Wed, 23 Sep 2015 19:04:47 +0000 (13:04 -0600)]
vfio/pci: Cleanup ATI 0x3c3 quirk

This is an easy quirk that really doesn't need a data structure if
its own.  We can pass vdev as the opaque data and access to the
MemoryRegion isn't required.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
8 years agovfio/pci: Foundation for new quirk structure
Alex Williamson [Wed, 23 Sep 2015 19:04:46 +0000 (13:04 -0600)]
vfio/pci: Foundation for new quirk structure

VFIOQuirk hosts a single memory region and a fixed set of data fields
that try to handle all the quirk cases, but end up making those that
don't exactly match really confusing.  This patch introduces a struct
intended to provide more flexibility and simpler code.  VFIOQuirk is
stripped to its basics, an opaque data pointer for quirk specific
data and a pointer to an array of MemoryRegions with a counter.  This
still allows us to have common teardown routines, but adds much
greater flexibility to support multiple memory regions and quirk
specific data structures that are easier to maintain.  The existing
VFIOQuirk is transformed into VFIOLegacyQuirk, which further patches
will eliminate entirely.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
8 years agovfio/pci: Cleanup ROM blacklist quirk
Alex Williamson [Wed, 23 Sep 2015 19:04:45 +0000 (13:04 -0600)]
vfio/pci: Cleanup ROM blacklist quirk

Create a vendor:device ID helper that we'll also use as we rework the
rest of the quirks.  Re-reading the config entries, even if we get
more blacklist entries, is trivial overhead and only incurred during
device setup.  There's no need to typedef the blacklist structure,
it's a static private data type used once.  The elements get bumped
up to uint32_t to avoid future maintenance issues if PCI_ANY_ID gets
used for a blacklist entry (avoiding an actual hardware match).  Our
test loop is also crying out to be simplified as a for loop.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
8 years agovfio/pci: Split quirks to a separate file
Alex Williamson [Wed, 23 Sep 2015 19:04:45 +0000 (13:04 -0600)]
vfio/pci: Split quirks to a separate file

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
8 years agovfio/pci: Extract PCI structures to a separate header
Alex Williamson [Wed, 23 Sep 2015 19:04:44 +0000 (13:04 -0600)]
vfio/pci: Extract PCI structures to a separate header

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
8 years agovfio: Change polarity of our no-mmap option
Alex Williamson [Wed, 23 Sep 2015 19:04:44 +0000 (13:04 -0600)]
vfio: Change polarity of our no-mmap option

The default should be to allow mmap and new drivers shouldn't need to
expose an option or set it to other than the allocation default in
their initfn.  Take advantage of the experimental flag to change this
option to the correct polarity.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
8 years agovfio/pci: Make interrupt bypass runtime configurable
Alex Williamson [Wed, 23 Sep 2015 19:04:44 +0000 (13:04 -0600)]
vfio/pci: Make interrupt bypass runtime configurable

Tracing is more effective when we can completely disable all KVM
bypass paths.  Make these runtime rather than build-time configurable.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
8 years agovfio/pci: Rename MSI/X functions for easier tracing
Alex Williamson [Wed, 23 Sep 2015 19:04:43 +0000 (13:04 -0600)]
vfio/pci: Rename MSI/X functions for easier tracing

This allows vfio_msi* tracing.  The MSI/X interrupt tracing is also
pulled out of #ifdef DEBUG_VFIO to avoid a recompile for tracing this
path.  A few cycles to read the message is hardly anything if we're
already in QEMU.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
8 years agovfio/pci: Rename INTx functions for easier tracing
Alex Williamson [Wed, 23 Sep 2015 19:04:43 +0000 (13:04 -0600)]
vfio/pci: Rename INTx functions for easier tracing

Rename functions and tracing callbacks so that we can trace vfio_intx*
to see all the INTx related activities.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
8 years agovfio/pci: Cleanup vfio_early_setup_msix() error path
Alex Williamson [Wed, 23 Sep 2015 19:04:43 +0000 (13:04 -0600)]
vfio/pci: Cleanup vfio_early_setup_msix() error path

With the addition of the Chelsio quirk we have an error path out of
vfio_early_setup_msix() that doesn't free the allocated VFIOMSIXInfo
struct.  This doesn't introduce a leak as it still gets freed in the
vfio_put_device() path, but it's complicated and sloppy to rely on
that.  Restructure to free the allocated data on error and only link
it into the vdev on success.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reported-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
8 years agovfio/pci: Cleanup RTL8168 quirk and tracing
Alex Williamson [Wed, 23 Sep 2015 19:04:42 +0000 (13:04 -0600)]
vfio/pci: Cleanup RTL8168 quirk and tracing

There's quite a bit of cleanup that can be done to the RTL8168 quirk,
as well as the tracing to prevent a spew of uninteresting accesses
for anything else the driver might choose to use the window registers
for besides the MSI-X table.  There should be no functional change,
but it's now possible to get compact and useful traces by enabling
vfio_rtl8168_quirk*, ex:

vfio_rtl8168_quirk_write 0000:04:00.0 [address]: 0x1f000
vfio_rtl8168_quirk_read 0000:04:00.0 [address]: 0x8001f000
vfio_rtl8168_quirk_read 0000:04:00.0 [data]: 0xfee0100c
vfio_rtl8168_quirk_write 0000:04:00.0 [address]: 0x1f004
vfio_rtl8168_quirk_read 0000:04:00.0 [address]: 0x8001f004
vfio_rtl8168_quirk_read 0000:04:00.0 [data]: 0x0
vfio_rtl8168_quirk_write 0000:04:00.0 [address]: 0x1f008
vfio_rtl8168_quirk_read 0000:04:00.0 [address]: 0x8001f008
vfio_rtl8168_quirk_read 0000:04:00.0 [data]: 0x49b1
vfio_rtl8168_quirk_write 0000:04:00.0 [address]: 0x1f00c
vfio_rtl8168_quirk_read 0000:04:00.0 [address]: 0x8001f00c
vfio_rtl8168_quirk_read 0000:04:00.0 [data]: 0x0

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
8 years agoMerge remote-tracking branch 'remotes/dgibson/tags/spapr-next-20150923' into staging
Peter Maydell [Wed, 23 Sep 2015 15:52:54 +0000 (16:52 +0100)]
Merge remote-tracking branch 'remotes/dgibson/tags/spapr-next-20150923' into staging

sPAPR Patch Queue: 2015-09-23

Highlights:
    * pseries-2.5 machine type
    * Memory hotplug for "pseries" guests
    * Fixes to the PAPR Dynamic Reconfiguration hotplug code
    * Several PAPR compliance fixes
    * New SLOF with:
        * GPT support
        * Much faster VGA handling

# gpg: Signature made Wed 23 Sep 2015 02:50:10 BST using DSA key ID FDDA6FC6
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: F730 2185 38B4 D13E FD80  34F2 6882 CAC6 FDDA 6FC6

* remotes/dgibson/tags/spapr-next-20150923: (36 commits)
  sPAPR: Enable EEH on VFIO PCI device only
  sPAPR: Revert don't enable EEH on emulated PCI devices
  ppc/spapr: Implement H_RANDOM hypercall in QEMU
  ppc/spapr: Fix buffer overflow in spapr_populate_drconf_memory()
  spapr: Fix default NUMA node allocation for threads
  spapr: Move memory hotplug to RTAS_LOG_V6_HP_ID_DRC_COUNT type
  spapr: Support hotplug by specifying DRC count
  spapr: Revert to memory@XXXX representation for non-hotplugged memory
  spapr: Populate ibm,associativity-lookup-arrays correctly for non-NUMA
  spapr: Provide better error message when slots exceed max allowed
  spapr: Don't allow memory hotplug to memory less nodes
  spapr: Memory hotplug support
  spapr: Make hash table size a factor of maxram_size
  spapr: Support ibm,dynamic-reconfiguration-memory
  spapr: Add LMB DR connectors
  spapr: Use QEMU limit for maximum CPUs number
  spapr: Don't use QOM [*] syntax for DR connectors.
  spapr_drc: use RTAS return codes for methods called by RTAS
  spapr: Initialize hotplug memory address space
  spapr_drc: don't allow 'empty' DRCs to be unisolated or allocated
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agosPAPR: Enable EEH on VFIO PCI device only
Gavin Shan [Fri, 18 Sep 2015 07:30:44 +0000 (17:30 +1000)]
sPAPR: Enable EEH on VFIO PCI device only

This checks if the PCI device retrieved from the PCI device address
is VFIO PCI device when enabling EEH functionality. If it's not
VFIO PCI device, the EEH functonality isn't enabled.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agosPAPR: Revert don't enable EEH on emulated PCI devices
Gavin Shan [Fri, 18 Sep 2015 07:30:43 +0000 (17:30 +1000)]
sPAPR: Revert don't enable EEH on emulated PCI devices

This reverts commit 7cb18007 ("sPAPR: Don't enable EEH on emulated
PCI devices") as rtas_ibm_set_eeh_option() isn't the right place
to check if there has the corresponding PCI device for the input
address, which can be PE address, not PCI device address.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agoppc/spapr: Implement H_RANDOM hypercall in QEMU
Thomas Huth [Thu, 17 Sep 2015 08:49:41 +0000 (10:49 +0200)]
ppc/spapr: Implement H_RANDOM hypercall in QEMU

The PAPR interface defines a hypercall to pass high-quality
hardware generated random numbers to guests. Recent kernels can
already provide this hypercall to the guest if the right hardware
random number generator is available. But in case the user wants
to use another source like EGD, or QEMU is running with an older
kernel, we should also have this call in QEMU, so that guests that
do not support virtio-rng yet can get good random numbers, too.

This patch now adds a new pseudo-device to QEMU that either
directly provides this hypercall to the guest or is able to
enable the in-kernel hypercall if available. The in-kernel
hypercall can be enabled with the use-kvm property, e.g.:

 qemu-system-ppc64 -device spapr-rng,use-kvm=true

For handling the hypercall in QEMU instead, a "RngBackend" is
required since the hypercall should provide "good" random data
instead of pseudo-random (like from a "simple" library function
like rand() or g_random_int()). Since there are multiple RngBackends
available, the user must select an appropriate back-end via the
"rng" property of the device, e.g.:

 qemu-system-ppc64 -object rng-random,filename=/dev/hwrng,id=gid0 \
                   -device spapr-rng,rng=gid0 ...

See http://wiki.qemu-project.org/Features-Done/VirtIORNG for
other example of specifying RngBackends.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agoppc/spapr: Fix buffer overflow in spapr_populate_drconf_memory()
Thomas Huth [Tue, 15 Sep 2015 19:34:20 +0000 (21:34 +0200)]
ppc/spapr: Fix buffer overflow in spapr_populate_drconf_memory()

The buffer that is allocated in spapr_populate_drconf_memory()
is used for setting both, the "ibm,dynamic-memory" and the
"ibm,associativity-lookup-arrays" property. However, only the
size of the first one is taken into account when allocating the
memory. So if the length of the second property is larger than
the length of the first one, we run into a buffer overflow here!
Fix it by taking the length of the second property into account,
too.

Fixes: "spapr: Support ibm,dynamic-reconfiguration-memory" patch
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agospapr: Fix default NUMA node allocation for threads
David Gibson [Tue, 8 Sep 2015 01:21:52 +0000 (11:21 +1000)]
spapr: Fix default NUMA node allocation for threads

At present, if guest numa nodes are requested, but the cpus in each node
are not specified, spapr just uses the default behaviour or assigning each
vcpu round-robin to nodes.

If smp_threads != 1, that will assign adjacent threads in a core to
different NUMA nodes.  As well as being just weird, that's a configuration
that can't be represented in the device tree we give to the guest, which
means the guest and qemu end up with different ideas of the NUMA topology.

This patch implements mc->cpu_index_to_socket_id in the spapr code to
make sure vcpus get assigned to nodes only at the socket granularity.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
8 years agospapr: Move memory hotplug to RTAS_LOG_V6_HP_ID_DRC_COUNT type
Bharata B Rao [Mon, 3 Aug 2015 05:35:43 +0000 (11:05 +0530)]
spapr: Move memory hotplug to RTAS_LOG_V6_HP_ID_DRC_COUNT type

Till now memory hotplug used RTAS_LOG_V6_HP_ID_DRC_INDEX hotplug type
which meant that we generated one hotplug type of EPOW event for every
256MB (SPAPR_MEMORY_BLOCK_SIZE). This quickly overruns the kernel
rtas log buffer thus resulting in loss of memory hotplug events. Switch
to RTAS_LOG_V6_HP_ID_DRC_COUNT hotplug type for memory so that we
generate only one event per hotplug request.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agospapr: Support hotplug by specifying DRC count
Bharata B Rao [Mon, 3 Aug 2015 05:35:42 +0000 (11:05 +0530)]
spapr: Support hotplug by specifying DRC count

Support hotplug identifier type RTAS_LOG_V6_HP_ID_DRC_COUNT that allows
hotplugging of DRCs by specifying the DRC count.

While we are here, rename

spapr_hotplug_req_add_event() to spapr_hotplug_req_add_by_index()
spapr_hotplug_req_remove_event() to spapr_hotplug_req_remove_by_index()

so that they match with spapr_hotplug_req_add_by_count().

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agospapr: Revert to memory@XXXX representation for non-hotplugged memory
Bharata B Rao [Mon, 3 Aug 2015 05:35:41 +0000 (11:05 +0530)]
spapr: Revert to memory@XXXX representation for non-hotplugged memory

Don't represent non-hotluggable memory under drconf node. With this
we don't have to create DRC objects for them.

The effect of this patch is that we revert back to memory@XXXX representation
for all the memory specified with -m option and represent the cold
plugged memory and hot-pluggable memory under
ibm,dynamic-reconfiguration-memory.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agospapr: Populate ibm,associativity-lookup-arrays correctly for non-NUMA
Bharata B Rao [Mon, 3 Aug 2015 05:35:40 +0000 (11:05 +0530)]
spapr: Populate ibm,associativity-lookup-arrays correctly for non-NUMA

When NUMA isn't configured explicitly, assume node 0 is present for
the purpose of creating ibm,associativity-lookup-arrays property
under ibm,dynamic-reconfiguration-memory DT node. This ensures that
the associativity index property is correctly updated in ibm,dynamic-memory
for the LMB that is hotplugged.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agospapr: Provide better error message when slots exceed max allowed
Bharata B Rao [Mon, 3 Aug 2015 05:35:39 +0000 (11:05 +0530)]
spapr: Provide better error message when slots exceed max allowed

Currently when user specifies more slots than allowed max of
SPAPR_MAX_RAM_SLOTS (32), we error out like this:

qemu-system-ppc64: unsupported amount of memory slots: 64

Let the user know about the max allowed slots like this:

qemu-system-ppc64: Specified number of memory slots 64 exceeds max supported 32

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agospapr: Don't allow memory hotplug to memory less nodes
Bharata B Rao [Mon, 29 Jun 2015 08:44:32 +0000 (14:14 +0530)]
spapr: Don't allow memory hotplug to memory less nodes

Currently PowerPC kernel doesn't allow hot-adding memory to memory-less
node, but instead will silently add the memory to the first node that has
some memory. This causes two unexpected behaviours for the user.

- Memory gets hotplugged to a different node than what the user specified.
- Since pc-dimm subsystem in QEMU still thinks that memory belongs to
  memory-less node, a reboot will set things accordingly and the previously
  hotplugged memory now ends in the right node. This appears as if some
  memory moved from one node to another.

So until kernel starts supporting memory hotplug to memory-less
nodes, just prevent such attempts upfront in QEMU.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agospapr: Memory hotplug support
Bharata B Rao [Tue, 1 Sep 2015 01:22:35 +0000 (11:22 +1000)]
spapr: Memory hotplug support

Make use of pc-dimm infrastructure to support memory hotplug
for PowerPC.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agospapr: Make hash table size a factor of maxram_size
Bharata B Rao [Mon, 29 Jun 2015 08:44:30 +0000 (14:14 +0530)]
spapr: Make hash table size a factor of maxram_size

The hash table size is dependent on ram_size, but since with hotplug
the memory can grow till maxram_size. Hence make hash table size dependent
on maxram_size.

This allows to hotplug huge amounts of memory to the guest.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agospapr: Support ibm,dynamic-reconfiguration-memory
Bharata B Rao [Mon, 13 Jul 2015 00:34:00 +0000 (10:34 +1000)]
spapr: Support ibm,dynamic-reconfiguration-memory

Parse ibm,architecture.vec table obtained from the guest and enable
memory node configuration via ibm,dynamic-reconfiguration-memory if guest
supports it. This is in preparation to support memory hotplug for
sPAPR guests.

This changes the way memory node configuration is done. Currently all
memory nodes are built upfront. But after this patch, only memory@0 node
for RMA is built upfront. Guest kernel boots with just that and rest of
the memory nodes (via memory@XXX or ibm,dynamic-reconfiguration-memory)
are built when guest does ibm,client-architecture-support call.

Note: This patch needs a SLOF enhancement which is already part of
SLOF binary in QEMU.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agospapr: Add LMB DR connectors
David Gibson [Wed, 12 Aug 2015 03:16:48 +0000 (13:16 +1000)]
spapr: Add LMB DR connectors

Enable memory hotplug for pseries 2.4 and add LMB DR connectors.
With memory hotplug, enforce RAM size, NUMA node memory size and maxmem
to be a multiple of SPAPR_MEMORY_BLOCK_SIZE (256M) since that's the
granularity in which LMBs are represented and hot-added.

LMB DR connectors will be used by the memory hotplug code.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
               [spapr_drc_reset implementation]
[since this missed the 2.4 cutoff, changing to only enable for 2.5]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agospapr: Use QEMU limit for maximum CPUs number
Alexey Kardashevskiy [Thu, 6 Aug 2015 03:37:24 +0000 (13:37 +1000)]
spapr: Use QEMU limit for maximum CPUs number

sPAPR uses hard coded limit of maximum 255 supported CPUs which is
exactly the same as QEMU-wide limit which is MAX_CPUMASK_BITS and also
defined as 255.

This makes use of a global CPU number limit for the "pseries" machine.

In order to anticipate future increase of the MAX_CPUMASK_BITS
(or to help debugging large systems), this also bumps the FDT_MAX_SIZE
limit from 256K to 1M assuming that 1 CPU core needs roughly 512 bytes
in the device tree so the new limit can cover up to 2048 CPU cores.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agospapr: Don't use QOM [*] syntax for DR connectors.
David Gibson [Wed, 16 Sep 2015 06:57:51 +0000 (16:57 +1000)]
spapr: Don't use QOM [*] syntax for DR connectors.

The dynamic reconfiguration (hotplug) code for the pseries machine type
uses a "DR connector" QOM object for each resource it will be possible
to hotplug.  Each of these is added to its owner using
    object_property_add_child(owner, "dr-connector[*], ...);

That works ok, mostly, but it means that the property indices are
arbitrary, depending on the order in which the connectors are constructed.
That might line up to something useful, but it doesn't have to.

It will get worse once we add hotplug RAM support.  That will add a DR
connector object for every 256MB of potential memory.  So if maxmem=2T,
for example, there are 8192 objects under the same parent.

The QOM interfaces aren't really designed for this.  In particular
object_property_add() with [*] has O(n^2) time complexity (in the number of
existing children): first it has a linear search through array indices to
find a free slot, each of which is attempted to a recursive call to
object_property_add() with a specific [N].  Those calls are O(n) because
there's a linear search through all properties to check for duplicates.

By using a meaningful index value, which we already know is unique we can
avoid the [*] special behaviour.  That lets us reduce the total time for
creating the DR objects from O(n^3) to O(n^2).

O(n^2) is still kind of crappy, but it's enough to reduce the startup time
of qemu (with in-progress memory hotplug support) with maxmem=2T from ~20
minutes to ~4 seconds.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Tested-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
8 years agospapr_drc: use RTAS return codes for methods called by RTAS
Michael Roth [Thu, 10 Sep 2015 21:11:02 +0000 (16:11 -0500)]
spapr_drc: use RTAS return codes for methods called by RTAS

Certain methods in sPAPRDRConnector objects are only ever called by
RTAS and in many cases are responsible for the logic that determines
the RTAS return codes.

Rather than having a level of indirection requiring RTAS code to
re-interpret return values from such methods to determine the
appropriate return code, just pass them through directly.

This requires changing method return types to uint32_t to match the
type of values currently passed to RTAS helpers.

In the case of read accesses like drc->entity_sense() where we weren't
previously reporting any errors, just the read value, we modify the
function to return RTAS return code, and pass the read value back via
reference.

Suggested-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Suggested-by: David Gibson <david@gibson.dropbear.id.au>
Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agospapr: Initialize hotplug memory address space
Bharata B Rao [Mon, 29 Jun 2015 08:44:27 +0000 (14:14 +0530)]
spapr: Initialize hotplug memory address space

Initialize a hotplug memory region under which all the hotplugged
memory is accommodated. Also enable memory hotplug by setting
CONFIG_MEM_HOTPLUG.

Modelled on i386 memory hotplug.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agospapr_drc: don't allow 'empty' DRCs to be unisolated or allocated
Michael Roth [Thu, 10 Sep 2015 21:11:03 +0000 (16:11 -0500)]
spapr_drc: don't allow 'empty' DRCs to be unisolated or allocated

Logical resources start with allocation-state:UNUSABLE /
isolation-state:ISOLATED. During hotplug, guests will transition
them to allocation-state:USABLE, and then to
isolation-state:UNISOLATED.

For cases where we cannot transition to allocation-state:USABLE,
in this case due to no device/resource being association with
the logical DRC, we should return an error -3.

For physical DRCs, we default to allocation-state:USABLE and stay
there, so in this case we should report an error -3 when the guest
attempts to make the isolation-state:ISOLATED transition for a DRC
with no device associated.

These are as documented in PAPR 2.7, 13.5.3.4.

We also ensure allocation-state:USABLE when the guest attempts
transition to isolation-state:UNISOLATED to deal with misbehaving
guests attempting to bring online an unallocated logical resource.

This is as documented in PAPR 2.7, 13.7.

Currently we implement no such error logic. Fix this by handling
these error cases as PAPR defines.

Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agospapr_pci: fix device tree props for MSI/MSI-X
Michael Roth [Tue, 15 Sep 2015 21:34:59 +0000 (16:34 -0500)]
spapr_pci: fix device tree props for MSI/MSI-X

PAPR requires ibm,req#msi and ibm,req#msi-x to be present in the
device node to define the number of msi/msi-x interrupts the device
supports, respectively.

Currently we have ibm,req#msi-x hardcoded to a non-sensical constant
that happens to be 2, and are missing ibm,req#msi entirely. The result
of that is that msi-x capable devices get limited to 2 msi-x
interrupts (which can impact performance), and msi-only devices likely
wouldn't work at all. Additionally, if devices expect a minimum that
exceeds 2, the guest driver may fail to load entirely.

SLOF still owns the generation of these properties at boot-time
(although other device properties have since been offloaded to QEMU),
but for hotplugged devices we rely on the values generated by QEMU
and thus hit the limitations above.

Fix this by generating these properties in QEMU as expected by guests.

In the future it may make sense to modify SLOF to pass through these
values directly as we do with other props since we're duplicating SLOF
code.

Cc: qemu-ppc@nongnu.org
Cc: qemu-stable@nongnu.org
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agospapr: Enable in-kernel H_SET_MODE handling
Alexey Kardashevskiy [Tue, 8 Sep 2015 01:25:13 +0000 (11:25 +1000)]
spapr: Enable in-kernel H_SET_MODE handling

For setting debug watchpoints, sPAPR guests use H_SET_MODE hypercall.
The existing QEMU H_SET_MODE handler does not support this but
the KVM handler in HV KVM does. However it is not enabled.

This enables the in-kernel H_SET_MODE handler which handles:
- Completed Instruction Address Breakpoint Register
- Watch point 0 registers.

The rest is still handled in QEMU.

Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agopseries: Fix incorrect calculation of threads per socket for chip-id
David Gibson [Tue, 8 Sep 2015 01:21:31 +0000 (11:21 +1000)]
pseries: Fix incorrect calculation of threads per socket for chip-id

The device tree presented to pseries machine type guests includes an
ibm,chip-id property which gives essentially the socket number of each
vcpu core (individual vcpu threads don't get a node in the device
tree).

To calculate this, it uses a vcpus_per_socket variable computed as
(smp_cpus / #sockets).  This is correct for the usual case where
smp_cpus == smp_threads * smp_cores * #sockets.

However, you can start QEMU with the number of cores and threads
mismatching the total number of vcpus (whether that _should_ be
permitted is a topic for another day).  It's a bit hard to say what
the "real" number of vcpus per socket here is, but for most purposes
(smp_threads * smp_cores) will more meaningfully match how QEMU
behaves with respect to socket boundaries.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
8 years agopseries: Update SLOF firmware image to qemu-slof-20150813
Alexey Kardashevskiy [Thu, 13 Aug 2015 09:24:16 +0000 (19:24 +1000)]
pseries: Update SLOF firmware image to qemu-slof-20150813

The changes are:
1. GPT support;
2. Much faster VGA support.

The full changelog is:
  > Add missing half word access case to _FASTRMOVE and _FASTMOVE
  > Remove unused RMOVE64 stub
  > fbuffer: Implement RFILL as an accelerated primitive
  > fbuffer: Implement MRMOVE as an accelerated primitive
  > fbuffer: Precalculate line length in bytes
  > terminal: Disable the terminal-write trace by default
  > boot: remove trailing ":" in the bootpath
  > ci: implement boot client interface
  > boot: bootpath should be complete device path
  > fbuffer: Use a smaller cursor
  > fbuffer: Improve invert-region helper
  > usb-hid: Caps is not always shift
  > cas: Increase FDT buffer size to accomodate larger ibm, cas node properties
  > README: Update with patch submittion note
  > disk-label: add support for booting from GPT FAT partition
  > disk-label: introduce helper to check fat filesystem
  > introduce 8-byte LE helpers
  > disk-label: simplify gpt-prep-partition? routine
  > fbuffer: introduce the invert-region-x helper
  > fbuffer: introduce the invert-region helper
  > fbuffer: simplify address computations in fb8-toggle-cursor

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agopseries: define coldplugged devices as "configured"
Laurent Vivier [Thu, 13 Aug 2015 12:53:02 +0000 (14:53 +0200)]
pseries: define coldplugged devices as "configured"

When a device is hotplugged, attach() sets "configured" to
false, waiting an action from the OS to configure it and then
to call ibm,configure-connector. On ibm,configure-connector,
the hypervisor sets "configured" to true.

In case of coldplugged device, attach() sets "configured" to
false, but firmware and OS never call the ibm,configure-connector
in this case, so it remains set to false.

It could be harmless, but when we unplug a device, hypervisor
waits the device becomes configured because for it, a not configured
device is a device being configured, so it waits the end of configuration
to unplug it... and it never happens, so it is never unplugged.

This patch set by default coldplugged device to "configured=true",
hotplugged device to "configured=false".

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agosPAPR: Introduce rtas_ldq()
Gavin Shan [Tue, 1 Sep 2015 01:05:12 +0000 (11:05 +1000)]
sPAPR: Introduce rtas_ldq()

This introduces rtas_ldq() to load 64-bits parameter from continuous
two 4-bytes memory chunk of RTAS parameter buffer, to simplify the
code.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agospapr_rtas: Prevent QEMU crash during hotplug without a prior device_add
Bharata B Rao [Mon, 31 Aug 2015 23:53:52 +0000 (09:53 +1000)]
spapr_rtas: Prevent QEMU crash during hotplug without a prior device_add

If drmgr is used in the guest to hotplug a device before a device_add
has been issued via the QEMU monitor, QEMU segfaults in configure_connector
call. This occurs due to accessing of NULL FDT which otherwise would have
been created and associated with the DRC during device_add command.

Check for NULL FDT and return failure from configure_connector call.
As per PAPR+, an error value of -9003 seems appropriate for this failure.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Cc: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agoppc/spapr: Use qemu_log_mask() for hcall_dprintf()
Thomas Huth [Tue, 1 Sep 2015 01:29:02 +0000 (11:29 +1000)]
ppc/spapr: Use qemu_log_mask() for hcall_dprintf()

To see the output of the hcall_dprintf statements, you currently have
to enable the DEBUG_SPAPR_HCALLS macro in include/hw/ppc/spapr.h.
This is ugly because a) not every user who wants to debug guest
problems can or wants to recompile QEMU to be able to see such issues,
and b) since this macro is disabled by default, the code in the
hcall_dprintf() brackets tends to bitrot until somebody temporarily
enables that macro again.
Since the hcall_dprintf statements except one indicate guest
problems, let's always use qemu_log_mask(LOG_GUEST_ERROR, ...) for
this macro instead. One spot indicated an unimplemented host feature,
so this is changed into qemu_log_mask(LOG_UNIMP, ...) instead. Now
it's possible to see all those messages by simply adding the CLI
parameter "-d guest_errors,unimp", without the need to re-compile
the binary.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agospapr_drc: Fix potential undefined behaviour
David Gibson [Thu, 3 Sep 2015 00:08:23 +0000 (10:08 +1000)]
spapr_drc: Fix potential undefined behaviour

The DRC_INDEX_ID_MASK macro does a left shift on ~0, which is a signed
quantity, and therefore undefined behaviour according to the C spec.  In
particular this causes warnings from the clang sanitizer.

This fixes it by calculating the same mask without using ~0 (I think the
new method is a more common idiom for generating masks anyway).  For good
measure I also use 1ULL to force the expression's type to unsigned long
long, which should be good for assigning to anything we're going to want
to.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
8 years agospapr: add dumpdtb support
Andrew Jones [Tue, 1 Sep 2015 01:25:35 +0000 (11:25 +1000)]
spapr: add dumpdtb support

dumpdtb (-machine dumpdtb=<file>) allows one to inspect the generated
device tree of machine types that generate device trees. This is
useful for a) seeing what's there b) debugging/testing device tree
generator patches. It can be used as follows

$QEMU_CMDLINE -machine dumpdtb=dtb
dtc -I dtb -O dts dtb

Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agospapr: SPLPAR Characteristics
Sam Bobroff [Tue, 1 Sep 2015 01:24:37 +0000 (11:24 +1000)]
spapr: SPLPAR Characteristics

Improve the SPLPAR Characteristics information:

    Add MaxPlatProcs: set to max_cpus, the maximum CPUs that could be
    addded to the system.
    Add DesMem: set to the initial memory of the system.
    Add DesProcs: set to smp_cpus, the inital number of CPUs in the
    system.

These tokens and values are specified by PAPR.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agospapr: Make ibm, change-msi respect 3 return values
Sam Bobroff [Tue, 1 Sep 2015 01:23:47 +0000 (11:23 +1000)]
spapr: Make ibm, change-msi respect 3 return values

Currently, rtas_ibm_change_msi() always returns four values even if
less are specified.

Correct this by only returning the fourth parameter if it was
requested.

This is specified by PAPR.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agospapr: Add /rtas/ibm,change-msix-capable
Sam Bobroff [Tue, 1 Sep 2015 01:23:34 +0000 (11:23 +1000)]
spapr: Add /rtas/ibm,change-msix-capable

QEMU is MSI-X capable and makes it available via ibm,change-msi, so
we should indicate this by adding /rtas/ibm,change-msix-capable to the
device tree.

This is specificed by PAPR.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agospapr: Add /ibm,partition-name
Sam Bobroff [Tue, 1 Sep 2015 01:23:19 +0000 (11:23 +1000)]
spapr: Add /ibm,partition-name

QEMU has a notion of the guest name, so if it's present we might as
well put that into the device tree as /ibm,partition-name.

This is specificed by PAPR.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agospapr: Create pseries-2.5 machine
David Gibson [Wed, 12 Aug 2015 03:15:56 +0000 (13:15 +1000)]
spapr: Create pseries-2.5 machine

Add pseries-2.5 machine version.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
[Altered to merge before memory hotplug -- dwg]
[Altered to work with b9f072d01 -- dwg]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agospapr: Provide an error message when migration fails due to htab_shift mismatch
Bharata B Rao [Thu, 30 Jul 2015 14:55:15 +0000 (20:25 +0530)]
spapr: Provide an error message when migration fails due to htab_shift mismatch

Include an error message when migration fails due to mismatch in
htab_shift values at source and target. This should provide a bit more
verbose message in addition to the current migration failure message
that reads like:

qemu-system-ppc64: error while loading state for instance 0x0 of device 'spapr/htab'

After this patch, the failure message will look like this:

qemu-system-ppc64: htab_shift mismatch: source 29 target 24
qemu-system-ppc64: error while loading state for instance 0x0 of device 'spapr/htab'

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
8 years agoMerge remote-tracking branch 'remotes/kraxel/tags/pull-ipxe-20150903-1' into staging
Peter Maydell [Tue, 22 Sep 2015 18:22:23 +0000 (19:22 +0100)]
Merge remote-tracking branch 'remotes/kraxel/tags/pull-ipxe-20150903-1' into staging

ipxe: update to 35c53797 to 4e03af8, build tweaks.

# gpg: Signature made Thu 03 Sep 2015 13:52:01 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-ipxe-20150903-1:
  ipxe: update binaries
  ipxe: use upstream configuration
  ipxe: don't override GITVERSION
  ipxe: update from 35c53797 to 4e03af8

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoMerge remote-tracking branch 'remotes/armbru/tags/pull-monitor-2015-09-22' into staging
Peter Maydell [Tue, 22 Sep 2015 15:51:36 +0000 (16:51 +0100)]
Merge remote-tracking branch 'remotes/armbru/tags/pull-monitor-2015-09-22' into staging

Monitor patches

# gpg: Signature made Tue 22 Sep 2015 10:33:34 BST using RSA key ID EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"

* remotes/armbru/tags/pull-monitor-2015-09-22:
  hmp: Restore "info pci"
  monitor: allow device_del to accept QOM paths

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agohmp: Restore "info pci"
Paolo Bonzini [Fri, 18 Sep 2015 15:18:29 +0000 (17:18 +0200)]
hmp: Restore "info pci"

Dropped by commit da76ee76f78b9705e2a91e3c964aef28fecededb's
transition to hmp-commands-info.hx.

Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1442589509-10806-1-git-send-email-pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
8 years agomonitor: allow device_del to accept QOM paths
Daniel P. Berrange [Fri, 11 Sep 2015 12:33:56 +0000 (13:33 +0100)]
monitor: allow device_del to accept QOM paths

Currently device_del requires that the client provide the
device short ID. device_add allows devices to be created
without giving an ID, at which point there is no way to
delete them with device_del. The QOM object path, however,
provides an alternative way to identify the devices.

Allowing device_del to accept an object path ensures all
devices are deletable regardless of whether they have an
ID.

 (qemu) device_add usb-mouse
 (qemu) qom-list /machine/peripheral-anon
 device[0] (child<usb-mouse>)
 type (string)
 (qemu) device_del /machine/peripheral-anon/device[0]

Devices are required to be marked as hotpluggable
otherwise an error is raised

 (qemu) device_del /machine/unattached/device[4]
 Device 'PIIX3' does not support hotplugging

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1441974836-17476-1-git-send-email-berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[Commit message touched up, accidental white-space change dropped]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
8 years agoMerge remote-tracking branch 'remotes/spice/tags/pull-spice-20150921-1' into staging
Peter Maydell [Mon, 21 Sep 2015 23:37:05 +0000 (00:37 +0100)]
Merge remote-tracking branch 'remotes/spice/tags/pull-spice-20150921-1' into staging

spice: surface switch fast path requires same format too.

# gpg: Signature made Mon 21 Sep 2015 10:05:54 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/spice/tags/pull-spice-20150921-1:
  spice: surface switch fast path requires same format too.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoMerge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2015-09-21' into staging
Peter Maydell [Mon, 21 Sep 2015 21:33:51 +0000 (22:33 +0100)]
Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2015-09-21' into staging

qapi: QMP introspection

# gpg: Signature made Mon 21 Sep 2015 08:59:17 BST using RSA key ID EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"

* remotes/armbru/tags/pull-qapi-2015-09-21: (26 commits)
  qapi-introspect: Hide type names
  qapi: New QMP command query-qmp-schema for QMP introspection
  qapi: Pseudo-type '**' is now unused, drop it
  qapi-schema: Fix up misleading specification of netdev_add
  qom: Don't use 'gen': false for qom-get, qom-set, object-add
  qapi: Introduce a first class 'any' type
  qapi: Make output visitor return qnull() instead of NULL
  qapi: Improve built-in type documentation
  qapi-commands: De-duplicate output marshaling functions
  qapi: De-duplicate parameter list generation
  qapi: Rename qmp_marshal_input_FOO() to qmp_marshal_FOO()
  qapi-commands: Rearrange code
  qapi-visit: Rearrange code a bit
  qapi: Clean up after recent conversions to QAPISchemaVisitor
  qapi: Replace dirty is_c_ptr() by method c_null()
  qapi-event: Convert to QAPISchemaVisitor, fixing data with base
  qapi-event: Eliminate global variable event_enum_value
  qapi: De-duplicate enum code generation
  qapi-commands: Convert to QAPISchemaVisitor
  qapi-visit: Convert to QAPISchemaVisitor, fixing bugs
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoMerge remote-tracking branch 'remotes/aurel/tags/pull-tcg-mips-20150921' into staging
Peter Maydell [Mon, 21 Sep 2015 18:42:33 +0000 (19:42 +0100)]
Merge remote-tracking branch 'remotes/aurel/tags/pull-tcg-mips-20150921' into staging

TCG MIPS queue

- Fixes for 64-bit guests
- Small cleanups

# gpg: Signature made Sun 20 Sep 2015 23:33:15 BST using RSA key ID 1DDD8C9B
# gpg: Good signature from "Aurelien Jarno <aurelien@aurel32.net>"
# gpg:                 aka "Aurelien Jarno <aurelien@jarno.fr>"
# gpg:                 aka "Aurelien Jarno <aurel32@debian.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 7746 2642 A9EF 94FD 0F77  196D BA9C 7806 1DDD 8C9B

* remotes/aurel/tags/pull-tcg-mips-20150921:
  tcg/mips: pass oi to tcg_out_tlb_load
  tcg/mips: move tcg_out_addsub2
  tcg/mips: Fix clobbering of qemu_ld inputs

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoMerge remote-tracking branch 'remotes/agraf/tags/signed-ppc-for-upstream' into staging
Peter Maydell [Mon, 21 Sep 2015 16:01:46 +0000 (17:01 +0100)]
Merge remote-tracking branch 'remotes/agraf/tags/signed-ppc-for-upstream' into staging

Patch queue for ppc - 2015-09-20

Highlights this time around:

  - e500: Fix u-boot boot with -M virt by updating to new version
  - e500: fix ATMU reads
  - book3s: Fixes (unaligned exceptions, vector instructions)
  - yet another dbdma ide fix

I'm out taking care of my son for the next 2 months. During that time
please consider David Gibson the interim ppc queue maintainer. I'm sure
Aurelien will be more than happy to help him review patches as well ;-).

# gpg: Signature made Sun 20 Sep 2015 21:51:16 BST using RSA key ID 03FEDC60
# gpg: Good signature from "Alexander Graf <agraf@suse.de>"
# gpg:                 aka "Alexander Graf <alex@csgraf.de>"

* remotes/agraf/tags/signed-ppc-for-upstream:
  target-ppc: fix xscmpodp and xscmpudp decoding
  target-ppc: fix vcipher, vcipherlast, vncipherlast and vpermxor
  PPC: E500: Update u-boot to commit 79c884d7e4
  target-ppc: Fix SRR0 when taking unaligned exceptions
  PPC: e500 pci host: Fix ATMUs register reads
  mac_dbdma: always clear FLUSH bit once DBDMA channel flush is complete
  kvm_ppc: remove kvmppc_timer_hack

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoqapi-introspect: Hide type names
Markus Armbruster [Wed, 16 Sep 2015 11:06:29 +0000 (13:06 +0200)]
qapi-introspect: Hide type names

To eliminate the temptation for clients to look up types by name
(which are not ABI), replace all type names by meaningless strings.

Reduces output of query-schema by 13 out of 85KiB.

As a debugging aid, provide option -u to suppress the hiding.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1442401589-24189-27-git-send-email-armbru@redhat.com>

8 years agoqapi: New QMP command query-qmp-schema for QMP introspection
Markus Armbruster [Wed, 16 Sep 2015 11:06:28 +0000 (13:06 +0200)]
qapi: New QMP command query-qmp-schema for QMP introspection

qapi/introspect.json defines the introspection schema.  It's designed
for QMP introspection, but should do for similar uses, such as QGA.

The introspection schema does not reflect all the rules and
restrictions that apply to QAPI schemata.  A valid QAPI schema has an
introspection value conforming to the introspection schema, but the
converse is not true.

Introspection lowers away a number of schema details, and makes
implicit things explicit:

* The built-in types are declared with their JSON type.

  All integer types are mapped to 'int', because how many bits we use
  internally is an implementation detail.  It could be pressed into
  external interface service as very approximate range information,
  but that's a bad idea.  If we need range information, we better do
  it properly.

* Implicit type definitions are made explicit, and given
  auto-generated names:

  - Array types, named by appending "List" to the name of their
    element type, like in generated C.

  - The enumeration types implicitly defined by simple union types,
    named by appending "Kind" to the name of their simple union type,
    like in generated C.

  - Types that don't occur in generated C.  Their names start with ':'
    so they don't clash with the user's names.

* All type references are by name.

* The struct and union types are generalized into an object type.

* Base types are flattened.

* Commands take a single argument and return a single result.

  Dictionary argument or list result is an implicit type definition.

  The empty object type is used when a command takes no arguments or
  produces no results.

  The argument is always of object type, but the introspection schema
  doesn't reflect that.

  The 'gen': false directive is omitted as implementation detail.

  The 'success-response' directive is omitted as well for now, even
  though it's not an implementation detail, because it's not used by
  QMP.

* Events carry a single data value.

  Implicit type definition and empty object type use, just like for
  commands.

  The value is of object type, but the introspection schema doesn't
  reflect that.

* Types not used by commands or events are omitted.

  Indirect use counts as use.

* Optional members have a default, which can only be null right now

  Instead of a mandatory "optional" flag, we have an optional default.
  No default means mandatory, default null means optional without
  default value.  Non-null is available for optional with default
  (possible future extension).

* Clients should *not* look up types by name, because type names are
  not ABI.  Look up the command or event you're interested in, then
  follow the references.

  TODO Should we hide the type names to eliminate the temptation?

New generator scripts/qapi-introspect.py computes an introspection
value for its input, and generates a C variable holding it.

It can generate awfully long lines.  Marked TODO.

A new test-qmp-input-visitor test case feeds its result for both
tests/qapi-schema/qapi-schema-test.json and qapi-schema.json to a
QmpInputVisitor to verify it actually conforms to the schema.

New QMP command query-qmp-schema takes its return value from that
variable.  Its reply is some 85KiBytes for me right now.

If this turns out to be too much, we have a couple of options:

* We can use shorter names in the JSON.  Not the QMP style.

* Optionally return the sub-schema for commands and events given as
  arguments.

  Right now qmp_query_schema() sends the string literal computed by
  qmp-introspect.py.  To compute sub-schema at run time, we'd have to
  duplicate parts of qapi-introspect.py in C.  Unattractive.

* Let clients cache the output of query-qmp-schema.

  It changes only on QEMU upgrades, i.e. rarely.  Provide a command
  query-qmp-schema-hash.  Clients can have a cache indexed by hash,
  and re-query the schema only when they don't have it cached.  Even
  simpler: put the hash in the QMP greeting.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
8 years agoqapi: Pseudo-type '**' is now unused, drop it
Markus Armbruster [Wed, 16 Sep 2015 11:06:27 +0000 (13:06 +0200)]
qapi: Pseudo-type '**' is now unused, drop it

'gen': false needs to stay for now, because netdev_add is still using
it.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1442401589-24189-25-git-send-email-armbru@redhat.com>

8 years agoqapi-schema: Fix up misleading specification of netdev_add
Markus Armbruster [Wed, 16 Sep 2015 11:06:26 +0000 (13:06 +0200)]
qapi-schema: Fix up misleading specification of netdev_add

It doesn't take a 'props' argument, let alone one in the format
"NAME=VALUE,..."

The bogus arguments specification doesn't matter due to 'gen': false.
Clean it up to be incomplete rather than wrong, and document the
incompleteness.

While there, improve netdev_add usage example in the manual: add a
device option to show how it's done.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1442401589-24189-24-git-send-email-armbru@redhat.com>

8 years agoqom: Don't use 'gen': false for qom-get, qom-set, object-add
Markus Armbruster [Wed, 16 Sep 2015 11:06:25 +0000 (13:06 +0200)]
qom: Don't use 'gen': false for qom-get, qom-set, object-add

With the previous commit, the generated marshalers just work, and save
us a bit of handwritten code.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1442401589-24189-23-git-send-email-armbru@redhat.com>

8 years agoqapi: Introduce a first class 'any' type
Markus Armbruster [Wed, 16 Sep 2015 11:06:24 +0000 (13:06 +0200)]
qapi: Introduce a first class 'any' type

It's first class, because unlike '**', it actually works, i.e. doesn't
require 'gen': false.

'**' will go away next.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
8 years agoqapi: Make output visitor return qnull() instead of NULL
Markus Armbruster [Wed, 16 Sep 2015 11:06:23 +0000 (13:06 +0200)]
qapi: Make output visitor return qnull() instead of NULL

Before commit 1d10b44, it crashed.  Since then, it returns NULL, with
a FIXME comment.  The FIXME is valid: code that assumes QObject *
can't be null exists.  I'm not aware of a way to feed this problematic
return value to code that actually chokes on null in the current code,
but the next few commits will create one, failing "make check".

Commit 481b002 solved a very similar problem by introducing a special
null QObject.  Using this special null QObject is clearly the right
way to resolve this FIXME, so do that, and update the test
accordingly.

However, the patch isn't quite right: it messes up the reference
counting.  After about SIZE_MAX visits, the reference counter
overflows, failing the assertion in qnull_destroy_obj().  Because
that's many orders of magnitude more visits of nulls than we expect,
we take this patch despite its flaws, to get the QMP introspection
stuff in without further delay.  We'll want to fix it for real before
the release.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1442401589-24189-21-git-send-email-armbru@redhat.com>

8 years agoqapi: Improve built-in type documentation
Markus Armbruster [Wed, 16 Sep 2015 11:06:22 +0000 (13:06 +0200)]
qapi: Improve built-in type documentation

Clarify how they map to JSON.  Add how they map to C.  Fix the
reference to StringInputVisitor.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1442401589-24189-20-git-send-email-armbru@redhat.com>

8 years agoqapi-commands: De-duplicate output marshaling functions
Markus Armbruster [Wed, 16 Sep 2015 11:06:21 +0000 (13:06 +0200)]
qapi-commands: De-duplicate output marshaling functions

gen_marshal_output() uses its parameter name only for name of the
generated function.  Name it after the type being marshaled instead of
its caller, and drop duplicates.

Saves 7 copies of qmp_marshal_output_int() in qemu-ga, and one copy of
qmp_marshal_output_str() in qemu-system-*.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1442401589-24189-19-git-send-email-armbru@redhat.com>

8 years agoqapi: De-duplicate parameter list generation
Markus Armbruster [Wed, 16 Sep 2015 11:06:20 +0000 (13:06 +0200)]
qapi: De-duplicate parameter list generation

Generated qapi-event.[ch] lose line breaks.  No change otherwise.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1442401589-24189-18-git-send-email-armbru@redhat.com>

8 years agoqapi: Rename qmp_marshal_input_FOO() to qmp_marshal_FOO()
Markus Armbruster [Wed, 16 Sep 2015 11:06:19 +0000 (13:06 +0200)]
qapi: Rename qmp_marshal_input_FOO() to qmp_marshal_FOO()

These functions marshal both input and output.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1442401589-24189-17-git-send-email-armbru@redhat.com>

8 years agoqapi-commands: Rearrange code
Markus Armbruster [Wed, 16 Sep 2015 11:06:18 +0000 (13:06 +0200)]
qapi-commands: Rearrange code

Rename gen_marshal_input() to gen_marshal(), because the generated
function marshals both arguments and results.

Rename gen_visitor_input_containers_decl() to gen_marshal_vars(), and
move the other variable declarations there, too.

Rename gen_visitor_input_block() to gen_marshal_input_visit(), and
rearrange its code slightly.

Rename gen_marshal_input_decl() to gen_marshal_proto(), because the
result isn't a full declaration, unlike gen_command_decl()'s.

New gen_marshal_decl() actually returns a full declaration.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1442401589-24189-16-git-send-email-armbru@redhat.com>

8 years agoqapi-visit: Rearrange code a bit
Markus Armbruster [Wed, 16 Sep 2015 11:06:17 +0000 (13:06 +0200)]
qapi-visit: Rearrange code a bit

Move gen_visit_decl() to a better place.  Inline
generate_visit_struct_body().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1442401589-24189-15-git-send-email-armbru@redhat.com>

8 years agoqapi: Clean up after recent conversions to QAPISchemaVisitor
Markus Armbruster [Wed, 16 Sep 2015 11:06:16 +0000 (13:06 +0200)]
qapi: Clean up after recent conversions to QAPISchemaVisitor

Generate just 'FOO' instead of 'struct FOO' when possible.

Drop helper functions that are now unused.

Make pep8 and pylint reasonably happy.

Rename generate_FOO() functions to gen_FOO() for consistency.

Use more consistent and sensible variable names.

Consistently use c_ for mapping keys when their value is a C
identifier or type.

Simplify gen_enum() and gen_visit_union()

Consistently use single quotes for C text string literals.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1442401589-24189-14-git-send-email-armbru@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
8 years agoqapi: Replace dirty is_c_ptr() by method c_null()
Markus Armbruster [Wed, 16 Sep 2015 11:06:15 +0000 (13:06 +0200)]
qapi: Replace dirty is_c_ptr() by method c_null()

is_c_ptr() looks whether the end of the C text for the type looks like
a pointer.  Works, but is fragile.

We now have a better tool: use QAPISchemaType method c_null().  The
initializers for non-pointers become prettier: 0, false or the
enumeration constant with the value 0 instead of {0}.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1442401589-24189-13-git-send-email-armbru@redhat.com>

8 years agoqapi-event: Convert to QAPISchemaVisitor, fixing data with base
Markus Armbruster [Wed, 16 Sep 2015 11:06:14 +0000 (13:06 +0200)]
qapi-event: Convert to QAPISchemaVisitor, fixing data with base

Fixes events whose data is struct with base to include the struct's
base members.  Test case is qapi-schema-test.json's event
__org.qemu_x-command:

    { 'event': '__ORG.QEMU_X-EVENT', 'data': '__org.qemu_x-Struct' }

    { 'struct': '__org.qemu_x-Struct', 'base': '__org.qemu_x-Base',
      'data': { '__org.qemu_x-member2': 'str' } }

    { 'struct': '__org.qemu_x-Base',
      'data': { '__org.qemu_x-member1': '__org.qemu_x-Enum' } }

Patch's effect on generated qapi_event_send___org_qemu_x_event():

    -void qapi_event_send___org_qemu_x_event(const char *__org_qemu_x_member2,
    +void qapi_event_send___org_qemu_x_event(__org_qemu_x_Enum __org_qemu_x_member1,
    +                                        const char *__org_qemu_x_member2,
                                             Error **errp)
     {
         QDict *qmp;
    @@ -224,6 +225,10 @@ void qapi_event_send___org_qemu_x_event(
             goto clean;
         }

    +    visit_type___org_qemu_x_Enum(v, &__org_qemu_x_member1, "__org.qemu_x-member1", &local_err);
    +    if (local_err) {
    +        goto clean;
    +    }
         visit_type_str(v, (char **)&__org_qemu_x_member2, "__org.qemu_x-member2", &local_err);
         if (local_err) {
             goto clean;

Code is generated in a different order now, but that doesn't matter.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
8 years agoqapi-event: Eliminate global variable event_enum_value
Markus Armbruster [Wed, 16 Sep 2015 11:06:13 +0000 (13:06 +0200)]
qapi-event: Eliminate global variable event_enum_value

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1442401589-24189-11-git-send-email-armbru@redhat.com>