Olga Krishtal [Tue, 30 Jun 2015 10:25:22 +0000 (13:25 +0300)]
qga: added bus type and disk location path
According to Microsoft disk location path can be obtained via
IOCTL_SCSI_GET_ADDRESS. Unfortunately this ioctl can not be used for all
devices. There are certain bus types which could be obtained with this
API. Please, refer to the following link for more details
https://technet.microsoft.com/en-us/library/
ee851589(v=ws.10).aspx
Bus type could be obtained using IOCTL_STORAGE_QUERY_PROPERTY. Enum
STORAGE_BUS_TYPE describes all buses supported by OS.
Windows defines more bus types than Linux. Thus some values have been added
to GuestDiskBusType.
Signed-off-by: Olga Krishtal <okrishtal@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Eric Blake <eblake@redhat.com>
CC: Michael Roth <mdroth@linux.vnet.ibm.com>
* fixed warning in CreateFile due to use of NULL instead of 0
* only provide disk info when CONFIG_QGA_NTDDSCSI=y
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Michael Roth [Tue, 7 Jul 2015 23:10:09 +0000 (18:10 -0500)]
configure: add configure check for ntdddisk.h
This header file provides w32 ioctl definitions for working with disk
devices. Older versions of mingw do not expose this in a useable way,
so add a configure check and report it via CONFIG_QGA_NTDDSCSI.
Subsequent patches will use this macro to stub out functionality that
relies on this in cases where it's not available.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Olga Krishtal [Tue, 30 Jun 2015 10:25:21 +0000 (13:25 +0300)]
qga: added mountpoint and filesystem type for single volume
We should use GetVolumeXXX api to work with volumes. This will help us to
resolve the situation with volumes without drive letter, i.e. when the
volume is mounted as a folder. Such volume is called mounted folder.
This volume is a regular mounted volume from all other points of view.
The information about non mounted volume is reported as System Reserved.
This volume is not mounted and thus it is not writable.
GuestDiskAddressList API is not used because operations are performed with
volumes but no with disks. This means that spanned disk will
be counted and handled as a single volume. It is worth mentioning
that the information about every disk in the volume can be queried
via IOCTL_VOLUME_GET_VOLUME_DISK_EXTENTS.
Signed-off-by: Olga Krishtal <okrishtal@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Eric Blake <eblake@redhat.com>
CC: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Olga Krishtal [Tue, 30 Jun 2015 10:25:20 +0000 (13:25 +0300)]
qga: added empty qmp_quest_get_fsinfo functionality.
We need qmp_quest_get_fsinfo togather with vss-provider, which works with
volumes. The call to this function is implemented via
FindFirst/NextVolumes. Moreover, volumes in Windows OS are filesystem unit,
so it will be more effective to work with them rather with devices.
Signed-off-by: Olga Krishtal <okrishtal@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Eric Blake <eblake@redhat.com>
CC: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Marc-André Lureau [Sun, 5 Jul 2015 14:28:58 +0000 (16:28 +0200)]
qga: fail early for invalid time
It's possible to set system time with dates after 2070, however, it's
not possible to set the RTC. It has limitation to up to year
2070 (1970+100). In order to keep both clock in sync and before the
kernel complains on invalid values, bail out early.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Kirk Allan [Tue, 2 Jun 2015 17:41:07 +0000 (11:41 -0600)]
qga: win32 qmp_guest_network_get_interfaces implementation
By default, IPv4 prefixes will be derived by matching the address
to those returned by GetAdaptersInfo. IPv6 prefixes can not be
matched this way due to the unpredictable order of entries.
In Windows Vista/2008 guests and newer, both IPv4 and IPv6 prefixes
can be retrieved from OnLinkPrefixLength. Setting --extra-cflags
in the build configuration to "-D_WIN32_WINNT=0x600"
or greater makes OnLinkPrefixLength available. Setting --extra-cflags
is not required and if not set, the default approach to get the prefix
will be taken.
Signed-off-by: Kirk Allan <kallan@suse.com>
* drop ws2ipdef.h, it's missing on old mingw, and ws2tcpip.h already
includes it automatically on new builds
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Kirk Allan [Tue, 2 Jun 2015 17:41:06 +0000 (11:41 -0600)]
qga: add win32 library iphlpapi
Add the iphlpapi library to use APIs such as GetAdaptersInfo and
GetAdaptersAddresses.
Signed-off-by: Kirk Allan <kallan@suse.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Markus Armbruster [Wed, 27 May 2015 17:53:49 +0000 (19:53 +0200)]
Revert "guest agent: remove g_strcmp0 usage"
Since we now require GLib 2.22+ (commit f40685c), we don't have to
work around lack of g_strcmp0() anymore.
This reverts commit
8f4774789947bc4bc4c8d026a289fe980d3d2ee1.
Conflicts:
qemu-ga.c
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Justin Ossevoort [Mon, 11 May 2015 06:58:45 +0000 (08:58 +0200)]
qga/qmp_guest_fstrim: Return per path fstrim result
The current guest-fstrim support only returns an error if some
mountpoint was unable to be trimmed, skipping any possible additional
mountpoints. The result of the TRIM operation itself is also discarded.
This change returns a per mountpoint result of the TRIM operation. If an
error occurs on some mountpoints that error is returned and the
guest-fstrim continue with any additional mountpoints.
The returned values for errors, minimum and trimmed are dependant on the
filesystem, storage stacks and kernel version.
Signed-off-by: Justin Ossevoort <justin@quarantainenet.nl>
* s/type/struct/ in schema type definitions
* moved version annotation for new guest-fstrim return field to
the field itself rather than applying to the entire command
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Justin Ossevoort [Mon, 11 May 2015 06:58:44 +0000 (08:58 +0200)]
qga/commands-posix: Fix bug in guest-fstrim
The FITRIM ioctl updates the fstrim_range structure it receives. This
way the caller can determine how many bytes were trimmed. The
guest-fstrim logic reuses the same fstrim_range for each filesystem,
effectively limiting each filesystem to trim at most as much as the
previous was able to trim.
If a previous filesystem would have trimmed 0 bytes, than the next
filesystem would report an error 'Invalid argument' because a FITRIM
request with length 0 is not valid.
This change resets the fstrim_range structure for each filesystem.
Signed-off-by: Justin Ossevoort <justin@quarantainenet.nl>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Peter Maydell [Tue, 7 Jul 2015 20:16:06 +0000 (21:16 +0100)]
Merge remote-tracking branch 'remotes/agraf/tags/signed-ppc-for-upstream' into staging
Patch queue for ppc - 2015-07-07
A few last minute PPC changes for 2.4:
- spapr: Update SLOF
- spapr: Fix a few bugs
- spapr: Preparation for hotplug
- spapr: Minor code cleanups
- linux-user: Add mftb handling
- kvm: Enable hugepage support with memory-backend-file
- mac99: Remove nonexistent interrupt pin (Mac OS 9 fix)
# gpg: Signature made Tue Jul 7 16:48:41 2015 BST using RSA key ID
03FEDC60
# gpg: Good signature from "Alexander Graf <agraf@suse.de>"
# gpg: aka "Alexander Graf <alex@csgraf.de>"
* remotes/agraf/tags/signed-ppc-for-upstream: (30 commits)
sPAPR: Clear stale MSIx table during EEH reset
sPAPR: Reenable EEH functionality on reboot
sPAPR: Don't enable EEH on emulated PCI devices
spapr-vty: Use TYPE_ definition instead of hardcoding
spapr_vty: lookup should only return valid VTY objects
spapr_pci: drop redundant args in spapr_[populate, create]_pci_child_dt
spapr_pci: populate ibm,loc-code
spapr_pci: enumerate and add PCI device tree
xics_kvm: Don't enable KVM_CAP_IRQ_XICS if already enabled
ppc: Update cpu_model in MachineState
spapr: Consolidate cpu init code into a routine
spapr: Reorganize CPU dt generation code
cpus: Add a macro to walk CPUs in reverse
spapr: Support ibm, lrdr-capacity device tree property
spapr: Consider max_cpus during xics initialization
Revert "hw/ppc/spapr_pci.c: Avoid functions not in glib 2.12 (g_hash_table_iter_*)"
spapr_iommu: translate sPAPRTCEAccess to IOMMUAccessFlags
spapr_iommu: drop erroneous check in h_put_tce_indirect()
spapr_pci: set device node unit address as hex
spapr_pci: encode class code including Prog IF register
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Tue, 7 Jul 2015 19:12:55 +0000 (20:12 +0100)]
Merge remote-tracking branch 'remotes/ehabkost/tags/x86-pull-request' into staging
X86 queue, 2015-07-07
Patch "target-i386: emulate CPUID level of real hardware" was removed after the
2015-07-03 pull request.
# gpg: Signature made Tue Jul 7 15:46:23 2015 BST using RSA key ID
984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/x86-pull-request:
target-i386: avoid overflow in the tsc-frequency property
i386: Introduce ARAT CPU feature
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Tue, 7 Jul 2015 18:12:45 +0000 (19:12 +0100)]
Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Pull request
v2:
* Drop block/nfs patch since it exposes an unfinished QAPI interface [kwolf]
# gpg: Signature made Tue Jul 7 14:29:47 2015 BST using RSA key ID
81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
* remotes/stefanha/tags/block-pull-request:
blockjob: add block_job_release function
block/raw-posix: Don't think /dev/fd/<NN> is a floppy drive.
block: Use bdrv_drain to replace uncessary bdrv_drain_all
block: Initialize local_err in bdrv_append_temp_snapshot
block: update bdrv_drain_all()/bdrv_drain() comments
qcow2: remove unnecessary check
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Tue, 7 Jul 2015 16:19:59 +0000 (17:19 +0100)]
Merge remote-tracking branch 'remotes/juanquintela/tags/migration/
20150707' into staging
migration/next for
20150707
# gpg: Signature made Tue Jul 7 13:56:30 2015 BST using RSA key ID
5872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>"
# gpg: aka "Juan Quintela <quintela@trasno.org>"
* remotes/juanquintela/tags/migration/
20150707: (28 commits)
migration: extend migration_bitmap
migration: protect migration_bitmap
check_section_footers: Check the correct section_id
migration: Add migration events on target side
migration: Make events a capability
migration: create migration event
migration: No need to call trace_migrate_set_state()
migration: Use always helper to set state
migration: ensure we start in NONE state
migration: Use cmpxchg correctly
migration: Add configuration section
vmstate: Create optional sections
global_state: Make section optional
migration: create new section to store global state
runstate: migration allows more transitions now
runstate: Add runstate store
Fix older machine type compatibility on power with section footers
Fail more cleanly in mismatched RAM cases
Sanity check RDMA remote data
Sort destination RAMBlocks to be the same as the source
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Gavin Shan [Thu, 2 Jul 2015 06:23:28 +0000 (16:23 +1000)]
sPAPR: Clear stale MSIx table during EEH reset
The PCI device MSIx table is cleaned out in hardware after EEH PE
reset. However, we still hold the stale MSIx entries in QEMU, which
should be cleared accordingly. Otherwise, we will run into another
(recursive) EEH error and the PCI devices contained in the PE have
to be offlined exceptionally.
The patch introduces function spapr_phb_vfio_eeh_pre_reset(), which
is called by sPAPR when asserting hot or fundamental reset, to clear
stale MSIx table for VFIO PCI devices before EEH PE reset so that
MSIx table could be restored properly after EEH PE reset.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Gavin Shan [Thu, 2 Jul 2015 06:23:27 +0000 (16:23 +1000)]
sPAPR: Reenable EEH functionality on reboot
When rebooting the guest, some PEs might be in frozen state. The
contained PCI devices won't work properly if their frozen states
aren't cleared in time. One case running into this situation would
be maximal EEH error times encountered in the guest.
The patch reenables the EEH functinality on PEs on PHB's reset
callback, which will clear their frozen states if needed.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Gavin Shan [Thu, 2 Jul 2015 06:23:26 +0000 (16:23 +1000)]
sPAPR: Don't enable EEH on emulated PCI devices
There might have emulated PCI devices, together with VFIO PCI
devices under one PHB. The EEH capability shouldn't enabled
on emulated PCI devices.
The patch returns error when enabling EEH capability on emulated
PCI devices by RTAS call "ibm,set-eeh-option".
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
David Gibson [Thu, 2 Jul 2015 06:23:25 +0000 (16:23 +1000)]
spapr-vty: Use TYPE_ definition instead of hardcoding
There's a call to object_dynamic_cast() in spapr_vty which uses the type
name "spapr-vty" directly, instead of the usual idiom of using the #defined
TYPE_VIO_SPAPR_VTY_DEVICE. Fix it.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
David Gibson [Thu, 2 Jul 2015 06:23:24 +0000 (16:23 +1000)]
spapr_vty: lookup should only return valid VTY objects
If a guest passes the reg property of a valid VIO object that is not a VTY
to either H_GET_TERM_CHAR or H_PUT_TERM_CHAR, QEMU hits a dynamic cast
assertion and aborts.
PAPR+ says "Hypervisor checks the termno parameter for validity against the
Vterm IOA unit addresses assigned to the partition, else return H_Parameter."
This patch adds a type check to ensure vty_lookup() either returns a pointer
to a valid VTY object or NULL. H_GET_TERM_CHAR and H_PUT_TERM_CHAR will
now return H_PARAMETER to the guest instead of crashing.
The patch has no effect on the reg == 0 hack used to implement the RTAS call
display-character.
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Nikunj A Dadhania [Thu, 2 Jul 2015 06:23:23 +0000 (16:23 +1000)]
spapr_pci: drop redundant args in spapr_[populate, create]_pci_child_dt
* phb_index is not being used and if required can be obtained from sphb
* use helper to get drc_index in spapr_populate_pci_child_dt()
* Check if drc_index is zero
Suggested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Nikunj A Dadhania [Thu, 2 Jul 2015 06:23:22 +0000 (16:23 +1000)]
spapr_pci: populate ibm,loc-code
Each hardware instance has a platform unique location code. The OF
device tree that describes a part of a hardware entity must include
the “ibm,loc-code” property with a value that represents the location
code for that hardware entity.
Populate ibm,loc-code.
1) PCI passthru devices need to identify with its own ibm,loc-code
available on the host. In failure cases use:
vfio_<name>:<phb-index>:<bus>:<slot>.<fn>
2) Emulated devices encode as following:
qemu_<name>:<phb-index>:<bus>:<slot>.<fn>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Nikunj A Dadhania [Thu, 2 Jul 2015 06:23:21 +0000 (16:23 +1000)]
spapr_pci: enumerate and add PCI device tree
All the PCI enumeration and device node creation was off-loaded to
SLOF. With PCI hotplug support, code needed to be added to add device
node. This creates multiple copy of the code one in SLOF and other in
hotplug code. To unify this, the patch adds the pci device node
creation in Qemu. For backward compatibility, a flag
"qemu,phb-enumerated" is added to the phb, suggesting to SLOF to not
do device node creation.
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
[ Squashed Michael's drc_index changes ]
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Bharata B Rao [Thu, 2 Jul 2015 06:23:20 +0000 (16:23 +1000)]
xics_kvm: Don't enable KVM_CAP_IRQ_XICS if already enabled
When supporting CPU hot removal by parking the vCPU fd and reusing
it during hotplug again, there can be cases where we try to reenable
KVM_CAP_IRQ_XICS CAP for the vCPU for which it was already enabled.
Introduce a boolean member in ICPState to track this and don't
reenable the CAP if it was already enabled earlier.
Re-enabling this CAP should ideally work, but currently it results in
kernel trying to create and associate ICP with this vCPU and that
fails since there is already an ICP associated with it. Hence this
patch is needed to work around this problem in the kernel.
This change allows CPU hot removal to work for sPAPR.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Bharata B Rao [Thu, 2 Jul 2015 06:23:19 +0000 (16:23 +1000)]
ppc: Update cpu_model in MachineState
Keep cpu_model field in MachineState uptodate so that it can be used
from the CPU hotplug path.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Bharata B Rao [Thu, 2 Jul 2015 06:23:18 +0000 (16:23 +1000)]
spapr: Consolidate cpu init code into a routine
Factor out bits of sPAPR specific CPU initialization code into
a separate routine so that it can be called from CPU hotplug
path too.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Bharata B Rao [Thu, 2 Jul 2015 06:23:17 +0000 (16:23 +1000)]
spapr: Reorganize CPU dt generation code
Reorganize CPU device tree generation code so that it be reused from
hotplug path. CPU dt entries are now generated from spapr_finalize_fdt()
instead of spapr_create_fdt_skel().
Note: This is how the split-up looks like now:
Boot path
---------
spapr_finalize_fdt
spapr_populate_cpus_dt_node
spapr_populate_cpu_dt
spapr_fixup_cpu_numa_dt
spapr_fixup_cpu_smt_dt
ibm,cas path
------------
spapr_h_cas_compose_response
spapr_fixup_cpu_dt
spapr_fixup_cpu_numa_dt
spapr_fixup_cpu_smt_dt
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Bharata B Rao [Thu, 2 Jul 2015 06:23:16 +0000 (16:23 +1000)]
cpus: Add a macro to walk CPUs in reverse
Add CPU_FOREACH_REVERSE that walks CPUs in reverse.
Needed for PowerPC CPU device tree reorganization.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Bharata B Rao [Thu, 2 Jul 2015 06:23:15 +0000 (16:23 +1000)]
spapr: Support ibm, lrdr-capacity device tree property
Add support for ibm,lrdr-capacity since this is needed by the guest
kernel to know about the possible hot-pluggable CPUs and Memory. With
this, pseries kernels will start reporting correct maxcpus in
/sys/devices/system/cpu/possible.
Also define the minimum hotpluggable memory size as 256MB.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[agraf: Fix compile error on 32bit hosts]
Signed-off-by: Alexander Graf <agraf@suse.de>
Bharata B Rao [Thu, 2 Jul 2015 06:23:14 +0000 (16:23 +1000)]
spapr: Consider max_cpus during xics initialization
Use max_cpus instead of smp_cpus when intializating xics system. Also
report max_cpus in ibm,interrupt-server-ranges device tree property of
interrupt controller node.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Markus Armbruster [Thu, 2 Jul 2015 06:23:13 +0000 (16:23 +1000)]
Revert "hw/ppc/spapr_pci.c: Avoid functions not in glib 2.12 (g_hash_table_iter_*)"
Since we now require GLib 2.22+ (commit f40685c), we don't have to
work around lack of g_hash_table_iter_init() & friends anymore.
This reverts commit
f8833a37c0c6b22ddd57b45e48cfb0f97dbd5af4.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Greg Kurz [Thu, 2 Jul 2015 06:23:12 +0000 (16:23 +1000)]
spapr_iommu: translate sPAPRTCEAccess to IOMMUAccessFlags
The fact that these enums have matching values is pure coincidence. We
actually need to translate from the PAPR definition to the QEMU one.
This patch doesn't fix any bug, it is only code cleanup.
Suggested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Greg Kurz [Thu, 2 Jul 2015 06:23:11 +0000 (16:23 +1000)]
spapr_iommu: drop erroneous check in h_put_tce_indirect()
The tce_list variable is not a TCE but the address to a TCE: we shouldn't
clear permission bits as we do now. And this is dead code anyway since we
check tce_list is 4K aligned a few lines above.
This patch doesn't fix any bug, it is only code cleanup.
Suggested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Nikunj A Dadhania [Thu, 2 Jul 2015 06:23:10 +0000 (16:23 +1000)]
spapr_pci: set device node unit address as hex
Device node names should encode the unit address as hex, while the
code was encodind it as integers.
Also, use FDT_NAME_MAX macro for allocating and composing the name.
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Nikunj A Dadhania [Thu, 2 Jul 2015 06:23:09 +0000 (16:23 +1000)]
spapr_pci: encode class code including Prog IF register
Current code missed the Prog IF register. All Class Code, Subclass,
and Prog IF registers are needed to identify the accurate device type.
For example: USB controllers use the PROG IF for denoting: USB
FullSpeed, HighSpeed or SuperSpeed.
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Nikunj A Dadhania [Thu, 2 Jul 2015 06:23:08 +0000 (16:23 +1000)]
spapr_pci: encode missing 64-bit memory address space
The properties reg/assigned-resources need to encode 64-bit memory
address space as part of phys.hi dword.
00 if configuration space
01 if IO region,
10 if 32-bit MEM region
11 if 64-bit MEM region
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
David Gibson [Thu, 2 Jul 2015 06:23:07 +0000 (16:23 +1000)]
spapr: Add sPAPRMachineClass
Currently although we have an sPAPRMachineState descended from MachineState
we don't have an sPAPRMAchineClass descended from MachineClass. So far it
hasn't been needed, but several upcoming features are going to want it,
so this patch creates a stub implementation.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
David Gibson [Thu, 2 Jul 2015 06:23:06 +0000 (16:23 +1000)]
spapr: Remove obsolete entry_point field from sPAPRMachineState
The sPAPRMachineState structure includes an entry_point field containing
the initial PC value for starting the machine, even though this always has
the value 0x100.
I think this is a hangover from very early versions which bypassed the
firmware when using -kernel. In any case it has no function now, so remove
it.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
David Gibson [Thu, 2 Jul 2015 06:23:05 +0000 (16:23 +1000)]
spapr: Remove obsolete ram_limit field from sPAPRMachineState
The ram_limit field was imported from sPAPREnvironment where it predates
the machine's ram size being available generically from machine->ram_size.
Worse, the existing code was inconsistent about where it got the ram size
from. Sometimes it used spapr->ram_limit, sometimes the global 'ram_size'
and sometimes a local 'ram_size' masking the global.
This cleans up the code to consistently use machine->ram_size, eliminating
spapr->ram_limit in the process.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
David Gibson [Thu, 2 Jul 2015 06:23:04 +0000 (16:23 +1000)]
spapr: Merge sPAPREnvironment into sPAPRMachineState
The code for -machine pseries maintains a global sPAPREnvironment structure
which keeps track of general state information about the guest platform.
This predates the existence of the MachineState structure, but performs
basically the same function.
Now that we have the generic MachineState, fold sPAPREnvironment into
sPAPRMachineState, the pseries specific subclass of MachineState.
This is mostly a matter of search and replace, although a few places which
relied on the global spapr variable are changed to find the structure via
qdev_get_machine().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Alexey Kardashevskiy [Thu, 2 Jul 2015 06:23:03 +0000 (16:23 +1000)]
pseries: Update SLOF firmware image to qemu-slof-
20150429
The changelog is:
> version: update to
20150429
> pci: Use QEMU created PCI device nodes
> usb: support 64-bit pci bars
> pci: Support 64-bit address translation
> pci: program correct bridge limit registers during probe
> scsi: handle report-luns failure
> Fix "key?" Forth word when using USB keyboards
> Remove bulk.fs package
> Include make.rules in the library Makefiles
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
Greg Kurz [Thu, 2 Jul 2015 06:23:02 +0000 (16:23 +1000)]
spapr: ensure we have at least one XICS server
XICS needs to know the upper value for cpu_index as it is used to compute
the number of servers:
smp_cpus * kvmppc_smt_threads() / smp_threads
When passing -smp cpus=1,threads=9 on a POWER8 host, we end up with:
1 * 8 / 9 = 0
... which leads to an assertion in both emulated:
Number of servers needs to be greater 0
Aborted (core dumped)
... and in-kernel XICS:
xics_kvm_realize: Assertion `icp->nr_servers' failed.
Aborted (core dumped)
With this patch, we are sure that nr_servers > 0. Passing the same bogus
-smp option then leads to:
qemu-system-ppc64: Cannot support more than 8 threads on PPC with KVM
... which is a lot more explicit than the XICS errors.
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Michael Roth [Thu, 2 Jul 2015 20:46:14 +0000 (15:46 -0500)]
target-ppc: fix hugepage support when using memory-backend-file
Current PPC code relies on -mem-path being used in order for
hugepage support to be detected. With the introduction of
MemoryBackendFile we can now handle this via:
-object memory-file-backend,mem-path=...,id=hugemem0 \
-numa node,id=mem0,memdev=hugemem0
Management tools like libvirt treat the 2 approaches as
interchangeable in some cases, which can lead to user-visible
regressions even for previously supported guest configurations.
Fix these by also iterating through any configured memory
backends that may be backed by hugepages.
Since the old code assumed hugepages always backed the entirety
of guest memory, play it safe an pick the minimum across the
max pages sizes for all backends, even ones that aren't backed
by hugepages.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Cormac O'Brien [Wed, 17 Jun 2015 22:04:11 +0000 (17:04 -0500)]
macio: remove nonexistent interrupt on pin 1
The current macio implementation declares an interrupt that doesn't appear to
exist in the hardware or any other emulator implementation. OpenBIOS detects
this interrupt and generates an 'interrupts' property in the macio device tree
entry. Mac OS 9 halts boot when it detects this interrupt, so it has been
removed to permit further progress in the boot process.
Signed-off-by: Cormac O'Brien <i.am.cormac.obrien@gmail.com>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Alexander Graf <agraf@suse.de>
Laurent Vivier [Tue, 30 Jun 2015 09:49:54 +0000 (11:49 +0200)]
linux-user, ppc: mftbl can be used by user application
In qemu-linux-user, when calling gethostbyname2(),
it was hanging in .__res_nmkquery.
(gdb) bt
0 in .__res_nmkquery () from /lib64/libresolv.so.2
1 in .__libc_res_nquery () from /lib64/libresolv.so.2
2 in .__libc_res_nsearch () from /lib64/libresolv.so.2
3 in ._nss_dns_gethostbyname3_r () from /lib64/libnss_dns.so.2
4 in ._nss_dns_gethostbyname2_r () from /lib64/libnss_dns.so.2
5 in .gethostbyname2_r () from /lib64/libc.so.6
6 in .gethostbyname2 () from /lib64/libc.so.6
.__res_nmkquery() is:
...
do { RANDOM_BITS (randombits); } while ((randombits & 0xffff) == 0);
...
<.__res_nmkquery+112>: mftbl r11
<.__res_nmkquery+116>: clrlwi r10,r11,16
<.__res_nmkquery+120>: cmpwi cr7,r10,0
<.__res_nmkquery+124>: beq cr7,<.__res_nmkquery+112>
but as mftbl (Move From Time Base Lower) is not implemented,
r11 is always 0, so we have an infinite loop.
This patch fills the Time Base register with cpu_get_real_ticks().
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Alexander Graf <agraf@suse.de>
Peter Maydell [Tue, 7 Jul 2015 14:48:49 +0000 (15:48 +0100)]
Merge remote-tracking branch 'remotes/stefanha/tags/net-pull-request' into staging
# gpg: Signature made Tue Jul 7 13:38:13 2015 BST using RSA key ID
81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
* remotes/stefanha/tags/net-pull-request:
rocker: tests: don't need to specify master/self when setting vlans
rocker: mark copy-to-cpu pkts as forwarding offloaded
rocker: return -1 when dropping packet on ingress
rocker: fix missing break statements
rocker: fix misplaced break statement
rocker: don't queue receive pkts when port is disabled
vmxnet3: Fix incorrect small packet padding
e1000: flush packets when link comes up
rocker: fix memory leak
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Paolo Bonzini [Wed, 24 Jun 2015 12:11:27 +0000 (14:11 +0200)]
target-i386: avoid overflow in the tsc-frequency property
The TSC frequency fits comfortably in an int when expressed in kHz,
but it may overflow when converted to Hz. In this case,
tsc-frequency returns a negative value because x86_cpuid_get_tsc_freq
does a 32-bit multiplication before assigning to int64_t.
For simplicity just make tsc_khz a 64-bit value.
Spotted by Coverity.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Jan Kiszka [Sun, 7 Jun 2015 09:15:08 +0000 (11:15 +0200)]
i386: Introduce ARAT CPU feature
ARAT signals that the APIC timer does not stop in power saving states.
As our APICs are emulated, it's fine to expose this feature to guests,
at least when asking for KVM host features or with CPU types that
include the flag. The exact model number that introduced the feature is
not known, but reports can be found that it's at least available since
Sandy Bridge.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Peter Maydell [Tue, 7 Jul 2015 13:44:19 +0000 (14:44 +0100)]
Merge remote-tracking branch 'remotes/kraxel/tags/pull-vga-
20150707-1' into staging
virtio-gpu property fixes, add testcase
# gpg: Signature made Tue Jul 7 10:24:16 2015 BST using RSA key ID
D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg: aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
* remotes/kraxel/tags/pull-vga-
20150707-1:
virtio-gpu: add to display-vga test
virtio-gpu: use virtio_instance_init_common, fixup properties
virtio-gpu: update console device property.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Ting Wang [Fri, 26 Jun 2015 09:37:35 +0000 (17:37 +0800)]
blockjob: add block_job_release function
There is job resource leak in function mirror_start_job,
although bdrv_create_dirty_bitmap is unlikely failed.
Add block_job_release for each release when needed.
Signed-off-by: Ting Wang <kathy.wangting@huawei.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id:
1435311455-56048-1-git-send-email-kathy.wangting@huawei.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Richard W.M. Jones [Wed, 1 Jul 2015 14:40:14 +0000 (15:40 +0100)]
block/raw-posix: Don't think /dev/fd/<NN> is a floppy drive.
In libguestfs we use /dev/fd/<NN> to pass pre-opened file descriptors
to qemu-img. Lately I've discovered that although this works, qemu
believes that these are floppy disk images. That in itself isn't much
of a problem, but now qemu prints a warning about host floppy
pass-thru being deprecated.
Extend the existing test so that it ignores /dev/fd/ as well as
/dev/fdset/
A simple test of this, if you are using the bash shell, is:
qemu-img info <( cat /dev/null )
without this patch:
$ qemu-img info <( cat /dev/null )
qemu-img: Host floppy pass-through is deprecated
Support for it will be removed in a future release.
qemu-img: Could not open '/dev/fd/63': Could not refresh total sector count: Illegal seek
with this patch:
$ qemu-img info <( cat /dev/null )
qemu-img: Could not open '/dev/fd/63': Could not refresh total sector count: Illegal seek
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id:
1435761614-31358-1-git-send-email-rjones@redhat.com
Fixes: https://bugs.launchpad.net/qemu/+bug/1470536
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Fam Zheng [Fri, 29 May 2015 10:53:14 +0000 (18:53 +0800)]
block: Use bdrv_drain to replace uncessary bdrv_drain_all
There callers work on a single BlockDriverState subtree, where using
bdrv_drain() is more accurate.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Fam Zheng [Mon, 6 Jul 2015 04:24:44 +0000 (12:24 +0800)]
block: Initialize local_err in bdrv_append_temp_snapshot
Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id:
1436156684-16526-1-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Li Zhijian [Thu, 2 Jul 2015 12:18:06 +0000 (20:18 +0800)]
migration: extend migration_bitmap
Prevously, if we hotplug a device(e.g. device_add e1000) during
migration is processing in source side, qemu will add a new ram
block but migration_bitmap is not extended.
In this case, migration_bitmap will overflow and lead qemu abort
unexpectedly.
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Li Zhijian [Thu, 2 Jul 2015 12:18:05 +0000 (20:18 +0800)]
migration: protect migration_bitmap
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Dr. David Alan Gilbert [Thu, 2 Jul 2015 08:22:03 +0000 (09:22 +0100)]
check_section_footers: Check the correct section_id
The section footers check was incorrectly checking the section_id
in the SaveStateEntry not the LoadStateEntry. These can validly be different
if the two QEMU instances have instantiated their devices in a
different order. The test only cares that we're finishing the same
section we started, and hence it's the LoadStateEntry that we care about.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Juan Quintela [Wed, 20 May 2015 15:15:42 +0000 (17:15 +0200)]
migration: Add migration events on target side
We reuse the migration events from the source side, sending them on the
appropiate place.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Juan Quintela [Tue, 7 Jul 2015 12:44:05 +0000 (14:44 +0200)]
migration: Make events a capability
Make check fails with events. THis is due to the parser/lexer that it
uses. Just in case that they are more broken parsers, just only send
events when there are capabilities.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Juan Quintela [Wed, 20 May 2015 10:16:15 +0000 (12:16 +0200)]
migration: create migration event
We have one argument that tells us what event has happened.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Juan Quintela [Tue, 16 Jun 2015 23:38:25 +0000 (01:38 +0200)]
migration: No need to call trace_migrate_set_state()
We now use the helper everywhere, so no need to call this on this two
places. See on previous commit that there were a place where we missed
to mark the trace. Now all tracing is done in migrate_set_state().
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Juan Quintela [Tue, 16 Jun 2015 23:36:40 +0000 (01:36 +0200)]
migration: Use always helper to set state
There were three places that were not using the migrate_set_state()
helper, just fix that.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Juan Quintela [Wed, 1 Jul 2015 07:32:29 +0000 (09:32 +0200)]
migration: ensure we start in NONE state
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Juan Quintela [Wed, 17 Jun 2015 00:06:20 +0000 (02:06 +0200)]
migration: Use cmpxchg correctly
cmpxchg returns the old value
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Juan Quintela [Wed, 13 May 2015 16:17:43 +0000 (18:17 +0200)]
migration: Add configuration section
It needs to be the first one and it is not optional, that is the reason
why it is opencoded. For new machine types, it is required that machine
type name is the same in both sides.
It is just done right now for pc's.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Juan Quintela [Wed, 15 Oct 2014 07:39:14 +0000 (09:39 +0200)]
vmstate: Create optional sections
To make sections optional, we need to do it at the beggining of the code.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Juan Quintela [Wed, 8 Oct 2014 11:58:24 +0000 (13:58 +0200)]
global_state: Make section optional
This section would be sent:
a- for all new machine types
b- for old machine types if section state is different form {running,paused}
that were the only giving us troubles.
So, in new qemus: it is alwasy there. In old qemus: they are only
there if it an error has happened, basically stoping on target.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Juan Quintela [Wed, 8 Oct 2014 08:58:10 +0000 (10:58 +0200)]
migration: create new section to store global state
This includes a new section that for now just stores the current qemu state.
Right now, there are only one way to control what is the state of the
target after migration.
- If you run the target qemu with -S, it would start stopped.
- If you run the target qemu without -S, it would run just after migration finishes.
The problem here is what happens if we start the target without -S and
there happens one error during migration that puts current state as
-EIO. Migration would ends (notice that the error happend doing block
IO, network IO, i.e. nothing related with migration), and when
migration finish, we would just "continue" running on destination,
probably hanging the guest/corruption data, whatever.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Juan Quintela [Wed, 8 Oct 2014 10:47:08 +0000 (12:47 +0200)]
runstate: migration allows more transitions now
Next commit would allow to move from incoming migration to error happening on source.
Should we add more states to this transition? Luiz?
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Juan Quintela [Wed, 8 Oct 2014 09:53:22 +0000 (11:53 +0200)]
runstate: Add runstate store
This allows us to store the current state to send it through migration.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Dr. David Alan Gilbert [Fri, 12 Jun 2015 17:37:52 +0000 (18:37 +0100)]
Fix older machine type compatibility on power with section footers
I forgot to add compatibility for Power when adding section footers.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Fixes:
37fb569c0198cba58e3e
Signed-off-by: Juan Quintela <quintela@redhat.com>
Dr. David Alan Gilbert [Thu, 11 Jun 2015 17:17:28 +0000 (18:17 +0100)]
Fail more cleanly in mismatched RAM cases
If the number of RAMBlocks was different on the source from the
destination, QEMU would hang waiting for a disconnect on the source
and wouldn't release from that hang until the destination was manually
killed.
Mark the stream as being in error, this causes the destination to die
and the source to carry on.
(It still gets a whole bunch of warnings on the destination, and I've
not managed to complete another migration after the 1st one, still
progress).
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Dr. David Alan Gilbert [Thu, 11 Jun 2015 17:17:27 +0000 (18:17 +0100)]
Sanity check RDMA remote data
Perform some basic (but probably not complete) sanity checking on
requests from the RDMA source.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Dr. David Alan Gilbert [Thu, 11 Jun 2015 17:17:26 +0000 (18:17 +0100)]
Sort destination RAMBlocks to be the same as the source
Use the order of incoming RAMBlocks from the source to record
an index number; that then allows us to sort the destination
local RAMBlock list to match the source.
Now that the RAMBlocks are known to be in the same order, this
simplifies the RDMA Registration step which previously tried to
match RAMBlocks based on offset (which isn't guaranteed to match).
Looking at the existing compress code, I think it was erroneously
relying on an assumption of matching ordering, which this fixes.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Dr. David Alan Gilbert [Thu, 11 Jun 2015 17:17:25 +0000 (18:17 +0100)]
Rework ram block hash
RDMA uses a hash from block offset->RAM Block; this isn't needed
on the destination, and it becomes harder to maintain after the next
patch in the series that sorts the block list.
Split the hash so that it's only generated on the source.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Dr. David Alan Gilbert [Thu, 11 Jun 2015 17:17:24 +0000 (18:17 +0100)]
Allow rdma_delete_block to work without the hash
In the next patch we remove the hash on the destination,
rdma_delete_block does two things with the hash which can be avoided:
a) The caller passes the offset and rdma_delete_block looks it up
in the hash; fixed by getting the caller to pass the block
b) The hash gets recreated after deletion; fixed by making that
conditional on the hash being initialised.
While this function is currently only used during cleanup, Michael
asked that we keep it general for future dynamic block registration
work.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Dr. David Alan Gilbert [Thu, 11 Jun 2015 17:17:23 +0000 (18:17 +0100)]
Rework ram_control_load_hook to hook during block load
We need the names of RAMBlocks as they're loaded for RDMA,
reuse a slightly modified ram_control_load_hook:
a) Pass a 'data' parameter to use for the name in the block-reg
case
b) Only some hook types now require the presence of a hook function.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Dr. David Alan Gilbert [Thu, 11 Jun 2015 17:17:22 +0000 (18:17 +0100)]
Translate offsets to destination address space
The 'offset' field in RDMACompress and 'current_addr' field
in RDMARegister are commented as being offsets within a particular
RAMBlock, however they appear to actually be offsets within the
ram_addr_t space.
The code currently assumes that the offsets on the source/destination
match, this change removes the need for the assumption for these
structures by translating the addresses into the ram_addr_t space of
the destination host.
Note: An alternative would be to change the fields to actually
take the data they're commented for; this would potentially be
simpler but would break stream compatibility for those cases
that currently work.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Dr. David Alan Gilbert [Thu, 11 Jun 2015 17:17:21 +0000 (18:17 +0100)]
Store block name in local blocks structure
In a later patch the block name will be used to match up two views
of the block list. Keep a copy of the block name with the local block
list.
(At some point it could be argued that it would be best just to let
migration see the innards of RAMBlock and avoid the need to use
foreach).
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Dr. David Alan Gilbert [Thu, 11 Jun 2015 17:17:20 +0000 (18:17 +0100)]
rdma typos
A couple of typo fixes.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Dr. David Alan Gilbert [Tue, 23 Jun 2015 16:34:35 +0000 (17:34 +0100)]
Only try and read a VMDescription if it should be there
The VMDescription section maybe after the EOF mark, the current code
does a 'qemu_get_byte' and either gets the header byte identifying the
description or an error (which it ignores). Doing the 'get' upsets
RDMA which hangs on old machine types without the VMDescription.
Just avoid reading the VMDescription if we wouldn't send it.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Gonglei [Tue, 23 Jun 2015 07:56:38 +0000 (15:56 +0800)]
rdma: fix memory leak
Variable "r" going out of scope leaks the storage
it points to in line 3268.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Scott Feldman [Wed, 1 Jul 2015 10:33:12 +0000 (03:33 -0700)]
rocker: tests: don't need to specify master/self when setting vlans
4.1 Linux kernel doesn't require specifying "master" or "self" when setting
vlans on a port, so clean these up from the tests that use vlans.
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Message-id:
1435746792-41278-6-git-send-email-sfeldma@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Scott Feldman [Wed, 1 Jul 2015 10:33:11 +0000 (03:33 -0700)]
rocker: mark copy-to-cpu pkts as forwarding offloaded
For pkts copied to the CPU (to be processed by guest driver), mark the Rx
descriptor with flag "OFFLOAD_FWD" to indicate device has already forwarded
pkt. The guest driver will use this indicator to avoid duplicate
forwarding in the guest OS.
Examples include bcast/mcast/unknown ucast pkts flooded to bridged ports.
We want to avoid both the device and the guest bridge driver flooding these
pkts, which would result in duplicates pkts on the wire. Packet sampling,
such as sFlow, can also use this technique to mark pkts for the guest OS to
record but otherwise drop.
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Message-id:
1435746792-41278-5-git-send-email-sfeldma@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Scott Feldman [Wed, 1 Jul 2015 10:33:10 +0000 (03:33 -0700)]
rocker: return -1 when dropping packet on ingress
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Message-id:
1435746792-41278-4-git-send-email-sfeldma@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Scott Feldman [Wed, 1 Jul 2015 10:33:09 +0000 (03:33 -0700)]
rocker: fix missing break statements
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id:
1435746792-41278-3-git-send-email-sfeldma@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Scott Feldman [Wed, 1 Jul 2015 10:33:08 +0000 (03:33 -0700)]
rocker: fix misplaced break statement
Premature break in switch case block. This particular case (group L2 rewrite)
will be used for L2 LAG and L3 ECMP support, neither of which are enabled in
the guest driver at this time, but are under development.
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id:
1435746792-41278-2-git-send-email-sfeldma@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Scott Feldman [Wed, 1 Jul 2015 02:25:53 +0000 (19:25 -0700)]
rocker: don't queue receive pkts when port is disabled
Commit 6e99c63 ("net/socket: Drop net_socket_can_send") changed the
semantics around .can_receive for sockets to now require the device to
flush queued pkts when transitioning to a .can_receive=true state. Rocker
device was not flushing the queue on .can_receive=true transition, so the
receiver was stuck.
But, turns out we really don't want any queuing at all on the port when the
port is disabled, otherwise when the port transitions to enabled, we'd
receive and forward stale pkts that really should have been dropped. So,
let's remove .can_receive so avoid queuing and drop the pkt in .receive if
the port is disabled.
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id:
1435717553-36187-1-git-send-email-sfeldma@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Brian Kress [Tue, 23 Jun 2015 15:49:25 +0000 (11:49 -0400)]
vmxnet3: Fix incorrect small packet padding
When running ESXi under qemu there is an issue with the ESXi guest
discarding packets that are too short. The guest discards any packets
under the normal minimum length for an ethernet packet (60). This
results in odd behaviour where other hosts or VMs on other hosts can
communicate with the ESXi guest just fine (since there's a physical NIC
somewhere doing padding), but VMs on the host and the host itself cannot
because the ARP request packets are too small for the ESXi host to
accept.
Someone in the past thought this was worth fixing, and added code to the
vmxnet3 qemu emulation such that if it is receiving packets smaller than
60 bytes to pad the packet out to 60. Unfortunately this code is wrong
(or at least in the wrong place). It does so BEFORE before taking into
account the vnet_hdr at the front of the packet added by the tap device.
As a result, it might add padding, but it never adds enough.
Specifically it adds 10 less (the length of the vnet_hdr) than it needs
to.
The following (hopefully "obviously correct") patch simply swaps the
order of processing the vnet header and the padding. With this patch an
ESXi guest is able to communicate with the host or other local VMs.
Signed-off-by: Brian Kress <kressb@moose.net>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi [Thu, 25 Jun 2015 09:18:05 +0000 (10:18 +0100)]
e1000: flush packets when link comes up
e1000_can_receive() checks the link up status register bit. If the bit
is clear, packets will be queued and the peer may disable receive to
avoid wasting CPU reading packets that cannot be delivered. The queue
must be flushed once the link comes back up again.
This patch fixes broken e1000 receive with Mac OS X Snow Leopard guests
and tap networking. Flushing the queue invokes the async send callback,
which re-enables tap fd read.
Reported-by: Jonathan Liu <net147@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id:
1435223885-12745-1-git-send-email-stefanha@redhat.com
Gonglei [Thu, 25 Jun 2015 06:24:10 +0000 (14:24 +0800)]
rocker: fix memory leak
Meanwhile, using g_new0 instead of g_malloc0,
refer to commit 5839e53.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Message-id:
1435213450-6700-1-git-send-email-arei.gonglei@huawei.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi [Thu, 2 Jul 2015 16:24:41 +0000 (17:24 +0100)]
block: update bdrv_drain_all()/bdrv_drain() comments
The doc comments for bdrv_drain_all() and bdrv_drain() are outdated:
* The bdrv_drain() comment is a poor man's bdrv_lock()/bdrv_unlock()
which Fam Zheng is currently developing. Unfortunately this warning
was never really enough because devices keep submitting I/O and op
blockers don't prevent that.
* The bdrv_drain_all() comment is still partially correct but reflects
the nature of the implementation rather than API documentation.
Do make it clear that bdrv_drain() is only appropriate within an
AioContext. For anything spanning AioContexts you need
bdrv_drain_all().
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id:
1435854281-6078-1-git-send-email-stefanha@redhat.com
Alberto Garcia [Thu, 2 Jul 2015 08:06:11 +0000 (11:06 +0300)]
qcow2: remove unnecessary check
The value of 'i' is guaranteed to be >= 0
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id:
1435824371-2660-1-git-send-email-berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Gerd Hoffmann [Mon, 28 Apr 2014 09:10:12 +0000 (11:10 +0200)]
virtio-gpu: add to display-vga test
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Gerd Hoffmann [Wed, 24 Jun 2015 10:22:09 +0000 (12:22 +0200)]
virtio-gpu: use virtio_instance_init_common, fixup properties
Switch over to virtio_instance_init_common. Drop duplicate properties
in virtio-gpu-pci and virtio-vga as they are properly aliased now. Also
drop the indirection via DEFINE_VIRTIO_GPU_PROPERTIES, we don't need it
any more as the properties are defined in a single place now.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Gerd Hoffmann [Wed, 24 Jun 2015 10:19:42 +0000 (12:19 +0200)]
virtio-gpu: update console device property.
Update the device link of the QemuConsole, so it points to the
virtio-gpu-pci or virtio-vga device instead of virtio-gpu-device.
This is needed because we want to find the device by id, for
example for input routing, and the id specified on the command
line is attached to the pci proxy, not the virtio device.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Peter Maydell [Tue, 7 Jul 2015 08:22:40 +0000 (09:22 +0100)]
Merge remote-tracking branch 'remotes/awilliam/tags/vfio-update-
20150706.0' into staging
VFIO updates for 2.4-rc0
- "real" host page size API (Peter Crosthwaite)
- platform device irqfd support (Eric Auger)
- spapr container disconnect fix (Alexey Kardashevskiy)
- quirk for broken Chelsio hardware (Gabriel Laupre)
- coverity fix (Paolo Bonzini)
# gpg: Signature made Mon Jul 6 19:23:49 2015 BST using RSA key ID
3BB08B22
# gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>"
# gpg: aka "Alex Williamson <alex@shazbot.org>"
# gpg: aka "Alex Williamson <alwillia@redhat.com>"
# gpg: aka "Alex Williamson <alex.l.williamson@gmail.com>"
* remotes/awilliam/tags/vfio-update-
20150706.0:
vfio/pci : Add pba_offset PCI quirk for Chelsio T5 devices
vfio: Unregister IOMMU notifiers when container is destroyed
hw/vfio/platform: add irqfd support
kvm: some fixes to kvm_resamplefds_allowed
sysbus: add irq_routing_notifier
intc: arm_gic_kvm: set the qemu_irq/gsi mapping
kvm-all.c: add qemu_irq/gsi hash table and utility routines
kvm: rename kvm_irqchip_[add,remove]_irqfd_notifier with gsi suffix
vfio: cpu: Use "real" page size API
cpu-all: complete "real" host page size API
vfio: fix return type of pread
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Conflicts:
kvm-all.c
Peter Maydell [Mon, 6 Jul 2015 22:37:53 +0000 (23:37 +0100)]
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream-smm' into staging
This series implements KVM support for SMM, and lets you enable/disable
it through the "smm" property of x86 machine types.
# gpg: Signature made Mon Jul 6 17:41:05 2015 BST using RSA key ID
78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream-smm:
pc: add SMM property
ich9: add smm_enabled field and arguments
pc_piix: rename kvm_enabled to smm_enabled
target-i386: register a separate KVM address space including SMRAM regions
kvm-all: kvm_irqchip_create is not expected to fail
kvm-all: add support for multiple address spaces
kvm-all: make KVM's memory listener more generic
kvm-all: move internal types to kvm_int.h
kvm-all: remove useless typedef
kvm-all: put kvm_mem_flags to more work
target-i386: add support for SMBASE MSR and SMIs
piix4/ich9: do not raise SMI on ACPI enable/disable commands
linux-headers: Update to 4.2-rc1
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Gabriel Laupre [Mon, 6 Jul 2015 18:15:15 +0000 (12:15 -0600)]
vfio/pci : Add pba_offset PCI quirk for Chelsio T5 devices
Fix pba_offset initialization value for Chelsio T5 Virtual Function
device. The T5 hardware has a bug in it where it reports a Pending Interrupt
Bit Array Offset of 0x8000 for its SR-IOV Virtual Functions instead
of the 0x1000 that the hardware actually uses internally. As the hardware
doesn't return the correct pba_offset value, add a quirk to instead
return a hardcoded value of 0x1000 when a Chelsio T5 VF device is
detected.
This bug has been fixed in the Chelsio's next chip series T6 but there are
no plans to respin the T5 ASIC for this bug. It is just documented in the
T5 Errata and left it at that.
Signed-off-by: Gabriel Laupre <glaupre@chelsio.com>
Reviewed-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Alexey Kardashevskiy [Mon, 6 Jul 2015 18:15:15 +0000 (12:15 -0600)]
vfio: Unregister IOMMU notifiers when container is destroyed
On systems with guest visible IOMMU, adding a new memory region onto
PCI bus calls vfio_listener_region_add() for every DMA window. This
installs a notifier for IOMMU memory regions. The notifier is supposed
to be removed vfio_listener_region_del(), however in the case of mixed
PHB (emulated + VFIO devices) when last VFIO device is unplugged and
container gets destroyed, all existing DMA windows stay alive altogether
with the notifiers which are on the linked list which head was in
the destroyed container.
This unregisters IOMMU memory region notifier when a container is
destroyed.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Eric Auger [Mon, 6 Jul 2015 18:15:14 +0000 (12:15 -0600)]
hw/vfio/platform: add irqfd support
This patch aims at optimizing IRQ handling using irqfd framework.
Instead of handling the eventfds on user-side they are handled on
kernel side using
- the KVM irqfd framework,
- the VFIO driver virqfd framework.
the virtual IRQ completion is trapped at interrupt controller
This removes the need for fast/slow path swap.
Overall this brings significant performance improvements.
Signed-off-by: Alvise Rigo <a.rigo@virtualopensystems.com>
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Vikram Sethi <vikrams@codeaurora.org>
Acked-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Eric Auger [Mon, 6 Jul 2015 18:15:14 +0000 (12:15 -0600)]
kvm: some fixes to kvm_resamplefds_allowed
Commit
f41389ae3c54b introduced kvm_resamplefds_enabled() and
associated kvm_resamplefds_allowed boolean. This patch adds
non-KVM version for kvm_resamplefds_enabled and also declares
kvm_resamplefds_allowed in kvm-stub as it is done for fellow
kvm_irqfds_allowed.
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>