sdk/emulator/qemu.git
10 years agopc: ACPI: update acpi-dsdt.hex.generated q35-acpi-dsdt.hex.generated
Igor Mammedov [Thu, 9 Jan 2014 16:36:39 +0000 (17:36 +0100)]
pc: ACPI: update acpi-dsdt.hex.generated q35-acpi-dsdt.hex.generated

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 years agopc: ACPI: unify source of CPU hotplug IO base/len
Igor Mammedov [Thu, 9 Jan 2014 16:36:38 +0000 (17:36 +0100)]
pc: ACPI: unify source of CPU hotplug IO base/len

use C headers defines as source of IO base/len for respective
values in ASL code.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 years agopc: ACPI: expose PRST IO range via _CRS
Igor Mammedov [Thu, 9 Jan 2014 16:36:37 +0000 (17:36 +0100)]
pc: ACPI: expose PRST IO range via _CRS

.. so OSPM could notice resource conflict if there is any.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 years agopc: Q35 DSDT: exclude CPU hotplug IO range from PCI bus resources
Igor Mammedov [Thu, 9 Jan 2014 16:36:36 +0000 (17:36 +0100)]
pc: Q35 DSDT: exclude CPU hotplug IO range from PCI bus resources

... for range defined at hw/acpi/ich9.c:ICH9_PROC_BASE

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 years agopc: PIIX DSDT: exclude CPU/PCI hotplug & GPE0 IO range from PCI bus resources
Igor Mammedov [Thu, 9 Jan 2014 16:36:35 +0000 (17:36 +0100)]
pc: PIIX DSDT: exclude CPU/PCI hotplug & GPE0 IO range from PCI bus resources

.. so that they might not be used by PCI devices.

Note:
Resort to concatenating templates with preprocessor help,
because 1.0b spec isn't supporting ConcatenateResTemplate,
as result Windows XP fails to execute PCI0._CRS method if
ConcatenateResTemplate() is used.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 years agopc: set PRST base in DSDT depending on chipset
Igor Mammedov [Thu, 9 Jan 2014 16:36:34 +0000 (17:36 +0100)]
pc: set PRST base in DSDT depending on chipset

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 years agoacpi: ich9: add CPU hotplug handling to Q35 machine
Igor Mammedov [Thu, 9 Jan 2014 16:36:32 +0000 (17:36 +0100)]
acpi: ich9: add CPU hotplug handling to Q35 machine

.. use IO port 0cd8-0xcf7 range for CPU present bitmap

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 years agoacpi: factor out common cpu hotplug code for PIIX4/Q35
Igor Mammedov [Thu, 9 Jan 2014 16:36:31 +0000 (17:36 +0100)]
acpi: factor out common cpu hotplug code for PIIX4/Q35

.. so it could be used for adding CPU hotplug to Q35 machine

Add an additional header with that will be shared between
C and ASL code: include/hw/acpi/cpu_hotplug_defs.h

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 years agoacpi-build: enable hotplug for PCI bridges
Michael S. Tsirkin [Mon, 14 Oct 2013 15:01:29 +0000 (18:01 +0300)]
acpi-build: enable hotplug for PCI bridges

This enables support for device hotplug behind
pci bridges. Bridge devices themselves need
to be pre-configured on qemu command line.

Design:
    - at machine init time, assign "bsel" property to bridges with
      hotplug support
    - dynamically (At ACPI table read) generate ACPI code to handle
      hotplug events for each bridge with "bsel" property

Note: ACPI doesn't support adding or removing bridges by hotplug.
We detect and prevent removal of bridges by hotplug,
unless they were added by hotplug previously
(and so, are not described by ACPI).

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 years agopiix4: add acpi pci hotplug support
Michael S. Tsirkin [Mon, 14 Oct 2013 15:01:20 +0000 (18:01 +0300)]
piix4: add acpi pci hotplug support

Add support for acpi pci hotplug using the
new infrastructure.
PIIX4 legacy interface is maintained as is for
machine types 1.7 and older.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 years agopcihp: generalization of piix4 acpi
Michael S. Tsirkin [Mon, 14 Oct 2013 15:01:11 +0000 (18:01 +0300)]
pcihp: generalization of piix4 acpi

Add ACPI based PCI hotplug library with bridge hotplug
support.
Design
   - each bus gets assigned "bsel" property.
   - ACPI code writes this number
     to a new BNUM register, then uses existing
     UP/DOWN registers to probe slot status;
     to eject, write number to BNUM register,
     then slot into existing EJ.

The interface is actually backwards-compatible with
existing PIIX4 ACPI (though not migration compatible).

This is split out from PIIX4 codebase so we can
reuse it for Q35 as well.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 years agopci: add pci_for_each_bus_depth_first
Michael S. Tsirkin [Mon, 14 Oct 2013 15:01:07 +0000 (18:01 +0300)]
pci: add pci_for_each_bus_depth_first

Useful for ACPI hotplug.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 years agopc: make: fix dependencies: rebuild when included file is changed
Igor Mammedov [Thu, 9 Jan 2014 16:36:33 +0000 (17:36 +0100)]
pc: make: fix dependencies: rebuild when included file is changed

some *.dsl files include another *.dsl files but there weren't
any dependicies and when included file changed target table wasn't
rebuild. Fix this by using the same auto dependency generation
as for C files.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 years agoacpi unit-test: do not fail on asl mismatch
Marcel Apfelbaum [Thu, 16 Jan 2014 15:50:48 +0000 (17:50 +0200)]
acpi unit-test: do not fail on asl mismatch

The asl comparison will break every time the ACPI
tables are updated. This may break the git bisect.
Instead of failing print a warning on stderr
including the retained asl files, so they can be
compared offline.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 years agoacpi unit-test: resolved iasl crash
Marcel Apfelbaum [Thu, 16 Jan 2014 15:50:47 +0000 (17:50 +0200)]
acpi unit-test: resolved iasl crash

It seems that iasl has an issue when disassembles
some ACPI tables using the command line:
iasl -e DSDT -e SSDT -d HPET

Modified the iasl command line to "iasl -d HPET"
until the problem is solved. The command line
remained the same for DSDT and SSDT tables.

Reported-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 years agoacpi unit-test: renamed ssdt_tables to tables
Marcel Apfelbaum [Thu, 16 Jan 2014 15:50:46 +0000 (17:50 +0200)]
acpi unit-test: renamed ssdt_tables to tables

Just a refactoring, ssdt_tables name was confusing as
it included other tables as well.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 years agotests: fix acpi to work on bigendian host
Alexey Kardashevskiy [Mon, 13 Jan 2014 07:33:53 +0000 (18:33 +1100)]
tests: fix acpi to work on bigendian host

Double endianness convertion make this test failing on POWERPC machine
running in big-endian.

This fixes the test to success on big-endian host.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 years agoacpi unit-test: hook to rebuild expected aml files
Marcel Apfelbaum [Thu, 26 Dec 2013 14:54:25 +0000 (16:54 +0200)]
acpi unit-test: hook to rebuild expected aml files

When running the test with TEST_ACPI_REBUILD_AML=y environment
variable, the test will rebuild and validate the expected aml
files.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 years agoacpi unit-test: added script to rebuild the expected aml files
Marcel Apfelbaum [Thu, 26 Dec 2013 14:54:24 +0000 (16:54 +0200)]
acpi unit-test: added script to rebuild the expected aml files

Acpi unit-test will fail every time the acpi tables change.
This script rebuild the expected aml files, so the test
will pass. It also validates the modifications.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 years agoacpi unit-test: extract iasl executable from configuration
Marcel Apfelbaum [Thu, 26 Dec 2013 14:54:23 +0000 (16:54 +0200)]
acpi unit-test: extract iasl executable from configuration

The test checked if iasl is installed by running "iasl"
and checking the error output.
It is better to use the iasl executable as appears
in configuration.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 years agoconfigure: add CONFIG_IASL to config-host.h
Marcel Apfelbaum [Thu, 26 Dec 2013 14:54:22 +0000 (16:54 +0200)]
configure: add CONFIG_IASL to config-host.h

Acpi unit-tests will extract iasl executable
from CONFIG_IASL define.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 years agoacpi unit-test: compare DSDT and SSDT tables against expected values
Marcel Apfelbaum [Thu, 26 Dec 2013 14:54:21 +0000 (16:54 +0200)]
acpi unit-test: compare DSDT and SSDT tables against expected values

This test will run only if iasl is installed on the host machine.
The test plan:
 1. Dumps the ACPI tables as AML on the disk.
 2. Runs iasl to disassembly the tables into ASL files.
 3. Runs iasl to disassembly the offline AML files into ASL files.
 4. Compares the ASL files.

The test runs for both default machine and q35.
In case the test fails, it can be easily tweaked to
show the differences between the ASL files and
understand the issue.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 years agoconfigure: added acpi unit-test files
Marcel Apfelbaum [Thu, 26 Dec 2013 14:54:20 +0000 (16:54 +0200)]
configure: added acpi unit-test files

Ensure configure will set-up links for the files
if the build is created in other directory.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 years agoacpi unit-test: add test files
Marcel Apfelbaum [Thu, 26 Dec 2013 14:54:19 +0000 (16:54 +0200)]
acpi unit-test: add test files

Added unit-test's expected aml files to be compared
with the actual ACPI tables.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 years agovirtio: Fix return value for dummy function vhost_net_virtqueue_pending
Stefan Weil [Sun, 22 Dec 2013 14:51:22 +0000 (15:51 +0100)]
virtio: Fix return value for dummy function vhost_net_virtqueue_pending

cgcc complains that -ENOSYS is not a good value for 'bool'.

A dummy virtio will never have pending queue entries, so let us return
false.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 years agoACPI: Fix AppleSMC _STA size
Gabriel L. Somlo [Mon, 13 Jan 2014 20:27:13 +0000 (15:27 -0500)]
ACPI: Fix AppleSMC _STA size

Minimize the storage used for AppleSMC's _STA (8bit), relying on ASL
to implicitly convert it to the officially specified 32bit value.

Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 years agoAdd DSDT node for AppleSMC
Gabriel L. Somlo [Sun, 22 Dec 2013 15:34:56 +0000 (10:34 -0500)]
Add DSDT node for AppleSMC

AppleSMC (-device isa-applesmc) is required to boot OS X guests.
OS X expects a SMC node to be present in the ACPI DSDT. This patch
adds a SMC node to the DSDT, and dynamically patches the return value
of SMC._STA to either 0x0B if the chip is present, or otherwise to 0x00,
before booting the guest.

Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 years agoPython-lang gdb script to extract x86_64 guest vmcore from qemu coredump
Laszlo Ersek [Tue, 17 Dec 2013 00:37:06 +0000 (01:37 +0100)]
Python-lang gdb script to extract x86_64 guest vmcore from qemu coredump

When qemu dies unexpectedly, for example in response to an explicit
abort() call, or (more importantly) when an external signal is delivered
to it that results in a coredump, sometimes it is useful to extract the
guest vmcore from the qemu process' memory image. The guest vmcore might
help understand an emulation problem in qemu, or help debug the guest.

This script reimplements (and cuts many features of) the
qmp_dump_guest_memory() command in gdb/Python,

  https://sourceware.org/gdb/current/onlinedocs/gdb/Python-API.html

working off the saved memory image of the qemu process. The docstring in
the patch (serving as gdb help text) describes the limitations relative to
the QMP command.

Dependencies of qmp_dump_guest_memory() have been reimplemented as needed.
I sought to follow the general structure, sticking to original function
names where possible. However, keeping it simple prevailed in some places.

The patch has been tested with a 4 VCPU, 768 MB, RHEL-6.4
(2.6.32-358.el6.x86_64) guest:

- The script printed

> guest RAM blocks:
> target_start     target_end       host_addr        message count
> ---------------- ---------------- ---------------- ------- -----
0000000000000000 00000000000a0000 00007f95d0000000 added       1
00000000000a0000 00000000000b0000 00007f960ac00000 added       2
00000000000c0000 00000000000ca000 00007f95d00c0000 added       3
00000000000ca000 00000000000cd000 00007f95d00ca000 joined      3
00000000000cd000 00000000000d0000 00007f95d00cd000 joined      3
00000000000d0000 00000000000f0000 00007f95d00d0000 joined      3
00000000000f0000 0000000000100000 00007f95d00f0000 joined      3
0000000000100000 0000000030000000 00007f95d0100000 joined      3
00000000fc000000 00000000fc800000 00007f960ac00000 added       4
00000000fffe0000 0000000100000000 00007f9618800000 added       5
> dumping range at 00007f95d0000000 for length 00000000000a0000
> dumping range at 00007f960ac00000 for length 0000000000010000
> dumping range at 00007f95d00c0000 for length 000000002ff40000
> dumping range at 00007f960ac00000 for length 0000000000800000
> dumping range at 00007f9618800000 for length 0000000000020000

- The vmcore was checked with "readelf", comparing the results against a
  vmcore written by qmp_dump_guest_memory():

> --- theirs      2013-09-12 17:38:59.797289404 +0200
> +++ mine        2013-09-12 17:39:03.820289404 +0200
> @@ -27,16 +27,16 @@
>    Type           Offset             VirtAddr           PhysAddr
>                   FileSiz            MemSiz              Flags  Align
>    NOTE           0x0000000000000190 0x0000000000000000 0x0000000000000000
> -                 0x0000000000000ca0 0x0000000000000ca0         0
> -  LOAD           0x0000000000000e30 0x0000000000000000 0x0000000000000000
> +                 0x000000000000001c 0x000000000000001c         0
> +  LOAD           0x00000000000001ac 0x0000000000000000 0x0000000000000000
>                   0x00000000000a0000 0x00000000000a0000         0
> -  LOAD           0x00000000000a0e30 0x0000000000000000 0x00000000000a0000
> +  LOAD           0x00000000000a01ac 0x0000000000000000 0x00000000000a0000
>                   0x0000000000010000 0x0000000000010000         0
> -  LOAD           0x00000000000b0e30 0x0000000000000000 0x00000000000c0000
> +  LOAD           0x00000000000b01ac 0x0000000000000000 0x00000000000c0000
>                   0x000000002ff40000 0x000000002ff40000         0
> -  LOAD           0x000000002fff0e30 0x0000000000000000 0x00000000fc000000
> +  LOAD           0x000000002fff01ac 0x0000000000000000 0x00000000fc000000
>                   0x0000000000800000 0x0000000000800000         0
> -  LOAD           0x00000000307f0e30 0x0000000000000000 0x00000000fffe0000
> +  LOAD           0x00000000307f01ac 0x0000000000000000 0x00000000fffe0000
>                   0x0000000000020000 0x0000000000020000         0
>
>  There is no dynamic section in this file.
> @@ -47,13 +47,6 @@
>
>  No version information found in this file.
>
> -Notes at offset 0x00000190 with length 0x00000ca0:
> +Notes at offset 0x00000190 with length 0x0000001c:
>    Owner                Data size       Description
> -  CORE         0x00000150      NT_PRSTATUS (prstatus structure)
> -  CORE         0x00000150      NT_PRSTATUS (prstatus structure)
> -  CORE         0x00000150      NT_PRSTATUS (prstatus structure)
> -  CORE         0x00000150      NT_PRSTATUS (prstatus structure)
> -  QEMU         0x000001b0      Unknown note type: (0x00000000)
> -  QEMU         0x000001b0      Unknown note type: (0x00000000)
> -  QEMU         0x000001b0      Unknown note type: (0x00000000)
> -  QEMU         0x000001b0      Unknown note type: (0x00000000)
> +  NONE         0x00000005      Unknown note type: (0x00000000)

- The vmcore was checked with "crash" too, again comparing the results
  against a vmcore written by qmp_dump_guest_memory():

> --- guest.vmcore.log2   2013-09-12 17:52:27.074289201 +0200
> +++ example.dump.log2   2013-09-12 17:52:15.904289203 +0200
> @@ -22,11 +22,11 @@
>  This GDB was configured as "x86_64-unknown-linux-gnu"...
>
>       KERNEL: /usr/lib/debug/lib/modules/2.6.32-358.el6.x86_64/vmlinux
> -    DUMPFILE: /home/lacos/tmp/guest.vmcore
> +    DUMPFILE: /home/lacos/tmp/example.dump
>          CPUS: 4
> -        DATE: Thu Sep 12 17:16:11 2013
> -      UPTIME: 00:01:09
> -LOAD AVERAGE: 0.07, 0.03, 0.00
> +        DATE: Thu Sep 12 17:17:41 2013
> +      UPTIME: 00:00:38
> +LOAD AVERAGE: 0.18, 0.05, 0.01
>         TASKS: 130
>      NODENAME: localhost.localdomain
>       RELEASE: 2.6.32-358.el6.x86_64
> @@ -38,12 +38,12 @@
>       COMMAND: "swapper"
>          TASK: ffffffff81a8d020  (1 of 4)  [THREAD_INFO: ffffffff81a00000]
>           CPU: 0
> -       STATE: TASK_RUNNING (PANIC)
> +       STATE: TASK_RUNNING (ACTIVE)
> +     WARNING: panic task not found
>
>  crash> bt
>  PID: 0      TASK: ffffffff81a8d020  CPU: 0   COMMAND: "swapper"
> - #0 [ffffffff81a01ed0] default_idle at ffffffff8101495d
> - #1 [ffffffff81a01ef0] cpu_idle at ffffffff81009fc6
> + #0 [ffffffff81a01ef0] cpu_idle at ffffffff81009fc6
>  crash> task ffffffff81a8d020
>  PID: 0      TASK: ffffffff81a8d020  CPU: 0   COMMAND: "swapper"
>  struct task_struct {
> @@ -75,7 +75,7 @@
>        prev = 0xffffffff81a8d080
>      },
>      on_rq = 0,
> -    exec_start = 8618466836,
> +    exec_start = 7469214014,
>      sum_exec_runtime = 0,
>      vruntime = 0,
>      prev_sum_exec_runtime = 0,
> @@ -149,7 +149,7 @@
>    },
>    tasks = {
>      next = 0xffff88002d621948,
> -    prev = 0xffff880029618f28
> +    prev = 0xffff880023b74488
>    },
>    pushable_tasks = {
>      prio = 140,
> @@ -165,7 +165,7 @@
>      }
>    },
>    mm = 0x0,
> -  active_mm = 0xffff88002929b780,
> +  active_mm = 0xffff8800297eb980,
>    exit_state = 0,
>    exit_code = 0,
>    exit_signal = 0,
> @@ -177,7 +177,7 @@
>    sched_reset_on_fork = 0,
>    pid = 0,
>    tgid = 0,
> -  stack_canary = 2483693585637059287,
> +  stack_canary = 7266362296181431986,
>    real_parent = 0xffffffff81a8d020,
>    parent = 0xffffffff81a8d020,
>    children = {
> @@ -224,14 +224,14 @@
>    set_child_tid = 0x0,
>    clear_child_tid = 0x0,
>    utime = 0,
> -  stime = 3,
> +  stime = 2,
>    utimescaled = 0,
> -  stimescaled = 3,
> +  stimescaled = 2,
>    gtime = 0,
>    prev_utime = 0,
>    prev_stime = 0,
>    nvcsw = 0,
> -  nivcsw = 1000,
> +  nivcsw = 1764,
>    start_time = {
>      tv_sec = 0,
>      tv_nsec = 0

- <name_dropping>I asked for Dave Anderson's help with verifying the
  extracted vmcore, and his comments make me think I should post
  this.</name_dropping>

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 years agoMerge remote-tracking branch 'qemu-kvm/uq/master' into staging
Anthony Liguori [Fri, 24 Jan 2014 23:52:44 +0000 (15:52 -0800)]
Merge remote-tracking branch 'qemu-kvm/uq/master' into staging

* qemu-kvm/uq/master:
  kvm: always update the MPX model specific register
  KVM: fix addr type for KVM_IOEVENTFD
  KVM: Retry KVM_CREATE_VM on EINTR
  mempath prefault: fix off-by-one error
  kvm: x86: Separately write feature control MSR on reset
  roms: Flush icache when writing roms to guest memory
  target-i386: clear guest TSC on reset
  target-i386: do not special case TSC writeback
  target-i386: Intel MPX

Conflicts:
exec.c

aliguori: fix trivial merge conflict in exec.c

Signed-off-by: Anthony Liguori <aliguori@amazon.com>
10 years agoMerge remote-tracking branch 'otubo/seccomp' into staging
Anthony Liguori [Fri, 24 Jan 2014 23:52:16 +0000 (15:52 -0800)]
Merge remote-tracking branch 'otubo/seccomp' into staging

* otubo/seccomp:
  seccomp: add some basic shared memory syscalls to the whitelist
  seccomp: add mkdir() and fchmod() to the whitelist

Message-id: 1390231004-18392-1-git-send-email-otubo@linux.vnet.ibm.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
10 years agoMerge remote-tracking branch 'sweil/tags/for_anthony' into staging
Anthony Liguori [Fri, 24 Jan 2014 23:52:08 +0000 (15:52 -0800)]
Merge remote-tracking branch 'sweil/tags/for_anthony' into staging

Initial patch for QEMU GTK support on Windows

# gpg: Signature made Mon 20 Jan 2014 11:37:58 AM PST using RSA key ID FAD62069
# gpg: Can't check signature: public key not found

* sweil/tags/for_anthony:
  gtk: Support keyboard translation for hosts running Windows

Message-id: 1390246909-18757-1-git-send-email-sw@weilnetz.de
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
10 years agoMerge remote-tracking branch 'kraxel/tags/pull-audio-2' into staging
Anthony Liguori [Fri, 24 Jan 2014 23:51:38 +0000 (15:51 -0800)]
Merge remote-tracking branch 'kraxel/tags/pull-audio-2' into staging

hda-codec: disable streams on reset

# gpg: Signature made Tue 21 Jan 2014 02:17:12 AM PST using RSA key ID D3E87138
# gpg: Can't check signature: public key not found

* kraxel/tags/pull-audio-2:
  hda-codec: disable streams on reset

Message-id: 1390299589-5082-1-git-send-email-kraxel@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
10 years agoMerge remote-tracking branch 'kraxel/tags/pull-usb-2' into staging
Anthony Liguori [Fri, 24 Jan 2014 23:51:23 +0000 (15:51 -0800)]
Merge remote-tracking branch 'kraxel/tags/pull-usb-2' into staging

usb core+hid: add support for microsoft os descriptors

# gpg: Signature made Tue 21 Jan 2014 02:21:29 AM PST using RSA key ID D3E87138
# gpg: Can't check signature: public key not found

* kraxel/tags/pull-usb-2:
  usb-hid: add microsoft os descriptor support
  usb: add support for microsoft os descriptors

Message-id: 1390299772-5368-1-git-send-email-kraxel@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
10 years agoMerge remote-tracking branch 'bonzini/scsi-next' into staging
Anthony Liguori [Fri, 24 Jan 2014 23:50:14 +0000 (15:50 -0800)]
Merge remote-tracking branch 'bonzini/scsi-next' into staging

* bonzini/scsi-next:
  scsi: Support TEST UNIT READY in the dummy LUN0
  block: add .bdrv_reopen_prepare() stub for iscsi
  virtio-scsi: Prevent assertion on missed events
  virtio-scsi: Cleanup of I/Os that never started
  scsi: Assign cancel_io vector for scsi_disk_emulate_ops

Conflicts:
block/iscsi.c

aliguori: resolve trivial merge conflict in block/iscsi.c

Signed-off-by: Anthony Liguori <aliguori@amazon.com>
10 years agoMerge remote-tracking branch 'kwolf/tags/for-anthony' into staging
Anthony Liguori [Fri, 24 Jan 2014 23:43:30 +0000 (15:43 -0800)]
Merge remote-tracking branch 'kwolf/tags/for-anthony' into staging

Block patches

# gpg: Signature made Fri 24 Jan 2014 08:40:53 AM PST using RSA key ID C88F2FD6
# gpg: Can't check signature: public key not found

* kwolf/tags/for-anthony: (93 commits)
  block: Switch bdrv_io_limits_intercept() to byte granularity
  qemu-iotests: Test pwritev RMW logic
  qemu-io: New command 'sleep'
  blkdebug: Make required alignment configurable
  iscsi: Set bs->request_alignment
  block: Make bdrv_pwrite() a bdrv_prwv_co() wrapper
  block: Make bdrv_pread() a bdrv_prwv_co() wrapper
  block: Change coroutine wrapper to byte granularity
  block: Assert serialisation assumptions in pwritev
  block: Align requests in bdrv_co_do_pwritev()
  block: Allow wait_serialising_requests() at any point
  block: Make overlap range for serialisation dynamic
  block: Generalise and optimise COR serialisation
  block: Make zero-after-EOF work with larger alignment
  block: Allow waiting for overlapping requests between begin/end
  block: Switch BdrvTrackedRequest to byte granularity
  block: Introduce bdrv_co_do_pwritev()
  block: write: Handle COR dependency after I/O throttling
  block: Introduce bdrv_aligned_pwritev()
  block: Introduce bdrv_co_do_preadv()
  ...

Message-id: 1390584136-24703-1-git-send-email-kwolf@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
10 years agoblock: Switch bdrv_io_limits_intercept() to byte granularity
Kevin Wolf [Thu, 16 Jan 2014 12:29:10 +0000 (13:29 +0100)]
block: Switch bdrv_io_limits_intercept() to byte granularity

Request sizes used to be rounded down to the next sector boundary,
allowing to bypass the I/O limit. Now all requests are accounted for
with their exact byte size.

Reported-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
10 years agoqemu-iotests: Test pwritev RMW logic
Kevin Wolf [Tue, 14 Jan 2014 14:37:03 +0000 (15:37 +0100)]
qemu-iotests: Test pwritev RMW logic

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
10 years agoqemu-io: New command 'sleep'
Kevin Wolf [Wed, 15 Jan 2014 14:39:10 +0000 (15:39 +0100)]
qemu-io: New command 'sleep'

There is no easy way to check that a request correctly waits for a
different request. With a sleep command we can at least approximate it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoblkdebug: Make required alignment configurable
Kevin Wolf [Tue, 14 Jan 2014 12:44:35 +0000 (13:44 +0100)]
blkdebug: Make required alignment configurable

The new 'align' option of blkdebug can be used in order to emulate
backends with a required 4k alignment on hosts which only really require
512 byte alignment.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoiscsi: Set bs->request_alignment
Paolo Bonzini [Tue, 29 Nov 2011 11:41:35 +0000 (12:41 +0100)]
iscsi: Set bs->request_alignment

The iSCSI backend already gets the block size from the READ CAPACITY
command it sends.  Save it so that the generic block layer gets it
too.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
10 years agoblock: Make bdrv_pwrite() a bdrv_prwv_co() wrapper
Kevin Wolf [Thu, 5 Dec 2013 11:34:02 +0000 (12:34 +0100)]
block: Make bdrv_pwrite() a bdrv_prwv_co() wrapper

Instead of implementing the alignment adjustment here, use the now
existing functionality of bdrv_co_do_pwritev().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
10 years agoblock: Make bdrv_pread() a bdrv_prwv_co() wrapper
Kevin Wolf [Thu, 5 Dec 2013 11:29:59 +0000 (12:29 +0100)]
block: Make bdrv_pread() a bdrv_prwv_co() wrapper

Instead of implementing the alignment adjustment here, use the now
existing functionality of bdrv_co_do_preadv().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
10 years agoblock: Change coroutine wrapper to byte granularity
Kevin Wolf [Thu, 5 Dec 2013 11:09:38 +0000 (12:09 +0100)]
block: Change coroutine wrapper to byte granularity

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
10 years agoblock: Assert serialisation assumptions in pwritev
Kevin Wolf [Tue, 14 Jan 2014 10:41:35 +0000 (11:41 +0100)]
block: Assert serialisation assumptions in pwritev

If a request calls wait_serialising_requests() and actually has to wait
in this function (i.e. a coroutine yield), other requests can run and
previously read data (like the head or tail buffer) could become
outdated. In this case, we would have to restart from the beginning to
read in the updated data.

However, we're lucky and don't actually need to do that: A request can
only wait in the first call of wait_serialising_requests() because we
mark it as serialising before that call, so any later requests would
wait. So as we don't wait in practice, we don't have to reload the data.

This is an important assumption that may not be broken or data
corruption will happen. Document it with some assertions.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
10 years agoblock: Align requests in bdrv_co_do_pwritev()
Kevin Wolf [Tue, 3 Dec 2013 15:34:41 +0000 (16:34 +0100)]
block: Align requests in bdrv_co_do_pwritev()

This patch changes bdrv_co_do_pwritev() to actually be what its name
promises. If requests aren't properly aligned, it performs a RMW.

Requests touching the same block are serialised against the RMW request.
Further optimisation of this is possible by differentiating types of
requests (concurrent reads should actually be okay here).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agoblock: Allow wait_serialising_requests() at any point
Kevin Wolf [Fri, 13 Dec 2013 12:04:35 +0000 (13:04 +0100)]
block: Allow wait_serialising_requests() at any point

We can only have a single wait_serialising_requests() call per request
because otherwise we can run into deadlocks where requests are waiting
for each other. The same is true when wait_serialising_requests() is not
at the very beginning of a request, so that other requests can be issued
between the start of the tracking and wait_serialising_requests().

Fix this by changing wait_serialising_requests() to ignore requests that
are already (directly or indirectly) waiting for the calling request.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agoblock: Make overlap range for serialisation dynamic
Kevin Wolf [Wed, 4 Dec 2013 16:08:50 +0000 (17:08 +0100)]
block: Make overlap range for serialisation dynamic

Copy on Read wants to serialise with all requests touching the same
cluster, so wait_serialising_requests() rounded to cluster boundaries.
Other users like alignment RMW will have different requirements, though
(requests touching the same sector), so make it dynamic.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agoblock: Generalise and optimise COR serialisation
Kevin Wolf [Wed, 4 Dec 2013 15:43:44 +0000 (16:43 +0100)]
block: Generalise and optimise COR serialisation

Change the API so that specific requests can be marked serialising. Only
these requests are checked for overlaps then.

This means that during a Copy on Read operation, not all requests
overlapping other requests are serialised any more, but only those that
actually overlap with the specific COR request.

Also remove COR from function and variable names because this
functionality can be useful in other contexts.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agoblock: Make zero-after-EOF work with larger alignment
Kevin Wolf [Wed, 4 Dec 2013 11:13:10 +0000 (12:13 +0100)]
block: Make zero-after-EOF work with larger alignment

Odd file sizes could make bdrv_aligned_preadv() shorten the request in
non-aligned ways. Fix it by rounding to the required alignment instead
of 512 bytes.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agoblock: Allow waiting for overlapping requests between begin/end
Kevin Wolf [Tue, 3 Dec 2013 13:55:55 +0000 (14:55 +0100)]
block: Allow waiting for overlapping requests between begin/end

Previously, it was not possible to use wait_for_overlapping_requests()
between tracked_request_begin()/end() because it would wait for itself.

Ignore the current request in the overlap check and run more of the
bdrv_co_do_preadv/pwritev code with a BdrvTrackedRequest present.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agoblock: Switch BdrvTrackedRequest to byte granularity
Kevin Wolf [Tue, 3 Dec 2013 14:31:25 +0000 (15:31 +0100)]
block: Switch BdrvTrackedRequest to byte granularity

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agoblock: Introduce bdrv_co_do_pwritev()
Kevin Wolf [Tue, 3 Dec 2013 13:40:18 +0000 (14:40 +0100)]
block: Introduce bdrv_co_do_pwritev()

This is going to become the bdrv_co_do_preadv() equivalent for writes.
In this patch, however, just a function taking byte offsets is created,
it doesn't align anything yet.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agoblock: write: Handle COR dependency after I/O throttling
Kevin Wolf [Tue, 3 Dec 2013 13:30:44 +0000 (14:30 +0100)]
block: write: Handle COR dependency after I/O throttling

First waiting for all COR requests to complete and calling the
throttling function afterwards means that the request could be delayed
and we still need to wait for the COR request even if it was issued only
after the throttled write request.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agoblock: Introduce bdrv_aligned_pwritev()
Kevin Wolf [Tue, 3 Dec 2013 13:02:23 +0000 (14:02 +0100)]
block: Introduce bdrv_aligned_pwritev()

This separates the part of bdrv_co_do_writev() that needs to happen
before the request is modified to match the backend alignment, and a
part that needs to be executed afterwards and passes the request to the
BlockDriver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agoblock: Introduce bdrv_co_do_preadv()
Kevin Wolf [Mon, 2 Dec 2013 15:09:46 +0000 (16:09 +0100)]
block: Introduce bdrv_co_do_preadv()

Similar to bdrv_pread(), which aligns byte-aligned request to 512 byte
sectors, bdrv_co_do_preadv() takes a byte-aligned request and aligns it
to the alignment specified in bs->request_alignment.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agoblock: Introduce bdrv_aligned_preadv()
Kevin Wolf [Mon, 2 Dec 2013 14:07:48 +0000 (15:07 +0100)]
block: Introduce bdrv_aligned_preadv()

This separates the part of bdrv_co_do_readv() that needs to happen
before the request is modified to match the backend alignment, and a
part that needs to be executed afterwards and passes the request to the
BlockDriver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
10 years agoraw: Probe required direct I/O alignment
Paolo Bonzini [Tue, 29 Nov 2011 11:42:20 +0000 (12:42 +0100)]
raw: Probe required direct I/O alignment

Add a bs->request_alignment field that contains the required
offset/length alignment for I/O requests and fill it in the raw block
drivers. Use ioctls if possible, else see what alignment it takes for
O_DIRECT to succeed.

While at it, also expose the memory alignment requirements, which may be
(and in practice are) different from the disk alignment requirements.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
10 years agoblock: rename buffer_alignment to guest_block_size
Paolo Bonzini [Tue, 29 Nov 2011 10:35:47 +0000 (11:35 +0100)]
block: rename buffer_alignment to guest_block_size

The alignment field is now set to the value that is promised to the
guest, rather than required by the host.  The next patches will make
QEMU aware of the host-provided values, so make this clear.

The alignment is also not about memory buffers, but about the sectors on
the disk, change the documentation of the field.

At this point, the field is set by the device emulation, but completely
ignored by the block layer.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agoblock: Don't use guest sector size for qemu_blockalign()
Kevin Wolf [Thu, 28 Nov 2013 09:23:32 +0000 (10:23 +0100)]
block: Don't use guest sector size for qemu_blockalign()

bs->buffer_alignment is set by the device emulation and contains the
logical block size of the guest device. This isn't something that the
block layer should know, and even less something to use for determining
the right alignment of buffers to be used for the host.

The new BlockLimits field opt_mem_alignment tells the qemu block layer
the optimal alignment to be used so that no bounce buffer must be used
in the driver.

This patch may change the buffer alignment from 4k to 512 for all
callers that used qemu_blockalign() with the top-level image format
BlockDriverState. The value was never propagated to other levels in the
tree, so in particular raw-posix never required anything else than 512.

While on disks with 4k sectors direct I/O requires a 4k alignment,
memory may still be okay when aligned to 512 byte boundaries. This is
what must have happened in practice, because otherwise this would
already have failed earlier. Therefore I don't expect regressions even
with this intermediate state. Later, raw-posix can implement the hook
and expose a different memory alignment requirement.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
10 years agoblock: Detect unaligned length in bdrv_qiov_is_aligned()
Kevin Wolf [Thu, 5 Dec 2013 12:01:46 +0000 (13:01 +0100)]
block: Detect unaligned length in bdrv_qiov_is_aligned()

For an O_DIRECT request to succeed, it's not only necessary that all
base addresses in the qiov are aligned, but also that each length in it
is aligned.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
10 years agoqemu_memalign: Allow small alignments
Kevin Wolf [Fri, 29 Nov 2013 20:29:17 +0000 (21:29 +0100)]
qemu_memalign: Allow small alignments

The functions used by qemu_memalign() require an alignment that is at
least sizeof(void*). Adjust it if it is too small.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoît Canet <benoit@irqsave.net>
10 years agoblock: Update BlockLimits when they might have changed
Kevin Wolf [Wed, 11 Dec 2013 19:14:09 +0000 (20:14 +0100)]
block: Update BlockLimits when they might have changed

When reopening with different flags, or when backing files disappear
from the chain, the limits may change. Make sure they get updated in
these cases.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoît Canet <benoit@irqsave.net>
10 years agoblock: Inherit opt_transfer_length
Kevin Wolf [Wed, 11 Dec 2013 18:50:32 +0000 (19:50 +0100)]
block: Inherit opt_transfer_length

When there is a format driver between the backend, it's not guaranteed
that exposing the opt_transfer_length for the format driver results in
the optimal requests (because of fragmentation etc.), but it can't make
things worse, so let's just do it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoît Canet <benoit@irqsave.net>
10 years agoblock: Move initialisation of BlockLimits to bdrv_refresh_limits()
Kevin Wolf [Wed, 11 Dec 2013 18:26:16 +0000 (19:26 +0100)]
block: Move initialisation of BlockLimits to bdrv_refresh_limits()

This function separates filling the BlockLimits from bdrv_open(), which
allows it to call it from other operations which may change the limits
(e.g. modifications to the backing file chain or bdrv_reopen)

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agoblock: Fix bdrv_commit return value
Kevin Wolf [Fri, 24 Jan 2014 13:00:43 +0000 (14:00 +0100)]
block: Fix bdrv_commit return value

bdrv_commit() could return 0 or 1 on success, depending on whether or
not the last sector was allocated in the overlay and whether the overlay
format had a .bdrv_make_empty callback.

Most callers ignored it, but qemu-img commit would print an error
message while the operation actually succeeded.

Also clean up the handling of I/O errors to return the real error code
instead of -EIO.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agoblock: update block commit documentation regarding image truncation
Jeff Cody [Fri, 24 Jan 2014 14:02:37 +0000 (09:02 -0500)]
block: update block commit documentation regarding image truncation

This updates the documentation for commiting snapshot images.
Specifically, this highlights what happens when the base image
is either smaller or larger than the snapshot image being committed.

In the case of the base image being smaller, it is resized to the
larger size of the snapshot image.  In the case of the base image
being larger, it is not resized automatically, but once the commit
has completed it is safe for the user to truncate the base image.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoblock: resize backing image during active layer commit, if needed
Jeff Cody [Fri, 24 Jan 2014 14:02:36 +0000 (09:02 -0500)]
block: resize backing image during active layer commit, if needed

If the top image to commit is the active layer, and also larger than
the base image, then an I/O error will likely be returned during
block-commit.

For instance, if we have a base image with a virtual size 10G, and a
active layer image of size 20G, then committing the snapshot via
'block-commit' will likely fail.

This will automatically attempt to resize the base image, if the
active layer image to be committed is larger.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoblock: resize backing file image during offline commit, if necessary
Jeff Cody [Fri, 24 Jan 2014 14:02:35 +0000 (09:02 -0500)]
block: resize backing file image during offline commit, if necessary

Currently, if an image file is logically larger than its backing file,
committing it via 'qemu-img commit' will fail.

For instance, if we have a base image with a virtual size 10G, and a
snapshot image of size 20G, then committing the snapshot offline with
'qemu-img commit' will likely fail.

This will automatically attempt to resize the base image, if the
snapshot image to be committed is larger.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoblock/curl: Implement the libcurl timer callback interface
Peter Maydell [Fri, 24 Jan 2014 13:56:17 +0000 (14:56 +0100)]
block/curl: Implement the libcurl timer callback interface

libcurl versions 7.16.0 and later have a timer callback interface which
must be implemented in order for libcurl to make forward progress (it
will sometimes rely on being called back on the timeout if there are
no file descriptors registered). Implement the callback, and use a
QEMU AIO timer to ensure we prod libcurl again when it asks us to.

Based on Peter's original patch plus my fix to add curl_multi_timeout_do.
Should compile just fine even on older versions of libcurl.

I also tried copy-on-read and streaming:

    $ ./qemu-img create -f qcow2 -o \
         backing_file=http://download.fedoraproject.org/pub/fedora/linux/releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso \
         foo.qcow2 1G
    $ x86_64-softmmu/qemu-system-x86_64 \
         -drive if=none,file=foo.qcow2,copy-on-read=on,id=cd \
         -device ide-cd,drive=cd --enable-kvm -m 1024

Direct http usage is probably too slow, but with copy-on-read ultimately
the image does boot!

After some time, streaming gets canceled by an EIO, which needs further
investigation.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoqmp: Allow to take external snapshots on bs graphs node.
Benoît Canet [Thu, 23 Jan 2014 20:31:38 +0000 (21:31 +0100)]
qmp: Allow to take external snapshots on bs graphs node.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoqmp: Allow block_resize to manipulate bs graph nodes.
Benoît Canet [Thu, 23 Jan 2014 20:31:37 +0000 (21:31 +0100)]
qmp: Allow block_resize to manipulate bs graph nodes.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoblock: Create authorizations mechanism for external snapshot and resize.
Benoît Canet [Thu, 23 Jan 2014 20:31:36 +0000 (21:31 +0100)]
block: Create authorizations mechanism for external snapshot and resize.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoqmp: Allow to change password on named block driver states.
Benoît Canet [Thu, 23 Jan 2014 20:31:35 +0000 (21:31 +0100)]
qmp: Allow to change password on named block driver states.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Fam Zheng <famz@redhat.com>
There was two candidate ways to implement named node manipulation:

1)
{ 'command': 'block_passwd', 'data': {'*device': 'str',
                                      '*node-name': 'str', 'password': 'str'}
}

2)

{ 'command': 'block_passwd', 'data': {'device': 'str',
                                      '*device-is-node': 'bool',
                                      'password': 'str'} }

Luiz proposed 1 and says 2 was an abuse of the QMP interface and proposed to
rewrite the QMP block interface for 2.0.

Luiz does not like in 1 the fact that 2 fields are optional but one of them must
be specified leading to an abuse of the QMP semantic.

Kevin argumented that 2 what a clear abuse of the device field and would not be
practical when reading fast some log file because the user would read "device"
and think that a device is manipulated when it's in fact a node name.
Documentation of 1 make it pretty clear what to do for the user.

Kevin argued that all bs are node including devices ones so 2 does not make
sense.

Kevin also argued that rewriting the QMP block interface would not make disapear
the current one.

Kevin pushed the argument that making the QAPI generator compatible with the
semantic of the operation would need a rewrite that no one has done yet.

A vote has been done on the list to elect the version to use and 1 won.

For reference the complete thread is:
"[Qemu-devel] [PATCH V4 4/7] qmp: Allow to change password on names block driver
states."

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoqmp: Add QMP query-named-block-nodes to list the named BlockDriverState nodes.
Benoît Canet [Thu, 23 Jan 2014 20:31:34 +0000 (21:31 +0100)]
qmp: Add QMP query-named-block-nodes to list the named BlockDriverState nodes.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoblock: Allow the user to define "node-name" option both on command line and QMP.
Benoît Canet [Thu, 23 Jan 2014 20:31:33 +0000 (21:31 +0100)]
block: Allow the user to define "node-name" option both on command line and QMP.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoblock: Add bs->node_name to hold the name of a bs node of the bs graph.
Benoît Canet [Thu, 23 Jan 2014 20:31:32 +0000 (21:31 +0100)]
block: Add bs->node_name to hold the name of a bs node of the bs graph.

Add the minimum of code to prepare for the following patches.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoqapi: Add "backing" to BlockStats
Fam Zheng [Thu, 23 Jan 2014 02:03:26 +0000 (10:03 +0800)]
qapi: Add "backing" to BlockStats

Currently there is no way to query BlockStats of the backing chain. This
adds "backing" field into BlockStats to make it possible.

The comment of "parent" is reworded.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agovmdk: Fix format specific information (create type) for streamOptimized
Fam Zheng [Thu, 23 Jan 2014 07:10:52 +0000 (15:10 +0800)]
vmdk: Fix format specific information (create type) for streamOptimized

Previously the field is wrong:

    $ ./qemu-img create -f vmdk -o subformat=streamOptimized /tmp/a.vmdk 1G

    $ ./qemu-img info /tmp/a.vmdk
    image: /tmp/a.vmdk
    file format: vmdk
    virtual size: 1.0G (1073741824 bytes)
    disk size: 12K
    Format specific information:
        cid: 1390460459
        parent cid: 4294967295
>>>     create type: monolithicSparse
        <snip>

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agodrive mirror:fix memory leak
Zhang Min [Thu, 23 Jan 2014 07:59:16 +0000 (15:59 +0800)]
drive mirror:fix memory leak

In the function mirror_iteration() -> qemu_iovec_init(),
it allocates memory for op->qiov.iov, when the write request calls back,
but in the function mirror_iteration_done(), it only frees the op,
not free the op->qiov.iov, so this causes memory leak.

It should use qemu_iovec_destroy() to free op->qiov.

Signed-off-by: Zhang Min <rudy.zhangmin@huawei.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agosheepdog: fix 'qemu-img map'
Liu Yuan [Tue, 21 Jan 2014 17:14:11 +0000 (01:14 +0800)]
sheepdog: fix 'qemu-img map'

It was muted in the previous commit 4bc74be9. Let's revive it since nothing
prevents us to do it.

With this patch, following command will work as other formats:

$ qemu-img map sheepdog:image

Cc: qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoDocumentation: qemu-img: Mention SIGUSR1 progress report
Kevin Wolf [Mon, 20 Jan 2014 14:12:16 +0000 (15:12 +0100)]
Documentation: qemu-img: Mention SIGUSR1 progress report

Document the SIGUSR1 behaviour of qemu-img. Also, added compare to the
list of subcommands that support -p.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agoqemu-progress: Fix progress printing on SIGUSR1
Kevin Wolf [Mon, 20 Jan 2014 14:06:03 +0000 (15:06 +0100)]
qemu-progress: Fix progress printing on SIGUSR1

Since commit a7aae221 ('Switch SIG_IPI to SIGUSR1'), SIGUSR1 is blocked
during startup, breaking the progress report in tools.

This patch reenables the signal when initialising a progress report.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agoqemu-progress: Drop unused include
Kevin Wolf [Mon, 20 Jan 2014 14:05:25 +0000 (15:05 +0100)]
qemu-progress: Drop unused include

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
10 years agovmdk: Check for overhead when opening
Fam Zheng [Tue, 21 Jan 2014 07:07:43 +0000 (15:07 +0800)]
vmdk: Check for overhead when opening

Report an error if file size is even smaller than metadata.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoqcow2: fix wrong value of L1E_OFFSET_MASK, L2E_OFFSET_MASK and REFT_OFFSET_MASK
Hu Tao [Tue, 21 Jan 2014 03:30:02 +0000 (11:30 +0800)]
qcow2: fix wrong value of L1E_OFFSET_MASK, L2E_OFFSET_MASK and REFT_OFFSET_MASK

Accoring to qcow spec, the offset fields in l1e, l2e and ref table entry
start at bit 9. The offset is cluster offset, and the smallest possible
cluster size is 512 bytes.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agodataplane: fix shadowed return value
Stefan Hajnoczi [Mon, 13 Jan 2014 10:47:39 +0000 (18:47 +0800)]
dataplane: fix shadowed return value

Propagate the error return value from get_indirect().  This bug was
introduced in commit 4d684832 ("vring: create a common function to parse
descriptors").

Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoblock: fix backing file segfault
Peter Feiner [Wed, 8 Jan 2014 19:43:25 +0000 (19:43 +0000)]
block: fix backing file segfault

When a backing file is opened such that (1) a protocol is directly
used as the block driver and (2) the block driver has bdrv_file_open,
bdrv_open_backing_file segfaults. The problem arises because
bdrv_open_common returns without setting bd->backing_hd->file.

To effect (1), you seem to have to use the -F flag in qemu-img. There
are several block drivers that satisfy (2), such as "file" and "nbd".
Here are some concrete examples:

    #!/bin/bash

    echo Test file format
    ./qemu-img create -f file base.file 1m
    ./qemu-img create -f qcow2 -F file -o backing_file=base.file\
        file-overlay.qcow2
    ./qemu-img convert -O raw file-overlay.qcow2 file-convert.raw

    echo Test nbd format
    SOCK=$PWD/nbd.sock
    ./qemu-img create -f raw base.raw 1m
    ./qemu-nbd -t -k $SOCK base.raw &
    trap "kill $!" EXIT
    while ! test -e $SOCK; do sleep 1; done
    ./qemu-img create -f qcow2 -F nbd -o backing_file=nbd:unix:$SOCK\
        nbd-overlay.qcow2
    ./qemu-img convert -O raw nbd-overlay.qcow2 nbd-convert.raw

Without this patch, the two qemu-img convert commands segfault.

This is a regression that was introduced in v1.7 by
dbecebddfa4932d1c83915bcb9b5ba5984eb91be.

Signed-off-by: Peter Feiner <peter@gridcentric.ca>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoiotests: Test file format nesting
Max Reitz [Fri, 20 Dec 2013 18:28:24 +0000 (19:28 +0100)]
iotests: Test file format nesting

Add a test for nested image formats.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoiotests: Test new blkdebug/blkverify interface
Max Reitz [Fri, 20 Dec 2013 18:28:23 +0000 (19:28 +0100)]
iotests: Test new blkdebug/blkverify interface

Add a test for the new blkdebug/blkverify interface.

This test is not written in Python, although it uses QMP. This is
because it invokes the qemu-io HMP command, which outputs errors to
stderr instead of returning them through QMP. Filtering and testing that
output is easier in a shell script than with the Python infrastructure.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agotests: Add test for qdict_flatten()
Max Reitz [Fri, 20 Dec 2013 18:28:22 +0000 (19:28 +0100)]
tests: Add test for qdict_flatten()

Add a test case for qdict_flatten() in tests/check-qdict.c. This test
case covers the flattening of subordinate QLists as well.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agotests: Add test for qdict_array_split()
Max Reitz [Fri, 20 Dec 2013 18:28:21 +0000 (19:28 +0100)]
tests: Add test for qdict_array_split()

Add a test case for qdict_array_split() in tests/check-qdict.c.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoqemu-io: Make filename optional
Max Reitz [Fri, 20 Dec 2013 18:28:20 +0000 (19:28 +0100)]
qemu-io: Make filename optional

Giving a filename is actually not essential, since it can be specified
through the options as well - on the contrary: Sometimes a filename must
not be given.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoqapi: QMP interface for blkdebug and blkverify
Max Reitz [Fri, 20 Dec 2013 18:28:19 +0000 (19:28 +0100)]
qapi: QMP interface for blkdebug and blkverify

Add structures to support blkdebug and blkverify in blockdev-add.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoqapi: Add "errno" to the list of polluted words
Max Reitz [Fri, 20 Dec 2013 18:28:18 +0000 (19:28 +0100)]
qapi: Add "errno" to the list of polluted words

Using "errno" directly as an identifier results in various syntax
errors; therefore it should be added to the list of polluted words.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoblkverify: Don't require protocol filename
Max Reitz [Fri, 20 Dec 2013 18:28:17 +0000 (19:28 +0100)]
blkverify: Don't require protocol filename

If the filename is not prefixed by "blkverify:" in
blkverify_parse_filename(), the blkverify driver was not selected
through that protocol prefix, but by an explicit command line (or QMP)
option (like driver=blkverify).

If blkverify_parse_filename() has been called, a filename has been
given. If it is not prefixed, it is probably really just a plain
filename. This is no problem, since we can use it as the test image
filename and rely on the user to specify the raw image filename through
the new corresponding option.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoblkverify: Allow command-line configuration
Max Reitz [Fri, 20 Dec 2013 18:28:16 +0000 (19:28 +0100)]
blkverify: Allow command-line configuration

Introduce the "test" and "raw" options for specifying images.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoblkdebug: Allow command-line file configuration
Max Reitz [Fri, 20 Dec 2013 18:28:15 +0000 (19:28 +0100)]
blkdebug: Allow command-line file configuration

Introduce the "image" option as an alternative to specifying the image
through the filename.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoblockdev: Move "file" to legacy_opts
Max Reitz [Fri, 20 Dec 2013 18:28:14 +0000 (19:28 +0100)]
blockdev: Move "file" to legacy_opts

Specifying the image filename through the "file" option is a legacy
option and should not be supported by blockdev-add (in that case, giving
a string for "file" references an existing block device).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoblock: Allow recursive "file"s
Max Reitz [Fri, 20 Dec 2013 18:28:13 +0000 (19:28 +0100)]
block: Allow recursive "file"s

It should be possible to use a format as a driver for a file which in
turn requires another file, i.e., nesting file formats.

Allowing nested file formats results in e.g. qcow2 BlockDriverStates
never being directly passed to bdrv_open_common() from bdrv_file_open(),
but instead being handed through bdrv_open(). This changes the error
message when trying to give a filename to qcow2, i.e. trying to use it
as a driver for the protocol level. Therefore, change the reference
output of I/O test 051 accordingly.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10 years agoblock: Use bdrv_open_image() in bdrv_open()
Max Reitz [Fri, 20 Dec 2013 18:28:12 +0000 (19:28 +0100)]
block: Use bdrv_open_image() in bdrv_open()

Using bdrv_open_image() instead of bdrv_file_open() directly in
bdrv_open() is easier.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>