review.tizen.org Git - sdk/emulator/qemu.git/log

docker: Add clang test

The (currently partially commented out) configure options are suggested
by John Snow <jsnow@redhat.com>.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1464755128-32490-11-git-send-email-famz@redhat.com

docker: Add full test

This builds all available targets.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1464755128-32490-10-git-send-email-famz@redhat.com

docker: Add quick test

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1464755128-32490-9-git-send-email-famz@redhat.com

docker: Add common.rc

"requires" checks the "FEATURE" environment for specified prerequisits,
and skip the execution of test if not found.

"build_qemu" is the central routine to compile QEMU for tests to call.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1464755128-32490-8-git-send-email-famz@redhat.com

docker: Add test runner

It's better to have a launcher for all tests, to make it easier to
initialize and manage the environment.

If "DEBUG=1" a shell prompt will show up before the test runs.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1464755128-32490-7-git-send-email-famz@redhat.com

docker: Add images

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1464755128-32490-6-git-send-email-famz@redhat.com

Makefile: Rules for docker testing

This adds a group of make targets to run docker tests, all are available
in source tree without running ./configure.

The usage is shown with "make docker".

Besides the fixed ones, dynamic targets for building each image and
running each test in each image are generated automatically by make,
scanning $(SRC_PATH)/tests/docker/ files with specific patterns.

Alternative to manually list particular targets (docker-TEST@IMAGE)
set, you can control which tests/images to run by filtering variables,
TESTS= and IMAGES=, which are expressed in Makefile pattern syntax,
"foo% %bar ...". For example:

$ make docker-test IMAGES="ubuntu fedora"

Unfortunately, it's impossible to propagate "-j $JOBS" into make in
containers, however since each combination is made a first class target
in the top Makefile, "make -j$N docker-test" still parallels the tests
coarsely.

Still, $J is made a magic variable to let all make invocations in
containers to use -j$J.

Instead of providing a live version of the source tree to the docker
container we snapshot it with git-archive. This ensures the tree is in a
pristine state for whatever operations the container is going to run on
them.

Uncommitted changes known to files known by the git index will be
included in the snapshot if there are any.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1464755128-32490-5-git-send-email-famz@redhat.com

Makefile: Always include rules.mak

When config-host.mak is not found it is safe to assume SRC_PATH is ".".
So, it is okay to move inclusion of ruls.mak out of the ifeq condition.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1464755128-32490-4-git-send-email-famz@redhat.com

rules.mak: Add "COMMA" constant

Using "," literal in $(call quiet-command, ...) arguments is awkward.
Add this constant to make it at least doable.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1464755128-32490-3-git-send-email-famz@redhat.com

tests: Add utilities for docker testing

docker.py is added with a number of useful subcommands to manager docker
images and instances for QEMU docker testing. Subcommands are:

run: A wrapper of "docker run" (or "sudo -n docker run" if necessary),
which takes care of killing and removing the running container at
SIGINT.

clean: Tear down all the containers including inactive ones that are
started by docker_run.

build: Compare an image from given dockerfile and rebuild it if they're
different.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1464755128-32490-2-git-send-email-famz@redhat.com

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.7-20160531' into staging

ppc patch queue for 2016-05-31

Here's another ppc patch queue.  This batch is all preliminaries
towards two significant features:

1) Full hypervisor-mode support for POWER8
    Patches 1-8 start fixing various bugs with TCG's handling of
    hypervisor mode

2) CPU hotplug support
    Patches 9-12 make some preliminary fixes towards implementing CPU
    hotplug on ppc64 (and other non-x86 platforms).  These patches are
    actually to generic code, not ppc, but are included here with
    Paolo's ACK.

# gpg: Signature made Tue 31 May 2016 01:39:44 BST using RSA key ID 20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.7-20160531:
  cpu: Add a sync version of cpu_remove()
  cpu: Reclaim vCPU objects
  exec: Do vmstate unregistration from cpu_exec_exit()
  exec: Remove cpu from cpus list during cpu_exec_exit()
  ppc: Add PPC_64H instruction flag to POWER7 and POWER8
  ppc: Get out of emulation on SMT "OR" ops
  ppc: Fix sign extension issue in mtmsr(d) emulation
  ppc: Change 'invalid' bit mask of tlbiel and tlbie
  ppc: tlbie, tlbia and tlbisync are HV only
  ppc: Do some batching of TCG tlb flushes
  ppc: Use split I/D mmu modes to avoid flushes on interrupts
  ppc: Remove MMU_MODEn_SUFFIX definitions

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* docs/atomics fixes and atomic_rcu_* optimization (Emilio)
* NBD bugfix (Eric)
* Memory fixes and cleanups (Paolo, Paul)
* scsi-block support for SCSI status, including persistent
  reservations (Paolo)
* kvm_stat moves to the Linux repository
* SCSI bug fixes (Peter, Prasad)
* Killing qemu_char_get_next_serial, non-ARM parts (Xiaoqiang)

# gpg: Signature made Sun 29 May 2016 08:11:20 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"

* remotes/bonzini/tags/for-upstream: (30 commits)
  exec: hide mr->ram_addr from qemu_get_ram_ptr users
  memory: split memory_region_from_host from qemu_ram_addr_from_host
  exec: remove ram_addr argument from qemu_ram_block_from_host
  memory: remove qemu_get_ram_fd, qemu_set_ram_fd, qemu_ram_block_host_ptr
  scsi-generic: Merge block max xfer len in INQUIRY response
  scsi-block: always use SG_IO
  scsi-disk: introduce scsi_disk_req_check_error
  scsi-disk: add need_fua_emulation to SCSIDiskClass
  scsi-disk: introduce dma_readv and dma_writev
  scsi-disk: introduce a common base class
  xen-hvm: ignore background I/O sections
  docs/atomics: update comparison with Linux
  atomics: do not emit consume barrier for atomic_rcu_read
  atomics: emit an smp_read_barrier_depends() barrier only for Alpha and Thread Sanitizer
  docs/atomics: update atomic_read/set comparison with Linux
  bt: rewrite csrhci_write to avoid out-of-bounds writes
  block/iscsi: avoid potential overflow of acb->task->cdb
  scsi: megasas: check 'read_queue_head' index value
  scsi: megasas: initialise local configuration data buffer
  scsi: megasas: use appropriate property buffer size
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

cpu: Add a sync version of cpu_remove()

This sync API will be used by the CPU hotplug code to wait for the CPU to
completely get removed before flagging the failure to the device_add
command.

Sync version of this call is needed to correctly recover from CPU
realization failures when ->plug() handler fails.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

cpu: Reclaim vCPU objects

In order to deal well with the kvm vcpus (which can not be removed without any
protection), we do not close KVM vcpu fd, just record and mark it as stopped
into a list, so that we can reuse it for the appending cpu hot-add request if
possible. It is also the approach that kvm guys suggested:
https://www.mail-archive.com/kvm@vger.kernel.org/msg102839.html

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
               [- Explicit CPU_REMOVE() from qemu_kvm/tcg_destroy_vcpu()
                  isn't needed as it is done from cpu_exec_exit()
                - Use iothread mutex instead of global mutex during
                  destroy
                - Don't cleanup vCPU object from vCPU thread context
                  but leave it to the callers (device_add/device_del)]
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

exec: Do vmstate unregistration from cpu_exec_exit()

cpu_exec_init() does vmstate_register for the CPU device. This needs to be
undone from cpu_exec_exit(). This change is needed to support CPU hot
removal.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
[dwg: added missing include to fix compile on some archs]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

exec: Remove cpu from cpus list during cpu_exec_exit()

CPUState *cpu gets added to the cpus list during cpu_exec_init(). It
should be removed from cpu_exec_exit().

cpu_exec_exit() is called from generic CPU::instance_finalize and some
archs like PowerPC call it from CPU unrealizefn. So ensure that we
dequeue the cpu only once.

Now -1 value for cpu->cpu_index indicates that we have already dequeued
the cpu for CONFIG_USER_ONLY case also.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc: Add PPC_64H instruction flag to POWER7 and POWER8

This will enable decoding of hrfid

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc: Get out of emulation on SMT "OR" ops

Otherwise tight loops at smt_low for example, which OPAL does,
eat so much CPU that we can't boot a kernel anymore. With that,
I can boot 8 CPUs just fine with powernv.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc: Fix sign extension issue in mtmsr(d) emulation

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc: Change 'invalid' bit mask of tlbiel and tlbie

Otherwise it will trip on the forms used in recent architecture.

Ideally, we should have different handlers for different architecture
levels but our current implementation of TLB flushing is dumb enough
that this will do for now.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc: tlbie, tlbia and tlbisync are HV only

Not that anything remotely recent supports tlbia but ...

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc: Do some batching of TCG tlb flushes

On ppc64 especially, we flush the tlb on any slbie or tlbie instruction.

However, those instructions often come in bursts of 3 or more (context
switch will favor a series of slbie's for example to an slbia if the
SLB has less than a certain number of entries in it, and tlbie's can
happen in a series, with PAPR, H_BULK_REMOVE can remove up to 4 entries
at a time.

Doing a tlb_flush() each time is a waste of time. We end up doing a memset
of the whole TLB, reloading it for the next instruction, memset'ing again,
etc...

Those instructions don't have to take effect immediately. For slbie, they
can wait for the next context synchronizing event. For tlbie, the next
tlbsync.

This implements batching by keeping a flag that indicates that we have a
TLB in need of flushing. We check it on interrupts, rfi's, isync's and
tlbsync and flush the TLB if needed.

This reduces the number of tlb_flush() on a boot to a ubuntu installer
first dialog screen from roughly 360K down to 36K.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: added a 'CPUPPCState *' variable in h_remove() and
h_bulk_remove() ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
[dwg: removed spurious whitespace change, use 0/1 not true/false
consistently, since tlb_need_flush has int type]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc: Use split I/D mmu modes to avoid flushes on interrupts

We rework the way the MMU indices are calculated, providing separate
indices for I and D side based on MSR:IR and MSR:DR respectively,
and thus no longer need to flush the TLB on context changes. This also
adds correct support for HV as a separate address space.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc: Remove MMU_MODEn_SUFFIX definitions

We don't use the resulting accessors and this gets in the way of
the split I/D TLB work.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

exec: hide mr->ram_addr from qemu_get_ram_ptr users

Let users of qemu_get_ram_ptr and qemu_ram_ptr_length pass in an
address that is relative to the MemoryRegion. This basically means
what address_space_translate returns.

Because the semantics of the second parameter change, rename the
function to qemu_map_ram_ptr.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

memory: split memory_region_from_host from qemu_ram_addr_from_host

Move the old qemu_ram_addr_from_host to memory_region_from_host and
make it return an offset within the region. For qemu_ram_addr_from_host
return the ram_addr_t directly, similar to what it was before
commit 1b5ec23 ("memory: return MemoryRegion from qemu_ram_addr_from_host",
2013-07-04).

Reviewed-by: Marc-André Lureau <marcandre.lureau@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

exec: remove ram_addr argument from qemu_ram_block_from_host

Of the two callers, one does not use it, and the other can compute
it itself based on the other output argument (offset) and the RAMBlock.

Reviewed-by: Marc-André Lureau <marcandre.lureau@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

memory: remove qemu_get_ram_fd, qemu_set_ram_fd, qemu_ram_block_host_ptr

Remove direct uses of ram_addr_t and optimize memory_region_{get,set}_fd
now that a MemoryRegion knows its RAMBlock directly.

Reviewed-by: Marc-André Lureau <marcandre.lureau@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

scsi-generic: Merge block max xfer len in INQUIRY response

The rationale is similar to the above mode sense response interception:
this is practically the only channel to communicate restraints from
elsewhere such as host and block driver.

The scsi bus we attach onto can have a larger max xfer len than what is
accepted by the host file system (guarding between the host scsi LUN and
QEMU), in which case the SG_IO we generate would get -EINVAL.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1464243305-10661-3-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

scsi-block: always use SG_IO

Using pread/pwrite or io_submit has the advantage of eliminating the
bounce buffer, but drops the SCSI status. This keeps the guest from
seeing unit attention codes, as well as statuses such as RESERVATION
CONFLICT. Because we know scsi-block operates on an SBC device we can
still use the DMA helpers with SG_IO; just remember to patch the CDBs
if the transfer is split into multiple segments.

This means that scsi-block will always use the thread-pool unfortunately,
instead of respecting aio=native.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

scsi-disk: introduce scsi_disk_req_check_error

Commonize all the checks for canceled requests and errors. The next patch
will add another case to check for, in order to handle passthrough commands.

There is no semantic change here; the only nontrivial modification is in
scsi_write_do_fua, where cancellation has been checked earlier by both
callers. Thus, the check is replaced with an assertion.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

scsi-disk: add need_fua_emulation to SCSIDiskClass

scsi-block will be able to do FUA just by passing the request through
to the LUN (which is also more efficient); there is no need to emulate
it like we do for scsi-disk.

Add a new method to distinguish this.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

scsi-disk: introduce dma_readv and dma_writev

These are replacements for blk_aio_readv and blk_aio_writev that allow
customization of the data path. They reuse the DMA helpers' DMAIOFunc
callback type, so that the same function can be used in either the
QEMUSGList or the bounce-buffered case.

This customization will be needed in the next patch to do zero-copy
SG_IO on scsi-block.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

scsi-disk: introduce a common base class

This will be the place to add DMAIOFuncs in the next patch. There
are also a couple DeviceClass members that can be moved to the
abstract class's initialization function.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

xen-hvm: ignore background I/O sections

Since Xen will correctly handle accesses to unimplemented I/O ports (by
returning all 1's for reads and ignoring writes) there is no need for
QEMU to register backgroud I/O sections.

This patch therefore adds checks to xen_io_add/del so that sections with
memory-region ops pointing at 'unassigned_io_ops' are ignored.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1462811480-16295-1-git-send-email-paul.durrant@citrix.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

docs/atomics: update comparison with Linux

Over time, some differences between QEMU and Linux atomics are getting
smoothed.  In particular, Linux grew atomic_fetch_or (and in general
the differences regarding RMW operations were not described accurately)
and smp_load_acquire/smp_store_release.  Also, set_mb was renamed to
smp_store_mb().  Include these changes in the documentation.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

atomics: do not emit consume barrier for atomic_rcu_read

Currently we emit a consume-load in atomic_rcu_read. Because of
limitations in current compilers, this is overkill for non-Alpha hosts
and it is only useful to make Thread Sanitizer work.

This patch leaves the consume-load in atomic_rcu_read when
compiling with Thread Sanitizer enabled, and resorts to a
relaxed load + smp_read_barrier_depends otherwise.

On an RMO host architecture, such as aarch64, the performance
improvement of this change is easily measurable. For instance,
qht-bench performs an atomic_rcu_read on every lookup. Performance
before and after applying this patch:

$ tests/qht-bench -d 5 -n 1
Before: 9.78 MT/s
After: 10.96 MT/s

Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1464120374-8950-4-git-send-email-cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

atomics: emit an smp_read_barrier_depends() barrier only for Alpha and Thread Sanitizer

For correctness, smp_read_barrier_depends() is only required to
emit a barrier on Alpha hosts. However, we are currently emitting
a consume fence unconditionally, and most compilers currently treat
consume and acquire fences as equivalent.

Fix it by keeping the consume fence if we're compiling with Thread
Sanitizer, since this might help prevent false warnings. Otherwise,
only emit the barrier for Alpha hosts. Note that we still guarantee
that smp_read_barrier_depends() is a compiler barrier.

Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1464120374-8950-3-git-send-email-cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

docs/atomics: update atomic_read/set comparison with Linux

Recently Linux did a mass conversion of its atomic_read/set calls
so that they at least are READ/WRITE_ONCE. See Linux's commit
62e8a325 ("atomic, arch: Audit atomic_{read,set}()"). It seems though
that their documentation hasn't been updated to reflect this.

The appended updates our documentation to reflect the change, which
means there is effectively no difference between our atomic_read/set
and the current Linux implementation.

While at it, fix the statement that a barrier is implied by
atomic_read/set, which is incorrect. Volatile/atomic semantics prevent
transformations pertaining the variable they apply to; this, however,
has no effect on surrounding statements like barriers do. For more
details on this, see:
https://gcc.gnu.org/onlinedocs/gcc/Volatiles.html

Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1464120374-8950-2-git-send-email-cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

bt: rewrite csrhci_write to avoid out-of-bounds writes

The usage of INT_MAX in this function confuses Coverity. I think
the defect is bogus, however there is no protection against
getting more than sizeof(s->inpkt) bytes from the character device
backend.

Rewrite the function to only fill in as much data as needed from
buf into s->inpkt. The plen variable is replaced by a simple
state machine and there is no need anymore to shift contents to
the beginning of s->inpkt.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

block/iscsi: avoid potential overflow of acb->task->cdb

at least in the path via virtio-blk the maximum size is not
restricted.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-Id: <1464080368-29584-1-git-send-email-pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

scsi: megasas: check 'read_queue_head' index value

While doing MegaRAID SAS controller command frame lookup, routine
'megasas_lookup_frame' uses 'read_queue_head' value as an index
into 'frames[MEGASAS_MAX_FRAMES=2048]' array. Limit its value
within array bounds to avoid any OOB access.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1464179110-18593-1-git-send-email-ppandit@redhat.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

scsi: megasas: initialise local configuration data buffer

When reading MegaRAID SAS controller configuration via MegaRAID
Firmware Interface(MFI) commands, routine megasas_dcmd_cfg_read
uses an uninitialised local data buffer. Initialise this buffer
to avoid stack information leakage.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1464178304-12831-1-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

scsi: megasas: use appropriate property buffer size

When setting MegaRAID SAS controller properties via MegaRAID
Firmware Interface(MFI) commands, a user supplied size parameter
is used to set property value. Use appropriate size value to avoid
OOB access issues.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1464172291-2856-2-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

scsi: mptsas: infinite loop while fetching requests

The LSI SAS1068 Host Bus Adapter emulator in Qemu, periodically
looks for requests and fetches them. A loop doing that in
mptsas_fetch_requests() could run infinitely if 's->state' was
not operational. Move check to avoid such a loop.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Cc: qemu-stable@nongnu.org
Message-Id: <1464077264-25473-1-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

scsi: pvscsi: check command descriptor ring buffer size (CVE-2016-4952)

Vmware Paravirtual SCSI emulation uses command descriptors to
process SCSI commands. These descriptors come with their ring
buffers. A guest could set the ring buffer size to an arbitrary
value leading to OOB access issue. Add check to avoid it.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Cc: qemu-stable@nongnu.org
Message-Id: <1464000485-27041-1-git-send-email-ppandit@redhat.com>
Reviewed-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

kvm_stat: Remove

The source has moved to the Linux kernel tree.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

nbd: Don't trim unrequested bytes

Similar to commit df7b97ff, we are mishandling clients that
give an unaligned NBD_CMD_TRIM request, and potentially
trimming bytes that occur before their request; which in turn
can cause potential unintended data loss (unlikely in
practice, since most clients are sane and issue aligned trim
requests). However, while we fixed read and write by switching
to the byte interfaces of blk_, we don't yet have a byte
interface for discard. On the other hand, trim is advisory, so
rounding the user's request to simply ignore the first and last
unaligned sectors (or the entire request, if it is sub-sector
in length) is just fine.

CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1464173965-9694-1-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

hw/char: QOM'ify milkymist-uart.c

drop the qemu_char_get_next_serial and use chardev prop instead

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-Id: <1464158344-12266-6-git-send-email-zxq_yx_007@163.com>
Tested-by: Michael Walle <michael@walle.cc>
Acked-by: Michael Walle <michael@walle.cc>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

hw/char: QOM'ify lm32_uart.c

* Drop the old SysBus init function and use instance_init
* Call qemu_chr_add_handlers in the realize callback
* Use qdev chardev prop instead of qemu_char_get_next_serial
* Add lm32_uart_create function to create lm32 uart device

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-Id: <1464158344-12266-5-git-send-email-zxq_yx_007@163.com>
Tested-by: Michael Walle <michael@walle.cc>
Acked-by: Michael Walle <michael@walle.cc>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

hw/char: QOM'ify lm32_juart.c

* Drop the old SysBus init function
* Call qemu_chr_add_handlers in the realize callback
* Use qdev chardev prop instead of qemu_char_get_next_serial

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-Id: <1464158344-12266-4-git-send-email-zxq_yx_007@163.com>
Tested-by: Michael Walle <michael@walle.cc>
Acked-by: Michael Walle <michael@walle.cc>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

hw/char: QOM'ify etraxfs_ser.c

* Drop the old SysBus init function and use instance_init
* Call qemu_chr_add_handlers in the realize callback
* Use qdev chardev prop instead of qemu_char_get_next_serial
* Add etraxfs_ser_create function to create etraxfs serial device

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-Id: <1464158344-12266-3-git-send-email-zxq_yx_007@163.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

hw/char: QOM'ify escc.c

* Drop the old SysBus init function and use instance_init
* Call qemu_chr_add_handlers in the realize callback

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-Id: <1464158344-12266-2-git-send-email-zxq_yx_007@163.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Revert "memory: Drop FlatRange.romd_mode"

This reverts commit 5b5660adf1fdb61db14ec681b10463b8cba633f1,
as it breaks the UEFI guest firmware (known as ArmVirtPkg or AAVMF)
running in the "virt" machine type of "qemu-system-aarch64":

Contrary to the commit message, (a->mr == b->mr) does *not* imply
that (a->romd_mode == b->romd_mode): the pflash device model calls
memory_region_rom_device_set_romd() -- for switching between the above
modes --, and that function changes mr->romd_mode but the current
AddressSpaceDispatch's FlatRange keeps the old value. Therefore
region_del/region_add are not called on the KVM MemoryListener.

Reported-by: Drew Jones <drjones@redhat.com>
Tested-by: Drew Jones <drjones@redhat.com>
Analyzed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Merge remote-tracking branch 'remotes/riku/tags/pull-linux-user-20160527' into staging

linux-user pull request v2 for may 2016

# gpg: Signature made Fri 27 May 2016 12:51:10 BST using RSA key ID DE3C9BC0
# gpg: Good signature from "Riku Voipio <riku.voipio@iki.fi>"
# gpg:                 aka "Riku Voipio <riku.voipio@linaro.org>"

* remotes/riku/tags/pull-linux-user-20160527: (38 commits)
  linux-user,target-ppc: fix use of MSR_LE
  linux-user/signal.c: Use s390 target space address instead of host space
  linux-user/signal.c: Use target address instead of host address for microblaze restorer
  linux-user/signal.c: Generate opcode data for restorer in setup_rt_frame
  linux-user: arm: Remove ARM_cpsr and similar #defines
  linux-user: Use direct syscalls for setuid(), etc
  linux-user: x86_64: Don't use 16-bit UIDs
  linux-user: Use g_try_malloc() in do_msgrcv()
  linux-user: Handle msgrcv error case correctly
  linux-user: Handle negative values in timespec conversion
  linux-user: Use safe_syscall for futex syscall
  linux-user: Use safe_syscall for pselect, select syscalls
  linux-user: Use safe_syscall for execve syscall
  linux-user: Use safe_syscall for wait system calls
  linux-user: Use safe_syscall for open and openat system calls
  linux-user: Use safe_syscall for read and write system calls
  linux-user: Provide safe_syscall for fixing races between signals and syscalls
  linux-user: Add debug code to exercise restarting system calls
  linux-user: Support for restarting system calls for Microblaze targets
  linux-user: Set r14 on exit from microblaze syscall
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

linux-user,target-ppc: fix use of MSR_LE

setup_frame()/setup_rt_frame()/restore_user_regs() are using
MSR_LE as the similar kernel functions do: as a bitmask.

But in QEMU, MSR_LE is a bit position, so change this
accordingly.

The previous code was doing nothing as MSR_LE is 0,
and "env->msr &= ~MSR_LE" doesn't change the value of msr.

And yes, a user process can change its endianness,
see linux kernel commit:

fab5db9 [PATCH] powerpc: Implement support for setting little-endian mode via prctl

and prctl(2): PR_SET_ENDIAN, PR_GET_ENDIAN

Reviewed-by: Thomas Huth <huth@tuxfamily.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user/signal.c: Use s390 target space address instead of host space

The return address is in target space, so the restorer address needs to
be target space, too.

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>

linux-user/signal.c: Use target address instead of host address for microblaze restorer

The return address is in target space, so the restorer address needs to
be target space, too.

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user/signal.c: Generate opcode data for restorer in setup_rt_frame

Original implementation uses do_rt_sigreturn directly in host space,
when a guest program is in unwind procedure in guest space, it will get
an incorrect restore address, then causes unwind failure.

Also cleanup the original incorrect indentation.

Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: arm: Remove ARM_cpsr and similar #defines

The #defines of ARM_cpsr and friends in linux-user/arm/target-syscall.h
can clash with versions in the system headers if building on an
ARM or AArch64 build (though this seems to be dependent on the version
of the system headers). The QEMU defines are not very useful (it's
not clear that they're intended for use with the target_pt_regs struct
rather than (say) the CPUARMState structure) and we only use them in one
function in elfload.c anyway. So just remove the #defines and directly
access regs->uregs[].

Reported-by: Christopher Covington <cov@codeaurora.org>
Tested-by: Christopher Covington <cov@codeaurora.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Use direct syscalls for setuid(), etc

On Linux the setuid(), setgid(), etc system calls have different semantics
from the libc functions. The libc functions follow POSIX and update the
credentials for all threads in the process; the system calls update only
the thread which makes the call. (This impedance mismatch is worked around
in libc by signalling all threads to tell them to do a syscall, in a
byzantine and fragile way; see http://ewontfix.com/17/.)

Since in linux-user we are trying to emulate the system call semantics,
we must implement all these syscalls to directly call the underlying
host syscall, rather than calling the host libc function.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: x86_64: Don't use 16-bit UIDs

The 64-bit x86 syscall ABI uses 32-bit UIDs; only define
USE_UID16 for 32-bit x86.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Use g_try_malloc() in do_msgrcv()

In do_msgrcv() we want to allocate a message buffer, whose size
is passed to us by the guest. That means we could legitimately
fail, so use g_try_malloc() and handle the error case, in the same
way that do_msgsnd() does.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Handle msgrcv error case correctly

The msgrcv ABI is a bit odd -- the msgsz argument is a size_t, which is
unsigned, but it must fail EINVAL if the value is negative when cast
to a long. We were incorrectly passing the value through an
"unsigned int", which meant that if the guest was 32-bit longs and
the host was 64-bit longs an input of 0xffffffff (which should trigger
EINVAL) would simply be passed to the host msgrcv() as 0xffffffff,
where it does not cause the host kernel to reject it.
Follow the same approach as do_msgsnd() in using a ssize_t and
doing the check for negative values by hand, so we correctly fail
in this corner case.

This fixes the msgrcv03 Linux Test Project test case, which otherwise
hangs.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Handle negative values in timespec conversion

In a struct timespec, both fields are signed longs. Converting
them from guest to host with code like
host_ts->tv_sec = tswapal(target_ts->tv_sec);
mishandles negative values if the guest has 32-bit longs and
the host has 64-bit longs because tswapal()'s return type is
abi_ulong: the assignment will zero-extend into the host long
type rather than sign-extending it.

Make the conversion routines use __get_user() and __set_user()
instead: this automatically picks up the signedness of the
field type and does the correct kind of sign or zero extension.
It also handles the possibility that the target struct is not
sufficiently aligned for the host's requirements.

In particular, this fixes a hang when running the Linux Test Project
mq_timedsend01 and mq_timedreceive01 tests: one of the test cases
sets the timeout to -1 and expects an EINVAL failure, but we were
setting a very long timeout instead.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Use safe_syscall for futex syscall

Use the safe_syscall wrapper for the futex syscall.

In particular, this fixes hangs when using programs that link
against the Boehm garbage collector, including the Mono runtime.

(We don't change the sys_futex() call in the implementation of
the exit syscall, because as the FIXME comment there notes
that should be handled by disabling signals, since we can't
easily back out if the futex were to return ERESTARTSYS.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Use safe_syscall for pselect, select syscalls

Use the safe_syscall wrapper for the pselect and select syscalls.
Since not every architecture has the select syscall, we now
have to implement select in terms of pselect, which means doing
timeval<->timespec conversion.

(Five years on from the initial patch that added pselect support
to QEMU and a decade after pselect6 went into the kernel, it seems
safe to not try to support hosts with header files which don't
define __NR_pselect6.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Use safe_syscall for execve syscall

Wrap execve() in the safe-syscall handling. Although execve() is not
an interruptible syscall, it is a special case: if we allow a signal
to happen before we make the host$ syscall then we will 'lose' it,
because at the point of execve the process leaves QEMU's control. So
we use the safe syscall wrapper to ensure that we either take the
signal as a guest signal, or else it does not happen before the
execve completes and makes it the other program's problem.

The practical upshot is that without this SIGTERM could fail to
terminate the process.

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-25-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
[PMM: expanded commit message to explain in more detail why this is
needed, and add comment about it too]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Use safe_syscall for wait system calls

Use safe_syscall for waitpid, waitid and wait4 syscalls. Note that this
change allows us to implement support for waitid's fifth (rusage) argument
in future; for the moment we ignore it as we have done up til now.

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-18-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
[PMM: Adjust to new safe_syscall convention. Add fifth waitid syscall argument
(which isn't present in the libc interface but is in the syscall ABI)]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Use safe_syscall for open and openat system calls

Restart open() and openat() if signals occur before,
or during with SA_RESTART.

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-17-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
[PMM: Adjusted to follow new -1-and-set-errno safe_syscall convention]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Use safe_syscall for read and write system calls

Restart read() and write() if signals occur before, or during with SA_RESTART

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-15-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
[PMM: Update to new safe_syscall() convention of setting errno]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Provide safe_syscall for fixing races between signals and syscalls

If a signal is delivered immediately before a blocking system call the
handler will only be called after the system call returns, which may be a
long time later or never.

This is fixed by using a function (safe_syscall) that checks if a guest
signal is pending prior to making a system call, and if so does not call the
system call and returns -TARGET_ERESTARTSYS. If a signal is received between
the check and the system call host_signal_handler() rewinds execution to
before the check. This rewinding has the effect of closing the race window
so that safe_syscall will reliably either (a) go into the host syscall
with no unprocessed guest signals pending or or (b) return
-TARGET_ERESTARTSYS so that the caller can deal with the signals.
Implementing this requires a per-host-architecture assembly language
fragment.

This will also resolve the mishandling of the SA_RESTART flag where
we would restart a host system call and not call the guest signal handler
until the syscall finally completed -- syscall restarting now always
happens at the guest syscall level so the guest signal handler will run.
(The host syscall will never be restarted because if the host kernel
rewinds the PC to point at the syscall insn for a restart then our
host_signal_handler() will see this and arrange the guest PC rewind.)

This commit contains the infrastructure for implementing safe_syscall
and the assembly language fragment for x86-64, but does not change any
syscalls to use it.

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-14-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
[PMM:
* Avoid having an architecture if-ladder in configure by putting
   linux-user/host/$(ARCH) on the include path and including
   safe-syscall.inc.S from it
* Avoid ifdef ladder in signal.c by creating new hostdep.h to hold
   host-architecture-specific things
* Added copyright/license header to safe-syscall.inc.S
* Rewrote commit message
* Added comments to safe-syscall.inc.S
* Changed calling convention of safe_syscall() to match syscall()
   (returns -1 and host error in errno on failure)
* Added a long comment in qemu.h about how to use safe_syscall()
   to implement guest syscalls.
]
RV: squashed Peters "fixup! linux-user: compile on non-x86-64 hosts"
patch
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

linux-user: Add debug code to exercise restarting system calls

If DEBUG_ERESTARTSYS is set restart all system calls once. This
is pure debug code for exercising the syscall restart code paths
in the per-architecture cpu main loops.

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-10-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
[PMM: Add comment and a commented-out #define next to the commented-out
generic DEBUG #define; remove the check on TARGET_USE_ERESTARTSYS;
tweak comment message]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Support for restarting system calls for Microblaze targets

Update the Microblaze main loop and sigreturn code:
* on TARGET_ERESTARTSYS, wind guest PC backwards to repeat syscall insn
* set all guest CPU state within signal.c code on sigreturn
* handle TARGET_QEMU_ESIGRETURN in the main loop as the indication
that the main loop should not touch any guest CPU state

Note that this in passing fixes a bug where we were corrupting
the guest r[3] on sigreturn with the guest's r[10] because
do_sigreturn() was returning env->regs[10] but the register for
syscall return values is env->regs[3].

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-11-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: Commit message tweaks; drop TARGET_USE_ERESTARTSYS define;
drop whitespace changes]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Set r14 on exit from microblaze syscall

All syscall exits on microblaze result in r14 being equal to the
PC we return to, because the kernel syscall exit instruction "rtbd"
does this. (This is true even for sigreturn(); note that r14 is
not a userspace-usable register as the kernel may clobber it at
any point.)

Emulate the setting of r14 on exit; this isn't really a guest
visible change for valid guest code because r14 isn't reliably
observable anyway. However having the code and the comment helps
to explain why it's ok for the ERESTARTSYS handling not to undo
the changes to r14 that happen on syscall entry.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Support for restarting system calls for tilegx targets

Update the tilegx main loop and sigreturn code:
* on TARGET_ERESTARTSYS, wind guest PC backwards to repeat syscall insn
* return -TARGET_QEMU_ESIGRETURN from sigreturn rather than current R_RE
* handle TARGET_QEMU_ESIGRETURN in the main loop as the indication
that the main loop should not touch any guest CPU state

Note that this fixes a bug where a sigreturn which happened to have
an errno value in TILEGX_R_RE would incorrectly cause TILEGX_R_ERR
to get set.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Support for restarting system calls for CRIS targets

Update the CRIS main loop and sigreturn code:
* on TARGET_ERESTARTSYS, wind guest PC backwards to repeat syscall insn
* set all guest CPU state within signal.c code on sigreturn
* handle TARGET_QEMU_ESIGRETURN in the main loop as the indication
that the main loop should not touch any guest CPU state

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-34-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
[PMM: tweak commit message; drop TARGET_USE_ERESTARTSYS define]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Support for restarting system calls for S390 targets

Update the S390 main loop and sigreturn code:
* on TARGET_ERESTARTSYS, wind guest PC backwards to repeat syscall insn
* set all guest CPU state within signal.c code on sigreturn
* handle TARGET_QEMU_ESIGRETURN in the main loop as the indication
that the main loop should not touch any guest CPU state

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-33-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: tweak commit message; remove stray double semicolon; drop
TARGET_USE_ERESTARTSYS define]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Support for restarting system calls for M68K targets

Update the M68K main loop and sigreturn code:
* on TARGET_ERESTARTSYS, wind guest PC backwards to repeat syscall insn
* set all guest CPU state within signal.c code on sigreturn
* handle TARGET_QEMU_ESIGRETURN in the main loop as the indication
that the main loop should not touch any guest CPU state

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-32-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: tweak commit message; drop TARGET_USE_ERESTARTSYS define]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Support for restarting system calls for OpenRISC targets

Update the OpenRISC main loop code:
* on TARGET_ERESTARTSYS, wind guest PC backwards to repeat syscall insn
* handle TARGET_QEMU_ESIGRETURN in the main loop as the indication
that the main loop should not touch any guest CPU state

(We don't implement sigreturn on this target so there is no
code there to update.)

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-31-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: tweak commit message; drop TARGET_USE_ERESTARTSYS define]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Support for restarting system calls for UniCore32 targets

Update the UniCore32 main loop code:
* on TARGET_ERESTARTSYS, wind guest PC backwards to repeat syscall insn
* handle TARGET_QEMU_ESIGRETURN in the main loop as the indication
that the main loop should not touch any guest CPU state

(We don't support signals on this target so there is no sigreturn code
to update.)

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-30-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: tweak commit message; drop TARGET_USE_ERESTARTSYS define]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Support for restarting system calls for Alpha targets

Update the Alpha main loop and sigreturn code:
* on TARGET_ERESTARTSYS, wind guest PC backwards to repeat syscall insn
* handle TARGET_QEMU_ESIGRETURN in the main loop as the indication
that the main loop should not touch any guest CPU state

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-13-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: tweak commit message; drop TARGET_USE_ERESTARTSYS define;
PC is env->pc, not env->ir[IR_PV]]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Support for restarting system calls for SH4 targets

Update the SH4 main loop and sigreturn code:
* on TARGET_ERESTARTSYS, wind guest PC backwards to repeat syscall insn
* set all guest CPU state within signal.c code on sigreturn
* handle TARGET_QEMU_ESIGRETURN in the main loop as the indication
that the main loop should not touch any guest CPU state

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-12-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: tweak commit message; drop TARGET_USE_ERESTARTSYS define]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Support for restarting system calls for SPARC targets

Update the SPARC main loop and sigreturn code:
* on TARGET_ERESTARTSYS, wind guest PC backwards to repeat syscall insn
* set all guest CPU state within signal.c code on sigreturn
* handle TARGET_QEMU_ESIGRETURN in the main loop as the indication
that the main loop should not touch any guest CPU state

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-9-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
[PMM: Commit message tweaks; drop TARGET_USE_ERESTARTSYS define]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Support for restarting system calls for PPC targets

Update the PPC main loop code:
* on TARGET_ERESTARTSYS, wind guest PC backwards to repeat syscall insn

(We already handle TARGET_QEMU_ESIGRETURN.)

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-8-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: tweak commit message; drop TARGET_USE_ERESTARTSYS define]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Support for restarting system calls for MIPS targets

Update the MIPS main loop code:
* on TARGET_ERESTARTSYS, wind guest PC backwards to repeat syscall insn

(We already handle TARGET_QEMU_ESIGRETURN.)

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-7-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: tweak commit message; drop TARGET_USE_ERESTARTSYS define]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Support for restarting system calls for ARM targets

Update the 32-bit and 64-bit ARM main loop and sigreturn code:
* on TARGET_ERESTARTSYS, wind guest PC backwards to repeat syscall insn
* set all guest CPU state within signal.c code on sigreturn
* handle TARGET_QEMU_ESIGRETURN in the main loop as the indication
that the main loop should not touch any guest CPU state

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-6-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: tweak commit message; drop TARGET_USE_ERESTARTSYS define]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Support for restarting system calls for x86 targets

Update the x86 main loop and sigreturn code:
* on TARGET_ERESTARTSYS, wind guest PC backwards to repeat syscall insn
* set all guest CPU state within signal.c code rather than passing it
back out as the "return code" from do_sigreturn()
* handle TARGET_QEMU_ESIGRETURN in the main loop as the indication
that the main loop should not touch EAX

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-5-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: Commit message tweaks; drop TARGET_USE_ERESTARTSYS define]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Renumber TARGET_QEMU_ESIGRETURN, make it not arch-specific

Currently we define a QEMU-internal errno TARGET_QEMU_ESIGRETURN
only on the MIPS and PPC targets; move this to errno_defs.h
so it is available for all architectures, and renumber it to 513.
We pick 513 because this is safe from future use as a system call return
value: Linux uses it as ERESTART_NOINTR internally and never allows that
errno to escape to userspace.

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-4-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
[PMM: TARGET_ERESTARTSYS split out into preceding patch, add comment]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Define TARGET_ERESTART* errno values

Define TARGET_ERESTARTSYS; like the kernel, we will use this to
indicate that a guest system call should be restarted. We use
the same value the kernel does for this, 512.

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
[PMM: split out from the patch which moves and renumbers
TARGET_QEMU_ESIGRETURN, add comment on usage]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Reindent signal handling

Some of the signal handling was a mess with a mixture of tabs and 8 space
indents.

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-3-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: just rebased]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>

linux-user: Consistently return host errnos from do_openat()

The function do_openat() is not consistent about whether it is
returning a host errno or a guest errno in case of failure.
Standardise on returning -1 with errno set (ie caller has
to call get_errno()).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reported-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>

linux-user: Check array bounds in errno conversion

Check array bounds in host_to_target_errno() and target_to_host_errno().

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-2-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
[PMM: Add a lower-bound check, use braces on if(), tweak commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.7-20160527' into staging

ppc patch queue for 2016-05-27 (first pull for qemu-2.7)

I'm back from holidays now, and have re-collated the ppc patch queue.
This is a first pull request against the qemu-2.7 branch, mostly
consisting of patches which were posted before the 2.6 freeze, but
weren't suitable for late inclusion in the 2.6 branch.

* Assorted bugfixes and cleanups
* Some preliminary patches towards dynamic DMA windows and CPU hotplug
* Significant performance impovement for the spapr-llan device
* Added myself to MAINTAINERS for ppc (overdue)

# gpg: Signature made Fri 27 May 2016 04:04:15 BST using RSA key ID 20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.7-20160527:
  MAINTAINERS: Add David Gibson as ppc maintainer
  spapr_iommu: Move table allocation to helpers
  spapr_iommu: Finish renaming vfio_accel to need_vfio
  spapr_pci: Use correct DMA LIOBN when composing the device tree
  spapr: ensure device trees are always associated with DRC
  PPC/KVM: early validation of vcpu id
  Added negative check for get_image_size()
  hw/net/spapr_llan: Provide counter with dropped rx frames to the guest
  hw/net/spapr_llan: Delay flushing of the RX queue while adding new RX buffers
  target-ppc: Cleanups to rldinm, rldnm, rldimi
  target-ppc: Use 32-bit rotate instead of deposit + 64-bit rotate
  target-ppc: Use movcond in isel
  target-ppc: Correct KVM synchronization for ppc_hash64_set_external_hpt()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

MAINTAINERS: Add David Gibson as ppc maintainer

I've been de facto co-maintainer of all ppc target related code for some
time. Alex Graf isworking on other things and doesn't have a whole lot of
time for qemu ppc maintainership. So, update the MAINTAINERS file to
reflect this.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexander Graf <agraf@suse.de>
Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>

spapr_iommu: Move table allocation to helpers

At the moment presence of vfio-pci devices on a bus affect the way
the guest view table is allocated. If there is no vfio-pci on a PHB
and the host kernel supports KVM acceleration of H_PUT_TCE, a table
is allocated in KVM. However, if there is vfio-pci and we do yet not
KVM acceleration for these, the table has to be allocated by
the userspace. At the moment the table is allocated once at boot time
but next patches will reallocate it.

This moves kvmppc_create_spapr_tce/g_malloc0 and their counterparts
to helpers.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

spapr_iommu: Finish renaming vfio_accel to need_vfio

6a81dd17 "spapr_iommu: Rename vfio_accel parameter" renamed vfio_accel
flag everywhere but one spot was missed.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

spapr_pci: Use correct DMA LIOBN when composing the device tree

The user could have picked LIOBN via the CLI but the device tree
rendering code would still use the value derived from the PHB index
(which is the default fallback if LIOBN is not set in the CLI).

This replaces SPAPR_PCI_LIOBN() with the actual DMA LIOBN value.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

spapr: ensure device trees are always associated with DRC

There are possible racing situations involving hotplug events and
guest migration. For cases where a hotplug event is migrated, or
the guest is in the process of fetching device tree at the time of
migration, we need to ensure the device tree is created and
associated with the corresponding DRC for devices that were
hotplugged on the source, but 'coldplugged' on the target.

Signed-off-by: Jianjun Duan <duanj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

PPC/KVM: early validation of vcpu id

The KVM API restricts vcpu ids to be < KVM_CAP_MAX_VCPUS. On PowerPC
targets, depending on the number of threads per core in the host and
in the guest, some topologies do generate higher vcpu ids actually.
When this happens, QEMU bails out with the following error:

kvm_init_vcpu failed: Invalid argument

The KVM_CREATE_VCPU ioctl has several EINVAL return paths, so it is
not possible to fully disambiguate.

This patch adds a check in the code that computes vcpu ids, so that
we can detect the error earlier, and print a friendlier message instead
of calling KVM_CREATE_VCPU with an obviously bogus vcpu id.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>