Benjamin Herrenschmidt [Thu, 29 Aug 2013 07:25:35 +0000 (17:25 +1000)]
powerpc/scom: Use "devspec" rather than "path" in debugfs entries
This is the traditional name for device-tree path, used in sysfs,
do the same for the XSCOM debugfs files.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Benjamin Herrenschmidt [Thu, 29 Aug 2013 06:58:12 +0000 (16:58 +1000)]
powerpc/scom: CONFIG_SCOM_DEBUGFS should depend on CONFIG_DEBUG_FS
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Benjamin Herrenschmidt [Thu, 29 Aug 2013 06:57:33 +0000 (16:57 +1000)]
powerpc/powernv: Add scom support under OPALv3
OPAL v3 provides interfaces to access the chips XSCOM, expose
this via the existing scom infrastructure.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Benjamin Herrenschmidt [Thu, 29 Aug 2013 06:56:59 +0000 (16:56 +1000)]
powerpc/scom: Create debugfs files using ibm,chip-id if available
When creating the debugfs scom files, use "ibm,chip-id" as the scom%d
index rather than a simple made up number when possible.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Benjamin Herrenschmidt [Thu, 29 Aug 2013 06:56:16 +0000 (16:56 +1000)]
powerpc/scom: Add support for "reg" property
When devices are direct children of a scom controller node, they
should be able to use the normal "reg" property instead of "scom-reg".
In that case, they also use #address-cells rather than #scom-cells
to indicate the size of an entry.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Benjamin Herrenschmidt [Thu, 29 Aug 2013 06:55:45 +0000 (16:55 +1000)]
powerpc/scom: Change scom_read() and scom_write() to return errors
scom_read() now returns the read value via a pointer argument and
both functions return an int error code
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Benjamin Herrenschmidt [Thu, 29 Aug 2013 06:55:07 +0000 (16:55 +1000)]
powerpc: Enable /dev/port when isa_io_special is set
isa_io_special is set when the platform provides a "special"
implementation of inX/outX via some FW interface for example.
Such a platform doesn't need an ISA bridge on PCI, and so /dev/port
should be made available even if one isn't present.
This makes the LPC bus IOs accessible via /dev/port on PowerNV Power8
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Eugene Surovegin [Fri, 20 Sep 2013 18:42:21 +0000 (11:42 -0700)]
powerpc: Make ftrace endian-safe.
Signed-off-by: Eugene Surovegin <surovegin@google.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Eugene Surovegin [Fri, 20 Sep 2013 18:42:20 +0000 (11:42 -0700)]
powerpc: Make kernel module helper endian-safe.
Signed-off-by: Eugene Surovegin <surovegin@google.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Laurent Dufour [Tue, 17 Sep 2013 09:52:48 +0000 (11:52 +0200)]
powerpc: prom_init exception when updating core value
Since the CPU is generating an exception when accessing unaligned word, and
as this exception is not yet handled when running prom_init, data should be
copied from the architecture vector byte per byte.
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Kevin Hao [Thu, 26 Sep 2013 08:23:56 +0000 (16:23 +0800)]
powerpc/booke64: Check napping in performance monitor interrupt
The performance monitor interrupt is asynchronous, so we should check
if the current processor is in napping status in the handler of this
interrupt.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cedric Le Goater [Mon, 23 Sep 2013 12:17:54 +0000 (14:17 +0200)]
powerpc/kernel: Fix endian issue in rtas_pci
Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Ellerman [Fri, 11 Oct 2013 03:07:59 +0000 (14:07 +1100)]
powerpc/pseries: Implement arch_get_random_long() based on H_RANDOM
Add support for the arch_get_random_long() hook based on the H_RANDOM
hypervisor call. We trust the hypervisor to provide us with random data,
ie. we don't whiten it in anyway.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Ellerman [Fri, 11 Oct 2013 03:07:58 +0000 (14:07 +1100)]
hwrng: Add a driver for the hwrng found in power7+ systems
Add a driver for the hwrng found in power7+ systems, based on the
existing code for the arch_get_random_long() hook.
We only register a single instance of the driver, not one per device,
because we use the existing per_cpu array of devices in the arch code.
This means we always read from the "closest" device, avoiding inter-chip
memory traffic.
Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Ellerman [Fri, 11 Oct 2013 03:07:57 +0000 (14:07 +1100)]
powerpc: Implement arch_get_random_long/int() for powernv
Add the plumbing to implement arch_get_random_long/int(). It didn't seem
worth adding an extra ppc_md hook for int, so we reuse the one for long.
Add an implementation for powernv based on the hwrng found in power7+
systems. We whiten the output of the hwrng, and the result passes all
the dieharder tests.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Ellerman [Wed, 25 Sep 2013 09:24:17 +0000 (19:24 +1000)]
hwrng: Return errors to upper levels in pseries-rng.c
We don't expect to get errors from the hypervisor when reading the rng,
but if we do we should pass the error up to the hwrng driver. Otherwise
the hwrng driver will continue calling us forever.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Bharat Bhushan [Wed, 9 Oct 2013 05:11:17 +0000 (10:41 +0530)]
powerpc: Added __cmpdi2 for signed 64bit comparision
This was missing on powerpc and I am getting compilation error
drivers/vfio/pci/vfio_pci_rdwr.c:193: undefined reference to `__cmpdi2'
drivers/vfio/pci/vfio_pci_rdwr.c:193: undefined reference to `__cmpdi2'
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Vladimir Murzin [Sun, 29 Sep 2013 12:41:18 +0000 (14:41 +0200)]
powerpc: Fix section mismatch warning in free_lppacas
While cross-building for PPC64 I've got bunch of
WARNING: arch/powerpc/kernel/built-in.o(.text.unlikely+0x2d2): Section
mismatch in reference from the function .free_lppacas() to the variable
.init.data:lppaca_size The function .free_lppacas() references the variable
__initdata lppaca_size. This is often because .free_lppacas lacks a __initdata
annotation or the annotation of lppaca_size is wrong.
Fix it by using proper annotation for free_lppacas. Additionally, annotate
{allocate,new}_llpcas properly.
Signed-off-by: Vladimir Murzin <murzin.v@gmail.com>
Acked-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Kevin Hao [Thu, 26 Sep 2013 08:41:34 +0000 (16:41 +0800)]
powerpc/ppc64: Remove the unneeded load of ti_flags in resume_kernel
We already got the value of current_thread_info and ti_flags and store
them into r9 and r4 respectively before jumping to resume_kernel. So
there is no reason to reload them again.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Bartlomiej Zolnierkiewicz [Mon, 30 Sep 2013 13:13:55 +0000 (15:13 +0200)]
powerpc/8xx/tqm8xx: Fix incorrect placement of __initdata tag
__initdata tag should be placed between the variable name and equal
sign for the variable to be placed in the intended .init.data section.
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Bartlomiej Zolnierkiewicz [Mon, 30 Sep 2013 13:11:42 +0000 (15:11 +0200)]
powerpc/legacy_serial: Fix incorrect placement of __initdata tag
__initdata tag should be placed between the variable name and equal
sign for the variable to be placed in the intended .init.data section.
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Scott Wood [Fri, 27 Sep 2013 00:18:18 +0000 (19:18 -0500)]
powerpc/mpic: Disable preemption when calling mpic_processor_id()
Otherwise, we get a debug traceback due to the use of
smp_processor_id() (or get_paca()) inside hard_smp_processor_id().
mpic_host_map() is just looking for a default CPU, so it doesn't matter
if we migrate after getting the CPU ID.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 23 Sep 2013 02:05:13 +0000 (12:05 +1000)]
powerpc: Work around little endian gcc bug
Temporarily work around an ICE we are seeing while building
in little endian mode:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57134
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 23 Sep 2013 02:05:12 +0000 (12:05 +1000)]
powerpc: Don't set HAVE_EFFICIENT_UNALIGNED_ACCESS on little endian builds
POWER7 takes alignment exceptions on some unaligned addresses, so
disable HAVE_EFFICIENT_UNALIGNED_ACCESS. This fixes an early boot
issue in the printk code.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Ian Munsie [Mon, 23 Sep 2013 02:05:11 +0000 (12:05 +1000)]
powerpc: Add ability to build little endian kernels
This patch allows the kbuild system to successfully compile a kernel
for the little endian PowerPC64 architecture. A subsequent patch
will add the CONFIG_CPU_LITTLE_ENDIAN kernel config option which
must be set to build such a kernel.
If cross compiling, CROSS_COMPILE must point to a suitable toolchain
(compiled for the powerpc64le-linux and powerpcle-linux targets).
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 23 Sep 2013 02:05:09 +0000 (12:05 +1000)]
KVM: PPC: Disable KVM on little endian builds
There are a number of KVM issues with little endian builds.
We are working on fixing them, but in the meantime disable
it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Benjamin Herrenschmidt [Mon, 23 Sep 2013 02:05:08 +0000 (12:05 +1000)]
tty/hvc_opal: powerpc: Make OPAL HVC device tree accesses endian safe
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Benjamin Herrenschmidt [Mon, 23 Sep 2013 02:05:07 +0000 (12:05 +1000)]
powerpc/hvsi: Fix endian issues in HVSI driver
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 23 Sep 2013 02:05:06 +0000 (12:05 +1000)]
powerpc/powernv: Fix some PCI sparse errors and one LE bug
pnv_pci_setup_bml_iommu was missing a byteswap of a device
tree property.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 23 Sep 2013 02:05:05 +0000 (12:05 +1000)]
powerpc/powernv: More little endian issues in OPAL RTC driver
Sparse caught an issue where opal_set_rtc_time was incorrectly
byteswapping. Also fix a number of sparse warnings.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Benjamin Herrenschmidt [Mon, 23 Sep 2013 02:05:04 +0000 (12:05 +1000)]
powerpc/powernv: Don't register exception handlers in little endian mode
The powernv exception handlers are not ready to take exceptions
in little endian mode, so disable them.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Benjamin Herrenschmidt [Mon, 23 Sep 2013 02:05:03 +0000 (12:05 +1000)]
powerpc/powernv: Fix OPAL entry and exit in little endian mode
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Benjamin Herrenschmidt [Mon, 23 Sep 2013 02:05:02 +0000 (12:05 +1000)]
powerpc/powernv: Fix endian issues in OPAL console and udbg backend
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Benjamin Herrenschmidt [Mon, 23 Sep 2013 02:05:01 +0000 (12:05 +1000)]
powerpc/powernv: Fix endian issues in powernv PCI code
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Benjamin Herrenschmidt [Mon, 23 Sep 2013 02:05:00 +0000 (12:05 +1000)]
powerpc/powernv: Make OPAL NVRAM device tree accesses endian safe
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Benjamin Herrenschmidt [Mon, 23 Sep 2013 02:04:59 +0000 (12:04 +1000)]
powerpc/powernv: Fix endian issues in OPAL ICS backend
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Benjamin Herrenschmidt [Mon, 23 Sep 2013 02:04:58 +0000 (12:04 +1000)]
powerpc/powernv: Fix endian issues in OPAL RTC driver
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Alistair Popple [Mon, 23 Sep 2013 02:04:57 +0000 (12:04 +1000)]
powerpc: Little endian sparse clean up for arch/powerpc/platforms/powernv/pci-ioda.c
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Alistair Popple [Mon, 23 Sep 2013 02:04:56 +0000 (12:04 +1000)]
powerpc: Little endian fix for arch/powerpc/platforms/powernv/pci-p5ioc2.c
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Alistair Popple [Mon, 23 Sep 2013 02:04:55 +0000 (12:04 +1000)]
powerpc: Little endian fix for arch/powerpc/platforms/powernv/pci.c
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Alistair Popple [Mon, 23 Sep 2013 02:04:54 +0000 (12:04 +1000)]
powerpc: Little endian fixes for platforms/powernv/opal.c
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 23 Sep 2013 02:04:53 +0000 (12:04 +1000)]
powerpc: uname should return ppc64le/ppcle on little endian builds
We need to distinguish between big endian and little endian
environments, so fix uname to return the right thing.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 23 Sep 2013 02:04:52 +0000 (12:04 +1000)]
powerpc: Use generic memcpy code in little endian
We need to fix some endian issues in our memcpy code. For now
just enable the generic memcpy routine for little endian builds.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 23 Sep 2013 02:04:51 +0000 (12:04 +1000)]
powerpc: Use generic checksum code in little endian
We need to fix some endian issues in our checksum code. For now
just enable the generic checksum routines for little endian builds.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 23 Sep 2013 02:04:50 +0000 (12:04 +1000)]
powerpc: Handle VSX alignment faults in little endian mode
Things are complicated by the fact that VSX elements are big
endian ordered even in little endian mode. 8 byte loads and
stores also write to the top 8 bytes of the register.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 23 Sep 2013 02:04:49 +0000 (12:04 +1000)]
powerpc: Add little endian support to alignment handler
Handle most unaligned load and store faults in little
endian mode. Strings, multiples and VSX are not supported.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 23 Sep 2013 02:04:48 +0000 (12:04 +1000)]
powerpc: Alignment handler shouldn't access VSX registers with TS_FPR
The TS_FPR macro selects the FPR component of a VSX register (the
high doubleword). emulate_vsx is using this macro to get the
address of the associated VSX register. This happens to work on big
endian, but fails on little endian.
Replace it with an explicit array access.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 23 Sep 2013 02:04:47 +0000 (12:04 +1000)]
powerpc: Remove hard coded FP offsets in alignment handler
The alignment handler assumes big endian ordering when selecting
the low word of a 64bit floating point value. Use the existing
union which works in both little and big endian.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 23 Sep 2013 02:04:46 +0000 (12:04 +1000)]
powerpc: Remove open coded byte swap macro in alignment handler
Use swab64/32/16 instead of open coding it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Benjamin Herrenschmidt [Mon, 23 Sep 2013 02:04:45 +0000 (12:04 +1000)]
powerpc: Endian safe trampoline
Create a trampoline that works in either endian and flips to
the expected endian. Use it for primary and secondary thread
entry as well as RTAS and OF call return.
Credit for finding the magic instruction goes to Paul Mackerras
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Ian Munsie [Mon, 23 Sep 2013 02:04:44 +0000 (12:04 +1000)]
powerpc: Include the appropriate endianness header
This patch will have powerpc include the appropriate generic endianness
header depending on what the compiler reports.
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 23 Sep 2013 02:04:43 +0000 (12:04 +1000)]
powerpc: Reset MSR_LE on signal entry
We always take signals in big endian which is wrong. Signals
should be taken in native endian.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 23 Sep 2013 02:04:42 +0000 (12:04 +1000)]
powerpc: Set MSR_LE bit on little endian builds
We need to set MSR_LE in kernel and userspace for little endian builds
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 23 Sep 2013 02:04:41 +0000 (12:04 +1000)]
powerpc: Add little endian support for word-at-a-time functions
The powerpc word-at-a-time functions are big endian specific.
Bring in the x86 version in order to support little endian builds.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Ian Munsie [Mon, 23 Sep 2013 02:04:40 +0000 (12:04 +1000)]
powerpc: Support endian agnostic MMIO
This patch maps the MMIO functions for 32bit PowerPC to their
appropriate instructions depending on CPU endianness.
The macros used to create the corresponding inline functions are also
renamed by this patch. Previously they had BE or LE in their names which
was misleading - they had nothing to do with endianness, but actually
created different instruction forms so their new names reflect the
instruction form they are creating (D-Form and X-Form).
Little endian 64bit PowerPC is not supported, so the lack of mappings
(and corresponding breakage) for that case is intentional to bring the
attention of anyone doing a 64bit little endian port. 64bit big endian
is unaffected.
[ Added 64 bit versions - Anton ]
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 23 Sep 2013 02:04:39 +0000 (12:04 +1000)]
powerpc: Little endian builds double word swap VSX state during context save/restore
The elements within VSX loads and stores are big endian ordered
regardless of endianness. Our VSX context save/restore code uses
lxvd2x and stxvd2x which is a 2x doubleword operation. This means
the two doublewords will be swapped and we have to perform another
swap to undo it.
We need to do this on save and restore.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 23 Sep 2013 02:04:38 +0000 (12:04 +1000)]
powerpc: PTRACE_PEEKUSR/PTRACE_POKEUSER of FPR registers in little endian builds
FPRs overlap the high 64bits of the first 32 VSX registers. The
ptrace FP read/write code assumes big endian ordering and grabs
the lowest 64 bits.
Fix this by using the TS_FPR macro which does the right thing.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 23 Sep 2013 02:04:37 +0000 (12:04 +1000)]
powerpc: Fix offset of FPRs in VSX registers in little endian builds
The FPRs overlap the high doublewords of the first 32 VSX registers.
Fix TS_FPROFFSET and TS_VSRLOWOFFSET so we access the correct fields
in little endian mode.
If VSX is disabled the FPRs are only one doubleword in length so
TS_FPROFFSET needs adjusting in little endian.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 23 Sep 2013 02:04:36 +0000 (12:04 +1000)]
powerpc: Book 3S MMU little endian support
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 23 Sep 2013 02:04:35 +0000 (12:04 +1000)]
powerpc: Fix endian issues in VMX copy loops
Fix the permute loops for little endian.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Benjamin Herrenschmidt [Mon, 7 Oct 2013 21:08:24 +0000 (08:08 +1100)]
powerpc/irq: Don't switch to irq stack from softirq stack
irq_exit() is now called on the irq stack, which can trigger a switch to
the softirq stack from the irq stack. If an interrupt happens at that
point, we will not properly detect the re-entrancy and clobber the
original return context on the irq stack.
This fixes it. The side effect is to prevent all nesting from softirq
stack to irq stack even in the "safe" case but it's simpler that way and
matches what x86_64 does.
Reported-by: Cédric Le Goater <clg@fr.ibm.com>
Tested-by: Cédric Le Goater <clg@fr.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 7 Oct 2013 16:30:36 +0000 (09:30 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jikos/hid
Pull HID fixes from Jiri Kosina:
- fix for hidraw reference counting regression, by Manoj Chourasia
- fix for minor number allocation for uhid, by David Herrmann
- other small unsorted fixes / device ID additions
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: wiimote: fix FF deadlock
HID: add Holtek USB ID 04d9:a081 SHARKOON DarkGlider
HID: hidraw: close underlying device at removal of last reader
HID: roccat: Fix "cannot create duplicate filename" problems
HID: uhid: allocate static minor
Linus Torvalds [Mon, 7 Oct 2013 16:30:02 +0000 (09:30 -0700)]
Merge branch 'stable' of git://git./linux/kernel/git/cmetcalf/linux-tile
Pull Tile bugfixes from Chris Metcalf:
"This fixes some serious issues with PREEMPT support, and a couple of
smaller corner-case issues fixed in the last couple of weeks"
* 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
arch: tile: re-use kbasename() helper
tile: use a more conservative __my_cpu_offset in CONFIG_PREEMPT
tile: ensure interrupts disabled for preempt_schedule_irq()
tile: change lock initalization in hardwall
tile: include: asm: use 'long long' instead of 'u64' for atomic64_t and its related functions
David Herrmann [Wed, 2 Oct 2013 11:47:28 +0000 (13:47 +0200)]
HID: wiimote: fix FF deadlock
The input core has an internal spinlock that is acquired during event
injection via input_event() and friends but also held during FF callbacks.
That means, there is no way to share a lock between event-injection and FF
handling. Unfortunately, this is what is required for wiimote state
tracking and what we do with state.lock and input->lock.
This deadlock can be triggered when using continuous data reporting and FF
on a wiimote device at the same time. I takes me at least 30m of
stress-testing to trigger it but users reported considerably shorter
times (http://bpaste.net/show/132504/) when using some gaming-console
emulators.
The real problem is that we have two copies of internal state, one in the
wiimote objects and the other in the input device. As the input-lock is
not supposed to be accessed from outside of input-core, we have no other
chance than offloading FF handling into a worker. This actually works
pretty nice and also allows to implictly merge fast rumble changes into a
single request.
Due to the 3-layered workers (rumble+queue+l2cap) this might reduce FF
responsiveness. Initial tests were fine so lets fix the race first and if
it turns out to be too slow we can always handle FF out-of-band and skip
the queue-worker.
Cc: <stable@vger.kernel.org> # 3.11+
Reported-by: Thomas Schneider
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Linus Torvalds [Mon, 7 Oct 2013 08:13:26 +0000 (01:13 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
"A couple of bux fixes, notable are the regression with ptrace vs
restarting system calls and the patch for kdump to be able to copy
from virtual memory"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: fix system call restart after inferior call
s390: Allow vmalloc target buffers for copy_from_oldmem()
s390/sclp: properly detect line mode console
s390/kprobes: add exrl to list of prohibited opcodes
s390/3270: fix return value check in tty3270_resize_work()
Linus Torvalds [Sun, 6 Oct 2013 21:00:20 +0000 (14:00 -0700)]
Linux 3.12-rc4
Eric W. Biederman [Sat, 5 Oct 2013 20:15:30 +0000 (13:15 -0700)]
net: Update the sysctl permissions handler to test effective uid/gid
Modify the code to use current_euid(), and in_egroup_p, as in done
in fs/proc/proc_sysctl.c:test_perm()
Cc: stable@vger.kernel.org
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Reported-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 6 Oct 2013 20:38:31 +0000 (13:38 -0700)]
Merge git://git./linux/kernel/git/nab/target-pending
Pull SCSI target fixes from Nicholas Bellinger:
"Here are the outstanding target fixes queued up for v3.12-rc4 code.
The highlights include:
- Make vhost/scsi tag percpu_ida_alloc() use GFP_ATOMIC
- Allow sess_cmd_map allocation failure fallback to use vzalloc
- Fix COMPARE_AND_WRITE se_cmd->data_length bug with FILEIO backends
- Fixes for COMPARE_AND_WRITE callback recursive failure OOPs + non
zero scsi_status bug
- Make iscsi-target do acknowledgement tag release from RX context
- Setup iscsi-target with extra (cmdsn_depth / 2) percpu_ida tags
Also included is a iscsi-target patch CC'ed for v3.10+ that avoids
legacy wait_for_task=true release during fast-past StatSN
acknowledgement, and two other SRP target related patches that address
long-standing issues that are CC'ed for v3.3+.
Extra thanks to Thomas Glanzmann for his testing feedback with
COMPARE_AND_WRITE + EXTENDED_COPY VAAI logic"
* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
iscsi-target; Allow an extra tag_num / 2 number of percpu_ida tags
iscsi-target: Perform release of acknowledged tags from RX context
iscsi-target: Only perform wait_for_tasks when performing shutdown
target: Fail on non zero scsi_status in compare_and_write_callback
target: Fix recursive COMPARE_AND_WRITE callback failure
target: Reset data_length for COMPARE_AND_WRITE to NoLB * block_size
ib_srpt: always set response for task management
target: Fall back to vzalloc upon ->sess_cmd_map kzalloc failure
vhost/scsi: Use GFP_ATOMIC with percpu_ida_alloc for obtaining tag
ib_srpt: Destroy cm_id before destroying QP.
target: Fix xop->dbl assignment in target_xcopy_parse_segdesc_02
Linus Torvalds [Sun, 6 Oct 2013 20:35:15 +0000 (13:35 -0700)]
Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma
Pull slave-dmaengine fixes from Vinod Koul:
"Here is the slave dmanegine fixes. We have the fix for deadlock issue
on imx-dma by Michael and Josh's edma config fix along with author
change"
* 'fixes' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: imx-dma: fix callback path in tasklet
dmaengine: imx-dma: fix lockdep issue between irqhandler and tasklet
dmaengine: imx-dma: fix slow path issue in prep_dma_cyclic
dma/Kconfig: Make TI_EDMA select TI_PRIV_EDMA
edma: Update author email address
Linus Torvalds [Sat, 5 Oct 2013 19:17:24 +0000 (12:17 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"This is a small collection of fixes, including a regression fix from
Liu Bo that solves rare crashes with compression on.
I've merged my for-linus up to 3.12-rc3 because the top commit is only
meant for 3.12. The rest of the fixes are also available in my master
branch on top of my last 3.11 based pull"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
btrfs: Fix crash due to not allocating integrity data for a bioset
Btrfs: fix a use-after-free bug in btrfs_dev_replace_finishing
Btrfs: eliminate races in worker stopping code
Btrfs: fix crash of compressed writes
Btrfs: fix transid verify errors when recovering log tree
Linus Torvalds [Sat, 5 Oct 2013 19:11:40 +0000 (12:11 -0700)]
Merge tag 'gpio-v3.12-2' of git://git./linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
"Two patches for the OMAP driver, dealing with setting up IRQs properly
on the device tree boot path"
* tag 'gpio-v3.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio/omap: auto-setup a GPIO when used as an IRQ
gpio/omap: maintain GPIO and IRQ usage separately
Linus Torvalds [Sat, 5 Oct 2013 18:54:10 +0000 (11:54 -0700)]
Merge tag 'usb-3.12-rc4' of git://git./linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are none fixes for various USB driver problems. The majority are
gadget/musb fixes, but there are some new device ids in here as well"
* tag 'usb-3.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: chipidea: add Intel Clovertrail pci id
usb: gadget: s3c-hsotg: fix can_write limit for non-periodic endpoints
usb: gadget: f_fs: fix error handling
usb: musb: dsps: do not bind to "musb-hdrc"
USB: serial: option: Ignore card reader interface on Huawei E1750
usb: musb: gadget: fix otg active status flag
usb: phy: gpio-vbus: fix deferred probe from __init
usb: gadget: pxa25x_udc: fix deferred probe from __init
usb: musb: fix otg default state
Linus Torvalds [Sat, 5 Oct 2013 18:26:19 +0000 (11:26 -0700)]
Merge tag 'tty-3.12-rc4' of git://git./linux/kernel/git/gregkh/tty
Pull tty fixes from Greg KH:
"Here are two tty driver fixes for 3.12-rc4.
One fixes the reported regression in the n_tty code that a number of
people found recently, and the other one fixes an issue with xen
consoles that broke in 3.10"
* tag 'tty-3.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
xen/hvc: allow xenboot console to be used again
tty: Fix pty master read() after slave closes
Linus Torvalds [Sat, 5 Oct 2013 18:25:38 +0000 (11:25 -0700)]
Merge tag 'staging-3.12-rc4' of git://git./linux/kernel/git/gregkh/staging
Pull staging fixes from Greg KH:
"Here are 4 tiny staging and iio driver fixes for 3.12-rc4. Nothing
major, just some small fixes for reported issues"
* tag 'staging-3.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: comedi: ni_65xx: (bug fix) confine insn_bits to one subdevice
iio:magnetometer: Bugfix magnetometer default output registers
iio: Remove debugfs entries in iio_device_unregister()
iio: amplifiers: ad8366: Remove regulator_put
Darrick J. Wong [Fri, 20 Sep 2013 03:37:07 +0000 (20:37 -0700)]
btrfs: Fix crash due to not allocating integrity data for a bioset
When btrfs creates a bioset, we must also allocate the integrity data pool.
Otherwise btrfs will crash when it tries to submit a bio to a checksumming
disk:
BUG: unable to handle kernel NULL pointer dereference at
0000000000000018
IP: [<
ffffffff8111e28a>] mempool_alloc+0x4a/0x150
PGD
2305e4067 PUD
23063d067 PMD 0
Oops: 0000 [#1] PREEMPT SMP
Modules linked in: btrfs scsi_debug xfs ext4 jbd2 ext3 jbd mbcache
sch_fq_codel eeprom lpc_ich mfd_core nfsd exportfs auth_rpcgss af_packet
raid6_pq xor zlib_deflate libcrc32c [last unloaded: scsi_debug]
CPU: 1 PID: 4486 Comm: mount Not tainted 3.12.0-rc1-mcsum #2
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
task:
ffff8802451c9720 ti:
ffff880230698000 task.ti:
ffff880230698000
RIP: 0010:[<
ffffffff8111e28a>] [<
ffffffff8111e28a>] mempool_alloc+0x4a/0x150
RSP: 0018:
ffff880230699688 EFLAGS:
00010286
RAX:
0000000000000001 RBX:
0000000000000000 RCX:
00000000005f8445
RDX:
0000000000000001 RSI:
0000000000000010 RDI:
0000000000000000
RBP:
ffff8802306996f8 R08:
0000000000011200 R09:
0000000000000008
R10:
0000000000000020 R11:
ffff88009d6e8000 R12:
0000000000011210
R13:
0000000000000030 R14:
ffff8802306996b8 R15:
ffff8802451c9720
FS:
00007f25b8a16800(0000) GS:
ffff88024fc80000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
CR2:
0000000000000018 CR3:
0000000230576000 CR4:
00000000000007e0
Stack:
ffff8802451c9720 0000000000000002 ffffffff81a97100 0000000000281250
ffffffff81a96480 ffff88024fc99150 ffff880228d18200 0000000000000000
0000000000000000 0000000000000040 ffff880230e8c2e8 ffff8802459dc900
Call Trace:
[<
ffffffff811b2208>] bio_integrity_alloc+0x48/0x1b0
[<
ffffffff811b26fc>] bio_integrity_prep+0xac/0x360
[<
ffffffff8111e298>] ? mempool_alloc+0x58/0x150
[<
ffffffffa03e8041>] ? alloc_extent_state+0x31/0x110 [btrfs]
[<
ffffffff81241579>] blk_queue_bio+0x1c9/0x460
[<
ffffffff8123e58a>] generic_make_request+0xca/0x100
[<
ffffffff8123e639>] submit_bio+0x79/0x160
[<
ffffffffa03f865e>] btrfs_map_bio+0x48e/0x5b0 [btrfs]
[<
ffffffffa03c821a>] btree_submit_bio_hook+0xda/0x110 [btrfs]
[<
ffffffffa03e7eba>] submit_one_bio+0x6a/0xa0 [btrfs]
[<
ffffffffa03ef450>] read_extent_buffer_pages+0x250/0x310 [btrfs]
[<
ffffffff8125eef6>] ? __radix_tree_preload+0x66/0xf0
[<
ffffffff8125f1c5>] ? radix_tree_insert+0x95/0x260
[<
ffffffffa03c66f6>] btree_read_extent_buffer_pages.constprop.128+0xb6/0x120
[btrfs]
[<
ffffffffa03c8c1a>] read_tree_block+0x3a/0x60 [btrfs]
[<
ffffffffa03caefd>] open_ctree+0x139d/0x2030 [btrfs]
[<
ffffffffa03a282a>] btrfs_mount+0x53a/0x7d0 [btrfs]
[<
ffffffff8113ab0b>] ? pcpu_alloc+0x8eb/0x9f0
[<
ffffffff81167305>] ? __kmalloc_track_caller+0x35/0x1e0
[<
ffffffff81176ba0>] mount_fs+0x20/0xd0
[<
ffffffff81191096>] vfs_kern_mount+0x76/0x120
[<
ffffffff81193320>] do_mount+0x200/0xa40
[<
ffffffff81135cdb>] ? strndup_user+0x5b/0x80
[<
ffffffff81193bf0>] SyS_mount+0x90/0xe0
[<
ffffffff8156d31d>] system_call_fastpath+0x1a/0x1f
Code: 4c 8d 75 a8 4c 89 6d e8 45 89 e0 4c 8d 6f 30 48 89 5d d8 41 83 e0 af 48
89 fb 49 83 c6 18 4c 89 7d f8 65 4c 8b 3c 25 c0 b8 00 00 <48> 8b 73 18 44 89 c7
44 89 45 98 ff 53 20 48 85 c0 48 89 c2 74
RIP [<
ffffffff8111e28a>] mempool_alloc+0x4a/0x150
RSP <
ffff880230699688>
CR2:
0000000000000018
---[ end trace
7a96042017ed21e2 ]---
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Chris Mason [Sat, 5 Oct 2013 14:51:32 +0000 (10:51 -0400)]
Merge branch 'for-linus' into for-linus-3.12
Linus Torvalds [Sat, 5 Oct 2013 03:50:16 +0000 (20:50 -0700)]
Merge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6
Pull CIFS fixes from Steve French:
"Small set of cifs fixes. Most important is Jeff's fix that works
around disconnection problems which can be caused by simultaneous use
of user space tools (starting a long running smbclient backup then
doing a cifs kernel mount) or multiple cifs mounts through a NAT, and
Jim's fix to deal with reexport of cifs share.
I expect to send two more cifs fixes next week (being tested now) -
fixes to address an SMB2 unmount hang when server dies and a fix for
cifs symlink handling of Windows "NFS" symlinks"
* 'for-linus' of git://git.samba.org/sfrench/cifs-2.6:
[CIFS] update cifs.ko version
[CIFS] Remove ext2 flags that have been moved to fs.h
[CIFS] Provide sane values for nlink
cifs: stop trying to use virtual circuits
CIFS: FS-Cache: Uncache unread pages in cifs_readpages() before freeing them
Linus Torvalds [Sat, 5 Oct 2013 03:48:20 +0000 (20:48 -0700)]
Merge tag 'pci-v3.12-fixes-1' of git://git./linux/kernel/git/helgaas/pci
Pull PCI fix from Bjorn Helgaas:
"We merged what was intended to be an MMCONFIG cleanup, but in fact,
for systems without _CBA (which is almost everything), it broke
extended config space for domain 0 and it broke all config space for
other domains.
This reverts the change"
* tag 'pci-v3.12-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
Revert "x86/PCI: MMCONFIG: Check earlier for MMCONFIG region at address zero"
Bjorn Helgaas [Fri, 4 Oct 2013 22:14:30 +0000 (16:14 -0600)]
Revert "x86/PCI: MMCONFIG: Check earlier for MMCONFIG region at address zero"
This reverts commit
07f9b61c3915e8eb156cb4461b3946736356ad02.
07f9b61c was intended to be a cleanup that didn't change anything, but in
fact, for systems without _CBA (which is almost everything), it broke
extended config space for domain 0 and all config space for other domains.
Reference: http://lkml.kernel.org/r/
20131004011806.GE20450@dangermouse.emea.sgi.com
Reported-by: Hedi Berriche <hedi@sgi.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Linus Torvalds [Fri, 4 Oct 2013 22:03:42 +0000 (15:03 -0700)]
Merge tag 'pm+acpi-3.12-rc4' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael Wysocki:
- The resume part of user space driven hibernation (s2disk) is now
broken after the change that moved the creation of memory bitmaps to
after the freezing of tasks, because I forgot that the resume utility
loaded the image before freezing tasks and needed the bitmaps for
that. The fix adds special handling for that case.
- One of recent commits changed the export of acpi_bus_get_device() to
EXPORT_SYMBOL_GPL(), which was technically correct but broke existing
binary modules using that function including one in particularly
widespread use. Change it back to EXPORT_SYMBOL().
- The intel_pstate driver sometimes fails to disable turbo if its
no_turbo sysfs attribute is set. Fix from Srinivas Pandruvada.
- One of recent cpufreq fixes forgot to update a check in cpufreq-cpu0
which still (incorrectly) treats non-NULL as non-error. Fix from
Philipp Zabel.
- The SPEAr cpufreq driver uses a wrong variable type in one place
preventing it from catching errors returned by one of the functions
called by it. Fix from Sachin Kamat.
* tag 'pm+acpi-3.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: Use EXPORT_SYMBOL() for acpi_bus_get_device()
intel_pstate: fix no_turbo
cpufreq: cpufreq-cpu0: NULL is a valid regulator, part 2
cpufreq: SPEAr: Fix incorrect variable type
PM / hibernate: Fix user space driven resume regression
Linus Torvalds [Fri, 4 Oct 2013 21:47:22 +0000 (14:47 -0700)]
Merge tag 'xfs-for-linus-v3.12-rc4' of git://oss.sgi.com/xfs/xfs
Pull xfs bugfixes from Ben Myers:
"There are lockdep annotations for project quotas, a fix for dirent
dtype support on v4 filesystems, a fix for a memory leak in recovery,
and a fix for the build error that resulted from it. D'oh"
* tag 'xfs-for-linus-v3.12-rc4' of git://oss.sgi.com/xfs/xfs:
xfs: Use kmem_free() instead of free()
xfs: fix memory leak in xlog_recover_add_to_trans
xfs: dirent dtype presence is dependent on directory magic numbers
xfs: lockdep needs to know about 3 dquot-deep nesting
Linus Torvalds [Fri, 4 Oct 2013 21:05:38 +0000 (14:05 -0700)]
selinux: remove 'flags' parameter from avc_audit()
Now avc_audit() has no more users with that parameter. Remove it.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 4 Oct 2013 19:57:22 +0000 (12:57 -0700)]
selinux: avc_has_perm_flags has no more users
.. so get rid of it. The only indirect users were all the
avc_has_perm() callers which just expanded to have a zero flags
argument.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ilya Dryomov [Wed, 2 Oct 2013 17:41:01 +0000 (20:41 +0300)]
Btrfs: fix a use-after-free bug in btrfs_dev_replace_finishing
free_device rcu callback, scheduled from btrfs_rm_dev_replace_srcdev,
can be processed before btrfs_scratch_superblock is called, which would
result in a use-after-free on btrfs_device contents. Fix this by
zeroing the superblock before the rcu callback is registered.
Cc: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Ilya Dryomov [Wed, 2 Oct 2013 16:39:50 +0000 (19:39 +0300)]
Btrfs: eliminate races in worker stopping code
The current implementation of worker threads in Btrfs has races in
worker stopping code, which cause all kinds of panics and lockups when
running btrfs/011 xfstest in a loop. The problem is that
btrfs_stop_workers is unsynchronized with respect to check_idle_worker,
check_busy_worker and __btrfs_start_workers.
E.g., check_idle_worker race flow:
btrfs_stop_workers(): check_idle_worker(aworker):
- grabs the lock
- splices the idle list into the
working list
- removes the first worker from the
working list
- releases the lock to wait for
its kthread's completion
- grabs the lock
- if aworker is on the working list,
moves aworker from the working list
to the idle list
- releases the lock
- grabs the lock
- puts the worker
- removes the second worker from the
working list
......
btrfs_stop_workers returns, aworker is on the idle list
FS is umounted, memory is freed
......
aworker is waken up, fireworks ensue
With this applied, I wasn't able to trigger the problem in 48 hours,
whereas previously I could reliably reproduce at least one of these
races within an hour.
Reported-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Liu Bo [Tue, 1 Oct 2013 15:49:49 +0000 (23:49 +0800)]
Btrfs: fix crash of compressed writes
The crash[1] is found by xfstests/generic/208 with "-o compress",
it's not reproduced everytime, but it does panic.
The bug is quite interesting, it's actually introduced by a recent commit
(
573aecafca1cf7a974231b759197a1aebcf39c2a,
Btrfs: actually limit the size of delalloc range).
Btrfs implements delay allocation, so during writeback, we
(1) get a page A and lock it
(2) search the state tree for delalloc bytes and lock all pages within the range
(3) process the delalloc range, including find disk space and create
ordered extent and so on.
(4) submit the page A.
It runs well in normal cases, but if we're in a racy case, eg.
buffered compressed writes and aio-dio writes,
sometimes we may fail to lock all pages in the 'delalloc' range,
in which case, we need to fall back to search the state tree again with
a smaller range limit(max_bytes = PAGE_CACHE_SIZE - offset).
The mentioned commit has a side effect, that is, in the fallback case,
we can find delalloc bytes before the index of the page we already have locked,
so we're in the case of (delalloc_end <= *start) and return with (found > 0).
This ends with not locking delalloc pages but making ->writepage still
process them, and the crash happens.
This fixes it by just thinking that we find nothing and returning to caller
as the caller knows how to deal with it properly.
[1]:
------------[ cut here ]------------
kernel BUG at mm/page-writeback.c:2170!
[...]
CPU: 2 PID: 11755 Comm: btrfs-delalloc- Tainted: G O 3.11.0+ #8
[...]
RIP: 0010:[<
ffffffff810f5093>] [<
ffffffff810f5093>] clear_page_dirty_for_io+0x1e/0x83
[...]
[ 4934.248731] Stack:
[ 4934.248731]
ffff8801477e5dc8 ffffea00049b9f00 ffff8801869f9ce8 ffffffffa02b841a
[ 4934.248731]
0000000000000000 0000000000000000 0000000000000fff 0000000000000620
[ 4934.248731]
ffff88018db59c78 ffffea0005da8d40 ffffffffa02ff860 00000001810016c0
[ 4934.248731] Call Trace:
[ 4934.248731] [<
ffffffffa02b841a>] extent_range_clear_dirty_for_io+0xcf/0xf5 [btrfs]
[ 4934.248731] [<
ffffffffa02a8889>] compress_file_range+0x1dc/0x4cb [btrfs]
[ 4934.248731] [<
ffffffff8104f7af>] ? detach_if_pending+0x22/0x4b
[ 4934.248731] [<
ffffffffa02a8bad>] async_cow_start+0x35/0x53 [btrfs]
[ 4934.248731] [<
ffffffffa02c694b>] worker_loop+0x14b/0x48c [btrfs]
[ 4934.248731] [<
ffffffffa02c6800>] ? btrfs_queue_worker+0x25c/0x25c [btrfs]
[ 4934.248731] [<
ffffffff810608f5>] kthread+0x8d/0x95
[ 4934.248731] [<
ffffffff81060868>] ? kthread_freezable_should_stop+0x43/0x43
[ 4934.248731] [<
ffffffff814fe09c>] ret_from_fork+0x7c/0xb0
[ 4934.248731] [<
ffffffff81060868>] ? kthread_freezable_should_stop+0x43/0x43
[ 4934.248731] Code: ff 85 c0 0f 94 c0 0f b6 c0 59 5b 5d c3 0f 1f 44 00 00 55 48 89 e5 41 54 53 48 89 fb e8 2c de 00 00 49 89 c4 48 8b 03 a8 01 75 02 <0f> 0b 4d 85 e4 74 52 49 8b 84 24 80 00 00 00 f6 40 20 01 75 44
[ 4934.248731] RIP [<
ffffffff810f5093>] clear_page_dirty_for_io+0x1e/0x83
[ 4934.248731] RSP <
ffff8801869f9c48>
[ 4934.280307] ---[ end trace
36f06d3f8750236a ]---
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Josef Bacik [Mon, 30 Sep 2013 18:10:43 +0000 (14:10 -0400)]
Btrfs: fix transid verify errors when recovering log tree
If we crash with a log, remount and recover that log, and then crash before we
can commit another transaction we will get transid verify errors on the next
mount. This is because we were not zero'ing out the log when we committed the
transaction after recovery. This is ok as long as we commit another transaction
at some point in the future, but if you abort or something else goes wrong you
can end up in this weird state because the recovery stuff says that the tree log
should have a generation+1 of the super generation, which won't be the case of
the transaction that was started for recovery. Fix this by removing the check
and _always_ zero out the log portion of the super when we commit a transaction.
This fixes the transid verify issues I was seeing with my force errors tests.
Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Linus Torvalds [Fri, 4 Oct 2013 19:54:11 +0000 (12:54 -0700)]
selinux: remove 'flags' parameter from inode_has_perm
Every single user passes in '0'. I think we had non-zero users back in
some stone age when selinux_inode_permission() was implemented in terms
of inode_has_perm(), but that complicated case got split up into a
totally separate code-path so that we could optimize the much simpler
special cases.
See commit
2e33405785d3 ("SELinux: delay initialization of audit data in
selinux_inode_permission") for example.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Thierry Reding [Tue, 1 Oct 2013 14:47:53 +0000 (16:47 +0200)]
xfs: Use kmem_free() instead of free()
This fixes a build failure caused by calling the free() function which
does not exist in the Linux kernel.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
(cherry picked from commit
aaaae98022efa4f3c31042f1fdf9e7a0c5f04663)
tinguely@sgi.com [Fri, 27 Sep 2013 14:00:55 +0000 (09:00 -0500)]
xfs: fix memory leak in xlog_recover_add_to_trans
Free the memory in error path of xlog_recover_add_to_trans().
Normally this memory is freed in recovery pass2, but is leaked
in the error path.
Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
(cherry picked from commit
519ccb81ac1c8e3e4eed294acf93be00b43dcad6)
Dave Chinner [Sun, 29 Sep 2013 23:37:04 +0000 (09:37 +1000)]
xfs: dirent dtype presence is dependent on directory magic numbers
The determination of whether a directory entry contains a dtype
field originally was dependent on the filesystem having CRCs
enabled. This meant that the format for dtype beign enabled could be
determined by checking the directory block magic number rather than
doing a feature bit check. This was useful in that it meant that we
didn't need to pass a struct xfs_mount around to functions that
were already supplied with a directory block header.
Unfortunately, the introduction of dtype fields into the v4
structure via a feature bit meant this "use the directory block
magic number" method of discriminating the dirent entry sizes is
broken. Hence we need to convert the places that use magic number
checks to use feature bit checks so that they work correctly and not
by chance.
The current code works on v4 filesystems only because the dirent
size roundup covers the extra byte needed by the dtype field in the
places where this problem occurs.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
(cherry picked from commit
367993e7c6428cb7617ab7653d61dca54e2fdede)
Dave Chinner [Sun, 29 Sep 2013 23:37:03 +0000 (09:37 +1000)]
xfs: lockdep needs to know about 3 dquot-deep nesting
Michael Semon reported that xfs/299 generated this lockdep warning:
=============================================
[ INFO: possible recursive locking detected ]
3.12.0-rc2+ #2 Not tainted
---------------------------------------------
touch/21072 is trying to acquire lock:
(&xfs_dquot_other_class){+.+...}, at: [<
c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64
but task is already holding lock:
(&xfs_dquot_other_class){+.+...}, at: [<
c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&xfs_dquot_other_class);
lock(&xfs_dquot_other_class);
*** DEADLOCK ***
May be due to missing lock nesting notation
7 locks held by touch/21072:
#0: (sb_writers#10){++++.+}, at: [<
c11185b6>] mnt_want_write+0x1e/0x3e
#1: (&type->i_mutex_dir_key#4){+.+.+.}, at: [<
c11078ee>] do_last+0x245/0xe40
#2: (sb_internal#2){++++.+}, at: [<
c122c9e0>] xfs_trans_alloc+0x1f/0x35
#3: (&(&ip->i_lock)->mr_lock/1){+.+...}, at: [<
c126cd1b>] xfs_ilock+0x100/0x1f1
#4: (&(&ip->i_lock)->mr_lock){++++-.}, at: [<
c126cf52>] xfs_ilock_nowait+0x105/0x22f
#5: (&dqp->q_qlock){+.+...}, at: [<
c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64
#6: (&xfs_dquot_other_class){+.+...}, at: [<
c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64
The lockdep annotation for dquot lock nesting only understands
locking for user and "other" dquots, not user, group and quota
dquots. Fix the annotations to match the locking heirarchy we now
have.
Reported-by: Michael L. Semon <mlsemon35@gmail.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
(cherry picked from commit
f112a049712a5c07de25d511c3c6587a2b1a015e)
Linus Torvalds [Fri, 4 Oct 2013 16:06:13 +0000 (09:06 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mszeredi/fuse
Pull fuse bugfixes from Miklos Szeredi:
"This contains two more fixes by Maxim for writeback/truncate races and
fixes for RCU walk in fuse_dentry_revalidate()"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: no RCU mode in fuse_access()
fuse: readdirplus: fix RCU walk
fuse: don't check_submounts_and_drop() in RCU walk
fuse: fix fallocate vs. ftruncate race
fuse: wait for writeback in fuse_file_fallocate()
Linus Torvalds [Fri, 4 Oct 2013 16:05:12 +0000 (09:05 -0700)]
Merge tag 'iommu-fixes-v3.12-rc3' of git://git./linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:
"A couple of fixes from the IOMMU side:
- some small fixes for the new ARM-SMMU driver
- a register offset correction for VT-d
- add MAINTAINERS entry for drivers/iommu
Overall no really big or intrusive changes"
* tag 'iommu-fixes-v3.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
x86/iommu: correct ICS register offset
MAINTAINERS: add overall IOMMU section
iommu/arm-smmu: don't enable SMMU device until probing has completed
iommu/arm-smmu: fix iommu_present() test in init
iommu/arm-smmu: fix a signedness bug
Linus Torvalds [Fri, 4 Oct 2013 16:04:26 +0000 (09:04 -0700)]
Merge tag 'arm64-stable' of git://git./linux/kernel/git/cmarinas/linux-aarch64
Pull ARM64 fixes/updates from Catalin Marinas:
- Bug-fixes (get_user/put_user, incorrect register width for ASID,
FPSIMD initialisation)
- Kconfig clean-up
- defconfig update
* tag 'arm64-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64:
arm64: Remove duplicate DEBUG_STACK_USAGE config
arm64: include VIRTIO_{MMIO,BLK} in defconfig
arm64: include EXT4 in defconfig
arm64: fix possible invalid FPSIMD initialization state
arm64: use correct register width when retrieving ASID
arm64: avoid multiple evaluation of ptr in get_user/put_user()
Linus Torvalds [Fri, 4 Oct 2013 16:03:51 +0000 (09:03 -0700)]
Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
"Two small fixes for 3.12 only this week. I have a few more fixes
pending but those are conceptually more complex so will have to wait
for a bit longer"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: Fix forgotten preempt_enable() when CPU has inclusive pcaches
MIPS: Alchemy: MTX-1: fix incorrect placement of __initdata tag
Linus Torvalds [Fri, 4 Oct 2013 16:03:07 +0000 (09:03 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Two simplefb fixes"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/simplefb: Mark framebuffer mem-resources as IORESOURCE_BUSY to avoid bootup warning
x86/simplefb: Fix overflow causing bogus fall-back
Linus Torvalds [Fri, 4 Oct 2013 16:02:35 +0000 (09:02 -0700)]
Merge branch 'irq-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull irq fix from Ingo Molnar:
"Frederic's minimal fix for hardirq/softirq nesting crashes"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irq: Force hardirq exit's softirq processing on its own stack
Michael Grzeschik [Tue, 17 Sep 2013 13:56:08 +0000 (15:56 +0200)]
dmaengine: imx-dma: fix callback path in tasklet
We need to free the ld_active list head before jumping into the callback
routine. Otherwise the callback could run into issue_pending and change
our ld_active list head we just going to free. This will run the channel
list into an currupted and undefined state.
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Michael Grzeschik [Tue, 17 Sep 2013 13:56:07 +0000 (15:56 +0200)]
dmaengine: imx-dma: fix lockdep issue between irqhandler and tasklet
The tasklet and irqhandler are using spin_lock while other routines are
using spin_lock_irqsave/restore. This leads to lockdep issues as
described bellow. This patch is changing the code to use
spinlock_irq_save/restore in both code pathes.
As imxdma_xfer_desc always gets called with spin_lock_irqsave lock held,
this patch also removes the spare call inside the routine to avoid
double locking.
[ 403.358162] =================================
[ 403.362549] [ INFO: inconsistent lock state ]
[ 403.366945] 3.10.0-
20130823+ #904 Not tainted
[ 403.371331] ---------------------------------
[ 403.375721] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
[ 403.381769] swapper/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
[ 403.386762] (&(&imxdma->lock)->rlock){?.-...}, at: [<
c019d77c>] imxdma_tasklet+0x20/0x134
[ 403.395201] {IN-HARDIRQ-W} state was registered at:
[ 403.400108] [<
c004b264>] mark_lock+0x2a0/0x6b4
[ 403.404798] [<
c004d7c8>] __lock_acquire+0x650/0x1a64
[ 403.410004] [<
c004f15c>] lock_acquire+0x94/0xa8
[ 403.414773] [<
c02f74e4>] _raw_spin_lock+0x54/0x8c
[ 403.419720] [<
c019d094>] dma_irq_handler+0x78/0x254
[ 403.424845] [<
c0061124>] handle_irq_event_percpu+0x38/0x1b4
[ 403.430670] [<
c00612e4>] handle_irq_event+0x44/0x64
[ 403.435789] [<
c0063a70>] handle_level_irq+0xd8/0xf0
[ 403.440903] [<
c0060a20>] generic_handle_irq+0x28/0x38
[ 403.446194] [<
c0009cc4>] handle_IRQ+0x68/0x8c
[ 403.450789] [<
c0008714>] avic_handle_irq+0x3c/0x48
[ 403.455811] [<
c0008f84>] __irq_svc+0x44/0x74
[ 403.460314] [<
c0040b04>] cpu_startup_entry+0x88/0xf4
[ 403.465525] [<
c02f00d0>] rest_init+0xb8/0xe0
[ 403.470045] [<
c03e07dc>] start_kernel+0x28c/0x2d4
[ 403.474986] [<
a0008040>] 0xa0008040
[ 403.478709] irq event stamp: 50854
[ 403.482140] hardirqs last enabled at (50854): [<
c001c6b8>] tasklet_action+0x38/0xdc
[ 403.489954] hardirqs last disabled at (50853): [<
c001c6a0>] tasklet_action+0x20/0xdc
[ 403.497761] softirqs last enabled at (50850): [<
c001bc64>] _local_bh_enable+0x14/0x18
[ 403.505741] softirqs last disabled at (50851): [<
c001c268>] irq_exit+0x88/0xdc
[ 403.513026]
[ 403.513026] other info that might help us debug this:
[ 403.519593] Possible unsafe locking scenario:
[ 403.519593]
[ 403.525548] CPU0
[ 403.528020] ----
[ 403.530491] lock(&(&imxdma->lock)->rlock);
[ 403.534828] <Interrupt>
[ 403.537474] lock(&(&imxdma->lock)->rlock);
[ 403.541983]
[ 403.541983] *** DEADLOCK ***
[ 403.541983]
[ 403.547951] no locks held by swapper/0.
[ 403.551813]
[ 403.551813] stack backtrace:
[ 403.556222] CPU: 0 PID: 0 Comm: swapper Not tainted 3.10.0-
20130823+ #904
[ 403.563039] Backtrace:
[ 403.565581] [<
c000b98c>] (dump_backtrace+0x0/0x10c) from [<
c000bb28>] (show_stack+0x18/0x1c)
[ 403.574054] r6:
00000000 r5:
c05c51d8 r4:
c040bd58 r3:
00200000
[ 403.579872] [<
c000bb10>] (show_stack+0x0/0x1c) from [<
c02f398c>] (dump_stack+0x20/0x28)
[ 403.587955] [<
c02f396c>] (dump_stack+0x0/0x28) from [<
c02f29c8>] (print_usage_bug.part.28+0x224/0x28c)
[ 403.597340] [<
c02f27a4>] (print_usage_bug.part.28+0x0/0x28c) from [<
c004b404>] (mark_lock+0x440/0x6b4)
[ 403.606682] r8:
c004a41c r7:
00000000 r6:
c040bd58 r5:
c040c040 r4:
00000002
[ 403.613566] [<
c004afc4>] (mark_lock+0x0/0x6b4) from [<
c004d844>] (__lock_acquire+0x6cc/0x1a64)
[ 403.622244] [<
c004d178>] (__lock_acquire+0x0/0x1a64) from [<
c004f15c>] (lock_acquire+0x94/0xa8)
[ 403.631010] [<
c004f0c8>] (lock_acquire+0x0/0xa8) from [<
c02f74e4>] (_raw_spin_lock+0x54/0x8c)
[ 403.639614] [<
c02f7490>] (_raw_spin_lock+0x0/0x8c) from [<
c019d77c>] (imxdma_tasklet+0x20/0x134)
[ 403.648434] r6:
c3847010 r5:
c040e890 r4:
c38470d4
[ 403.653194] [<
c019d75c>] (imxdma_tasklet+0x0/0x134) from [<
c001c70c>] (tasklet_action+0x8c/0xdc)
[ 403.662013] r8:
c0599160 r7:
00000000 r6:
00000000 r5:
c040e890 r4:
c3847114 r3:
c019d75c
[ 403.670042] [<
c001c680>] (tasklet_action+0x0/0xdc) from [<
c001bd4c>] (__do_softirq+0xe4/0x1f0)
[ 403.678687] r7:
00000101 r6:
c0402000 r5:
c059919c r4:
00000001
[ 403.684498] [<
c001bc68>] (__do_softirq+0x0/0x1f0) from [<
c001c268>] (irq_exit+0x88/0xdc)
[ 403.692652] [<
c001c1e0>] (irq_exit+0x0/0xdc) from [<
c0009cc8>] (handle_IRQ+0x6c/0x8c)
[ 403.700514] r4:
00000030 r3:
00000110
[ 403.704192] [<
c0009c5c>] (handle_IRQ+0x0/0x8c) from [<
c0008714>] (avic_handle_irq+0x3c/0x48)
[ 403.712664] r5:
c0403f28 r4:
c0593ebc
[ 403.716343] [<
c00086d8>] (avic_handle_irq+0x0/0x48) from [<
c0008f84>] (__irq_svc+0x44/0x74)
[ 403.724733] Exception stack(0xc0403f28 to 0xc0403f70)
[ 403.729841] 3f20:
00000001 00000004 00000000 20000013 c0402000 c04104a8
[ 403.738078] 3f40:
00000002 c0b69620 a0004000 41069264 a03fb5f4 c0403f7c c0403f40 c0403f70
[ 403.746301] 3f60:
c004b92c c0009e74 20000013 ffffffff
[ 403.751383] r6:
ffffffff r5:
20000013 r4:
c0009e74 r3:
c004b92c
[ 403.757210] [<
c0009e30>] (arch_cpu_idle+0x0/0x4c) from [<
c0040b04>] (cpu_startup_entry+0x88/0xf4)
[ 403.766161] [<
c0040a7c>] (cpu_startup_entry+0x0/0xf4) from [<
c02f00d0>] (rest_init+0xb8/0xe0)
[ 403.774753] [<
c02f0018>] (rest_init+0x0/0xe0) from [<
c03e07dc>] (start_kernel+0x28c/0x2d4)
[ 403.783051] r6:
c03fc484 r5:
ffffffff r4:
c040a0e0
[ 403.787797] [<
c03e0550>] (start_kernel+0x0/0x2d4) from [<
a0008040>] (0xa0008040)
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>