review.tizen.org Git - platform/kernel/linux-exynos.git/log

USB: don't free bandwidth_mutex too early

[ Upstream commit ab2a4bf83902c170d29ba130a8abb5f9d90559e1 ]

The USB core contains a bug that can show up when a USB-3 host
controller is removed.  If the primary (USB-2) hcd structure is
released before the shared (USB-3) hcd, the core will try to do a
double-free of the common bandwidth_mutex.

The problem was described in graphical form by Chung-Geol Kim, who
first reported it:

=================================================
     At *remove USB(3.0) Storage
     sequence <1> --> <5> ((Problem Case))
=================================================
                                  VOLD
------------------------------------|------------
                                 (uevent)
                            ________|_________
                           |<1>               |
                           |dwc3_otg_sm_work  |
                           |usb_put_hcd       |
                           |peer_hcd(kref=2)|
                           |__________________|
                            ________|_________
                           |<2>               |
                           |New USB BUS #2    |
                           |                  |
                           |peer_hcd(kref=1)  |
                           |                  |
                         --(Link)-bandXX_mutex|
                         | |__________________|
                         |
    ___________________  |
   |<3>                | |
   |dwc3_otg_sm_work   | |
   |usb_put_hcd        | |
   |primary_hcd(kref=1)| |
   |___________________| |
    _________|_________  |
   |<4>                | |
   |New USB BUS #1     | |
   |hcd_release        | |
   |primary_hcd(kref=0)| |
   |                   | |
   |bandXX_mutex(free) |<-
   |___________________|
                               (( VOLD ))
                            ______|___________
                           |<5>               |
                           |      SCSI        |
                           |usb_put_hcd       |
                           |peer_hcd(kref=0)  |
                           |*hcd_release      |
                           |bandXX_mutex(free*)|<- double free
                           |__________________|

=================================================

This happens because hcd_release() frees the bandwidth_mutex whenever
it sees a primary hcd being released (which is not a very good idea
in any case), but in the course of releasing the primary hcd, it
changes the pointers in the shared hcd in such a way that the shared
hcd will appear to be primary when it gets released.

This patch fixes the problem by changing hcd_release() so that it
deallocates the bandwidth_mutex only when the _last_ hcd structure
referencing it is released.  The patch also removes an unnecessary
test, so that when an hcd is released, both the shared_hcd and
primary_hcd pointers in the hcd's peer will be cleared.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Chung-Geol Kim <chunggeol.kim@samsung.com>
Tested-by: Chung-Geol Kim <chunggeol.kim@samsung.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

make nfs_atomic_open() call d_drop() on all ->open_context() errors.

[ Upstream commit d20cb71dbf3487f24549ede1a8e2d67579b4632e ]

In "NFSv4: Move dentry instantiation into the NFSv4-specific atomic open code"
unconditional d_drop() after the ->open_context() had been removed.  It had
been correct for success cases (there ->open_context() itself had been doing
dcache manipulations), but not for error ones.  Only one of those (ENOENT)
got a compensatory d_drop() added in that commit, but in fact it should've
been done for all errors.  As it is, the case of O_CREAT non-exclusive open
on a hashed negative dentry racing with e.g. symlink creation from another
client ended up with ->open_context() getting an error and proceeding to
call nfs_lookup().  On a hashed dentry, which would've instantly triggered
BUG_ON() in d_materialise_unique() (or, these days, its equivalent in
d_splice_alias()).

Cc: stable@vger.kernel.org # v3.10+
Tested-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

KVM: arm/arm64: Stop leaking vcpu pid references

[ Upstream commit 591d215afcc2f94e8e2c69a63c924c044677eb31 ]

kvm provides kvm_vcpu_uninit(), which amongst other things, releases the
last reference to the struct pid of the task that was last running the vcpu.

On arm64 built with CONFIG_DEBUG_KMEMLEAK, starting a guest with kvmtool,
then killing it with SIGKILL results (after some considerable time) in:
> cat /sys/kernel/debug/kmemleak
> unreferenced object 0xffff80007d5ea080 (size 128):
>  comm "lkvm", pid 2025, jiffies 4294942645 (age 1107.776s)
>  hex dump (first 32 bytes):
>    01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>  backtrace:
>    [<ffff8000001b30ec>] create_object+0xfc/0x278
>    [<ffff80000071da34>] kmemleak_alloc+0x34/0x70
>    [<ffff80000019fa2c>] kmem_cache_alloc+0x16c/0x1d8
>    [<ffff8000000d0474>] alloc_pid+0x34/0x4d0
>    [<ffff8000000b5674>] copy_process.isra.6+0x79c/0x1338
>    [<ffff8000000b633c>] _do_fork+0x74/0x320
>    [<ffff8000000b66b0>] SyS_clone+0x18/0x20
>    [<ffff800000085cb0>] el0_svc_naked+0x24/0x28
>    [<ffffffffffffffff>] 0xffffffffffffffff

On x86 kvm_vcpu_uninit() is called on the path from kvm_arch_destroy_vm(),
on arm no equivalent call is made. Add the call to kvm_arch_vcpu_free().

Signed-off-by: James Morse <james.morse@arm.com>
Fixes: 749cf76c5a36 ("KVM: ARM: Initial skeleton to compile KVM support")
Cc: <stable@vger.kernel.org> # 3.10+
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

powerpc/tm: Always reclaim in start_thread() for exec() class syscalls

[ Upstream commit 8e96a87c5431c256feb65bcfc5aec92d9f7839b6 ]

Userspace can quite legitimately perform an exec() syscall with a
suspended transaction. exec() does not return to the old process, rather
it load a new one and starts that, the expectation therefore is that the
new process starts not in a transaction. Currently exec() is not treated
any differently to any other syscall which creates problems.

Firstly it could allow a new process to start with a suspended
transaction for a binary that no longer exists. This means that the
checkpointed state won't be valid and if the suspended transaction were
ever to be resumed and subsequently aborted (a possibility which is
exceedingly likely as exec()ing will likely doom the transaction) the
new process will jump to invalid state.

Secondly the incorrect attempt to keep the transactional state while
still zeroing state for the new process creates at least two TM Bad
Things. The first triggers on the rfid to return to userspace as
start_thread() has given the new process a 'clean' MSR but the suspend
will still be set in the hardware MSR. The second TM Bad Thing triggers
in __switch_to() as the processor is still transactionally suspended but
__switch_to() wants to zero the TM sprs for the new process.

This is an example of the outcome of calling exec() with a suspended
transaction. Note the first 700 is likely the first TM bad thing
decsribed earlier only the kernel can't report it as we've loaded
userspace registers. c000000000009980 is the rfid in
fast_exception_return()

  Bad kernel stack pointer 3fffcfa1a370 at c000000000009980
  Oops: Bad kernel stack pointer, sig: 6 [#1]
  CPU: 0 PID: 2006 Comm: tm-execed Not tainted
  NIP: c000000000009980 LR: 0000000000000000 CTR: 0000000000000000
  REGS: c00000003ffefd40 TRAP: 0700   Not tainted
  MSR: 8000000300201031 <SF,ME,IR,DR,LE,TM[SE]>  CR: 00000000  XER: 00000000
  CFAR: c0000000000098b4 SOFTE: 0
  PACATMSCRATCH: b00000010000d033
  GPR00: 0000000000000000 00003fffcfa1a370 0000000000000000 0000000000000000
  GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR12: 00003fff966611c0 0000000000000000 0000000000000000 0000000000000000
  NIP [c000000000009980] fast_exception_return+0xb0/0xb8
  LR [0000000000000000]           (null)
  Call Trace:
  Instruction dump:
  f84d0278 e9a100d8 7c7b03a6 e84101a0 7c4ff120 e8410170 7c5a03a6 e8010070
  e8410080 e8610088 e8810090 e8210078 <4c000024> 48000000 e8610178 88ed023b

  Kernel BUG at c000000000043e80 [verbose debug info unavailable]
  Unexpected TM Bad Thing exception at c000000000043e80 (msr 0x201033)
  Oops: Unrecoverable exception, sig: 6 [#2]
  CPU: 0 PID: 2006 Comm: tm-execed Tainted: G      D
  task: c0000000fbea6d80 ti: c00000003ffec000 task.ti: c0000000fb7ec000
  NIP: c000000000043e80 LR: c000000000015a24 CTR: 0000000000000000
  REGS: c00000003ffef7e0 TRAP: 0700   Tainted: G      D
  MSR: 8000000300201033 <SF,ME,IR,DR,RI,LE,TM[SE]>  CR: 28002828  XER: 00000000
  CFAR: c000000000015a20 SOFTE: 0
  PACATMSCRATCH: b00000010000d033
  GPR00: 0000000000000000 c00000003ffefa60 c000000000db5500 c0000000fbead000
  GPR04: 8000000300001033 2222222222222222 2222222222222222 00000000ff160000
  GPR08: 0000000000000000 800000010000d033 c0000000fb7e3ea0 c00000000fe00004
  GPR12: 0000000000002200 c00000000fe00000 0000000000000000 0000000000000000
  GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR20: 0000000000000000 0000000000000000 c0000000fbea7410 00000000ff160000
  GPR24: c0000000ffe1f600 c0000000fbea8700 c0000000fbea8700 c0000000fbead000
  GPR28: c000000000e20198 c0000000fbea6d80 c0000000fbeab680 c0000000fbea6d80
  NIP [c000000000043e80] tm_restore_sprs+0xc/0x1c
  LR [c000000000015a24] __switch_to+0x1f4/0x420
  Call Trace:
  Instruction dump:
  7c800164 4e800020 7c0022a6 f80304a8 7c0222a6 f80304b0 7c0122a6 f80304b8
  4e800020 e80304a8 7c0023a6 e80304b0 <7c0223a6> e80304b8 7c0123a6 4e800020

This fixes CVE-2016-5828.

Fixes: bc2a9408fa65 ("powerpc: Hook in new transactional memory code")
Cc: stable@vger.kernel.org # v3.9+
Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

fs/nilfs2: fix potential underflow in call to crc32_le

[ Upstream commit 63d2f95d63396059200c391ca87161897b99e74a ]

The value `bytes' comes from the filesystem which is about to be
mounted.  We cannot trust that the value is always in the range we
expect it to be.

Check its value before using it to calculate the length for the crc32_le
call.  It value must be larger (or equal) sumoff + 4.

This fixes a kernel bug when accidentially mounting an image file which
had the nilfs2 magic value 0x3434 at the right offset 0x406 by chance.
The bytes 0x01 0x00 were stored at 0x408 and were interpreted as a
s_bytes value of 1.  This caused an underflow when substracting sumoff +
4 (20) in the call to crc32_le.

  BUG: unable to handle kernel paging request at ffff88021e600000
  IP:  crc32_le+0x36/0x100
  ...
  Call Trace:
    nilfs_valid_sb.part.5+0x52/0x60 [nilfs2]
    nilfs_load_super_block+0x142/0x300 [nilfs2]
    init_nilfs+0x60/0x390 [nilfs2]
    nilfs_mount+0x302/0x520 [nilfs2]
    mount_fs+0x38/0x160
    vfs_kern_mount+0x67/0x110
    do_mount+0x269/0xe00
    SyS_mount+0x9f/0x100
    entry_SYSCALL_64_fastpath+0x16/0x71

Link: http://lkml.kernel.org/r/1466778587-5184-2-git-send-email-konishi.ryusuke@lab.ntt.co.jp
Signed-off-by: Torsten Hilbrich <torsten.hilbrich@secunet.com>
Tested-by: Torsten Hilbrich <torsten.hilbrich@secunet.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

mm, compaction: abort free scanner if split fails

[ Upstream commit a4f04f2c6955aff5e2c08dcb40aca247ff4d7370 ]

If the memory compaction free scanner cannot successfully split a free
page (only possible due to per-zone low watermark), terminate the free
scanner rather than continuing to scan memory needlessly. If the
watermark is insufficient for a free page of order <= cc->order, then
terminate the scanner since all future splits will also likely fail.

This prevents the compaction freeing scanner from scanning all memory on
very large zones (very noticeable for zones > 128GB, for instance) when
all splits will likely fail while holding zone->lock.

compaction_alloc() iterating a 128GB zone has been benchmarked to take
over 400ms on some systems whereas any free page isolated and ready to
be split ends up failing in split_free_page() because of the low
watermark check and thus the iteration continues.

The next time compaction occurs, the freeing scanner will likely start
at the end of the zone again since no success was made previously and we
get the same lengthy iteration until the zone is brought above the low
watermark. All thp page faults can take >400ms in such a state without
this fix.

Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1606211820350.97086@chino.kir.corp.google.com
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

mm, compaction: skip compound pages by order in free scanner

[ Upstream commit 9fcd6d2e052eef525e94a9ae58dbe7ed4df4f5a7 ]

The compaction free scanner is looking for PageBuddy() pages and
skipping all others. For large compound pages such as THP or hugetlbfs,
we can save a lot of iterations if we skip them at once using their
compound_order(). This is generally unsafe and we can read a bogus
value of order due to a race, but if we are careful, the only danger is
skipping too much.

When tested with stress-highalloc from mmtests on 4GB system with 1GB
hugetlbfs pages, the vmstat compact_free_scanned count decreased by at
least 15%.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

mm/swap.c: flush lru pvecs on compound page arrival

[ Upstream commit 8f182270dfec432e93fae14f9208a6b9af01009f ]

Currently we can have compound pages held on per cpu pagevecs, which
leads to a lot of memory unavailable for reclaim when needed.  In the
systems with hundreads of processors it can be GBs of memory.

On of the way of reproducing the problem is to not call munmap
explicitly on all mapped regions (i.e.  after receiving SIGTERM).  After
that some pages (with THP enabled also huge pages) may end up on
lru_add_pvec, example below.

  void main() {
  #pragma omp parallel
  {
size_t size = 55 * 1000 * 1000; // smaller than  MEM/CPUS
void *p = mmap(NULL, size, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS , -1, 0);
if (p != MAP_FAILED)
memset(p, 0, size);
//munmap(p, size); // uncomment to make the problem go away
  }
  }

When we run it with THP enabled it will leave significant amount of
memory on lru_add_pvec.  This memory will be not reclaimed if we hit
OOM, so when we run above program in a loop:

for i in `seq 100`; do ./a.out; done

many processes (95% in my case) will be killed by OOM.

The primary point of the LRU add cache is to save the zone lru_lock
contention with a hope that more pages will belong to the same zone and
so their addition can be batched.  The huge page is already a form of
batched addition (it will add 512 worth of memory in one go) so skipping
the batching seems like a safer option when compared to a potential
excess in the caching which can be quite large and much harder to fix
because lru_add_drain_all is way to expensive and it is not really clear
what would be a good moment to call it.

Similarly we can reproduce the problem on lru_deactivate_pvec by adding:
madvise(p, size, MADV_FREE); after memset.

This patch flushes lru pvecs on compound page arrival making the problem
less severe - after applying it kill rate of above example drops to 0%,
due to reducing maximum amount of memory held on pvec from 28MB (with
THP) to 56kB per CPU.

Suggested-by: Michal Hocko <mhocko@suse.com>
Link: http://lkml.kernel.org/r/1466180198-18854-1-git-send-email-lukasz.odzioba@intel.com
Signed-off-by: Lukasz Odzioba <lukasz.odzioba@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Cc: Ming Li <mingli199x@qq.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

tmpfs: don't undo fallocate past its last page

[ Upstream commit b9b4bb26af017dbe930cd4df7f9b2fc3a0497bfe ]

When fallocate is interrupted it will undo a range that extends one byte
past its range of allocated pages. This can corrupt an in-use page by
zeroing out its first byte. Instead, undo using the inclusive byte
range.

Fixes: 1635f6a74152f1d ("tmpfs: undo fallocation on failure")
Link: http://lkml.kernel.org/r/1462713387-16724-1-git-send-email-anthony.romano@coreos.com
Signed-off-by: Anthony Romano <anthony.romano@coreos.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Brandon Philips <brandon@ifup.co>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

USB: EHCI: declare hostpc register as zero-length array

[ Upstream commit 7e8b3dfef16375dbfeb1f36a83eb9f27117c51fd ]

The HOSTPC extension registers found in some EHCI implementations form
a variable-length array, with one element for each port. Therefore
the hostpc field in struct ehci_regs should be declared as a
zero-length array, not a single-element array.

This fixes a problem reported by UBSAN.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Wilfried Klaebe <linux-kernel@lebenslange-mailadresse.de>
Tested-by: Wilfried Klaebe <linux-kernel@lebenslange-mailadresse.de>
CC: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

File names with trailing period or space need special case conversion

[ Upstream commit 45e8a2583d97ca758a55c608f78c4cef562644d1 ]

POSIX allows files with trailing spaces or a trailing period but
SMB3 does not, so convert these using the normal Services For Mac
mapping as we do for other reserved characters such as
: < > | ? *
This is similar to what Macs do for the same problem over SMB3.

CC: Stable <stable@vger.kernel.org>
Signed-off-by: Steve French <steve.french@primarydata.com>
Acked-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

Fix reconnect to not defer smb3 session reconnect long after socket reconnect

[ Upstream commit 4fcd1813e6404dd4420c7d12fb483f9320f0bf93 ]

Azure server blocks clients that open a socket and don't do anything on it.
In our reconnect scenarios, we can reconnect the tcp session and
detect the socket is available but we defer the negprot and SMB3 session
setup and tree connect reconnection until the next i/o is requested, but
this looks suspicous to some servers who expect SMB3 negprog and session
setup soon after a socket is created.

In the echo thread, reconnect SMB3 sessions and tree connections
that are disconnected. A later patch will replay persistent (and
resilient) handle opens.

CC: Stable <stable@vger.kernel.org>
Signed-off-by: Steve French <steve.french@primarydata.com>
Acked-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

pnfs_nfs: fix _cancel_empty_pagelist

[ Upstream commit 5e3a98883e7ebdd1440f829a9e9dd5c3d2c5903b ]

pnfs_generic_commit_cancel_empty_pagelist calls nfs_commitdata_release,
but that is wrong: nfs_commitdata_release puts the open context, something
that isn't valid until nfs_init_commit is called, which is never the case
when pnfs_generic_commit_cancel_empty_pagelist is called.

This was introduced in "nfs: avoid race that crashes nfs_init_commit".

Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

nfs: avoid race that crashes nfs_init_commit

[ Upstream commit ade8febde0271513360bac44883dbebad44276c3 ]

Since the patch "NFS: Allow multiple commit requests in flight per file"
we can run multiple simultaneous commits on the same inode.  This
introduced a race over collecting pages to commit that made it possible
to call nfs_init_commit() with an empty list - which causes crashes like
the one below.

The fix is to catch this race and avoid calling nfs_init_commit and
initiate_commit when there is no work to do.

Here is the crash:

[600522.076832] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
[600522.078475] IP: [<ffffffffa0479e72>] nfs_init_commit+0x22/0x130 [nfs]
[600522.078745] PGD 4272b1067 PUD 4272cb067 PMD 0
[600522.078972] Oops: 0000 [#1] SMP
[600522.079204] Modules linked in: nfsv3 nfs_layout_flexfiles rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache dcdbas ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw vmw_vsock_vmci_transport vsock bonding ipmi_devintf ipmi_msghandler coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ppdev vmw_balloon parport_pc parport acpi_cpufreq vmw_vmci i2c_piix4 shpchp nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c vmwgfx drm_kms_helper ttm drm crc32c_intel serio_raw vmxnet3
[600522.081380]  vmw_pvscsi ata_generic pata_acpi
[600522.081809] CPU: 3 PID: 15667 Comm: /usr/bin/python Not tainted 4.1.9-100.pd.88.el7.x86_64 #1
[600522.082281] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2014
[600522.082814] task: ffff8800bbbfa780 ti: ffff88042ae84000 task.ti: ffff88042ae84000
[600522.083378] RIP: 0010:[<ffffffffa0479e72>]  [<ffffffffa0479e72>] nfs_init_commit+0x22/0x130 [nfs]
[600522.083973] RSP: 0018:ffff88042ae87438  EFLAGS: 00010246
[600522.084571] RAX: 0000000000000000 RBX: ffff880003485e40 RCX: ffff88042ae87588
[600522.085188] RDX: 0000000000000000 RSI: ffff88042ae874b0 RDI: ffff880003485e40
[600522.085756] RBP: ffff88042ae87448 R08: ffff880003486010 R09: ffff88042ae874b0
[600522.086332] R10: 0000000000000000 R11: 0000000000000005 R12: ffff88042ae872d0
[600522.086905] R13: ffff88042ae874b0 R14: ffff880003485e40 R15: ffff88042704c840
[600522.087484] FS:  00007f4728ff2740(0000) GS:ffff88043fd80000(0000) knlGS:0000000000000000
[600522.088070] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[600522.088663] CR2: 0000000000000040 CR3: 000000042b6aa000 CR4: 00000000001406e0
[600522.089327] Stack:
[600522.089926]  0000000000000001 ffff88042ae87588 ffff88042ae874f8 ffffffffa04f09fa
[600522.090549]  0000000000017840 0000000000017840 ffff88042ae87588 ffff8803258d9930
[600522.091169]  ffff88042ae87578 ffffffffa0563d80 0000000000000000 ffff88042704c840
[600522.091789] Call Trace:
[600522.092420]  [<ffffffffa04f09fa>] pnfs_generic_commit_pagelist+0x1da/0x320 [nfsv4]
[600522.093052]  [<ffffffffa0563d80>] ? ff_layout_commit_prepare_v3+0x30/0x30 [nfs_layout_flexfiles]
[600522.093696]  [<ffffffffa0562645>] ff_layout_commit_pagelist+0x15/0x20 [nfs_layout_flexfiles]
[600522.094359]  [<ffffffffa047bc78>] nfs_generic_commit_list+0xe8/0x120 [nfs]
[600522.095032]  [<ffffffffa047bd6a>] nfs_commit_inode+0xba/0x110 [nfs]
[600522.095719]  [<ffffffffa046ac54>] nfs_release_page+0x44/0xd0 [nfs]
[600522.096410]  [<ffffffff811a8122>] try_to_release_page+0x32/0x50
[600522.097109]  [<ffffffff811bd4f1>] shrink_page_list+0x961/0xb30
[600522.097812]  [<ffffffff811bdced>] shrink_inactive_list+0x1cd/0x550
[600522.098530]  [<ffffffff811bea65>] shrink_lruvec+0x635/0x840
[600522.099250]  [<ffffffff811bed60>] shrink_zone+0xf0/0x2f0
[600522.099974]  [<ffffffff811bf312>] do_try_to_free_pages+0x192/0x470
[600522.100709]  [<ffffffff811bf6ca>] try_to_free_pages+0xda/0x170
[600522.101464]  [<ffffffff811b2198>] __alloc_pages_nodemask+0x588/0x970
[600522.102235]  [<ffffffff811fbbd5>] alloc_pages_vma+0xb5/0x230
[600522.103000]  [<ffffffff813a1589>] ? cpumask_any_but+0x39/0x50
[600522.103774]  [<ffffffff811d6115>] wp_page_copy.isra.55+0x95/0x490
[600522.104558]  [<ffffffff810e3438>] ? __wake_up+0x48/0x60
[600522.105357]  [<ffffffff811d7d3b>] do_wp_page+0xab/0x4f0
[600522.106137]  [<ffffffff810a1bbb>] ? release_task+0x36b/0x470
[600522.106902]  [<ffffffff8126dbd7>] ? eventfd_ctx_read+0x67/0x1c0
[600522.107659]  [<ffffffff811da2a8>] handle_mm_fault+0xc78/0x1900
[600522.108431]  [<ffffffff81067ef1>] __do_page_fault+0x181/0x420
[600522.109173]  [<ffffffff811446a6>] ? __audit_syscall_exit+0x1e6/0x280
[600522.109893]  [<ffffffff810681c0>] do_page_fault+0x30/0x80
[600522.110594]  [<ffffffff81024f36>] ? syscall_trace_leave+0xc6/0x120
[600522.111288]  [<ffffffff81790a58>] page_fault+0x28/0x30
[600522.111947] Code: 5d c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 4c 8d 87 d0 01 00 00 48 89 e5 53 48 89 fb 48 83 ec 08 4c 8b 0e 49 8b 41 18 4c 39 ce <48> 8b 40 40 4c 8b 50 30 74 24 48 8b 87 d0 01 00 00 48 8b 7e 08
[600522.113343] RIP  [<ffffffffa0479e72>] nfs_init_commit+0x22/0x130 [nfs]
[600522.114003]  RSP <ffff88042ae87438>
[600522.114636] CR2: 0000000000000040

Fixes: af7cf057 (NFS: Allow multiple commit requests in flight per file)
CC: stable@vger.kernel.org
Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

pNFS: Tighten up locking around DS commit buckets

[ Upstream commit 27571297a7e9a2a845c232813a7ba7e1227f5ec6 ]

I'm not aware of any bugreports around this issue, but the locking
around the pnfs_commit_bucket is inconsistent at best. This patch
tightens it up by ensuring that the 'bucket->committing' list is always
changed atomically w.r.t. the 'bucket->clseg' layout segment tracking.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ALSA: dummy: Fix a use-after-free at closing

[ Upstream commit d5dbbe6569481bf12dcbe3e12cff72c5f78d272c ]

syzkaller fuzzer spotted a potential use-after-free case in snd-dummy
driver when hrtimer is used as backend:
> ==================================================================
> BUG: KASAN: use-after-free in rb_erase+0x1b17/0x2010 at addr ffff88005e5b6f68
>  Read of size 8 by task syz-executor/8984
> =============================================================================
> BUG kmalloc-192 (Not tainted): kasan: bad access detected
> -----------------------------------------------------------------------------
>
> Disabling lock debugging due to kernel taint
> INFO: Allocated in 0xbbbbbbbbbbbbbbbb age=18446705582212484632
> ....
> [<      none      >] dummy_hrtimer_create+0x49/0x1a0 sound/drivers/dummy.c:464
> ....
> INFO: Freed in 0xfffd8e09 age=18446705496313138713 cpu=2164287125 pid=-1
> [<      none      >] dummy_hrtimer_free+0x68/0x80 sound/drivers/dummy.c:481
> ....
> Call Trace:
>  [<ffffffff8179e59e>] __asan_report_load8_noabort+0x3e/0x40 mm/kasan/report.c:333
>  [<     inline     >] rb_set_parent include/linux/rbtree_augmented.h:111
>  [<     inline     >] __rb_erase_augmented include/linux/rbtree_augmented.h:218
>  [<ffffffff82ca5787>] rb_erase+0x1b17/0x2010 lib/rbtree.c:427
>  [<ffffffff82cb02e8>] timerqueue_del+0x78/0x170 lib/timerqueue.c:86
>  [<ffffffff814d0c80>] __remove_hrtimer+0x90/0x220 kernel/time/hrtimer.c:903
>  [<     inline     >] remove_hrtimer kernel/time/hrtimer.c:945
>  [<ffffffff814d23da>] hrtimer_try_to_cancel+0x22a/0x570 kernel/time/hrtimer.c:1046
>  [<ffffffff814d2742>] hrtimer_cancel+0x22/0x40 kernel/time/hrtimer.c:1066
>  [<ffffffff85420531>] dummy_hrtimer_stop+0x91/0xb0 sound/drivers/dummy.c:417
>  [<ffffffff854228bf>] dummy_pcm_trigger+0x17f/0x1e0 sound/drivers/dummy.c:507
>  [<ffffffff85392170>] snd_pcm_do_stop+0x160/0x1b0 sound/core/pcm_native.c:1106
>  [<ffffffff85391b26>] snd_pcm_action_single+0x76/0x120 sound/core/pcm_native.c:956
>  [<ffffffff85391e01>] snd_pcm_action+0x231/0x290 sound/core/pcm_native.c:974
>  [<     inline     >] snd_pcm_stop sound/core/pcm_native.c:1139
>  [<ffffffff8539754d>] snd_pcm_drop+0x12d/0x1d0 sound/core/pcm_native.c:1784
>  [<ffffffff8539d3be>] snd_pcm_common_ioctl1+0xfae/0x2150 sound/core/pcm_native.c:2805
>  [<ffffffff8539ee91>] snd_pcm_capture_ioctl1+0x2a1/0x5e0 sound/core/pcm_native.c:2976
>  [<ffffffff8539f2ec>] snd_pcm_kernel_ioctl+0x11c/0x160 sound/core/pcm_native.c:3020
>  [<ffffffff853d9a44>] snd_pcm_oss_sync+0x3a4/0xa30 sound/core/oss/pcm_oss.c:1693
>  [<ffffffff853da27d>] snd_pcm_oss_release+0x1ad/0x280 sound/core/oss/pcm_oss.c:2483
>  .....

A workaround is to call hrtimer_cancel() in dummy_hrtimer_sync() which
is called certainly before other blocking ops.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ALSA: hda / realtek - add two more Thinkpad IDs (5050,5053) for tpt460 fixup

[ Upstream commit 0f087ee3f3b86a4507db4ff1d2d5a3880e4cfd16 ]

See: https://bugzilla.redhat.com/show_bug.cgi?id=1349539
See: https://bugzilla.kernel.org/show_bug.cgi?id=120961

Signed-off-by: Jaroslav Kysela <perex@perex.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ALSA: hda - remove one pin from ALC292_STANDARD_PINS

[ Upstream commit 21e9d017b88ea0baa367ef0b6516d794fa23e85e ]

One more Dell laptop with alc293 codec needs
ALC293_FIXUP_DELL1_MIC_NO_PRESENCE, but the pin 0x1e does not match
the corresponding one in the ALC292_STANDARD_PINS. To use this macro
for this machine, we need to remove pin 0x1e from it.

BugLink: https://bugs.launchpad.net/bugs/1476888
Cc: <stable@vger.kernel.org>
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

HID: hiddev: validate num_values for HIDIOCGUSAGES, HIDIOCSUSAGES commands

[ Upstream commit 93a2001bdfd5376c3dc2158653034c20392d15c5 ]

This patch validates the num_values parameter from userland during the
HIDIOCGUSAGES and HIDIOCSUSAGES commands. Previously, if the report id was set
to HID_REPORT_ID_UNKNOWN, we would fail to validate the num_values parameter
leading to a heap overflow.

Cc: stable@vger.kernel.org
Signed-off-by: Scott Bauer <sbauer@plzdonthack.me>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

cifs: dynamic allocation of ntlmssp blob

[ Upstream commit b8da344b74c822e966c6d19d6b2321efe82c5d97 ]

In sess_auth_rawntlmssp_authenticate(), the ntlmssp blob is allocated
statically and its size is an "empirical" 5*sizeof(struct
_AUTHENTICATE_MESSAGE) (320B on x86_64). I don't know where this value
comes from or if it was ever appropriate, but it is currently
insufficient: the user and domain name in UTF16 could take 1kB by
themselves. Because of that, build_ntlmssp_auth_blob() might corrupt
memory (out-of-bounds write). The size of ntlmssp_blob in
SMB2_sess_setup() is too small too (sizeof(struct _NEGOTIATE_MESSAGE)
+ 500).

This patch allocates the blob dynamically in
build_ntlmssp_auth_blob().

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

Input: vmmouse - remove port reservation

[ Upstream commit 60842ef8128e7bf58c024814cd0dc14319232b6c ]

The VMWare EFI BIOS will expose port 0x5658 as an ACPI resource.  This
causes the port to be reserved by the APCI module as the system comes up,
making it unavailable to be reserved again by other drivers, thus
preserving this VMWare port for special use in a VMWare guest.

This port is designed to be shared among multiple VMWare services, such as
the VMMOUSE.  Because of this, VMMOUSE should not try to reserve this port
on its own.

The VMWare non-EFI BIOS does not do this to preserve compatibility with
existing/legacy VMs.  It is known that there is small chance a VM may be
configured such that these ports get reserved by other non-VMWare devices,
and if this ever happens, the result is undefined.

Signed-off-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Cc: <stable@vger.kernel.org> # 4.1-
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

drm/nouveau: fix for disabled fbdev emulation

[ Upstream commit 52dfcc5ccfbb6697ac3cac7f7ff1e712760e1216 ]

Hello,

after this commit:

commit f045f459d925138fe7d6193a8c86406bda7e49da
Author: Ben Skeggs <bskeggs@redhat.com>
Date: Thu Jun 2 12:23:31 2016 +1000
drm/nouveau/fbcon: fix out-of-bounds memory accesses

kernel started to oops when loading nouveau module when using GTX 780 Ti
video adapter. This patch fixes the problem.

Bug report: https://bugzilla.kernel.org/show_bug.cgi?id=120591

Signed-off-by: Dmitrii Tcvetkov <demfloro@demfloro.ru>
Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Fixes: f045f459d925 ("nouveau_fbcon_init()")
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

Input: elantech - add more IC body types to the list

[ Upstream commit 226ba707744a51acb4244724e09caacb1d96aed9 ]

The touchpad in HP Pavilion 14-ab057ca reports it's version as 12 and
according to Elan both 11 and 12 are valid IC types and should be
identified as hw_version 4.

Reported-by: Patrick Lessard <Patrick.Lessard@cogeco.com>
Tested-by: Patrick Lessard <Patrick.Lessard@cogeco.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

Input: wacom_w8001 - w8001_MAX_LENGTH should be 13

[ Upstream commit 12afb34400eb2b301f06b2aa3535497d14faee59 ]

Somehow the patch that added two-finger touch support forgot to update
W8001_MAX_LENGTH from 11 to 13.

Signed-off-by: Ping Cheng <pingc@wacom.com>
Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

xen/pciback: Fix conf_space read/write overlap check.

[ Upstream commit 02ef871ecac290919ea0c783d05da7eedeffc10e ]

Current overlap check is evaluating to false a case where a filter
field is fully contained (proper subset) of a r/w request. This
change applies classical overlap check instead to include all the
scenarios.

More specifically, for (Hilscher GmbH CIFX 50E-DP(M/S)) device driver
the logic is such that the entire confspace is read and written in 4
byte chunks. In this case as an example, CACHE_LINE_SIZE,
LATENCY_TIMER and PCI_BIST are arriving together in one call to
xen_pcibk_config_write() with offset == 0xc and size == 4. With the
exsisting overlap check the LATENCY_TIMER field (offset == 0xd, length
== 1) is fully contained in the write request and hence is excluded
from write, which is incorrect.

Signed-off-by: Andrey Grodzovsky <andrey2805@gmail.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Jan Beulich <JBeulich@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

can: fix oops caused by wrong rtnl dellink usage

[ Upstream commit 25e1ed6e64f52a692ba3191c4fde650aab3ecc07 ]

For 'real' hardware CAN devices the netlink interface is used to set CAN
specific communication parameters. Real CAN hardware can not be created nor
removed with the ip tool ...

This patch adds a private dellink function for the CAN device driver interface
that does just nothing.

It's a follow up to commit 993e6f2fd ("can: fix oops caused by wrong rtnl
newlink usage") but for dellink.

Reported-by: ajneu <ajneu1@gmail.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

can: fix handling of unmodifiable configuration options fix

[ Upstream commit bce271f255dae8335dc4d2ee2c4531e09cc67f5a ]

With upstream commit bb208f144cf3f59 (can: fix handling of unmodifiable
configuration options) a new can_validate() function was introduced.

When invoking 'ip link set can0 type can' without any configuration data
can_validate() tries to validate the content without taking into account that
there's totally no content. This patch adds a check for missing content.

Reported-by: ajneu <ajneu1@gmail.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

UBIFS: Implement ->migratepage()

[ Upstream commit 4ac1c17b2044a1b4b2fbed74451947e905fc2992 ]

During page migrations UBIFS might get confused
and the following assert triggers:
[  213.480000] UBIFS assert failed in ubifs_set_page_dirty at 1451 (pid 436)
[  213.490000] CPU: 0 PID: 436 Comm: drm-stress-test Not tainted 4.4.4-00176-geaa802524636-dirty #1008
[  213.490000] Hardware name: Allwinner sun4i/sun5i Families
[  213.490000] [<c0015e70>] (unwind_backtrace) from [<c0012cdc>] (show_stack+0x10/0x14)
[  213.490000] [<c0012cdc>] (show_stack) from [<c02ad834>] (dump_stack+0x8c/0xa0)
[  213.490000] [<c02ad834>] (dump_stack) from [<c0236ee8>] (ubifs_set_page_dirty+0x44/0x50)
[  213.490000] [<c0236ee8>] (ubifs_set_page_dirty) from [<c00fa0bc>] (try_to_unmap_one+0x10c/0x3a8)
[  213.490000] [<c00fa0bc>] (try_to_unmap_one) from [<c00fadb4>] (rmap_walk+0xb4/0x290)
[  213.490000] [<c00fadb4>] (rmap_walk) from [<c00fb1bc>] (try_to_unmap+0x64/0x80)
[  213.490000] [<c00fb1bc>] (try_to_unmap) from [<c010dc28>] (migrate_pages+0x328/0x7a0)
[  213.490000] [<c010dc28>] (migrate_pages) from [<c00d0cb0>] (alloc_contig_range+0x168/0x2f4)
[  213.490000] [<c00d0cb0>] (alloc_contig_range) from [<c010ec00>] (cma_alloc+0x170/0x2c0)
[  213.490000] [<c010ec00>] (cma_alloc) from [<c001a958>] (__alloc_from_contiguous+0x38/0xd8)
[  213.490000] [<c001a958>] (__alloc_from_contiguous) from [<c001ad44>] (__dma_alloc+0x23c/0x274)
[  213.490000] [<c001ad44>] (__dma_alloc) from [<c001ae08>] (arm_dma_alloc+0x54/0x5c)
[  213.490000] [<c001ae08>] (arm_dma_alloc) from [<c035cecc>] (drm_gem_cma_create+0xb8/0xf0)
[  213.490000] [<c035cecc>] (drm_gem_cma_create) from [<c035cf20>] (drm_gem_cma_create_with_handle+0x1c/0xe8)
[  213.490000] [<c035cf20>] (drm_gem_cma_create_with_handle) from [<c035d088>] (drm_gem_cma_dumb_create+0x3c/0x48)
[  213.490000] [<c035d088>] (drm_gem_cma_dumb_create) from [<c0341ed8>] (drm_ioctl+0x12c/0x444)
[  213.490000] [<c0341ed8>] (drm_ioctl) from [<c0121adc>] (do_vfs_ioctl+0x3f4/0x614)
[  213.490000] [<c0121adc>] (do_vfs_ioctl) from [<c0121d30>] (SyS_ioctl+0x34/0x5c)
[  213.490000] [<c0121d30>] (SyS_ioctl) from [<c000f2c0>] (ret_fast_syscall+0x0/0x34)

UBIFS is using PagePrivate() which can have different meanings across
filesystems. Therefore the generic page migration code cannot handle this
case correctly.
We have to implement our own migration function which basically does a
plain copy but also duplicates the page private flag.
UBIFS is not a block device filesystem and cannot use buffer_migrate_page().

Cc: stable@vger.kernel.org
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
[rw: Massaged changelog, build fixes, etc...]
Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

mm: Export migrate_page_move_mapping and migrate_page_copy

[ Upstream commit 1118dce773d84f39ebd51a9fe7261f9169cb056e ]

Export these symbols such that UBIFS can implement
->migratepage.

Cc: stable@vger.kernel.org
Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ubi: Make recover_peb power cut aware

[ Upstream commit 972228d87445dc46c0a01f5f3de673ac017626f7 ]

recover_peb() was never power cut aware,
if a power cut happened right after writing the VID header
upon next attach UBI would blindly use the new partial written
PEB and all data from the old PEB is lost.

In order to make recover_peb() power cut aware, write the new
VID with a proper crc and copy_flag set such that the UBI attach
process will detect whether the new PEB is completely written
or not.
We cannot directly use ubi_eba_atomic_leb_change() since we'd
have to unlock the LEB which is facing a write error.

Cc: stable@vger.kernel.org
Reported-by: Jörg Pfähler <pfaehler@isse.de>
Reviewed-by: Jörg Pfähler <pfaehler@isse.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

pinctrl: single: Fix missing flush of posted write for a wakeirq

[ Upstream commit 0ac3c0a4025f41748a083bdd4970cb3ede802b15 ]

With many repeated suspend resume cycles, the pin specific wakeirq
may not always work on omaps. This is because the write to enable the
pin interrupt may not have reached the device over the interconnect
before suspend happens.

Let's fix the issue with a flush of posted write with a readback.

Cc: stable@vger.kernel.org
Reported-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

pinctrl: imx: Do not treat a PIN without MUX register as an error

[ Upstream commit ba562d5e54fd3136bfea0457add3675850247774 ]

Some PINs do not have a MUX register, it is not an error.
It is necessary to allow the continuation of the PINs configuration,
otherwise the whole PIN-group will be configured incorrectly.

Cc: stable@vger.kernel.org
Signed-off-by: Alexander Shiyan <shc_work@mail.ru>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

arm64: mm: remove page_mapping check in __sync_icache_dcache

[ Upstream commit 20c27a4270c775d7ed661491af8ac03264d60fc6 ]

__sync_icache_dcache unconditionally skips the cache maintenance for
anonymous pages, under the assumption that flushing is only required in
the presence of D-side aliases [see 7249b79f6b4cc ("arm64: Do not flush
the D-cache for anonymous pages")].

Unfortunately, this breaks migration of anonymous pages holding
self-modifying code, where userspace cannot be reasonably expected to
reissue maintenance instructions in response to a migration.

This patch fixes the problem by removing the broken page_mapping(page)
check from the cache syncing code, otherwise we may end up fetching and
executing stale instructions from the PoU.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: <stable@vger.kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

drm: atmel-hlcdc: actually disable scaling when no scaling is required

[ Upstream commit 1b7e38b92b0bbd363369f5160f13f4d26140972d ]

The driver is only enabling scaling, but never disabling it, thus, if you
enable the scaling feature once it stays enabled forever.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reported-by: Alex Vazquez <avazquez.dev@gmail.com>
Reviewed-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Fixes: 1a396789f65a ("drm: add Atmel HLCDC Display Controller support")
Cc: <stable@vger.kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

tracing: Handle NULL formats in hold_module_trace_bprintk_format()

[ Upstream commit 70c8217acd4383e069fe1898bbad36ea4fcdbdcc ]

If a task uses a non constant string for the format parameter in
trace_printk(), then the trace_printk_fmt variable is set to NULL. This
variable is then saved in the __trace_printk_fmt section.

The function hold_module_trace_bprintk_format() checks to see if duplicate
formats are used by modules, and reuses them if so (saves them to the list
if it is new). But this function calls lookup_format() that does a strcmp()
to the value (which is now NULL) and can cause a kernel oops.

This wasn't an issue till 3debb0a9ddb ("tracing: Fix trace_printk() to print
when not using bprintk()") which added "__used" to the trace_printk_fmt
variable, and before that, the kernel simply optimized it out (no NULL value
was saved).

The fix is simply to handle the NULL pointer in lookup_format() and have the
caller ignore the value if it was NULL.

Link: http://lkml.kernel.org/r/1464769870-18344-1-git-send-email-zhengjun.xing@intel.com
Reported-by: xingzhen <zhengjun.xing@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Fixes: 3debb0a9ddb ("tracing: Fix trace_printk() to print when not using bprintk()")
Cc: stable@vger.kernel.org # v3.5+
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

can: at91_can: RX queue could get stuck at high bus load

[ Upstream commit 43200a4480cbbe660309621817f54cbb93907108 ]

At high bus load it could happen that "at91_poll()" enters with all RX
message boxes filled up. If then at the end the "quota" is exceeded as
well, "rx_next" will not be reset to the first RX mailbox and hence the
interrupts remain disabled.

Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Tested-by: Amr Bekhit <amrbekhit@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

can: c_can: Update D_CAN TX and RX functions to 32 bit - fix Altera Cyclone access

[ Upstream commit 427460c83cdf55069eee49799a0caef7dde8df69 ]

When testing CAN write floods on Altera's CycloneV, the first 2 bytes
are sometimes 0x00, 0x00 or corrupted instead of the values sent. Also
observed bytes 4 & 5 were corrupted in some cases.

The D_CAN Data registers are 32 bits and changing from 16 bit writes to
32 bit writes fixes the problem.

Testing performed on Altera CycloneV (D_CAN). Requesting tests on other
C_CAN & D_CAN platforms.

Reported-by: Richard Andrysek <richard.andrysek@gomtec.de>
Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

IB/mlx4: Properly initialize GRH TClass and FlowLabel in AHs

[ Upstream commit 8c5122e45a10a9262f872b53f151a592e870f905 ]

When this code was reworked for IBoE support the order of assignments
for the sl_tclass_flowlabel got flipped around resulting in
TClass & FlowLabel being permanently set to 0 in the packet headers.

This breaks IB routers that rely on these headers, but only affects
kernel users - libmlx4 does this properly for user space.

Cc: stable@vger.kernel.org
Fixes: fa417f7b520e ("IB/mlx4: Add support for IBoE")
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

btrfs: account for non-CoW'd blocks in btrfs_abort_transaction

[ Upstream commit 64c12921e11b3a0c10d088606e328c58e29274d8 ]

The test for !trans->blocks_used in btrfs_abort_transaction is
insufficient to determine whether it's safe to drop the transaction
handle on the floor.  btrfs_cow_block, informed by should_cow_block,
can return blocks that have already been CoW'd in the current
transaction.  trans->blocks_used is only incremented for new block
allocations. If an operation overlaps the blocks in the current
transaction entirely and must abort the transaction, we'll happily
let it clean up the trans handle even though it may have modified
the blocks and will commit an incomplete operation.

In the long-term, I'd like to do closer tracking of when the fs
is actually modified so we can still recover as gracefully as possible,
but that approach will need some discussion.  In the short term,
since this is the only code using trans->blocks_used, let's just
switch it to a bool indicating whether any blocks were used and set
it when should_cow_block returns false.

Cc: stable@vger.kernel.org # 3.4+
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ALSA: hdac_regmap - fix the register access for runtime PM

[ Upstream commit 8198868f0a283eb23e264951632ce61ec2f82228 ]

Call path:

  1) snd_hdac_power_up_pm()
  2) snd_hdac_power_up()
  3) pm_runtime_get_sync()
  4) __pm_runtime_resume()
  5) rpm_resume()

The rpm_resume() returns 1 when the device is already active.
Because the return value is unmodified, the hdac regmap read/write
functions should allow this value for the retry I/O operation, too.

Signed-off-by: Jaroslav Kysela <perex@perex.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ALSA: hda - Fix possible race on regmap bypass flip

[ Upstream commit 3194ed497939c6448005542e3ca4fa2386968fa0 ]

HD-audio driver uses regmap cache bypass feature for reading a raw
value without the cache.  But this is racy since both the cached and
the uncached reads may occur concurrently.  The former is done via the
normal control API access while the latter comes from the proc file
read.

Even though the regmap itself has the protection against the
concurrent accesses, the flag set/reset is done without the
protection, so it may lead to inconsistent state of bypass flag that
doesn't match with the current read and occasionally result in a
kernel WARNING like:
  WARNING: CPU: 3 PID: 2731 at drivers/base/regmap/regcache.c:499 regcache_cache_only+0x78/0x93

One way to work around such a problem is to wrap with a mutex.  But in
this case, the solution is simpler: for the uncached read, we just
skip the regmap and directly calls its accessor.  The verb execution
there is protected by itself, so basically it's safe to call
individually.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=116171
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

KEYS: potential uninitialized variable

[ Upstream commit 38327424b40bcebe2de92d07312c89360ac9229a ]

If __key_link_begin() failed then "edit" would be uninitialized.  I've
added a check to fix that.

This allows a random user to crash the kernel, though it's quite
difficult to achieve.  There are three ways it can be done as the user
would have to cause an error to occur in __key_link():

(1) Cause the kernel to run out of memory.  In practice, this is difficult
     to achieve without ENOMEM cropping up elsewhere and aborting the
     attempt.

(2) Revoke the destination keyring between the keyring ID being looked up
     and it being tested for revocation.  In practice, this is difficult to
     time correctly because the KEYCTL_REJECT function can only be used
     from the request-key upcall process.  Further, users can only make use
     of what's in /sbin/request-key.conf, though this does including a
     rejection debugging test - which means that the destination keyring
     has to be the caller's session keyring in practice.

(3) Have just enough key quota available to create a key, a new session
     keyring for the upcall and a link in the session keyring, but not then
     sufficient quota to create a link in the nominated destination keyring
     so that it fails with EDQUOT.

The bug can be triggered using option (3) above using something like the
following:

echo 80 >/proc/sys/kernel/keys/root_maxbytes
keyctl request2 user debug:fred negate @t

The above sets the quota to something much lower (80) to make the bug
easier to trigger, but this is dependent on the system.  Note also that
the name of the keyring created contains a random number that may be
between 1 and 10 characters in size, so may throw the test off by
changing the amount of quota used.

Assuming the failure occurs, something like the following will be seen:

kfree_debugcheck: out of range ptr 6b6b6b6b6b6b6b68h
------------[ cut here ]------------
kernel BUG at ../mm/slab.c:2821!
...
RIP: 0010:[<ffffffff811600f9>] kfree_debugcheck+0x20/0x25
RSP: 0018:ffff8804014a7de8  EFLAGS: 00010092
RAX: 0000000000000034 RBX: 6b6b6b6b6b6b6b68 RCX: 0000000000000000
RDX: 0000000000040001 RSI: 00000000000000f6 RDI: 0000000000000300
RBP: ffff8804014a7df0 R08: 0000000000000001 R09: 0000000000000000
R10: ffff8804014a7e68 R11: 0000000000000054 R12: 0000000000000202
R13: ffffffff81318a66 R14: 0000000000000000 R15: 0000000000000001
...
Call Trace:
  kfree+0xde/0x1bc
  assoc_array_cancel_edit+0x1f/0x36
  __key_link_end+0x55/0x63
  key_reject_and_link+0x124/0x155
  keyctl_reject_key+0xb6/0xe0
  keyctl_negate_key+0x10/0x12
  SyS_keyctl+0x9f/0xe7
  do_syscall_64+0x63/0x13a
  entry_SYSCALL64_slow_path+0x25/0x25

Fixes: f70e2e06196a ('KEYS: Do preallocation for __key_link()')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

cgroup: set css->id to -1 during init

[ Upstream commit 8fa3b8d689a54d6d04ff7803c724fb7aca6ce98e ]

If percpu_ref initialization fails during css_create(), the free path
can end up trying to free css->id of zero. As ID 0 is unused, it
doesn't cause a critical breakage but it does trigger a warning
message. Fix it by setting css->id to -1 from init_and_link_css().

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Wenwei Tao <ww.tao0320@gmail.com>
Fixes: 01e586598b22 ("cgroup: release css->id after css_free")
Cc: stable@vger.kernel.org # v4.0+
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

memory: omap-gpmc: Fix omap gpmc EXTRADELAY timing

[ Upstream commit 8f50b8e57442d28e41bb736c173d8a2490549a82 ]

In the omap gpmc driver it can be noticed that GPMC_CONFIG4_OEEXTRADELAY
is overwritten by the WEEXTRADELAY value from the device tree and
GPMC_CONFIG4_WEEXTRADELAY is not updated by the value from the device
tree.

As a consequence, the memory accesses cannot be configured properly when
the extra delay are needed for OE and WE.

Fix the update of GPMC_CONFIG4_WEEXTRADELAY with the value from the
device tree file and prevents GPMC_CONFIG4_OEXTRADELAY
being overwritten by the WEXTRADELAY value from the device tree.

Cc: stable@vger.kernel.org
Signed-off-by: Ocquidant, Sebastien <sebastienocquidant@eaton.com>
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

kvm: Fix irq route entries exceeding KVM_MAX_IRQ_ROUTES

[ Upstream commit caf1ff26e1aa178133df68ac3d40815fed2187d9 ]

These days, we experienced one guest crash with 8 cores and 3 disks,
with qemu error logs as bellow:

qemu-system-x86_64: /build/qemu-2.0.0/kvm-all.c:984:
kvm_irqchip_commit_routes: Assertion `ret == 0' failed.

And then we found one patch(bdf026317d) in qemu tree, which said
could fix this bug.

Execute the following script will reproduce the BUG quickly:

irq_affinity.sh
========================================================================

vda_irq_num=25
vdb_irq_num=27
while [ 1 ]
do
    for irq in {1,2,4,8,10,20,40,80}
        do
            echo $irq > /proc/irq/$vda_irq_num/smp_affinity
            echo $irq > /proc/irq/$vdb_irq_num/smp_affinity
            dd if=/dev/vda of=/dev/zero bs=4K count=100 iflag=direct
            dd if=/dev/vdb of=/dev/zero bs=4K count=100 iflag=direct
        done
done
========================================================================

The following qemu log is added in the qemu code and is displayed when
this bug reproduced:

kvm_irqchip_commit_routes: max gsi: 1008, nr_allocated_irq_routes: 1024,
irq_routes->nr: 1024, gsi_count: 1024.

That's to say when irq_routes->nr == 1024, there are 1024 routing entries,
but in the kernel code when routes->nr >= 1024, will just return -EINVAL;

The nr is the number of the routing entries which is in of
[1 ~ KVM_MAX_IRQ_ROUTES], not the index in [0 ~ KVM_MAX_IRQ_ROUTES - 1].

This patch fix the BUG above.

Cc: stable@vger.kernel.org
Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
Signed-off-by: Wei Tang <tangwei@cmss.chinamobile.com>
Signed-off-by: Zhang Zhuoyu <zhangzhuoyu@cmss.chinamobile.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

base: make module_create_drivers_dir race-free

[ Upstream commit 7e1b1fc4dabd6ec8e28baa0708866e13fa93c9b3 ]

Modules which register drivers via standard path (driver_register) in
parallel can cause a warning:
WARNING: CPU: 2 PID: 3492 at ../fs/sysfs/dir.c:31 sysfs_warn_dup+0x62/0x80
sysfs: cannot create duplicate filename '/module/saa7146/drivers'
Modules linked in: hexium_gemini(+) mxb(+) ...
...
Call Trace:
...
[<ffffffff812e63a2>] sysfs_warn_dup+0x62/0x80
[<ffffffff812e6487>] sysfs_create_dir_ns+0x77/0x90
[<ffffffff8140f2c4>] kobject_add_internal+0xb4/0x340
[<ffffffff8140f5b8>] kobject_add+0x68/0xb0
[<ffffffff8140f631>] kobject_create_and_add+0x31/0x70
[<ffffffff8157a703>] module_add_driver+0xc3/0xd0
[<ffffffff8155e5d4>] bus_add_driver+0x154/0x280
[<ffffffff815604c0>] driver_register+0x60/0xe0
[<ffffffff8145bed0>] __pci_register_driver+0x60/0x70
[<ffffffffa0273e14>] saa7146_register_extension+0x64/0x90 [saa7146]
[<ffffffffa0033011>] hexium_init_module+0x11/0x1000 [hexium_gemini]
...

As can be (mostly) seen, driver_register causes this call sequence:
  -> bus_add_driver
    -> module_add_driver
      -> module_create_drivers_dir
The last one creates "drivers" directory in /sys/module/<...>. When
this is done in parallel, the directory is attempted to be created
twice at the same time.

This can be easily reproduced by loading mxb and hexium_gemini in
parallel:
while :; do
  modprobe mxb &
  modprobe hexium_gemini
  wait
  rmmod mxb hexium_gemini saa7146_vv saa7146
done

saa7146 calls pci_register_driver for both mxb and hexium_gemini,
which means /sys/module/saa7146/drivers is to be created for both of
them.

Fix this by a new mutex in module_create_drivers_dir which makes the
test-and-create "drivers" dir atomic.

I inverted the condition and removed 'return' to avoid multiple
unlocks or a goto.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Fixes: fe480a2675ed (Modules: only add drivers/ direcory if needed)
Cc: v2.6.21+ <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

nfsd4/rpc: move backchannel create logic into rpc code

[ Upstream commit d50039ea5ee63c589b0434baa5ecf6e5075bb6f9 ]

Also simplify the logic a bit.

Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Acked-by: Trond Myklebust <trondmy@primarydata.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

drm/dp/mst: Always clear proposed vcpi table for port.

[ Upstream commit fd2d2bac6e79b0be91ab86a6075a0c46ffda658a ]

Not clearing mst manager's proposed vcpis table for destroyed connectors when the manager is stopped leaves it pointing to unrefernced memory, this causes pagefault when the manager is restarted when plugging back a branch.

Fixes: 91a25e463130 ("drm/dp/mst: deallocate payload on port destruction")
Signed-off-by: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Reviewed-by: Lyude <cpaul@redhat.com>
Cc: stable@vger.kernel.org
Cc: Mykola Lysenko <Mykola.Lysenko@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

drm/i915/ilk: Don't disable SSC source if it's in use

[ Upstream commit 476490a945e1f0f6bd58e303058d2d8ca93a974c ]

Thanks to Ville Syrjälä for pointing me towards the cause of this issue.

Unfortunately one of the sideaffects of having the refclk for a DPLL set
to SSC is that as long as it's set to SSC, the GPU will prevent us from
powering down any of the pipes or transcoders using it. A couple of
BIOSes enable SSC in both PCH_DREF_CONTROL and in the DPLL
configurations. This causes issues on the first modeset, since we don't
expect SSC to be left on and as a result, can't successfully power down
the pipes or the transcoders using it. Here's an example from this Dell
OptiPlex 990:

[drm:intel_modeset_init] SSC enabled by BIOS, overriding VBT which says disabled
[drm:intel_modeset_init] 2 display pipes available.
[drm:intel_update_cdclk] Current CD clock rate: 400000 kHz
[drm:intel_update_max_cdclk] Max CD clock rate: 400000 kHz
[drm:intel_update_max_cdclk] Max dotclock rate: 360000 kHz
vgaarb: device changed decodes: PCI:0000:00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem
[drm:intel_crt_reset] crt adpa set to 0xf40000
[drm:intel_dp_init_connector] Adding DP connector on port C
[drm:intel_dp_aux_init] registering DPDDC-C bus for card0-DP-1
[drm:ironlake_init_pch_refclk] has_panel 0 has_lvds 0 has_ck505 0
[drm:ironlake_init_pch_refclk] Disabling SSC entirely
… later we try committing the first modeset …
[drm:intel_dump_pipe_config] [CRTC:26][modeset] config ffff88041b02e800 for pipe A
[drm:intel_dump_pipe_config] cpu_transcoder: A
…
[drm:intel_dump_pipe_config] dpll_hw_state: dpll: 0xc4016001, dpll_md: 0x0, fp0: 0x20e08, fp1: 0x30d07
[drm:intel_dump_pipe_config] planes on this crtc
[drm:intel_dump_pipe_config] STANDARD PLANE:23 plane: 0.0 idx: 0 enabled
[drm:intel_dump_pipe_config]     FB:42, fb = 800x600 format = 0x34325258
[drm:intel_dump_pipe_config]     scaler:0 src (0, 0) 800x600 dst (0, 0) 800x600
[drm:intel_dump_pipe_config] CURSOR PLANE:25 plane: 0.1 idx: 1 disabled, scaler_id = 0
[drm:intel_dump_pipe_config] STANDARD PLANE:27 plane: 0.1 idx: 2 disabled, scaler_id = 0
[drm:intel_get_shared_dpll] CRTC:26 allocated PCH DPLL A
[drm:intel_get_shared_dpll] using PCH DPLL A for pipe A
[drm:ilk_audio_codec_disable] Disable audio codec on port C, pipe A
[drm:intel_disable_pipe] disabling pipe A
------------[ cut here ]------------
WARNING: CPU: 1 PID: 130 at drivers/gpu/drm/i915/intel_display.c:1146 intel_disable_pipe+0x297/0x2d0 [i915]
pipe_off wait timed out
…
---[ end trace 94fc8aa03ae139e8 ]---
[drm:intel_dp_link_down]
[drm:ironlake_crtc_disable [i915]] *ERROR* failed to disable transcoder A

Later modesets succeed since they reset the DPLL's configuration anyway,
but this is enough to get stuck with a big fat warning in dmesg.

A better solution would be to add refcounts for the SSC source, but for
now leaving the source clock on should suffice.

Changes since v4:
- Fix calculation of final for systems with LVDS panels (fixes BUG() on
   CI test suite)
Changes since v3:
- Move temp variable into loop
- Move checks for using_ssc_source to after we've figured out has_ck505
- Add using_ssc_source to debug output
Changes since v2:
- Fix debug output for when we disable the CPU source
Changes since v1:
- Leave the SSC source clock on instead of just shutting it off on all
   of the DPLL configurations.

Cc: stable@vger.kernel.org
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Lyude <cpaul@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1465916649-10228-1-git-send-email-cpaul@redhat.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

power_supply: power_supply_read_temp only if use_cnt > 0

[ Upstream commit 5bc28b93a36e3cb3acc2870fb75cb6ffb182fece ]

Change power_supply_read_temp() to use power_supply_get_property()
so that it will check the use_cnt and ensure it is > 0. The use_cnt
will be incremented at the end of __power_supply_register, so this
will block to case where get_property can be called before the supply
is fully registered. This fixes the issue show in the stack below:

[    1.452598] power_supply_read_temp+0x78/0x80
[    1.458680] thermal_zone_get_temp+0x5c/0x11c
[    1.464765] thermal_zone_device_update+0x34/0xb4
[    1.471195] thermal_zone_device_register+0x87c/0x8cc
[    1.477974] __power_supply_register+0x364/0x424
[    1.484317] power_supply_register_no_ws+0x10/0x18
[    1.490833] bq27xxx_battery_setup+0x10c/0x164
[    1.497003] bq27xxx_battery_i2c_probe+0xd0/0x1b0
[    1.503435] i2c_device_probe+0x174/0x240
[    1.509172] driver_probe_device+0x1fc/0x29c
[    1.515167] __driver_attach+0xa4/0xa8
[    1.520643] bus_for_each_dev+0x58/0x98
[    1.526204] driver_attach+0x20/0x28
[    1.531505] bus_add_driver+0x1c8/0x22c
[    1.537067] driver_register+0x68/0x108
[    1.542630] i2c_register_driver+0x38/0x7c
[    1.548457] bq27xxx_battery_i2c_driver_init+0x18/0x20
[    1.555321] do_one_initcall+0x38/0x12c
[    1.560886] kernel_init_freeable+0x148/0x1ec
[    1.566972] kernel_init+0x10/0xfc
[    1.572101] ret_from_fork+0x10/0x40

Also make the same change to ps_get_max_charge_cntl_limit() and
ps_get_cur_chrage_cntl_limit() to be safe. Lastly, change the return
value of power_supply_get_property() to -EAGAIN from -ENODEV if
use_cnt <= 0.

Fixes: 297d716f6260 ("power_supply: Change ownership from driver to core")
Cc: stable@vger.kernel.org
Signed-off-by: Rhyland Klein <rklein@nvidia.com>
Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

kernel/sysrq, watchdog, sched/core: Reset watchdog on all CPUs while processing sysrq-w

[ Upstream commit 57675cb976eff977aefb428e68e4e0236d48a9ff ]

Lengthy output of sysrq-w may take a lot of time on slow serial console.

Currently we reset NMI-watchdog on the current CPU to avoid spurious
lockup messages. Sometimes this doesn't work since softlockup watchdog
might trigger on another CPU which is waiting for an IPI to proceed.
We reset softlockup watchdogs on all CPUs, but we do this only after
listing all tasks, and this may be too late on a busy system.

So, reset watchdogs CPUs earlier, in for_each_process_thread() loop.

Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1465474805-14641-1-git-send-email-aryabinin@virtuozzo.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

kprobes/x86: Clear TF bit in fault on single-stepping

[ Upstream commit dcfc47248d3f7d28df6f531e6426b933de94370d ]

Fix kprobe_fault_handler() to clear the TF (trap flag) bit of
the flags register in the case of a fault fixup on single-stepping.

If we put a kprobe on the instruction which caused a
page fault (e.g. actual mov instructions in copy_user_*),
that fault happens on the single-stepping buffer. In this
case, kprobes resets running instance so that the CPU can
retry execution on the original ip address.

However, current code forgets to reset the TF bit. Since this
fault happens with TF bit set for enabling single-stepping,
when it retries, it causes a debug exception and kprobes
can not handle it because it already reset itself.

On the most of x86-64 platform, it can be easily reproduced
by using kprobe tracer. E.g.

  # cd /sys/kernel/debug/tracing
  # echo p copy_user_enhanced_fast_string+5 > kprobe_events
  # echo 1 > events/kprobes/enable

And you'll see a kernel panic on do_debug(), since the debug
trap is not handled by kprobes.

To fix this problem, we just need to clear the TF bit when
resetting running kprobe.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: systemtap@sourceware.org
Cc: stable@vger.kernel.org # All the way back to ancient kernels
Link: http://lkml.kernel.org/r/20160611140648.25885.37482.stgit@devbox
[ Updated the comments. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

spi: sunxi: fix transfer timeout

[ Upstream commit 719bd6542044efd9b338a53dba1bef45f40ca169 ]

The trasfer timeout is fixed at 1000 ms. Reading a 4Mbyte flash over
1MHz SPI bus takes way longer than that. Calculate the timeout from the
actual time the transfer is supposed to take and multiply by 2 for good
measure.

Signed-off-by: Michal Suchanek <hramrach@gmail.com>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

spi: sun4i: fix FIFO limit

[ Upstream commit 6d9fe44bd73d567d04d3a68a2d2fa521ab9532f2 ]

When testing SPI without DMA I noticed that filling the FIFO on the
spi controller causes timeout.

Always leave room for one byte in the FIFO.

Signed-off-by: Michal Suchanek <hramrach@gmail.com>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

MIPS: KVM: Fix modular KVM under QEMU

[ Upstream commit 797179bc4fe06c89e47a9f36f886f68640b423f8 ]

Copy __kvm_mips_vcpu_run() into unmapped memory, so that we can never
get a TLB refill exception in it when KVM is built as a module.

This was observed to happen with the host MIPS kernel running under
QEMU, due to a not entirely transparent optimisation in the QEMU TLB
handling where TLB entries replaced with TLBWR are copied to a separate
part of the TLB array. Code in those pages continue to be executable,
but those mappings persist only until the next ASID switch, even if they
are marked global.

An ASID switch happens in __kvm_mips_vcpu_run() at exception level after
switching to the guest exception base. Subsequent TLB mapped kernel
instructions just prior to switching to the guest trigger a TLB refill
exception, which enters the guest exception handlers without updating
EPC. This appears as a guest triggered TLB refill on a host kernel
mapped (host KSeg2) address, which is not handled correctly as user
(guest) mode accesses to kernel (host) segments always generate address
error exceptions.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 3.10.x-
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

usb: common: otg-fsm: add license to usb-otg-fsm

[ Upstream commit ea1d39a31d3b1b6060b6e83e5a29c069a124c68a ]

Fix warning about tainted kernel because usb-otg-fsm has no license.
WARNING: with this patch usb-otg-fsm module can be loaded
but then the kernel will hang. Tested with a udoo quad board.

Cc: <stable@vger.kernel.org> #v4.1+
Signed-off-by: Oscar <oscar@naiandei.net>
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

drm/radeon: fix asic initialization for virtualized environments

[ Upstream commit 05082b8bbd1a0ffc74235449c4b8930a8c240f85 ]

When executing in a PCI passthrough based virtuzliation environment, the
hypervisor will usually attempt to send a PCIe bus reset signal to the
ASIC when the VM reboots. In this scenario, the card is not correctly
initialized, but we still consider it to be posted. Therefore, in a
passthrough based environemnt we should always post the card to guarantee
it is in a good state for driver initialization.

Ported from amdgpu commit:
amdgpu: fix asic initialization for virtualized environments

Cc: Andres Rodriguez <andres.rodriguez@amd.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ipmi: Remove smi_msg from waiting_rcv_msgs list before handle_one_recv_msg()

[ Upstream commit ae4ea9a2460c7fee2ae8feeb4dfe96f5f6c3e562 ]

Commit 7ea0ed2b5be8 ("ipmi: Make the message handler easier to use for
SMI interfaces") changed handle_new_recv_msgs() to call handle_one_recv_msg()
for a smi_msg while the smi_msg is still connected to waiting_rcv_msgs list.
That could lead to following list corruption problems:

1) low-level function treats smi_msg as not connected to list

  handle_one_recv_msg() could end up calling smi_send(), which
  assumes the msg is not connected to list.

  For example, the following sequence could corrupt list by
  doing list_add_tail() for the entry still connected to other list.

    handle_new_recv_msgs()
      msg = list_entry(waiting_rcv_msgs)
      handle_one_recv_msg(msg)
        handle_ipmb_get_msg_cmd(msg)
          smi_send(msg)
            spin_lock(xmit_msgs_lock)
            list_add_tail(msg)
            spin_unlock(xmit_msgs_lock)

2) race between multiple handle_new_recv_msgs() instances

  handle_new_recv_msgs() once releases waiting_rcv_msgs_lock before calling
  handle_one_recv_msg() then retakes the lock and list_del() it.

  If others call handle_new_recv_msgs() during the window shown below
  list_del() will be done twice for the same smi_msg.

  handle_new_recv_msgs()
    spin_lock(waiting_rcv_msgs_lock)
    msg = list_entry(waiting_rcv_msgs)
    spin_unlock(waiting_rcv_msgs_lock)
  |
  | handle_one_recv_msg(msg)
  |
    spin_lock(waiting_rcv_msgs_lock)
    list_del(msg)
    spin_unlock(waiting_rcv_msgs_lock)

Fixes: 7ea0ed2b5be8 ("ipmi: Make the message handler easier to use for SMI interfaces")
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
[Added a comment to describe why this works.]
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Cc: stable@vger.kernel.org # 3.19
Tested-by: Ye Feng <yefeng.yl@alibaba-inc.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

crypto: ux500 - memmove the right size

[ Upstream commit 19ced623db2fe91604d69f7d86b03144c5107739 ]

The hash buffer is really HASH_BLOCK_SIZE bytes, someone
must have thought that memmove takes n*u32 words by mistake.
Tests work as good/bad as before after this patch.

Cc: Joakim Bech <joakim.bech@linaro.org>
Cc: stable@vger.kernel.org
Reported-by: David Binderman <linuxdev.baldrick@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

crypto: vmx - Increase priority of aes-cbc cipher

[ Upstream commit 12d3f49e1ffbbf8cbbb60acae5a21103c5c841ac ]

All of the VMX AES ciphers (AES, AES-CBC and AES-CTR) are set at
priority 1000. Unfortunately this means we never use AES-CBC and
AES-CTR, because the base AES-CBC cipher that is implemented on
top of AES inherits its priority.

To fix this, AES-CBC and AES-CTR have to be a higher priority. Set
them to 2000.

Testing on a POWER8 with:

cryptsetup benchmark --cipher aes --key-size 256

Shows decryption speed increase from 402.4 MB/s to 3069.2 MB/s,
over 7x faster. Thanks to Mike Strosaker for helping me debug
this issue.

Fixes: 8c755ace357c ("crypto: vmx - Adding CBC routines for VMX module")
Cc: stable@vger.kernel.org
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ARM: 8579/1: mm: Fix definition of pmd_mknotpresent

[ Upstream commit 56530f5d2ddc9b9fade7ef8db9cb886e9dc689b5 ]

Currently pmd_mknotpresent will use a zero entry to respresent an
invalidated pmd.

Unfortunately this definition clashes with pmd_none, thus it is
possible for a race condition to occur if zap_pmd_range sees pmd_none
whilst __split_huge_pmd_locked is running too with pmdp_invalidate
just called.

This patch fixes the race condition by modifying pmd_mknotpresent to
create non-zero faulting entries (as is done in other architectures),
removing the ambiguity with pmd_none.

[catalin.marinas@arm.com: using L_PMD_SECT_VALID instead of PMD_TYPE_SECT]

Fixes: 8d9625070073 ("ARM: mm: Transparent huge page support for LPAE systems.")
Cc: <stable@vger.kernel.org> # 3.11+
Reported-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ARM: 8578/1: mm: ensure pmd_present only checks the valid bit

[ Upstream commit 624531886987f0f1b5d01fb598034d039198e090 ]

In a subsequent patch, pmd_mknotpresent will clear the valid bit of the
pmd entry, resulting in a not-present entry from the hardware's
perspective. Unfortunately, pmd_present simply checks for a non-zero pmd
value and will therefore continue to return true even after a
pmd_mknotpresent operation. Since pmd_mknotpresent is only used for
managing huge entries, this is only an issue for the 3-level case.

This patch fixes the 3-level pmd_present implementation to take into
account the valid bit. For bisectability, the change is made before the
fix to pmd_mknotpresent.

[catalin.marinas@arm.com: comment update regarding pmd_mknotpresent patch]

Fixes: 8d9625070073 ("ARM: mm: Transparent huge page support for LPAE systems.")
Cc: <stable@vger.kernel.org> # 3.11+
Cc: Russell King <linux@armlinux.org.uk>
Cc: Steve Capper <Steve.Capper@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

scsi: fix race between simultaneous decrements of ->host_failed

[ Upstream commit 72d8c36ec364c82bf1bf0c64dfa1041cfaf139f7 ]

sas_ata_strategy_handler() adds the works of the ata error handler to
system_unbound_wq. This workqueue asynchronously runs work items, so the
ata error handler will be performed concurrently on different CPUs. In
this case, ->host_failed will be decreased simultaneously in
scsi_eh_finish_cmd() on different CPUs, and become abnormal.

It will lead to permanently inequality between ->host_failed and
->host_busy, and scsi error handler thread won't start running. IO
errors after that won't be handled.

Since all scmds must have been handled in the strategy handler, just
remove the decrement in scsi_eh_finish_cmd() and zero ->host_busy after
the strategy handler to fix this race.

Fixes: 50824d6c5657 ("[SCSI] libsas: async ata-eh")
Cc: stable@vger.kernel.org
Signed-off-by: Wei Fang <fangwei1@huawei.com>
Reviewed-by: James Bottomley <jejb@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

usb: host: ehci-tegra: Grab the correct UTMI pads reset

[ Upstream commit f8a15a9650694feaa0dabf197b0c94d37cd3fb42 ]

There are three EHCI controllers on Tegra SoCs, each with its own reset
line. However, the first controller contains a set of UTMI configuration
registers that are shared with its siblings. These registers will only
be reset as part of the first controller's reset. For proper operation
it must be ensured that the UTMI configuration registers are reset
before any of the EHCI controllers are enabled, irrespective of the
probe order.

Commit a47cc24cd1e5 ("USB: EHCI: tegra: Fix probe order issue leading to
broken USB") introduced code that ensures the first controller is always
reset before setting up any of the controllers, and is never again reset
afterwards.

This code, however, grabs the wrong reset. Each EHCI controller has two
reset controls attached: 1) the USB controller reset and 2) the UTMI
pads reset (really the first controller's reset). In order to reset the
UTMI pads registers the code must grab the second reset, but instead it
grabbing the first.

Fixes: a47cc24cd1e5 ("USB: EHCI: tegra: Fix probe order issue leading to broken USB")
Acked-by: Jon Hunter <jonathanh@nvidia.com>
Cc: stable@vger.kernel.org
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

usb: musb: Stop bulk endpoint while queue is rotated

[ Upstream commit 7b2c17f829545df27a910e8d82e133c21c9a8c9c ]

Ensure that the endpoint is stopped by clearing REQPKT before
clearing DATAERR_NAKTIMEOUT before rotating the queue on the
dedicated bulk endpoint.
This addresses an issue where a race could result in the endpoint
receiving data before it was reprogrammed resulting in a warning
about such data from musb_rx_reinit before it was thrown away.
The data thrown away was a valid packet that had been correctly
ACKed which meant the host and device got out of sync.

Signed-off-by: Andrew Goodbody <andrew.goodbody@cambrionix.com>
Cc: stable@vger.kernel.org
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

usb: musb: Ensure rx reinit occurs for shared_fifo endpoints

[ Upstream commit f3eec0cf784e0d6c47822ca6b66df3d5812af7e6 ]

shared_fifo endpoints would only get a previous tx state cleared
out, the rx state was only cleared for non shared_fifo endpoints
Change this so that the rx state is cleared for all endpoints.
This addresses an issue that resulted in rx packets being dropped
silently.

Signed-off-by: Andrew Goodbody <andrew.goodbody@cambrionix.com>
Cc: stable@vger.kernel.org
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

USB: xhci: Add broken streams quirk for Frescologic device id 1009

[ Upstream commit d95815ba6a0f287213118c136e64d8c56daeaeab ]

I got one of these cards for testing uas with, it seems that with streams
it dma-s all over the place, corrupting memory. On my first tests it
managed to dma over the BIOS of the motherboard somehow and completely
bricked it.

Tests on another motherboard show that it does work with streams disabled.

Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

usb: quirks: Add no-lpm quirk for Acer C120 LED Projector

[ Upstream commit 32cb0b37098f4beeff5ad9e325f11b42a6ede56c ]

The Acer C120 LED Projector is a USB-3 connected pico projector which
takes both its power and video data from USB-3.

In combination with some hubs this device does not play well with
lpm, so disable lpm for it.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

usb: quirks: Fix sorting

[ Upstream commit 81099f97bd31e25ff2719a435b1860fc3876122f ]

Properly sort all the entries by vendor id.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

usb: xhci-plat: properly handle probe deferral for devm_clk_get()

[ Upstream commit de95c40d5beaa47f6dc8fe9ac4159b4672b51523 ]

On some platforms, the clocks might be registered by a platform
driver. When this is the case, the clock platform driver may very well
be probed after xhci-plat, in which case the first probe() invocation
of xhci-plat will receive -EPROBE_DEFER as the return value of
devm_clk_get().

The current code handles that as a normal error, and simply assumes
that this means that the system doesn't have a clock for the XHCI
controller, and continues probing without calling
clk_prepare_enable(). Unfortunately, this doesn't work on systems
where the XHCI controller does have a clock, but that clock is
provided by another platform driver. In order to fix this situation,
we handle the -EPROBE_DEFER error condition specially, and abort the
XHCI controller probe(). It will be retried later automatically, the
clock will be available, devm_clk_get() will succeed, and the probe()
will continue with the clock prepared and enabled as expected.

In practice, such issue is seen on the ARM64 Marvell 7K/8K platform,
where the clocks are registered by a platform driver.

Cc: <stable@vger.kernel.org>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

xhci: Fix handling timeouted commands on hosts in weird states.

[ Upstream commit 3425aa03f484d45dc21e0e791c2f6c74ea656421 ]

If commands timeout we mark them for abortion, then stop the command
ring, and turn the commands to no-ops and finally restart the command
ring.

If the host is working properly the no-op commands will finish and
pending completions are called.
If we notice the host is failing, driver clears the command ring and
completes, deletes and frees all pending commands.

There are two separate cases reported where host is believed to work
properly but is not. In the first case we successfully stop the ring
but no abort or stop command ring event is ever sent and host locks up.

The second case is if a host is removed, command times out and driver
believes the ring is stopped, and assumes it will be restarted, but
actually ends up timing out on the same command forever.
If one of the pending commands has the xhci->mutex held it will block
xhci_stop() in the remove codepath which otherwise would cleanup pending
commands.

Add a check that clears all pending commands in case host is removed,
or we are stuck timing out on the same command. Also restart the
command timeout timer when stopping the command ring to ensure we
recive an ring stop/abort event.

Cc: stable <stable@vger.kernel.org>
Tested-by: Joe Lawrence <joe.lawrence@stratus.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

HID: elo: kill not flush the work

[ Upstream commit ed596a4a88bd161f868ccba078557ee7ede8a6ef ]

Flushing a work that reschedules itself is not a sensible operation. It needs
to be killed. Failure to do so leads to a kernel panic in the timer code.

CC: stable@vger.kernel.org
Signed-off-by: Oliver Neukum <ONeukum@suse.com>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

usb: gadget: fix spinlock dead lock in gadgetfs

[ Upstream commit d246dcb2331c5783743720e6510892eb1d2801d9 ]

[   40.467381] =============================================
[   40.473013] [ INFO: possible recursive locking detected ]
[   40.478651] 4.6.0-08691-g7f3db9a #37 Not tainted
[   40.483466] ---------------------------------------------
[   40.489098] usb/733 is trying to acquire lock:
[   40.493734]  (&(&dev->lock)->rlock){-.....}, at: [<bf129288>] ep0_complete+0x18/0xdc [gadgetfs]
[   40.502882]
[   40.502882] but task is already holding lock:
[   40.508967]  (&(&dev->lock)->rlock){-.....}, at: [<bf12a420>] ep0_read+0x20/0x5e0 [gadgetfs]
[   40.517811]
[   40.517811] other info that might help us debug this:
[   40.524623]  Possible unsafe locking scenario:
[   40.524623]
[   40.530798]        CPU0
[   40.533346]        ----
[   40.535894]   lock(&(&dev->lock)->rlock);
[   40.540088]   lock(&(&dev->lock)->rlock);
[   40.544284]
[   40.544284]  *** DEADLOCK ***
[   40.544284]
[   40.550461]  May be due to missing lock nesting notation
[   40.550461]
[   40.557544] 2 locks held by usb/733:
[   40.561271]  #0:  (&f->f_pos_lock){+.+.+.}, at: [<c02a6114>] __fdget_pos+0x40/0x48
[   40.569219]  #1:  (&(&dev->lock)->rlock){-.....}, at: [<bf12a420>] ep0_read+0x20/0x5e0 [gadgetfs]
[   40.578523]
[   40.578523] stack backtrace:
[   40.583075] CPU: 0 PID: 733 Comm: usb Not tainted 4.6.0-08691-g7f3db9a #37
[   40.590246] Hardware name: Generic AM33XX (Flattened Device Tree)
[   40.596625] [<c010ffbc>] (unwind_backtrace) from [<c010c1bc>] (show_stack+0x10/0x14)
[   40.604718] [<c010c1bc>] (show_stack) from [<c04207fc>] (dump_stack+0xb0/0xe4)
[   40.612267] [<c04207fc>] (dump_stack) from [<c01886ec>] (__lock_acquire+0xf68/0x1994)
[   40.620440] [<c01886ec>] (__lock_acquire) from [<c0189528>] (lock_acquire+0xd8/0x238)
[   40.628621] [<c0189528>] (lock_acquire) from [<c06ad6b4>] (_raw_spin_lock_irqsave+0x38/0x4c)
[   40.637440] [<c06ad6b4>] (_raw_spin_lock_irqsave) from [<bf129288>] (ep0_complete+0x18/0xdc [gadgetfs])
[   40.647339] [<bf129288>] (ep0_complete [gadgetfs]) from [<bf10a728>] (musb_g_giveback+0x118/0x1b0 [musb_hdrc])
[   40.657842] [<bf10a728>] (musb_g_giveback [musb_hdrc]) from [<bf108768>] (musb_g_ep0_queue+0x16c/0x188 [musb_hdrc])
[   40.668772] [<bf108768>] (musb_g_ep0_queue [musb_hdrc]) from [<bf12a944>] (ep0_read+0x544/0x5e0 [gadgetfs])
[   40.678963] [<bf12a944>] (ep0_read [gadgetfs]) from [<c0284470>] (__vfs_read+0x20/0x110)
[   40.687414] [<c0284470>] (__vfs_read) from [<c0285324>] (vfs_read+0x88/0x114)
[   40.694864] [<c0285324>] (vfs_read) from [<c0286150>] (SyS_read+0x44/0x9c)
[   40.702051] [<c0286150>] (SyS_read) from [<c0107820>] (ret_fast_syscall+0x0/0x1c)

This is caused by the spinlock bug in ep0_read().
Fix the two other deadlock sources in gadgetfs_setup() too.

Cc: <stable@vger.kernel.org> # v3.16+
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

usb: dwc3: exynos: Fix deferred probing storm.

[ Upstream commit 4879efb34f7d49235fac334d76d9c6a77a021413 ]

dwc3-exynos has two problems during init if the regulators are slow
to come up (for instance if the I2C bus driver is not on the initramfs)
and return probe deferral. First, every time this happens, the driver
leaks the USB phys created; they need to be deallocated on error.

Second, since the phy devices are created before the regulators fail,
this means that there's a new device to re-trigger deferred probing,
which causes it to essentially go into a busy loop of re-probing the
device until the regulators come up.

Move the phy creation to after the regulators have succeeded, and also
fix cleanup on failure. On my ODROID XU4 system (with Debian's initramfs
which doesn't contain the I2C driver), this reduces the number of probe
attempts (for each of the two controllers) from more than 2000 to eight.

Signed-off-by: Steinar H. Gunderson <sesse@google.com>
Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Reviewed-by: Vivek Gautam <gautam.vivek@samsung.com>
Fixes: d720f057fda4 ("usb: dwc3: exynos: add nop transceiver support")
Cc: <stable@vger.kernel.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

clk: rockchip: initialize flags of clk_init_data in mmc-phase clock

[ Upstream commit 595144c1141c951a3c6bb9004ae6a2bc29aad66f ]

The flags element of clk_init_data was never initialized for mmc-
phase-clocks resulting in the element containing a random value
and thus possibly enabling unwanted clock flags.

Fixes: 89bf26cbc1a0 ("clk: rockchip: Add support for the mmc clock phases using the framework")
Cc: stable@vger.kernel.org
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

dmaengine: at_xdmac: double FIFO flush needed to compute residue

[ Upstream commit 9295c41d77ca93aac79cfca6fa09fa1ca5cab66f ]

Due to the way CUBC register is updated, a double flush is needed to
compute an accurate residue. First flush aim is to get data from the DMA
FIFO and second one ensures that we won't report data which are not in
memory.

Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Fixes: e1f7c9eee707 ("dmaengine: at_xdmac: creation of the atmel
eXtended DMA Controller driver")
Cc: stable@vger.kernel.org #v4.1 and later
Reviewed-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

dmaengine: at_xdmac: fix residue corruption

[ Upstream commit 53398f488821c2b5b15291e3debec6ad33f75d3d ]

An unexpected value of CUBC can lead to a corrupted residue. A more
complex sequence is needed to detect an inaccurate value for NCA or CUBC.

Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Fixes: e1f7c9eee707 ("dmaengine: at_xdmac: creation of the atmel
eXtended DMA Controller driver")
Cc: stable@vger.kernel.org #v4.1 and later
Reviewed-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

dmaengine: at_xdmac: align descriptors on 64 bits

[ Upstream commit 4a9723e8df68cfce4048517ee32e37f78854b6fb ]

Having descriptors aligned on 64 bits allows update CNDA and CUBC in an
atomic way.

Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Fixes: e1f7c9eee707 ("dmaengine: at_xdmac: creation of the atmel
eXtended DMA Controller driver")
Cc: stable@vger.kernel.org #v4.1 and later
Reviewed-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

cgroup: remove redundant cleanup in css_create

[ Upstream commit b00c52dae6d9ee8d0f2407118ef6544ae5524781 ]

When create css failed, before call css_free_rcu_fn, we remove the css
id and exit the percpu_ref, but we will do these again in
css_free_work_fn, so they are redundant.  Especially the css id, that
would cause problem if we remove it twice, since it may be assigned to
another css after the first remove.

tj: This was broken by two commits updating the free path without
    synchronizing the creation failure path.  This can be easily
    triggered by trying to create more than 64k memory cgroups.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Fixes: 9a1049da9bd2 ("percpu-refcount: require percpu_ref to be exited explicitly")
Fixes: 01e586598b22 ("cgroup: release css->id after css_free")
Cc: stable@vger.kernel.org # v3.17+
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

percpu: fix synchronization between synchronous map extension and chunk destruction

[ Upstream commit 6710e594f71ccaad8101bc64321152af7cd9ea28 ]

For non-atomic allocations, pcpu_alloc() can try to extend the area
map synchronously after dropping pcpu_lock; however, the extension
wasn't synchronized against chunk destruction and the chunk might get
freed while extension is in progress.

This patch fixes the bug by putting most of non-atomic allocations
under pcpu_alloc_mutex to synchronize against pcpu_balance_work which
is responsible for async chunk management including destruction.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-tested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Cc: stable@vger.kernel.org # v3.18+
Fixes: 1a4d76076cda ("percpu: implement asynchronous chunk population")
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

percpu: fix synchronization between chunk->map_extend_work and chunk destruction

[ Upstream commit 4f996e234dad488e5d9ba0858bc1bae12eff82c3 ]

Atomic allocations can trigger async map extensions which is serviced
by chunk->map_extend_work. pcpu_balance_work which is responsible for
destroying idle chunks wasn't synchronizing properly against
chunk->map_extend_work and may end up freeing the chunk while the work
item is still in flight.

This patch fixes the bug by rolling async map extension operations
into pcpu_balance_work.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-tested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Cc: stable@vger.kernel.org # v3.18+
Fixes: 9c824b6a172c ("percpu: make sure chunk->map array has available space")
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

af_unix: Fix splice-bind deadlock

[ Upstream commit c845acb324aa85a39650a14e7696982ceea75dc1 ]

On 2015/11/06, Dmitry Vyukov reported a deadlock involving the splice
system call and AF_UNIX sockets,

http://lists.openwall.net/netdev/2015/11/06/24

The situation was analyzed as

(a while ago) A: socketpair()
B: splice() from a pipe to /mnt/regular_file
does sb_start_write() on /mnt
C: try to freeze /mnt
wait for B to finish with /mnt
A: bind() try to bind our socket to /mnt/new_socket_name
lock our socket, see it not bound yet
decide that it needs to create something in /mnt
try to do sb_start_write() on /mnt, block (it's
waiting for C).
D: splice() from the same pipe to our socket
lock the pipe, see that socket is connected
try to lock the socket, block waiting for A
B: get around to actually feeding a chunk from
pipe to file, try to lock the pipe. Deadlock.

on 2015/11/10 by Al Viro,

http://lists.openwall.net/netdev/2015/11/10/4

The patch fixes this by removing the kern_path_create related code from
unix_mknod and executing it as part of unix_bind prior acquiring the
readlock of the socket in question. This means that A (as used above)
will sb_start_write on /mnt before it acquires the readlock, hence, it
won't indirectly block B which first did a sb_start_write and then
waited for a thread trying to acquire the readlock. Consequently, A
being blocked by C waiting for B won't cause a deadlock anymore
(effectively, both A and B acquire two locks in opposite order in the
situation described above).

Dmitry Vyukov(<dvyukov@google.com>) tested the original patch.

Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

Linux 4.1.27

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

uvc: Forward compat ioctls to their handlers directly

[ Upstream commit a44323e2a8f342848bb77e8e04fcd85fcb91b3b4 ]

The current code goes through a lot of indirection just to call a
known handler. Simplify it: just call the handlers directly.

Cc: stable@vger.kernel.org
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ecryptfs: forbid opening files without mmap handler

[ Upstream commit 2f36db71009304b3f0b95afacd8eba1f9f046b87 ]

This prevents users from triggering a stack overflow through a recursive
invocation of pagefault handling that involves mapping procfs files into
virtual memory.

Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Tyler Hicks <tyhicks@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

proc: prevent stacking filesystems on top

[ Upstream commit e54ad7f1ee263ffa5a2de9c609d58dfa27b21cd9 ]

This prevents stacking filesystems (ecryptfs and overlayfs) from using
procfs as lower filesystem. There is too much magic going on inside
procfs, and there is no good reason to stack stuff on top of procfs.

(For example, procfs does access checks in VFS open handlers, and
ecryptfs by design calls open handlers from a kernel thread that doesn't
drop privileges or so.)

Signed-off-by: Jann Horn <jannh@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

wext: Fix 32 bit iwpriv compatibility issue with 64 bit Kernel

[ Upstream commit 3d5fdff46c4b2b9534fa2f9fc78e90a48e0ff724 ]

iwpriv app uses iw_point structure to send data to Kernel. The iw_point
structure holds a pointer. For compatibility Kernel converts the pointer
as required for WEXT IOCTLs (SIOCIWFIRST to SIOCIWLAST). Some drivers
may use iw_handler_def.private_args to populate iwpriv commands instead
of iw_handler_def.private. For those case, the IOCTLs from
SIOCIWFIRSTPRIV to SIOCIWLASTPRIV will follow the path ndo_do_ioctl().
Accordingly when the filled up iw_point structure comes from 32 bit
iwpriv to 64 bit Kernel, Kernel will not convert the pointer and sends
it to driver. So, the driver may get the invalid data.

The pointer conversion for the IOCTLs (SIOCIWFIRSTPRIV to
SIOCIWLASTPRIV), which follow the path ndo_do_ioctl(), is mandatory.
This patch adds pointer conversion from 32 bit to 64 bit and vice versa,
if the ioctl comes from 32 bit iwpriv to 64 bit Kernel.

Cc: stable@vger.kernel.org
Signed-off-by: Prasun Maiti <prasunmaiti87@gmail.com>
Signed-off-by: Ujjal Roy <royujjal@gmail.com>
Tested-by: Dibyajyoti Ghosh <dibyajyotig@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

gpio: bcm-kona: fix bcm_kona_gpio_reset() warnings

[ Upstream commit b66b2a0adf0e48973b582e055758b9907a7eee7c ]

The bcm_kona_gpio_reset() calls bcm_kona_gpio_write_lock_regs()
with what looks like the wrong parameter. The write_lock_regs
function takes a pointer to the registers, not the bcm_kona_gpio
structure.

Fix the warning, and probably bug by changing the function to
pass reg_base instead of kona_gpio, fixing the following warning:

drivers/gpio/gpio-bcm-kona.c:550:47: warning: incorrect type in argument 1
  (different address spaces)
  expected void [noderef] <asn:2>*reg_base
  got struct bcm_kona_gpio *kona_gpio
  warning: incorrect type in argument 1 (different address spaces)
  expected void [noderef] <asn:2>*reg_base
  got struct bcm_kona_gpio *kona_gpio

Cc: stable@vger.kernel.org
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Acked-by: Ray Jui <ray.jui@broadcom.com>
Reviewed-by: Markus Mayer <mmayer@broadcom.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

gpiolib: Fix NULL pointer deference

[ Upstream commit 11f33a6d15bfa397867ac0d7f3481b6dd683286f ]

Under some circumstances, a gpiochip might be half cleaned from the
gpio_device list.

This patch makes sure that the chip pointer is still valid, before
calling the match function.

[  104.088296] BUG: unable to handle kernel NULL pointer dereference at
0000000000000090
[  104.089772] IP: [<ffffffff813d2045>] of_gpiochip_find_and_xlate+0x15/0x80
[  104.128273] Call Trace:
[  104.129802]  [<ffffffff813d2030>] ? of_parse_own_gpio+0x1f0/0x1f0
[  104.131353]  [<ffffffff813cd910>] gpiochip_find+0x60/0x90
[  104.132868]  [<ffffffff813d21ba>] of_get_named_gpiod_flags+0x9a/0x120
...
[  104.141586]  [<ffffffff8163d12b>] gpio_led_probe+0x11b/0x360

Cc: stable@vger.kernel.org
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

fix d_walk()/non-delayed __d_free() race

[ Upstream commit 3d56c25e3bb0726a5c5e16fc2d9e38f8ed763085 ]

Ascend-to-parent logics in d_walk() depends on all encountered child
dentries not getting freed without an RCU delay. Unfortunately, in
quite a few cases it is not true, with hard-to-hit oopsable race as
the result.

Fortunately, the fix is simiple; right now the rule is "if it ever
been hashed, freeing must be delayed" and changing it to "if it
ever had a parent, freeing must be delayed" closes that hole and
covers all cases the old rule used to cover. Moreover, pipes and
sockets remain _not_ covered, so we do not introduce RCU delay in
the cases which are the reason for having that delay conditional
in the first place.

Cc: stable@vger.kernel.org # v3.2+ (and watch out for __d_materialise_dentry())
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

cpufreq: intel_pstate: Fix ->set_policy() interface for no_turbo

[ Upstream commit 983e600e88835f0321d1a0ea06f52d48b7b5a544 ]

When turbo is disabled, the ->set_policy() interface is broken.

For example, when turbo is disabled and cpuinfo.max = 2900000 (full
max turbo frequency), setting the limits results in frequency less
than the requested one:
Set 1000000 KHz results in 0700000 KHz
Set 1500000 KHz results in 1100000 KHz
Set 2000000 KHz results in 1500000 KHz

This is because the limits->max_perf fraction is calculated using
the max turbo frequency as the reference, but when the max P-State is
capped in intel_pstate_get_min_max(), the reference is not the max
turbo P-State. This results in reducing max P-State.

One option is to always use max turbo as reference for calculating
limits. But this will not be correct. By definition the intel_pstate
sysfs limits, shows percentage of available performance. So when
BIOS has disabled turbo, the available performance is max non turbo.
So the max_perf_pct should still show 100%.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
[ rjw : Subject & changelog, rewrite in fewer lines of code ]
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

of: fix autoloading due to broken modalias with no 'compatible'

[ Upstream commit b3c0a4dab7e35a9b6d69c0415641d2280fdefb2b ]

Because of an improper dereference, a stray 'C' character was output to
the modalias when no 'compatible' was specified. This is the case for
some old PowerMac drivers which only set the 'name' property. Fix it to
let them match again.

Reported-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Tested-by: Mathieu Malaterre <malat@debian.org>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Fixes: 6543becf26fff6 ("mod/file2alias: make modalias generation safe for cross compiling")
Cc: stable@vger.kernel.org # v3.9+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

powerpc/pseries: Fix IBM_ARCH_VEC_NRCORES_OFFSET since POWER8NVL was added

[ Upstream commit 2c2a63e301fd19ccae673e79de59b30a232ff7f9 ]

The recent commit 7cc851039d64 ("powerpc/pseries: Add POWER8NVL support
to ibm,client-architecture-support call") added a new PVR mask & value
to the start of the ibm_architecture_vec[] array.

However it missed the fact that further down in the array, we hard code
the offset of one of the fields, and then at boot use that value to
patch the value in the array. This means every update to the array must
also update the #define, ugh.

This means that on pseries machines we will misreport to firmware the
number of cores we support, by a factor of threads_per_core.

Fix it for now by updating the #define.

Fixes: 7cc851039d64 ("powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call")
Cc: stable@vger.kernel.org # v4.0+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

x86, build: copy ldlinux.c32 to image.iso

[ Upstream commit 9c77679cadb118c0aa99e6f88533d91765a131ba ]

For newer versions of Syslinux, we need ldlinux.c32 in addition to
isolinux.bin to reside on the boot disk, so if the latter is found,
copy it, too, to the isoimage tree.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Linux Stable Tree <stable@vger.kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ALSA: hda/realtek: Add T560 docking unit fixup

[ Upstream commit dab38e43b298501a4e8807b56117c029e2e98383 ]

Tested with Lenovo Ultradock. Fixes the non-working headphone jack on
the docking unit.

Signed-off-by: Torsten Hilbrich <torsten.hilbrich@secunet.com>
Tested-by: Torsten Hilbrich <torsten.hilbrich@secunet.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

mnt: fs_fully_visible test the proper mount for MNT_LOCKED

[ Upstream commit d71ed6c930ac7d8f88f3cef6624a7e826392d61f ]

MNT_LOCKED implies on a child mount implies the child is locked to the
parent. So while looping through the children the children should be
tested (not their parent).

Typically an unshare of a mount namespace locks all mounts together
making both the parent and the slave as locked but there are a few
corner cases where other things work.

Cc: stable@vger.kernel.org
Fixes: ceeb0e5d39fc ("vfs: Ignore unlocked mounts in fs_fully_visible")
Reported-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

mnt: If fs_fully_visible fails call put_filesystem.

[ Upstream commit 97c1df3e54e811aed484a036a798b4b25d002ecf ]

Add this trivial missing error handling.

Cc: stable@vger.kernel.org
Fixes: 1b852bceb0d1 ("mnt: Refactor the logic for mounting sysfs and proc in a user namespace")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

parisc: Fix pagefault crash in unaligned __get_user() call

[ Upstream commit 8b78f260887df532da529f225c49195d18fef36b ]

One of the debian buildd servers had this crash in the syslog without
any other information:

Unaligned handler failed, ret = -2
clock_adjtime (pid 22578): Unaligned data reference (code 28)
CPU: 1 PID: 22578 Comm: clock_adjtime Tainted: G  E  4.5.0-2-parisc64-smp #1 Debian 4.5.4-1
task: 000000007d9960f8 ti: 00000001bde7c000 task.ti: 00000001bde7c000

      YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00001000000001001111100000001111 Tainted: G            E
r00-03  000000ff0804f80f 00000001bde7c2b0 00000000402d2be8 00000001bde7c2b0
r04-07  00000000409e1fd0 00000000fa6f7fff 00000001bde7c148 00000000fa6f7fff
r08-11  0000000000000000 00000000ffffffff 00000000fac9bb7b 000000000002b4d4
r12-15  000000000015241c 000000000015242c 000000000000002d 00000000fac9bb7b
r16-19  0000000000028800 0000000000000001 0000000000000070 00000001bde7c218
r20-23  0000000000000000 00000001bde7c210 0000000000000002 0000000000000000
r24-27  0000000000000000 0000000000000000 00000001bde7c148 00000000409e1fd0
r28-31  0000000000000001 00000001bde7c320 00000001bde7c350 00000001bde7c218
sr00-03  0000000001200000 0000000001200000 0000000000000000 0000000001200000
sr04-07  0000000000000000 0000000000000000 0000000000000000 0000000000000000

IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000402d2e84 00000000402d2e88
  IIR: 0ca0d089    ISR: 0000000001200000  IOR: 00000000fa6f7fff
  CPU:        1   CR30: 00000001bde7c000 CR31: ffffffffffffffff
  ORIG_R28: 00000002369fe628
  IAOQ[0]: compat_get_timex+0x2dc/0x3c0
  IAOQ[1]: compat_get_timex+0x2e0/0x3c0
  RP(r2): compat_get_timex+0x40/0x3c0
Backtrace:
  [<00000000402d4608>] compat_SyS_clock_adjtime+0x40/0xc0
  [<0000000040205024>] syscall_exit+0x0/0x14

This means the userspace program clock_adjtime called the clock_adjtime()
syscall and then crashed inside the compat_get_timex() function.
Syscalls should never crash programs, but instead return EFAULT.

The IIR register contains the executed instruction, which disassebles
into "ldw 0(sr3,r5),r9".
This load-word instruction is part of __get_user() which tried to read the word
at %r5/IOR (0xfa6f7fff). This means the unaligned handler jumped in.  The
unaligned handler is able to emulate all ldw instructions, but it fails if it
fails to read the source e.g. because of page fault.

The following program reproduces the problem:

#define _GNU_SOURCE
#include <unistd.h>
#include <sys/syscall.h>
#include <sys/mman.h>

int main(void) {
        /* allocate 8k */
        char *ptr = mmap(NULL, 2*4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
        /* free second half (upper 4k) and make it invalid. */
        munmap(ptr+4096, 4096);
        /* syscall where first int is unaligned and clobbers into invalid memory region */
        /* syscall should return EFAULT */
        return syscall(__NR_clock_adjtime, 0, ptr+4095);
}

To fix this issue we simply need to check if the faulting instruction address
is in the exception fixup table when the unaligned handler failed. If it
is, call the fixup routine instead of crashing.

While looking at the unaligned handler I found another issue as well: The
target register should not be modified if the handler was unsuccessful.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

of: irq: fix of_irq_get[_byname]() kernel-doc

[ Upstream commit 3993546646baf1dab5f5c4f7d9bb58f2046fd1c1 ]

The kernel-doc for the of_irq_get[_byname]() is clearly inadequate in
describing the return values -- of_irq_get_byname() is documented better
than of_irq_get() but it still doesn't mention that 0 is returned iff
irq_create_of_mapping() fails (it doesn't return an error code in this
case). Document all possible return value variants, making the writing
of the word "IRQ" consistent, while at it...

Fixes: 9ec36cafe43b ("of/irq: do irq resolution in platform_get_irq")
Fixes: ad69674e73a1 ("of/irq: do irq resolution in platform_get_irq_byname()")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
CC: stable@vger.kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

EDAC, sb_edac: Fix rank lookup on Broadwell

[ Upstream commit c7103f650a11328f28b9fa1c95027db331b7774b ]

Broadwell made a small change to the rank target register moving the
target rank ID field up from bits 16:19 to bits 20:23.

Also found that the offset field grew by one bit in the IVY_BRIDGE to
HASWELL transition, so fix the RIR_OFFSET() macro too.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: stable@vger.kernel.org # v3.19+
Cc: Aristeu Rozanski <arozansk@redhat.com>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/2943fb819b1f7e396681165db9c12bb3df0e0b16.1464735623.git.tony.luck@intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>