review.tizen.org Git - platform/kernel/linux-amlogic.git/log

mm: fix build of split ptlock code

Commit 597d795a2a78 ('mm: do not allocate page->ptl dynamically, if
spinlock_t fits to long') restructures some allocators that are compiled
even if USE_SPLIT_PTLOCKS arn't used.  It results in compilation
failure:

  mm/memory.c:4282:6: error: 'struct page' has no member named 'ptl'
  mm/memory.c:4288:12: error: 'struct page' has no member named 'ptl'

Add in the missing ifdef.

Fixes: 597d795a2a78 ('mm: do not allocate page->ptl dynamically, if spinlock_t fits to long')
Signed-off-by: Olof Johansson <olof@lixom.net>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'arc-fixes-for-3.13-rc5' of git://git./linux/kernel/git/vgupta/arc

Pull ARC fix from Vineet Gupta:
"Fix busted syscall table due to unistd header inclusion issue"

* tag 'arc-fixes-for-3.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: Allow conditional multiple inclusion of uapi/asm/unistd.h

Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux

Pull arm64 ptrace fix from Catalin Marinas.

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: ptrace: avoid using HW_BREAKPOINT_EMPTY for disabled events

pstore: Don't allow high traffic options on fragile devices

Some pstore backing devices use on board flash as persistent
storage. These have limited numbers of write cycles so it
is a poor idea to use them from high frequency operations.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'dmaengine-fixes-3.13-rc4' of git://git./linux/kernel/git/djbw/dmaengine

Pull dmaengine fixes from Dan Williams:

- deprecation of net_dma to be removed in 3.14

- crash regression fix in pl330 from the dmaengine_unmap rework

- crash regression fix for any channel running raid ops without
   CONFIG_ASYNC_TX_DMA from dmaengine_unmap

- memory leak regression in mv_xor from dmaengine_unmap

- build warning regressions in mv_xor, fsldma, ppc4xx, txx9, and
   at_hdmac from dmaengine_unmap

- sleep in atomic regression in dma_async_memcpy_pg_to_pg

- new fix in mv_xor for handling channel initialization failures

* tag 'dmaengine-fixes-3.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine:
  net_dma: mark broken
  dma: pl330: ensure DMA descriptors are zero-initialised
  dmaengine: fix sleep in atomic
  dmaengine: mv_xor: fix oops when channels fail to initialise
  dma: mv_xor: Use dmaengine_unmap_data for the self-tests
  dmaengine: fix enable for high order unmap pools
  dma: fix build warnings in txx9
  dmatest: fix build warning on mips
  dma: fix fsldma build warnings
  dma: fix build warnings in ppc4xx
  dmaengine: at_hdmac: remove unused function
  dma: mv_xor: remove mv_desc_get_dest_addr()

Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
"The PPC folks had a large amount of changes queued for 3.13, and now
  they are fixing the bugs"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: PPC: Book3S HV: Don't drop low-order page address bits
  powerpc: book3s: kvm: Don't abuse host r2 in exit path
  powerpc/kvm/booke: Fix build break due to stack frame size warning
  KVM: PPC: Book3S: PR: Enable interrupts earlier
  KVM: PPC: Book3S: PR: Make svcpu -> vcpu store preempt savvy
  KVM: PPC: Book3S: PR: Export kvmppc_copy_to|from_svcpu
  KVM: PPC: Book3S: PR: Don't clobber our exit handler id
  powerpc: kvm: fix rare but potential deadlock scene
  KVM: PPC: Book3S HV: Take SRCU read lock around kvm_read_guest() call
  KVM: PPC: Book3S HV: Make tbacct_lock irq-safe
  KVM: PPC: Book3S HV: Refine barriers in guest entry/exit
  KVM: PPC: Book3S HV: Fix physical address calculations

mm: do not allocate page->ptl dynamically, if spinlock_t fits to long

In struct page we have enough space to fit long-size page->ptl there,
but we use dynamically-allocated page->ptl if size(spinlock_t) is larger
than sizeof(int).

It hurts 64-bit architectures with CONFIG_GENERIC_LOCKBREAK, where
sizeof(spinlock_t) == 8, but it easily fits into struct page.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: page_alloc: revert NUMA aspect of fair allocation policy

Commit 81c0a2bb515f ("mm: page_alloc: fair zone allocator policy") meant
to bring aging fairness among zones in system, but it was overzealous
and badly regressed basic workloads on NUMA systems.

Due to the way kswapd and page allocator interacts, we still want to
make sure that all zones in any given node are used equally for all
allocations to maximize memory utilization and prevent thrashing on the
highest zone in the node.

While the same principle applies to NUMA nodes - memory utilization is
obviously improved by spreading allocations throughout all nodes -
remote references can be costly and so many workloads prefer locality
over memory utilization. The original change assumed that
zone_reclaim_mode would be a good enough predictor for that, but it
turned out to be as indicative as a coin flip.

Revert the NUMA aspect of the fairness until we can find a proper way to
make it configurable and agree on a sane default.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: <stable@kernel.org> # 3.12
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Revert "mm: page_alloc: exclude unreclaimable allocations from zone fairness policy"

This reverts commit 73f038b863df. The NUMA behaviour of this patch is
less than ideal. An alternative approch is to interleave allocations
only within local zones which is implemented in the next patch.

Cc: stable@vger.kernel.org
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: Fix NULL pointer dereference in madvise(MADV_WILLNEED) support

Sasha Levin found a NULL pointer dereference that is due to a missing
page table lock, which in turn is due to the pmd entry in question being
a transparent huge-table entry.

The code - introduced in commit 1998cc048901 ("mm: make
madvise(MADV_WILLNEED) support swap file prefetch") - correctly checks
for this situation using pmd_none_or_trans_huge_or_clear_bad(), but it
turns out that that function doesn't work correctly.

pmd_none_or_trans_huge_or_clear_bad() expected that pmd_bad() would
trigger if the transparent hugepage bit was set, but it doesn't do that
if pmd_numa() is also set. Note that the NUMA bit only gets set on real
NUMA machines, so people trying to reproduce this on most normal
development systems would never actually trigger this.

Fix it by removing the very subtle (and subtly incorrect) expectation,
and instead just checking pmd_trans_huge() explicitly.

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Acked-by: Andrea Arcangeli <aarcange@redhat.com>
[ Additionally remove the now stale test for pmd_trans_huge() inside the
pmd_bad() case - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'signed-for-3.13' of git://github.com/agraf/linux-2.6 into kvm-master

Patch queue for 3.13 - 2013-12-18

This fixes some grave issues we've only found after 3.13-rc1:

  - Make the modularized HV/PR book3s kvm work well as modules
  - Fix some race conditions
  - Fix compilation with certain compilers (booke)
  - Fix THP for book3s_hv
  - Fix preemption for book3s_pr

Alexander Graf (4):
      KVM: PPC: Book3S: PR: Don't clobber our exit handler id
      KVM: PPC: Book3S: PR: Export kvmppc_copy_to|from_svcpu
      KVM: PPC: Book3S: PR: Make svcpu -> vcpu store preempt savvy
      KVM: PPC: Book3S: PR: Enable interrupts earlier

Aneesh Kumar K.V (1):
      powerpc: book3s: kvm: Don't abuse host r2 in exit path

Paul Mackerras (5):
      KVM: PPC: Book3S HV: Fix physical address calculations
      KVM: PPC: Book3S HV: Refine barriers in guest entry/exit
      KVM: PPC: Book3S HV: Make tbacct_lock irq-safe
      KVM: PPC: Book3S HV: Take SRCU read lock around kvm_read_guest() call
      KVM: PPC: Book3S HV: Don't drop low-order page address bits

Scott Wood (1):
      powerpc/kvm/booke: Fix build break due to stack frame size warning

pingfan liu (1):
      powerpc: kvm: fix rare but potential deadlock scene

Merge tag 'stable/for-linus-3.13-rc4-tag' of git://git./linux/kernel/git/xen/tip

Pull Xen bugfixes from Konrad Rzeszutek Wilk:
- Fix balloon driver for auto-translate guests (PVHVM, ARM) to not use
   scratch pages.
- Fix block API header for ARM32 and ARM64 to have proper layout
- On ARM when mapping guests, stick on PTE_SPECIAL
- When using SWIOTLB under ARM, don't call swiotlb functions twice
- When unmapping guests memory and if we fail, don't return pages which
   failed to be unmapped.
- Grant driver was using the wrong address on ARM.

* tag 'stable/for-linus-3.13-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/balloon: Seperate the auto-translate logic properly (v2)
  xen/block: Correctly define structures in public headers on ARM32 and ARM64
  arm: xen: foreign mapping PTEs are special.
  xen/arm64: do not call the swiotlb functions twice
  xen: privcmd: do not return pages which we have failed to unmap
  XEN: Grant table address, xen_hvm_resume_frames, is a phys_addr not a pfn

Merge tag 'trace-fixes-v3.13-rc2' of git://git./linux/kernel/git/rostedt/linux-trace

Pull ftrace fix from Steven Rostedt:
"This fixes a long standing bug in the ftrace profiler.  The problem is
  that the profiler only initializes the online CPUs, and not possible
  CPUs.  This causes issues if the user takes CPUs online or offline
  while the profiler is running.

  If we online a CPU after starting the profiler, we lose all the trace
  information on the CPU going online.

  If we offline a CPU after running a test and start a new test, it will
  not clear the old data from that CPU.

  This bug causes incorrect data to be reported to the user if they
  online or offline CPUs during the profiling"

* tag 'trace-fixes-v3.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace: Initialize the ftrace profiler for each possible cpu

arm64: ptrace: avoid using HW_BREAKPOINT_EMPTY for disabled events

Commit 8f34a1da35ae ("arm64: ptrace: use HW_BREAKPOINT_EMPTY type for
disabled breakpoints") fixed an issue with GDB trying to zero breakpoint
control registers. The problem there is that the arch hw_breakpoint code
will attempt to create a (disabled), execute breakpoint of length 0.

This will fail validation and report unexpected failure to GDB. To avoid
this, we treated disabled breakpoints as HW_BREAKPOINT_EMPTY, but that
seems to have broken with recent kernels, causing watchpoints to be
treated as TYPE_INST in the core code and returning ENOSPC for any
further breakpoints.

This patch fixes the problem by prioritising the `enable' field of the
breakpoint: if it is cleared, we simply update the perf_event_attr to
indicate that the thing is disabled and don't bother changing either the
type or the length. This reinforces the behaviour that the breakpoint
control register is essentially read-only apart from the enable bit
when disabling a breakpoint.

Cc: <stable@vger.kernel.org>
Reported-by: Aaron Liu <liucy214@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull scheduler fixes from Ingo Molnar:
"An RT group-scheduling fix and the sched-domains topology setup fix
  from Mel"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/rt: Fix rq's cpupri leak while enqueue/dequeue child RT entities
  sched: Assign correct scheduling domain to 'sd_llc'

Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
"An ABI documentation fix, and a mixed-PMU perf-info-corruption fix"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf: Document the new transaction sample type
perf: Disable all pmus on unthrottling and rescheduling

Merge tag 'sound-3.13-rc5' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"We have a bit more changes than usual in ASoC here, as it was slipped
  from the previous update.  There are one minr ASoC PCM code fix and
  ASoC dmaengine fix, in addition of a collection of small ASoC driver
  fixes.  The rest are a couple of HD-audio stable fixups, and a
  long-standing fix for the paused stream handling.

  So, all commits look not scary (and hopefully won't give you
  disastrous holiday season)"

* tag 'sound-3.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda - Add Dell headset detection quirk for one more laptop model
  ASoC: wm8904: fix DSP mode B configuration
  ASoC: wm_adsp: Add small delay while polling DSP RAM start
  ALSA: Add SNDRV_PCM_STATE_PAUSED case in wait_for_avail function
  ASoC: kirkwood: Fix the CPU DAI rates
  ASoC: wm5110: Correct HPOUT3 DAPM route typo
  ALSA: hda - Add Dell headset detection quirk for three laptop models
  ALSA: hda - Add enable_msi=0 workaround for four HP machines
  ASoC: don't leak on error in snd_dmaengine_pcm_register
  ASoC: fsl: imx-wm8962: Don't update bias_level in machine driver
  ASoC: tegra: fix uninitialized variables in set_fmt
  ASoC: wm8962: Enable SYSCLK provisonally before fetching generated DSPCLK_DIV
  ASoC: sam9x5_wm8731: change to work in DSP A mode
  ASoC: atmel_ssc_dai: add dai trigger ops
  ASoC: soc-pcm: Use valid condition for snd_soc_dai_digital_mute() in hw_free()

ARC: Allow conditional multiple inclusion of uapi/asm/unistd.h

Commit 97bc386fc12d "ARC: Add guard macro to uapi/asm/unistd.h"
inhibited multiple inclusion of ARCH unistd.h. This however hosed the system
since Generic syscall table generator relies on it being included twice,
and in lack-of an empty table was emitted by C preprocessor.

Fix that by allowing one exception to rule for the special case (just
like Xtensa)

Suggested-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>

Merge tag 'asoc-v3.13-rc4' of git://git./linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v3.13

The fixes here are all driver specific ones, none of which particularly
stand out but all of which are useful to users of those drivers.

Merge remote-tracking branches 'asoc/fix/adsp', 'asoc/fix/arizona', 'asoc/fix/atmel', 'asoc/fix/fsl', 'asoc/fix/kirkwood', 'asoc/fix/tegra', 'asoc/fix/wm8904' and 'asoc/fix/wm8962' into asoc-linus

Merge remote-tracking branch 'asoc/fix/dma' into asoc-linus

Merge remote-tracking branch 'asoc/fix/core' into asoc-linus

Merge branch 'akpm' (incoming from Andrew)

Merge patches from Andrew Morton:
"23 fixes and a MAINTAINERS update"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (24 commits)
  mm/hugetlb: check for pte NULL pointer in __page_check_address()
  fix build with make 3.80
  mm/mempolicy: fix !vma in new_vma_page()
  MAINTAINERS: add Davidlohr as GPT maintainer
  mm/memory-failure.c: recheck PageHuge() after hugetlb page migrate successfully
  mm/compaction: respect ignore_skip_hint in update_pageblock_skip
  mm/mempolicy: correct putback method for isolate pages if failed
  mm: add missing dependency in Kconfig
  sh: always link in helper functions extracted from libgcc
  mm: page_alloc: exclude unreclaimable allocations from zone fairness policy
  mm: numa: defer TLB flush for THP migration as long as possible
  mm: numa: guarantee that tlb_flush_pending updates are visible before page table updates
  mm: fix TLB flush race between migration, and change_protection_range
  mm: numa: avoid unnecessary disruption of NUMA hinting during migration
  mm: numa: clear numa hinting information on mprotect
  sched: numa: skip inaccessible VMAs
  mm: numa: avoid unnecessary work on the failure path
  mm: numa: ensure anon_vma is locked to prevent parallel THP splits
  mm: numa: do not clear PTE for pte_numa update
  mm: numa: do not clear PMD during PTE update scan
  ...

mm/hugetlb: check for pte NULL pointer in __page_check_address()

In __page_check_address(), if address's pud is not present,
huge_pte_offset() will return NULL, we should check the return value.

Signed-off-by: Jianguo Wu <wujianguo@huawei.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: qiuxishi <qiuxishi@huawei.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fix build with make 3.80

According to Documentation/Changes, make 3.80 is still being supported
for building the kernel, hence make files must not make (unconditional)
use of features introduced only in newer versions.

Commit 1bf49dd4be0b ("./Makefile: export initial ramdisk compression
config option") however introduced "else ifeq" constructs which make
3.80 doesn't understand. Replace the logic there with more conventional
(in the kernel build infrastructure) list constructs (except that the
list here is intentionally limited to exactly one element).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: P J P <ppandit@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/mempolicy: fix !vma in new_vma_page()

BUG_ON(!vma) assumption is introduced by commit 0bf598d863e3 ("mbind:
add BUG_ON(!vma) in new_vma_page()"), however, even if

    address = __vma_address(page, vma);

and

    vma->start < address < vma->end

page_address_in_vma() may still return -EFAULT because of many other
conditions in it.  As a result the while loop in new_vma_page() may end
with vma=NULL.

This patch revert the commit and also fix the potential dereference NULL
pointer reported by Dan.

   http://marc.info/?l=linux-mm&m=137689530323257&w=2

  kernel BUG at mm/mempolicy.c:1204!
  invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
  CPU: 3 PID: 7056 Comm: trinity-child3 Not tainted 3.13.0-rc3+ #2
  task: ffff8801ca5295d0 ti: ffff88005ab20000 task.ti: ffff88005ab20000
  RIP: new_vma_page+0x70/0x90
  RSP: 0000:ffff88005ab21db0  EFLAGS: 00010246
  RAX: fffffffffffffff2 RBX: 0000000000000000 RCX: 0000000000000000
  RDX: 0000000008040075 RSI: ffff8801c3d74600 RDI: ffffea00079a8b80
  RBP: ffff88005ab21dc8 R08: 0000000000000004 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000000 R12: fffffffffffffff2
  R13: ffffea00079a8b80 R14: 0000000000400000 R15: 0000000000400000

  FS:  00007ff49c6f4740(0000) GS:ffff880244e00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007ff49c68f994 CR3: 000000005a205000 CR4: 00000000001407e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Stack:
   ffffea00079a8b80 ffffea00079a8bc0 ffffea00079a8ba0 ffff88005ab21e50
   ffffffff811adc7a 0000000000000000 ffff8801ca5295d0 0000000464e224f8
   0000000000000000 0000000000000002 0000000000000000 ffff88020ce75c00
  Call Trace:
    migrate_pages+0x12a/0x850
    SYSC_mbind+0x513/0x6a0
    SyS_mbind+0xe/0x10
    ia32_do_call+0x13/0x13
  Code: 85 c0 75 2f 4c 89 e1 48 89 da 31 f6 bf da 00 02 00 65 44 8b 04 25 08 f7 1c 00 e8 ec fd ff ff 5b 41 5c 41 5d 5d c3 0f 1f 44 00 00 <0f> 0b 66 0f 1f 44 00 00 4c 89 e6 48 89 df ba 01 00 00 00 e8 48
  RIP  [<ffffffff8119f200>] new_vma_page+0x70/0x90
   RSP <ffff88005ab21db0>

Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Reported-by: Dave Jones <davej@redhat.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MAINTAINERS: add Davidlohr as GPT maintainer

Add a new entry for the GPT standard. Any future changes will now be
routed through linux-efi.

Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Acked-by: Matt Fleming <matt.fleming@intel.com>
Cc: Jens Axboe <axboe@kernel.dk>
Acked-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/memory-failure.c: recheck PageHuge() after hugetlb page migrate successfully

After a successful hugetlb page migration by soft offline, the source
page will either be freed into hugepage_freelists or buddy(over-commit
page).  If page is in buddy, page_hstate(page) will be NULL.  It will
hit a NULL pointer dereference in dequeue_hwpoisoned_huge_page().

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000058
  IP: [<ffffffff81163761>] dequeue_hwpoisoned_huge_page+0x131/0x1d0
  PGD c23762067 PUD c24be2067 PMD 0
  Oops: 0000 [#1] SMP

So check PageHuge(page) after call migrate_pages() successfully.

Signed-off-by: Jianguo Wu <wujianguo@huawei.com>
Tested-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/compaction: respect ignore_skip_hint in update_pageblock_skip

update_pageblock_skip() only fits to compaction which tries to isolate
by pageblock unit. If isolate_migratepages_range() is called by CMA, it
try to isolate regardless of pageblock unit and it don't reference
get_pageblock_skip() by ignore_skip_hint. We should also respect it on
update_pageblock_skip() to prevent from setting the wrong information.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: <stable@vger.kernel.org> [3.7+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/mempolicy: correct putback method for isolate pages if failed

queue_pages_range() isolates hugetlbfs pages and putback_lru_pages()
can't handle these. We should change it to putback_movable_pages().

Naoya said that it is worth going into stable, because it can break
in-use hugepage list.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Rafael Aquini <aquini@redhat.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: <stable@vger.kernel.org> [3.12.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: add missing dependency in Kconfig

Eliminate the following (rand)config warning by adding missing PROC_FS
dependency:

warning: (HWPOISON_INJECT && MEM_SOFT_DIRTY) selects PROC_PAGE_MONITOR which has unmet direct dependencies (PROC_FS && MMU)

Signed-off-by: Sima Baymani <sima.baymani@gmail.com>
Suggested-by: David Rientjes <rientjes@google.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

sh: always link in helper functions extracted from libgcc

E.g. landisk_defconfig, which has CONFIG_NTFS_FS=m:

  ERROR: "__ashrdi3" [fs/ntfs/ntfs.ko] undefined!

For "lib-y", if no symbols in a compilation unit are referenced by other
units, the compilation unit will not be included in vmlinux.  This
breaks modules that do reference those symbols.

Use "obj-y" instead to fix this.

http://kisskb.ellerman.id.au/kisskb/buildresult/8838077/

This doesn't fix all cases. There are others, e.g. udivsi3.
This is also not limited to sh, many architectures handle this in the
same way.

A simple solution is to unconditionally include all helper functions.
A more complex solution is to make the choice of "lib-y" or "obj-y" depend
on CONFIG_MODULES:

  obj-$(CONFIG_MODULES) += ...
  lib-y($CONFIG_MODULES) += ...

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Tested-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Reviewed-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: page_alloc: exclude unreclaimable allocations from zone fairness policy

Dave Hansen noted a regression in a microbenchmark that loops around
open() and close() on an 8-node NUMA machine and bisected it down to
commit 81c0a2bb515f ("mm: page_alloc: fair zone allocator policy").
That change forces the slab allocations of the file descriptor to spread
out to all 8 nodes, causing remote references in the page allocator and
slab.

The round-robin policy is only there to provide fairness among memory
allocations that are reclaimed involuntarily based on pressure in each
zone. It does not make sense to apply it to unreclaimable kernel
allocations that are freed manually, in this case instantly after the
allocation, and incur the remote reference costs twice for no reason.

Only round-robin allocations that are usually freed through page reclaim
or slab shrinking.

Bisected by Dave Hansen.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: numa: defer TLB flush for THP migration as long as possible

THP migration can fail for a variety of reasons. Avoid flushing the TLB
to deal with THP migration races until the copy is ready to start.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: numa: guarantee that tlb_flush_pending updates are visible before page table updates

According to documentation on barriers, stores issued before a LOCK can
complete after the lock implying that it's possible tlb_flush_pending
can be visible after a page table update. As per revised documentation,
this patch adds a smp_mb__before_spinlock to guarantee the correct
ordering.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: fix TLB flush race between migration, and change_protection_range

There are a few subtle races, between change_protection_range (used by
mprotect and change_prot_numa) on one side, and NUMA page migration and
compaction on the other side.

The basic race is that there is a time window between when the PTE gets
made non-present (PROT_NONE or NUMA), and the TLB is flushed.

During that time, a CPU may continue writing to the page.

This is fine most of the time, however compaction or the NUMA migration
code may come in, and migrate the page away.

When that happens, the CPU may continue writing, through the cached
translation, to what is no longer the current memory location of the
process.

This only affects x86, which has a somewhat optimistic pte_accessible.
All other architectures appear to be safe, and will either always flush,
or flush whenever there is a valid mapping, even with no permissions
(SPARC).

The basic race looks like this:

CPU A CPU B CPU C

load TLB entry
make entry PTE/PMD_NUMA
fault on entry
read/write old page
start migrating page
change PTE/PMD to new page
read/write old page [*]
flush TLB
reload TLB from new entry
read/write new page
lose data

[*] the old page may belong to a new user at this point!

The obvious fix is to flush remote TLB entries, by making sure that
pte_accessible aware of the fact that PROT_NONE and PROT_NUMA memory may
still be accessible if there is a TLB flush pending for the mm.

This should fix both NUMA migration and compaction.

[mgorman@suse.de: fix build]
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: numa: avoid unnecessary disruption of NUMA hinting during migration

do_huge_pmd_numa_page() handles the case where there is parallel THP
migration. However, by the time it is checked the NUMA hinting
information has already been disrupted. This patch adds an earlier
check with some helpers.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: numa: clear numa hinting information on mprotect

On a protection change it is no longer clear if the page should be still
accessible. This patch clears the NUMA hinting fault bits on a
protection change.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

sched: numa: skip inaccessible VMAs

Inaccessible VMA should not be trapping NUMA hint faults. Skip them.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: numa: avoid unnecessary work on the failure path

If a PMD changes during a THP migration then migration aborts but the
failure path is doing more work than is necessary.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: numa: ensure anon_vma is locked to prevent parallel THP splits

The anon_vma lock prevents parallel THP splits and any associated
complexity that arises when handling splits during THP migration. This
patch checks if the lock was successfully acquired and bails from THP
migration if it failed for any reason.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: numa: do not clear PTE for pte_numa update

The TLB must be flushed if the PTE is updated but change_pte_range is
clearing the PTE while marking PTEs pte_numa without necessarily
flushing the TLB if it reinserts the same entry. Without the flush,
it's conceivable that two processors have different TLBs for the same
virtual address and at the very least it would generate spurious faults.

This patch only unmaps the pages in change_pte_range for a full
protection change.

[riel@redhat.com: write pte_numa pte back to the page tables]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: Chegu Vinod <chegu_vinod@hp.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: numa: do not clear PMD during PTE update scan

If the PMD is flushed then a parallel fault in handle_mm_fault() will
enter the pmd_none and do_huge_pmd_anonymous_page() path where it'll
attempt to insert a huge zero page. This is wasteful so the patch
avoids clearing the PMD when setting pmd_numa.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: clear pmd_numa before invalidating

On x86, PMD entries are similar to _PAGE_PROTNONE protection and are
handled as NUMA hinting faults.  The following two page table protection
bits are what defines them

_PAGE_NUMA:set _PAGE_PRESENT:clear

A PMD is considered present if any of the _PAGE_PRESENT, _PAGE_PROTNONE,
_PAGE_PSE or _PAGE_NUMA bits are set.  If pmdp_invalidate encounters a
pmd_numa, it clears the present bit leaving _PAGE_NUMA which will be
considered not present by the CPU but present by pmd_present.  The
existing caller of pmdp_invalidate should handle it but it's an
inconsistent state for a PMD.  This patch keeps the state consistent
when calling pmdp_invalidate.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: numa: call MMU notifiers on THP migration

MMU notifiers must be called on THP page migration or secondary MMUs
will get very confused.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: numa: serialise parallel get_user_page against THP migration

Base pages are unmapped and flushed from cache and TLB during normal
page migration and replaced with a migration entry that causes any
parallel NUMA hinting fault or gup to block until migration completes.

THP does not unmap pages due to a lack of support for migration entries
at a PMD level. This allows races with get_user_pages and
get_user_pages_fast which commit 3f926ab945b6 ("mm: Close races between
THP migration and PMD numa clearing") made worse by introducing a
pmd_clear_flush().

This patch forces get_user_page (fast and normal) on a pmd_numa page to
go through the slow get_user_page path where it will serialise against
THP migration and properly account for the NUMA hinting fault. On the
migration side the page table lock is taken for each PTE update.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kexec: migrate to reboot cpu

Commit 1b3a5d02ee07 ("reboot: move arch/x86 reboot= handling to generic
kernel") moved reboot= handling to generic code.  In the process it also
removed the code in native_machine_shutdown() which are moving reboot
process to reboot_cpu/cpu0.

I guess that thought must have been that all reboot paths are calling
migrate_to_reboot_cpu(), so we don't need this special handling.  But
kexec reboot path (kernel_kexec()) is not calling
migrate_to_reboot_cpu() so above change broke kexec.  Now reboot can
happen on non-boot cpu and when INIT is sent in second kerneo to bring
up BP, it brings down the machine.

So start calling migrate_to_reboot_cpu() in kexec reboot path to avoid
this problem.

Bisected by WANG Chao.

Reported-by: Matthew Whitehead <mwhitehe@redhat.com>
Reported-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Tested-by: Baoquan He <bhe@redhat.com>
Tested-by: WANG Chao <chaowang@redhat.com>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'usb-3.13-rc5' of git://git./linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
"Here are a few USB fixes for things that have people have reported
  issues with recently"

* tag 'usb-3.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: ohci-at91: fix irq and iomem resource retrieval
  usb: phy: fix driver dependencies
  phy: kconfig: add depends on "USB_PHY" to OMAP_USB2 and TWL4030_USB
  drivers: phy: tweaks to phy_create()
  drivers: phy: Fix memory leak
  xhci: Limit the spurious wakeup fix only to HP machines
  usb: chipidea: fix nobody cared IRQ when booting with host role
  usb: chipidea: host: Only disable the vbus regulator if it is not NULL
  usb: serial: zte_ev: move support for ZTE AC2726 from zte_ev back to option
  usb: cdc-wdm: manage_power should always set needs_remote_wakeup
  usb: phy-tegra-usb.c: wrong pointer check for remap UTMI
  usb: phy: twl6030-usb: signedness bug in twl6030_readb()
  usb: dwc3: power off usb phy in error path
  usb: dwc3: invoke phy_resume after phy_init

Merge tag 'tty-3.13-rc5' of git://git./linux/kernel/git/gregkh/tty

Pull tty/serial fixes from Greg KH:
"Here are a few fixes for 3.13-rc5 that resolve a number of reported
  tty and serial driver issues"

* tag 'tty-3.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  tty: xuartps: Properly guard sysrq specific code
  n_tty: Fix apparent order of echoed output
  serial: 8250_dw: add new ACPI IDs
  serial: 8250_dw: Fix LCR workaround regression
  tty: Fix hang at ldsem_down_read()

Merge tag 'staging-3.13-rc5' of git://git./linux/kernel/git/gregkh/staging

Pull staging driver fixes from Greg KH:
"Here are a number of staging, and iio, fixes for 3.13-rc5 that resolve
  some reported issues"

* tag 'staging-3.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  imx-drm: imx-drm-core: improve safety of imx_drm_add_crtc()
  imx-drm: imx-drm-core: make imx_drm_crtc_register() safer
  imx-drm: imx-drm-core: use defined constant for number of CRTCs.
  imx-drm: imx-tve: don't call sleeping functions beneath enable_lock spinlock
  imx-drm: ipu-v3: fix potential CRTC device registration race
  imx-drm: imx-drm-core: fix DRM cleanup paths
  imx-drm: imx-drm-core: fix error cleanup path for imx_drm_add_crtc()
  staging: comedi: drivers: fix return value of comedi_load_firmware()
  staging: comedi: 8255_pci: fix for newer PCI-DIO48H
  iio:adc:ad7887 Fix channel reported endianness from cpu to big endian
  iio:imu:adis16400 fix pressure channel scan type
  staging:iio:mag:hmc5843 fix incorrect endianness of channel as a result of missuse of the IIO_ST macro.
  iio: cm36651: Changed return value of read function

Merge tag 'driver-core-3.13-rc5' of git://git./linux/kernel/git/gregkh/driver-core

Pull driver core fix from Greg KH:
"Here's a single sysfs fix for 3.13-rc5 that resolves a lockdep issue
in sysfs that has been reported"

* tag 'driver-core-3.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
sysfs: give different locking key to regular and bin files

Merge branch 'keys-devel' of git://git./linux/kernel/git/dhowells/linux-fs

Pull crypto key patches from David Howells:
"There are four items:

   - A patch to fix X.509 certificate gathering.  The problem was that I
     was coming up with a different path for signing_key.x509 in the
     build directory if it didn't exist to if it did exist.  This meant
     that the X.509 cert container object file would be rebuilt on the
     second rebuild in a build directory and the kernel would get
     relinked.

   - Unconditionally remove files generated by SYSTEM_TRUSTED_KEYRING=y
     when doing make mrproper.

   - Actually initialise the persistent-keyring semaphore for
     init_user_ns.  I have no idea why this works at all for users in
     the base user namespace unless it's something to do with systemd
     containerising the system.

   - Documentation for module signing"

* 'keys-devel' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  Add Documentation/module-signing.txt file
  KEYS: fix uninitialized persistent_keyring_register_sem
  KEYS: Remove files generated when SYSTEM_TRUSTED_KEYRING=y
  X.509: Fix certificate gathering

net_dma: mark broken

net_dma can cause data to be copied to a stale mapping if a
copy-on-write fault occurs during dma.  The application sees missing
data.

The following trace is triggered by modifying the kernel to WARN if it
ever triggers copy-on-write on a page that is undergoing dma:

WARNING: CPU: 24 PID: 2529 at lib/dma-debug.c:485 debug_dma_assert_idle+0xd2/0x120()
ioatdma 0000:00:04.0: DMA-API: cpu touching an active dma mapped page [pfn=0x16bcd9]
Modules linked in: iTCO_wdt iTCO_vendor_support ioatdma lpc_ich pcspkr dca
CPU: 24 PID: 2529 Comm: linbug Tainted: G        W    3.13.0-rc1+ #353
  00000000000001e5 ffff88016f45f688 ffffffff81751041 ffff88017ab0ef70
  ffff88016f45f6d8 ffff88016f45f6c8 ffffffff8104ed9c ffffffff810f3646
  ffff8801768f4840 0000000000000282 ffff88016f6cca10 00007fa2bb699349
Call Trace:
  [<ffffffff81751041>] dump_stack+0x46/0x58
  [<ffffffff8104ed9c>] warn_slowpath_common+0x8c/0xc0
  [<ffffffff810f3646>] ? ftrace_pid_func+0x26/0x30
  [<ffffffff8104ee86>] warn_slowpath_fmt+0x46/0x50
  [<ffffffff8139c062>] debug_dma_assert_idle+0xd2/0x120
  [<ffffffff81154a40>] do_wp_page+0xd0/0x790
  [<ffffffff811582ac>] handle_mm_fault+0x51c/0xde0
  [<ffffffff813830b9>] ? copy_user_enhanced_fast_string+0x9/0x20
  [<ffffffff8175fc2c>] __do_page_fault+0x19c/0x530
  [<ffffffff8175c196>] ? _raw_spin_lock_bh+0x16/0x40
  [<ffffffff810f3539>] ? trace_clock_local+0x9/0x10
  [<ffffffff810fa1f4>] ? rb_reserve_next_event+0x64/0x310
  [<ffffffffa0014c00>] ? ioat2_dma_prep_memcpy_lock+0x60/0x130 [ioatdma]
  [<ffffffff8175ffce>] do_page_fault+0xe/0x10
  [<ffffffff8175c862>] page_fault+0x22/0x30
  [<ffffffff81643991>] ? __kfree_skb+0x51/0xd0
  [<ffffffff813830b9>] ? copy_user_enhanced_fast_string+0x9/0x20
  [<ffffffff81388ea2>] ? memcpy_toiovec+0x52/0xa0
  [<ffffffff8164770f>] skb_copy_datagram_iovec+0x5f/0x2a0
  [<ffffffff8169d0f4>] tcp_rcv_established+0x674/0x7f0
  [<ffffffff816a68c5>] tcp_v4_do_rcv+0x2e5/0x4a0
  [..]
---[ end trace e30e3b01191b7617 ]---
Mapped at:
  [<ffffffff8139c169>] debug_dma_map_page+0xb9/0x160
  [<ffffffff8142bf47>] dma_async_memcpy_pg_to_pg+0x127/0x210
  [<ffffffff8142cce9>] dma_memcpy_pg_to_iovec+0x119/0x1f0
  [<ffffffff81669d3c>] dma_skb_copy_datagram_iovec+0x11c/0x2b0
  [<ffffffff8169d1ca>] tcp_rcv_established+0x74a/0x7f0:

...the problem is that the receive path falls back to cpu-copy in
several locations and this trace is just one of the areas.  A few
options were considered to fix this:

1/ sync all dma whenever a cpu copy branch is taken

2/ modify the page fault handler to hold off while dma is in-flight

Option 1 adds yet more cpu overhead to an "offload" that struggles to compete
with cpu-copy.  Option 2 adds checks for behavior that is already documented as
broken when using get_user_pages().  At a minimum a debug mode is warranted to
catch and flag these violations of the dma-api vs get_user_pages().

Thanks to David for his reproducer.

Cc: <stable@vger.kernel.org>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Reported-by: David Whipple <whipple@securedatainnovations.ch>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>

dma: pl330: ensure DMA descriptors are zero-initialised

I see the following splat with 3.13-rc1 when attempting to perform DMA:

[  253.004516] Alignment trap: not handling instruction e1902f9f at [<c0204b40>]
[  253.004583] Unhandled fault: alignment exception (0x221) at 0xdfdfdfd7
[  253.004646] Internal error: : 221 [#1] PREEMPT SMP ARM
[  253.004691] Modules linked in: dmatest(+) [last unloaded: dmatest]
[  253.004798] CPU: 0 PID: 671 Comm: kthreadd Not tainted 3.13.0-rc1+ #2
[  253.004864] task: df9b0900 ti: df03e000 task.ti: df03e000
[  253.004937] PC is at dmaengine_unmap_put+0x14/0x34
[  253.005010] LR is at pl330_tasklet+0x3c8/0x550
[  253.005087] pc : [<c0204b44>]    lr : [<c0207478>]    psr: a00e0193
[  253.005087] sp : df03fe48  ip : 00000000  fp : df03bf18
[  253.005178] r10: bf00e108  r9 : 00000001  r8 : 00000000
[  253.005245] r7 : df837040  r6 : dfb41800  r5 : df837048  r4 : df837000
[  253.005316] r3 : dfdfdfcf  r2 : dfb41f80  r1 : df837048  r0 : dfdfdfd7
[  253.005384] Flags: NzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
[  253.005459] Control: 30c5387d  Table: 9fb9ba80  DAC: fffffffd
[  253.005520] Process kthreadd (pid: 671, stack limit = 0xdf03e248)

This is due to desc->txd.unmap containing garbage (uninitialised memory).

Rather than add another dummy initialisation to _init_desc, instead
ensure that the descriptors are zero-initialised during allocation and
remove the dummy, per-field initialisation.

Cc: Andriy Shevchenko <andriy.shevchenko@intel.com>
Acked-by: Jassi Brar <jassisinghbrar@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>

ALSA: hda - Add Dell headset detection quirk for one more laptop model

On the Dell machines with codec whose Subsystem Id is 0x10280640,
no external microphone can be detected when plugging a 3-ring headset.
Using ALC255_FIXUP_DELL1_MIC_NO_PRESENCE can fix this problem.

The codec (Vendor ID: 0x10ec0255) on the machine belongs to alc_269
family.

BugLink: https://bugs.launchpad.net/bugs/1260303
Cc: David Henningsson <david.henningsson@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

ASoC: wm8904: fix DSP mode B configuration

When wm8904 work in DSP mode B, we still need to configure it to
work in DSP mode. Or else, it will work in Right Justified mode.

Signed-off-by: Bo Shen <voice.shen@atmel.com>
Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
Cc: stable@vger.kernel.org

ASoC: wm_adsp: Add small delay while polling DSP RAM start

Some devices are getting very close to the limit whilst polling the RAM
start, this patch adds a small delay to this loop to give a longer
startup timeout.

Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
Cc: stable@vger.kernel.org

KVM: PPC: Book3S HV: Don't drop low-order page address bits

Commit caaa4c804fae ("KVM: PPC: Book3S HV: Fix physical address
calculations") unfortunately resulted in some low-order address bits
getting dropped in the case where the guest is creating a 4k HPTE
and the host page size is 64k. By getting the low-order bits from
hva rather than gpa we miss out on bits 12 - 15 in this case, since
hva is at page granularity. This puts the missing bits back in.

Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>

powerpc: book3s: kvm: Don't abuse host r2 in exit path

We don't use PACATOC for PR. Avoid updating HOST_R2 with PR
KVM mode when both HV and PR are enabled in the kernel. Without this we
get the below crash

(qemu)
Unable to handle kernel paging request for data at address 0xffffffffffff8310
Faulting instruction address: 0xc00000000001d5a4
cpu 0x2: Vector: 300 (Data Access) at [c0000001dc53aef0]
    pc: c00000000001d5a4: .vtime_delta.isra.1+0x34/0x1d0
    lr: c00000000001d760: .vtime_account_system+0x20/0x60
    sp: c0000001dc53b170
   msr: 8000000000009032
   dar: ffffffffffff8310
dsisr: 40000000
  current = 0xc0000001d76c62d0
  paca    = 0xc00000000fef1100   softe: 0        irq_happened: 0x01
    pid   = 4472, comm = qemu-system-ppc
enter ? for help
[c0000001dc53b200] c00000000001d760 .vtime_account_system+0x20/0x60
[c0000001dc53b290] c00000000008d050 .kvmppc_handle_exit_pr+0x60/0xa50
[c0000001dc53b340] c00000000008f51c kvm_start_lightweight+0xb4/0xc4
[c0000001dc53b510] c00000000008cdf0 .kvmppc_vcpu_run_pr+0x150/0x2e0
[c0000001dc53b9e0] c00000000008341c .kvmppc_vcpu_run+0x2c/0x40
[c0000001dc53ba50] c000000000080af4 .kvm_arch_vcpu_ioctl_run+0x54/0x1b0
[c0000001dc53bae0] c00000000007b4c8 .kvm_vcpu_ioctl+0x478/0x730
[c0000001dc53bca0] c0000000002140cc .do_vfs_ioctl+0x4ac/0x770
[c0000001dc53bd80] c0000000002143e8 .SyS_ioctl+0x58/0xb0
[c0000001dc53be30] c000000000009e58 syscall_exit+0x0/0x98

Signed-off-by: Alexander Graf <agraf@suse.de>

imx-drm: imx-drm-core: improve safety of imx_drm_add_crtc()

We must not add more CRTCs than we have declared to the vblank
helpers, otherwise we overflow their arrays. Force failure if we
exceed the number of CRTCs.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

imx-drm: imx-drm-core: make imx_drm_crtc_register() safer

imx_drm_crtc_register() doesn't clean up the CRTC upon failure, which
leaves the CRTC attached to the DRM device. Also, it does setup after
attaching the CRTC to the DRM device.

Fix this by reordering the function such that we do the setup before
drm_crtc_init(): this fixes both issues.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

imx-drm: imx-drm-core: use defined constant for number of CRTCs.

We have this definition, there's no reason not to use it.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

imx-drm: imx-tve: don't call sleeping functions beneath enable_lock spinlock

Enable lock claims that it is serializing tve_enable/disable calls.
However, DRM already serialises mode sets with a mutex, which prevents
encoder/connector functions being called concurrently. Secondly,
holding a spinlock while calling clk_prepare_enable() is wrong; it
will cause a might_sleep() warning should that debugging be enabled.
So, let's just get rid of the enable_lock.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

imx-drm: ipu-v3: fix potential CRTC device registration race

Clean up the IPUv3 CRTC device registration; we don't need a separate
function just to call platform_device_register_data(), and we don't
need the return value converted at all.

Update the IPU client id under a mutex, so that parallel probing
doesn't race.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

imx-drm: imx-drm-core: fix DRM cleanup paths

We must call drm_vblank_cleanup() on the error cleanup and unload paths
after we've had a successful call to drm_vblank_init(). Ensure that
the calls are in the reverse order to the initialisation order.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

imx-drm: imx-drm-core: fix error cleanup path for imx_drm_add_crtc()

imx_drm_add_crtc() was kfree'ing the imx_drm_crtc structure while
leaving it on the list of CRTCs. Delete it from the list first.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
"Definitely seems quieter this week,

  Radeon, intel, intel broadwell, vmwgfx, ttm, armada, and a couple of
  core fixes, one revert in radeon

  Most of these are either going to stable or fixes for things
  introduced in the merge window"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (30 commits)
  drm/edid: add quirk for BPC in Samsung NP700G7A-S01PL notebook
  drm/ttm: Fix accesses through vmas with only partial coverage
  drm/nouveau: only runtime suspend by default in optimus configuration
  drm: don't double-free on driver load error
  Revert "drm/radeon: Implement radeon_pci_shutdown"
  drm/radeon: add missing display tiling setup for oland
  drm/radeon: fix typo in cik_copy_dma
  drm/radeon/cik: plug in missing blit callback
  drm/radeon/dpm: Fix hwmon crash
  drm/radeon: Fix sideport problems on certain RS690 boards
  drm/i915: don't update the dri1 breadcrumb with modesetting
  DRM: Armada: prime refcounting bug fix
  DRM: Armada: fix printing of phys_addr_t/dma_addr_t
  DRM: Armada: destroy framebuffer after helper
  DRM: Armada: implement lastclose() for fbhelper
  drm/i915: Repeat eviction search after idling the GPU
  drm/vmwgfx: Add max surface memory param
  drm/i915: Fix use-after-free in do_switch
  drm/i915: fix pm init ordering
  drm/i915: Hold mutex across i915_gem_release
  ...

tty: xuartps: Properly guard sysrq specific code

Commit 'tty: xuartps: Implement BREAK detection, add SYSRQ support'
(0c0c47bc40a2e358d593b2d7fb93b50027fbfc0c) introduced sysrq support
without properly guarding sysrq specific code which results in build
errors when sysrq is disabled:
DNAME=KBUILD_STR(xilinx_uartps)" -c -o
drivers/tty/serial/.tmp_xilinx_uartps.o
drivers/tty/serial/xilinx_uartps.c
drivers/tty/serial/xilinx_uartps.c: In function 'xuartps_isr':
drivers/tty/serial/xilinx_uartps.c:247:5: error: 'struct uart_port'
has no member named 'sysrq'
drivers/tty/serial/xilinx_uartps.c:247:5: error: 'struct uart_port'
has no member named 'sysrq'
drivers/tty/serial/xilinx_uartps.c:247:5: error: 'struct uart_port'
has no member named 'sysrq'
make[3]: *** [drivers/tty/serial/xilinx_uartps.o] Error 1

Reported-by: Masanari Iida <standby24x7@gmail.com>
Cc: Vlad Lungu <vlad.lungu@windriver.com>
Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:
"A quick batch of fixes, including the annoying bad lock stack problem
  introduced by udp_sk_rx_dst_set() locking change:

   1) Use xchg() instead of sk_dst_lock() in udp_sk_rx_dst_set(), from
      Eric Dumazet.

   2) qlcnic bug fixes from Himanshu Madhani and Manish Chopra.

   3) Update IPSEC MAINTAINERS entry, from Steffen Klassert.

   4) Administrative neigh entry changes should generate netlink
      notifications the same as event generated ones.  From Bob
      Gilligan.

   5) Netfilter SYNPROXY fixes from Patrick McHardy.

   6) Netfilter nft_reject endianness fixes from Eric Leblond"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  qlcnic: Dump mailbox registers when mailbox command times out.
  qlcnic: Fix mailbox processing during diagnostic test
  qlcnic: Allow firmware dump collection when auto firmware recovery is disabled
  qlcnic: Fix memory allocation
  qlcnic: Fix TSS/RSS validation for 83xx/84xx series adapter.
  qlcnic: Fix TSS/RSS ring validation logic.
  qlcnic: Fix diagnostic test for all adapters.
  qlcnic: Fix usage of netif_tx_{wake, stop} api during link change.
  xen-netback: fix fragments error handling in checksum_setup_ip()
  neigh: Netlink notification for administrative NUD state change
  ipv4: improve documentation of ip_no_pmtu_disc
  net: unix: allow bind to fail on mutex lock
  MAINTAINERS: Update the IPsec maintainer entry
  udp: ipv4: do not use sk_dst_lock from softirq context
  netvsc: don't flush peers notifying work during setting mtu
  can: peak_usb: fix mem leak in pcan_usb_pro_init()
  can: ems_usb: fix urb leaks on failure paths
  sctp: loading sctp when load sctp_probe
  netfilter: nft_reject: fix endianness in dump function
  netfilter: SYNPROXY target: restrict to INPUT/FORWARD

Merge branch 'fixes-for-3.13' of git://gitorious.org/linux-can/linux-can

Marc Kleine-Budde says:

====================
this is a pull request with two fixes for net/master, the current release
cycle.

It consists of a patch by Alexey Khoroshilov from the Linux Driver Verification
project, which fixes a memory leak in ems_usb's failure patch. And a patch by
me which fixes a memory leak in the peak usb driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'qlcnic'

Himanshu Madhani says:

====================
qlcnic: Bug fixes.

This series contains bug fixes for mailbox handling and multi Tx queue support
for all supported adapters.

changes from v1 -> v2
o updated patch to fix usage of netif_tx_{wake,stop} api during link change
as per David Miller's suggestion.
o Dropped patch to use spinklock per tx queue for more work.
o Added reworked patch for memory allocation failures.
o Added patch to allow capturing of dump, when auto recovery is disabled in firmware.
o Added patches for mailbox interrupt handling and debugging data for mailbox failure.

Please apply to net.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: Dump mailbox registers when mailbox command times out.

Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: Fix mailbox processing during diagnostic test

o Do not enable mailbox polling in case of legacy interrupt.
Process mailbox AEN/response from the interrupt.

Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: Allow firmware dump collection when auto firmware recovery is disabled

o Allow driver to collect firmware dump, during a forced firmware dump
operation, when auto firmware recovery is disabled. Also, during this
operation, driver should not allow reset recovery to be performed.

Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: Fix memory allocation

o Use vzalloc() instead of kzalloc() for allocation of
bootloader size memory. kzalloc() may fail to allocate
the size of bootloader

Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: Fix TSS/RSS validation for 83xx/84xx series adapter.

o Current code was not allowing the user to configure more
  than one Tx ring using ethtool for 83xx/84xx adapter.
  This regression was introduced by commit id
  18afc102fdcb95d6c7d57f2967a06f2f8fe3ba4c ("qlcnic: Enable
  multiple Tx queue support for 83xx/84xx Series adapter.")

Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: Fix TSS/RSS ring validation logic.

o TSS/RSS ring validation does not take into account that either
  of these ring values can be 0. This patch fixes this validation
  and would fail set_channel operation if any of these ring value
  is 0. This regression was added as part of commit id
  34e8c406fda5b5a9d2e126a92bab84cd28e3b5fa ("qlcnic: refactor Tx/SDS
  ring calculation and validation in driver.")

Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: Fix diagnostic test for all adapters.

o Driver should re-allocate all Tx queues after completing
  diagnostic tests. This regression was added by commit id
  c2c5e3a0681bb1945c0cb211a5f4baa22cb2cbb3 ("qlcnic: Enable
  diagnostic test for multiple Tx queues.")

Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: Fix usage of netif_tx_{wake, stop} api during link change.

o Driver was using netif_tx_{stop,wake}_all_queues() api
  during link change event. Remove these api calls to
  manage queue start/stop event, as core networking stack
  will manage this based on netif_carrier_{on,off} call.
  These API's were modified as part of commit id
  012ec81223aa45d2b80aeafb77392fd1a19c7b10 ("qlcnic: Multi Tx
  queue support for 82xx Series adapter.")

Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

usb: ohci-at91: fix irq and iomem resource retrieval

When using dt resources retrieval (interrupts and reg properties) there is
no predefined order for these resources in the platform dev resources
table.

Retrieve resources using platform_get_resource and platform_get_irq
functions instead of direct resource table entries to avoid resource type
mismatch.

Signed-off-by: Boris BREZILLON <b.brezillon@overkiz.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reviewed-by: Tomasz Figa <tomasz.figa@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

xen-netback: fix fragments error handling in checksum_setup_ip()

Fix to return -EPROTO error if fragments detected in checksum_setup_ip().

Fixes: 1431fb31ecba ('xen-netback: fix fragment detection in checksum setup')
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

neigh: Netlink notification for administrative NUD state change

The neighbour code sends up an RTM_NEWNEIGH netlink notification if
the NUD state of a neighbour cache entry is changed by a timer (e.g.
from REACHABLE to STALE), even if the lladdr of the entry has not
changed.

But an administrative change to the the NUD state of a neighbour cache
entry that does not change the lladdr (e.g. via "ip -4 neigh change
... nud ...") does not trigger a netlink notification. This means
that netlink listeners will not hear about administrative NUD state
changes such as from a resolved state to PERMANENT.

This patch changes the neighbor code to generate an RTM_NEWNEIGH
message when the NUD state of an entry is changed administratively.

Signed-off-by: Bob Gilligan <gilligan@aristanetworks.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

staging: comedi: drivers: fix return value of comedi_load_firmware()

Some of the callback functions that upload the firmware in the comedi
drivers return a positive value indicating the number of bytes sent
to the device. Detect this condition and just return '0' to indicate
a successful upload.

Reported-by: Bernd Porr <mail@berndporr.me.uk>
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Acked-by: Ian Abbott <abbotti@mev.co.uk>
Acked-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: comedi: 8255_pci: fix for newer PCI-DIO48H

At some point, Measurement Computing / ComputerBoards redesigned the
PCI-DIO48H to use a PLX PCI interface chip instead of an AMCC chip.
This meant they had to put their hardware registers in the PCI BAR 2
region instead of PCI BAR 1.  Unfortunately, they kept the same PCI
device ID for the new design.  This means the driver recognizes the
newer cards, but doesn't work (and is likely to screw up the local
configuration registers of the PLX chip) because it's using the wrong
region.

Since  the PCI subvendor and subdevice IDs were both zero on the old
design, but are the same as the vendor and device on the new design, we
can tell the old design and new design apart easily enough.  Split the
existing entry for the PCI-DIO48H in `pci_8255_boards[]` into two new
entries, referenced by different entries in the PCI device ID table
`pci_8255_pci_table[]`.  Use the same board name for both entries.

Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Cc: stablle <stable@vger.kernel.org> # 3.10.y # 3.11.y # 3.12.y # 3.13.y
Acked-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Merge tag 'iio-fixes-for-3.13c' of git://git./linux/kernel/git/jic23/iio into staging-linus

Jonathan writes:

Third set of fixes for IIO in the 3.13 cycle.

* Fix for a bug in the new cm36651 driver where it told the IIO driver it
  was providing a decimal part, but then didn't.  Now it correctly tells the
  IIO core that it is only providing an integer value.  This prevents random
  incorrect values being output on a sysfs read.

* 3 fixes where drivers were miss specifying the endianness of their channels
  as output through the buffer interface.  These were discovered whilst
  removing the terrible IIO_ST macro once and for all.  The result is that
  userspace may be informed that the buffer elements are being output as
  little endian (on little endian platforms) when infact they are big endian.
  Thus userspace will handle them incorrectly.  This incorrect buffer
  element specification is provided as sysfs attributes under
  iio:deviceN/scan_elements.

Merge tag 's2mps11-build' of git://git./linux/kernel/git/broonie/regulator

Pull regulator/clk fix from Mark Brown:
"Fix s2mps11 build

  This patch fixes a build failure that appeared in v3.13-rc4 due to an
  RTC/MFD update merged via -mm"

* tag 's2mps11-build' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  mfd: s2mps11: Fix build after regmap field rename in sec-core.c

iio:adc:ad7887 Fix channel reported endianness from cpu to big endian

Note this also sets the endianness to big endian whereas it would
previously have defaulted to the cpu endian. Hence technically
this is a bug fix on LE platforms.

Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Acked-by: Lars-Peter Clausen <lars@metafoo.de>
Cc: stable@vger.kernel.org

Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull timer fixes from Ingo Molnar:
"Five self-contained fixlets:

   - fix clocksource driver build bug
   - fix two sched_clock() bugs triggering on specific hardware
   - fix devicetree enumeration bug affecting specific hardware
   - fix irq handler registration race resulting in boot crash
   - fix device node refcount bug"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource: dw_apb_timer_of: Fix support for dts binding "snps,dw-apb-timer"
  clocksource: dw_apb_timer_of: Fix read_sched_clock
  clocksource: sunxi: Stop timer from ticking before enabling interrupts
  clocksource: clksrc-of: Do not drop unheld reference on device node
  clocksource: armada-370-xp: Register sched_clock after the counter reset
  clocksource: time-efm32: Select CLKSRC_MMIO

Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull scheduler fixes from Ingo Molnar:
"Three fixes for scheduler crashes, each triggers in relatively rare,
  hardware environment dependent situations"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Rework sched_fair time accounting
  math64: Add mul_u64_u32_shr()
  sched: Remove PREEMPT_NEED_RESCHED from generic code
  sched: Initialize power_orig for overlapping groups

Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf fix from Ingo Molnar:
"An x86/intel event constraint fix"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86: Fix constraint table end marker bug

iio:imu:adis16400 fix pressure channel scan type

A single channel in this driver was using the IIO_ST macro.
This does not provide a parameter for setting the endianness of
the channel. Thus this channel will have been reported as whatever
is the native endianness of the cpu rather than big endian. This
means it would be incorrect on little endian platforms.

Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Acked-by: Lars-Peter Clausen <lars@metafoo.de>
Cc: stable@vger.kernel.org

staging:iio:mag:hmc5843 fix incorrect endianness of channel as a result of missuse of the IIO_ST macro.

This driver sets the shift value equal to IIO_BE (or 1) rather than setting
that to 0 and specificying the endianness. This means the channel type is
missreported as
[be|le]:u16/16>>1 where the be|le is dependent on the cpu native endianness,
rather than
be:u16/16>>0 resulting in any userspace code using this information, miss
converting the channel and generating thoroughly trashed data.

Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Acked-by: Lars-Peter Clausen <lars@metafoo.de>
Cc: stable@vger.kernel.org

ipv4: improve documentation of ip_no_pmtu_disc

Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'master' of git://git./linux/kernel/git/pablo/nf

Pablo Neira Ayuso says:

====================
The following patchset contains two Netfilter fixes for your net
tree, they are:

* Fix endianness in nft_reject, the NFTA_REJECT_TYPE netlink attributes
  was not converted to network byte order as needed by all nfnetlink
  subsystems, from Eric Leblond.

* Restrict SYNPROXY target to INPUT and FORWARD chains, this avoid a
  possible crash due to misconfigurations, from Patrick McHardy.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: unix: allow bind to fail on mutex lock

This is similar to the set_peek_off patch where calling bind while the
socket is stuck in unix_dgram_recvmsg() will block and cause a hung task
spew after a while.

This is also the last place that did a straightforward mutex_lock(), so
there shouldn't be any more of these patches.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

MAINTAINERS: Update the IPsec maintainer entry

Add the IPsec git trees and some pure IPsec modules
to the IPsec section in the MAINTAINERS file.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

udp: ipv4: do not use sk_dst_lock from softirq context

Using sk_dst_lock from softirq context is not supported right now.

Instead of adding BH protection everywhere,
udp_sk_rx_dst_set() can instead use xchg(), as suggested
by David.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Fixes: 975022310233 ("udp: ipv4: must add synchronization in udp_sk_rx_dst_set()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'gpio-v3.13-4' of git://git./linux/kernel/git/linusw/linux-gpio

Pull GPIO fixes from Linus Walleij:
"All but one are long-standing bug fixes that are also tagged for
  stable

   - Driver bug fixes for SH PFC, TWL4030, MSM and RCAR.

   - Update the MAINTAINERS"

* tag 'gpio-v3.13-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpio: rcar: Fix level interrupt handling
  gpio: msm: Fix irq mask/unmask by writing bits instead of numbers
  gpio: twl4030: Fix regression for twl gpio LED output
  sh-pfc: Fix PINMUX_GPIO macro
  MAINTAINERS: update GPIO maintainers entry

Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client

Pull two Ceph fixes from Sage Weil:
"One of these is fixing a regression from the d_flags file type patch
  that went into -rc1 that broke instantiation of inodes and dentries
  (we were doing dentries first).  The other is just an off-by-one
  corner case"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: Avoid data inconsistency due to d-cache aliasing in readpage()
  ceph: initialize inode before instantiating dentry

netvsc: don't flush peers notifying work during setting mtu

There's a possible deadlock if we flush the peers notifying work during setting
mtu:

[   22.991149] ======================================================
[   22.991173] [ INFO: possible circular locking dependency detected ]
[   22.991198] 3.10.0-54.0.1.el7.x86_64.debug #1 Not tainted
[   22.991219] -------------------------------------------------------
[   22.991243] ip/974 is trying to acquire lock:
[   22.991261]  ((&(&net_device_ctx->dwork)->work)){+.+.+.}, at: [<ffffffff8108af95>] flush_work+0x5/0x2e0
[   22.991307]
but task is already holding lock:
[   22.991330]  (rtnl_mutex){+.+.+.}, at: [<ffffffff81539deb>] rtnetlink_rcv+0x1b/0x40
[   22.991367]
which lock already depends on the new lock.

[   22.991398]
the existing dependency chain (in reverse order) is:
[   22.991426]
-> #1 (rtnl_mutex){+.+.+.}:
[   22.991449]        [<ffffffff810dfdd9>] __lock_acquire+0xb19/0x1260
[   22.991477]        [<ffffffff810e0d12>] lock_acquire+0xa2/0x1f0
[   22.991501]        [<ffffffff81673659>] mutex_lock_nested+0x89/0x4f0
[   22.991529]        [<ffffffff815392b7>] rtnl_lock+0x17/0x20
[   22.991552]        [<ffffffff815230b2>] netdev_notify_peers+0x12/0x30
[   22.991579]        [<ffffffffa0340212>] netvsc_send_garp+0x22/0x30 [hv_netvsc]
[   22.991610]        [<ffffffff8108d251>] process_one_work+0x211/0x6e0
[   22.991637]        [<ffffffff8108d83b>] worker_thread+0x11b/0x3a0
[   22.991663]        [<ffffffff81095e5d>] kthread+0xed/0x100
[   22.991686]        [<ffffffff81681c6c>] ret_from_fork+0x7c/0xb0
[   22.991715]
-> #0 ((&(&net_device_ctx->dwork)->work)){+.+.+.}:
[   22.991715]        [<ffffffff810de817>] check_prevs_add+0x967/0x970
[   22.991715]        [<ffffffff810dfdd9>] __lock_acquire+0xb19/0x1260
[   22.991715]        [<ffffffff810e0d12>] lock_acquire+0xa2/0x1f0
[   22.991715]        [<ffffffff8108afde>] flush_work+0x4e/0x2e0
[   22.991715]        [<ffffffff8108e1b5>] __cancel_work_timer+0x95/0x130
[   22.991715]        [<ffffffff8108e303>] cancel_delayed_work_sync+0x13/0x20
[   22.991715]        [<ffffffffa03404e4>] netvsc_change_mtu+0x84/0x200 [hv_netvsc]
[   22.991715]        [<ffffffff815233d4>] dev_set_mtu+0x34/0x80
[   22.991715]        [<ffffffff8153bc2a>] do_setlink+0x23a/0xa00
[   22.991715]        [<ffffffff8153d054>] rtnl_newlink+0x394/0x5e0
[   22.991715]        [<ffffffff81539eac>] rtnetlink_rcv_msg+0x9c/0x260
[   22.991715]        [<ffffffff8155cdd9>] netlink_rcv_skb+0xa9/0xc0
[   22.991715]        [<ffffffff81539dfa>] rtnetlink_rcv+0x2a/0x40
[   22.991715]        [<ffffffff8155c41d>] netlink_unicast+0xdd/0x190
[   22.991715]        [<ffffffff8155c807>] netlink_sendmsg+0x337/0x750
[   22.991715]        [<ffffffff8150d219>] sock_sendmsg+0x99/0xd0
[   22.991715]        [<ffffffff8150d63e>] ___sys_sendmsg+0x39e/0x3b0
[   22.991715]        [<ffffffff8150eba2>] __sys_sendmsg+0x42/0x80
[   22.991715]        [<ffffffff8150ebf2>] SyS_sendmsg+0x12/0x20
[   22.991715]        [<ffffffff81681d19>] system_call_fastpath+0x16/0x1b

This is because we hold the rtnl_lock() before ndo_change_mtu() and try to flush
the work in netvsc_change_mtu(), in the mean time, netdev_notify_peers() may be
called from worker and also trying to hold the rtnl_lock. This will lead the
flush won't succeed forever. Solve this by not canceling and flushing the work,
this is safe because the transmission done by NETDEV_NOTIFY_PEERS was
synchronized with the netif_tx_disable() called by netvsc_change_mtu().

Reported-by: Yaju Cao <yacao@redhat.com>
Tested-by: Yaju Cao <yacao@redhat.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>