platform/kernel/linux-rpi3.git
8 years agoMerge branch 'perf/urgent' into perf/core, to pick up fixes
Ingo Molnar [Thu, 4 Feb 2016 07:57:44 +0000 (08:57 +0100)]
Merge branch 'perf/urgent' into perf/core, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoMerge tag 'perf-urgent-for-mingo-2' of git://git.kernel.org/pub/scm/linux/kernel...
Ingo Molnar [Thu, 4 Feb 2016 07:56:54 +0000 (08:56 +0100)]
Merge tag 'perf-urgent-for-mingo-2' of git://git./linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fix from Arnaldo Carvalho de Melo:

  - Fix 'perf stat' interval output values (Jiri Olsa)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoMerge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Thu, 4 Feb 2016 07:55:00 +0000 (08:55 +0100)]
Merge tag 'perf-urgent-for-mingo' of git://git./linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

 - tracepoint_error() can receive e=NULL, robustify it, fixes a problem noticed
   with a very specific combination: Machine with Intel PT (e.g. Broadwell),
   kernel with no perf_event_attr.context_switch feature (e.g. 4.2) and unreadable
   tracefs (for instance !root users), making the fallback from
   perf_event_attr.context_switch to the sched:sched_switch tracepoint to fail
   reading its info from tracefs, fix it. (Adrian Hunter)

 - Fix segfault in intel PT, by making it follow the 'struct thread' lifetime cycle
   checking expectations, noticed for instance, when processing perf.data files with
   Intel PT data using 'perf script' and when exiting 'perf report' (Adrian Hunter)

 - Fix CFI usage from .eh_frame and .debug_frame, which sometimes requires that we
   fallback from .eh_frame to .debug_frame in architectures such as PowerPC (Hemant Kumar)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf stat: Fix interval output values
Jiri Olsa [Wed, 3 Feb 2016 07:43:56 +0000 (08:43 +0100)]
perf stat: Fix interval output values

We broke interval data displays with commit:

  3f416f22d1e2 ("perf stat: Do not clean event's private stats")

This commit removed stats cleaning, which is important for '-r' option
to carry counters data over the whole run. But it's necessary to clean
it for interval mode, otherwise the displayed value is avg of all
previous values.

Before:
  $ perf stat -e cycles -a -I 1000 record
  #           time             counts unit events
       1.000240796         75,216,287      cycles
       2.000512791        107,823,524      cycles

  $ perf stat report
  #           time             counts unit events
       1.000240796         75,216,287      cycles
       2.000512791         91,519,906      cycles

Now:
  $ perf stat report
  #           time             counts unit events
       1.000240796         75,216,287      cycles
       2.000512791        107,823,524      cycles

Notice the second value being bigger (91,.. < 107,..).

This could be easily verified by using perf script which displays raw
stat data:

  $ perf script
  CPU  THREAD       VAL         ENA         RUN        TIME EVENT
    0      -1  23855779  1000209530  1000209530  1000240796 cycles
    1      -1  33340397  1000224964  1000224964  1000240796 cycles
    2      -1  15835415  1000226695  1000226695  1000240796 cycles
    3      -1   2184696  1000228245  1000228245  1000240796 cycles
    0      -1  97014312  2000514533  2000514533  2000512791 cycles
    1      -1  46121497  2000543795  2000543795  2000512791 cycles
    2      -1  32269530  2000543566  2000543566  2000512791 cycles
    3      -1   7634472  2000544108  2000544108  2000512791 cycles

The sum of the first 4 values is the first interval aggregated value:

  23855779 + 33340397 + 15835415 + 2184696 = 75,216,287

The sum of the second 4 values minus first value is the second interval
aggregated value:

  97014312 + 46121497 + 32269530 + 7634472 - 75216287 = 107,823,524

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1454485436-20639-1-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Wed, 3 Feb 2016 18:10:02 +0000 (10:10 -0800)]
Merge branch 'akpm' (patches from Andrew)

Merge fixes from Andrew Morton:
 "18 fixes"

[ The 18 fixes turned into 17 commits, because one of the fixes was a
  fix for another patch in the series that I just folded in by editing
  the patch manually - hopefully correctly     - Linus ]

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm: fix memory leak in copy_huge_pmd()
  drivers/hwspinlock: fix race between radix tree insertion and lookup
  radix-tree: fix race in gang lookup
  mm/vmpressure.c: fix subtree pressure detection
  mm: polish virtual memory accounting
  mm: warn about VmData over RLIMIT_DATA
  Documentation: cgroup-v2: add memory.stat::sock description
  mm: memcontrol: drop superfluous entry in the per-memcg stats array
  drivers/scsi/sg.c: mark VMA as VM_IO to prevent migration
  proc: revert /proc/<pid>/maps [stack:TID] annotation
  numa: fix /proc/<pid>/numa_maps for hugetlbfs on s390
  MAINTAINERS: update Seth email
  ocfs2/cluster: fix memory leak in o2hb_region_release
  lib/test-string_helpers.c: fix and improve string_get_size() tests
  thp: limit number of object to scan on deferred_split_scan()
  thp: change deferred_split_count() to return number of THP in queue
  thp: make split_queue per-node

8 years agoMerge tag 'for-linus-4.5-2' of git://git.code.sf.net/p/openipmi/linux-ipmi
Linus Torvalds [Wed, 3 Feb 2016 18:04:58 +0000 (10:04 -0800)]
Merge tag 'for-linus-4.5-2' of git://git.code.sf.net/p/openipmi/linux-ipmi

Pull IPMI fix from Corey Minyard:
 "Fix a compile error on IPMI when ACPI is disabled"

* tag 'for-linus-4.5-2' of git://git.code.sf.net/p/openipmi/linux-ipmi:
  ipmi: put acpi.h with the other headers

8 years agoMerge tag 'devicetree-fixes-for-4.5' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 3 Feb 2016 17:55:50 +0000 (09:55 -0800)]
Merge tag 'devicetree-fixes-for-4.5' of git://git./linux/kernel/git/robh/linux

Pull DeviceTree fixes from Rob Herring:

 - Fix build error with *_OF_DECLARE() when used in modules

 - Add missing platform maintainers for dts files in MAINTAINERS

* tag 'devicetree-fixes-for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  of: drop symbols declared by _OF_DECLARE() from modules
  MAINTAINERS: Add missing platform maintainers for dts files

8 years agoMerge tag 'nfs-for-4.5-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Linus Torvalds [Wed, 3 Feb 2016 17:36:41 +0000 (09:36 -0800)]
Merge tag 'nfs-for-4.5-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfix and cleanup from Trond Myklebust:
 "Bugfix:
   - pNFS: Fix for missing layoutreturn calls

  Cleanup:
   - pNFS: rename NFS_LAYOUT_RETURN_BEFORE_CLOSE for code clarity"

* tag 'nfs-for-4.5-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFS: Cleanup - rename NFS_LAYOUT_RETURN_BEFORE_CLOSE
  pNFS: Fix missing layoutreturn calls

8 years agoMerge tag 'trace-v4.5-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt...
Linus Torvalds [Wed, 3 Feb 2016 17:31:34 +0000 (09:31 -0800)]
Merge tag 'trace-v4.5-rc1-2' of git://git./linux/kernel/git/rostedt/linux-trace

Pull tracing fix from Steven Rostedt:
 "A cleanup to the stack tracer broke stack tracing on s390.  Here's a
  simple fix to correct that issue"

* tag 'trace-v4.5-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing/stacktrace: Show entire trace if passed in function not found

8 years agomm: retire GUP WARN_ON_ONCE that outlived its usefulness
Hugh Dickins [Sun, 31 Jan 2016 02:03:16 +0000 (18:03 -0800)]
mm: retire GUP WARN_ON_ONCE that outlived its usefulness

Trinity is now hitting the WARN_ON_ONCE we added in v3.15 commit
cda540ace6a1 ("mm: get_user_pages(write,force) refuse to COW in shared
areas").  The warning has served its purpose, nobody was harmed by that
change, so just remove the warning to generate less noise from Trinity.

Which reminds me of the comment I wrongly left behind with that commit
(but was spotted at the time by Kirill), which has since moved into a
separate function, and become even more obscure: delete it.

Reported-by: Dave Jones <davej@codemonkey.org.uk>
Suggested-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoipmi: put acpi.h with the other headers
Tony Camuso [Tue, 2 Feb 2016 18:57:24 +0000 (13:57 -0500)]
ipmi: put acpi.h with the other headers

Enclosing '#include <linux/acpi.h>' within '#ifdef CONFIG_ACPI' is
unnecessary, since it has its own conditional compile for CONFIG_ACPI.

Commit 0fbcf4af7c83 ("ipmi: Convert the IPMI SI ACPI handling to a
platform device") exposed this as a problem for platforms that do not
support ACPI when it introduced a call to ACPI_PTR() macro outside of
the CONFIG_ACPI conditional compile. This would have been perfectly
acceptable if acpi.h were not conditionally excluded for the non-acpi
platform, because the conditional compile within acpi.h defines
ACPI_PTR() to return NULL when compiled for non acpi platforms.

Signed-off-by: Tony Camuso <tcamuso@redhat.com>
Fixed commit reference in header to conform to standard.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
8 years agomm: fix memory leak in copy_huge_pmd()
Matthew Wilcox [Wed, 3 Feb 2016 00:57:57 +0000 (16:57 -0800)]
mm: fix memory leak in copy_huge_pmd()

We allocate a pgtable but do not attach it to anything if the PMD is in
a DAX VMA, causing it to leak.

We certainly try to not free pgtables associated with the huge zero page
if the zero page is in a DAX VMA, so I think this is the right solution.
This needs to be properly audited.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agodrivers/hwspinlock: fix race between radix tree insertion and lookup
Matthew Wilcox [Wed, 3 Feb 2016 00:57:55 +0000 (16:57 -0800)]
drivers/hwspinlock: fix race between radix tree insertion and lookup

of_hwspin_lock_get_id() is protected by the RCU lock, which means that
insertions can occur simultaneously with the lookup.  If the radix tree
transitions from a height of 0, we can see a slot with the indirect_ptr
bit set, which will cause us to at least read random memory, and could
cause other havoc.

Fix this by using the newly introduced radix_tree_iter_retry().

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ohad Ben-Cohen <ohad@wizery.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoradix-tree: fix race in gang lookup
Matthew Wilcox [Wed, 3 Feb 2016 00:57:52 +0000 (16:57 -0800)]
radix-tree: fix race in gang lookup

If the indirect_ptr bit is set on a slot, that indicates we need to redo
the lookup.  Introduce a new function radix_tree_iter_retry() which
forces the loop to retry the lookup by setting 'slot' to NULL and
turning the iterator back to point at the problematic entry.

This is a pretty rare problem to hit at the moment; the lookup has to
race with a grow of the radix tree from a height of 0.  The consequences
of hitting this race are that gang lookup could return a pointer to a
radix_tree_node instead of a pointer to whatever the user had inserted
in the tree.

Fixes: cebbd29e1c2f ("radix-tree: rewrite gang lookup using iterator")
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ohad Ben-Cohen <ohad@wizery.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm/vmpressure.c: fix subtree pressure detection
Vladimir Davydov [Wed, 3 Feb 2016 00:57:49 +0000 (16:57 -0800)]
mm/vmpressure.c: fix subtree pressure detection

When vmpressure is called for the entire subtree under pressure we
mistakenly use vmpressure->scanned instead of vmpressure->tree_scanned
when checking if vmpressure work is to be scheduled.  This results in
suppressing all vmpressure events in the legacy cgroup hierarchy.  Fix it.

Fixes: 8e8ae645249b ("mm: memcontrol: hook up vmpressure to socket pressure")
Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm: polish virtual memory accounting
Konstantin Khlebnikov [Wed, 3 Feb 2016 00:57:46 +0000 (16:57 -0800)]
mm: polish virtual memory accounting

* add VM_STACK as alias for VM_GROWSUP/DOWN depending on architecture
* always account VMAs with flag VM_STACK as stack (as it was before)
* cleanup classifying helpers
* update comments and documentation

Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Tested-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm: warn about VmData over RLIMIT_DATA
Konstantin Khlebnikov [Wed, 3 Feb 2016 00:57:43 +0000 (16:57 -0800)]
mm: warn about VmData over RLIMIT_DATA

This patch provides a way of working around a slight regression
introduced by commit 84638335900f ("mm: rework virtual memory
accounting").

Before that commit RLIMIT_DATA have control only over size of the brk
region.  But that change have caused problems with all existing versions
of valgrind, because it set RLIMIT_DATA to zero.

This patch fixes rlimit check (limit actually in bytes, not pages) and
by default turns it into warning which prints at first VmData misuse:

  "mmap: top (795): VmData 516096 exceed data ulimit 512000.  Will be forbidden soon."

Behavior is controlled by boot param ignore_rlimit_data=y/n and by sysfs
/sys/module/kernel/parameters/ignore_rlimit_data.  For now it set to "y".

[akpm@linux-foundation.org: tweak kernel-parameters.txt text[
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Link: http://lkml.kernel.org/r/20151228211015.GL2194@uranus
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Kees Cook <keescook@google.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoDocumentation: cgroup-v2: add memory.stat::sock description
Johannes Weiner [Wed, 3 Feb 2016 00:57:41 +0000 (16:57 -0800)]
Documentation: cgroup-v2: add memory.stat::sock description

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm: memcontrol: drop superfluous entry in the per-memcg stats array
Johannes Weiner [Wed, 3 Feb 2016 00:57:38 +0000 (16:57 -0800)]
mm: memcontrol: drop superfluous entry in the per-memcg stats array

MEM_CGROUP_STAT_NSTATS is just a delimiter for cgroup1 statistics, not
an actual array entry.  Reuse it for the first cgroup2 stat entry, like
in the event array.

Fixes: b2807f07f4f8 ("mm: memcontrol: add "sock" to cgroup2 memory.stat")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agodrivers/scsi/sg.c: mark VMA as VM_IO to prevent migration
Kirill A. Shutemov [Wed, 3 Feb 2016 00:57:35 +0000 (16:57 -0800)]
drivers/scsi/sg.c: mark VMA as VM_IO to prevent migration

Reduced testcase:

    #include <fcntl.h>
    #include <unistd.h>
    #include <sys/mman.h>
    #include <numaif.h>

    #define SIZE 0x2000

    int main()
    {
        int fd;
        void *p;

        fd = open("/dev/sg0", O_RDWR);
        p = mmap(NULL, SIZE, PROT_EXEC, MAP_PRIVATE | MAP_LOCKED, fd, 0);
        mbind(p, SIZE, 0, NULL, 0, MPOL_MF_MOVE);
        return 0;
    }

We shouldn't try to migrate pages in sg VMA as we don't have a way to
update Sg_scatter_hold::pages accordingly from mm core.

Let's mark the VMA as VM_IO to indicate to mm core that the VMA is not
migratable.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Doug Gilbert <dgilbert@interlog.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Shiraz Hashim <shashim@codeaurora.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: syzkaller <syzkaller@googlegroups.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoproc: revert /proc/<pid>/maps [stack:TID] annotation
Johannes Weiner [Wed, 3 Feb 2016 00:57:29 +0000 (16:57 -0800)]
proc: revert /proc/<pid>/maps [stack:TID] annotation

Commit b76437579d13 ("procfs: mark thread stack correctly in
proc/<pid>/maps") added [stack:TID] annotation to /proc/<pid>/maps.

Finding the task of a stack VMA requires walking the entire thread list,
turning this into quadratic behavior: a thousand threads means a
thousand stacks, so the rendering of /proc/<pid>/maps needs to look at a
million combinations.

The cost is not in proportion to the usefulness as described in the
patch.

Drop the [stack:TID] annotation to make /proc/<pid>/maps (and
/proc/<pid>/numa_maps) usable again for higher thread counts.

The [stack] annotation inside /proc/<pid>/task/<tid>/maps is retained, as
identifying the stack VMA there is an O(1) operation.

Siddesh said:
 "The end users needed a way to identify thread stacks programmatically and
  there wasn't a way to do that.  I'm afraid I no longer remember (or have
  access to the resources that would aid my memory since I changed
  employers) the details of their requirement.  However, I did do this on my
  own time because I thought it was an interesting project for me and nobody
  really gave any feedback then as to its utility, so as far as I am
  concerned you could roll back the main thread maps information since the
  information is available in the thread-specific files"

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Siddhesh Poyarekar <siddhesh.poyarekar@gmail.com>
Cc: Shaohua Li <shli@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agonuma: fix /proc/<pid>/numa_maps for hugetlbfs on s390
Michael Holzheu [Wed, 3 Feb 2016 00:57:26 +0000 (16:57 -0800)]
numa: fix /proc/<pid>/numa_maps for hugetlbfs on s390

When working with hugetlbfs ptes (which are actually pmds) is not valid to
directly use pte functions like pte_present() because the hardware bit
layout of pmds and ptes can be different.  This is the case on s390.
Therefore we have to convert the hugetlbfs ptes first into a valid pte
encoding with huge_ptep_get().

Currently the /proc/<pid>/numa_maps code uses hugetlbfs ptes without
huge_ptep_get().  On s390 this leads to the following two problems:

1) The pte_present() function returns false (instead of true) for
   PROT_NONE hugetlb ptes. Therefore PROT_NONE vmas are missing
   completely in the "numa_maps" output.

2) The pte_dirty() function always returns false for all hugetlb ptes.
   Therefore these pages are reported as "mapped=xxx" instead of
   "dirty=xxx".

Therefore use huge_ptep_get() to correctly convert the hugetlb ptes.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: <stable@vger.kernel.org> [4.3+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoMAINTAINERS: update Seth email
Seth Jennings [Wed, 3 Feb 2016 00:57:23 +0000 (16:57 -0800)]
MAINTAINERS: update Seth email

Update/unify my contact info.  The old email address will no longer work
soon.

Signed-off-by: Seth Jennings <sjenning@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoocfs2/cluster: fix memory leak in o2hb_region_release
Joseph Qi [Wed, 3 Feb 2016 00:57:21 +0000 (16:57 -0800)]
ocfs2/cluster: fix memory leak in o2hb_region_release

o2hb_region_release currently doesn't free o2hb_debug_buf
hr_db_elapsed_time and hr_db_pinned malloced in o2hb_debug_create.  Also
we should call debugfs_remove before freeing its data, to prevent the risk
accessing debugfs rightly after its data has been freed.

Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: Jiufei Xue <xuejiufei@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agolib/test-string_helpers.c: fix and improve string_get_size() tests
Vitaly Kuznetsov [Wed, 3 Feb 2016 00:57:18 +0000 (16:57 -0800)]
lib/test-string_helpers.c: fix and improve string_get_size() tests

Recently added commit 564b026fbd0d ("string_helpers: fix precision loss
for some inputs") fixed precision issues for string_get_size() and broke
tests.

Fix and improve them: test both STRING_UNITS_2 and STRING_UNITS_10 at a
time, better failure reporting, test small an huge values.

Fixes: 564b026fbd0d28e9 ("string_helpers: fix precision loss for some inputs")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: James Bottomley <JBottomley@Odin.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agothp: limit number of object to scan on deferred_split_scan()
Kirill A. Shutemov [Wed, 3 Feb 2016 00:57:15 +0000 (16:57 -0800)]
thp: limit number of object to scan on deferred_split_scan()

If we have a lot of pages in queue to be split, deferred_split_scan()
can spend unreasonable amount of time under spinlock with disabled
interrupts.

Let's cap number of pages to split on scan by sc->nr_to_scan.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agothp: change deferred_split_count() to return number of THP in queue
Kirill A. Shutemov [Wed, 3 Feb 2016 00:57:12 +0000 (16:57 -0800)]
thp: change deferred_split_count() to return number of THP in queue

I've got meaning of shrinker::count_objects() wrong: it should return
number of potentially freeable objects, which is not necessary correlate
with freeable memory.

Returning 256 per THP in queue is not reasonable:
shrinker::scan_objects() never called with nr_to_scan > 128 in my setup.

Let's return 1 per THP and correct scan_object accordingly.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agothp: make split_queue per-node
Kirill A. Shutemov [Wed, 3 Feb 2016 00:57:08 +0000 (16:57 -0800)]
thp: make split_queue per-node

Andrea Arcangeli suggested to make split queue per-node to improve
scalability.  Let's do it.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Suggested-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoMerge tag 'perf-core-for-mingo-3' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Wed, 3 Feb 2016 10:02:37 +0000 (11:02 +0100)]
Merge tag 'perf-core-for-mingo-3' of git://git./linux/kernel/git/acme/linux into perf/core

Pull perf/core callchain fixes and improvements from Arnaldo Carvalho de Melo <acme@redhat.com:

User visible changes:

  - Make --percent-limit apply to callchains also and fix some bugs
    related to --percent-limit (Namhyung Kim)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoMerge tag 'perf-core-for-mingo-2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Wed, 3 Feb 2016 10:00:29 +0000 (11:00 +0100)]
Merge tag 'perf-core-for-mingo-2' of git://git./linux/kernel/git/acme/linux into perf/core

Pull perf tooling changes from Arnaldo Carvalho de Melo:

New features:

 - Port 'perf kvm stat' to PowerPC (Hemant Kumar)

Infrastructure changes:

 - Use the 'feature-dump' target to do the feature checks just once and then
   add code to reuse that in the tests/make makefile, speeding up the
   'make -C tools/perf build-test' target (Wang Nan)

 - Reduce the number of tests the 'build-test' target do to those that don't
   pollute the source tree (Arnaldo Carvalho de Melo)

 - Improve the output of the build tests a bit by aligning the name of the
   tests, more can be done to filter out uninteresting info in the output
   (Arnaldo Carvalho de Melo)

 - Add perf_evlist pointer to *info_priv_size(), more prep work for
   supporting the coresight architecture (Mathieu Poirier)

 - Improve the 'perf test bp_signal' test (Wang Nan)

 - Check environment before starting the BPF 'perf test', so that we can just
   'Skip' older kernels instead of 'FAIL'ing them (Wang Nan)

 - Fix cpumode of synthesized buildid event (Wang Nan)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoMerge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Wed, 3 Feb 2016 09:58:46 +0000 (10:58 +0100)]
Merge tag 'perf-core-for-mingo' of git://git./linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

User visible changes:

 - Rename the "colors.code" ~/.perfconfig variable to "colors.jump_arrows",
   as it controls just the that UI element in the annotate browser (Taeung Song)

 - Avoid trying to read ELF symtabs from device files, noticed while doing
   memory profiling work (Jiri Olsa)

 - Improve context detection when offering options in the hists browser,
   i.e.  some options don't make sense when the browser is not working with
   a perf.data file ('perf top' mode), only in 'perf report' mode, like
   scripting (Namhyung Kim)

Infrastructure changes:

 - Elliminate duplication in the hists browser filter functions, getting the
   common part into a function that receives callbacks for filtering by
   DSO, thread, etc. (Namhyung Kim)

 - Fix misleadingly indented assignment, found using
   gcc6 -Wmisleading-indentation (Markus Trippelsdorf)

 - Handle LLVM relocation oddities in libbpf, introducing a 'perf test' that
   detects such problems and then fixing the problem, so that the test now
   passes (Wang Nan)

 - More improvements to the build infrastructure to allow reusing the
   feature detection facilities (Wang Nan)

 - Auto initialize the globals needed by cpu__max_{cpu,node}() routines
   (Arnaldo Carvalho de Melo)

Documentation changes:

 - Document the perf sysctls in Documentation/sysctl/kernel.txt (Ben Hutchings)

 - Document a bunch more ~/.perfconfig knobs (Taeung Song)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf probe: Search both .eh_frame and .debug_frame sections for probe location
Hemant Kumar [Tue, 2 Feb 2016 15:26:46 +0000 (20:56 +0530)]
perf probe: Search both .eh_frame and .debug_frame sections for probe location

'perf probe' through debuginfo__find_probes() in util/probe-finder.c
checks for the functions' frame descriptions in either .eh_frame section
of an ELF or the .debug_frame.

The check is based on whether either one of these sections is present.
Depending on distro, toolchain defaults, architetcutre, build flags,
etc., CFI might be found in either .eh_frame and/or .debug_frame.
Sometimes, it may happen that, .eh_frame, even if present, may not be
complete and may miss some descriptions.

Therefore, to be sure, to find the CFI covering an address we will
always have to investigate both if available.

For e.g., in powerpc, this may happen:
  $ gcc -g bin.c -o bin

  $ objdump --dwarf ./bin
  <1><145>: Abbrev Number: 7 (DW_TAG_subprogram)
     <146> DW_AT_external   : 1
     <146> DW_AT_name       : (indirect string, offset: 0x9e): main
     <14a> DW_AT_decl_file  : 1
     <14b> DW_AT_decl_line  : 39
     <14c> DW_AT_prototyped : 1
     <14c> DW_AT_type       : <0x57>
     <150> DW_AT_low_pc     : 0x100007b8

If the .eh_frame and .debug_frame are checked for the same binary, we
will find that, .eh_frame (although present) doesn't contain a
description for "main" function.

But, .debug_frame has a description:

  000000d8 00000024 00000000 FDE cie=00000000 pc=100007b8..10000838
    DW_CFA_advance_loc: 16 to 100007c8
    DW_CFA_def_cfa_offset: 144
    DW_CFA_offset_extended_sf: r65 at cfa+16
  ...

Due to this (since, perf checks whether .eh_frame is present and goes on
searching for that address inside that frame), perf is unable to process
the probes:

  # perf probe -x ./bin main
    Failed to get call frame on 0x100007b8
    Error: Failed to add events.

To avoid this issue, we need to check both the sections (.eh_frame and
.debug_frame), which is done in this patch.

Note that, we can always force everything into both .eh_frame and
.debug_frame by:

  $ gcc bin.c -fasynchronous-unwind-tables  -fno-dwarf2-cfi-asm -g -o bin

Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Mark Wielaard <mjw@redhat.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1454426806-13974-1-git-send-email-hemant@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Fix thread lifetime related segfaut in intel_pt
Adrian Hunter [Mon, 1 Feb 2016 03:21:04 +0000 (03:21 +0000)]
perf tools: Fix thread lifetime related segfaut in intel_pt

intel_pt_process_auxtrace_info() creates a pt->unknown_thread thread
that eventually needs to be freed by the last thread__put() on it, when
its refcount hits zero, which may happen in
intel_pt_process_auxtrace_info() error handling path and triggers the
following segfault, which would happen as well at intel_pt_free, when
tools using this intel_pt codebase frees up resources:

  # perf record -I -e intel_pt/tsc=1,noretcomp=1/u /bin/ls
  0  a  anaconda-ks.cfg  bin   perf.data perf.data.old  perf-f23-bringup.todo
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.217 MB perf.data ]
  #
  # perf script -F event,comm,pid,tid,time,addr,ip,sym,dso,iregs
  Samples for 'instructions:u' event do not have IREGS attribute set. Cannot print 'iregs' field.
  intel_pt_synth_events: failed to synthesize 'instructions' event type
  Segmentation fault (core dumped)
  #

The problem is: there's a union in 'struct thread' combines a list_head
and a rb_node. The standard life cycle of a thread is: init rb_node in
the constructor, insert it into machine->threads rbtree using rb_node,
move it to machine->dead_threads using list_head, clean in the last
thread__put: list_del_init(&thread->node).

In the above command, it clean a thread before adding it into list,
causes the above segfault.

Since pt->unknown_thread will never live in an rbtree, initialize its
list node so that when list_del_init() is done on it we don't segfault.

After this patch:

  # perf script -F event,comm,pid,tid,time,addr,ip,sym,dso,iregs
  Samples for 'instructions:u' event do not have IREGS attribute set. Cannot print 'iregs' field.
  intel_pt_synth_events: failed to synthesize 'instructions' event type
  0x248 [0x88]: failed to process type: 70
  #

Reported-by: Tong Zhang <ztong@vt.edu>
Reported-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Link: http://lkml.kernel.org/r/1454296865-19749-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Mon, 1 Feb 2016 23:56:08 +0000 (15:56 -0800)]
Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:
 "This looks like a lot but it's a mixture of regression fixes as well
  as fixes for longer standing issues.

   1) Fix on-channel cancellation in mac80211, from Johannes Berg.

   2) Handle CHECKSUM_COMPLETE properly in xt_TCPMSS netfilter xtables
      module, from Eric Dumazet.

   3) Avoid infinite loop in UDP SO_REUSEPORT logic, also from Eric
      Dumazet.

   4) Avoid a NULL deref if we try to set SO_REUSEPORT after a socket is
      bound, from Craig Gallek.

   5) GRO key comparisons don't take lightweight tunnels into account,
      from Jesse Gross.

   6) Fix struct pid leak via SCM credentials in AF_UNIX, from Eric
      Dumazet.

   7) We need to set the rtnl_link_ops of ipv6 SIT tunnels before we
      register them, otherwise the NEWLINK netlink message is missing
      the proper attributes.  From Thadeu Lima de Souza Cascardo.

   8) Several Spectrum chip bug fixes for mlxsw switch driver, from Ido
      Schimmel

   9) Handle fragments properly in ipv4 easly socket demux, from Eric
      Dumazet.

  10) Don't ignore the ifindex key specifier on ipv6 output route
      lookups, from Paolo Abeni"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (128 commits)
  tcp: avoid cwnd undo after receiving ECN
  irda: fix a potential use-after-free in ircomm_param_request
  net: tg3: avoid uninitialized variable warning
  net: nb8800: avoid uninitialized variable warning
  net: vxge: avoid unused function warnings
  net: bgmac: clarify CONFIG_BCMA dependency
  net: hp100: remove unnecessary #ifdefs
  net: davinci_cpdma: use dma_addr_t for DMA address
  ipv6/udp: use sticky pktinfo egress ifindex on connect()
  ipv6: enforce flowi6_oif usage in ip6_dst_lookup_tail()
  netlink: not trim skb for mmaped socket when dump
  vxlan: fix a out of bounds access in __vxlan_find_mac
  net: dsa: mv88e6xxx: fix port VLAN maps
  fib_trie: Fix shift by 32 in fib_table_lookup
  net: moxart: use correct accessors for DMA memory
  ipv4: ipconfig: avoid unused ic_proto_used symbol
  bnxt_en: Fix crash in bnxt_free_tx_skbs() during tx timeout.
  bnxt_en: Exclude rx_drop_pkts hw counter from the stack's rx_dropped counter.
  bnxt_en: Ring free response from close path should use completion ring
  net_sched: drr: check for NULL pointer in drr_dequeue
  ...

8 years agoMerge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Linus Torvalds [Mon, 1 Feb 2016 23:49:18 +0000 (15:49 -0800)]
Merge branch 'linus' of git://git./linux/kernel/git/herbert/crypto-2.6

Pull crypto fixes from Herbert Xu:
 "This fixes the following issues:

  API:
   - algif_hash needs to wait for init operations to complete.
   - The has_key setting for shash was always true.

  Algorithms:
   - Add missing selections of CRYPTO_HASH.
   - Fix pkcs7 authentication.

  Drivers:
   - Fix stack alignment bug in chacha20-ssse3.
   - Fix performance regression in caam due to incorrect setting.
   - Fix potential compile-only build failure of stm32"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: atmel-aes - remove calls of clk_prepare() from atomic contexts
  crypto: algif_hash - wait for crypto_ahash_init() to complete
  crypto: shash - Fix has_key setting
  hwrng: stm32 - Fix dependencies for !HAS_IOMEM archs
  crypto: ghash,poly1305 - select CRYPTO_HASH where needed
  crypto: chacha20-ssse3 - Align stack pointer to 64 bytes
  PKCS#7: Don't require SpcSpOpusInfo in Authenticode pkcs7 signatures
  crypto: caam - make write transactions bufferable on PPC platforms

8 years agoMerge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdim...
Linus Torvalds [Mon, 1 Feb 2016 23:21:20 +0000 (15:21 -0800)]
Merge branch 'libnvdimm-fixes' of git://git./linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm fixes from Dan Williams:
 "1/ Fixes to the libnvdimm 'pfn' device that establishes a reserved
     area for storing a struct page array.

  2/ Fixes for dax operations on a raw block device to prevent pagecache
     collisions with dax mappings.

  3/ A fix for pfn_t usage in vm_insert_mixed that lead to a null
     pointer de-reference.

  These have received build success notification from the kbuild robot
  across 153 configs and pass the latest ndctl tests"

* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  phys_to_pfn_t: use phys_addr_t
  mm: fix pfn_t to page conversion in vm_insert_mixed
  block: use DAX for partition table reads
  block: revert runtime dax control of the raw block device
  fs, block: force direct-I/O for dax-enabled block devices
  devm_memremap_pages: fix vmem_altmap lifetime + alignment handling
  libnvdimm, pfn: fix restoring memmap location
  libnvdimm: fix mode determination for e820 devices

8 years agoperf report: Don't show blank lines if entry has no callchain
Namhyung Kim [Thu, 28 Jan 2016 12:24:54 +0000 (21:24 +0900)]
perf report: Don't show blank lines if entry has no callchain

When all callchains of a hist entry is percent-limited, do not add a
blank line at the end.  It makes the entry look like it doesn't have
callchains.

Reported-and-Tested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20160128122454.GA27446@danjae.kornet
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf hists browser: Fix percent display in callchains
Namhyung Kim [Wed, 27 Jan 2016 15:40:56 +0000 (00:40 +0900)]
perf hists browser: Fix percent display in callchains

When there's only a single callchain, perf doesn't print its percentage
in front of the symbols.  This is because it assumes that the percentage
is same as parents.  But if a percent limit is applied, it's possible
that there are actually a couple of child nodes but only one of them is
shown.  In this case it should display the percent to prevent
misunderstanding of its percentage is same as the parent's.

For example, let's see the following callchain.

  $ perf report --no-children --percent-limit 0.01 --tui
  ...
  -    0.06%  sleep    [kernel.vmlinux]    [k] kmem_cache_alloc_trace
       kmem_cache_alloc_trace
     - perf_event_mmap
        - 0.04% mmap_region
             do_mmap_pgoff
           - vm_mmap_pgoff
              + 0.02% sys_mmap_pgoff
              + 0.02% vm_mmap
           + 0.02% mprotect_fixup

Current code omits the percent if 'mmap_region' becomes the only node
when percent limit is set to 0.03%, its percent is not 0.06% but users
will assume it incorrectly.

Before:

  $ perf report --no-children --percent-limit 0.03 --tui
  ...
     0.06%  sleep    [kernel.vmlinux]    [k] kmem_cache_alloc_trace
       kmem_cache_alloc_trace
     - perf_event_mmap
        - mmap_region
          do_mmap_pgoff
          vm_mmap_pgoff

After:

  $ perf report --no-children --percent-limit 0.03 --tui
  ...
     0.06%  sleep    [kernel.vmlinux]    [k] kmem_cache_alloc_trace
       kmem_cache_alloc_trace
     - perf_event_mmap
        - 0.04% mmap_region
             do_mmap_pgoff
             vm_mmap_pgoff

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453909257-26015-10-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf hists browser: Pass parent_total to callchain print functions
Namhyung Kim [Wed, 27 Jan 2016 15:40:55 +0000 (00:40 +0900)]
perf hists browser: Pass parent_total to callchain print functions

Pass parent node's total period to callchain print functions.  This info
is needed by later patch to determine whether it can omit percent or not
correctly.

No functional change intended.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453909257-26015-9-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf hists browser: Fix dump to show correct callchain style
Namhyung Kim [Wed, 27 Jan 2016 15:40:54 +0000 (00:40 +0900)]
perf hists browser: Fix dump to show correct callchain style

The commit 8c430a348699 ("perf hists browser: Support folded
callchains") missed to update hist_browser__dump() so it always shows
graph-style callchains regardless of current setting.

To fix that, factor out callchain printing code and rename the existing
function which prints graph-style callchain.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 8c430a348699 ("perf hists browser: Support folded callchains")
Link: http://lkml.kernel.org/r/1453909257-26015-8-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf report: Fix percent display in callchains on --stdio
Namhyung Kim [Wed, 27 Jan 2016 15:40:53 +0000 (00:40 +0900)]
perf report: Fix percent display in callchains on --stdio

When there's only a single callchain, perf doesn't print its percentage
in front of the symbols.  This is because it assumes that the percentage
is same as parents.  But if a percent limit is applied, it's possible
that there are actually a couple of child nodes but only one of them is
shown.  In this case it should display the percent to prevent
misunderstanding of its percentage is same as the parent's.

For example, let's see the following callchain.

  $ perf report -s comm --percent-limit 0.01 --stdio
  ...
     9.95%  swapper
            |
            |--7.57%--intel_idle
            |          cpuidle_enter_state
            |          cpuidle_enter
            |          call_cpuidle
            |          cpu_startup_entry
            |          |
            |          |--4.89%--start_secondary
            |          |
            |           --2.68%--rest_init
            |                     start_kernel
            |                     x86_64_start_reservations
            |                     x86_64_start_kernel
    |
    |--0.15%--__schedule
    |          |
    |          |--0.13%--schedule
    |          |          schedule_preempt_disable
    |          |          cpu_startup_entry
            |          |          |
            |          |          |--0.09%--start_secondary
            |          |          |
            |          |           --0.04%--rest_init
            |          |                     start_kernel
            |          |                     x86_64_start_reservations
            |          |                     x86_64_start_kernel
            |          |
            |           --0.01%--schedule_preempt_disabled
            |                     cpu_startup_entry
  ...

Current code omits the percent if 'intel_idle' becomes the only node
when percent limit is set to 0.5%, its percent is not 9.95% but users
will assume it incorrectly.

Before:

  $ perf report --percent-limit 0.5 --stdio
  ...
     9.95%  swapper
            |
            ---intel_idle
               cpuidle_enter_state
               cpuidle_enter
               call_cpuidle
               cpu_startup_entry
               |
               |--4.89%--start_secondary
               |
                --2.68%--rest_init
                          start_kernel
                          x86_64_start_reservations
                          x86_64_start_kernel

After:

  $ perf report --percent-limit 0.5 --stdio
  ...
     9.95%  swapper
            |
             --7.57%--intel_idle
                       cpuidle_enter_state
                       cpuidle_enter
                       call_cpuidle
                       cpu_startup_entry
                       |
                       |--4.89%--start_secondary
                       |
                        --2.68%--rest_init
                                  start_kernel
                                  x86_64_start_reservations
                                  x86_64_start_kernel

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453909257-26015-7-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf callchain: Pass parent_samples to __callchain__fprintf_graph()
Namhyung Kim [Wed, 27 Jan 2016 15:40:52 +0000 (00:40 +0900)]
perf callchain: Pass parent_samples to __callchain__fprintf_graph()

Pass hist entry's period to graph callchain print function.  This info
is needed by later patch to determine whether it can omit percentage of
top-level node or not.

No functional change intended.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453909257-26015-6-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf report: Get rid of hist_entry__callchain_fprintf()
Namhyung Kim [Wed, 27 Jan 2016 15:40:51 +0000 (00:40 +0900)]
perf report: Get rid of hist_entry__callchain_fprintf()

It's just a wrapper function to align the start position ofcallchains to
'comm' of each thread if it's a first sort key.  But it doesn't not work
with tracepoint events and also with upcoming hierarchy view.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453909257-26015-5-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf report: Apply --percent-limit to callchains also
Namhyung Kim [Wed, 27 Jan 2016 15:40:50 +0000 (00:40 +0900)]
perf report: Apply --percent-limit to callchains also

Currently --percent-limit option only works for hist entries.  However
it'd be better to have same effect to callchains as well

Requested-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453909257-26015-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf hists: Update hists' total period when adding entries
Namhyung Kim [Wed, 27 Jan 2016 15:40:49 +0000 (00:40 +0900)]
perf hists: Update hists' total period when adding entries

Currently the hist entry addition path doesn't update total_period of
hists and it's calculated during 'resort' path.  But the resort path
needs to know the total period before doing its job because it's used
for calculating percent limit of callchains in hist entries.

So this patch update the total period during the addition path.  It
makes the percent limit of callchains working (again).

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453909257-26015-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf hists: Fix min callchain hits calculation
Namhyung Kim [Wed, 27 Jan 2016 15:40:48 +0000 (00:40 +0900)]
perf hists: Fix min callchain hits calculation

The total period should be get using hists__total_period() since it
takes filtered entries into account.  In addition, if callchain mode is
'fractal', the total period should be the entry's period.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453909257-26015-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: tracepoint_error() can receive e=NULL, robustify it
Adrian Hunter [Tue, 26 Jan 2016 12:05:20 +0000 (14:05 +0200)]
perf tools: tracepoint_error() can receive e=NULL, robustify it

Fixes segmentation fault using, for instance:

  (gdb) run record -I -e intel_pt/tsc=1,noretcomp=1/u /bin/ls
  Starting program: /home/acme/bin/perf record -I -e intel_pt/tsc=1,noretcomp=1/u /bin/ls
  Missing separate debuginfos, use: dnf debuginfo-install glibc-2.22-7.fc23.x86_64
  [Thread debugging using libthread_db enabled]
  Using host libthread_db library "/lib64/libthread_db.so.1".

 Program received signal SIGSEGV, Segmentation fault.
  0 x00000000004b9ea5 in tracepoint_error (e=0x0, err=13, sys=0x19b1370 "sched", name=0x19a5d00 "sched_switch") at util/parse-events.c:410
  (gdb) bt
  #0  0x00000000004b9ea5 in tracepoint_error (e=0x0, err=13, sys=0x19b1370 "sched", name=0x19a5d00 "sched_switch") at util/parse-events.c:410
  #1  0x00000000004b9fc5 in add_tracepoint (list=0x19a5d20, idx=0x7fffffffb8c0, sys_name=0x19b1370 "sched", evt_name=0x19a5d00 "sched_switch", err=0x0, head_config=0x0)
      at util/parse-events.c:433
  #2  0x00000000004ba334 in add_tracepoint_event (list=0x19a5d20, idx=0x7fffffffb8c0, sys_name=0x19b1370 "sched", evt_name=0x19a5d00 "sched_switch", err=0x0, head_config=0x0)
      at util/parse-events.c:498
  #3  0x00000000004bb699 in parse_events_add_tracepoint (list=0x19a5d20, idx=0x7fffffffb8c0, sys=0x19b1370 "sched", event=0x19a5d00 "sched_switch", err=0x0, head_config=0x0)
      at util/parse-events.c:936
  #4  0x00000000004f6eda in parse_events_parse (_data=0x7fffffffb8b0, scanner=0x19a49d0) at util/parse-events.y:391
  #5  0x00000000004bc8e5 in parse_events__scanner (str=0x663ff2 "sched:sched_switch", data=0x7fffffffb8b0, start_token=258) at util/parse-events.c:1361
  #6  0x00000000004bca57 in parse_events (evlist=0x19a5220, str=0x663ff2 "sched:sched_switch", err=0x0) at util/parse-events.c:1401
  #7  0x0000000000518d5f in perf_evlist__can_select_event (evlist=0x19a3b90, str=0x663ff2 "sched:sched_switch") at util/record.c:253
  #8  0x0000000000553c42 in intel_pt_track_switches (evlist=0x19a3b90) at arch/x86/util/intel-pt.c:364
  #9  0x00000000005549d1 in intel_pt_recording_options (itr=0x19a2c40, evlist=0x19a3b90, opts=0x8edf68 <record+232>) at arch/x86/util/intel-pt.c:664
  #10 0x000000000051e076 in auxtrace_record__options (itr=0x19a2c40, evlist=0x19a3b90, opts=0x8edf68 <record+232>) at util/auxtrace.c:539
  #11 0x0000000000433368 in cmd_record (argc=1, argv=0x7fffffffde60, prefix=0x0) at builtin-record.c:1264
  #12 0x000000000049bec2 in run_builtin (p=0x8fa2a8 <commands+168>, argc=5, argv=0x7fffffffde60) at perf.c:390
  #13 0x000000000049c12a in handle_internal_command (argc=5, argv=0x7fffffffde60) at perf.c:451
  #14 0x000000000049c278 in run_argv (argcp=0x7fffffffdcbc, argv=0x7fffffffdcb0) at perf.c:495
  #15 0x000000000049c60a in main (argc=5, argv=0x7fffffffde60) at perf.c:618
(gdb)

Intel PT attempts to find the sched:sched_switch tracepoint but that seg
faults if tracefs is not readable, because the error reporting structure
is null, as errors are not reported when automatically adding
tracepoints.  Fix by checking before using.

Committer note:

This doesn't take place in a kernel that supports
perf_event_attr.context_switch, that is the default way that will be
used for tracking context switches, only in older kernels, like 4.2, in
a machine with Intel PT (e.g. Broadwell) for non-priviledged users.

Further info from a similar patch by Wang:

The error is in tracepoint_error: it assumes the 'e' parameter is valid.

However, there are many situation a parse_event() can be called without
parse_events_error. See result of

  $ grep 'parse_events(.*NULL)' ./tools/perf/ -r'

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Tong Zhang <ztong@vt.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: stable@vger.kernel.org # v4.4+
Fixes: 196581717d85 ("perf tools: Enhance parsing events tracepoint error output")
Link: http://lkml.kernel.org/r/1453809921-24596-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoLinux 4.5-rc2
Linus Torvalds [Mon, 1 Feb 2016 02:12:16 +0000 (18:12 -0800)]
Linux 4.5-rc2

8 years agoMerge tag 'usb-4.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Mon, 1 Feb 2016 01:36:45 +0000 (17:36 -0800)]
Merge tag 'usb-4.5-rc2' of git://git./linux/kernel/git/gregkh/usb

Pull USB driver fixes from Greg KH:
 "Here are some small USB fixes and new device ids for 4.5-rc2.  Nothing
  major here, full details are in the shortlog, and all of these have
  been in linux-next successfully"

* tag 'usb-4.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: option: fix Cinterion AHxx enumeration
  USB: mxu11x0: fix memory leak on usb_serial private data
  USB: serial: ftdi_sio: add support for Yaesu SCU-18 cable
  USB: serial: option: Adding support for Telit LE922
  USB: serial: visor: fix crash on detecting device without write_urbs
  USB: visor: fix null-deref at probe
  USB: cp210x: add ID for IAI USB to RS485 adaptor
  usb: hub: do not clear BOS field during reset device
  cdc-acm:exclude Samsung phone 04e8:685d
  usb: cdc-acm: send zero packet for intel 7260 modem
  usb: cdc-acm: handle unlinked urb in acm read callback

8 years agoMerge tag 'tty-4.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Linus Torvalds [Mon, 1 Feb 2016 01:09:39 +0000 (17:09 -0800)]
Merge tag 'tty-4.5-rc2' of git://git./linux/kernel/git/gregkh/tty

Pull tty/serial fixes from Greg KH:
 "Here are some small tty/serial driver fixes for 4.5-rc2.

  They resolve a number of reported problems (the ioctl one specifically
  has been pointed out by numerous people) and one patch adds some new
  device ids for the 8250_pci driver.  All have been in linux-next
  successfully"

* tag 'tty-4.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  serial: 8250_pci: Add Intel Broadwell ports
  staging/speakup: Use tty_ldisc_ref() for paste kworker
  n_tty: Fix unsafe reference to "other" ldisc
  tty: Fix unsafe ldisc reference via ioctl(TIOCGETD)
  tty: Retry failed reopen if tty teardown in-progress
  tty: Wait interruptibly for tty lock on reopen

8 years agoMerge tag 'staging-4.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Mon, 1 Feb 2016 01:00:27 +0000 (17:00 -0800)]
Merge tag 'staging-4.5-rc2' of git://git./linux/kernel/git/gregkh/staging

Pull staging fixes from Greg KH:
 "Here are some small staging driver fixes for 4.5-rc2.

  One of them predated 4.4-final, but I missed that merge window due to
  the holliday.  The others fix reported issues that have come up
  recently.  The tty change is needed for the speakup driver fix and has
  the ack of the tty driver maintainer as well, i.e.  myself :)

  All have been in linux-next with no reported issues"

* tag 'staging-4.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  Staging: speakup: fix read scrolled-back VT
  Staging: speakup: Fix getting port information
  Revert "Staging: panel: usleep_range is preferred over udelay"
  iio: adis_buffer: Fix out-of-bounds memory access

8 years agoMerge tag 'driver-core-4.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Mon, 1 Feb 2016 00:55:04 +0000 (16:55 -0800)]
Merge tag 'driver-core-4.5-rc2' of git://git./linux/kernel/git/gregkh/driver-core

Pull driver core fix from Greg KH:
 "Here's a single driver core fix that resolves an issue a lot of users
  have been hitting for a while now.  It's been tested a lot and has
  been in linux-next successfully for a while"

* tag 'driver-core-4.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  base/platform: Fix platform drivers with no probe callback

8 years agoMerge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Linus Torvalds [Mon, 1 Feb 2016 00:50:31 +0000 (16:50 -0800)]
Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus

Pull MIPS fix from Ralf Baechle:
 "Just a single revert for a patch which I had upstreamed out of
  sequence"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  Revert "MIPS: bcm63xx: nvram: Remove unused bcm63xx_nvram_get_psi_size() function"

8 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Mon, 1 Feb 2016 00:17:19 +0000 (16:17 -0800)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "A bit on the largish side due to a series of fixes for a regression in
  the x86 vector management which was introduced in 4.3.  This work was
  started in December already, but it took some time to fix all corner
  cases and a couple of older bugs in that area which were detected
  while at it

  Aside of that a few platform updates for intel-mid, quark and UV and
  two fixes for in the mm code:
   - Use proper types for pgprot values to avoid truncation
   - Prevent a size truncation in the pageattr code when setting page
     attributes for large mappings"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
  x86/mm/pat: Avoid truncation when converting cpa->numpages to address
  x86/mm: Fix types used in pgprot cacheability flags translations
  x86/platform/quark: Print boundaries correctly
  x86/platform/UV: Remove EFI memmap quirk for UV2+
  x86/platform/intel-mid: Join string and fix SoC name
  x86/platform/intel-mid: Enable 64-bit build
  x86/irq: Plug vector cleanup race
  x86/irq: Call irq_force_move_complete with irq descriptor
  x86/irq: Remove outgoing CPU from vector cleanup mask
  x86/irq: Remove the cpumask allocation from send_cleanup_vector()
  x86/irq: Clear move_in_progress before sending cleanup IPI
  x86/irq: Remove offline cpus from vector cleanup
  x86/irq: Get rid of code duplication
  x86/irq: Copy vectormask instead of an AND operation
  x86/irq: Check vector allocation early
  x86/irq: Reorganize the search in assign_irq_vector
  x86/irq: Reorganize the return path in assign_irq_vector
  x86/irq: Do not use apic_chip_data.old_domain as temporary buffer
  x86/irq: Validate that irq descriptor is still active
  x86/irq: Fix a race in x86_vector_free_irqs()
  ...

8 years agoMerge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 31 Jan 2016 23:49:06 +0000 (15:49 -0800)]
Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull timer fixes from Thomas Gleixner:
 "The timer departement delivers:

   - a regression fix for the NTP code along with a proper selftest
   - prevent a spurious timer interrupt in the NOHZ lowres code
   - a fix for user space interfaces returning the remaining time on
     architectures with CONFIG_TIME_LOW_RES=y
   - a few patches to fix COMPILE_TEST fallout"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tick/nohz: Set the correct expiry when switching to nohz/lowres mode
  clocksource: Fix dependencies for archs w/o HAS_IOMEM
  clocksource: Select CLKSRC_MMIO where needed
  tick/sched: Hide unused oneshot timer code
  kselftests: timers: Add adjtimex SETOFFSET validity tests
  ntp: Fix ADJ_SETOFFSET being used w/ ADJ_NANO
  itimers: Handle relative timers with CONFIG_TIME_LOW_RES proper
  posix-timers: Handle relative timers with CONFIG_TIME_LOW_RES proper
  timerfd: Handle relative timers with CONFIG_TIME_LOW_RES proper
  hrtimer: Handle remaining time proper for TIME_LOW_RES
  clockevents/tcb_clksrc: Prevent disabling an already disabled clock

8 years agoMerge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 31 Jan 2016 23:44:04 +0000 (15:44 -0800)]
Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull scheduler fixes from Thomas Gleixner:
 "Three small fixes in the scheduler/core:

   - use after free in the numa code
   - crash in the numa init code
   - a simple spelling fix"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  pid: Fix spelling in comments
  sched/numa: Fix use-after-free bug in the task_numa_compare
  sched: Fix crash in sched_init_numa()

8 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 31 Jan 2016 23:38:27 +0000 (15:38 -0800)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Thomas Gleixner:
 "This is much bigger than typical fixes, but Peter found a category of
  races that spurred more fixes and more debugging enhancements.  Work
  started before the merge window, but got finished only now.

  Aside of that this contains the usual small fixes to perf and tools.
  Nothing particular exciting"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (43 commits)
  perf: Remove/simplify lockdep annotation
  perf: Synchronously clean up child events
  perf: Untangle 'owner' confusion
  perf: Add flags argument to perf_remove_from_context()
  perf: Clean up sync_child_event()
  perf: Robustify event->owner usage and SMP ordering
  perf: Fix STATE_EXIT usage
  perf: Update locking order
  perf: Remove __free_event()
  perf/bpf: Convert perf_event_array to use struct file
  perf: Fix NULL deref
  perf/x86: De-obfuscate code
  perf/x86: Fix uninitialized value usage
  perf: Fix race in perf_event_exit_task_context()
  perf: Fix orphan hole
  perf stat: Do not clean event's private stats
  perf hists: Fix HISTC_MEM_DCACHELINE width setting
  perf annotate browser: Fix behaviour of Shift-Tab with nothing focussed
  perf tests: Remove wrong semicolon in while loop in CQM test
  perf: Synchronously free aux pages in case of allocation failure
  ...

8 years agoMerge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 31 Jan 2016 23:29:37 +0000 (15:29 -0800)]
Merge branch 'locking-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull locking fix from Thomas Gleixner:
 "A single commit, which makes the rtmutex.wait_lock an irq safe lock.

  This prevents a potential deadlock which can be triggered by the rcu
  boosting code from rcu_read_unlock()"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  rtmutex: Make wait_lock irq safe

8 years agoMerge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 31 Jan 2016 22:48:58 +0000 (14:48 -0800)]
Merge branch 'irq-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull IRQ fixes from Ingo Molnar:
 "Mostly irqchip driver fixes, but also an irq core crash fix and a
  build fix"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/mxs: Add missing set_handle_irq()
  irqchip/atmel-aic: Fix wrong bit operation for IRQ priority
  irqchip/gic-v3-its: Recompute the number of pages on page size change
  base: Export platform_msi_domain_[alloc,free]_irqs
  of: MSI: Simplify irqdomain lookup
  irqdomain: Allow domain lookup with DOMAIN_BUS_WIRED token
  irqchip: Fix dependencies for archs w/o HAS_IOMEM
  irqchip/s3c24xx: Mark init_eint as __maybe_unused
  genirq: Validate action before dereferencing it in handle_irq_event_percpu()

8 years agoMerge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 31 Jan 2016 22:43:09 +0000 (14:43 -0800)]
Merge branch 'core-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull debugobjects fix from Ingo Molnar:
 "Bump up debugobjects pool limit that bigger s390 systems kept running
  into"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  debugobjects: Allow bigger number of early boot objects

8 years agoMerge tag 'vfio-v4.5-rc2' of git://github.com/awilliam/linux-vfio
Linus Torvalds [Sun, 31 Jan 2016 22:38:37 +0000 (14:38 -0800)]
Merge tag 'vfio-v4.5-rc2' of git://github.com/awilliam/linux-vfio

Pull VFIO fix from Alex Williamson:
 "Use alternate group tracking for no-iommu"

* tag 'vfio-v4.5-rc2' of git://github.com/awilliam/linux-vfio:
  vfio/noiommu: Don't use iommu_present() to track fake groups

8 years agoMerge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Sun, 31 Jan 2016 22:29:52 +0000 (14:29 -0800)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:
 "Here are two I2C driver regression fixes.  piix4 gets a larger
  overhaul fixing the latest refactoring and also an older known issue
  as well.  designware-pci gets a fix for a bad merge conflict
  resolution"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: piix4: don't regress on bus names
  i2c: designware-pci: use IRQF_COND_SUSPEND flag
  i2c: piix4: Fully initialize SB800 before it is registered
  i2c: piix4: Fix SB800 locking

8 years agophys_to_pfn_t: use phys_addr_t
Dan Williams [Fri, 22 Jan 2016 17:43:28 +0000 (09:43 -0800)]
phys_to_pfn_t: use phys_addr_t

A dma_addr_t is potentially smaller than a phys_addr_t on some archs.
Don't truncate the address when doing the pfn conversion.

Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: Matthew Wilcox <willy@linux.intel.com>
[willy: fix pfn_t_to_phys as well]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
8 years agomm: fix pfn_t to page conversion in vm_insert_mixed
Dan Williams [Tue, 26 Jan 2016 17:48:05 +0000 (09:48 -0800)]
mm: fix pfn_t to page conversion in vm_insert_mixed

pfn_t_to_page() honors the flags in the pfn_t value to determine if a
pfn is backed by a page.  However, vm_insert_mixed() was originally
written to use pfn_valid() to make this determination.  To restore the
old/correct behavior, ignore the pfn_t flags in the !pfn_t_devmap() case
and fallback to trusting pfn_valid().

Fixes: 01c8f1c44b83 ("mm, dax, gpu: convert vm_insert_mixed to pfn_t")
Cc: Dave Hansen <dave@sr71.net>
Cc: David Airlie <airlied@linux.ie>
Reported-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Tested-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
8 years agoMerge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetoot...
David S. Miller [Sat, 30 Jan 2016 23:32:42 +0000 (15:32 -0800)]
Merge branch 'for-upstream' of git://git./linux/kernel/git/bluetooth/bluetooth

Johan Hedberg says:

====================
pull request: bluetooth 2016-01-30

Here's a set of important Bluetooth fixes for the 4.5 kernel:

 - Two fixes to 6LoWPAN code (one fixing a potential crash)
 - Fix LE pairing with devices using both public and random addresses
 - Fix allocation of dynamic LE PSM values
 - Fix missing COMPATIBLE_IOCTL for UART line discipline

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoblock: use DAX for partition table reads
Dan Williams [Fri, 29 Jan 2016 04:25:31 +0000 (20:25 -0800)]
block: use DAX for partition table reads

Avoid populating pagecache when the block device is in DAX mode.
Otherwise these page cache entries collide with the fsync/msync
implementation and break data durability guarantees.

Cc: Jan Kara <jack@suse.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Reported-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Tested-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
8 years agoblock: revert runtime dax control of the raw block device
Dan Williams [Fri, 29 Jan 2016 04:13:39 +0000 (20:13 -0800)]
block: revert runtime dax control of the raw block device

Dynamically enabling DAX requires that the page cache first be flushed
and invalidated.  This must occur atomically with the change of DAX mode
otherwise we confuse the fsync/msync tracking and violate data
durability guarantees.  Eliminate the possibilty of DAX-disabled to
DAX-enabled transitions for now and revisit this for the next cycle.

Cc: Jan Kara <jack@suse.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
8 years agofs, block: force direct-I/O for dax-enabled block devices
Dan Williams [Tue, 26 Jan 2016 01:23:18 +0000 (17:23 -0800)]
fs, block: force direct-I/O for dax-enabled block devices

Similar to the file I/O path, re-direct all I/O to the DAX path for I/O
to a block-device special file.  Both regular files and device special
files can use the common filp->f_mapping->host lookup to determing is
DAX is enabled.

Otherwise, we confuse the DAX code that does not expect to find live
data in the page cache:

    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 7676 at mm/filemap.c:217
    __delete_from_page_cache+0x9f6/0xb60()
    Modules linked in:
    CPU: 0 PID: 7676 Comm: a.out Not tainted 4.4.0+ #276
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
     00000000ffffffff ffff88006d3f7738 ffffffff82999e2d 0000000000000000
     ffff8800620a0000 ffffffff86473d20 ffff88006d3f7778 ffffffff81352089
     ffffffff81658d36 ffffffff86473d20 00000000000000d9 ffffea0000009d60
    Call Trace:
     [<     inline     >] __dump_stack lib/dump_stack.c:15
     [<ffffffff82999e2d>] dump_stack+0x6f/0xa2 lib/dump_stack.c:50
     [<ffffffff81352089>] warn_slowpath_common+0xd9/0x140 kernel/panic.c:482
     [<ffffffff813522b9>] warn_slowpath_null+0x29/0x30 kernel/panic.c:515
     [<ffffffff81658d36>] __delete_from_page_cache+0x9f6/0xb60 mm/filemap.c:217
     [<ffffffff81658fb2>] delete_from_page_cache+0x112/0x200 mm/filemap.c:244
     [<ffffffff818af369>] __dax_fault+0x859/0x1800 fs/dax.c:487
     [<ffffffff8186f4f6>] blkdev_dax_fault+0x26/0x30 fs/block_dev.c:1730
     [<     inline     >] wp_pfn_shared mm/memory.c:2208
     [<ffffffff816e9145>] do_wp_page+0xc85/0x14f0 mm/memory.c:2307
     [<     inline     >] handle_pte_fault mm/memory.c:3323
     [<     inline     >] __handle_mm_fault mm/memory.c:3417
     [<ffffffff816ecec3>] handle_mm_fault+0x2483/0x4640 mm/memory.c:3446
     [<ffffffff8127eff6>] __do_page_fault+0x376/0x960 arch/x86/mm/fault.c:1238
     [<ffffffff8127f738>] trace_do_page_fault+0xe8/0x420 arch/x86/mm/fault.c:1331
     [<ffffffff812705c4>] do_async_page_fault+0x14/0xd0 arch/x86/kernel/kvm.c:264
     [<ffffffff86338f78>] async_page_fault+0x28/0x30 arch/x86/entry/entry_64.S:986
     [<ffffffff86336c36>] entry_SYSCALL_64_fastpath+0x16/0x7a
    arch/x86/entry/entry_64.S:185
    ---[ end trace dae21e0f85f1f98c ]---

Fixes: 5a023cdba50c ("block: enable dax for raw block devices")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: Kirill A. Shutemov <kirill@shutemov.name>
Suggested-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jan Kara <jack@suse.cz>
Suggested-by: Matthew Wilcox <willy@linux.intel.com>
Tested-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
8 years agocrypto: atmel-aes - remove calls of clk_prepare() from atomic contexts
Cyrille Pitchen [Fri, 29 Jan 2016 16:53:33 +0000 (17:53 +0100)]
crypto: atmel-aes - remove calls of clk_prepare() from atomic contexts

clk_prepare()/clk_unprepare() must not be called within atomic context.

This patch calls clk_prepare() once for all from atmel_aes_probe() and
clk_unprepare() from atmel_aes_remove().

Then calls of clk_prepare_enable()/clk_disable_unprepare() were replaced
by calls of clk_enable()/clk_disable().

Cc: stable@vger.kernel.org
Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com>
Reported-by: Matthias Mayr <matthias.mayr@student.kit.edu>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
8 years agocrypto: algif_hash - wait for crypto_ahash_init() to complete
Wang, Rui Y [Wed, 27 Jan 2016 09:08:37 +0000 (17:08 +0800)]
crypto: algif_hash - wait for crypto_ahash_init() to complete

hash_sendmsg/sendpage() need to wait for the completion
of crypto_ahash_init() otherwise it can cause panic.

Cc: stable@vger.kernel.org
Signed-off-by: Rui Wang <rui.y.wang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
8 years agopid: Fix spelling in comments
Zhen Lei [Sat, 30 Jan 2016 02:04:17 +0000 (10:04 +0800)]
pid: Fix spelling in comments

Accidentally discovered this typo when I studied this module.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tianhong Ding <dingtianhong@huawei.com>
Cc: Xinwei Hu <huxinwei@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/r/1454119457-11272-1-git-send-email-thunder.leizhen@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoMerge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Sat, 30 Jan 2016 08:15:49 +0000 (09:15 +0100)]
Merge tag 'perf-urgent-for-mingo' of git://git./linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

 - Fix 'perf stat' stddev reporting due to mistakenly cleaning event
   private stats (Jiri Olsa)

 - Fix 'perf test CQM' endless loop detected by 'gcc6 -Wmisleading-indentation'
   (Markus Trippelsdorf)

 - Fix behaviour of Shift-Tab when nothing is focussed in the annotate TUI browser,
   detected with gcc6 -Wmisleading-indentation (Markus Trippelsdorf)

 - Fix mem data cacheline hists browser width setting for unresolved
   addresses (Jiri Olsa)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agotcp: avoid cwnd undo after receiving ECN
Yuchung Cheng [Fri, 29 Jan 2016 23:11:50 +0000 (15:11 -0800)]
tcp: avoid cwnd undo after receiving ECN

RFC 4015 section 3.4 says the TCP sender MUST refrain from
reversing the congestion control state when the ACK signals
congestion through the ECN-Echo flag. Currently we may not
always do that when prior_ssthresh is reset upon receiving
ACKs with ECE marks. This patch fixes that.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoirda: fix a potential use-after-free in ircomm_param_request
WANG Cong [Fri, 29 Jan 2016 19:58:03 +0000 (11:58 -0800)]
irda: fix a potential use-after-free in ircomm_param_request

self->ctrl_skb is protected by self->spinlock, we should not
access it out of the lock. Move the debugging printk inside.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agodevm_memremap_pages: fix vmem_altmap lifetime + alignment handling
Dan Williams [Sat, 30 Jan 2016 05:48:34 +0000 (21:48 -0800)]
devm_memremap_pages: fix vmem_altmap lifetime + alignment handling

to_vmem_altmap() needs to return valid results until
arch_remove_memory() completes.  It also needs to be valid for any pfn
in a section regardless of whether that pfn maps to data.  This escape
was a result of a bug in the unit test.

The signature of this bug is that free_pagetable() fails to retrieve a
vmem_altmap and goes off into the weeds:

 BUG: unable to handle kernel NULL pointer dereference at           (null)
 IP: [<ffffffff811d2629>] get_pfnblock_flags_mask+0x49/0x60
 [..]
 Call Trace:
  [<ffffffff811d3477>] free_hot_cold_page+0x97/0x1d0
  [<ffffffff811d367a>] __free_pages+0x2a/0x40
  [<ffffffff8191e669>] free_pagetable+0x8c/0xd4
  [<ffffffff8191ef4e>] remove_pagetable+0x37a/0x808
  [<ffffffff8191b210>] vmemmap_free+0x10/0x20

Fixes: 4b94ffdc4163 ("x86, mm: introduce vmem_altmap to augment vmemmap_populate()")
Cc: Andrew Morton <akpm@linux-foundation.org>
Reported-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
8 years agoMerge branch 'arnd-net-driver-fixes'
David S. Miller [Sat, 30 Jan 2016 04:33:57 +0000 (20:33 -0800)]
Merge branch 'arnd-net-driver-fixes'

Arnd Bergmann says:

====================
network driver fixes

This is an updated series of fixes for the network device drivers
that showed warnings in ARM randconfig.

Changes since v1 are:

dropped "net: macb: avoid uninitialized variables", already fixed in net-next

dropped "net: fddi/defxx: avoid warning about uninitialized variable
use", already fixed in net-next

added missing barriers in "net: moxart: use correct accessors for
DMA memory"

clarified "net: bgmac: clarify CONFIG_BCMA dependency" changelog
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: tg3: avoid uninitialized variable warning
Arnd Bergmann [Fri, 29 Jan 2016 11:39:15 +0000 (12:39 +0100)]
net: tg3: avoid uninitialized variable warning

The tg3_set_eeprom() function correctly initializes the 'start' variable,
but gcc generates a false warning:

drivers/net/ethernet/broadcom/tg3.c: In function 'tg3_set_eeprom':
drivers/net/ethernet/broadcom/tg3.c:12057:4: warning: 'start' may be used uninitialized in this function [-Wmaybe-uninitialized]

I have not come up with a way to restructure the code in a way that
avoids the warning without making it less readable, so this adds an
initialization for the declaration to shut up that warning.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: nb8800: avoid uninitialized variable warning
Arnd Bergmann [Fri, 29 Jan 2016 11:39:14 +0000 (12:39 +0100)]
net: nb8800: avoid uninitialized variable warning

The nb8800_poll() function initializes the 'next' variable in the
loop looking for new input data. We know this will be called at
least once because 'budget' is a guaranteed to be a positive number
when we enter the function, but the compiler doesn't know that
and warns when the variable is used later:

drivers/net/ethernet/aurora/nb8800.c: In function 'nb8800_poll':
drivers/net/ethernet/aurora/nb8800.c:350:21: warning: 'next' may be used uninitialized in this function [-Wmaybe-uninitialized]

Changing the 'while() {}' loop to 'do {} while()' makes it obvious
to the compiler what is going on so it no longer warns.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Mans Rullgard <mans@mansr.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: vxge: avoid unused function warnings
Arnd Bergmann [Fri, 29 Jan 2016 11:39:13 +0000 (12:39 +0100)]
net: vxge: avoid unused function warnings

When CONFIG_PCI_MSI is disabled, we get warnings about unused functions
in the vxge driver:

drivers/net/ethernet/neterion/vxge/vxge-main.c:2121:13: warning: 'adaptive_coalesce_tx_interrupts' defined but not used [-Wunused-function]
drivers/net/ethernet/neterion/vxge/vxge-main.c:2149:13: warning: 'adaptive_coalesce_rx_interrupts' defined but not used [-Wunused-function]

We could add another #ifdef here, but it's nicer to avoid those warnings
for good by converting the existing #ifdef to if(IS_ENABLED()), which has
the same effect but provides better compile-time coverage in general,
and lets the compiler understand better when the function is intentionally
unused.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: bgmac: clarify CONFIG_BCMA dependency
Arnd Bergmann [Fri, 29 Jan 2016 11:39:12 +0000 (12:39 +0100)]
net: bgmac: clarify CONFIG_BCMA dependency

The bgmac driver depends on BCMA_HOST_SOC, which is only used
when CONFIG_BCMA is enabled. However, it is a bool option and can
be set when CONFIG_BCMA=m, and then bgmac can be built-in, leading
to an obvious link error:

drivers/built-in.o: In function `bgmac_init':
:(.init.text+0x7f2c): undefined reference to `__bcma_driver_register'
drivers/built-in.o: In function `bgmac_exit':
:(.exit.text+0x110a): undefined reference to `bcma_driver_unregister'

To avoid this case, we need to depend on both BCMA and BCMA_SOC,
as this patch does. I'm also trying to make the dependency more
readable by splitting it into three lines, and adding a COMPILE_TEST
alternative so we can test-build it in all configurations that
support BCMA.

The added dependency on FIXED_PHY addresses a related issue where
we cannot call fixed_phy_register() when CONFIG_FIXED_PHY=m and
CONFIG_BGMAC=y.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: hp100: remove unnecessary #ifdefs
Arnd Bergmann [Fri, 29 Jan 2016 11:39:11 +0000 (12:39 +0100)]
net: hp100: remove unnecessary #ifdefs

Building the hp100 ethernet driver causes warnings when both the PCI
and EISA drivers are disabled:

ethernet/hp/hp100.c: In function 'hp100_module_init':
ethernet/hp/hp100.c:3047:2: warning: label 'out3' defined but not used [-Wunused-label]
ethernet/hp/hp100.c: At top level:
ethernet/hp/hp100.c:2828:13: warning: 'cleanup_dev' defined but not used [-Wunused-function]

We can easily avoid the warnings and make the driver look slightly
nicer by removing the #ifdefs that check for the CONFIG_PCI and
CONFIG_EISA, as all the registration functions are designed to
have no effect when the buses are disabled.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: davinci_cpdma: use dma_addr_t for DMA address
Arnd Bergmann [Fri, 29 Jan 2016 11:39:10 +0000 (12:39 +0100)]
net: davinci_cpdma: use dma_addr_t for DMA address

The davinci_cpdma mixes up physical addresses as seen from the CPU
and DMA addresses as seen from a DMA master, since it can operate
on both normal memory or an on-chip buffer. If dma_addr_t is
different from phys_addr_t, this means we get a compile-time warning
about the type mismatch:

ethernet/ti/davinci_cpdma.c: In function 'cpdma_desc_pool_create':
ethernet/ti/davinci_cpdma.c:182:48: error: passing argument 3 of 'dma_alloc_coherent' from incompatible pointer type [-Werror=incompatible-pointer-types]
   pool->cpumap = dma_alloc_coherent(dev, size, &pool->phys,
In file included from ethernet/ti/davinci_cpdma.c:21:0:
dma-mapping.h:398:21: note: expected 'dma_addr_t * {aka long long unsigned int *}' but argument is of type 'phys_addr_t * {aka unsigned int *}'
 static inline void *dma_alloc_coherent(struct device *dev, size_t size,

This slightly restructures the code so the address we use for
mapping RAM into a DMA address is always a dma_addr_t, avoiding
the warning. The code is correct even if both types are 32-bit
because the DMA master in this device only supports 32-bit addressing
anyway, independent of the types that are used.

We still assign this value to pool->phys, and that is wrong if
the driver is ever used with an IOMMU, but that value appears to
be never used, so there is no problem really. I've added a couple
of comments about where we do things that are slightly violating
the API.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch 'ipv6-sticky-pktinfo'
David S. Miller [Sat, 30 Jan 2016 04:31:27 +0000 (20:31 -0800)]
Merge branch 'ipv6-sticky-pktinfo'

Paolo Abeni says:

====================
ipv6: fix sticky pktinfo behaviour

Currently:

ip addr add dev eth0 2001:0010::1/64
ip addr add dev eth1 2001:0020::1/64
ping6 -I eth0 2001:0020::2

do not lead to the expected results, i.e. eth1 is used as the
egress interface.

This is due to two related issues in handling sticky pktinfo,
used by ping6 to enforce the device binding:

- ip6_dst_lookup_flow()/ip6_dst_lookup_tail() do not really enforce
flowi6_oif match
- ipv6 udp connect() just ignore flowi6_oif

These patches address each issue individually.

The kernel has never enforced the egress interface specified
via the sticky pktinfo, except briefly between the commits
741a11d9e410 ("net: ipv6: Add RT6_LOOKUP_F_IFACE flag if oif is set")
and
d46a9d678e4c ("net: ipv6: Dont add RT6_LOOKUP_F_IFACE flag if saddr set"),
but the ping6 tools was unaffected up to iputils-20100214,
since before it used SO_BINDTODEVICE to enforce the egress
interface.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoipv6/udp: use sticky pktinfo egress ifindex on connect()
Paolo Abeni [Fri, 29 Jan 2016 11:30:20 +0000 (12:30 +0100)]
ipv6/udp: use sticky pktinfo egress ifindex on connect()

Currently, the egress interface index specified via IPV6_PKTINFO
is ignored by __ip6_datagram_connect(), so that RFC 3542 section 6.7
can be subverted when the user space application calls connect()
before sendmsg().
Fix it by initializing properly flowi6_oif in connect() before
performing the route lookup.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoipv6: enforce flowi6_oif usage in ip6_dst_lookup_tail()
Paolo Abeni [Fri, 29 Jan 2016 11:30:19 +0000 (12:30 +0100)]
ipv6: enforce flowi6_oif usage in ip6_dst_lookup_tail()

The current implementation of ip6_dst_lookup_tail basically
ignore the egress ifindex match: if the saddr is set,
ip6_route_output() purposefully ignores flowi6_oif, due
to the commit d46a9d678e4c ("net: ipv6: Dont add RT6_LOOKUP_F_IFACE
flag if saddr set"), if the saddr is 'any' the first route lookup
in ip6_dst_lookup_tail fails, but upon failure a second lookup will
be performed with saddr set, thus ignoring the ifindex constraint.

This commit adds an output route lookup function variant, which
allows the caller to specify lookup flags, and modify
ip6_dst_lookup_tail() to enforce the ifindex match on the second
lookup via said helper.

ip6_route_output() becames now a static inline function build on
top of ip6_route_output_flags(); as a side effect, out-of-tree
modules need now a GPL license to access the output route lookup
functionality.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge tag 'wireless-drivers-for-davem-2016-01-29' of git://git.kernel.org/pub/scm...
David S. Miller [Sat, 30 Jan 2016 04:26:08 +0000 (20:26 -0800)]
Merge tag 'wireless-drivers-for-davem-2016-01-29' of git://git./linux/kernel/git/kvalo/wireless-drivers

Kalle Valo says:

====================
iwlwifi

* Fix support for 3168 device:
  * NVM version
  * firmware file name
  * device IDs
* Fix a compilation warning in dvm calibration code
* Fix the TPC (reduced Tx Power) code. This fixes performance issues
* Add device IDs for 8265

rtx2x00

* fix monitor mode regression dating back to 4.1

brcmfmac

* fix sdio initialisation related crash

rtlwifi

* rtl8821ae: Fix 5G failure when EEPROM is incorrectly encoded

ath9k

* ignore eeprom magic mismatch on flash based devices
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonetlink: not trim skb for mmaped socket when dump
Ken-ichirou MATSUZAWA [Fri, 29 Jan 2016 01:45:50 +0000 (10:45 +0900)]
netlink: not trim skb for mmaped socket when dump

We should not trim skb for mmaped socket since its buf size is fixed
and userspace will read as frame which data equals head. mmaped
socket will not call recvmsg, means max_recvmsg_len is 0,
skb_reserve was not called before commit: db65a3aaf29e.

Fixes: db65a3aaf29e (netlink: Trim skb to alloc size to avoid MSG_TRUNC)
Signed-off-by: Ken-ichirou MATSUZAWA <chamas@h4.dion.ne.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agovxlan: fix a out of bounds access in __vxlan_find_mac
Li RongQing [Fri, 29 Jan 2016 01:43:47 +0000 (09:43 +0800)]
vxlan: fix a out of bounds access in __vxlan_find_mac

The size of all_zeros_mac is 6 byte, but eth_hash() will access the
8 byte, and KASan reported the below bug:

[ 8596.479031] BUG: KASan: out of bounds access in __vxlan_find_mac+0x24/0x100 at addr ffffffff841514c0
[ 8596.487647] Read of size 8 by task ip/52820
[ 8596.490818] Address belongs to variable all_zeros_mac+0x0/0x40
[ 8596.496051] CPU: 0 PID: 52820 Comm: ip Tainted: G WC 4.1.15 #1
[ 8596.503520] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 02/10/2014
[ 8596.509365] ffffffff841514c0 ffff88007450f0b8 ffffffff822fa5e1 0000000000000032
[ 8596.516112] ffff88007450f150 ffff88007450f138 ffffffff812dd58c ffff88007450f1d8
[ 8596.522856] ffffffff81113b80 0000000000000282 0000000000000001 ffffffff8101ee4d
[ 8596.529599] Call Trace:
[ 8596.530858] [<ffffffff822fa5e1>] dump_stack+0x4f/0x7b
[ 8596.535080] [<ffffffff812dd58c>] kasan_report_error+0x3bc/0x3f0
[ 8596.540258] [<ffffffff81113b80>] ? __lock_acquire+0x90/0x2140
[ 8596.545245] [<ffffffff8101ee4d>] ? save_stack_trace+0x2d/0x80
[ 8596.550234] [<ffffffff812dda70>] kasan_report+0x40/0x50
[ 8596.554647] [<ffffffff81b211e4>] ? __vxlan_find_mac+0x24/0x100
[ 8596.559729] [<ffffffff812dc399>] __asan_load8+0x69/0xa0
[ 8596.564141] [<ffffffff81b211e4>] __vxlan_find_mac+0x24/0x100
[ 8596.569033] [<ffffffff81b2683d>] vxlan_fdb_create+0x9d/0x570

it can be fixed by enlarging the all_zeros_mac to 8 byte, although it is
harmless; eth_hash() will be called in other place with the memory which
is larger and equal to 8 byte.

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: dsa: mv88e6xxx: fix port VLAN maps
Vivien Didelot [Thu, 28 Jan 2016 21:54:37 +0000 (16:54 -0500)]
net: dsa: mv88e6xxx: fix port VLAN maps

Currently the port based VLAN maps should be configured to allow every
port to egress frames on all other ports, except themselves.

The debugfs interface shows that they are misconfigured. For instance, a
7-port switch has the following content in the related register 0x06:

       GLOBAL GLOBAL2 SERDES   0    1    2    3    4    5    6
    ...
    6:  1fa4    1f0f       4   7f   7e   7d   7c   7b   7a   79
    ...

This means that port 3 is allowed to talk to port 2-6, but cannot talk
to ports 0 and 1. With this fix, port 3 can correctly talk to all ports
except 3 itself:

       GLOBAL GLOBAL2 SERDES   0    1    2    3    4    5    6
    ...
    6:  1fa4    1f0f       4   7e   7d   7b   77   6f   5f   3f
    ...

Fixes: ede8098d0fef ("net: dsa: mv88e6xxx: bridges do not need an FID")
Reported-by: Kevin Smith <kevin.smith@elecsyscorp.com>
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Kevin Smith <kevin.smith@elecsyscorp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agofib_trie: Fix shift by 32 in fib_table_lookup
Alexander Duyck [Thu, 28 Jan 2016 21:42:24 +0000 (13:42 -0800)]
fib_trie: Fix shift by 32 in fib_table_lookup

The fib_table_lookup function had a shift by 32 that triggered a UBSAN
warning.  This was due to the fact that I had placed the shift first and
then followed it with the check for the suffix length to ignore the
undefined behavior.  If we reorder this so that we verify the suffix is
less than 32 before shifting the value we can avoid the issue.

Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: moxart: use correct accessors for DMA memory
Arnd Bergmann [Thu, 28 Jan 2016 16:54:33 +0000 (17:54 +0100)]
net: moxart: use correct accessors for DMA memory

The moxart ethernet driver confuses coherent DMA buffers with
MMIO registers.

moxart_ether.c: In function 'moxart_mac_setup_desc_ring':
moxart_ether.c:146:428: error: passing argument 1 of '__fswab32' makes integer from pointer without a cast [-Werror=int-conversion]
moxart_ether.c:74:39: warning: incorrect type in argument 3 (different address spaces)
moxart_ether.c:74:39:    expected void *cpu_addr
moxart_ether.c:74:39:    got void [noderef] <asn:2>*tx_desc_base

This leaves the basic logic alone and uses normal pointers for
the virtual address of the descriptor. As we cannot use readl/writel
to access them, we also introduce our own moxart_desc_read
moxart_desc_write helpers that perform the same endianess swap
as the original code, but without the address space conversion.

The barriers are made explicit here where needed: Even in the worst-case
scenario, we just have to use a rmb() after checking ownership so
we don't read any input data before we are sure it is value, and we
use wmb() before transferring ownership back to the device.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoipv4: ipconfig: avoid unused ic_proto_used symbol
Arnd Bergmann [Thu, 28 Jan 2016 16:39:24 +0000 (17:39 +0100)]
ipv4: ipconfig: avoid unused ic_proto_used symbol

When CONFIG_PROC_FS, CONFIG_IP_PNP_BOOTP, CONFIG_IP_PNP_DHCP and
CONFIG_IP_PNP_RARP are all disabled, we get a warning about the
ic_proto_used variable being unused:

net/ipv4/ipconfig.c:146:12: error: 'ic_proto_used' defined but not used [-Werror=unused-variable]

This avoids the warning, by making the definition conditional on
whether a dynamic IP configuration protocol is configured. If not,
we know that the value is always zero, so we can optimize away the
variable and all code that depends on it.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agolibnvdimm, pfn: fix restoring memmap location
Dan Williams [Sat, 30 Jan 2016 01:42:51 +0000 (17:42 -0800)]
libnvdimm, pfn: fix restoring memmap location

This path was missed when turning on the memmap in pmem support.  Permit
'pmem' as a valid location for the map.

Reported-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
8 years agoMerge branch 'bnxt_en-fixes'
David S. Miller [Sat, 30 Jan 2016 01:28:40 +0000 (17:28 -0800)]
Merge branch 'bnxt_en-fixes'

Michael Chan says:

====================
bnxt_en: Bug fixes.

3 small bug fix patches for net.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agobnxt_en: Fix crash in bnxt_free_tx_skbs() during tx timeout.
Michael Chan [Thu, 28 Jan 2016 08:11:22 +0000 (03:11 -0500)]
bnxt_en: Fix crash in bnxt_free_tx_skbs() during tx timeout.

The ring index j is not wrapped properly at the end of the ring, causing
it to reference pointers past the end of the ring.  For proper loop
termination and to access the ring properly, we need to increment j and
mask it before referencing the ring entry.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agobnxt_en: Exclude rx_drop_pkts hw counter from the stack's rx_dropped counter.
Michael Chan [Thu, 28 Jan 2016 08:11:21 +0000 (03:11 -0500)]
bnxt_en: Exclude rx_drop_pkts hw counter from the stack's rx_dropped counter.

This hardware counter is misleading as it counts dropped packets that
don't match the hardware filters for unicast/broadcast/multicast.  We
will still report this counter in ethtool -S for diagnostics purposes.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agobnxt_en: Ring free response from close path should use completion ring
Prashant Sreedharan [Thu, 28 Jan 2016 08:11:20 +0000 (03:11 -0500)]
bnxt_en: Ring free response from close path should use completion ring

Use completion ring for ring free response from firmware.  The response
will be the last entry in the ring and we can free the ring after getting
the response.  This will guarantee no spurious DMA to freed memory.

Signed-off-by: Prashant Sreedharan <prashant@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet_sched: drr: check for NULL pointer in drr_dequeue
Bernie Harris [Thu, 28 Jan 2016 03:30:51 +0000 (16:30 +1300)]
net_sched: drr: check for NULL pointer in drr_dequeue

There are cases where qdisc_dequeue_peeked can return NULL, and the result
is dereferenced later on in the function.

Similarly to the other qdisc dequeue functions, check whether the skb
pointer is NULL and if it is, goto out.

Signed-off-by: Bernie Harris <bernie.harris@alliedtelesis.co.nz>
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Linus Torvalds [Sat, 30 Jan 2016 00:16:12 +0000 (16:16 -0800)]
Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm

Pull ARM fixes from Russell King:
 "Just one fix for a -fstack-protector-strong problem from Kees Cook,
  and adding the new copy_file_range syscall"

* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  ARM: wire up copy_file_range() syscall
  ARM: 8500/1: fix atags_to_fdt with stack-protector-strong

8 years agoMerge tag 'powerpc-4.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Sat, 30 Jan 2016 00:10:16 +0000 (16:10 -0800)]
Merge tag 'powerpc-4.5-2' of git://git./linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 - Wire up copy_file_range() syscall from Chandan Rajendra
 - Simplify module TOC handling from Alan Modra
 - Remove newly added extra definition of pmd_dirty from Stephen Rothwell
 - Allow user space to map rtas_rmo_buf from Vasant Hegde
 - Fix PE location code from Gavin Shan
 - Remove PPMU_HAS_SSLOT flag for Power8 from Madhavan Srinivasan
 - Fixup _HPAGE_CHG_MASK from Aneesh Kumar K.V

* tag 'powerpc-4.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/mm: Fixup _HPAGE_CHG_MASK
  powerpc/perf: Remove PPMU_HAS_SSLOT flag for Power8
  powerpc/eeh: Fix PE location code
  powerpc/mm: Allow user space to map rtas_rmo_buf
  powerpc: Remove newly added extra definition of pmd_dirty
  powerpc: Simplify module TOC handling
  powerpc: Wire up copy_file_range() syscall