review.tizen.org Git - platform/kernel/linux-starfive.git/log

fs/proc/consoles.c: use seq_putc() in show_console_dev()

A single character (line break) should be put into a sequence. Thus use
the corresponding function "seq_putc".

This issue was detected by using the Coccinelle software.

Link: http://lkml.kernel.org/r/04fb69fe-d820-9141-820f-07e9a48f4635@users.sourceforge.net
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

proc: rearrange args

Rearrange args for smaller code.

lookup revolves around memcmp() which gets len 3rd arg, so propagate
length as 3rd arg.

readdir and lookup add additional arg to VFS ->readdir and ->lookup, so
better add it to the end.

Space savings on x86_64:

add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-18 (-18)
Function                                     old     new   delta
proc_readdir                                  22      13      -9
proc_lookup                                   18       9      -9

proc_match() is smaller if not inlined, I promise!

Link: http://lkml.kernel.org/r/20180104175958.GB5204@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

proc: spread likely/unlikely a bit

use_pde() is used at every open/read/write/...  of every random /proc
file.  Negative refcount happens only if PDE is being deleted by module
(read: never).  So it gets "likely".

unuse_pde() gets "unlikely" for the same reason.

close_pdeo() gets unlikely as the completion is filled only if there is a
race between PDE removal and close() (read: never ever).

It even saves code on x86_64 defconfig:

add/remove: 0/0 grow/shrink: 1/2 up/down: 2/-20 (-18)
Function                                     old     new   delta
close_pdeo                                   183     185      +2
proc_reg_get_unmapped_area                   119     111      -8
proc_reg_poll                                 85      73     -12

Link: http://lkml.kernel.org/r/20180104175657.GA5204@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fs/proc: use __ro_after_init

/proc/self inode numbers, value of proc_inode_cache and st_nlink of
/proc/$TGID are fixed constants.

Link: http://lkml.kernel.org/r/20180103184707.GA31849@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fs/proc/internal.h: fix up comment

Document what ->pde_unload_lock actually does.

Link: http://lkml.kernel.org/r/20180103185120.GB31849@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fs/proc/internal.h: rearrange struct proc_dir_entry

struct proc_dir_entry became bit messy over years:

* move 16-bit ->mode_t before namelen to get rid of padding
* make ->in_use first field: it seems to be most used resulting in
  smaller code on x86_64 (defconfig):

add/remove: 0/0 grow/shrink: 7/13 up/down: 24/-67 (-43)
Function                                     old     new   delta
proc_readdir_de                              451     455      +4
proc_get_inode                               282     286      +4
pde_put                                       65      69      +4
remove_proc_subtree                          294     297      +3
remove_proc_entry                            297     300      +3
proc_register                                295     298      +3
proc_notify_change                            94      97      +3
unuse_pde                                     27      26      -1
proc_reg_write                                89      85      -4
proc_reg_unlocked_ioctl                       85      81      -4
proc_reg_read                                 89      85      -4
proc_reg_llseek                               87      83      -4
proc_reg_get_unmapped_area                   123     119      -4
proc_entry_rundown                           139     135      -4
proc_reg_poll                                 91      85      -6
proc_reg_mmap                                 79      73      -6
proc_get_link                                 55      49      -6
proc_reg_release                             108     101      -7
proc_reg_open                                298     291      -7
close_pdeo                                   228     218     -10

* move writeable fields together to a first cacheline (on x86_64),
  those include
* ->in_use: reference count, taken every open/read/write/close etc
* ->count: reference count, taken at readdir on every entry
* ->pde_openers: tracks (nearly) every open, dirtied
* ->pde_unload_lock: spinlock protecting ->pde_openers
* ->proc_iops, ->proc_fops, ->data: writeonce fields,
  used right together with previous group.

* other rarely written fields go into 1st/2nd and 2nd/3rd cacheline on
  32-bit and 64-bit respectively.

Additionally on 32-bit, ->subdir, ->subdir_node, ->namelen, ->name go
fully into 2nd cacheline, separated from writeable fields.  They are all
used during lookup.

Link: http://lkml.kernel.org/r/20171220215914.GA7877@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fs/proc/kcore.c: use probe_kernel_read() instead of memcpy()

Commit df04abfd181a ("fs/proc/kcore.c: Add bounce buffer for ktext
data") added a bounce buffer to avoid hardened usercopy checks. Copying
to the bounce buffer was implemented with a simple memcpy() assuming
that it is always valid to read from kernel memory iff the
kern_addr_valid() check passed.

A simple, but pointless, test case like "dd if=/proc/kcore of=/dev/null"
now can easily crash the kernel, since the former execption handling on
invalid kernel addresses now doesn't work anymore.

Also adding a kern_addr_valid() implementation wouldn't help here. Most
architectures simply return 1 here, while a couple implemented a page
table walk to figure out if something is mapped at the address in
question.

With DEBUG_PAGEALLOC active mappings are established and removed all the
time, so that relying on the result of kern_addr_valid() before
executing the memcpy() also doesn't work.

Therefore simply use probe_kernel_read() to copy to the bounce buffer.
This also allows to simplify read_kcore().

At least on s390 this fixes the observed crashes and doesn't introduce
warnings that were removed with df04abfd181a ("fs/proc/kcore.c: Add
bounce buffer for ktext data"), even though the generic
probe_kernel_read() implementation uses uaccess functions.

While looking into this I'm also wondering if kern_addr_valid() could be
completely removed...(?)

Link: http://lkml.kernel.org/r/20171202132739.99971-1-heiko.carstens@de.ibm.com
Fixes: df04abfd181a ("fs/proc/kcore.c: Add bounce buffer for ktext data")
Fixes: f5509cc18daa ("mm: Hardened usercopy")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fs/proc/array.c: delete children_seq_release()

It is 1:1 wrapper around seq_release().

Link: http://lkml.kernel.org/r/20171122171510.GA12161@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

proc: less memory for /proc/*/map_files readdir

dentry name can be evaluated later, right before calling into VFS.

Also, spend less time under ->mmap_sem.

Link: http://lkml.kernel.org/r/20171110163034.GA2534@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fs/proc/vmcore.c: simpler /proc/vmcore cleanup

Iterators aren't necessary as you can just grab the first entry and delete
it until no entries left.

Link: http://lkml.kernel.org/r/20171121191121.GA20757@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

proc: fix /proc/*/map_files lookup

Current code does:

if (sscanf(dentry->d_name.name, "%lx-%lx", start, end) != 2)

However sscanf() is broken garbage.

It silently accepts whitespace between format specifiers
(did you know that?).

It silently accepts valid strings which result in integer overflow.

Do not use sscanf() for any even remotely reliable parsing code.

OK
# readlink '/proc/1/map_files/55a23af39000-55a23b05b000'
/lib/systemd/systemd

broken
# readlink '/proc/1/map_files/               55a23af39000-55a23b05b000'
/lib/systemd/systemd

broken
# readlink '/proc/1/map_files/55a23af39000-55a23b05b000    '
/lib/systemd/systemd

very broken
# readlink '/proc/1/map_files/1000000000000000055a23af39000-55a23b05b000'
/lib/systemd/systemd

Andrei said:

: This patch breaks criu.  It was a bug in criu.  And this bug is on a minor
: path, which works when memfd_create() isn't available.  It is a reason why
: I ask to not backport this patch to stable kernels.
:
: In CRIU this bug can be triggered, only if this patch will be backported
: to a kernel which version is lower than v3.16.

Link: http://lkml.kernel.org/r/20171120212706.GA14325@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

proc: don't use READ_ONCE/WRITE_ONCE for /proc/*/fail-nth

READ_ONCE and WRITE_ONCE are useless when there is only one read/write
is being made.

Link: http://lkml.kernel.org/r/20171120204033.GA9446@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

proc: use %u for pid printing and slightly less stack

PROC_NUMBUF is 13 which is enough for "negative int + \n + \0".

However PIDs and TGIDs are never negative and newline is not a concern,
so use just 10 per integer.

Link: http://lkml.kernel.org/r/20171120203005.GA27743@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alexander Viro <viro@ftp.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kasan: remove redundant initialization of variable 'real_size'

Variable real_size is initialized with a value that is never read, it is
re-assigned a new value later on, hence the initialization is redundant
and can be removed.

Cleans up clang warning:

lib/test_kasan.c:422:21: warning: Value stored to 'real_size' during its initialization is never read

Link: http://lkml.kernel.org/r/20180206144950.32457-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kasan: clean up KASAN_SHADOW_SCALE_SHIFT usage

Right now the fact that KASAN uses a single shadow byte for 8 bytes of
memory is scattered all over the code.

This change defines KASAN_SHADOW_SCALE_SHIFT early in asm include files
and makes use of this constant where necessary.

[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/34937ca3b90736eaad91b568edf5684091f662e3.1515775666.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kasan: fix prototype author email address

Use the new one.

Link: http://lkml.kernel.org/r/de3b7ffc30a55178913a7d3865216aa7accf6c40.1515775666.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kasan: detect invalid frees

Detect frees of pointers into middle of heap objects.

Link: http://lkml.kernel.org/r/cb569193190356beb018a03bb8d6fbae67e7adbc.1514378558.git.dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>a
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kasan: unify code between kasan_slab_free() and kasan_poison_kfree()

Both of these functions deal with freeing of slab objects.
However, kasan_poison_kfree() mishandles SLAB_TYPESAFE_BY_RCU
(must also not poison such objects) and does not detect double-frees.

Unify code between these functions.

This solves both of the problems and allows to add more common code
(e.g. detection of invalid frees).

Link: http://lkml.kernel.org/r/385493d863acf60408be219a021c3c8e27daa96f.1514378558.git.dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>a
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kasan: detect invalid frees for large mempool objects

Detect frees of pointers into middle of mempool objects.

I did a one-off test, but it turned out to be very tricky, so I reverted
it.  First, mempool does not call kasan_poison_kfree() unless allocation
function fails.  I stubbed an allocation function to fail on second and
subsequent allocations.  But then mempool stopped to call
kasan_poison_kfree() at all, because it does it only when allocation
function is mempool_kmalloc().  We could support this special failing
test allocation function in mempool, but it also can't live with kasan
tests, because these are in a module.

Link: http://lkml.kernel.org/r/bf7a7d035d7a5ed62d2dd0e3d2e8a4fcdf456aa7.1514378558.git.dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>a
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kasan: don't use __builtin_return_address(1)

__builtin_return_address(1) is unreliable without frame pointers.
With defconfig on kmalloc_pagealloc_invalid_free test I am getting:

BUG: KASAN: double-free or invalid-free in (null)

Pass caller PC from callers explicitly.

Link: http://lkml.kernel.org/r/9b01bc2d237a4df74ff8472a3bf6b7635908de01.1514378558.git.dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>a
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kasan: detect invalid frees for large objects

Patch series "kasan: detect invalid frees".

KASAN detects double-frees, but does not detect invalid-frees (when a
pointer into a middle of heap object is passed to free). We recently had
a very unpleasant case in crypto code which freed an inner object inside
of a heap allocation. This left unnoticed during free, but totally
corrupted heap and later lead to a bunch of random crashes all over kernel
code.

Detect invalid frees.

This patch (of 5):

Detect frees of pointers into middle of large heap objects.

I dropped const from kasan_kfree_large() because it starts propagating
through a bunch of functions in kasan_report.c, slab/slub nearest_obj(),
all of their local variables, fixup_red_left(), etc.

Link: http://lkml.kernel.org/r/1b45b4fe1d20fc0de1329aab674c1dd973fee723.1514378558.git.dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>a
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kasan: add functions for unpoisoning stack variables

As a code-size optimization, LLVM builds since r279383 may bulk-manipulate
the shadow region when (un)poisoning large memory blocks. This requires
new callbacks that simply do an uninstrumented memset().

This fixes linking the Clang-built kernel when using KASAN.

[arnd@arndb.de: add declarations for internal functions]
Link: http://lkml.kernel.org/r/20180105094112.2690475-1-arnd@arndb.de
[fengguang.wu@intel.com: __asan_set_shadow_00 can be static]
Link: http://lkml.kernel.org/r/20171223125943.GA74341@lkp-ib03
[ghackmann@google.com: fix memset() parameters, and tweak commit message to describe new callbacks]
Link: http://lkml.kernel.org/r/20171204191735.132544-6-paullawrence@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kasan: add tests for alloca poisoning

Link: http://lkml.kernel.org/r/20171204191735.132544-5-paullawrence@google.com
Signed-off-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kasan: support alloca() poisoning

clang's AddressSanitizer implementation adds redzones on either side of
alloca()ed buffers. These redzones are 32-byte aligned and at least 32
bytes long.

__asan_alloca_poison() is passed the size and address of the allocated
buffer, *excluding* the redzones on either side. The left redzone will
always be to the immediate left of this buffer; but AddressSanitizer may
need to add padding between the end of the buffer and the right redzone.
If there are any 8-byte chunks inside this padding, we should poison
those too.

__asan_allocas_unpoison() is just passed the top and bottom of the dynamic
stack area, so unpoisoning is simpler.

Link: http://lkml.kernel.org/r/20171204191735.132544-4-paullawrence@google.com
Signed-off-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kasan/Makefile: support LLVM style asan parameters

LLVM doesn't understand GCC-style paramters ("--param asan-foo=bar"), thus
we currently we don't use inline/globals/stack instrumentation when
building the kernel with clang.

Add support for LLVM-style parameters ("-mllvm -asan-foo=bar") to enable
all KASAN features.

Link: http://lkml.kernel.org/r/20171204191735.132544-3-paullawrence@google.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Greg Hackmann <ghackmann@google.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kasan: add compiler support for clang

Patch series "kasan: support alloca, LLVM", v4.

This patch (of 5):

For now we can hard-code ASAN ABI level 5, since historical clang builds
can't build the kernel anyway. We also need to emulate gcc's
__SANITIZE_ADDRESS__ flag, or memset() calls won't be instrumented.

Link: http://lkml.kernel.org/r/20171204191735.132544-2-paullawrence@google.com
Signed-off-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kasan: don't emit builtin calls when sanitization is off

With KASAN enabled the kernel has two different memset() functions, one
with KASAN checks (memset) and one without (__memset).  KASAN uses some
macro tricks to use the proper version where required.  For example
memset() calls in mm/slub.c are without KASAN checks, since they operate
on poisoned slab object metadata.

The issue is that clang emits memset() calls even when there is no
memset() in the source code.  They get linked with improper memset()
implementation and the kernel fails to boot due to a huge amount of KASAN
reports during early boot stages.

The solution is to add -fno-builtin flag for files with KASAN_SANITIZE :=
n marker.

Link: http://lkml.kernel.org/r/8ffecfffe04088c52c42b92739c2bd8a0bcb3f5e.1516384594.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'xfs-4.16-merge-5' of git://git./fs/xfs/xfs-linux

Pull more xfs updates from Darrick Wong:
"As promised, here's a (much smaller) second pull request for the
  second week of the merge cycle. This time around we have a couple
  patches shutting off unsupported fs configurations, and a couple of
  cleanups.

  Last, we turn off EXPERIMENTAL for the reverse mapping btree, since
  the primary downstream user of that information (online fsck) is now
  upstream and I haven't seen any major failures in a few kernel
  releases.

  Summary:

   - Print scrub build status in the xfs build info.

   - Explicitly call out the remaining two scenarios where we don't
     support reflink and never have.

   - Remove EXPERIMENTAL tag from reverse mapping btree!"

* tag 'xfs-4.16-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: remove experimental tag for reverse mapping
  xfs: don't allow reflink + realtime filesystems
  xfs: don't allow DAX on reflink filesystems
  xfs: add scrub to XFS_BUILD_OPTIONS
  xfs: fix u32 type usage in sb validation function

Merge branch 'overlayfs-linus' of git://git./linux/kernel/git/mszeredi/vfs

Pull overlayfs updates from Miklos Szeredi:
"This work from Amir adds NFS export capability to overlayfs. NFS
  exporting an overlay filesystem is a challange because we want to keep
  track of any copy-up of a file or directory between encoding the file
  handle and decoding it.

  This is achieved by indexing copied up objects by lower layer file
  handle. The index is already used for hard links, this patchset
  extends the use to NFS file handle decoding"

* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: (51 commits)
  ovl: check ERR_PTR() return value from ovl_encode_fh()
  ovl: fix regression in fsnotify of overlay merge dir
  ovl: wire up NFS export operations
  ovl: lookup indexed ancestor of lower dir
  ovl: lookup connected ancestor of dir in inode cache
  ovl: hash non-indexed dir by upper inode for NFS export
  ovl: decode pure lower dir file handles
  ovl: decode indexed dir file handles
  ovl: decode lower file handles of unlinked but open files
  ovl: decode indexed non-dir file handles
  ovl: decode lower non-dir file handles
  ovl: encode lower file handles
  ovl: copy up before encoding non-connectable dir file handle
  ovl: encode non-indexed upper file handles
  ovl: decode connected upper dir file handles
  ovl: decode pure upper file handles
  ovl: encode pure upper file handles
  ovl: document NFS export
  vfs: factor out helpers d_instantiate_anon() and d_alloc_anon()
  ovl: store 'has_upper' and 'opaque' as bit flags
  ...

Merge tag 'rproc-v4.16' of git://github.com/andersson/remoteproc

Pull remoteproc updates from Bjorn Andersson:
"This contains a few bug fixes and a cleanup up of the resource-table
  handling in the framework, which removes the need for drivers with no
  resource table to provide a fake one"

* tag 'rproc-v4.16' of git://github.com/andersson/remoteproc:
  remoteproc: Reset table_ptr on stop
  remoteproc: Drop dangling find_rsc_table dummies
  remoteproc: Move resource table load logic to find
  remoteproc: Don't handle empty resource table
  remoteproc: Merge rproc_ops and rproc_fw_ops
  remoteproc: Clone rproc_ops in rproc_alloc()
  remoteproc: Cache resource table size
  remoteproc: Remove depricated crash completion
  virtio_remoteproc: correct put_device virtio_device.dev

Merge tag 'rpmsg-v4.16' of git://github.com/andersson/remoteproc

Pull rpmsg updates from Bjorn Andersson:
"This fixes a few issues found in the SMD and GLINK drivers and
  corrects the handling of SMD channels that are found in an
  (previously) unexpected state"

* tag 'rpmsg-v4.16' of git://github.com/andersson/remoteproc:
  rpmsg: smd: Fix double unlock in __qcom_smd_send()
  rpmsg: glink: Fix missing mutex_init() in qcom_glink_alloc_channel()
  rpmsg: smd: Don't hold the tx lock during wait
  rpmsg: smd: Fail send on a closed channel
  rpmsg: smd: Wake up all waiters
  rpmsg: smd: Create device for all channels
  rpmsg: smd: Perform handshake during open
  rpmsg: glink: smem: Ensure ordering during tx
  drivers: rpmsg: remove duplicate includes
  remoteproc: qcom: Use PTR_ERR_OR_ZERO() in glink prob

Merge tag 'mmc-v4.16-2' of git://git./linux/kernel/git/ulfh/mmc

Pull MMC host fixes from Ulf Hansson:

- renesas_sdhi: Fix build error in case NO_DMA=y

- sdhci: Implement a bounce buffer to address throughput regressions

* tag 'mmc-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: MMC_SDHI_{SYS,INTERNAL}_DMAC should depend on HAS_DMA
mmc: sdhci: Implement an SDHCI-specific bounce buffer

Merge tag 'pwm/for-4.16-rc1' of git://git./linux/kernel/git/thierry.reding/linux-pwm

Pull pwm updates from Thierry Reding:
"The Meson PWM controller driver gains support for the AXG series and a
  minor bug is fixed for the STMPE driver.

  To round things off, the class is now set for PWM channels exported
  via sysfs which allows non-root access, provided that the system has
  been configured accordingly"

* tag 'pwm/for-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
  pwm: meson: Add clock source configuration for Meson-AXG
  dt-bindings: pwm: Update bindings for the Meson-AXG
  pwm: stmpe: Fix wrong register offset for hwpwm=2 case
  pwm: Set class for exported channels in sysfs

net: mediatek: Explicitly include pinctrl headers

The Mediatek ethernet driver fails to build after commit 23c35f48f5fb
("pinctrl: remove include file from <linux/device.h>") because it relies
on the pinctrl/consumer.h and pinctrl/devinfo.h being pulled in by the
device.h header implicitly.

Include these headers explicitly to avoid the build failure.

Cc: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mmc: meson-gx-mmc: Explicitly include pinctr/consumer.h

The Meson GX MMC driver fails to build after commit 23c35f48f5fb
("pinctrl: remove include file from <linux/device.h>") because it relies
on the pinctrl/consumer.h being pulled in by the device.h header
implicitly.

Include the header explicitly to avoid the build failure.

Cc: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drm/rockchip: lvds: Explicitly include pinctrl headers

The Rockchip LVDS driver fails to build after commit 23c35f48f5fb
("pinctrl: remove include file from <linux/device.h>") because it relies
on the pinctrl/consumer.h and pinctrl/devinfo.h being pulled in by the
device.h header implicitly.

Include these headers explicitly to avoid the build failure.

Cc: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

pinctrl: files should directly include apis they use

Fixes: 23c35f48f5fb ("pinctrl: remove include file from <linux/device.h>")
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ovl: check ERR_PTR() return value from ovl_encode_fh()

Another fix for an issue reported by 0-day robot.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 8ed5eec9d6c4 ("ovl: encode pure upper file handles")
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>

ovl: fix regression in fsnotify of overlay merge dir

A re-factoring patch in NFS export series has passed the wrong argument
to ovl_get_inode() causing a regression in the very recent fix to
fsnotify of overlay merge dir.

The regression has caused merge directory inodes to be hashed by upper
instead of lower real inode, when NFS export and directory indexing is
disabled. That caused an inotify watch to become obsolete after directory
copy up and drop caches.

LTP test inotify07 was improved to catch this regression.
The regression also caused multiple redirect dirs to same origin not to
be detected on lookup with NFS export disabled. An xfstest was added to
cover this case.

Fixes: 0aceb53e73be ("ovl: do not pass overlay dentry to ovl_get_inode()")
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>

Merge branch 'x86-pti-for-linus' of git://git./linux/kernel/git/tip/tip

Pull spectre/meltdown updates from Thomas Gleixner:
"The next round of updates related to melted spectrum:

   - The initial set of spectre V1 mitigations:

       - Array index speculation blocker and its usage for syscall,
         fdtable and the n180211 driver.

       - Speculation barrier and its usage in user access functions

   - Make indirect calls in KVM speculation safe

   - Blacklisting of known to be broken microcodes so IPBP/IBSR are not
     touched.

   - The initial IBPB support and its usage in context switch

   - The exposure of the new speculation MSRs to KVM guests.

   - A fix for a regression in x86/32 related to the cpu entry area

   - Proper whitelisting for known to be safe CPUs from the mitigations.

   - objtool fixes to deal proper with retpolines and alternatives

   - Exclude __init functions from retpolines which speeds up the boot
     process.

   - Removal of the syscall64 fast path and related cleanups and
     simplifications

   - Removal of the unpatched paravirt mode which is yet another source
     of indirect unproteced calls.

   - A new and undisputed version of the module mismatch warning

   - A couple of cleanup and correctness fixes all over the place

  Yet another step towards full mitigation. There are a few things still
  missing like the RBS underflow mitigation for Skylake and other small
  details, but that's being worked on.

  That said, I'm taking a belated christmas vacation for a week and hope
  that everything is magically solved when I'm back on Feb 12th"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits)
  KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL
  KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL
  KVM/VMX: Emulate MSR_IA32_ARCH_CAPABILITIES
  KVM/x86: Add IBPB support
  KVM/x86: Update the reverse_cpuid list to include CPUID_7_EDX
  x86/speculation: Fix typo IBRS_ATT, which should be IBRS_ALL
  x86/pti: Mark constant arrays as __initconst
  x86/spectre: Simplify spectre_v2 command line parsing
  x86/retpoline: Avoid retpolines for built-in __init functions
  x86/kvm: Update spectre-v1 mitigation
  KVM: VMX: make MSR bitmaps per-VCPU
  x86/paravirt: Remove 'noreplace-paravirt' cmdline option
  x86/speculation: Use Indirect Branch Prediction Barrier in context switch
  x86/cpuid: Fix up "virtual" IBRS/IBPB/STIBP feature bits on Intel
  x86/spectre: Fix spelling mistake: "vunerable"-> "vulnerable"
  x86/spectre: Report get_user mitigation for spectre_v1
  nl80211: Sanitize array index in parse_txq_params
  vfs, fdtable: Prevent bounds-check bypass via speculative execution
  x86/syscall: Sanitize syscall table de-references under speculation
  x86/get_user: Use pointer masking to limit speculation
  ...

Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
"A small set of changes:

   - a fixup for kexec related to 5-level paging mode. That covers most
     of the cases except kexec from a 5-level kernel to a 4-level
     kernel. The latter needs more work and is going to come in 4.17

   - two trivial fixes for build warnings triggered by LTO and gcc-8"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/power: Fix swsusp_arch_resume prototype
  x86/dumpstack: Avoid uninitlized variable
  x86/kexec: Make kexec (mostly) work in 5-level paging mode

Merge branch 'irq-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull irq fixes from Thomas Gleixner:
"Two small changes:

   - a fix for a interrupt regression caused by the vector management
     changes in 4.15 affecting museum pieces which rely on interrupt
     probing for legacy (e.g. parallel port) devices.

     One of the startup calls in the autoprobe code was not changed to
     the new activate_and_startup() function resulting in a warning and
     as a consequence failing to discover the device interrupt.

   - a trivial update to the copyright/license header of the STM32 irq
     chip driver"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq: Make legacy autoprobing work again
  irqchip/stm32: Fix copyright

Merge tag 'for-linus-20180204' of git://git.kernel.dk/linux-block

Pull more block updates from Jens Axboe:
"Most of this is fixes and not new code/features:

   - skd fix from Arnd, fixing a build error dependent on sla allocator
     type.

   - blk-mq scheduler discard merging fixes, one from me and one from
     Keith. This fixes a segment miscalculation for blk-mq-sched, where
     we mistakenly think two segments are physically contigious even
     though the request isn't carrying real data. Also fixes a bio-to-rq
     merge case.

   - Don't re-set a bit on the buffer_head flags, if it's already set.
     This can cause scalability concerns on bigger machines and
     workloads. From Kemi Wang.

   - Add BLK_STS_DEV_RESOURCE return value to blk-mq, allowing us to
     distuingish between a local (device related) resource starvation
     and a global one. The latter might happen without IO being in
     flight, so it has to be handled a bit differently. From Ming"

* tag 'for-linus-20180204' of git://git.kernel.dk/linux-block:
  block: skd: fix incorrect linux/slab_def.h inclusion
  buffer: Avoid setting buffer bits that are already set
  blk-mq-sched: Enable merging discard bio into request
  blk-mq: fix discard merge with scheduler attached
  blk-mq: introduce BLK_STS_DEV_RESOURCE

Merge tag 'ntb-4.16' of git://github.com/jonmason/ntb

Pull NTB updates from Jon Mason:
"Bug fixes galore, removal of the ntb atom driver, and updates to the
  ntb tools and tests to support the multi-port interface"

* tag 'ntb-4.16' of git://github.com/jonmason/ntb: (37 commits)
  NTB: ntb_perf: fix cast to restricted __le32
  ntb_perf: Fix an error code in perf_copy_chunk()
  ntb_hw_switchtec: Make function switchtec_ntb_remove() static
  NTB: ntb_tool: fix memory leak on 'buf' on error exit path
  NTB: ntb_perf: fix printing of resource_size_t
  NTB: ntb_hw_idt: Set NTB_TOPO_SWITCH topology
  NTB: ntb_test: Update ntb_perf tests
  NTB: ntb_test: Update ntb_tool MW tests
  NTB: ntb_test: Add ntb_tool Message tests
  NTB: ntb_test: Update ntb_tool Scratchpad tests
  NTB: ntb_test: Update ntb_tool DB tests
  NTB: ntb_test: Update ntb_tool link tests
  NTB: ntb_test: Add ntb_tool port tests
  NTB: ntb_test: Safely use paths with whitespace
  NTB: ntb_perf: Add full multi-port NTB API support
  NTB: ntb_tool: Add full multi-port NTB API support
  NTB: ntb_pp: Add full multi-port NTB API support
  NTB: Fix UB/bug in ntb_mw_get_align()
  NTB: Set dma mask and dma coherent mask to NTB devices
  NTB: Rename NTB messaging API methods
  ...

Merge tag 'mailbox-v4.16' of git://git.linaro.org/landing-teams/working/fujitsu/integration

Pull mailbox updates from Jassi Brar:
"Misc driver changes only:

   - TI-MsgMgr: Fix print format for a printk

   - TI-MSgMgr: SPDX license switch for the driver

   - QCOM-IPC: Convert driver to use regmap

   - QCOM-IPC: Spawn sibling clock device from mailbox driver"

* tag 'mailbox-v4.16' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
  dt-bindings: mailbox: qcom: Document the APCS clock binding
  mailbox: qcom: Create APCS child device for clock controller
  mailbox: qcom: Convert APCS IPC driver to use regmap
  mailbox: ti-msgmgr: Use %zu for size_t print format
  mailbox: ti-msgmgr: Switch to SPDX Licensing

Merge branch 'i2c/for-4.16' of git://git./linux/kernel/git/wsa/linux

Pull i2c updates from Wolfram Sang:
"I2C has the following changes for you:

   - new flag to mark DMA safe buffers in i2c_msg. Also, some
     infrastructure around it. And docs.

   - huge refactoring of the at24 driver led by the new maintainer
     Bartosz

   - update I2C bus recovery to send STOP after recovery

   - conversion from gpio to gpiod for I2C bus recovery

   - adding a fault-injector to the i2c-gpio driver

   - lots of small driver improvements, and bigger ones to
     i2c-sh_mobile"

* 'i2c/for-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (99 commits)
  i2c: mv64xxx: Add myself as maintainer for this driver
  i2c: mv64xxx: Fix clock resource by adding an optional bus clock
  i2c: mv64xxx: Remove useless test before clk_disable_unprepare
  i2c: mxs: use true and false for boolean values
  i2c: meson: update doc description to fix build warnings
  i2c: meson: add configurable divider factors
  dt-bindings: i2c: update documentation for the Meson-AXG
  i2c: imx-lpi2c: add runtime pm support
  i2c: rcar: fix some trivial typos in comments
  i2c: davinci: fix the cpufreq transition
  i2c: rk3x: add proper kerneldoc header
  i2c: rk3x: account for const type of of_device_id.data
  i2c: acorn: remove outdated path from file header
  i2c: acorn: add MODULE_LICENSE tag
  i2c: rcar: implement bus recovery
  i2c: send STOP after successful bus recovery
  i2c: ensure SDA is released in recovery if SDA is controllable
  i2c: add 'set_sda' to bus_recovery_info
  i2c: add identifier in declarations for i2c_bus_recovery
  i2c: make kerneldoc about bus recovery more precise
  ...

Merge tag 'fscrypt_for_linus' of git://git./linux/kernel/git/tytso/fscrypt

Pull fscrypt updates from Ted Ts'o:
"Refactor support for encrypted symlinks to move common code to fscrypt"

Ted also points out about the merge:
"This makes the f2fs symlink code use the fscrypt_encrypt_symlink()
  from the fscrypt tree. This will end up dropping the kzalloc() ->
  f2fs_kzalloc() change, which means the fscrypt-specific allocation
  won't get tested by f2fs's kmalloc error injection system; which is
  fine"

* tag 'fscrypt_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt: (26 commits)
  fscrypt: fix build with pre-4.6 gcc versions
  fscrypt: remove 'ci' parameter from fscrypt_put_encryption_info()
  fscrypt: document symlink length restriction
  fscrypt: fix up fscrypt_fname_encrypted_size() for internal use
  fscrypt: define fscrypt_fname_alloc_buffer() to be for presented names
  fscrypt: calculate NUL-padding length in one place only
  fscrypt: move fscrypt_symlink_data to fscrypt_private.h
  fscrypt: remove fscrypt_fname_usr_to_disk()
  ubifs: switch to fscrypt_get_symlink()
  ubifs: switch to fscrypt ->symlink() helper functions
  ubifs: free the encrypted symlink target
  f2fs: switch to fscrypt_get_symlink()
  f2fs: switch to fscrypt ->symlink() helper functions
  ext4: switch to fscrypt_get_symlink()
  ext4: switch to fscrypt ->symlink() helper functions
  fscrypt: new helper function - fscrypt_get_symlink()
  fscrypt: new helper functions for ->symlink()
  fscrypt: trim down fscrypt.h includes
  fscrypt: move fscrypt_is_dot_dotdot() to fs/crypto/fname.c
  fscrypt: move fscrypt_valid_enc_modes() to fscrypt_private.h
  ...

dt-bindings: mailbox: qcom: Document the APCS clock binding

Update the binding documentation for APCS to mention that the APCS
hardware block also expose a clock controller functionality.

The APCS clock controller is a mux and half-integer divider. It has the
main CPU PLL as an input and provides the clock for the application CPU.

Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>

mailbox: qcom: Create APCS child device for clock controller

There is a clock controller functionality provided by the APCS hardware
block of msm8916 devices. The device-tree would represent an APCS node
with both mailbox and clock provider properties.
Create a platform child device for the clock controller functionality so
the driver can probe and use APCS as parent.

Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>

mailbox: qcom: Convert APCS IPC driver to use regmap

This hardware block provides more functionalities that just IPC. Convert
it to regmap to allow other child platform devices to use the same regmap.

Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>

Merge tag 'usercopy-v4.16-rc1' of git://git./linux/kernel/git/kees/linux

Pull hardened usercopy whitelisting from Kees Cook:
"Currently, hardened usercopy performs dynamic bounds checking on slab
  cache objects. This is good, but still leaves a lot of kernel memory
  available to be copied to/from userspace in the face of bugs.

  To further restrict what memory is available for copying, this creates
  a way to whitelist specific areas of a given slab cache object for
  copying to/from userspace, allowing much finer granularity of access
  control.

  Slab caches that are never exposed to userspace can declare no
  whitelist for their objects, thereby keeping them unavailable to
  userspace via dynamic copy operations. (Note, an implicit form of
  whitelisting is the use of constant sizes in usercopy operations and
  get_user()/put_user(); these bypass all hardened usercopy checks since
  these sizes cannot change at runtime.)

  This new check is WARN-by-default, so any mistakes can be found over
  the next several releases without breaking anyone's system.

  The series has roughly the following sections:
   - remove %p and improve reporting with offset
   - prepare infrastructure and whitelist kmalloc
   - update VFS subsystem with whitelists
   - update SCSI subsystem with whitelists
   - update network subsystem with whitelists
   - update process memory with whitelists
   - update per-architecture thread_struct with whitelists
   - update KVM with whitelists and fix ioctl bug
   - mark all other allocations as not whitelisted
   - update lkdtm for more sensible test overage"

* tag 'usercopy-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (38 commits)
  lkdtm: Update usercopy tests for whitelisting
  usercopy: Restrict non-usercopy caches to size 0
  kvm: x86: fix KVM_XEN_HVM_CONFIG ioctl
  kvm: whitelist struct kvm_vcpu_arch
  arm: Implement thread_struct whitelist for hardened usercopy
  arm64: Implement thread_struct whitelist for hardened usercopy
  x86: Implement thread_struct whitelist for hardened usercopy
  fork: Provide usercopy whitelisting for task_struct
  fork: Define usercopy region in thread_stack slab caches
  fork: Define usercopy region in mm_struct slab caches
  net: Restrict unwhitelisted proto caches to size 0
  sctp: Copy struct sctp_sock.autoclose to userspace using put_user()
  sctp: Define usercopy region in SCTP proto slab cache
  caif: Define usercopy region in caif proto slab cache
  ip: Define usercopy region in IP proto slab cache
  net: Define usercopy region in struct proto slab cache
  scsi: Define usercopy region in scsi_sense_cache slab cache
  cifs: Define usercopy region in cifs_request slab cache
  vxfs: Define usercopy region in vxfs_inode slab cache
  ufs: Define usercopy region in ufs_inode_cache slab cache
  ...

KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL

[ Based on a patch from Paolo Bonzini <pbonzini@redhat.com> ]

... basically doing exactly what we do for VMX:

- Passthrough SPEC_CTRL to guests (if enabled in guest CPUID)
- Save and restore SPEC_CTRL around VMExit and VMEntry only if the guest
actually used it.

Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: kvm@vger.kernel.org
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Link: https://lkml.kernel.org/r/1517669783-20732-1-git-send-email-karahmed@amazon.de

KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL

[ Based on a patch from Ashok Raj <ashok.raj@intel.com> ]

Add direct access to MSR_IA32_SPEC_CTRL for guests. This is needed for
guests that will only mitigate Spectre V2 through IBRS+IBPB and will not
be using a retpoline+IBPB based approach.

To avoid the overhead of saving and restoring the MSR_IA32_SPEC_CTRL for
guests that do not actually use the MSR, only start saving and restoring
when a non-zero is written to it.

No attempt is made to handle STIBP here, intentionally. Filtering STIBP
may be added in a future patch, which may require trapping all writes
if we don't want to pass it through directly to the guest.

[dwmw2: Clean up CPUID bits, save/restore manually, handle reset]

Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: kvm@vger.kernel.org
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Link: https://lkml.kernel.org/r/1517522386-18410-5-git-send-email-karahmed@amazon.de

KVM/VMX: Emulate MSR_IA32_ARCH_CAPABILITIES

Intel processors use MSR_IA32_ARCH_CAPABILITIES MSR to indicate RDCL_NO
(bit 0) and IBRS_ALL (bit 1). This is a read-only MSR. By default the
contents will come directly from the hardware, but user-space can still
override it.

[dwmw2: The bit in kvm_cpuid_7_0_edx_x86_features can be unconditional]

Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: kvm@vger.kernel.org
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Link: https://lkml.kernel.org/r/1517522386-18410-4-git-send-email-karahmed@amazon.de

KVM/x86: Add IBPB support

The Indirect Branch Predictor Barrier (IBPB) is an indirect branch
control mechanism. It keeps earlier branches from influencing
later ones.

Unlike IBRS and STIBP, IBPB does not define a new mode of operation.
It's a command that ensures predicted branch targets aren't used after
the barrier. Although IBRS and IBPB are enumerated by the same CPUID
enumeration, IBPB is very different.

IBPB helps mitigate against three potential attacks:

* Mitigate guests from being attacked by other guests.
  - This is addressed by issing IBPB when we do a guest switch.

* Mitigate attacks from guest/ring3->host/ring3.
  These would require a IBPB during context switch in host, or after
  VMEXIT. The host process has two ways to mitigate
  - Either it can be compiled with retpoline
  - If its going through context switch, and has set !dumpable then
    there is a IBPB in that path.
    (Tim's patch: https://patchwork.kernel.org/patch/10192871)
  - The case where after a VMEXIT you return back to Qemu might make
    Qemu attackable from guest when Qemu isn't compiled with retpoline.
  There are issues reported when doing IBPB on every VMEXIT that resulted
  in some tsc calibration woes in guest.

* Mitigate guest/ring0->host/ring0 attacks.
  When host kernel is using retpoline it is safe against these attacks.
  If host kernel isn't using retpoline we might need to do a IBPB flush on
  every VMEXIT.

Even when using retpoline for indirect calls, in certain conditions 'ret'
can use the BTB on Skylake-era CPUs. There are other mitigations
available like RSB stuffing/clearing.

* IBPB is issued only for SVM during svm_free_vcpu().
  VMX has a vmclear and SVM doesn't.  Follow discussion here:
  https://lkml.org/lkml/2018/1/15/146

Please refer to the following spec for more details on the enumeration
and control.

Refer here to get documentation about mitigations.

https://software.intel.com/en-us/side-channel-security-support

[peterz: rebase and changelog rewrite]
[karahmed: - rebase
           - vmx: expose PRED_CMD if guest has it in CPUID
           - svm: only pass through IBPB if guest has it in CPUID
           - vmx: support !cpu_has_vmx_msr_bitmap()]
           - vmx: support nested]
[dwmw2: Expose CPUID bit too (AMD IBPB only for now as we lack IBRS)
        PRED_CMD is a write-only MSR]

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: kvm@vger.kernel.org
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Link: http://lkml.kernel.org/r/1515720739-43819-6-git-send-email-ashok.raj@intel.com
Link: https://lkml.kernel.org/r/1517522386-18410-3-git-send-email-karahmed@amazon.de

KVM/x86: Update the reverse_cpuid list to include CPUID_7_EDX

[dwmw2: Stop using KF() for bits in it, too]
Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Cc: kvm@vger.kernel.org
Cc: Radim Krčmář <rkrcmar@redhat.com>
Link: https://lkml.kernel.org/r/1517522386-18410-2-git-send-email-karahmed@amazon.de

Merge tag 'pstore-v4.16-rc1' of git://git./linux/kernel/git/kees/linux

Pull pstore update from Kees Cook:
"Only a header cleanup this release; nice and quiet. :)

- clean up hardirq header usage (Yang Shi)"

* tag 'pstore-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
fs: pstore: remove unused hardirq.h

Merge tag 'ext4_for_linus' of git://git./linux/kernel/git/tytso/ext4

Pull ext4 updates from Ted Ts'o:
"Only miscellaneous cleanups and bug fixes for ext4 this cycle"

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: create ext4_kset dynamically
  ext4: create ext4_feat kobject dynamically
  ext4: release kobject/kset even when init/register fail
  ext4: fix incorrect indentation of if statement
  ext4: correct documentation for grpid mount option
  ext4: use 'sbi' instead of 'EXT4_SB(sb)'
  ext4: save error to disk in __ext4_grp_locked_error()
  jbd2: fix sphinx kernel-doc build warnings
  ext4: fix a race in the ext4 shutdown path
  mbcache: make sure c_entry_count is not decremented past zero
  ext4: no need flush workqueue before destroying it
  ext4: fixed alignment and minor code cleanup in ext4.h
  ext4: fix ENOSPC handling in DAX page fault handler
  dax: pass detailed error code from dax_iomap_fault()
  mbcache: revert "fs/mbcache.c: make count_objects() more robust"
  mbcache: initialize entry->e_referenced in mb_cache_entry_create()
  ext4: fix up remaining files with SPDX cleanups

Merge branch 'dmi-for-linus' of git://git./linux/kernel/git/jdelvare/staging

Pull dmi subsystem updates/fixes from Jean Delvare.

* 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
  firmware: dmi: handle missing DMI data gracefully
  firmware: dmi_scan: Fix handling of empty DMI strings
  firmware: dmi_scan: Drop dmi_initialized
  firmware: dmi: Optimize dmi_matches

Merge branch 'fixes-v4.16-rc1' of git://git./linux/kernel/git/jmorris/linux-security

Pull integrity fixes from James Morris:

-  add James Bottommley as a Trusted Keys maintainer.

- IMA: re-initialize iint->atomic_flags on iint_free(), from Mimi.

* 'fixes-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  ima: re-initialize iint->atomic_flags
  maintainers: update trusted keys

Merge branch 'msr-bitmaps' of git://git./virt/kvm/kvm into x86/pti

Pull the KVM prerequisites so the IBPB patches apply.

Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) The bnx2x can hang if you give it a GSO packet with a segment size
    which is too big for the hardware, detect and drop in this case.
    From Daniel Axtens.

2) Fix some overflows and pointer leaks in xtables, from Dmitry Vyukov.

3) Missing RCU locking in igmp, from Eric Dumazet.

4) Fix RX checksum handling on r8152, it can only checksum UDP and TCP
    packets. From Hayes Wang.

5) Minor pacing tweak to TCP BBR congestion control, from Neal
    Cardwell.

6) Missing RCU annotations in cls_u32, from Paolo Abeni.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (30 commits)
  Revert "defer call to mem_cgroup_sk_alloc()"
  soreuseport: fix mem leak in reuseport_add_sock()
  net: qlge: use memmove instead of skb_copy_to_linear_data
  net: qed: use correct strncpy() size
  net: cxgb4: avoid memcpy beyond end of source buffer
  cls_u32: add missing RCU annotation.
  r8152: set rx mode early when linking on
  r8152: fix wrong checksum status for received IPv4 packets
  nfp: fix TLV offset calculation
  net: pxa168_eth: add netconsole support
  net: igmp: add a missing rcu locking section
  ibmvnic: fix firmware version when no firmware level has been provided by the VIOS server
  vmxnet3: remove redundant initialization of pointer 'rq'
  lan78xx: remove redundant initialization of pointer 'phydev'
  net: jme: remove unused initialization of 'rxdesc'
  rtnetlink: remove check for IFLA_IF_NETNSID
  rocker: fix possible null pointer dereference in rocker_router_fib_event_work
  inet: Avoid unitialized variable warning in inet_unhash()
  net: bridge: Fix uninitialized error in br_fdb_sync_static()
  openvswitch: Remove padding from packet before L3+ conntrack processing
  ...

Merge tag 'gfs2-4.16.fixes2' of git://git./linux/kernel/git/gfs2/linux-gfs2

Pull GFS2 fixes from Bob Peterson:
"Andreas Gruenbacher wrote two additional patches that we would like
  merged in this time. Both are regressions:

   - fix another kernel build dependency problem

   - fix a performance regression in glock dumps"

* tag 'gfs2-4.16.fixes2' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
  gfs2: Glock dump performance regression fix
  gfs2: Fix the crc32c dependency

Merge tag 'scsi-postmerge' of git://git./linux/kernel/git/jejb/scsi

Pull second set of SCSI updates from James Bottomley:
"This is a set of three patches that depended on mq and zone changes in
  the block tree (now upstream)"

* tag 'scsi-postmerge' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: sd: Remove zone write locking
  scsi: sd_zbc: Initialize device request queue zoned data
  scsi: scsi-mq-debugfs: Show more information

Merge tag 'linux-kselftest-4.16-rc1' of git://git./linux/kernel/git/shuah/linux-kselftest

Pull kselftest updates from Shuah Khan:
"This update to Kselftest consists of fixes, cleanups, and SPDX license
  additions"

* tag 'linux-kselftest-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests: vm: update .gitignore with missing generated file
  selftests/x86: Add <test_name>{,_32,_64} targets
  selftests: Fix loss of test output in run_kselftests.sh
  selftest: ftrace: Fix to add 256 kprobe events correctly
  selftest: ftrace: Fix to pick text symbols for kprobes
  selftests: media_tests: Add SPDX license identifier
  selftests: kselftest.h: Add SPDX license identifier
  selftests: kselftest_install.sh: Add SPDX license identifier
  selftests: gen_kselftest_tar.h: Add SPDX license identifier
  selftests: media_tests: Fix Makefile 'clean' target warning
  tools/testing: Fix trailing semicolon
  kselftest: fix OOM in memory compaction test
  selftests: seccomp: fix compile error seccomp_bpf

pinctrl: remove include file from <linux/device.h>

When pulling the recent pinctrl merge, I was surprised by how a
pinctrl-only pull request ended up rebuilding basically the whole
kernel.

The reason for that ended up being that <linux/device.h> included
<linux/pinctrl/devinfo.h>, so any change to that file ended up causing
pretty much every driver out there to be rebuilt.

The reason for that was because 'struct device' has this in it:

    #ifdef CONFIG_PINCTRL
        struct dev_pin_info     *pins;
    #endif

but we already avoid header includes for these kinds of things in that
header file, preferring to just use a forward-declaration of the
structure instead.  Exactly to avoid this kind of header dependency.

Since some drivers seem to expect that <linux/pinctrl/devinfo.h> header
to come in automatically, move the include to <linux/pinctrl/pinctrl.h>
instead.  It might be better to just make the includes more targeted,
but I'm not going to review every driver.

It would definitely be good to have a tool for finding and minimizing
header dependencies automatically - or at least help with them.  Right
now we almost certainly end up having way too many of these things, and
it's hard to test every single configuration.

FWIW, you can get a sense of the "hotness" of a header file with something
like this after doing a full build:

    find . -name '.*.o.cmd' -print0 |
        xargs -0 tail --lines=+2 |
        grep -v 'wildcard ' |
        tr ' \\' '\n' |
        sort | uniq -c | sort -n | less -S

which isn't exact (there are other things in those '*.o.cmd' than just
the dependencies, and the "--lines=+2" only removes the header), but
might a useful approximation.

With this patch, <linux/pinctrl/devinfo.h> drops to "only" having 833
users in the current x86-64 allmodconfig.  In contrast, <linux/device.h>
has 14857 build files including it directly or indirectly.

Of course, the headers that absolutely _everybody_ includes (things like
<linux/types.h> etc) get a score of 23000+.

Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

firmware: dmi: handle missing DMI data gracefully

Currently, when booting a kernel with DMI support on a platform that has
no DMI tables, the following output is emitted into the kernel log:

  [    0.128818] DMI not present or invalid.
  ...
  [    1.306659] dmi: Firmware registration failed.
  ...
  [    2.908681] dmi-sysfs: dmi entry is absent.

The first one is a pr_info(), but the subsequent ones are pr_err()s that
complain about a condition that is not really an error to begin with.

So let's clean this up, and give up silently if dma_available is not set.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Martin Hundebøll <mnhu@prevas.dk>
Signed-off-by: Jean Delvare <jdelvare@suse.de>

firmware: dmi_scan: Fix handling of empty DMI strings

The handling of empty DMI strings looks quite broken to me:
* Strings from 1 to 7 spaces are not considered empty.
* True empty DMI strings (string index set to 0) are not considered
  empty, and result in allocating a 0-char string.
* Strings with invalid index also result in allocating a 0-char
  string.
* Strings starting with 8 spaces are all considered empty, even if
  non-space characters follow (sounds like a weird thing to do, but
  I have actually seen occurrences of this in DMI tables before.)
* Strings which are considered empty are reported as 8 spaces,
  instead of being actually empty.

Some of these issues are the result of an off-by-one error in memcmp,
the rest is incorrect by design.

So let's get it square: missing strings and strings made of only
spaces, regardless of their length, should be treated as empty and
no memory should be allocated for them. All other strings are
non-empty and should be allocated.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Fixes: 79da4721117f ("x86: fix DMI out of memory problems")
Cc: Parag Warudkar <parag.warudkar@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>

firmware: dmi_scan: Drop dmi_initialized

I don't think it makes sense to check for a possible bad
initialization order at run time on every system when it is all
decided at build time.

A more efficient way to make sure developers do not introduce new
calls to dmi_check_system() too early in the initialization sequence
is to simply document the expected call order. That way, developers
have a chance to get it right immediately, without having to
test-boot their kernel, wonder why it does not work, and parse the
kernel logs for a warning message. And we get rid of the run-time
performance penalty as a nice side effect.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Ingo Molnar <mingo@kernel.org>

firmware: dmi: Optimize dmi_matches

Function dmi_matches can me made a bit faster:

* The documented purpose of dmi_initialized is to catch too early
  calls to dmi_check_system(). I'm not fully convinced it justifies
  slowing down the initialization of all systems out there, but at
  least the check should not have been moved from dmi_check_system()
  to dmi_matches(). dmi_matches() is being called for every entry of
  the table passed to dmi_check_system(), causing the same redundant
  check to be performed again and again. So move it back to
  dmi_check_system(), reverting this specific portion of commit
  d7b1956fed33 ("DMI: Introduce dmi_first_match to make the interface
  more flexible").

* Don't check for the exact_match flag again when we already know its
  value.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Fixes: d7b1956fed33 ("DMI: Introduce dmi_first_match to make the interface more flexible")
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Jeff Garzik <jgarzik@redhat.com>

Revert "defer call to mem_cgroup_sk_alloc()"

This patch effectively reverts commit 9f1c2674b328 ("net: memcontrol:
defer call to mem_cgroup_sk_alloc()").

Moving mem_cgroup_sk_alloc() to the inet_csk_accept() completely breaks
memcg socket memory accounting, as packets received before memcg
pointer initialization are not accounted and are causing refcounting
underflow on socket release.

Actually the free-after-use problem was fixed by
commit c0576e397508 ("net: call cgroup_sk_alloc() earlier in
sk_clone_lock()") for the cgroup pointer.

So, let's revert it and call mem_cgroup_sk_alloc() just before
cgroup_sk_alloc(). This is safe, as we hold a reference to the socket
we're cloning, and it holds a reference to the memcg.

Also, let's drop BUG_ON(mem_cgroup_is_root()) check from
mem_cgroup_sk_alloc(). I see no reasons why bumping the root
memcg counter is a good reason to panic, and there are no realistic
ways to hit it.

Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

soreuseport: fix mem leak in reuseport_add_sock()

reuseport_add_sock() needs to deal with attaching a socket having
its own sk_reuseport_cb, after a prior
setsockopt(SO_ATTACH_REUSEPORT_?BPF)

Without this fix, not only a WARN_ONCE() was issued, but we were also
leaking memory.

Thanks to sysbot and Eric Biggers for providing us nice C repros.

------------[ cut here ]------------
socket already in reuseport group
WARNING: CPU: 0 PID: 3496 at net/core/sock_reuseport.c:119
reuseport_add_sock+0x742/0x9b0 net/core/sock_reuseport.c:117
Kernel panic - not syncing: panic_on_warn set ...

CPU: 0 PID: 3496 Comm: syzkaller869503 Not tainted 4.15.0-rc6+ #245
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:17 [inline]
  dump_stack+0x194/0x257 lib/dump_stack.c:53
  panic+0x1e4/0x41c kernel/panic.c:183
  __warn+0x1dc/0x200 kernel/panic.c:547
  report_bug+0x211/0x2d0 lib/bug.c:184
  fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178
  fixup_bug arch/x86/kernel/traps.c:247 [inline]
  do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296
  do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
  invalid_op+0x22/0x40 arch/x86/entry/entry_64.S:1079

Fixes: ef456144da8e ("soreuseport: define reuseport groups")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot+c0ea2226f77a42936bf7@syzkaller.appspotmail.com
Acked-by: Craig Gallek <kraig@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qlge: use memmove instead of skb_copy_to_linear_data

gcc-8 points out that the skb_copy_to_linear_data() argument points to
the skb itself, which makes it run into a problem with overlapping
memcpy arguments:

In file included from include/linux/ip.h:20,
from drivers/net/ethernet/qlogic/qlge/qlge_main.c:26:
drivers/net/ethernet/qlogic/qlge/qlge_main.c: In function 'ql_realign_skb':
include/linux/skbuff.h:3378:2: error: 'memcpy' source argument is the same as destination [-Werror=restrict]
memcpy(skb->data, from, len);

It's unclear to me what the best solution is, maybe it ought to use a
different helper that adjusts the skb data in a safe way. Simply using
memmove() here seems like the easiest workaround.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qed: use correct strncpy() size

passing the strlen() of the source string as the destination
length is pointless, and gcc-8 now warns about it:

drivers/net/ethernet/qlogic/qed/qed_debug.c: In function 'qed_grc_dump':
include/linux/string.h:253: error: 'strncpy' specified bound depends on the length of the source argument [-Werror=stringop-overflow=]

This changes qed_grc_dump_big_ram() to instead uses the length of
the destination buffer, and use strscpy() to guarantee nul-termination.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: cxgb4: avoid memcpy beyond end of source buffer

Building with link-time-optimizations revealed that the cxgb4 driver does
a fixed-size memcpy() from a variable-length constant string into the
network interface name:

In function 'memcpy',
    inlined from 'cfg_queues_uld.constprop' at drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c:335:2,
    inlined from 'cxgb4_register_uld.constprop' at drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c:719:9:
include/linux/string.h:350:3: error: call to '__read_overflow2' declared with attribute error: detected read beyond size of object passed as 2nd parameter
   __read_overflow2();
   ^

I can see two equally workable solutions: either we use a strncpy() instead
of the memcpy() to stop at the end of the input, or we make the source buffer
fixed length as well. This implements the latter.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

cls_u32: add missing RCU annotation.

In a couple of points of the control path, n->ht_down is currently
accessed without the required RCU annotation. The accesses are
safe, but sparse complaints. Since we already held the
rtnl lock, let use rtnl_dereference().

Fixes: a1b7c5fd7fe9 ("net: sched: add cls_u32 offload hooks for netdevs")
Fixes: de5df63228fc ("net: sched: cls_u32 changes to knode must appear atomic to readers")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'r8152-fix-rx-issues'

Hayes Wang says:

====================
r8152: fix rx issues

The two patched are used to fix rx issues.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

r8152: set rx mode early when linking on

Set rx mode before calling netif_wake_queue() when linking on to avoid
the device missing the receiving packets.

The transmission may start after calling netif_wake_queue(), and the
packets of resopnse may reach before calling rtl8152_set_rx_mode()
which let the device could receive packets. Then, the packets of
response would be missed.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8152: fix wrong checksum status for received IPv4 packets

The device could only check the checksum of TCP and UDP packets. Therefore,
for the IPv4 packets excluding TCP and UDP, the check of checksum is necessary,
even though the IP checksum is correct.

Take ICMP for example, The IP checksum may be correct, but the ICMP checksum
may be wrong.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfp: fix TLV offset calculation

The data pointer in the config space TLV parser already includes
NFP_NET_CFG_TLV_BASE, it should not be added again. Incorrect
offset values were only used in printed user output, rendering
the bug merely cosmetic.

Fixes: 73a0329b057e ("nfp: add TLV capabilities to the BAR")
Signed-off-by: Edwin Peer <edwin.peer@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'firewire-updates' of git://git./linux/kernel/git/ieee1394/linux1394

Pull firewire updates from Stefan Richter

  - make JMicron JMB38x controllers work with IOMMU-equipped systems

  - IP-over-1394: allow user-configured MTU of up to 4096 bytes

* tag 'firewire-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
  firewire-ohci: work around oversized DMA reads on JMicron controllers
  firewire: net: max MTU off by one

x86/power: Fix swsusp_arch_resume prototype

The declaration for swsusp_arch_resume marks it as 'asmlinkage', but the
definition in x86-32 does not, and it fails to include the header with the
declaration. This leads to a warning when building with
link-time-optimizations:

kernel/power/power.h:108:23: error: type of 'swsusp_arch_resume' does not match original declaration [-Werror=lto-type-mismatch]
extern asmlinkage int swsusp_arch_resume(void);
^
arch/x86/power/hibernate_32.c:148:0: note: 'swsusp_arch_resume' was previously declared here
int swsusp_arch_resume(void)

This moves the declaration into a globally visible header file and fixes up
both x86 definitions to match it.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Len Brown <len.brown@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Nicolas Pitre <nico@linaro.org>
Cc: linux-pm@vger.kernel.org
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Bart Van Assche <bart.vanassche@wdc.com>
Link: https://lkml.kernel.org/r/20180202145634.200291-2-arnd@arndb.de

x86/dumpstack: Avoid uninitlized variable

In some configurations, 'partial' does not get initialized, as shown by
this gcc-8 warning:

arch/x86/kernel/dumpstack.c: In function 'show_trace_log_lvl':
arch/x86/kernel/dumpstack.c:156:4: error: 'partial' may be used uninitialized in this function [-Werror=maybe-uninitialized]
show_regs_if_on_stack(&stack_info, regs, partial);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This initializes it to false, to get the previous behavior in this case.

Fixes: a9cdbe72c4e8 ("x86/dumpstack: Fix partial register dumps")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Nicolas Pitre <nico@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Link: https://lkml.kernel.org/r/20180202145634.200291-1-arnd@arndb.de

Merge tag 'pinctrl-v4.16-1' of git://git./linux/kernel/git/linusw/linux-pinctrl

Pull pin control updates from Linus Walleij:
"This is the bulk of pin control changes for the v4.16 kernel cycle.
  Like with GPIO it is actually a bit calm this time.

  Core changes:

   - After lengthy discussions and partly due to my ignorance, we have
     merged a patch making pinctrl_force_default() and
     pinctrl_force_sleep() reprogram the states into the hardware of any
     hogged pins, even if they are already in the desired state.

     This only apply to hogged pins since groups of pins owned by
     drivers need to be managed by each driver, lest they could not do
     things like runtime PM and put pins to sleeping state even if the
     system as a whole is not in sleep.

  New drivers:

   - New driver for the Microsemi Ocelot SoC. This is used in ethernet
     switches.

   - The X-Powers AXP209 GPIO driver was extended to also deal with pin
     control and moved over from the GPIO subsystem. This circuit is a
     mixed-mode integrated circuit which is part of AllWinner designs.

   - New subdriver for the Qualcomm MSM8998 SoC, core of a high end
     mobile devices (phones) chipset.

   - New subdriver for the ST Microelectronics STM32MP157 MPU and
     STM32F769 MCU from the STM32 family.

   - New subdriver for the MediaTek MT7622 SoC. This is used for
     routers, repeater, gateways and such network infrastructure.

   - New subdriver for the NXP (former Freescale) i.MX 6ULL. This SoC
     has multimedia features and target "smart devices", I guess in-car
     entertainment, in-flight entertainment, industrial control panels
     etc.

  General improvements:

   - Incremental improvements on the SH-PFC subdrivers for things like
     the CAN bus.

   - Enable the glitch filter on Baytrail GPIOs used for interrupts.

   - Proper handling of pins to GPIO ranges on the Semtec SX150X

   - An IRQ setup ordering fix on MCP23S08.

   - A good set of janitorial coding style fixes"

* tag 'pinctrl-v4.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (102 commits)
  pinctrl: mcp23s08: fix irq setup order
  pinctrl: Forward declare struct device
  pinctrl: sunxi: Use of_clk_get_parent_count() instead of open coding
  pinctrl: stm32: add STM32F769 MCU support
  pinctrl: sx150x: Add a static gpio/pinctrl pin range mapping
  pinctrl: sx150x: Register pinctrl before adding the gpiochip
  pinctrl: sx150x: Unregister the pinctrl on release
  pinctrl: ingenic: Remove redundant dev_err call in ingenic_pinctrl_probe()
  pinctrl: sprd: Use seq_putc() in sprd_pinconf_group_dbg_show()
  pinctrl: pinmux: Use seq_putc() in pinmux_pins_show()
  pinctrl: abx500: Use seq_putc() in abx500_gpio_dbg_show()
  pinctrl: mediatek: mt7622: align error handling of mtk_hw_get_value call
  pinctrl: mediatek: mt7622: fix potential uninitialized value being returned
  pinctrl: uniphier: refactor drive strength get/set functions
  pinctrl: imx7ulp: constify struct imx_cfg_params_decode
  pinctrl: imx: constify struct imx_pinctrl_soc_info
  pinctrl: imx7d: simplify imx7d_pinctrl_probe
  pinctrl: imx: use struct imx_pinctrl_soc_info as a const
  pinctrl: sunxi-pinctrl: fix pin funtion can not be match correctly.
  pinctrl: qcom: Add msm8998 pinctrl driver
  ...

Merge tag 'rtc-4.16' of git://git./linux/kernel/git/abelloni/linux

Pull RTC updates from Alexandre Belloni:
"Not much this cycle. I've pushed the at32ap700x removal late but it is
  unlikely to cause any issues.

  Summary:

  Subsystem:
   - Move ABI documentation to Documentation/ABI

  New driver:
   - NXP i.MX53 SRTC
   - Chrome OS EC RTC

  Drivers:
   - Remove at32ap700x
   - Many fixes in various error paths"

* tag 'rtc-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
  rtc: remove rtc-at32ap700x
  Documentation: rtc: move iotcl interface documentation to ABI
  Documentation: rtc: add sysfs file permissions
  Documentation: rtc: move sysfs documentation to ABI
  rtc: mxc_v2: remove __exit annotation
  rtc: mxc_v2: Remove unnecessary platform_get_resource() error check
  rtc: add mxc driver for i.MX53 SRTC
  dt-bindings: rtc: add bindings for i.MX53 SRTC
  rtc: r7301: Fix a possible sleep-in-atomic bug in rtc7301_set_time
  rtc: r7301: Fix a possible sleep-in-atomic bug in rtc7301_read_time
  rtc: omap: fix unbalanced clk_prepare_enable/clk_disable_unprepare
  rtc: ac100: Fix multiple race conditions
  rtc: sun6i: ensure rtc is kfree'd on error
  rtc: cros-ec: add cros-ec-rtc driver.
  mfd: cros_ec: Introduce RTC commands and events definitions.
  rtc: stm32: Fix copyright
  rtc: Remove unused RTC_DEVICE_NAME_SIZE
  rtc: r9701: Remove r9701_remove function
  rtc: brcmstb-waketimer: fix error handling in brcmstb_waketmr_probe()

x86/speculation: Fix typo IBRS_ATT, which should be IBRS_ALL

Fixes: 117cc7a908c83 ("x86/retpoline: Fill return stack buffer on vmexit")
Signed-off-by: Darren Kenny <darren.kenny@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Link: https://lkml.kernel.org/r/20180202191220.blvgkgutojecxr3b@starbug-vm.ie.oracle.com

x86/pti: Mark constant arrays as __initconst

I'm seeing build failures from the two newly introduced arrays that
are marked 'const' and '__initdata', which are mutually exclusive:

arch/x86/kernel/cpu/common.c:882:43: error: 'cpu_no_speculation' causes a section type conflict with 'e820_table_firmware_init'
arch/x86/kernel/cpu/common.c:895:43: error: 'cpu_no_meltdown' causes a section type conflict with 'e820_table_firmware_init'

The correct annotation is __initconst.

Fixes: fec9434a12f3 ("x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Link: https://lkml.kernel.org/r/20180202213959.611210-1-arnd@arndb.de

Merge branch 'for-linus' of git://git./linux/kernel/git/mattst88/alpha

Pull alpha updates from Matt Turner:
"A few small fixes and clean ups"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha:
  alpha: fix crash if pthread_create races with signal delivery
  alpha: fix formating of stack content
  alpha: fix reboot on Avanti platform
  alpha: deprecate pci_get_bus_and_slot()
  alpha: Fix mixed up args in EXC macro in futex operations
  alpha: osf_sys.c: use timespec64 where appropriate
  alpha: osf_sys.c: fix put_tv32 regression
  alpha: make thread_saved_pc static
  alpha: make XTABS equivalent to TAB3

Merge tag 'powerpc-4.16-1' of git://git./linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
"Highlights:

   - Enable support for memory protection keys aka "pkeys" on Power7/8/9
     when using the hash table MMU.

   - Extend our interrupt soft masking to support masking PMU interrupts
     as well as "normal" interrupts, and then use that to implement
     local_t for a ~4x speedup vs the current atomics-based
     implementation.

   - A new driver "ocxl" for "Open Coherent Accelerator Processor
     Interface (OpenCAPI)" devices.

   - Support for new device tree properties on PowerVM to describe
     hotpluggable memory and devices.

   - Add support for CLOCK_{REALTIME/MONOTONIC}_COARSE to the 64-bit
     VDSO.

   - Freescale updates from Scott: fixes for CPM GPIO and an FSL PCI
     erratum workaround, plus a minor cleanup patch.

  As well as quite a lot of other changes all over the place, and small
  fixes and cleanups as always.

  Thanks to: Alan Modra, Alastair D'Silva, Alexey Kardashevskiy,
  Alistair Popple, Andreas Schwab, Andrew Donnellan, Aneesh Kumar K.V,
  Anju T Sudhakar, Anshuman Khandual, Anton Blanchard, Arnd Bergmann,
  Balbir Singh, Benjamin Herrenschmidt, Bhaktipriya Shridhar, Bryant G.
  Ly, Cédric Le Goater, Christophe Leroy, Christophe Lombard, Cyril Bur,
  David Gibson, Desnes A. Nunes do Rosario, Dmitry Torokhov, Frederic
  Barrat, Geert Uytterhoeven, Guilherme G. Piccoli, Gustavo A. R. Silva,
  Gustavo Romero, Ivan Mikhaylov, Joakim Tjernlund, Joe Perches, Josh
  Poimboeuf, Juan J. Alvarez, Julia Cartwright, Kamalesh Babulal,
  Madhavan Srinivasan, Mahesh Salgaonkar, Mathieu Malaterre, Michael
  Bringmann, Michael Hanselmann, Michael Neuling, Nathan Fontenot,
  Naveen N. Rao, Nicholas Piggin, Paul Mackerras, Philippe Bergheaud,
  Ram Pai, Russell Currey, Santosh Sivaraj, Scott Wood, Seth Forshee,
  Simon Guo, Stewart Smith, Sukadev Bhattiprolu, Thiago Jung Bauermann,
  Vaibhav Jain, Vasyl Gomonovych"

* tag 'powerpc-4.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (199 commits)
  powerpc/mm/radix: Fix build error when RADIX_MMU=n
  macintosh/ams-input: Use true and false for boolean values
  macintosh: change some data types from int to bool
  powerpc/watchdog: Print the NIP in soft_nmi_interrupt()
  powerpc/watchdog: regs can't be null in soft_nmi_interrupt()
  powerpc/watchdog: Tweak watchdog printks
  powerpc/cell: Remove axonram driver
  rtc-opal: Fix handling of firmware error codes, prevent busy loops
  powerpc/mpc52xx_gpt: make use of raw_spinlock variants
  macintosh/adb: Properly mark continued kernel messages
  powerpc/pseries: Fix cpu hotplug crash with memoryless nodes
  powerpc/numa: Ensure nodes initialized for hotplug
  powerpc/numa: Use ibm,max-associativity-domains to discover possible nodes
  powerpc/kernel: Block interrupts when updating TIDR
  powerpc/powernv/idoa: Remove unnecessary pcidev from pci_dn
  powerpc/mm/nohash: do not flush the entire mm when range is a single page
  powerpc/pseries: Add Initialization of VF Bars
  powerpc/pseries/pci: Associate PEs to VFs in configure SR-IOV
  powerpc/eeh: Add EEH notify resume sysfs
  powerpc/eeh: Add EEH operations to notify resume
  ...

Merge branch 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm

Pull ARM updates from Russell King:

- StrongARM SA1111 updates to modernise and remove cruft

- Add StrongARM gpio drivers for board GPIOs

- Verify size of zImage is what we expect to avoid issues with
   appended DTB

- nommu updates from Vladimir Murzin

- page table read-write-execute checking from Jinbum Park

- Broadcom Brahma-B15 cache updates from Florian Fainelli

- Avoid failure with kprobes test caused by inappropriately
   placed kprobes

- Remove __memzero optimisation (which was incorrectly being
   used directly by some drivers)

* 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: (32 commits)
  ARM: 8745/1: get rid of __memzero()
  ARM: 8744/1: don't discard memblock for kexec
  ARM: 8743/1: bL_switcher: add MODULE_LICENSE tag
  ARM: 8742/1: Always use REFCOUNT_FULL
  ARM: 8741/1: B15: fix unused label warnings
  ARM: 8740/1: NOMMU: Make sure we do not hold stale data in mem[] array
  ARM: 8739/1: NOMMU: Setup VBAR/Hivecs for secondaries cores
  ARM: 8738/1: Disable CONFIG_DEBUG_VIRTUAL for NOMMU
  ARM: 8737/1: mm: dump: add checking for writable and executable
  ARM: 8736/1: mm: dump: make the page table dumping seq_file
  ARM: 8735/1: mm: dump: make page table dumping reusable
  ARM: sa1100/neponset: add GPIO drivers for control and modem registers
  ARM: sa1100/assabet: add BCR/BSR GPIO driver
  ARM: 8734/1: mm: idmap: Mark variables as ro_after_init
  ARM: 8733/1: hw_breakpoint: Mark variables as __ro_after_init
  ARM: 8732/1: NOMMU: Allow userspace to access background MPU region
  ARM: 8727/1: MAINTAINERS: Update brcmstb entries to cover B15 code
  ARM: 8728/1: B15: Register reboot notifier for KEXEC
  ARM: 8730/1: B15: Add suspend/resume hooks
  ARM: 8726/1: B15: Add CPU hotplug awareness
  ...

Merge tag 'microblaze-4.16-rc1' of git://git.monstr.eu/linux-2.6-microblaze

Pull microblaze updates from Michal Simek:

- Fix endian handling and Kconfig dependency

- Fix iounmap prototype

* tag 'microblaze-4.16-rc1' of git://git.monstr.eu/linux-2.6-microblaze:
  microblaze: Setup proper dependency for optimized lib functions
  microblaze: fix iounmap prototype
  microblaze: fix endian handling

block: skd: fix incorrect linux/slab_def.h inclusion

skd includes slab_def.h to get access to the slab cache object size.
However, including this header breaks when we use SLUB or SLOB instead of
the SLAB allocator, since the structure layout is completely different,
as shown by this warning when we build this driver in one of the invalid
configurations with link-time optimizations enabled:

include/linux/slab.h:715:0: error: type of 'kmem_cache_size' does not match original declaration [-Werror=lto-type-mismatch]
unsigned int kmem_cache_size(struct kmem_cache *s);

mm/slab_common.c:77:14: note: 'kmem_cache_size' was previously declared here
unsigned int kmem_cache_size(struct kmem_cache *s)
^
mm/slab_common.c:77:14: note: code may be misoptimized unless -fno-strict-aliasing is used
include/linux/slab.h:147:0: error: type of 'kmem_cache_destroy' does not match original declaration [-Werror=lto-type-mismatch]
void kmem_cache_destroy(struct kmem_cache *);

mm/slab_common.c:858:6: note: 'kmem_cache_destroy' was previously declared here
void kmem_cache_destroy(struct kmem_cache *s)
^
mm/slab_common.c:858:6: note: code may be misoptimized unless -fno-strict-aliasing is used
include/linux/slab.h:140:0: error: type of 'kmem_cache_create' does not match original declaration [-Werror=lto-type-mismatch]
struct kmem_cache *kmem_cache_create(const char *name, size_t size,

mm/slab_common.c:534:1: note: 'kmem_cache_create' was previously declared here
kmem_cache_create(const char *name, size_t size, size_t align,
^

This removes the header inclusion and instead uses the kmem_cache_size()
interface to get the size in a reliable way.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

buffer: Avoid setting buffer bits that are already set

It's expensive to set buffer flags that are already set, because that
causes a costly cache line transition.

A common case is setting the "verified" flag during ext4 writes.
This patch checks for the flag being set first.

With the AIM7/creat-clo benchmark testing on a 48G ramdisk based-on ext4
file system, we see 3.3%(15431->15936) improvement of aim7.jobs-per-min on
a 2-sockets broadwell platform.

What the benchmark does is: it forks 3000 processes, and each process do
the following:
a) open a new file
b) close the file
c) delete the file
until loop=100*1000 times.

The original patch is contributed by Andi Kleen.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Kemi Wang <kemi.wang@intel.com>
Signed-off-by: Kemi Wang <kemi.wang@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

x86/spectre: Simplify spectre_v2 command line parsing

[dwmw2: Use ARRAY_SIZE]

Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: peterz@infradead.org
Cc: bp@alien8.de
Link: https://lkml.kernel.org/r/1517484441-1420-3-git-send-email-dwmw@amazon.co.uk

x86/retpoline: Avoid retpolines for built-in __init functions

There's no point in building init code with retpolines, since it runs before
any potentially hostile userspace does. And before the retpoline is actually
ALTERNATIVEd into place, for much of it.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: karahmed@amazon.de
Cc: peterz@infradead.org
Cc: bp@alien8.de
Link: https://lkml.kernel.org/r/1517484441-1420-2-git-send-email-dwmw@amazon.co.uk

ima: re-initialize iint->atomic_flags

Intermittently security.ima is not being written for new files. This
patch re-initializes the new slab iint->atomic_flags field before
freeing it.

Fixes: commit 0d73a55208e9 ("ima: re-introduce own integrity cache lock")
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>

maintainers: update trusted keys

Adding James Bottomley as the new maintainer for trusted keys.

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>

xfs: remove experimental tag for reverse mapping

Reverse mapping has had a while to soak, so remove the experimental tag.
Now that we've landed space metadata cross-referencing in scrub, the
feature actually has a purpose.

Reject rmap filesystems with an rt device until the code to support it
is actually implemented.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Bill O'Donnell <billodo@redhat.com>

xfs: don't allow reflink + realtime filesystems

We don't support realtime filesystems with reflink either, so fail
those mounts.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Bill O'Donnell <billodo@redhat.com>

xfs: don't allow DAX on reflink filesystems

Now that reflink is no longer experimental, reject attempts to mount
with DAX until that whole mess gets sorted out.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Bill O'Donnell <billodo@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>