review.tizen.org Git - platform/kernel/linux-starfive.git/log

staging: lustre: replace l_wait_event_exclusive_head() with wait_event_idle_exclusive

This l_wait_event_exclusive_head() will wait indefinitely
if the timeout is zero. If it does wait with a timeout
and times out, the timeout for next time is set to zero.

The can be mapped to a call to either
wait_event_idle_exclusive()
or
wait_event_idle_exclusive_timeout()
depending in the timeout setting.

The current code arranges for LIFO queuing of waiters,
but include/event.h doesn't support that yet.
Until it does, fall back on FIFO with
wait_event_idle_exclusive{,_timeout}().

Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: lustre: remove l_wait_event from ptlrpc_set_wait

This is the last remaining use of l_wait_event().
It is the only use of LWI_TIMEOUT_INTR_ALL() which
has a meaning that timeouts can be interrupted.
Only interrupts by "fatal" signals are allowed, so
introduce l_wait_event_abortable_timeout() to
support this.

Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: lustre: use explicit poll loop in ptlrpc_unregister_reply

replace l_wait_event() with wait_event_idle_timeout() and explicit
loop. This approach is easier to understand.

Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: lustre: use explicit poll loop in ptlrpc_service_unlink_rqbd

Rather an using l_wait_event(), use wait_event_idle_timeout()
with an explicit loop so it is easier to see what is happening.

Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: lustre: improve waiting in sptlrpc_req_refresh_ctx

Replace l_wait_event with wait_event_idle_timeout() and call the
handler function explicitly. This makes it more clear
what is happening.

Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: lustre: use wait_event_idle_timeout in ptlrpcd()

We can replace l_wait_event() with
wait_event_idle_timeout() here providing we call the
timeout function when wait_event_idle_timeout() returns zero.

As ptlrpc_expired_set() returns 1, the l_wait_event() aborts of the
first timeout.

Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: lustre: make polling loop in ptlrpc_unregister_bulk more obvious

This use of l_wait_event() is a polling loop that re-checks
every second. Make this more obvious with a while loop
and wait_event_idle_timeout().

Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: lustre: remove back_to_sleep()

When 'back_to_sleep()' is passed as the 'timeout' function,
the effect is to wait indefinitely for the event, polling
once after the timeout.
If LWI_ON_SIGNAL_NOOP is given, then after the timeout
we allow fatal signals to interrupt the wait.

Make this more obvious in both places "back_to_sleep()" is
used but using two explicit sleeps.

The code in ptlrpcd_add_req() looks odd - why not just have one
wait_event_idle()? However I believe this is a faithful
transformation of the existing code.

Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: lustre: simplify waiting in ptlrpc_invalidate_import()

This waiter currently wakes up every second to re-test if
imp_flight is zero. If we ensure wakeup is called whenever
imp_flight is decremented to zero, we can just have a simple
wait_event_idle_timeout().

So add a wake_up_all to the one place it is missing, and simplify
the wait_event.

Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: lustre: open code polling loop instead of using l_wait_event()

Two places that LWI_TIMEOUT_INTERVAL() is used, the outcome is a
simple polling loop that polls every second for some event (with a
limit).

So write a simple loop to make this more apparent.

Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: lustre: simplify waiting in ldlm_completion_ast()

If a signal-callback (lwi_on_signal) is set without lwi_allow_intr, as
is the case in ldlm_completion_ast(), the behavior depends on the
timeout set.

If a timeout is set, then signals are ignored.  If the timeout is
reached, the timeout handler is called.  If the timeout handler
return 0, which ldlm_expired_completion_wait() always does, the
l_wait_event() switches to exactly the behavior if no timeout was set.

If no timeout is set, then "fatal" signals are not ignored.  If one
arrives the callback is run, but as the callback is empty in this
case, that is not relevant.

This can be simplified to:
if a timeout is wanted
     wait_event_idle_timeout()
     if that timed out, call the timeout handler
l_wait_event_abortable()

i.e. the code always waits indefinitely.  Sometimes it performs a
non-abortable wait first.  Sometimes it doesn't.  But it only
aborts before the condition is true if it is signaled.
This doesn't quite agree with the comments and debug messages.

Now that we call the timeout handler (ldlm_expired_completion_wait())
wait directly, we can pass the two args directly rather then
using a special-purpose struct.

Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: lustre: simplify l_wait_event when intr handler but no timeout.

If l_wait_event() is given a function to be called on a signal,
but no timeout or timeout handler, then the intr function is simply
called at the end if the wait was aborted by a signal.
So a simpler way to write the code (in the one place this case is
used) it to open-code the body of the function after the
wait_event, if -ERESTARTSYS was returned.

Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: lustre: introduce and use l_wait_event_abortable()

lustre sometimes wants to wait for an event, but abort if
one of a specific list of signals arrives. This is a little
bit like wait_event_killable(), except that the signals are
identified a different way.

So introduce l_wait_event_abortable() which provides this
functionality.
Having separate functions for separate needs is more in line
with the pattern set by include/linux/wait.h, than having a
single function which tries to include all possible needs.

Also introduce l_wait_event_abortable_exclusive().

Note that l_wait_event() return -EINTR on a signal, while
Linux wait_event functions return -ERESTARTSYS.
l_wait_event_{abortable_,}exclusive follow the Linux pattern.

Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: lustre: use wait_event_idle_timeout() where appropriate.

When the lwi arg has a timeout, but no timeout
callback function, l_wait_event() acts much the same as
wait_event_idle_timeout() - the wait is not interruptible and
simply waits for the event or the timeouts.

The most noticable difference is that the return value is
-ETIMEDOUT or 0, rather than 0 or non-zero.

Another difference is that if the timeout is zero, l_wait_event()
will not time out at all.  In the one case where that is possible
we need to conditionally use wait_event_idle().

So replace all such calls with wait_event_idle_timeout(), being
careful of the return value.

In one case, there is no event expected, only the timeout
is needed.  So use schedule_timeout_uninterruptible().

Note that the presence or absence of LWI_ON_SIGNAL_NOOP
has no effect in these cases.  It only has effect if the timeout
callback is non-NULL, or the timeout is zero, or
LWI_TIMEOUT_INTR_ALL() is used.

Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: lustre: discard cfs_time_seconds()

cfs_time_seconds() converts a number of seconds to the
matching number of jiffies.
The standard way to do this in Linux is "* HZ".
So discard cfs_time_seconds() and use "* HZ" instead.

Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: lustre: replace simple cases of l_wait_event() with wait_event().

When the lwi arg is full of zeros, l_wait_event() behaves almost
identically to the standard wait_event_idle() interface, so use that
instead.

l_wait_event() uses TASK_INTERRUPTIBLE, but blocks all signals.
wait_event_idle() uses the new TASK_IDLE and so avoids adding
to the load average without needing to block signals.

In one case, wait_event_idle_exclusive() is needed.

Also remove all l_wait_condition*() macros which were short-cuts
for setting lwi to {0}.

Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: lustre: discard SVC_SIGNAL and related functions

This flag is never set, so remove checks and remove
the flag.

Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

sched/wait: add wait_event_idle() functions.

The new TASK_IDLE state (TASK_UNINTERRUPTIBLE | __TASK_NOLOAD)
is not much used.  One way to make it easier to use is to
add wait_event*() family functions that make use of it.
This patch adds:
  wait_event_idle()
  wait_event_idle_timeout()
  wait_event_idle_exclusive()
  wait_event_idle_exclusive_timeout()

This set was chosen because lustre needs them before
it can discard its own l_wait_event() macro.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: lustre: llite: refine ll_find_alias based on d_exact_alias

The task of ll_find_alias() is now very similar to d_exact_alias().
We cannot use that function directly, but we can copy much of
the structure so that the similarities and differences are more
obvious.
Examining d_exact_alias() shows that the d_lock spinlock does not
need to be held in ll_find_alias as much as it currently is.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: lustre: llite: simplify ll_find_alias()

Now that ll_find_alias is only searching for one type
of dentry, we can return as soon as we find it.
This allows substantial simplification, and brings the
bonus that we don't need to take the d_lock again just
to increment the ref-count. We can increment it immediately
that the dentry is found.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: lustre: llite: remove directory-specific code from ll_find_alias()

Now that ll_find_alias() is never called for directories,
we can remove code that only applies to directories.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: lustre: llite: use d_splice_alias for directories.

In the Linux dcache a directory only ever has one dentry,
so d_splice_alias() can be used by ll_splice_alias() for directories.
It will find the one dentry whether it is DCACHE_DISCONNECTED or
IS_ROOT() or d_lustre_invalid().
Separating out the directories from non-directories will allow us
to simplify the non-directory code.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: lustre: llite: handle DCACHE_PAR_LOOKUP in ll_dcompare

ll_dcompare is used in two slightly different contexts.
It is called (from __d_lookup, __d_lookup_rcu, and d_exact_alias)
to compare a name against a dentry that is already in the dcache.
It is also called (from d_alloc_parallel) to compare a name against
a dentry that is not in the dcache yet, but is part of an active
"lookup" or "atomic_open" call.

In the first case we need to avoid matching against "invalid" dentries
as a match implies something about ldlm locks which is not accurate.
In the second case we need to allow matching against "invalid" dentries
as the dentry will always be invalid (set by ll_d_init()) but we still
want to guard against multiple concurrent lookups of the same name.
d_alloc_parallel() will repeat the call to ll_dcompare() after
the lookup has finished, and if the dentry is still invalid, the whole
d_alloc_parallel() process is repeated. This assures us that it is safe
to report success whenever d_in_lookup().

With this patch, there will never be two threads concurrently in
ll_lookup_nd(), looking up the same name in the same directory.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: lustre: honor error code from ll_iget().

Commit 020ecc6f3229 ("staging: lustre: llite: Remove IS_ERR tests")
changed ll_prep_inode to assume any error from ll_iget() meant
-ENOMEM because at that time it only returned NULL for errors.
Commit c3397e7e677b ("staging: lustre: llite: add error handler in
inode prepare phase") changed ll_iget() to once again return
meaningful codes, but nobody told ll_prep_inode().

So change ll_prep_inode() back to using PTR_ERR(*inode).

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: lustre: fix inverted test on strcmp

This code tests various fields to see if they are different, except
for one where there test is if they are the same.
This is clearly wrong for a function that is tesding for equality.

So change "!strcmp()" which I always find hard to read, to
"strcmp() != 0" which obviously means that the strings are not equal.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: lustre: update the TODO list

As more people become involved with the progression of the lustre
client it needs to more clear what needs to be done to leave
staging. Update the TODO list with the various bugs and changes
to accomplish this. Some are simple bugs and others are far more
complex task that will change many lines of code. Some even cover
updating the user land utilities to meet the kernel requirements.
Several bugs have already been addressed and just need to be
pushed to the staging tree.

Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: lustre: lnet: return of an error code should be negative

Return value of error codes should typically be negative.
Issue reported by checkpatch.pl

Signed-off-by: Sumit Pundir <pundirsumit11@gmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: rtl8188eu: Move a blank line

Move a blank line from in the middle of a declaration list to
after the declaration list, to improve readability.

Issue found by checkpatch.

Signed-off-by: Dafna Hirschfeld <dafna3@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Linux 4.16-rc1

unify {de,}mangle_poll(), get rid of kernel-side POLL...

except, again, POLLFREE and POLL_BUSY_LOOP.

With this, we finally get to the promised end result:

- POLL{IN,OUT,...} are plain integers and *not* in __poll_t, so any
   stray instances of ->poll() still using those will be caught by
   sparse.

- eventpoll.c and select.c warning-free wrt __poll_t

- no more kernel-side definitions of POLL... - userland ones are
   visible through the entire kernel (and used pretty much only for
   mangle/demangle)

- same behavior as after the first series (i.e. sparc et.al. epoll(2)
   working correctly).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

vfs: do bulk POLL* -> EPOLL* replacement

This is the mindless scripted replacement of kernel use of POLL*
variables as described by Al, done by this script:

    for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
        L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
        for f in $L; do sed -i "-es/^$[^\"]*$$\<POLL$V\>$/\\1E\\2/" $f; done
    done

with de-mangling cleanups yet to come.

NOTE! On almost all architectures, the EPOLL* constants have the same
values as the POLL* constants do.  But they keyword here is "almost".
For various bad reasons they aren't the same, and epoll() doesn't
actually work quite correctly in some cases due to this on Sparc et al.

The next patch from Al will sort out the final differences, and we
should be all done.

Scripted-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'work.poll2' of git://git./linux/kernel/git/viro/vfs

Pull more poll annotation updates from Al Viro:
"This is preparation to solving the problems you've mentioned in the
  original poll series.

  After this series, the kernel is ready for running

      for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
            L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
            for f in $L; do sed -i "-es/^$[^\"]*$$\<POLL$V\>$/\\1E\\2/" $f; done
      done

  as a for bulk search-and-replace.

  After that, the kernel is ready to apply the patch to unify
  {de,}mangle_poll(), and then get rid of kernel-side POLL... uses
  entirely, and we should be all done with that stuff.

  Basically, that's what you suggested wrt KPOLL..., except that we can
  use EPOLL... instead - they already are arch-independent (and equal to
  what is currently kernel-side POLL...).

  After the preparations (in this series) switch to returning EPOLL...
  from ->poll() instances is completely mechanical and kernel-side
  POLL... can go away. The last step (killing kernel-side POLL... and
  unifying {de,}mangle_poll() has to be done after the
  search-and-replace job, since we need userland-side POLL... for
  unified {de,}mangle_poll(), thus the cherry-pick at the last step.

  After that we will have:

   - POLL{IN,OUT,...} *not* in __poll_t, so any stray instances of
     ->poll() still using those will be caught by sparse.

   - eventpoll.c and select.c warning-free wrt __poll_t

   - no more kernel-side definitions of POLL... - userland ones are
     visible through the entire kernel (and used pretty much only for
     mangle/demangle)

   - same behavior as after the first series (i.e. sparc et.al. epoll(2)
     working correctly)"

* 'work.poll2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  annotate ep_scan_ready_list()
  ep_send_events_proc(): return result via esed->res
  preparation to switching ->poll() to returning EPOLL...
  add EPOLLNVAL, annotate EPOLL... and event_poll->event
  use linux/poll.h instead of asm/poll.h
  xen: fix poll misannotation
  smc: missing poll annotations

Merge tag 'xtensa-20180211' of git://github.com/jcmvbkbc/linux-xtensa

Pull xtense fix from Max Filippov:
"Build fix for xtensa architecture with KASAN enabled"

* tag 'xtensa-20180211' of git://github.com/jcmvbkbc/linux-xtensa:
xtensa: fix build with KASAN

Merge tag 'nios2-v4.16-rc1' of git://git./linux/kernel/git/lftan/nios2

Pull nios2 update from Ley Foon Tan:

- clean up old Kconfig options from defconfig

- remove leading 0x and 0s from bindings notation in dts files

* tag 'nios2-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2:
nios2: defconfig: Cleanup from old Kconfig options
nios2: dts: Remove leading 0x and 0s from bindings notation

xtensa: fix build with KASAN

The commit 917538e212a2 ("kasan: clean up KASAN_SHADOW_SCALE_SHIFT
usage") removed KASAN_SHADOW_SCALE_SHIFT definition from
include/linux/kasan.h and added it to architecture-specific headers,
except for xtensa. This broke the xtensa build with KASAN enabled.
Define KASAN_SHADOW_SCALE_SHIFT in arch/xtensa/include/asm/kasan.h

Reported by: kbuild test robot <fengguang.wu@intel.com>
Fixes: 917538e212a2 ("kasan: clean up KASAN_SHADOW_SCALE_SHIFT usage")
Acked-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>

nios2: defconfig: Cleanup from old Kconfig options

Remove old, dead Kconfig option INET_LRO. It is gone since
commit 7bbf3cae65b6 ("ipv4: Remove inet_lro library").

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Acked-by: Ley Foon Tan <ley.foon.tan@intel.com>

nios2: dts: Remove leading 0x and 0s from bindings notation

Improve the DTS files by removing all the leading "0x" and zeros to fix the
following dtc warnings:

Warning (unit_address_format): Node /XXX unit name should not have leading "0x"

and

Warning (unit_address_format): Node /XXX unit name should not have leading 0s

Converted using the following command:

find . -type f $ -iname *.dts -o -iname *.dtsi $ -exec sed -E -i -e "s/@0x([0-9a-fA-F\.]+)\s?\{/@\L\1 \{/g" -e "s/@0+([0-9a-fA-F\.]+)\s?\{/@\L\1 \{/g" {} +

For simplicity, two sed expressions were used to solve each warnings separately.

To make the regex expression more robust a few other issues were resolved,
namely setting unit-address to lower case, and adding a whitespace before the
the opening curly brace:

https://elinux.org/Device_Tree_Linux#Linux_conventions

This is a follow up to commit 4c9847b7375a ("dt-bindings: Remove leading 0x from bindings notation")

Reported-by: David Daney <ddaney@caviumnetworks.com>
Suggested-by: Rob Herring <robh@kernel.org>
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Acked-by: Ley Foon Tan <ley.foon.tan@intel.com>

Merge tag 'pci-v4.16-fixes-1' of git://git./linux/kernel/git/helgaas/pci

Pull PCI fix from Bjorn Helgaas:
"Fix a POWER9/powernv INTx regression from the merge window (Alexey
Kardashevskiy)"

* tag 'pci-v4.16-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
powerpc/pci: Fix broken INTx configuration via OF

Merge tag 'for-linus-20180210' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
"A few fixes to round off the merge window on the block side:

   - a set of bcache fixes by way of Michael Lyle, from the usual bcache
     suspects.

   - add a simple-to-hook-into function for bpf EIO error injection.

   - fix blk-wbt that mischarectized flushes as reads. Improve the logic
     so that flushes and writes are accounted as writes, and only reads
     as reads. From me.

   - fix requeue crash in BFQ, from Paolo"

* tag 'for-linus-20180210' of git://git.kernel.dk/linux-block:
  block, bfq: add requeue-request hook
  bcache: fix for data collapse after re-attaching an attached device
  bcache: return attach error when no cache set exist
  bcache: set writeback_rate_update_seconds in range [1, 60] seconds
  bcache: fix for allocator and register thread race
  bcache: set error_limit correctly
  bcache: properly set task state in bch_writeback_thread()
  bcache: fix high CPU occupancy during journal
  bcache: add journal statistic
  block: Add should_fail_bio() for bpf error injection
  blk-wbt: account flush requests correctly

Merge tag 'platform-drivers-x86-v4.16-3' of git://github.com/dvhart/linux-pdx86

Pull x86 platform driver updates from Darren Hart:
"Mellanox fixes and new system type support.

  Mostly data for new system types with a correction and an
  uninitialized variable fix"

[ Pulling from github because git.infradead.org currently seems to be
  down for some reason, but Darren had a backup location    - Linus ]

* tag 'platform-drivers-x86-v4.16-3' of git://github.com/dvhart/linux-pdx86:
  platform/x86: mlx-platform: Add support for new 200G IB and Ethernet systems
  platform/x86: mlx-platform: Add support for new msn201x system type
  platform/x86: mlx-platform: Add support for new msn274x system type
  platform/x86: mlx-platform: Fix power cable setting for msn21xx family
  platform/x86: mlx-platform: Add define for the negative bus
  platform/x86: mlx-platform: Use defines for bus assignment
  platform/mellanox: mlxreg-hotplug: Fix uninitialized variable

Merge tag 'chrome-platform-for-linus-4.16' of git://git./linux/kernel/git/bleung/chrome-platform

Pull chrome platform updates from Benson Leung:

- move cros_ec_dev to drivers/mfd

- other small maintenance fixes

[ The cros_ec_dev movement came in earlier through the MFD tree  - Linus ]

* tag 'chrome-platform-for-linus-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform:
  platform/chrome: Use proper protocol transfer function
  platform/chrome: cros_ec_lpc: Add support for Google Glimmer
  platform/chrome: cros_ec_lpc: Register the driver if ACPI entry is missing.
  platform/chrome: cros_ec_lpc: remove redundant pointer request
  cros_ec: fix nul-termination for firmware build info
  platform/chrome: chromeos_laptop: make chromeos_laptop const

Merge tag 'kvm-4.16-1' of git://git./virt/kvm/kvm

Pull KVM updates from Radim Krčmář:
"ARM:

   - icache invalidation optimizations, improving VM startup time

   - support for forwarded level-triggered interrupts, improving
     performance for timers and passthrough platform devices

   - a small fix for power-management notifiers, and some cosmetic
     changes

  PPC:

   - add MMIO emulation for vector loads and stores

   - allow HPT guests to run on a radix host on POWER9 v2.2 CPUs without
     requiring the complex thread synchronization of older CPU versions

   - improve the handling of escalation interrupts with the XIVE
     interrupt controller

   - support decrement register migration

   - various cleanups and bugfixes.

  s390:

   - Cornelia Huck passed maintainership to Janosch Frank

   - exitless interrupts for emulated devices

   - cleanup of cpuflag handling

   - kvm_stat counter improvements

   - VSIE improvements

   - mm cleanup

  x86:

   - hypervisor part of SEV

   - UMIP, RDPID, and MSR_SMI_COUNT emulation

   - paravirtualized TLB shootdown using the new KVM_VCPU_PREEMPTED bit

   - allow guests to see TOPOEXT, GFNI, VAES, VPCLMULQDQ, and more
     AVX512 features

   - show vcpu id in its anonymous inode name

   - many fixes and cleanups

   - per-VCPU MSR bitmaps (already merged through x86/pti branch)

   - stable KVM clock when nesting on Hyper-V (merged through
     x86/hyperv)"

* tag 'kvm-4.16-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (197 commits)
  KVM: PPC: Book3S: Add MMIO emulation for VMX instructions
  KVM: PPC: Book3S HV: Branch inside feature section
  KVM: PPC: Book3S HV: Make HPT resizing work on POWER9
  KVM: PPC: Book3S HV: Fix handling of secondary HPTEG in HPT resizing code
  KVM: PPC: Book3S PR: Fix broken select due to misspelling
  KVM: x86: don't forget vcpu_put() in kvm_arch_vcpu_ioctl_set_sregs()
  KVM: PPC: Book3S PR: Fix svcpu copying with preemption enabled
  KVM: PPC: Book3S HV: Drop locks before reading guest memory
  kvm: x86: remove efer_reload entry in kvm_vcpu_stat
  KVM: x86: AMD Processor Topology Information
  x86/kvm/vmx: do not use vm-exit instruction length for fast MMIO when running nested
  kvm: embed vcpu id to dentry of vcpu anon inode
  kvm: Map PFN-type memory regions as writable (if possible)
  x86/kvm: Make it compile on 32bit and with HYPYERVISOR_GUEST=n
  KVM: arm/arm64: Fixup userspace irqchip static key optimization
  KVM: arm/arm64: Fix userspace_irqchip_in_use counting
  KVM: arm/arm64: Fix incorrect timer_is_pending logic
  MAINTAINERS: update KVM/s390 maintainers
  MAINTAINERS: add Halil as additional vfio-ccw maintainer
  MAINTAINERS: add David as a reviewer for KVM/s390
  ...

powerpc/pci: Fix broken INTx configuration via OF

59f47eff03a0 ("powerpc/pci: Use of_irq_parse_and_map_pci() helper")
replaced of_irq_parse_pci() + irq_create_of_mapping() with
of_irq_parse_and_map_pci(), but neglected to capture the virq
returned by irq_create_of_mapping(), so virq remained zero, which
caused INTx configuration to fail.

Save the virq value returned by of_irq_parse_and_map_pci() and correct
the virq declaration to match the of_irq_parse_and_map_pci() signature.

Fixes: 59f47eff03a0 "powerpc/pci: Use of_irq_parse_and_map_pci() helper"
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

Merge tag 'kbuild-v4.16-2' of git://git./linux/kernel/git/masahiroy/linux-kbuild

Pull more Kbuild updates from Masahiro Yamada:
"Makefile changes:
   - enable unused-variable warning that was wrongly disabled for clang

  Kconfig changes:
   - warn about blank 'help' and fix existing instances
   - fix 'choice' behavior to not write out invisible symbols
   - fix misc weirdness

  Coccinell changes:
   - fix false positive of free after managed memory alloc detection
   - improve performance of NULL dereference detection"

* tag 'kbuild-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (21 commits)
  kconfig: remove const qualifier from sym_expand_string_value()
  kconfig: add xrealloc() helper
  kconfig: send error messages to stderr
  kconfig: echo stdin to stdout if either is redirected
  kconfig: remove check_stdin()
  kconfig: remove 'config*' pattern from .gitignnore
  kconfig: show '?' prompt even if no help text is available
  kconfig: do not write choice values when their dependency becomes n
  coccinelle: deref_null: avoid useless computation
  coccinelle: devm_free: reduce false positives
  kbuild: clang: disable unused variable warnings only when constant
  kconfig: Warn if help text is blank
  nios2: kconfig: Remove blank help text
  arm: vt8500: kconfig: Remove blank help text
  MIPS: kconfig: Remove blank help text
  MIPS: BCM63XX: kconfig: Remove blank help text
  lib/Kconfig.debug: Remove blank help text
  Staging: rtl8192e: kconfig: Remove blank help text
  Staging: rtl8192u: kconfig: Remove blank help text
  mmc: kconfig: Remove blank help text
  ...

mconsole_proc(): don't mess with file->f_pos

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs

Pull misc vfs fixes from Al Viro.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
seq_file: fix incomplete reset on read from zero offset
kernfs: fix regression in kernfs_fop_write caused by wrong type

kconfig: remove const qualifier from sym_expand_string_value()

This function returns realloc'ed memory, so the returned pointer
must be passed to free() when done. So, 'const' qualifier is odd.
It is allowed to modify the expanded string.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

kconfig: add xrealloc() helper

We already have xmalloc(), xcalloc(). Add xrealloc() as well
to save tedious error handling.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

platform/x86: mlx-platform: Add support for new 200G IB and Ethernet systems

It adds support for new Mellanox system types of basic classes qmb7, sn34,
sn37, containing systems QMB700 (40x200GbE InfiniBand switch), SN3700
(32x200GbE and 16x400GbE Ethernet switch) and SN3410 (6x400GbE plus
48x50GbE Ethernet switch). These are the Top of the Rack systems, equipped
with Mellanox COM-Express carrier board and switch board with Mellanox
Quantum device, which supports InfiniBand switching with 40X200G ports and
line rate of up to HDR speed or with Mellanox Spectrum-2 device, which
supports Ethernet switching with 32X200G ports line rate of up to HDR
speed.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>

platform/x86: mlx-platform: Add support for new msn201x system type

It adds support for new Mellanox system types of basic half unit size
class msn201x, containing system MSN2010 (18x10GbE plus 4x4x25GbE) half
and its derivatives. This is the Top of the Rack system, equipped with
Mellanox Small Form Factor carrier board and switch board with Mellanox
Spectrum device, which supports Ethernet switching with 32X100G ports line
rate of up to EDR speed.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>

platform/x86: mlx-platform: Add support for new msn274x system type

It adds support for new Mellanox system types of basic class msn274x,
containing system MSN2740 (32x100GbE Ethernet switch with cost reduction)
and its derivatives. These are the Top of the Rack system, equipped with
Mellanox Small Form Factor carrier board and switch board with Mellanox
Spectrum device, which supports Ethernet switching with 32X100G ports line
rate of up to EDR speed.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>

Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) Make allocations less aggressive in x_tables, from Minchal Hocko.

2) Fix netfilter flowtable Kconfig deps, from Pablo Neira Ayuso.

3) Fix connection loss problems in rtlwifi, from Larry Finger.

4) Correct DRAM dump length for some chips in ath10k driver, from Yu
    Wang.

5) Fix ABORT handling in rxrpc, from David Howells.

6) Add SPDX tags to Sun networking drivers, from Shannon Nelson.

7) Some ipv6 onlink handling fixes, from David Ahern.

8) Netem packet scheduler interval calcualtion fix from Md. Islam.

9) Don't put crypto buffers on-stack in rxrpc, from David Howells.

10) Fix handling of error non-delivery status in netlink multicast
    delivery over multiple namespaces, from Nicolas Dichtel.

11) Missing xdp flush in tuntap driver, from Jason Wang.

12) Synchonize RDS protocol netns/module teardown with rds object
    management, from Sowini Varadhan.

13) Add nospec annotations to mpls, from Dan Williams.

14) Fix SKB truesize handling in TIPC, from Hoang Le.

15) Interrupt masking fixes in stammc from Niklas Cassel.

16) Don't allow ptr_ring objects to be sized outside of kmalloc's
    limits, from Jason Wang.

17) Don't allow SCTP chunks to be built which will have a length
    exceeding the chunk header's 16-bit length field, from Alexey
    Kodanev.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (82 commits)
  ibmvnic: Remove skb->protocol checks in ibmvnic_xmit
  bpf: fix rlimit in reuseport net selftest
  sctp: verify size of a new chunk in _sctp_make_chunk()
  s390/qeth: fix SETIP command handling
  s390/qeth: fix underestimated count of buffer elements
  ptr_ring: try vmalloc() when kmalloc() fails
  ptr_ring: fail early if queue occupies more than KMALLOC_MAX_SIZE
  net: stmmac: remove redundant enable of PMT irq
  net: stmmac: rename GMAC_INT_DEFAULT_MASK for dwmac4
  net: stmmac: discard disabled flags in interrupt status register
  ibmvnic: Reset long term map ID counter
  tools/libbpf: handle issues with bpf ELF objects containing .eh_frames
  selftests/bpf: add selftest that use test_libbpf_open
  selftests/bpf: add test program for loading BPF ELF files
  tools/libbpf: improve the pr_debug statements to contain section numbers
  bpf: Sync kernel ABI header with tooling header for bpf_common.h
  net: phy: fix phy_start to consider PHY_IGNORE_INTERRUPT
  net: thunder: change q_len's type to handle max ring size
  tipc: fix skb truesize/datasize ratio control
  net/sched: cls_u32: fix cls_u32 on filter replace
  ...

Merge tag 'nfs-for-4.16-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull more NFS client updates from Trond Myklebust:
"A few bugfixes and some small sunrpc latency/performance improvements
  before the merge window closes:

  Stable fixes:

   - fix an incorrect calculation of the RDMA send scatter gather
     element limit

   - fix an Oops when attempting to free resources after RDMA device
     removal

  Bugfixes:

   - SUNRPC: Ensure we always release the TCP socket in a timely fashion
     when the connection is shut down.

   - SUNRPC: Don't call __UDPX_INC_STATS() from a preemptible context

  Latency/Performance:

   - SUNRPC: Queue latency sensitive socket tasks to the less contended
     xprtiod queue

   - SUNRPC: Make the xprtiod workqueue unbounded.

   - SUNRPC: Make the rpciod workqueue unbounded"

* tag 'nfs-for-4.16-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  SUNRPC: Don't call __UDPX_INC_STATS() from a preemptible context
  fix parallelism for rpc tasks
  Make the xprtiod workqueue unbounded.
  SUNRPC: Queue latency-sensitive socket tasks to xprtiod
  SUNRPC: Ensure we always close the socket after a connection shuts down
  xprtrdma: Fix BUG after a device removal
  xprtrdma: Fix calculation of ri_max_send_sges

Merge branch 'for-next' of git://git./linux/kernel/git/nab/target-pending

Pull SCSI target updates from Nicholas Bellinger:
"The highlights include:

   - numerous target-core-user improvements related to queue full and
     timeout handling. (MNC)

   - prevent target-core-user corruption when invalid data page is
     requested. (MNC)

   - add target-core device action configfs attributes to allow
     user-space to trigger events separate from existing attributes
     exposed to end-users. (MNC)

   - fix iscsi-target NULL pointer dereference 4.6+ regression in CHAP
     error path. (David Disseldorp)

   - avoid target-core backend UNMAP callbacks if range is zero. (Andrei
     Vagin)

   - fix a iscsi-target 4.14+ regression related multiple PDU logins,
     that was exposed due to removal of TCP prequeue support. (Florian
     Westphal + MNC)

  Also, there is a iser-target bug still being worked on for post -rc1
  code to address a long standing issue resulting in persistent
  ib_post_send() failures, for RNICs with small max_send_sge"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (36 commits)
  iscsi-target: make sure to wake up sleeping login worker
  tcmu: Fix trailing semicolon
  tcmu: fix cmd user after free
  target: fix destroy device in target_configure_device
  tcmu: allow userspace to reset ring
  target core: add device action configfs files
  tcmu: fix error return code in tcmu_configure_device()
  target_core_user: add cmd id to broken ring message
  target: add SAM_STAT_BUSY sense reason
  tcmu: prevent corruption when invalid data page requested
  target: don't call an unmap callback if a range length is zero
  target/iscsi: avoid NULL dereference in CHAP auth error path
  cxgbit: call neigh_event_send() to update MAC address
  target: tcm_loop: Use seq_puts() in tcm_loop_show_info()
  target: tcm_loop: Delete an unnecessary return statement in tcm_loop_submission_work()
  target: tcm_loop: Delete two unnecessary variable initialisations in tcm_loop_issue_tmr()
  target: tcm_loop: Combine substrings for 26 messages
  target: tcm_loop: Improve a size determination in two functions
  target: tcm_loop: Delete an error message for a failed memory allocation in four functions
  sbp-target: Delete an error message for a failed memory allocation in three functions
  ...

Merge tag 'trace-v4.16-rc1' of git://git./linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:
"Al Viro discovered some breakage with the parsing of the
  set_ftrace_filter as well as the removing of function probes.

  This fixes the code with Al's suggestions. I also added a few
  selftests to test the broken cases such that they wont happen
  again"

* tag 'trace-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  selftests/ftrace: Add more tests for removing of function probes
  selftests/ftrace: Add some missing glob checks
  selftests/ftrace: Have reset_ftrace_filter handle multiple instances
  selftests/ftrace: Have reset_ftrace_filter handle modules
  tracing: Fix parsing of globs with a wildcard at the beginning
  ftrace: Remove incorrect setting of glob search field

Merge tag '4.16-minor-rc-SMB3-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
"There are a couple additional security fixes that are still being
  tested that are not in this set."

* tag '4.16-minor-rc-SMB3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  Add missing structs and defines from recent SMB3.1.1 documentation
  address lock imbalance warnings in smbdirect.c
  cifs: silence compiler warnings showing up with gcc-8.0.0
  Add some missing debug fields in server and tcon structs

Merge tag 'fbdev-v4.16-fix' of git://github.com/bzolnier/linux

Pull fbdev fix from Bartlomiej Zolnierkiewicz:
"Fix building of the omapfb driver (Tomi Valkeinen)"

* tag 'fbdev-v4.16-fix' of git://github.com/bzolnier/linux:
video: omapfb: fix missing #includes

Merge tag 'kvm-ppc-next-4.16-2' of git://git./linux/kernel/git/paulus/powerpc

Second PPC KVM update for 4.16

Seven fixes that are either trivial or that address bugs that people
are actually hitting.  The main ones are:

- Drop spinlocks before reading guest memory

- Fix a bug causing corruption of VCPU state in PR KVM with preemption
  enabled

- Make HPT resizing work on POWER9

- Add MMIO emulation for vector loads and stores, because guests now
  use these instructions in memcpy and similar routines.

Merge branch 'msr-bitmaps' of git://git./virt/kvm/kvm

This topic branch allocates separate MSR bitmaps for each VCPU.
This is required for the IBRS enablement to choose, on a per-VM
basis, whether to intercept the SPEC_CTRL and PRED_CMD MSRs;
the IBRS enablement comes in through the tip tree.

ibmvnic: Remove skb->protocol checks in ibmvnic_xmit

Having these checks in ibmvnic_xmit causes problems with VLAN
tagging and balance-alb/tlb bonding modes. The restriction they
imposed can be removed.

Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bpf: fix rlimit in reuseport net selftest

Fix two issues in the reuseport_bpf selftests that were
reported by Linaro CI:

  [...]
  + ./reuseport_bpf
  ---- IPv4 UDP ----
  Testing EBPF mod 10...
  Reprograming, testing mod 5...
  ./reuseport_bpf: ebpf error. log:
  0: (bf) r6 = r1
  1: (20) r0 = *(u32 *)skb[0]
  2: (97) r0 %= 10
  3: (95) exit
  processed 4 insns
  : Operation not permitted
  + echo FAIL
  [...]
  ---- IPv4 TCP ----
  Testing EBPF mod 10...
  ./reuseport_bpf: failed to bind send socket: Address already in use
  + echo FAIL
  [...]

For the former adjust rlimit since this was the cause of
failure for loading the BPF prog, and for the latter add
SO_REUSEADDR.

Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Link: https://bugs.linaro.org/show_bug.cgi?id=3502
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

sctp: verify size of a new chunk in _sctp_make_chunk()

When SCTP makes INIT or INIT_ACK packet the total chunk length
can exceed SCTP_MAX_CHUNK_LEN which leads to kernel panic when
transmitting these packets, e.g. the crash on sending INIT_ACK:

[  597.804948] skbuff: skb_over_panic: text:00000000ffae06e4 len:120168
               put:120156 head:000000007aa47635 data:00000000d991c2de
               tail:0x1d640 end:0xfec0 dev:<NULL>
...
[  597.976970] ------------[ cut here ]------------
[  598.033408] kernel BUG at net/core/skbuff.c:104!
[  600.314841] Call Trace:
[  600.345829]  <IRQ>
[  600.371639]  ? sctp_packet_transmit+0x2095/0x26d0 [sctp]
[  600.436934]  skb_put+0x16c/0x200
[  600.477295]  sctp_packet_transmit+0x2095/0x26d0 [sctp]
[  600.540630]  ? sctp_packet_config+0x890/0x890 [sctp]
[  600.601781]  ? __sctp_packet_append_chunk+0x3b4/0xd00 [sctp]
[  600.671356]  ? sctp_cmp_addr_exact+0x3f/0x90 [sctp]
[  600.731482]  sctp_outq_flush+0x663/0x30d0 [sctp]
[  600.788565]  ? sctp_make_init+0xbf0/0xbf0 [sctp]
[  600.845555]  ? sctp_check_transmitted+0x18f0/0x18f0 [sctp]
[  600.912945]  ? sctp_outq_tail+0x631/0x9d0 [sctp]
[  600.969936]  sctp_cmd_interpreter.isra.22+0x3be1/0x5cb0 [sctp]
[  601.041593]  ? sctp_sf_do_5_1B_init+0x85f/0xc30 [sctp]
[  601.104837]  ? sctp_generate_t1_cookie_event+0x20/0x20 [sctp]
[  601.175436]  ? sctp_eat_data+0x1710/0x1710 [sctp]
[  601.233575]  sctp_do_sm+0x182/0x560 [sctp]
[  601.284328]  ? sctp_has_association+0x70/0x70 [sctp]
[  601.345586]  ? sctp_rcv+0xef4/0x32f0 [sctp]
[  601.397478]  ? sctp6_rcv+0xa/0x20 [sctp]
...

Here the chunk size for INIT_ACK packet becomes too big, mostly
because of the state cookie (INIT packet has large size with
many address parameters), plus additional server parameters.

Later this chunk causes the panic in skb_put_data():

  skb_packet_transmit()
      sctp_packet_pack()
          skb_put_data(nskb, chunk->skb->data, chunk->skb->len);

'nskb' (head skb) was previously allocated with packet->size
from u16 'chunk->chunk_hdr->length'.

As suggested by Marcelo we should check the chunk's length in
_sctp_make_chunk() before trying to allocate skb for it and
discard a chunk if its size bigger than SCTP_MAX_CHUNK_LEN.

Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leinter@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 's390-qeth-fixes'

Julian Wiedmann says:

====================
s390/qeth: fixes 2018-02-09

please apply the following two qeth patches for 4.16 and stable.

One restricts a command quirk to the intended commandd type,
while the other fixes an off-by-one during data transmission
that can cause qeth to build malformed buffer descriptors.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

s390/qeth: fix SETIP command handling

send_control_data() applies some special handling to SETIP v4 IPA
commands. But current code parses *all* command types for the SETIP
command code. Limit the command code check to IPA commands.

Fixes: 5b54e16f1a54 ("qeth: do not spin for SETIP ip assist command")
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

s390/qeth: fix underestimated count of buffer elements

For a memory range/skb where the last byte falls onto a page boundary
(ie. 'end' is of the form xxx...xxx001), the PFN_UP() part of the
calculation currently doesn't round up to the next PFN due to an
off-by-one error.
Thus qeth believes that the skb occupies one page less than it
actually does, and may select a IO buffer that doesn't have enough spare
buffer elements to fit all of the skb's data.
HW detects this as a malformed buffer descriptor, and raises an
exception which then triggers device recovery.

Fixes: 2863c61334aa ("qeth: refactor calculation of SBALE count")
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ptr_ring: try vmalloc() when kmalloc() fails

This patch switch to use kvmalloc_array() for using a vmalloc()
fallback to help in case kmalloc() fails.

Reported-by: syzbot+e4d4f9ddd4295539735d@syzkaller.appspotmail.com
Fixes: 2e0ab8ca83c12 ("ptr_ring: array based FIFO for pointers")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ptr_ring: fail early if queue occupies more than KMALLOC_MAX_SIZE

To avoid slab to warn about exceeded size, fail early if queue
occupies more than KMALLOC_MAX_SIZE.

Reported-by: syzbot+e4d4f9ddd4295539735d@syzkaller.appspotmail.com
Fixes: 2e0ab8ca83c12 ("ptr_ring: array based FIFO for pointers")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'stmmac-irq-fixes-cleanups'

Niklas Cassel says:

====================
stmmac irq fixes/cleanups

A couple of small stmmac irq fixes/cleanups.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: stmmac: remove redundant enable of PMT irq

For dwmac4, GMAC_INT_DEFAULT_ENABLE already includes
GMAC_INT_PMT_EN, so it is redundant to check if hw->pmt
is set, and if so, setting the bit again.

For dwmac1000, GMAC_INT_DEFAULT_MASK does not include
GMAC_INT_DISABLE_PMT, so it is redundant to check if
hw->pmt is set, and if so, clearing an already cleared bit.

Improve code readability by removing this redundant code.

Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: stmmac: rename GMAC_INT_DEFAULT_MASK for dwmac4

GMAC_INT_DEFAULT_MASK is written to the interrupt enable register.
In previous versions of the IP (e.g. dwmac1000), this register was
instead an interrupt mask register.
To improve clarity and reflect reality, rename GMAC_INT_DEFAULT_MASK
to GMAC_INT_DEFAULT_ENABLE.

Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: stmmac: discard disabled flags in interrupt status register

The interrupt status register in both dwmac1000 and dwmac4 ignores
interrupt enable (for dwmac4) / interrupt mask (for dwmac1000).
Therefore, if we want to check only the bits that can actually trigger
an irq, we have to filter the interrupt status register manually.

Commit 0a764db10337 ("stmmac: Discard masked flags in interrupt status
register") fixed this for dwmac1000. Fix the same issue for dwmac4.

Just like commit 0a764db10337 ("stmmac: Discard masked flags in
interrupt status register"), this makes sure that we do not get
spurious link up/link down prints.

Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ibmvnic: Reset long term map ID counter

When allocating RX or TX buffer pools, the driver needs to provide a
unique mapping ID to firmware for each pool. This value is assigned
using a counter which is incremented after a new pool is created. The
ID can be an integer ranging from 1-255. When migrating to a device
that requests a different number of queues, this value was not being
reset properly. As a result, after enough migrations, the counter
exceeded the upper bound and pool creation failed. This is fixed by
resetting the counter to one in this case.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge git://git./pub/scm/linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
pull-request: bpf 2018-02-09

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Two fixes for BPF sockmap in order to break up circular map references
   from programs attached to sockmap, and detaching related sockets in
   case of socket close() event. For the latter we get rid of the
   smap_state_change() and plug into ULP infrastructure, which will later
   also be used for additional features anyway such as TX hooks. For the
   second issue, dependency chain is broken up via map release callback
   to free parse/verdict programs, all from John.

2) Fix a libbpf relocation issue that was found while implementing XDP
   support for Suricata project. Issue was that when clang was invoked
   with default target instead of bpf target, then various other e.g.
   debugging relevant sections are added to the ELF file that contained
   relocation entries pointing to non-BPF related sections which libbpf
   trips over instead of skipping them. Test cases for libbpf are added
   as well, from Jesper.

3) Various misc fixes for bpftool and one for libbpf: a small addition
   to libbpf to make sure it recognizes all standard section prefixes.
   Then, the Makefile in bpftool/Documentation is improved to explicitly
   check for rst2man being installed on the system as we otherwise risk
   installing empty man pages; the man page for bpftool-map is corrected
   and a set of missing bash completions added in order to avoid shipping
   bpftool where the completions are only partially working, from Quentin.

4) Fix applying the relocation to immediate load instructions in the
   nfp JIT which were missing a shift, from Jakub.

5) Two fixes for the BPF kernel selftests: handle CONFIG_BPF_JIT_ALWAYS_ON=y
   gracefully in test_bpf.ko module and mark them as FLAG_EXPECTED_FAIL
   in this case; and explicitly delete the veth devices in the two tests
   test_xdp_{meta,redirect}.sh before dismantling the netnses as when
   selftests are run in batch mode, then workqueue to handle destruction
   might not have finished yet and thus veth creation in next test under
   same dev name would fail, from Yonghong.

6) Fix test_kmod.sh to check the test_bpf.ko module path before performing
   an insmod, and fallback to modprobe. Especially the latter is useful
   when having a device under test that has the modules installed instead,
   from Naresh.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'for-linus-4.16-rc1-tag' of git://git./linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:
"Only five small fixes for issues when running under Xen"

* tag 'for-linus-4.16-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: Fix {set,clear}_foreign_p2m_mapping on autotranslating guests
  pvcalls-back: do not return error on inet_accept EAGAIN
  xen-netfront: Fix race between device setup and open
  xen/grant-table: Use put_page instead of free_page
  x86/xen: init %gs very early to avoid page faults with stack protector

Merge branch 'for-linus' of git://git./linux/kernel/git/s390/linux

Pull s390 updates from Heiko Carstens:
"The main thing in this merge is the defense for the Spectre
  vulnerabilities. But there are other updates as well, the changes in
  more detail:

   - An s390 specific implementation of the array_index_mask_nospec
     function to the defense against spectre v1

   - Two patches to utilize the new PPA-12/PPA-13 instructions to run
     the kernel and/or user space with reduced branch predicton.

   - The s390 variant of the 'retpoline' spectre v2 defense called
     'expoline'. There is no return instruction for s390, instead an
     indirect branch is used for function return

     The s390 defense mechanism for indirect branches works by using an
     execute-type instruction with the indirect branch as the target of
     the execute. In effect that turns off the prediction for the
     indirect branch.

   - Scrub registers in entry.S that contain user controlled values to
     prevent the speculative use of these values.

   - Re-add the second parameter for the s390 specific runtime
     instrumentation system call and move the header file to uapi. The
     second parameter will continue to do nothing but older kernel
     versions only accepted valid real-time signal numbers. The details
     will be documented in the man-page for the system call.

   - Corrections and improvements for the s390 specific documentation

   - Add a line to /proc/sysinfo to display the CPU model dependent
     license-internal-code identifier

   - A header file include fix for eadm.

   - An error message fix in the kprobes code.

   - The removal of an outdated ARCH_xxx select statement"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/kconfig: Remove ARCH_WANTS_PROT_NUMA_PROT_NONE select
  s390: introduce execute-trampolines for branches
  s390: run user space and KVM guests with modified branch prediction
  s390: add options to change branch prediction behaviour for the kernel
  s390/alternative: use a copy of the facility bit mask
  s390: add optimized array_index_mask_nospec
  s390: scrub registers on kernel entry and KVM exit
  s390/cio: fix kernel-doc usage
  s390/runtime_instrumentation: re-add signum system call parameter
  s390/cpum_cf: correct counter number of LAST_HOST_TRANSLATIONS
  s390/kprobes: Fix %p uses in error messages
  s390/runtime instrumentation: provide uapi header file
  s390/sysinfo: add and display licensed internal code identifier
  s390/docs: reword airq section
  s390/docs: mention subchannel types
  s390/cmf: fix kerneldoc
  s390/eadm: fix CONFIG_BLOCK include dependency

Merge tag 'acpi-part2-4.16-rc1' of git://git./linux/kernel/git/rafael/linux-pm

Pull more ACPI updates from Rafael Wysocki:
"These are mostly fixes and cleanups, a few new quirks, a couple of
  updates related to the handling of ACPI tables and ACPICA copyrights
  refreshment.

  Specifics:

   - Update the ACPICA kernel code to upstream revision 20180105
     including:
       * Assorted fixes (Jung-uk Kim)
       * Support for X32 ABI compilation (Anuj Mittal)
       * Update of ACPICA copyrights to 2018 (Bob Moore)

   - Prepare for future modifications to avoid executing the _STA
     control method too early (Hans de Goede)

   - Make the processor performance control library code ignore _PPC
     notifications if they cannot be handled and fix up the C1 idle
     state definition when it is used as a fallback state (Chen Yu,
     Yazen Ghannam)

   - Make it possible to use the SPCR table on x86 and to replace the
     original IORT table with a new one from initrd (Prarit Bhargava,
     Shunyong Yang)

   - Add battery-related quirks for Asus UX360UA and UX410UAK and add
     quirks for table parsing on Dell XPS 9570 and Precision M5530 (Kai
     Heng Feng)

   - Address static checker warnings in the CPPC code (Gustavo Silva)

   - Avoid printing a raw pointer to the kernel log in the smart battery
     driver (Greg Kroah-Hartman)"

* tag 'acpi-part2-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: sbshc: remove raw pointer from printk() message
  ACPI: SPCR: Make SPCR available to x86
  ACPI / CPPC: Use 64-bit arithmetic instead of 32-bit
  ACPI / tables: Add IORT to injectable table list
  ACPI / bus: Parse tables as term_list for Dell XPS 9570 and Precision M5530
  ACPICA: Update version to 20180105
  ACPICA: All acpica: Update copyrights to 2018
  ACPI / processor: Set default C1 idle state description
  ACPI / battery: Add quirk for Asus UX360UA and UX410UAK
  ACPI: processor_perflib: Do not send _PPC change notification if not ready
  ACPI / scan: Use acpi_bus_get_status() to initialize ACPI_TYPE_DEVICE devs
  ACPI / bus: Do not call _STA on battery devices with unmet dependencies
  PCI: acpiphp_ibm: prepare for acpi_get_object_info() no longer returning status
  ACPI: export acpi_bus_get_status_handle()
  ACPICA: Add a missing pair of parentheses
  ACPICA: Prefer ACPI_TO_POINTER() over ACPI_ADD_PTR()
  ACPICA: Avoid NULL pointer arithmetic
  ACPICA: Linux: add support for X32 ABI compilation
  ACPI / video: Use true for boolean value

Merge tag 'pm-part2-4.16-rc1' of git://git./linux/kernel/git/rafael/linux-pm

Pull more power management updates from Rafael Wysocki:
"These are mostly fixes and cleanups and removal of the no longer
  needed at32ap-cpufreq driver.

  Specifics:

   - Drop the at32ap-cpufreq driver which is useless after the removal
     of the corresponding arch (Corentin LABBE).

   - Fix a regression from the 4.14 cycle in the APM idle driver by
     making it initialize the polling state properly (Rafael Wysocki).

   - Fix a crash on failing system suspend due to a missing check in the
     cpufreq core (Bo Yan).

   - Make the intel_pstate driver initialize the hardware-managed
     P-state control (HWP) feature on CPU0 upon resume from system
     suspend if HWP had been enabled before the system was suspended
     (Chen Yu).

   - Fix up the SCPI cpufreq driver after recent changes (Sudeep Holla,
     Wei Yongjun).

   - Avoid pointer subtractions during frequency table walks in cpufreq
     (Dominik Brodowski).

   - Avoid the check for ProcFeedback in ST/CZ in the cpufreq driver for
     AMD processors and add a MODULE_ALIAS for cpufreq on ARM IMX (Akshu
     Agrawal, Nicolas Chauvet).

   - Fix the prototype of swsusp_arch_resume() on x86 (Arnd Bergmann).

   - Fix up the parsing of power domains DT data (Ulf Hansson)"

* tag 'pm-part2-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  arm: imx: Add MODULE_ALIAS for cpufreq
  cpufreq: Add and use cpufreq_for_each_{valid_,}entry_idx()
  cpufreq: intel_pstate: Enable HWP during system resume on CPU0
  cpufreq: scpi: fix error return code in scpi_cpufreq_init()
  x86: hibernate: fix swsusp_arch_resume() prototype
  PM / domains: Fix up domain-idle-states OF parsing
  cpufreq: scpi: fix static checker warning cdev isn't an ERR_PTR
  cpufreq: remove at32ap-cpufreq
  cpufreq: AMD: Ignore the check for ProcFeedback in ST/CZ
  x86: PM: Make APM idle driver initialize polling state
  cpufreq: Skip cpufreq resume if it's not suspended

SUNRPC: Don't call __UDPX_INC_STATS() from a preemptible context

Calling __UDPX_INC_STATS() from a preemptible context leads to a
warning of the form:

BUG: using __this_cpu_add() in preemptible [00000000] code: kworker/u5:0/31
caller is xs_udp_data_receive_workfn+0x194/0x270
CPU: 1 PID: 31 Comm: kworker/u5:0 Not tainted 4.15.0-rc8-00076-g90ea9f1 #2
Workqueue: xprtiod xs_udp_data_receive_workfn
Call Trace:
  dump_stack+0x85/0xc1
  check_preemption_disabled+0xce/0xe0
  xs_udp_data_receive_workfn+0x194/0x270
  process_one_work+0x318/0x620
  worker_thread+0x20a/0x390
  ? process_one_work+0x620/0x620
  kthread+0x120/0x130
  ? __kthread_bind_mask+0x60/0x60
  ret_from_fork+0x24/0x30

Since we're taking a spinlock in those functions anyway, let's fix the
issue by moving the call so that it occurs under the spinlock.

Reported-by: kernel test robot <fengguang.wu@intel.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

video: omapfb: fix missing #includes

The omapfb driver fails to build after commit 23c35f48f5fb
("pinctrl: remove include file from <linux/device.h>") because it
relies on the <linux/pinctrl/consumer.h> and <linux/seq_file.h>
being pulled in by the <linux/device.h> header implicitly.

Include these headers explicitly to avoid the build failures.

Fixes: 23c35f48f5fb ("pinctrl: remove include file from <linux/device.h>")
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Tested-by: Tony Lindgren <tony@atomide.com>
[b.zolnierkie: fix include order and patch description]
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>

KVM: PPC: Book3S: Add MMIO emulation for VMX instructions

This patch provides the MMIO load/store vector indexed
X-Form emulation.

Instructions implemented:
lvx: the quadword in storage addressed by the result of EA &
0xffff_ffff_ffff_fff0 is loaded into VRT.

stvx: the contents of VRS are stored into the quadword in storage
addressed by the result of EA & 0xffff_ffff_ffff_fff0.

Reported-by: Gopesh Kumar Chaudhary <gopchaud@in.ibm.com>
Reported-by: Balamuruhan S <bala24@linux.vnet.ibm.com>
Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>

KVM: PPC: Book3S HV: Branch inside feature section

We ended up with code that did a conditional branch inside a feature
section to code outside of the feature section. Depending on how the
object file gets organized, that might mean we exceed the 14bit
relocation limit for conditional branches:

arch/powerpc/kvm/built-in.o:arch/powerpc/kvm/book3s_hv_rmhandlers.S:416:(__ftr_alt_97+0x8): relocation truncated to fit: R_PPC64_REL14 against `.text'+1ca4

So instead of doing a conditional branch outside of the feature section,
let's just jump at the end of the same, making the branch very short.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>

KVM: PPC: Book3S HV: Make HPT resizing work on POWER9

This adds code to enable the HPT resizing code to work on POWER9,
which uses a slightly modified HPT entry format compared to POWER8.
On POWER9, we convert HPTEs read from the HPT from the new format to
the old format so that the rest of the HPT resizing code can work as
before. HPTEs written to the new HPT are converted to the new format
as the last step before writing them into the new HPT.

This takes out the checks added by commit bcd3bb63dbc8 ("KVM: PPC:
Book3S HV: Disable HPT resizing on POWER9 for now", 2017-02-18),
now that HPT resizing works on POWER9.

On POWER9, when we pivot to the new HPT, we now call
kvmppc_setup_partition_table() to update the partition table in order
to make the hardware use the new HPT.

[paulus@ozlabs.org - added kvmppc_setup_partition_table() call,
wrote commit message.]

Tested-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>

KVM: PPC: Book3S HV: Fix handling of secondary HPTEG in HPT resizing code

This fixes the computation of the HPTE index to use when the HPT
resizing code encounters a bolted HPTE which is stored in its
secondary HPTE group.  The code inverts the HPTE group number, which
is correct, but doesn't then mask it with new_hash_mask.  As a result,
new_pteg will be effectively negative, resulting in new_hptep
pointing before the new HPT, which will corrupt memory.

In addition, this removes two BUG_ON statements.  The condition that
the BUG_ONs were testing -- that we have computed the hash value
incorrectly -- has never been observed in testing, and if it did
occur, would only affect the guest, not the host.  Given that
BUG_ON should only be used in conditions where the kernel (i.e.
the host kernel, in this case) can't possibly continue execution,
it is not appropriate here.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>

platform/x86: mlx-platform: Fix power cable setting for msn21xx family

Add dedicated structure with power cable setting for Mellanox msn21xx
family. These systems do not have a physical device for the power unit
controller. When the power cable is inserted or removed, the relevant
interrupt signal is handled, the status is updated, but no device is
associated with the signal.

Add definition for interrupt low aggregation signal. On system from
msn21xx family, low aggregation mask should be removed in order to allow
signal to hit CPU.

Fixes: 6613d18e9038 ("platform/x86: mlx-platform: Move module from arch/x86")
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>

platform/x86: mlx-platform: Add define for the negative bus

Add define for the negative bus ID in order to use it in case no hotplug
device is associated with the hotplug interrupt signal. In this case,
the signal will be handled by the mlxreg-hotplug driver, but no device
will be associated with the signal.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>

platform/x86: mlx-platform: Use defines for bus assignment

Add defines for the bus IDs, used for hotplug device topology to improve
code readability. Defines added for FAN and power units.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>

platform/mellanox: mlxreg-hotplug: Fix uninitialized variable

With gcc-4.1.2:

drivers/platform/mellanox/mlxreg-hotplug.c: In function ‘mlxreg_hotplug_health_work_helper’:
drivers/platform/mellanox/mlxreg-hotplug.c:347: warning: ‘ret’ is used uninitialized in this function

Indeed, if mlxreg_core_item.count is zero, ret is used uninitialized.

While this is unlikely to happen (it is set to ARRAY_SIZE(...) in x86
board files), this is done in another source file, so fix this by
preinitializing ret to zero.

Fixes: c6acad68eb2dbffd ("platform/mellanox: mlxreg-hotplug: Modify to use a regmap interface")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Vadim Pasternak <vadimp@mellanox.com>
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>

Merge branch 'bpf-libbpf-relo-fix-and-tests'

Jesper Dangaard Brouer says:

====================
While playing with using libbpf for the Suricata project, we had
issues LLVM >= 4.0.1 generating ELF files that could not be loaded
with libbpf (tools/lib/bpf/).

During the troubleshooting phase, I wrote a test program and improved
the debugging output in libbpf.  I turned this into a selftests
program, and it also serves as a code example for libbpf in itself.

I discovered that there are at least three ELF load issues with
libbpf.  I left them as TODO comments in (tools/testing/selftests/bpf)
test_libbpf.sh. I've only fixed the load issue with eh_frames, and
other types of relo-section that does not have exec flags.  We can
work on the other issues later.
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

tools/libbpf: handle issues with bpf ELF objects containing .eh_frames

V3: More generic skipping of relo-section (suggested by Daniel)

If clang >= 4.0.1 is missing the option '-target bpf', it will cause
llc/llvm to create two ELF sections for "Exception Frames", with
section names '.eh_frame' and '.rel.eh_frame'.

The BPF ELF loader library libbpf fails when loading files with these
sections.  The other in-kernel BPF ELF loader in samples/bpf/bpf_load.c,
handle this gracefully. And iproute2 loader also seems to work with these
"eh" sections.

The issue in libbpf is caused by bpf_object__elf_collect() skipping
some sections, and later when performing relocation it will be
pointing to a skipped section, as these sections cannot be found by
bpf_object__find_prog_by_idx() in bpf_object__collect_reloc().

This is a general issue that also occurs for other sections, like
debug sections which are also skipped and can have relo section.

As suggested by Daniel.  To avoid keeping state about all skipped
sections, instead perform a direct qlookup in the ELF object.  Lookup
the section that the relo-section points to and check if it contains
executable machine instructions (denoted by the sh_flags
SHF_EXECINSTR).  Use this check to also skip irrelevant relo-sections.

Note, for samples/bpf/ the '-target bpf' parameter to clang cannot be used
due to incompatibility with asm embedded headers, that some of the samples
include. This is explained in more details by Yonghong Song in bpf_devel_QA.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

selftests/bpf: add selftest that use test_libbpf_open

This script test_libbpf.sh will be part of the 'make run_tests'
invocation, but can also be invoked manually in this directory,
and a verbose mode can be enabled via setting the environment
variable $VERBOSE like:

$ VERBOSE=yes ./test_libbpf.sh

The script contains some tests that are commented out, as they
currently fail. They are reminders about what we need to improve
for the libbpf loader library.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

selftests/bpf: add test program for loading BPF ELF files

V2: Moved program into selftests/bpf from tools/libbpf

This program can be used on its own for testing/debugging if a
BPF ELF-object file can be loaded with libbpf (from tools/lib/bpf).

If something is wrong with the ELF object, the program have
a --debug mode that will display the ELF sections and especially
the skipped sections. This allows for quickly identifying the
problematic ELF section number, which can be corrolated with the
readelf tool.

The program signal error via return codes, and also have
a --quiet mode, which is practical for use in scripts like
selftests/bpf.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

tools/libbpf: improve the pr_debug statements to contain section numbers

While debugging a bpf ELF loading issue, I needed to correlate the
ELF section number with the failed relocation section reference.
Thus, add section numbers/index to the pr_debug.

In debug mode, also print section that were skipped. This helped
me identify that a section (.eh_frame) was skipped, and this was
the reason the relocation section (.rel.eh_frame) could not find
that section number.

The section numbers corresponds to the readelf tools Section Headers [Nr].

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

bpf: Sync kernel ABI header with tooling header for bpf_common.h

I recently fixed up a lot of commits that forgot to keep the tooling
headers in sync. And then I forgot to do the same thing in commit
cb5f7334d479 ("bpf: add comments to BPF ld/ldx sizes"). Let correct
that before people notice ;-).

Lawrence did partly fix/sync this for bpf.h in commit d6d4f60c3a09
("bpf: add selftest for tcpbpf").

Fixes: cb5f7334d479 ("bpf: add comments to BPF ld/ldx sizes")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

Merge tag 'nfsd-4.16' of git://linux-nfs.org/~bfields/linux

Pull nfsd update from Bruce Fields:
"A fairly small update this time around. Some cleanup, RDMA fixes,
  overlayfs fixes, and a fix for an NFSv4 state bug.

  The bigger deal for nfsd this time around was Jeff Layton's
  already-merged i_version patches"

* tag 'nfsd-4.16' of git://linux-nfs.org/~bfields/linux:
  svcrdma: Fix Read chunk round-up
  NFSD: hide unused svcxdr_dupstr()
  nfsd: store stat times in fill_pre_wcc() instead of inode times
  nfsd: encode stat->mtime for getattr instead of inode->i_mtime
  nfsd: return RESOURCE not GARBAGE_ARGS on too many ops
  nfsd4: don't set lock stateid's sc_type to CLOSED
  nfsd: Detect unhashed stids in nfsd4_verify_open_stid()
  sunrpc: remove dead code in svc_sock_setbufsize
  svcrdma: Post Receives in the Receive completion handler
  nfsd4: permit layoutget of executable-only files
  lockd: convert nlm_rqst.a_count from atomic_t to refcount_t
  lockd: convert nlm_lockowner.count from atomic_t to refcount_t
  lockd: convert nsm_handle.sm_count from atomic_t to refcount_t

Merge branch 'idr-2018-02-06' of git://git.infradead.org/users/willy/linux-dax

Pull idr updates from Matthew Wilcox:

- test-suite improvements

- replace the extended API by improving the normal API

- performance improvement for IDRs which are 1-based rather than
   0-based

- add documentation

* 'idr-2018-02-06' of git://git.infradead.org/users/willy/linux-dax:
  idr: Add documentation
  idr: Make 1-based IDRs more efficient
  idr: Warn if old iterators see large IDs
  idr: Rename idr_for_each_entry_ext
  idr: Remove idr_alloc_ext
  cls_u32: Convert to idr_alloc_u32
  cls_u32: Reinstate cyclic allocation
  cls_flower: Convert to idr_alloc_u32
  cls_bpf: Convert to use idr_alloc_u32
  cls_basic: Convert to use idr_alloc_u32
  cls_api: Convert to idr_alloc_u32
  net sched actions: Convert to use idr_alloc_u32
  idr: Add idr_alloc_u32 helper
  idr: Delete idr_find_ext function
  idr: Delete idr_replace_ext function
  idr: Delete idr_remove_ext function
  IDR test suite: Check handling negative end correctly
  idr test suite: Fix ida_test_random()
  radix tree test suite: Remove ARRAY_SIZE

Merge tag 'gcc-plugins-v4.16-rc1' of git://git./linux/kernel/git/kees/linux

Pull gcc plugins updates from Kees Cook:

- update includes for gcc 8 (Valdis Kletnieks)

- update initializers for gcc 8

* tag 'gcc-plugins-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
gcc-plugins: Use dynamic initializers
gcc-plugins: Add include required by GCC release 8

fix parallelism for rpc tasks

Hi folks,

On a multi-core machine, is it expected that we can have parallel RPCs
handled by each of the per-core workqueue?

In testing a read workload, observing via "top" command that a single
"kworker" thread is running servicing the requests (no parallelism).
It's more prominent while doing these operations over krb5p mount.

What has been suggested by Bruce is to try this and in my testing I
see then the read workload spread among all the kworker threads.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

net: phy: fix phy_start to consider PHY_IGNORE_INTERRUPT

This condition wasn't adjusted when PHY_IGNORE_INTERRUPT (-2) was added
long ago. In case of PHY_IGNORE_INTERRUPT the MAC interrupt indicates
also PHY state changes and we should do what the symbol says.

Fixes: 84a527a41f38 ("net: phylib: fix interrupts re-enablement in phy_start")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: thunder: change q_len's type to handle max ring size

The Cavium thunder nicvf driver supports rx/tx rings of up to 65536 entries per.
The number of entires are stored in the q_len member of struct q_desc_mem. The
problem is that q_len being a u16, results in 65536 becoming 0.

In getting pointers to descriptors in the rings, the driver uses q_len minus 1
as a mask after incrementing the pointer, in order to go back to the beginning
and not go past the end of the ring.

With the q_len set to 0 the mask is no longer correct and the driver does go
beyond the end of the ring, causing various ills. Usually the first thing that
shows up is a "NETDEV WATCHDOG: enP2p1s0f1 (nicvf): transmit queue 7 timed out"
warning.

This patch remedies the problem by changing q_len to a u32.

Signed-off-by: Dean Nelson <dnelson@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'wireless-drivers-next-for-davem-2018-02-08' of git://git./linux/kernel/git/kvalo/wireless-drivers-next

Kalle Valo says:

====================
wireless-drivers-next patches for 4.16

The most important here is the ssb fix, it has been reported by the
users frequently and the fix just missed the final v4.15. Also
numerous other fixes, mt76 had multiple problems with aggregation and
a long standing unaligned access bug in rtlwifi is finally fixed.

Major changes:

ath10k

* correct firmware RAM dump length for QCA6174/QCA9377

* add new QCA988X device id

* fix a kernel panic during pci probe

* revert a recent commit which broke ath10k firmware metadata parsing

ath9k

* fix a noise floor regression introduced during the merge window

* add new device id

rtlwifi

* fix unaligned access seen on ARM architecture

mt76

* various aggregation fixes which fix connection stalls

ssb

* fix b43 and b44 on non-MIPS which broke in v4.15-rc9
====================

Signed-off-by: David S. Miller <davem@davemloft.net>