platform/kernel/linux-starfive.git
4 years agox86/entry: Treat BUG/WARN as NMI-like entries
Andy Lutomirski [Fri, 12 Jun 2020 03:26:38 +0000 (20:26 -0700)]
x86/entry: Treat BUG/WARN as NMI-like entries

BUG/WARN are cleverly optimized using UD2 to handle the BUG/WARN out of
line in an exception fixup.

But if BUG or WARN is issued in a funny RCU context, then the
idtentry_enter...() path might helpfully WARN that the RCU context is
invalid, which results in infinite recursion.

Split the BUG/WARN handling into an nmi_enter()/nmi_exit() path in
exc_invalid_op() to increase the chance to survive the experience.

[ tglx: Make the declaration match the implementation ]

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/f8fe40e0088749734b4435b554f73eee53dcf7a8.1591932307.git.luto@kernel.org
4 years agoKVM: PPC: Fix nested guest RC bits update
Alexey Kardashevskiy [Thu, 11 Jun 2020 03:05:59 +0000 (13:05 +1000)]
KVM: PPC: Fix nested guest RC bits update

Before commit 6cdf30375f82 ("powerpc/kvm/book3s: Use kvm helpers
to walk shadow or secondary table") we called __find_linux_pte() with
a page table pointer from a kvm_nested_guest struct but
now we rely on kvmhv_find_nested() which takes an L1 LPID and returns
a kvm_nested_guest pointer, however we pass a L0 LPID there and
the L2 guest hangs.

This fixes the LPID passed to kvmppc_hv_handle_set_rc().

Fixes: 6cdf30375f82 ("powerpc/kvm/book3s: Use kvm helpers to walk shadow or secondary table")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200611030559.75257-1-aik@ozlabs.ru
4 years agonios2: signal: Mark expected switch fall-through
Ley Foon Tan [Fri, 12 Jun 2020 06:04:49 +0000 (14:04 +0800)]
nios2: signal: Mark expected switch fall-through

Mark switch cases where we are expecting to fall through.

Fix the following warning through the use of the new the new
pseudo-keyword fallthrough;

arch/nios2/kernel/signal.c:254:12: warning: this statement may fall through [-Wimplicit-fallthrough=]
  254 |    restart = -2;
      |    ~~~~~~~~^~~~
arch/nios2/kernel/signal.c:255:3: note: here
  255 |   case ERESTARTNOHAND:
      |   ^~~~

Reported-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Ley Foon Tan <ley.foon.tan@intel.com>
4 years agoMerge tag 'locking-kcsan-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 12 Jun 2020 01:55:43 +0000 (18:55 -0700)]
Merge tag 'locking-kcsan-2020-06-11' of git://git./linux/kernel/git/tip/tip

Pull the Kernel Concurrency Sanitizer from Thomas Gleixner:
 "The Kernel Concurrency Sanitizer (KCSAN) is a dynamic race detector,
  which relies on compile-time instrumentation, and uses a
  watchpoint-based sampling approach to detect races.

  The feature was under development for quite some time and has already
  found legitimate bugs.

  Unfortunately it comes with a limitation, which was only understood
  late in the development cycle:

     It requires an up to date CLANG-11 compiler

  CLANG-11 is not yet released (scheduled for June), but it's the only
  compiler today which handles the kernel requirements and especially
  the annotations of functions to exclude them from KCSAN
  instrumentation correctly.

  These annotations really need to work so that low level entry code and
  especially int3 text poke handling can be completely isolated.

  A detailed discussion of the requirements and compiler issues can be
  found here:

    https://lore.kernel.org/lkml/CANpmjNMTsY_8241bS7=XAfqvZHFLrVEkv_uM4aDUWE_kh3Rvbw@mail.gmail.com/

  We came to the conclusion that trying to work around compiler
  limitations and bugs again would end up in a major trainwreck, so
  requiring a working compiler seemed to be the best choice.

  For Continous Integration purposes the compiler restriction is
  manageable and that's where most xxSAN reports come from.

  For a change this limitation might make GCC people actually look at
  their bugs. Some issues with CSAN in GCC are 7 years old and one has
  been 'fixed' 3 years ago with a half baken solution which 'solved' the
  reported issue but not the underlying problem.

  The KCSAN developers also ponder to use a GCC plugin to become
  independent, but that's not something which will show up in a few
  days.

  Blocking KCSAN until wide spread compiler support is available is not
  a really good alternative because the continuous growth of lockless
  optimizations in the kernel demands proper tooling support"

* tag 'locking-kcsan-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (76 commits)
  compiler_types.h, kasan: Use __SANITIZE_ADDRESS__ instead of CONFIG_KASAN to decide inlining
  compiler.h: Move function attributes to compiler_types.h
  compiler.h: Avoid nested statement expression in data_race()
  compiler.h: Remove data_race() and unnecessary checks from {READ,WRITE}_ONCE()
  kcsan: Update Documentation to change supported compilers
  kcsan: Remove 'noinline' from __no_kcsan_or_inline
  kcsan: Pass option tsan-instrument-read-before-write to Clang
  kcsan: Support distinguishing volatile accesses
  kcsan: Restrict supported compilers
  kcsan: Avoid inserting __tsan_func_entry/exit if possible
  ubsan, kcsan: Don't combine sanitizer with kcov on clang
  objtool, kcsan: Add kcsan_disable_current() and kcsan_enable_current_nowarn()
  kcsan: Add __kcsan_{enable,disable}_current() variants
  checkpatch: Warn about data_race() without comment
  kcsan: Use GFP_ATOMIC under spin lock
  Improve KCSAN documentation a bit
  kcsan: Make reporting aware of KCSAN tests
  kcsan: Fix function matching in report
  kcsan: Change data_race() to no longer require marking racing accesses
  kcsan: Move kcsan_{disable,enable}_current() to kcsan-checks.h
  ...

4 years agoMerge branch 'net-ipa-endpoint-configuration-fixes'
David S. Miller [Fri, 12 Jun 2020 01:39:08 +0000 (18:39 -0700)]
Merge branch 'net-ipa-endpoint-configuration-fixes'

Alex Elder says:

====================
net: ipa: endpoint configuration fixes

This series fixes four bugs in the configuration of IPA endpoints.
See the description of each for more information.

In this version I have dropped the last patch from the series, and
restored a "static" keyword that had inadvertently gotten removed.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: ipa: header pad field only valid for AP->modem endpoint
Alex Elder [Thu, 11 Jun 2020 19:48:33 +0000 (14:48 -0500)]
net: ipa: header pad field only valid for AP->modem endpoint

Only QMAP endpoints should be configured to find a pad size field
within packet headers.  They are found in the first byte of the QMAP
header (and the hardware fills only the 6 bits in that byte that
constitute the pad_len field).

The RMNet driver assumes the pad_len field is valid for received
packets, so we want to ensure the pad_len field is filled in that
case.  That driver also assumes the length in the QMAP header
includes the pad bytes.

The RMNet driver does *not* pad the packets it sends, so the pad_len
field can be ignored.

Fix ipa_endpoint_init_hdr_ext() so it only marks the pad field
offset valid for QMAP RX endpoints, and in that case indicates
that the length field in the header includes the pad bytes.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: ipa: program upper nibbles of sequencer type
Alex Elder [Thu, 11 Jun 2020 19:48:32 +0000 (14:48 -0500)]
net: ipa: program upper nibbles of sequencer type

The upper two nibbles of the sequencer type were not used for
SDM845, and were assumed to be 0.  But for SC7180 they are used, and
so they must be programmed by ipa_endpoint_init_seq().  Fix this bug.

IPA_SEQ_PKT_PROCESS_NO_DEC_NO_UCP_DMAP doesn't have a descriptive
comment, so add one.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: ipa: fix modem LAN RX endpoint id
Alex Elder [Thu, 11 Jun 2020 19:48:31 +0000 (14:48 -0500)]
net: ipa: fix modem LAN RX endpoint id

The endpoint id assigned to the modem LAN RX endpoint for the SC7180 SoC
is incorrect.  The erroneous value might have been copied from SDM845 and
never updated.  The correct endpoint id to use for this SoC is 11.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: ipa: program metadata mask differently
Alex Elder [Thu, 11 Jun 2020 19:48:30 +0000 (14:48 -0500)]
net: ipa: program metadata mask differently

The way the mask value is programmed for QMAP RX endpoints was based
on some wrong assumptions about the way metadata containing the QMAP
mux_id value is formatted.  The metadata value supplied by the
modem is *not* in QMAP format, and in fact contains the mux_id we
want in its (big endian) low-order byte.  That byte must be written
by the IPA into offset 1 of the QMAP header it inserts before the
received packet.

QMAP TX endpoints *do* use a QMAP header as the metadata sent with
each packet.  The modem assumes this, and based on that assumes the
mux_id is in the second byte.  To match those assumptions we must
program the modem TX (QMAP) endpoint HDR register to indicate the
metadata will be found at offset 0 in the message header.

The previous configuration managed to work, but it was not working
correctly.  This patch fixes a bug whose symptom was receipt of
messages containing the wrong QMAP mux_id.

In fixing this, get rid of ipa_rmnet_mux_id_metadata_mask(), which
was more or less defined so there was a separate place to explain
what was happening as we generated the mask value.  Instead, put a
longer description of how this works above ipa_endpoint_init_hdr(),
and define the metadata mask to use as a simple constant.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge tag 'locking-urgent-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 12 Jun 2020 01:27:19 +0000 (18:27 -0700)]
Merge tag 'locking-urgent-2020-06-11' of git://git./linux/kernel/git/tip/tip

Pull atomics rework from Thomas Gleixner:
 "Peter Zijlstras rework of atomics and fallbacks. This solves two
  problems:

   1) Compilers uninline small atomic_* static inline functions which
      can expose them to instrumentation.

   2) The instrumentation of atomic primitives was done at the
      architecture level while composites or fallbacks were provided at
      the generic level. As a result there are no uninstrumented
      variants of the fallbacks.

  Both issues were in the way of fully isolating fragile entry code
  pathes and especially the text poke int3 handler which is prone to an
  endless recursion problem when anything in that code path is about to
  be instrumented. This was always a problem, but got elevated due to
  the new batch mode updates of tracing.

  The solution is to mark the functions __always_inline and to flip the
  fallback and instrumentation so the non-instrumented variants are at
  the architecture level and the instrumentation is done in generic
  code.

  The latter introduces another fallback variant which will go away once
  all architectures have been moved over to arch_atomic_*"

* tag 'locking-urgent-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/atomics: Flip fallbacks and instrumentation
  asm-generic/atomic: Use __always_inline for fallback wrappers

4 years agoionic: add pcie_print_link_status
Shannon Nelson [Fri, 12 Jun 2020 00:18:15 +0000 (17:18 -0700)]
ionic: add pcie_print_link_status

Print the PCIe link information for our device.

Fixes: 77f972a7077d ("ionic: remove support for mgmt device")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net...
David S. Miller [Fri, 12 Jun 2020 01:25:20 +0000 (18:25 -0700)]
Merge branch '40GbE' of git://git./linux/kernel/git/jkirsher/net-queue

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2020-06-11

This series contains fixes to the iavf driver.

Brett fixes the supported link speeds in the iavf driver, which was only
able to report speeds that the i40e driver supported and was missing the
speeds supported by the ice driver.  In addition, fix how 2.5 and 5.0
GbE speeds are reported.

Alek fixes a enum comparison that was comparing two different enums that
may have different values, so update the comparison to use matching
enums.

Paul increases the time to complete a reset to allow for 128 VFs to
complete a reset.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge tag 'mlx5-fixes-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git...
David S. Miller [Fri, 12 Jun 2020 01:20:20 +0000 (18:20 -0700)]
Merge tag 'mlx5-fixes-2020-06-11' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5 fixes 2020-06-11

This series introduces some fixes to mlx5 driver.
For more information please see tag log below.

Please pull and let me know if there is any problem.

For -stable v5.2
  ('net/mlx5: drain health workqueue in case of driver load error')

For -stable v5.3
  ('net/mlx5e: Fix repeated XSK usage on one channel')
  ('net/mlx5: Fix fatal error handling during device load')

For -stable v5.5
 ('net/mlx5: Disable reload while removing the device')

For -stable v5.7
  ('net/mlx5e: CT: Fix ipv6 nat header rewrite actions')
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Fri, 12 Jun 2020 01:18:50 +0000 (18:18 -0700)]
Merge branch 'akpm' (patches from Andrew)

Pull updates from Andrew Morton:
 "A few fixes and stragglers.

  Subsystems affected by this patch series: mm/memory-failure, ocfs2,
  lib/lzo, misc"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  amdgpu: a NULL ->mm does not mean a thread is a kthread
  lib/lzo: fix ambiguous encoding bug in lzo-rle
  ocfs2: fix build failure when TCP/IP is disabled
  mm/memory-failure: send SIGBUS(BUS_MCEERR_AR) only to current thread
  mm/memory-failure: prioritize prctl(PR_MCE_KILL) over vm.memory_failure_early_kill

4 years agorxrpc: Fix race between incoming ACK parser and retransmitter
David Howells [Thu, 11 Jun 2020 20:57:00 +0000 (21:57 +0100)]
rxrpc: Fix race between incoming ACK parser and retransmitter

There's a race between the retransmission code and the received ACK parser.
The problem is that the retransmission loop has to drop the lock under
which it is iterating through the transmission buffer in order to transmit
a packet, but whilst the lock is dropped, the ACK parser can crank the Tx
window round and discard the packets from the buffer.

The retransmission code then updated the annotations for the wrong packet
and a later retransmission thought it had to retransmit a packet that
wasn't there, leading to a NULL pointer dereference.

Fix this by:

 (1) Moving the annotation change to before we drop the lock prior to
     transmission.  This means we can't vary the annotation depending on
     the outcome of the transmission, but that's fine - we'll retransmit
     again later if it failed now.

 (2) Skipping the packet if the skb pointer is NULL.

The following oops was seen:

BUG: kernel NULL pointer dereference, address: 000000000000002d
Workqueue: krxrpcd rxrpc_process_call
RIP: 0010:rxrpc_get_skb+0x14/0x8a
...
Call Trace:
 rxrpc_resend+0x331/0x41e
 ? get_vtime_delta+0x13/0x20
 rxrpc_process_call+0x3c0/0x4ac
 process_one_work+0x18f/0x27f
 worker_thread+0x1a3/0x247
 ? create_worker+0x17d/0x17d
 kthread+0xe6/0xeb
 ? kthread_delayed_work_timer_fn+0x83/0x83
 ret_from_fork+0x1f/0x30

Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoamdgpu: a NULL ->mm does not mean a thread is a kthread
Christoph Hellwig [Fri, 12 Jun 2020 00:34:58 +0000 (17:34 -0700)]
amdgpu: a NULL ->mm does not mean a thread is a kthread

Use the proper API instead.

Fixes: 70539bd795002 ("drm/amd: Update MEC HQD loading code for KFD")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: http://lkml.kernel.org/r/20200404094101.672954-1-hch@lst.de
Link: http://lkml.kernel.org/r/20200404094101.672954-2-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agolib/lzo: fix ambiguous encoding bug in lzo-rle
Dave Rodgman [Fri, 12 Jun 2020 00:34:54 +0000 (17:34 -0700)]
lib/lzo: fix ambiguous encoding bug in lzo-rle

In some rare cases, for input data over 32 KB, lzo-rle could encode two
different inputs to the same compressed representation, so that
decompression is then ambiguous (i.e.  data may be corrupted - although
zram is not affected because it operates over 4 KB pages).

This modifies the compressor without changing the decompressor or the
bitstream format, such that:

 - there is no change to how data produced by the old compressor is
   decompressed

 - an old decompressor will correctly decode data from the updated
   compressor

 - performance and compression ratio are not affected

 - we avoid introducing a new bitstream format

In testing over 12.8M real-world files totalling 903 GB, three files
were affected by this bug.  I also constructed 37M semi-random 64 KB
files totalling 2.27 TB, and saw no affected files.  Finally I tested
over files constructed to contain each of the ~1024 possible bad input
sequences; for all of these cases, updated lzo-rle worked correctly.

There is no significant impact to performance or compression ratio.

Signed-off-by: Dave Rodgman <dave.rodgman@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Dave Rodgman <dave.rodgman@arm.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Markus F.X.J. Oberhumer <markus@oberhumer.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Chao Yu <yuchao0@huawei.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200507100203.29785-1-dave.rodgman@arm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoocfs2: fix build failure when TCP/IP is disabled
Tom Seewald [Fri, 12 Jun 2020 00:34:51 +0000 (17:34 -0700)]
ocfs2: fix build failure when TCP/IP is disabled

After commit 12abc5ee7873 ("tcp: add tcp_sock_set_nodelay") and commit
c488aeadcbd0 ("tcp: add tcp_sock_set_user_timeout"), building the kernel
with OCFS2_FS=y but without INET=y causes it to fail with:

  ld: fs/ocfs2/cluster/tcp.o: in function `o2net_accept_many':
  tcp.c:(.text+0x21b1): undefined reference to `tcp_sock_set_nodelay'
  ld: tcp.c:(.text+0x21c1): undefined reference to `tcp_sock_set_user_timeout'
  ld: fs/ocfs2/cluster/tcp.o: in function `o2net_start_connect':
  tcp.c:(.text+0x2633): undefined reference to `tcp_sock_set_nodelay'
  ld: tcp.c:(.text+0x2643): undefined reference to `tcp_sock_set_user_timeout'

This is due to tcp_sock_set_nodelay() and tcp_sock_set_user_timeout()
being declared in linux/tcp.h and defined in net/ipv4/tcp.c, which
depend on TCP/IP being enabled.

To fix this, make OCFS2_FS depend on INET=y which already requires
NET=y.

Fixes: 12abc5ee7873 ("tcp: add tcp_sock_set_nodelay")
Fixes: c488aeadcbd0 ("tcp: add tcp_sock_set_user_timeout")
Signed-off-by: Tom Seewald <tseewald@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Link: http://lkml.kernel.org/r/20200606190827.23954-1-tseewald@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agomm/memory-failure: send SIGBUS(BUS_MCEERR_AR) only to current thread
Naoya Horiguchi [Fri, 12 Jun 2020 00:34:48 +0000 (17:34 -0700)]
mm/memory-failure: send SIGBUS(BUS_MCEERR_AR) only to current thread

Action Required memory error should happen only when a processor is
about to access to a corrupted memory, so it's synchronous and only
affects current process/thread.

Recently commit 872e9a205c84 ("mm, memory_failure: don't send
BUS_MCEERR_AO for action required error") fixed the issue that Action
Required memory could unnecessarily send SIGBUS to the processes which
share the error memory.  But we still have another issue that we could
send SIGBUS to a wrong thread.

This is because collect_procs() and task_early_kill() fails to add the
current process to "to-kill" list.  So this patch is suggesting to fix
it.  With this fix, SIGBUS(BUS_MCEERR_AR) is never sent to non-current
process/thread.

Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Link: http://lkml.kernel.org/r/1591321039-22141-3-git-send-email-naoya.horiguchi@nec.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agomm/memory-failure: prioritize prctl(PR_MCE_KILL) over vm.memory_failure_early_kill
Naoya Horiguchi [Fri, 12 Jun 2020 00:34:45 +0000 (17:34 -0700)]
mm/memory-failure: prioritize prctl(PR_MCE_KILL) over vm.memory_failure_early_kill

Patch series "hwpoison: fixes signaling on memory error"

This is a small patchset to solve issues in memory error handler to send
SIGBUS to proper process/thread as expected in configuration.  Please
see descriptions in individual patches for more details.

This patch (of 2):

Early-kill policy is controlled from two types of settings, one is
per-process setting prctl(PR_MCE_KILL) and the other is system-wide
setting vm.memory_failure_early_kill.  Users expect per-process setting
to override system-wide setting as many other settings do, but
early-kill setting doesn't work as such.

For example, if a system configures vm.memory_failure_early_kill to 1
(enabled), a process receives SIGBUS even if it's configured to
explicitly disable PF_MCE_KILL by prctl().  That's not desirable for
applications with their own policies.

This patch is suggesting to change the priority of these two types of
settings, by checking sysctl_memory_failure_early_kill only when a given
process has the default kill policy.

Note that this patch is solving a thread choice issue too.

Originally, collect_procs() always chooses the main thread when
vm.memory_failure_early_kill is 1, even if the process has a dedicated
thread for memory error handling.  SIGBUS should be sent to the
dedicated thread if early-kill is enabled via
vm.memory_failure_early_kill as we are doing for PR_MCE_KILL_EARLY
processes.

Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Link: http://lkml.kernel.org/r/1591321039-22141-1-git-send-email-naoya.horiguchi@nec.com
Link: http://lkml.kernel.org/r/1591321039-22141-2-git-send-email-naoya.horiguchi@nec.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoMerge tag 'io_uring-5.8-2020-06-11' of git://git.kernel.dk/linux-block
Linus Torvalds [Thu, 11 Jun 2020 23:10:08 +0000 (16:10 -0700)]
Merge tag 'io_uring-5.8-2020-06-11' of git://git.kernel.dk/linux-block

Pull io_uring fixes from Jens Axboe:
 "A few late stragglers in here. In particular:

   - Validate full range for provided buffers (Bijan)

   - Fix bad use of kfree() in buffer registration failure (Denis)

   - Don't allow close of ring itself, it's not fully safe. Making it
     fully safe would require making the system call more expensive,
     which isn't worth it.

   - Buffer selection fix

   - Regression fix for O_NONBLOCK retry

   - Make IORING_OP_ACCEPT honor O_NONBLOCK (Jiufei)

   - Restrict opcode handling for SQ/IOPOLL (Pavel)

   - io-wq work handling cleanups and improvements (Pavel, Xiaoguang)

   - IOPOLL race fix (Xiaoguang)"

* tag 'io_uring-5.8-2020-06-11' of git://git.kernel.dk/linux-block:
  io_uring: fix io_kiocb.flags modification race in IOPOLL mode
  io_uring: check file O_NONBLOCK state for accept
  io_uring: avoid unnecessary io_wq_work copy for fast poll feature
  io_uring: avoid whole io_wq_work copy for requests completed inline
  io_uring: allow O_NONBLOCK async retry
  io_wq: add per-wq work handler instead of per work
  io_uring: don't arm a timeout through work.func
  io_uring: remove custom ->func handlers
  io_uring: don't derive close state from ->func
  io_uring: use kvfree() in io_sqe_buffer_register()
  io_uring: validate the full range of provided buffers for access
  io_uring: re-set iov base/len for buffer select retry
  io_uring: move send/recv IOPOLL check into prep
  io_uring: deduplicate io_openat{,2}_prep()
  io_uring: do build_open_how() only once
  io_uring: fix {SQ,IO}POLL with unsupported opcodes
  io_uring: disallow close of ring itself

4 years agoMerge tag 'block-5.8-2020-06-11' of git://git.kernel.dk/linux-block
Linus Torvalds [Thu, 11 Jun 2020 23:07:33 +0000 (16:07 -0700)]
Merge tag 'block-5.8-2020-06-11' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "Some followup fixes for this merge window. In particular:

   - Seqcount write missing preemption disable for stats (Ahmed)

   - blktrace fixes (Chaitanya)

   - Redundant initializations (Colin)

   - Various small NVMe fixes (Chaitanya, Christoph, Daniel, Max,
     Niklas, Rikard)

   - loop flag bug regression fix (Martijn)

   - blk-mq tagging fixes (Christoph, Ming)"

* tag 'block-5.8-2020-06-11' of git://git.kernel.dk/linux-block:
  umem: remove redundant initialization of variable ret
  pktcdvd: remove redundant initialization of variable ret
  nvmet: fail outstanding host posted AEN req
  nvme-pci: use simple suspend when a HMB is enabled
  nvme-fc: don't call nvme_cleanup_cmd() for AENs
  nvmet-tcp: constify nvmet_tcp_ops
  nvme-tcp: constify nvme_tcp_mq_ops and nvme_tcp_admin_mq_ops
  nvme: do not call del_gendisk() on a disk that was never added
  blk-mq: fix blk_mq_all_tag_iter
  blk-mq: split out a __blk_mq_get_driver_tag helper
  blktrace: fix endianness for blk_log_remap()
  blktrace: fix endianness in get_pdu_int()
  blktrace: use errno instead of bi_status
  block: nr_sects_write(): Disable preemption on seqcount write
  block: remove the error argument to the block_bio_complete tracepoint
  loop: Fix wrong masking of status flags
  block/bio-integrity: don't free 'buf' if bio_integrity_add_page() failed

4 years agoafs: Fix afs_store_data() to set mtime in new operation descriptor
David Howells [Thu, 11 Jun 2020 20:50:24 +0000 (21:50 +0100)]
afs: Fix afs_store_data() to set mtime in new operation descriptor

Fix afs_store_data() so that it sets the mtime in the new operation
descriptor otherwise the mtime on the server gets set to 0 when a write is
stored to the server.

Fixes: e49c7b2f6de7 ("afs: Build an abstraction around an "operation" concept")
Reported-by: Dave Botsch <botsch@cnf.cornell.edu>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoMerge tag 'x86-urgent-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 11 Jun 2020 22:54:31 +0000 (15:54 -0700)]
Merge tag 'x86-urgent-2020-06-11' of git://git./linux/kernel/git/tip/tip

Pull more x86 updates from Thomas Gleixner:
 "A set of fixes and updates for x86:

   - Unbreak paravirt VDSO clocks.

     While the VDSO code was moved into lib for sharing a subtle check
     for the validity of paravirt clocks got replaced. While the
     replacement works perfectly fine for bare metal as the update of
     the VDSO clock mode is synchronous, it fails for paravirt clocks
     because the hypervisor can invalidate them asynchronously.

     Bring it back as an optional function so it does not inflict this
     on architectures which are free of PV damage.

   - Fix the jiffies to jiffies64 mapping on 64bit so it does not
     trigger an ODR violation on newer compilers

   - Three fixes for the SSBD and *IB* speculation mitigation maze to
     ensure consistency, not disabling of some *IB* variants wrongly and
     to prevent a rogue cross process shutdown of SSBD. All marked for
     stable.

   - Add yet more CPU models to the splitlock detection capable list
     !@#%$!

   - Bring the pr_info() back which tells that TSC deadline timer is
     enabled.

   - Reboot quirk for MacBook6,1"

* tag 'x86-urgent-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/vdso: Unbreak paravirt VDSO clocks
  lib/vdso: Provide sanity check for cycles (again)
  clocksource: Remove obsolete ifdef
  x86_64: Fix jiffies ODR violation
  x86/speculation: PR_SPEC_FORCE_DISABLE enforcement for indirect branches.
  x86/speculation: Prevent rogue cross-process SSBD shutdown
  x86/speculation: Avoid force-disabling IBPB based on STIBP and enhanced IBRS.
  x86/cpu: Add Sapphire Rapids CPU model number
  x86/split_lock: Add Icelake microserver and Tigerlake CPU models
  x86/apic: Make TSC deadline timer detection message visible
  x86/reboot/quirks: Add MacBook6,1 reboot quirk

4 years agonet/mlx5: E-Switch, Fix some error pointer dereferences
Dan Carpenter [Wed, 3 Jun 2020 17:54:36 +0000 (20:54 +0300)]
net/mlx5: E-Switch, Fix some error pointer dereferences

We can't leave "counter" set to an error pointer.  Otherwise either it
will lead to an error pointer dereference later in the function or it
leads to an error pointer dereference when we call mlx5_fc_destroy().

Fixes: 07bab9502641d ("net/mlx5: E-Switch, Refactor eswitch ingress acl codes")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: Don't fail driver on failure to create debugfs
Leon Romanovsky [Tue, 2 Jun 2020 12:28:37 +0000 (15:28 +0300)]
net/mlx5: Don't fail driver on failure to create debugfs

Clang warns:

drivers/net/ethernet/mellanox/mlx5/core/main.c:1278:6: warning: variable
'err' is used uninitialized whenever 'if' condition is true
[-Wsometimes-uninitialized]
        if (!priv->dbg_root) {
            ^~~~~~~~~~~~~~~
drivers/net/ethernet/mellanox/mlx5/core/main.c:1303:9: note:
uninitialized use occurs here
        return err;
               ^~~
drivers/net/ethernet/mellanox/mlx5/core/main.c:1278:2: note: remove the
'if' if its condition is always false
        if (!priv->dbg_root) {
        ^~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/mellanox/mlx5/core/main.c:1259:9: note: initialize
the variable 'err' to silence this warning
        int err;
               ^
                = 0
1 warning generated.

The check of returned value of debugfs_create_dir() is wrong because
by the design debugfs failures should never fail the driver and the
check itself was wrong too. The kernel compiled without CONFIG_DEBUG_FS
will return ERR_PTR(-ENODEV) and not NULL as expected.

Fixes: 11f3b84d7068 ("net/mlx5: Split mdev init and pci init")
Link: https://github.com/ClangBuiltLinux/linux/issues/1042
Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: CT: Fix ipv6 nat header rewrite actions
Oz Shlomo [Sun, 7 Jun 2020 15:40:40 +0000 (15:40 +0000)]
net/mlx5e: CT: Fix ipv6 nat header rewrite actions

Set the ipv6 word fields according to the hardware definitions.

Fixes: ac991b48d43c ("net/mlx5e: CT: Offload established flows")
Signed-off-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: Fix devlink objects and devlink device unregister sequence
Parav Pandit [Fri, 15 May 2020 07:44:06 +0000 (02:44 -0500)]
net/mlx5: Fix devlink objects and devlink device unregister sequence

Current below problems exists.

1. devlink device is registered by mlx5_load_one(). But it is
not unregistered by mlx5_unload_one(). This is incorrect.

2. Above issue leads to,
When mlx5 PCI device is removed, currently devlink device is
unregistered before devlink ports are unregistered in below ladder
diagram.

remove_one()
  mlx5_devlink_unregister()
    [..]
    devlink_unregister() <- ports are still registered!
  mlx5_unload_one()
    mlx5_unregister_device()
      mlx5_remove_device()
        mlx5e_remove()
          mlx5e_devlink_port_unregister()
            devlink_port_unregister()

3. Condition checking for registering and unregister device are not
symmetric either in these routines.

Hence, fix the sequence by having load and unload routines symmetric
and in right order.
i.e.
(a) register devlink device followed by registering devlink ports
(b) unregister devlink ports followed by devlink device

Do this based on boot and cleanup flags instead of different
conditions.

Fixes: c6acd629eec7 ("net/mlx5e: Add support for devlink-port in non-representors mode")
Fixes: f60f315d339e ("net/mlx5e: Register devlink ports for physical link, PCI PF, VFs")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: Disable reload while removing the device
Parav Pandit [Thu, 14 May 2020 10:12:56 +0000 (05:12 -0500)]
net/mlx5: Disable reload while removing the device

While unregistration is in progress, user might be reloading the
interface.
This can race with unregistration in below flow which uses the
resources which are getting disabled by reload flow.

Hence, disable the devlink reloading first when removing the device.

     CPU0                                   CPU1
     ----                                   ----
local_pci_remove()                  devlink_mutex
  remove_one()                       devlink_nl_cmd_reload()
    mlx5_unregister_device()           devlink_reload()
                                       ops->reload_down()
                                         mlx5_unload_one()

Fixes: 4383cfcc65e7 ("net/mlx5: Add devlink reload")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: Fix ethtool hfunc configuration change
Aya Levin [Sun, 17 May 2020 09:45:52 +0000 (12:45 +0300)]
net/mlx5e: Fix ethtool hfunc configuration change

Changing RX hash function requires rearranging of RQT internal indexes,
the user isn't exposed to such changes and these changes do not affect
the user configured indirection table. Rebuild RQ table on hfunc change.

Fixes: bdfc028de1b3 ("net/mlx5e: Fix ethtool RX hash func configuration change")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: Fix repeated XSK usage on one channel
Maxim Mikityanskiy [Mon, 1 Jun 2020 13:03:44 +0000 (16:03 +0300)]
net/mlx5e: Fix repeated XSK usage on one channel

After an XSK is closed, the relevant structures in the channel are not
zeroed. If an XSK is opened the second time on the same channel without
recreating channels, the stray values in the structures will lead to
incorrect operation of queues, which causes CQE errors, and the new
socket doesn't work at all.

This patch fixes the issue by explicitly zeroing XSK-related structs in
the channel on XSK close. Note that those structs are zeroed on channel
creation, and usually a configuration change (XDP program is set)
happens on XSK open, which leads to recreating channels, so typical XSK
usecases don't suffer from this issue. However, if XSKs are opened and
closed on the same channel without removing the XDP program, this bug
reproduces.

Fixes: db05815b36cb ("net/mlx5e: Add XSK zero-copy support")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: DR, Fix freeing in dr_create_rc_qp()
Denis Efremov [Mon, 1 Jun 2020 16:45:26 +0000 (19:45 +0300)]
net/mlx5: DR, Fix freeing in dr_create_rc_qp()

Variable "in" in dr_create_rc_qp() is allocated with kvzalloc() and
should be freed with kvfree().

Fixes: 297cccebdc5a ("net/mlx5: DR, Expose an internal API to issue RDMA operations")
Cc: stable@vger.kernel.org
Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: Fix fatal error handling during device load
Shay Drory [Thu, 7 May 2020 06:32:53 +0000 (09:32 +0300)]
net/mlx5: Fix fatal error handling during device load

Currently, in case of fatal error during mlx5_load_one(), we cannot
enter error state until mlx5_load_one() is finished, what can take
several minutes until commands will get timeouts, because these commands
can't be processed due to the fatal error.
Fix it by setting dev->state as MLX5_DEVICE_STATE_INTERNAL_ERROR before
requesting the lock.

Fixes: c1d4d2e92ad6 ("net/mlx5: Avoid calling sleeping function by the health poll thread")
Signed-off-by: Shay Drory <shayd@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: drain health workqueue in case of driver load error
Shay Drory [Wed, 6 May 2020 12:59:48 +0000 (15:59 +0300)]
net/mlx5: drain health workqueue in case of driver load error

In case there is a work in the health WQ when we teardown the driver,
in driver load error flow, the health work will try to read dev->iseg,
which was already unmap in mlx5_pci_close().
Fix it by draining the health workqueue first thing in mlx5_pci_close().

Trace of the error:
BUG: unable to handle page fault for address: ffffb5b141c18014
PF: supervisor read access in kernel mode
PF: error_code(0x0000) - not-present page
PGD 1fe95d067 P4D 1fe95d067 PUD 1fe95e067 PMD 1b7823067 PTE 0
Oops: 0000 [#1] SMP PTI
CPU: 3 PID: 6755 Comm: kworker/u128:2 Not tainted 5.2.0-net-next-mlx5-hv_stats-over-last-worked-hyperv #1
Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006  04/28/2016
Workqueue: mlx5_healtha050:00:02.0 mlx5_fw_fatal_reporter_err_work [mlx5_core]
RIP: 0010:ioread32be+0x30/0x40
Code: 00 77 27 48 81 ff 00 00 01 00 76 07 0f b7 d7 ed 0f c8 c3 55 48 c7 c6 3b ee d5 9f 48 89 e5 e8 67 fc ff ff b8 ff ff ff ff 5d c3 <8b> 07 0f c8 c3 66 66 2e 0f 1f 84 00 00 00 00 00 48 81 fe ff ff 03
RSP: 0018:ffffb5b14c56fd78 EFLAGS: 00010292
RAX: ffffb5b141c18000 RBX: ffff8e9f78a801c0 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffff8e9f7ecd7628 RDI: ffffb5b141c18014
RBP: ffffb5b14c56fd90 R08: 0000000000000001 R09: 0000000000000000
R10: ffff8e9f372a2c30 R11: ffff8e9f87f4bc40 R12: ffff8e9f372a1fc0
R13: ffff8e9f78a80000 R14: ffffffffc07136a0 R15: ffff8e9f78ae6f20
FS:  0000000000000000(0000) GS:ffff8e9f7ecc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffb5b141c18014 CR3: 00000001c8f82006 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 ? mlx5_health_try_recover+0x4d/0x270 [mlx5_core]
 mlx5_fw_fatal_reporter_recover+0x16/0x20 [mlx5_core]
 devlink_health_reporter_recover+0x1c/0x50
 devlink_health_report+0xfb/0x240
 mlx5_fw_fatal_reporter_err_work+0x65/0xd0 [mlx5_core]
 process_one_work+0x1fb/0x4e0
 ? process_one_work+0x16b/0x4e0
 worker_thread+0x4f/0x3d0
 kthread+0x10d/0x140
 ? process_one_work+0x4e0/0x4e0
 ? kthread_cancel_delayed_work_sync+0x20/0x20
 ret_from_fork+0x1f/0x30
Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 nfs fscache 8021q garp mrp stp llc ipmi_devintf ipmi_msghandler rpcrdma rdma_ucm ib_iser rdma_cm ib_umad iw_cm ib_ipoib libiscsi scsi_transport_iscsi ib_cm mlx5_ib ib_uverbs ib_core mlx5_core sb_edac crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 mlxfw crypto_simd cryptd glue_helper input_leds hyperv_fb intel_rapl_perf joydev serio_raw pci_hyperv pci_hyperv_mini mac_hid hv_balloon nfsd auth_rpcgss nfs_acl lockd grace sunrpc sch_fq_codel ip_tables x_tables autofs4 hv_utils hid_generic hv_storvsc ptp hid_hyperv hid hv_netvsc hyperv_keyboard pps_core scsi_transport_fc psmouse hv_vmbus i2c_piix4 floppy pata_acpi
CR2: ffffb5b141c18014
---[ end trace b12c5503157cad24 ]---
RIP: 0010:ioread32be+0x30/0x40
Code: 00 77 27 48 81 ff 00 00 01 00 76 07 0f b7 d7 ed 0f c8 c3 55 48 c7 c6 3b ee d5 9f 48 89 e5 e8 67 fc ff ff b8 ff ff ff ff 5d c3 <8b> 07 0f c8 c3 66 66 2e 0f 1f 84 00 00 00 00 00 48 81 fe ff ff 03
RSP: 0018:ffffb5b14c56fd78 EFLAGS: 00010292
RAX: ffffb5b141c18000 RBX: ffff8e9f78a801c0 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffff8e9f7ecd7628 RDI: ffffb5b141c18014
RBP: ffffb5b14c56fd90 R08: 0000000000000001 R09: 0000000000000000
R10: ffff8e9f372a2c30 R11: ffff8e9f87f4bc40 R12: ffff8e9f372a1fc0
R13: ffff8e9f78a80000 R14: ffffffffc07136a0 R15: ffff8e9f78ae6f20
FS:  0000000000000000(0000) GS:ffff8e9f7ecc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffb5b141c18014 CR3: 00000001c8f82006 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
BUG: sleeping function called from invalid context at ./include/linux/percpu-rwsem.h:38
in_atomic(): 0, irqs_disabled(): 1, pid: 6755, name: kworker/u128:2
INFO: lockdep is turned off.
CPU: 3 PID: 6755 Comm: kworker/u128:2 Tainted: G      D           5.2.0-net-next-mlx5-hv_stats-over-last-worked-hyperv #1
Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006  04/28/2016
Workqueue: mlx5_healtha050:00:02.0 mlx5_fw_fatal_reporter_err_work [mlx5_core]
Call Trace:
 dump_stack+0x63/0x88
 ___might_sleep+0x10a/0x130
 __might_sleep+0x4a/0x80
 exit_signals+0x33/0x230
 ? blocking_notifier_call_chain+0x16/0x20
 do_exit+0xb1/0xc30
 ? kthread+0x10d/0x140
 ? process_one_work+0x4e0/0x4e0

Fixes: 52c368dc3da7 ("net/mlx5: Move health and page alloc init to mdev_init")
Signed-off-by: Shay Drory <shayd@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agoMerge tag 'timers-urgent-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 11 Jun 2020 22:36:28 +0000 (15:36 -0700)]
Merge tag 'timers-urgent-2020-06-11' of git://git./linux/kernel/git/tip/tip

Pull timer fix from Thomas Gleixner:
 "A small fix for the VDSO code to force inline
  __cvdso_clock_gettime_common() so the compiler
  can't generate horrible code"

* tag 'timers-urgent-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  lib/vdso: Force inlining of __cvdso_clock_gettime_common()

4 years agoiavf: increase reset complete wait time
Paul Greenwalt [Fri, 5 Jun 2020 17:09:46 +0000 (10:09 -0700)]
iavf: increase reset complete wait time

With an increased number of VFs, it's possible to encounter the following
issue during reset.

    iavf b8d4:00:02.0: Hardware reset detected
    iavf b8d4:00:02.0: Reset never finished (0)
    iavf b8d4:00:02.0: Reset task did not complete, VF disabled

Increase the reset complete wait count to allow for 128 VFs to complete
reset.

Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
4 years agoiavf: Fix reporting 2.5 Gb and 5Gb speeds
Brett Creeley [Fri, 5 Jun 2020 17:09:45 +0000 (10:09 -0700)]
iavf: Fix reporting 2.5 Gb and 5Gb speeds

Commit 4ae4916b5643 ("i40e: fix 'Unknown bps' in dmesg for 2.5Gb/5Gb
speeds") added the ability for the PF to report 2.5 and 5Gb speeds,
however, the iavf driver does not recognize those speeds as the values were
not added there. Add the proper enums and values so that iavf can properly
deal with those speeds.

Fixes: 4ae4916b5643 ("i40e: fix 'Unknown bps' in dmesg for 2.5Gb/5Gb speeds")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Witold Fijalkowski <witoldx.fijalkowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
4 years agoiavf: use appropriate enum for comparison
Aleksandr Loktionov [Fri, 5 Jun 2020 17:09:44 +0000 (10:09 -0700)]
iavf: use appropriate enum for comparison

adapter->link_speed has type enum virtchnl_link_speed but our comparisons
are against enum iavf_aq_link_speed. Though they are, currently, the same
values, change the comparison to the matching enum virtchnl_link_speed
since that may not always be the case.

Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Piotr Kwapulinski <piotr.kwapulinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
4 years agoiavf: fix speed reporting over virtchnl
Brett Creeley [Fri, 5 Jun 2020 17:09:43 +0000 (10:09 -0700)]
iavf: fix speed reporting over virtchnl

Link speeds are communicated over virtchnl using an enum
virtchnl_link_speed. Currently, the highest link speed is 40Gbps which
leaves us unable to reflect some speeds that an ice VF is capable of.
This causes link speed to be misreported on the iavf driver.

Allow for communicating link speeds using Mbps so that the proper speed can
be reported for an ice VF. Moving away from the enum allows us to
communicate future speed changes without requiring a new enum to be added.

In order to support communicating link speeds over virtchnl in Mbps the
following functionality was added:
    - Added u32 link_speed_mbps in the iavf_adapter structure.
    - Added the macro ADV_LINK_SUPPORT(_a) to determine if the VF
      driver supports communicating link speeds in Mbps.
    - Added the function iavf_get_vpe_link_status() to fill the
      correct link_status in the event_data union based on the
      ADV_LINK_SUPPORT(_a) macro.
    - Added the function iavf_set_adapter_link_speed_from_vpe()
      to determine whether or not to fill the u32 link_speed_mbps or
      enum virtchnl_link_speed link_speed field in the iavf_adapter
      structure based on the ADV_LINK_SUPPORT(_a) macro.
    - Do not free vf_res in iavf_init_get_resources() as vf_res will be
      accessed in iavf_get_link_ksettings(); memset to 0 instead. This
      memory is subsequently freed in iavf_remove().

Fixes: 7c710869d64e ("ice: Add handlers for VF netdevice operations")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Sergey Nemov <sergey.nemov@intel.com>
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
4 years agotools, bpftool: Exit on error in function codegen
Tobias Klauser [Thu, 11 Jun 2020 10:33:41 +0000 (12:33 +0200)]
tools, bpftool: Exit on error in function codegen

Currently, the codegen function might fail and return an error. But its
callers continue without checking its return value. Since codegen can
fail only in the unlikely case of the system running out of memory or
the static template being malformed, just exit(-1) directly from codegen
and make it void-returning.

Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200611103341.21532-1-tklauser@distanz.ch
4 years agoxdp: Fix xsk_generic_xmit errno
Li RongQing [Thu, 11 Jun 2020 05:11:06 +0000 (13:11 +0800)]
xdp: Fix xsk_generic_xmit errno

Propagate sock_alloc_send_skb error code, not set it to
EAGAIN unconditionally, when fail to allocate skb, which
might cause that user space unnecessary loops.

Fixes: 35fcde7f8deb ("xsk: support for Tx")
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Link: https://lore.kernel.org/bpf/1591852266-24017-1-git-send-email-lirongqing@baidu.com
4 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Thu, 11 Jun 2020 20:25:53 +0000 (13:25 -0700)]
Merge branch 'akpm' (patches from Andrew)

Merge some more updates from Andrew Morton:

 - various hotfixes and minor things

 - hch's use_mm/unuse_mm clearnups

Subsystems affected by this patch series: mm/hugetlb, scripts, kcov,
lib, nilfs, checkpatch, lib, mm/debug, ocfs2, lib, misc.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  kernel: set USER_DS in kthread_use_mm
  kernel: better document the use_mm/unuse_mm API contract
  kernel: move use_mm/unuse_mm to kthread.c
  kernel: move use_mm/unuse_mm to kthread.c
  stacktrace: cleanup inconsistent variable type
  lib: test get_count_order/long in test_bitops.c
  mm: add comments on pglist_data zones
  ocfs2: fix spelling mistake and grammar
  mm/debug_vm_pgtable: fix kernel crash by checking for THP support
  lib: fix bitmap_parse() on 64-bit big endian archs
  checkpatch: correct check for kernel parameters doc
  nilfs2: fix null pointer dereference at nilfs_segctor_do_construct()
  lib/lz4/lz4_decompress.c: document deliberate use of `&'
  kcov: check kcov_softirq in kcov_remote_stop()
  scripts/spelling: add a few more typos
  khugepaged: selftests: fix timeout condition in wait_for_scan()

4 years agodt-bindings: Fix more incorrect 'reg' property sizes in examples
Rob Herring [Thu, 11 Jun 2020 14:58:04 +0000 (08:58 -0600)]
dt-bindings: Fix more incorrect 'reg' property sizes in examples

The examples template is a 'simple-bus' with a size of 1 cell for
had between 2 and 4 cells which really only errors on I2C or SPI type
devices with a single cell.

The easiest fix in most cases is to change the 'reg' property to 1 cell
for address and size.

Cc: "Heiko Stübner" <heiko@sntech.de>
Cc: Ezequiel Garcia <ezequiel@collabora.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Kishon Vijay Abraham I <kishon@ti.com>
Cc: Vinod Koul <vkoul@kernel.org>
Cc: Liam Girdwood <lgirdwood@gmail.com>
Cc: linux-rockchip@lists.infradead.org
Cc: linux-media@vger.kernel.org
Cc: linux-mtd@lists.infradead.org
Cc: netdev@vger.kernel.org
Cc: alsa-devel@alsa-project.org
Acked-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Rob Herring <robh@kernel.org>
4 years agoalpha: Fix build around srm_sysrq_reboot_op
Joerg Roedel [Thu, 11 Jun 2020 09:11:39 +0000 (11:11 +0200)]
alpha: Fix build around srm_sysrq_reboot_op

The patch introducing the struct was probably never compile tested,
because it sets a handler with a wrong function signature. Wrap the
handler into a functions with the correct signature to fix the build.

Fixes: 0f1c9688a194 ("tty/sysrq: alpha: export and use __sysrq_get_key_op()")
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoMerge tag 'riscv-for-linus-5.8-mw1' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 11 Jun 2020 19:55:20 +0000 (12:55 -0700)]
Merge tag 'riscv-for-linus-5.8-mw1' of git://git./linux/kernel/git/riscv/linux

Pull more RISC-V updates from Palmer Dabbelt:

 - Kconfig select statements are now sorted alphanumerically

 - first-level interrupts are now handled via a full irqchip driver

 - CPU hotplug is fixed

 - vDSO calls now use the common vDSO infrastructure

* tag 'riscv-for-linus-5.8-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: set the permission of vdso_data to read-only
  riscv: use vDSO common flow to reduce the latency of the time-related functions
  riscv: fix build warning of missing prototypes
  RISC-V: Don't mark init section as non-executable
  RISC-V: Force select RISCV_INTC for CONFIG_RISCV
  RISC-V: Remove do_IRQ() function
  clocksource/drivers/timer-riscv: Use per-CPU timer interrupt
  irqchip: RISC-V per-HART local interrupt controller driver
  RISC-V: Rename and move plic_find_hart_id() to arch directory
  RISC-V: self-contained IPI handling routine
  RISC-V: Sort select statements alphanumerically

4 years agoMerge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64...
Linus Torvalds [Thu, 11 Jun 2020 19:53:23 +0000 (12:53 -0700)]
Merge tag 'arm64-upstream' of git://git./linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "arm64 fixes that came in during the merge window.

  There will probably be more to come, but it doesn't seem like it's
  worth me sitting on these in the meantime.

   - Fix SCS debug check to report max stack usage in bytes as advertised

   - Fix typo: CONFIG_FTRACE_WITH_REGS => CONFIG_DYNAMIC_FTRACE_WITH_REGS

   - Fix incorrect mask in HiSilicon L3C perf PMU driver

   - Fix compat vDSO compilation under some toolchain configurations

   - Fix false UBSAN warning from ACPI IORT parsing code

   - Fix booting under bootloaders that ignore TEXT_OFFSET

   - Annotate debug initcall function with '__init'"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: warn on incorrect placement of the kernel by the bootloader
  arm64: acpi: fix UBSAN warning
  arm64: vdso32: add CONFIG_THUMB2_COMPAT_VDSO
  drivers/perf: hisi: Fix wrong value for all counters enable
  arm64: ftrace: Change CONFIG_FTRACE_WITH_REGS to CONFIG_DYNAMIC_FTRACE_WITH_REGS
  arm64: debug: mark a function as __init to save some memory
  scs: Report SCS usage in bytes rather than number of entries

4 years agoMerge tag 'm68knommu-for-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg...
Linus Torvalds [Thu, 11 Jun 2020 19:50:54 +0000 (12:50 -0700)]
Merge tag 'm68knommu-for-v5.8' of git://git./linux/kernel/git/gerg/m68knommu

Pull m68knommu updates from Greg Ungerer:

 - casting clean up in the user access macros

 - memory leak on error case fix for PCI probing

 - update of a defconfig

* tag 'm68knommu-for-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
  m68k,nommu: fix implicit cast from __user in __{get,put}_user_asm()
  m68k,nommu: add missing __user in uaccess' __ptr() macro
  m68k: Drop CONFIG_MTD_M25P80 in stmark2_defconfig
  m68k/PCI: Fix a memory leak in an error handling path

4 years agodt-bindings: phy: qcom: Fix missing 'ranges' and example addresses
Rob Herring [Thu, 11 Jun 2020 14:52:38 +0000 (08:52 -0600)]
dt-bindings: phy: qcom: Fix missing 'ranges' and example addresses

The QCom QMP PHY bindings have child nodes with translatable (MMIO)
addresses, so a 'ranges' property is required in the parent node.
Additionally, the examples default to 1 address and size cell, so let's
fix that, too.

Fixes: ccf51c1cedfd ("dt-bindings: phy: qcom,qmp: Convert QMP PHY bindings to yaml")
Cc: Andy Gross <agross@kernel.org>
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Kishon Vijay Abraham I <kishon@ti.com>
Cc: Vinod Koul <vkoul@kernel.org>
Cc: Manu Gautam <mgautam@codeaurora.org>
Cc: linux-arm-msm@vger.kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
4 years agodt-bindings: Remove more cases of 'allOf' containing a '$ref'
Rob Herring [Wed, 10 Jun 2020 21:49:12 +0000 (15:49 -0600)]
dt-bindings: Remove more cases of 'allOf' containing a '$ref'

Another round of 'allOf' removals that came in this cycle.

json-schema versions draft7 and earlier have a weird behavior in that
any keywords combined with a '$ref' are ignored (silently). The correct
form was to put a '$ref' under an 'allOf'. This behavior is now changed
in the 2019-09 json-schema spec and '$ref' can be mixed with other
keywords. The json-schema library doesn't yet support this, but the
tooling now does a fixup for this and either way works.

This has been a constant source of review comments, so let's change this
treewide so everyone copies the simpler syntax.

Signed-off-by: Rob Herring <robh@kernel.org>
4 years agotipc: fix NULL pointer dereference in tipc_disc_rcv()
Tuong Lien [Thu, 11 Jun 2020 10:08:08 +0000 (17:08 +0700)]
tipc: fix NULL pointer dereference in tipc_disc_rcv()

When a bearer is enabled, we create a 'tipc_discoverer' object to store
the bearer related data along with a timer and a preformatted discovery
message buffer for later probing... However, this is only carried after
the bearer was set 'up', that left a race condition resulting in kernel
panic.

It occurs when a discovery message from a peer node is received and
processed in bottom half (since the bearer is 'up' already) just before
the discoverer object is created but is now accessed in order to update
the preformatted buffer (with a new trial address, ...) so leads to the
NULL pointer dereference.

We solve the problem by simply moving the bearer 'up' setting to later,
so make sure everything is ready prior to any message receiving.

Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agotipc: fix kernel WARNING in tipc_msg_append()
Tuong Lien [Thu, 11 Jun 2020 10:07:35 +0000 (17:07 +0700)]
tipc: fix kernel WARNING in tipc_msg_append()

syzbot found the following issue:

WARNING: CPU: 0 PID: 6808 at include/linux/thread_info.h:150 check_copy_size include/linux/thread_info.h:150 [inline]
WARNING: CPU: 0 PID: 6808 at include/linux/thread_info.h:150 copy_from_iter include/linux/uio.h:144 [inline]
WARNING: CPU: 0 PID: 6808 at include/linux/thread_info.h:150 tipc_msg_append+0x49a/0x5e0 net/tipc/msg.c:242
Kernel panic - not syncing: panic_on_warn set ...

This happens after commit 5e9eeccc58f3 ("tipc: fix NULL pointer
dereference in streaming") that tried to build at least one buffer even
when the message data length is zero... However, it now exposes another
bug that the 'mss' can be zero and the 'cpy' will be negative, thus the
above kernel WARNING will appear!
The zero value of 'mss' is never expected because it means Nagle is not
enabled for the socket (actually the socket type was 'SOCK_SEQPACKET'),
so the function 'tipc_msg_append()' must not be called at all. But that
was in this particular case since the message data length was zero, and
the 'send <= maxnagle' check became true.

We resolve the issue by explicitly checking if Nagle is enabled for the
socket, i.e. 'maxnagle != 0' before calling the 'tipc_msg_append()'. We
also reinforce the function to against such a negative values if any.

Reported-by: syzbot+75139a7d2605236b0b7f@syzkaller.appspotmail.com
Fixes: c0bceb97db9e ("tipc: add smart nagle feature")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoionic: remove support for mgmt device
Shannon Nelson [Thu, 11 Jun 2020 04:07:39 +0000 (21:07 -0700)]
ionic: remove support for mgmt device

We no longer support the mgmt device in the ionic driver,
so remove the device id and related code.

Fixes: b3f064e9746d ("ionic: add support for device id 0x1004")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge tag 'mailbox-v5.8' of git://git.linaro.org/landing-teams/working/fujitsu/integr...
Linus Torvalds [Thu, 11 Jun 2020 19:42:14 +0000 (12:42 -0700)]
Merge tag 'mailbox-v5.8' of git://git.linaro.org/landing-teams/working/fujitsu/integration

Pull mailbox updates from Jassi Brar:
 "qcom:
   - new controller driver for IPCC
   - reorg the of_device data
   - add support for ipq6018 platform

  spreadtrum:
   - new sprd controller driver

  imx:
   - implement suspend/resume PM support

  misc:
   - make pcc driver struct static
   - fix return value in imx_mu_scu
   - disable clock before bailout in imx probe
   - remove duplicate error mssg in zynqmp probe
   - fix header size in imx.scu
   - check for null instead of is-err in zynqmp"

* tag 'mailbox-v5.8' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
  mailbox: qcom: Add ipq6018 apcs compatible
  mailbox: qcom: Add clock driver name in apcs mailbox driver data
  dt-bindings: mailbox: Add YAML schemas for QCOM APCS global block
  mailbox: imx: ONLY IPC MU needs IRQF_NO_SUSPEND flag
  mailbox: imx: Add runtime PM callback to handle MU clocks
  mailbox: imx: Add context save/restore for suspend/resume
  MAINTAINERS: Add entry for Qualcomm IPCC driver
  mailbox: Add support for Qualcomm IPCC
  dt-bindings: mailbox: Add devicetree binding for Qcom IPCC
  mailbox: zynqmp-ipi: Fix NULL vs IS_ERR() check in zynqmp_ipi_mbox_probe()
  mailbox: imx-mailbox: fix scu msg header size check
  mailbox: sprd: Add Spreadtrum mailbox driver
  dt-bindings: mailbox: Add the Spreadtrum mailbox documentation
  mailbox: ZynqMP IPI: Delete an error message in zynqmp_ipi_probe()
  mailbox: imx: Disable the clock on devm_mbox_controller_register() failure
  mailbox: imx: Fix return in imx_mu_scu_xlate()
  mailbox: imx: Support runtime PM
  mailbox: pcc: make pcc_mbox_driver static

4 years agodrivers: dpaa2: Use devm_kcalloc() in setup_dpni()
Xu Wang [Thu, 11 Jun 2020 02:45:20 +0000 (02:45 +0000)]
drivers: dpaa2: Use devm_kcalloc() in setup_dpni()

A multiplication for the size determination of a memory allocation
indicated that an array data structure should be processed.
Thus use the corresponding function "devm_kcalloc".

Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge tag 'sound-fix-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Thu, 11 Jun 2020 19:38:11 +0000 (12:38 -0700)]
Merge tag 'sound-fix-5.8-rc1' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "Here are last-minute fixes gathered before merge window close; a few
  fixes are for the core while the rest majority are driver fixes.

   - PCM locking annotation fixes and the possible self-lock fix

   - ASoC DPCM regression fixes with multi-CPU DAI

   - A fix for inconsistent resume from system-PM on USB-audio

   - Improved runtime-PM handling with multiple USB interfaces

   - Quirks for HD-audio and USB-audio

   - Hardened firmware handling in max98390 codec

   - A couple of fixes for meson"

* tag 'sound-fix-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (21 commits)
  ASoC: rt5645: Add platform-data for Asus T101HA
  ASoC: Intel: bytcr_rt5640: Add quirk for Toshiba Encore WT10-A tablet
  ASoC: SOF: nocodec: conditionally set dpcm_capture/dpcm_playback flags
  ASoC: Intel: boards: replace capture_only by dpcm_capture
  ASoC: core: only convert non DPCM link to DPCM link
  ASoC: soc-pcm: dpcm: fix playback/capture checks
  ASoC: meson: add missing free_irq() in error path
  ALSA: pcm: disallow linking stream to itself
  ALSA: usb-audio: Manage auto-pm of all bundled interfaces
  ALSA: hda/realtek - add a pintbl quirk for several Lenovo machines
  ALSA: pcm: fix snd_pcm_link() lockdep splat
  ALSA: usb-audio: Use the new macro for HP Dock rename quirks
  ALSA: usb-audio: Add vendor, product and profile name for HP Thunderbolt Dock
  ALSA: emu10k1: delete an unnecessary condition
  dt-bindings: ASoc: Fix tdm-slot documentation spelling error
  ASoC: meson: fix memory leak of links if allocation of ldata fails
  ALSA: usb-audio: Fix inconsistent card PM state after resume
  ASoC: max98390: Fix potential crash during param fw loading
  ASoC: max98390: Fix incorrect printf qualifier
  ASoC: fsl-asoc-card: Defer probe when fail to find codec device
  ...

4 years agoMerge tag 'drm-next-2020-06-11-1' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Thu, 11 Jun 2020 19:27:06 +0000 (12:27 -0700)]
Merge tag 'drm-next-2020-06-11-1' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "One sun4i fix and a connector hotplug race The ast fix is for a
  regression in 5.6, and one of the i915 ones fixes an oops reported by
  dhowells.

  core:
   - fix race in connectors sending hotplug

  i915:
   - Avoid use after free in cmdparser
   - Avoid NULL dereference when probing all display encoders
   - Fixup to module parameter type

  sun4i:
   - clock divider fix

  ast:
   - 24/32 bpp mode setting fix"

* tag 'drm-next-2020-06-11-1' of git://anongit.freedesktop.org/drm/drm:
  drm/ast: fix missing break in switch statement for format->cpp[0] case 4
  drm/sun4i: hdmi ddc clk: Fix size of m divider
  drm/i915/display: Only query DP state of a DDI encoder
  drm/i915/params: fix i915.reset module param type
  drm/i915/gem: Mark the buffer pool as active for the cmdparser
  drm/connector: notify userspace on hotplug after register complete

4 years agoMerge tag 'nfs-for-5.8-1' of git://git.linux-nfs.org/projects/anna/linux-nfs
Linus Torvalds [Thu, 11 Jun 2020 19:22:41 +0000 (12:22 -0700)]
Merge tag 'nfs-for-5.8-1' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client updates from Anna Schumaker:
 "New features and improvements:
   - Sunrpc receive buffer sizes only change when establishing a GSS credentials
   - Add more sunrpc tracepoints
   - Improve on tracepoints to capture internal NFS I/O errors

  Other bugfixes and cleanups:
   - Move a dprintk() to after a call to nfs_alloc_fattr()
   - Fix off-by-one issues in rpc_ntop6
   - Fix a few coccicheck warnings
   - Use the correct SPDX license identifiers
   - Fix rpc_call_done assignment for BIND_CONN_TO_SESSION
   - Replace zero-length array with flexible array
   - Remove duplicate headers
   - Set invalid blocks after NFSv4 writes to update space_used attribute
   - Fix direct WRITE throughput regression"

* tag 'nfs-for-5.8-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (27 commits)
  NFS: Fix direct WRITE throughput regression
  SUNRPC: rpc_xprt lifetime events should record xprt->state
  xprtrdma: Make xprt_rdma_slot_table_entries static
  nfs: set invalid blocks after NFSv4 writes
  NFS: remove redundant initialization of variable result
  sunrpc: add missing newline when printing parameter 'auth_hashtable_size' by sysfs
  NFS: Add a tracepoint in nfs_set_pgio_error()
  NFS: Trace short NFS READs
  NFS: nfs_xdr_status should record the procedure name
  SUNRPC: Set SOFTCONN when destroying GSS contexts
  SUNRPC: rpc_call_null_helper() should set RPC_TASK_SOFT
  SUNRPC: rpc_call_null_helper() already sets RPC_TASK_NULLCREDS
  SUNRPC: trace RPC client lifetime events
  SUNRPC: Trace transport lifetime events
  SUNRPC: Split the xdr_buf event class
  SUNRPC: Add tracepoint to rpc_call_rpcerror()
  SUNRPC: Update the RPC_SHOW_SOCKET() macro
  SUNRPC: Update the rpc_show_task_flags() macro
  SUNRPC: Trace GSS context lifetimes
  SUNRPC: receive buffer size estimation values almost never change
  ...

4 years agocompiler_types.h, kasan: Use __SANITIZE_ADDRESS__ instead of CONFIG_KASAN to decide...
Marco Elver [Thu, 21 May 2020 14:20:47 +0000 (16:20 +0200)]
compiler_types.h, kasan: Use __SANITIZE_ADDRESS__ instead of CONFIG_KASAN to decide inlining

Use __always_inline in compilation units that have instrumentation
disabled (KASAN_SANITIZE_foo.o := n) for KASAN, like it is done for
KCSAN.

Also, add common documentation for KASAN and KCSAN explaining the
attribute.

 [ bp: Massage commit message. ]

Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/r/20200521142047.169334-12-elver@google.com
4 years agocompiler.h: Move function attributes to compiler_types.h
Marco Elver [Thu, 21 May 2020 14:20:46 +0000 (16:20 +0200)]
compiler.h: Move function attributes to compiler_types.h

Cleanup and move the KASAN and KCSAN related function attributes to
compiler_types.h, where the rest of the same kind live.

No functional change intended.

Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/r/20200521142047.169334-11-elver@google.com
4 years agocompiler.h: Avoid nested statement expression in data_race()
Marco Elver [Thu, 21 May 2020 14:20:45 +0000 (16:20 +0200)]
compiler.h: Avoid nested statement expression in data_race()

It appears that compilers have trouble with nested statement
expressions. Therefore, remove one level of statement expression nesting
from the data_race() macro. This will help avoiding potential problems
in the future as its usage increases.

Reported-by: Borislav Petkov <bp@suse.de>
Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lkml.kernel.org/r/20200520221712.GA21166@zn.tnic
Link: https://lkml.kernel.org/r/20200521142047.169334-10-elver@google.com
4 years agocompiler.h: Remove data_race() and unnecessary checks from {READ,WRITE}_ONCE()
Marco Elver [Thu, 21 May 2020 14:20:44 +0000 (16:20 +0200)]
compiler.h: Remove data_race() and unnecessary checks from {READ,WRITE}_ONCE()

The volatile accesses no longer need to be wrapped in data_race()
because compilers that emit instrumentation distinguishing volatile
accesses are required for KCSAN.

Consequently, the explicit kcsan_check_atomic*() are no longer required
either since the compiler emits instrumentation distinguishing the
volatile accesses.

Finally, simplify __READ_ONCE_SCALAR() and remove __WRITE_ONCE_SCALAR().

 [ bp: Convert commit message to passive voice. ]

Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/r/20200521142047.169334-9-elver@google.com
4 years agokcsan: Update Documentation to change supported compilers
Marco Elver [Thu, 21 May 2020 14:20:43 +0000 (16:20 +0200)]
kcsan: Update Documentation to change supported compilers

Document change in required compiler version for KCSAN, and remove the
now redundant note about __no_kcsan and inlining problems with older
compilers.

Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/r/20200521142047.169334-8-elver@google.com
4 years agokcsan: Remove 'noinline' from __no_kcsan_or_inline
Marco Elver [Thu, 21 May 2020 14:20:41 +0000 (16:20 +0200)]
kcsan: Remove 'noinline' from __no_kcsan_or_inline

Some compilers incorrectly inline small __no_kcsan functions, which then
results in instrumenting the accesses. For this reason, the 'noinline'
attribute was added to __no_kcsan_or_inline. All known versions of GCC
are affected by this. Supported versions of Clang are unaffected, and
never inline a no_sanitize function.

However, the attribute 'noinline' in __no_kcsan_or_inline causes
unexpected code generation in functions that are __no_kcsan and call a
__no_kcsan_or_inline function.

In certain situations it is expected that the __no_kcsan_or_inline
function is actually inlined by the __no_kcsan function, and *no* calls
are emitted. By removing the 'noinline' attribute, give the compiler
the ability to inline and generate the expected code in __no_kcsan
functions.

Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/r/CANpmjNNOpJk0tprXKB_deiNAv_UmmORf1-2uajLhnLWQQ1hvoA@mail.gmail.com
Link: https://lkml.kernel.org/r/20200521142047.169334-6-elver@google.com
4 years agokcsan: Pass option tsan-instrument-read-before-write to Clang
Marco Elver [Thu, 21 May 2020 14:20:40 +0000 (16:20 +0200)]
kcsan: Pass option tsan-instrument-read-before-write to Clang

Clang (unlike GCC) removes reads before writes with matching addresses
in the same basic block. This is an optimization for TSAN, since writes
will always cause conflict if the preceding read would have.

However, for KCSAN we cannot rely on this option, because we apply
several special rules to writes, in particular when the
KCSAN_ASSUME_PLAIN_WRITES_ATOMIC option is selected. To avoid missing
potential data races, pass the -tsan-instrument-read-before-write option
to Clang if it is available [1].

[1] https://github.com/llvm/llvm-project/commit/151ed6aa38a3ec6c01973b35f684586b6e1c0f7e

Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/r/20200521142047.169334-5-elver@google.com
4 years agokcsan: Support distinguishing volatile accesses
Marco Elver [Thu, 21 May 2020 14:20:39 +0000 (16:20 +0200)]
kcsan: Support distinguishing volatile accesses

In the kernel, the "volatile" keyword is used in various concurrent
contexts, whether in low-level synchronization primitives or for
legacy reasons. If supported by the compiler, it will be assumed
that aligned volatile accesses up to sizeof(long long) (matching
compiletime_assert_rwonce_type()) are atomic.

Recent versions of Clang [1] (GCC tentative [2]) can instrument
volatile accesses differently. Add the option (required) to enable the
instrumentation, and provide the necessary runtime functions. None of
the updated compilers are widely available yet (Clang 11 will be the
first release to support the feature).

[1] https://github.com/llvm/llvm-project/commit/5a2c31116f412c3b6888be361137efd705e05814
[2] https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544452.html

This change allows removing of any explicit checks in primitives such as
READ_ONCE() and WRITE_ONCE().

 [ bp: Massage commit message a bit. ]

Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200521142047.169334-4-elver@google.com
4 years agokcsan: Restrict supported compilers
Marco Elver [Thu, 21 May 2020 14:20:42 +0000 (16:20 +0200)]
kcsan: Restrict supported compilers

The first version of Clang that supports -tsan-distinguish-volatile will
be able to support KCSAN. The first Clang release to do so, will be
Clang 11. This is due to satisfying all the following requirements:

1. Never emit calls to __tsan_func_{entry,exit}.

2. __no_kcsan functions should not call anything, not even
   kcsan_{enable,disable}_current(), when using __{READ,WRITE}_ONCE => Requires
   leaving them plain!

3. Support atomic_{read,set}*() with KCSAN, which rely on
   arch_atomic_{read,set}*() using __{READ,WRITE}_ONCE() => Because of
   #2, rely on Clang 11's -tsan-distinguish-volatile support. We will
   double-instrument atomic_{read,set}*(), but that's reasonable given
   it's still lower cost than the data_race() variant due to avoiding 2
   extra calls (kcsan_{en,dis}able_current() calls).

4. __always_inline functions inlined into __no_kcsan functions are never
   instrumented.

5. __always_inline functions inlined into instrumented functions are
   instrumented.

6. __no_kcsan_or_inline functions may be inlined into __no_kcsan functions =>
   Implies leaving 'noinline' off of __no_kcsan_or_inline.

7. Because of #6, __no_kcsan and __no_kcsan_or_inline functions should never be
   spuriously inlined into instrumented functions, causing the accesses of the
   __no_kcsan function to be instrumented.

Older versions of Clang do not satisfy #3. The latest GCC currently
doesn't support at least #1, #3, and #7.

Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/r/CANpmjNMTsY_8241bS7=XAfqvZHFLrVEkv_uM4aDUWE_kh3Rvbw@mail.gmail.com
Link: https://lkml.kernel.org/r/20200521142047.169334-7-elver@google.com
4 years agokcsan: Avoid inserting __tsan_func_entry/exit if possible
Marco Elver [Thu, 21 May 2020 14:20:38 +0000 (16:20 +0200)]
kcsan: Avoid inserting __tsan_func_entry/exit if possible

To avoid inserting  __tsan_func_{entry,exit}, add option if supported by
compiler. Currently only Clang can be told to not emit calls to these
functions. It is safe to not emit these, since KCSAN does not rely on
them.

Note that, if we disable __tsan_func_{entry,exit}(), we need to disable
tail-call optimization in sanitized compilation units, as otherwise we
may skip frames in the stack trace; in particular when the tail called
function is one of the KCSAN's runtime functions, and a report is
generated, we might miss the function where the actual access occurred.

Since __tsan_func_{entry,exit}() insertion effectively disabled
tail-call optimization, there should be no observable change.

This was caught and confirmed with kcsan-test & UNWINDER_ORC.

Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200521142047.169334-3-elver@google.com
4 years agoubsan, kcsan: Don't combine sanitizer with kcov on clang
Arnd Bergmann [Thu, 21 May 2020 14:20:37 +0000 (16:20 +0200)]
ubsan, kcsan: Don't combine sanitizer with kcov on clang

Clang does not allow -fsanitize-coverage=trace-{pc,cmp} together
with -fsanitize=bounds or with ubsan:

  clang: error: argument unused during compilation: '-fsanitize-coverage=trace-pc' [-Werror,-Wunused-command-line-argument]
  clang: error: argument unused during compilation: '-fsanitize-coverage=trace-cmp' [-Werror,-Wunused-command-line-argument]

To avoid the warning, check whether clang can handle this correctly or
disallow ubsan and kcsan when kcov is enabled.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marco Elver <elver@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://bugs.llvm.org/show_bug.cgi?id=45831
Link: https://lore.kernel.org/lkml/20200505142341.1096942-1-arnd@arndb.de
Link: https://lkml.kernel.org/r/20200521142047.169334-2-elver@google.com
4 years agoRebase locking/kcsan to locking/urgent
Thomas Gleixner [Thu, 11 Jun 2020 18:02:46 +0000 (20:02 +0200)]
Rebase locking/kcsan to locking/urgent

Merge the state of the locking kcsan branch before the read/write_once()
and the atomics modifications got merged.

Squash the fallout of the rebase on top of the read/write once and atomic
fallback work into the merge. The history of the original branch is
preserved in tag locking-kcsan-2020-06-02.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
4 years agoMerge tag 'kvmarm-fixes-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmar...
Paolo Bonzini [Thu, 11 Jun 2020 18:02:32 +0000 (14:02 -0400)]
Merge tag 'kvmarm-fixes-5.8-1' of git://git./linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for Linux 5.8, take #1

* 32bit VM fixes:
  - Fix embarassing mapping issue between AArch32 CSSELR and AArch64
    ACTLR
  - Add ACTLR2 support for AArch32
  - Get rid of the useless ACTLR_EL1 save/restore
  - Fix CP14/15 accesses for AArch32 guests on BE hosts
  - Ensure that we don't loose any state when injecting a 32bit
    exception when running on a VHE host

* 64bit VM fixes:
  - Fix PtrAuth host saving happening in preemptible contexts
  - Optimize PtrAuth lazy enable
  - Drop vcpu to cpu context pointer
  - Fix sparse warnings for HYP per-CPU accesses

4 years agoMerge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Linus Torvalds [Thu, 11 Jun 2020 18:02:13 +0000 (11:02 -0700)]
Merge branch 'linus' of git://git./linux/kernel/git/herbert/crypto-2.6

Pull crypto fixes from Herbert Xu:
 "A number of fixes to the omap and nitrox drivers"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: cavium/nitrox - Fix 'nitrox_get_first_device()' when ndevlist is fully iterated
  crypto: omap-sham - add proper load balancing support for multicore
  crypto: omap-aes - prevent unregistering algorithms twice
  crypto: omap-sham - fix very small data size handling
  crypto: omap-sham - huge buffer access fixes
  crypto: omap-crypto - fix userspace copied buffer access
  crypto: omap-sham - force kernel driver usage for sha algos
  crypto: omap-aes - avoid spamming console with self tests

4 years agoKVM: x86: do not pass poisoned hva to __kvm_set_memory_region
Paolo Bonzini [Thu, 11 Jun 2020 18:01:51 +0000 (14:01 -0400)]
KVM: x86: do not pass poisoned hva to __kvm_set_memory_region

__kvm_set_memory_region does not use the hva at all, so trying to
catch use-after-delete is pointless and, worse, it fails access_ok
now that we apply it to all memslots including private kernel ones.
This fixes an AVIC regression.

Fixes: 09d952c971a5 ("KVM: check userspace_addr for all memslots")
Reported-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agoMerge tag 'vfs-5.8-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Linus Torvalds [Thu, 11 Jun 2020 17:48:12 +0000 (10:48 -0700)]
Merge tag 'vfs-5.8-merge-3' of git://git./fs/xfs/xfs-linux

Pull DAX updates part three from Darrick Wong:
 "Now that the xfs changes have landed, this third piece changes the
  FS_XFLAG_DAX ioctl code in xfs to request that the inode be reloaded
  after the last program closes the file, if doing so would make a S_DAX
  change happen. The goal here is to make dax access mode switching
  quicker when possible.

  Summary:

   - Teach XFS to ask the VFS to drop an inode if the administrator
     changes the FS_XFLAG_DAX inode flag such that the S_DAX state would
     change. This can result in files changing access modes without
     requiring an unmount cycle"

* tag 'vfs-5.8-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  fs/xfs: Update xfs_ioctl_setattr_dax_invalidate()
  fs/xfs: Combine xfs_diflags_to_linux() and xfs_diflags_to_iflags()
  fs/xfs: Create function xfs_inode_should_enable_dax()
  fs/xfs: Make DAX mount option a tri-state
  fs/xfs: Change XFS_MOUNT_DAX to XFS_MOUNT_DAX_ALWAYS
  fs/xfs: Remove unnecessary initialization of i_rwsem

4 years agoNFS: Fix direct WRITE throughput regression
Chuck Lever [Fri, 29 May 2020 18:14:40 +0000 (14:14 -0400)]
NFS: Fix direct WRITE throughput regression

I measured a 50% throughput regression for large direct writes.

The observed on-the-wire behavior is that the client sends every
NFS WRITE twice: once as an UNSTABLE WRITE plus a COMMIT, and once
as a FILE_SYNC WRITE.

This is because the nfs_write_match_verf() check in
nfs_direct_commit_complete() fails for every WRITE.

Buffered writes use nfs_write_completion(), which sets req->wb_verf
correctly. Direct writes use nfs_direct_write_completion(), which
does not set req->wb_verf at all. This leaves req->wb_verf set to
all zeroes for every direct WRITE, and thus
nfs_direct_commit_completion() always sets NFS_ODIRECT_RESCHED_WRITES.

This fix appears to restore nearly all of the lost performance.

Fixes: 1f28476dcb98 ("NFS: Fix O_DIRECT commit verifier handling")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agoSUNRPC: rpc_xprt lifetime events should record xprt->state
Chuck Lever [Mon, 18 May 2020 14:13:02 +0000 (10:13 -0400)]
SUNRPC: rpc_xprt lifetime events should record xprt->state

Help troubleshoot the logic that uses these flags.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agoxprtrdma: Make xprt_rdma_slot_table_entries static
Zou Wei [Thu, 23 Apr 2020 07:10:02 +0000 (15:10 +0800)]
xprtrdma: Make xprt_rdma_slot_table_entries static

Fix the following sparse warning:

net/sunrpc/xprtrdma/transport.c:71:14: warning: symbol 'xprt_rdma_slot_table_entries'
was not declared. Should it be static?

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zou Wei <zou_wei@huawei.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agonfs: set invalid blocks after NFSv4 writes
Zheng Bin [Thu, 21 May 2020 09:17:21 +0000 (17:17 +0800)]
nfs: set invalid blocks after NFSv4 writes

Use the following command to test nfsv4(size of file1M is 1MB):
mount -t nfs -o vers=4.0,actimeo=60 127.0.0.1/dir1 /mnt
cp file1M /mnt
du -h /mnt/file1M  -->0 within 60s, then 1M

When write is done(cp file1M /mnt), will call this:
nfs_writeback_done
  nfs4_write_done
    nfs4_write_done_cb
      nfs_writeback_update_inode
        nfs_post_op_update_inode_force_wcc_locked(change, ctime, mtime
nfs_post_op_update_inode_force_wcc_locked
   nfs_set_cache_invalid
   nfs_refresh_inode_locked
     nfs_update_inode

nfsd write response contains change, ctime, mtime, the flag will be
clear after nfs_update_inode. Howerver, write response does not contain
space_used, previous open response contains space_used whose value is 0,
so inode->i_blocks is still 0.

nfs_getattr  -->called by "du -h"
  do_update |= force_sync || nfs_attribute_cache_expired -->false in 60s
  cache_validity = READ_ONCE(NFS_I(inode)->cache_validity)
  do_update |= cache_validity & (NFS_INO_INVALID_ATTR    -->false
  if (do_update) {
        __nfs_revalidate_inode
  }

Within 60s, does not send getattr request to nfsd, thus "du -h /mnt/file1M"
is 0.

Add a NFS_INO_INVALID_BLOCKS flag, set it when nfsv4 write is done.

Fixes: 16e143751727 ("NFS: More fine grained attribute tracking")
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agoNFS: remove redundant initialization of variable result
Colin Ian King [Wed, 27 May 2020 12:56:11 +0000 (13:56 +0100)]
NFS: remove redundant initialization of variable result

The variable result is being initialized with a value that is never read
and it is being updated later with a new value.  The initialization is
redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agosunrpc: add missing newline when printing parameter 'auth_hashtable_size' by sysfs
Xiongfeng Wang [Fri, 8 May 2020 01:33:00 +0000 (09:33 +0800)]
sunrpc: add missing newline when printing parameter 'auth_hashtable_size' by sysfs

When I cat parameter
'/sys/module/sunrpc/parameters/auth_hashtable_size', it displays as
follows. It is better to add a newline for easy reading.

[root@hulk-202 ~]# cat /sys/module/sunrpc/parameters/auth_hashtable_size
16[root@hulk-202 ~]#

Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agoNFS: Add a tracepoint in nfs_set_pgio_error()
Chuck Lever [Tue, 12 May 2020 21:14:11 +0000 (17:14 -0400)]
NFS: Add a tracepoint in nfs_set_pgio_error()

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agoNFS: Trace short NFS READs
Chuck Lever [Tue, 12 May 2020 21:14:05 +0000 (17:14 -0400)]
NFS: Trace short NFS READs

A short read can generate an -EIO error without there being an error
on the wire. This tracepoint acts as an eyecatcher when there is no
obvious I/O error.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agoNFS: nfs_xdr_status should record the procedure name
Chuck Lever [Tue, 12 May 2020 21:14:00 +0000 (17:14 -0400)]
NFS: nfs_xdr_status should record the procedure name

When sunrpc trace points are not enabled, the recorded task ID
information alone is not helpful.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agoSUNRPC: Set SOFTCONN when destroying GSS contexts
Chuck Lever [Tue, 12 May 2020 21:13:55 +0000 (17:13 -0400)]
SUNRPC: Set SOFTCONN when destroying GSS contexts

Move the RPC_TASK_SOFTCONN flag into rpc_call_null_helper(). The
only minor behavior change is that it is now also set when
destroying GSS contexts.

This gives a better guarantee that gss_send_destroy_context() will
not hang for long if a connection cannot be established.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agoSUNRPC: rpc_call_null_helper() should set RPC_TASK_SOFT
Chuck Lever [Tue, 12 May 2020 21:13:50 +0000 (17:13 -0400)]
SUNRPC: rpc_call_null_helper() should set RPC_TASK_SOFT

Clean up.

All of rpc_call_null_helper() call sites assert RPC_TASK_SOFT, so
move that setting into rpc_call_null_helper() itself.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agoSUNRPC: rpc_call_null_helper() already sets RPC_TASK_NULLCREDS
Chuck Lever [Tue, 12 May 2020 21:13:44 +0000 (17:13 -0400)]
SUNRPC: rpc_call_null_helper() already sets RPC_TASK_NULLCREDS

Clean up.

Commit a52458b48af1 ("NFS/NFSD/SUNRPC: replace generic creds with
'struct cred'.") made rpc_call_null_helper() set RPC_TASK_NULLCREDS
unconditionally. Therefore there's no need for
rpc_call_null_helper()'s call sites to set RPC_TASK_NULLCREDS.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agoSUNRPC: trace RPC client lifetime events
Chuck Lever [Tue, 12 May 2020 21:13:39 +0000 (17:13 -0400)]
SUNRPC: trace RPC client lifetime events

The "create" tracepoint records parts of the rpc_create arguments,
and the shutdown tracepoint records when the rpc_clnt is about to
signal pending tasks and destroy auths.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agoSUNRPC: Trace transport lifetime events
Chuck Lever [Tue, 12 May 2020 21:13:34 +0000 (17:13 -0400)]
SUNRPC: Trace transport lifetime events

Refactor: Hoist create/destroy/disconnect tracepoints out of
xprtrdma and into the generic RPC client. Some benefits include:

- Enable tracing of xprt lifetime events for the socket transport
  types

- Expose the different types of disconnect to help run down
  issues with lingering connections

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agoSUNRPC: Split the xdr_buf event class
Chuck Lever [Tue, 12 May 2020 21:13:28 +0000 (17:13 -0400)]
SUNRPC: Split the xdr_buf event class

To help tie the recorded xdr_buf to a particular RPC transaction,
the client side version of this class should display task ID
information and the server side one should show the request's XID.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agoSUNRPC: Add tracepoint to rpc_call_rpcerror()
Chuck Lever [Tue, 12 May 2020 21:13:23 +0000 (17:13 -0400)]
SUNRPC: Add tracepoint to rpc_call_rpcerror()

Add a tracepoint in another common exit point for failing RPCs.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agoSUNRPC: Update the RPC_SHOW_SOCKET() macro
Chuck Lever [Tue, 12 May 2020 21:13:18 +0000 (17:13 -0400)]
SUNRPC: Update the RPC_SHOW_SOCKET() macro

Clean up: remove unnecessary commas, and fix a white-space nit.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agoSUNRPC: Update the rpc_show_task_flags() macro
Chuck Lever [Tue, 12 May 2020 21:13:12 +0000 (17:13 -0400)]
SUNRPC: Update the rpc_show_task_flags() macro

Recent additions to the RPC_TASK flags neglected to update
the tracepoint ENUM definitions.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agoSUNRPC: Trace GSS context lifetimes
Chuck Lever [Tue, 12 May 2020 21:13:07 +0000 (17:13 -0400)]
SUNRPC: Trace GSS context lifetimes

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agoSUNRPC: receive buffer size estimation values almost never change
Chuck Lever [Tue, 12 May 2020 21:13:01 +0000 (17:13 -0400)]
SUNRPC: receive buffer size estimation values almost never change

Avoid unnecessary cache sloshing by placing the buffer size
estimation update logic behind an atomic bit flag.

The size of GSS information included in each wrapped Reply does
not change during the lifetime of a GSS context. Therefore, the
au_rslack and au_ralign fields need to be updated only once after
establishing a fresh GSS credential.

Thus a slack size update must occur after a cred is created,
duplicated, renewed, or expires. I'm not sure I have this exactly
right. A trace point is introduced to track updates to these
variables to enable troubleshooting the problem if I missed a spot.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agoMerge tag 'nfsd-5.8' of git://linux-nfs.org/~bfields/linux
Linus Torvalds [Thu, 11 Jun 2020 17:33:13 +0000 (10:33 -0700)]
Merge tag 'nfsd-5.8' of git://linux-nfs.org/~bfields/linux

Pull nfsd updates from Bruce Fields:
 "Highlights:

   - Keep nfsd clients from unnecessarily breaking their own
     delegations.

     Note this requires a small kthreadd addition. The result is Tejun
     Heo's suggestion (see link), and he was OK with this going through
     my tree.

   - Patch nfsd/clients/ to display filenames, and to fix byte-order
     when displaying stateid's.

   - fix a module loading/unloading bug, from Neil Brown.

   - A big series from Chuck Lever with RPC/RDMA and tracing
     improvements, and lay some groundwork for RPC-over-TLS"

Link: https://lore.kernel.org/r/1588348912-24781-1-git-send-email-bfields@redhat.com
* tag 'nfsd-5.8' of git://linux-nfs.org/~bfields/linux: (49 commits)
  sunrpc: use kmemdup_nul() in gssp_stringify()
  nfsd: safer handling of corrupted c_type
  nfsd4: make drc_slab global, not per-net
  SUNRPC: Remove unreachable error condition in rpcb_getport_async()
  nfsd: Fix svc_xprt refcnt leak when setup callback client failed
  sunrpc: clean up properly in gss_mech_unregister()
  sunrpc: svcauth_gss_register_pseudoflavor must reject duplicate registrations.
  sunrpc: check that domain table is empty at module unload.
  NFSD: Fix improperly-formatted Doxygen comments
  NFSD: Squash an annoying compiler warning
  SUNRPC: Clean up request deferral tracepoints
  NFSD: Add tracepoints for monitoring NFSD callbacks
  NFSD: Add tracepoints to the NFSD state management code
  NFSD: Add tracepoints to NFSD's duplicate reply cache
  SUNRPC: svc_show_status() macro should have enum definitions
  SUNRPC: Restructure svc_udp_recvfrom()
  SUNRPC: Refactor svc_recvfrom()
  SUNRPC: Clean up svc_release_skb() functions
  SUNRPC: Refactor recvfrom path dealing with incomplete TCP receives
  SUNRPC: Replace dprintk() call sites in TCP receive path
  ...

4 years agomedia: rkvdec: Fix H264 scaling list order
Jonas Karlman [Fri, 22 May 2020 20:21:33 +0000 (22:21 +0200)]
media: rkvdec: Fix H264 scaling list order

The Rockchip Video Decoder driver is expecting that the values in a
scaling list are in zig-zag order and applies the inverse scanning process
to get the values in matrix order.

Commit 0b0393d59eb4 ("media: uapi: h264: clarify expected
scaling_list_4x4/8x8 order") clarified that the values in the scaling list
should already be in matrix order.

Fix this by removing the reordering and change to use two memcpy.

Fixes: cd33c830448b ("media: rkvdec: Add the rkvdec driver")
Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
Tested-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Reviewed-by: Ezequiel Garcia <ezequiel@collabora.com>
[hverkuil-cisco@xs4all.nl: rkvdec_scaling_matrix -> rkvdec_h264_scaling_list]
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
4 years agomedia: v4l2-ctrls: Unset correct HEVC loop filter flag
Jonas Karlman [Tue, 26 May 2020 22:25:15 +0000 (00:25 +0200)]
media: v4l2-ctrls: Unset correct HEVC loop filter flag

Wrong loop filter flag is unset when tiles enabled flag is not set,
this cause HEVC decoding issues with Rockchip Video Decoder.

Fix this by unsetting the loop filter across tiles enabled flag instead of
the pps loop filter across slices enabled flag when tiles are disabled.

Fixes: 256fa3920874 ("media: v4l: Add definitions for HEVC stateless decoding")
Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
4 years agomedia: videobuf2-dma-contig: fix bad kfree in vb2_dma_contig_clear_max_seg_size
Tomi Valkeinen [Wed, 27 May 2020 08:23:34 +0000 (10:23 +0200)]
media: videobuf2-dma-contig: fix bad kfree in vb2_dma_contig_clear_max_seg_size

Commit 9495b7e92f716ab2bd6814fab5e97ab4a39adfdd ("driver core: platform:
Initialize dma_parms for platform devices") in v5.7-rc5 causes
vb2_dma_contig_clear_max_seg_size() to kfree memory that was not
allocated by vb2_dma_contig_set_max_seg_size().

The assumption in vb2_dma_contig_set_max_seg_size() seems to be that
dev->dma_parms is always NULL when the driver is probed, and the case
where dev->dma_parms has bee initialized by someone else than the driver
(by calling vb2_dma_contig_set_max_seg_size) will cause a failure.

All the current users of these functions are platform devices, which now
always have dma_parms set by the driver core. To fix the issue for v5.7,
make vb2_dma_contig_set_max_seg_size() return an error if dma_parms is
NULL to be on the safe side, and remove the kfree code from
vb2_dma_contig_clear_max_seg_size().

For v5.8 we should remove the two functions and move the
dma_set_max_seg_size() calls into the drivers.

Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Fixes: 9495b7e92f71 ("driver core: platform: Initialize dma_parms for platform devices")
Cc: stable@vger.kernel.org
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
4 years agomedia: v4l2-subdev.rst: correct information about v4l2 events
Michael Rodin [Wed, 27 May 2020 16:21:32 +0000 (18:21 +0200)]
media: v4l2-subdev.rst: correct information about v4l2 events

Remove description of non-existing v4l2_subdev.nevents and replace the
undefined flag V4L2_SUBDEV_USES_EVENTS by the correct flag
V4L2_SUBDEV_FL_HAS_EVENTS, which is already documented in v4l2_subdev.flags

Fixes: commit 02adb1cc765b ("[media] v4l: subdev: Events support")
Signed-off-by: Michael Rodin <mrodin@de.adit-jv.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
4 years agomedia: s5p-mfc: Properly handle dma_parms for the allocated devices
Marek Szyprowski [Thu, 28 May 2020 14:03:26 +0000 (16:03 +0200)]
media: s5p-mfc: Properly handle dma_parms for the allocated devices

Commit 9495b7e92f71 ("driver core: platform: Initialize dma_parms for
platform devices") in v5.7-rc5 added allocation of dma_parms structure to
all platform devices. Then vb2_dma_contig_set_max_seg_size() have been
changed not to allocate dma_parms structure and rely on the one allocated
by the device core. Lets allocate the needed structure also for the
devices created for the 2 MFC device memory ports.

Reported-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Suggested-by: Ulf Hansson <ulf.hansson@linaro.org>
Fixes: 9495b7e92f71 ("driver core: platform: Initialize dma_parms for platform devices")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
4 years agomedia: medium: cec: Make MEDIA_CEC_SUPPORT default to n if !MEDIA_SUPPORT
Geert Uytterhoeven [Thu, 4 Jun 2020 09:39:53 +0000 (11:39 +0200)]
media: medium: cec: Make MEDIA_CEC_SUPPORT default to n if !MEDIA_SUPPORT

Recently, MEDIA_CEC_SUPPORT became indepedent of MEDIA_SUPPORT.
However, if MEDIA_SUPPORT is not enabled, MEDIA_SUPPORT_FILTER is not
defined, and MEDIA_CEC_SUPPORT is thus enabled by default, which is not
desirable.

Fix this by adding a dependency on MEDIA_CEC_SUPPORT to the default
configuration.

Fixes: 46d2a3b964ddbe63 ("media: place CEC menu before MEDIA_SUPPORT")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>