platform/kernel/linux-rpi.git
3 years agoperf session: Correct buffer copying when peeking events
Leo Yan [Sat, 5 Jun 2021 05:29:57 +0000 (13:29 +0800)]
perf session: Correct buffer copying when peeking events

[ Upstream commit 197eecb6ecae0b04bd694432f640ff75597fed9c ]

When peeking an event, it has a short path and a long path.  The short
path uses the session pointer "one_mmap_addr" to directly fetch the
event; and the long path needs to read out the event header and the
following event data from file and fill into the buffer pointer passed
through the argument "buf".

The issue is in the long path that it copies the event header and event
data into the same destination address which pointer "buf", this means
the event header is overwritten.  We are just lucky to run into the
short path in most cases, so we don't hit the issue in the long path.

This patch adds the offset "hdr_sz" to the pointer "buf" when copying
the event data, so that it can reserve the event header which can be
used properly by its caller.

Fixes: 5a52f33adf02 ("perf session: Add perf_session__peek_event()")
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210605052957.1070720-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agoNFSv4: Fix deadlock between nfs4_evict_inode() and nfs4_opendata_get_inode()
Trond Myklebust [Tue, 1 Jun 2021 15:10:05 +0000 (11:10 -0400)]
NFSv4: Fix deadlock between nfs4_evict_inode() and nfs4_opendata_get_inode()

[ Upstream commit dfe1fe75e00e4c724ede7b9e593f6f680e446c5f ]

If the inode is being evicted, but has to return a delegation first,
then it can cause a deadlock in the corner case where the server reboots
before the delegreturn completes, but while the call to iget5_locked() in
nfs4_opendata_get_inode() is waiting for the inode free to complete.
Since the open call still holds a session slot, the reboot recovery
cannot proceed.

In order to break the logjam, we can turn the delegation return into a
privileged operation for the case where we're evicting the inode. We
know that in that case, there can be no other state recovery operation
that conflicts.

Reported-by: zhangxiaoxu (A) <zhangxiaoxu5@huawei.com>
Fixes: 5fcdfacc01f3 ("NFSv4: Return delegations synchronously in evict_inode")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agoNFS: Fix a potential NULL dereference in nfs_get_client()
Dan Carpenter [Thu, 3 Jun 2021 12:37:53 +0000 (15:37 +0300)]
NFS: Fix a potential NULL dereference in nfs_get_client()

[ Upstream commit 09226e8303beeec10f2ff844d2e46d1371dc58e0 ]

None of the callers are expecting NULL returns from nfs_get_client() so
this code will lead to an Oops.  It's better to return an error
pointer.  I expect that this is dead code so hopefully no one is
affected.

Fixes: 31434f496abb ("nfs: check hostname in nfs_get_client")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agoIB/mlx5: Fix initializing CQ fragments buffer
Alaa Hleihel [Thu, 10 Jun 2021 07:34:27 +0000 (10:34 +0300)]
IB/mlx5: Fix initializing CQ fragments buffer

commit 2ba0aa2feebda680ecfc3c552e867cf4d1b05a3a upstream.

The function init_cq_frag_buf() can be called to initialize the current CQ
fragments buffer cq->buf, or the temporary cq->resize_buf that is filled
during CQ resize operation.

However, the offending commit started to use function get_cqe() for
getting the CQEs, the issue with this change is that get_cqe() always
returns CQEs from cq->buf, which leads us to initialize the wrong buffer,
and in case of enlarging the CQ we try to access elements beyond the size
of the current cq->buf and eventually hit a kernel panic.

 [exception RIP: init_cq_frag_buf+103]
  [ffff9f799ddcbcd8] mlx5_ib_resize_cq at ffffffffc0835d60 [mlx5_ib]
  [ffff9f799ddcbdb0] ib_resize_cq at ffffffffc05270df [ib_core]
  [ffff9f799ddcbdc0] llt_rdma_setup_qp at ffffffffc0a6a712 [llt]
  [ffff9f799ddcbe10] llt_rdma_cc_event_action at ffffffffc0a6b411 [llt]
  [ffff9f799ddcbe98] llt_rdma_client_conn_thread at ffffffffc0a6bb75 [llt]
  [ffff9f799ddcbec8] kthread at ffffffffa66c5da1
  [ffff9f799ddcbf50] ret_from_fork_nospec_begin at ffffffffa6d95ddd

Fix it by getting the needed CQE by calling mlx5_frag_buf_get_wqe() that
takes the correct source buffer as a parameter.

Fixes: 388ca8be0037 ("IB/mlx5: Implement fragmented completion queue (CQ)")
Link: https://lore.kernel.org/r/90a0e8c924093cfa50a482880ad7e7edb73dc19a.1623309971.git.leonro@nvidia.com
Signed-off-by: Alaa Hleihel <alaa@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoKVM: x86: Ensure liveliness of nested VM-Enter fail tracepoint message
Sean Christopherson [Mon, 7 Jun 2021 17:57:48 +0000 (10:57 -0700)]
KVM: x86: Ensure liveliness of nested VM-Enter fail tracepoint message

commit f31500b0d437a2464ca5972d8f5439e156b74960 upstream.

Use the __string() machinery provided by the tracing subystem to make a
copy of the string literals consumed by the "nested VM-Enter failed"
tracepoint.  A complete copy is necessary to ensure that the tracepoint
can't outlive the data/memory it consumes and deference stale memory.

Because the tracepoint itself is defined by kvm, if kvm-intel and/or
kvm-amd are built as modules, the memory holding the string literals
defined by the vendor modules will be freed when the module is unloaded,
whereas the tracepoint and its data in the ring buffer will live until
kvm is unloaded (or "indefinitely" if kvm is built-in).

This bug has existed since the tracepoint was added, but was recently
exposed by a new check in tracing to detect exactly this type of bug.

  fmt: '%s%s
  ' current_buffer: ' vmx_dirty_log_t-140127  [003] ....  kvm_nested_vmenter_failed: '
  WARNING: CPU: 3 PID: 140134 at kernel/trace/trace.c:3759 trace_check_vprintf+0x3be/0x3e0
  CPU: 3 PID: 140134 Comm: less Not tainted 5.13.0-rc1-ce2e73ce600a-req #184
  Hardware name: ASUS Q87M-E/Q87M-E, BIOS 1102 03/03/2014
  RIP: 0010:trace_check_vprintf+0x3be/0x3e0
  Code: <0f> 0b 44 8b 4c 24 1c e9 a9 fe ff ff c6 44 02 ff 00 49 8b 97 b0 20
  RSP: 0018:ffffa895cc37bcb0 EFLAGS: 00010282
  RAX: 0000000000000000 RBX: ffffa895cc37bd08 RCX: 0000000000000027
  RDX: 0000000000000027 RSI: 00000000ffffdfff RDI: ffff9766cfad74f8
  RBP: ffffffffc0a041d4 R08: ffff9766cfad74f0 R09: ffffa895cc37bad8
  R10: 0000000000000001 R11: 0000000000000001 R12: ffffffffc0a041d4
  R13: ffffffffc0f4dba8 R14: 0000000000000000 R15: ffff976409f2c000
  FS:  00007f92fa200740(0000) GS:ffff9766cfac0000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000559bd11b0000 CR3: 000000019fbaa002 CR4: 00000000001726e0
  Call Trace:
   trace_event_printf+0x5e/0x80
   trace_raw_output_kvm_nested_vmenter_failed+0x3a/0x60 [kvm]
   print_trace_line+0x1dd/0x4e0
   s_show+0x45/0x150
   seq_read_iter+0x2d5/0x4c0
   seq_read+0x106/0x150
   vfs_read+0x98/0x180
   ksys_read+0x5f/0xe0
   do_syscall_64+0x40/0xb0
   entry_SYSCALL_64_after_hwframe+0x44/0xae

Cc: Steven Rostedt <rostedt@goodmis.org>
Fixes: 380e0055bc7e ("KVM: nVMX: trace nested VM-Enter failures detected by H/W")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Message-Id: <20210607175748.674002-1-seanjc@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agox86/nmi_watchdog: Fix old-style NMI watchdog regression on old Intel CPUs
CodyYao-oc [Mon, 7 Jun 2021 02:53:35 +0000 (10:53 +0800)]
x86/nmi_watchdog: Fix old-style NMI watchdog regression on old Intel CPUs

commit a8383dfb2138742a1bb77b481ada047aededa2ba upstream.

The following commit:

   3a4ac121c2ca ("x86/perf: Add hardware performance events support for Zhaoxin CPU.")

Got the old-style NMI watchdog logic wrong and broke it for basically every
Intel CPU where it was active. Which is only truly old CPUs, so few people noticed.

On CPUs with perf events support we turn off the old-style NMI watchdog, so it
was pretty pointless to add the logic for X86_VENDOR_ZHAOXIN to begin with ... :-/

Anyway, the fix is to restore the old logic and add a 'break'.

[ mingo: Wrote a new changelog. ]

Fixes: 3a4ac121c2ca ("x86/perf: Add hardware performance events support for Zhaoxin CPU.")
Signed-off-by: CodyYao-oc <CodyYao-oc@zhaoxin.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210607025335.9643-1-CodyYao-oc@zhaoxin.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agosched/fair: Fix util_est UTIL_AVG_UNCHANGED handling
Dietmar Eggemann [Wed, 2 Jun 2021 14:58:08 +0000 (16:58 +0200)]
sched/fair: Fix util_est UTIL_AVG_UNCHANGED handling

commit 68d7a190682aa4eb02db477328088ebad15acc83 upstream.

The util_est internal UTIL_AVG_UNCHANGED flag which is used to prevent
unnecessary util_est updates uses the LSB of util_est.enqueued. It is
exposed via _task_util_est() (and task_util_est()).

Commit 92a801e5d5b7 ("sched/fair: Mask UTIL_AVG_UNCHANGED usages")
mentions that the LSB is lost for util_est resolution but
find_energy_efficient_cpu() checks if task_util_est() returns 0 to
return prev_cpu early.

_task_util_est() returns the max value of util_est.ewma and
util_est.enqueued or'ed w/ UTIL_AVG_UNCHANGED.
So task_util_est() returning the max of task_util() and
_task_util_est() will never return 0 under the default
SCHED_FEAT(UTIL_EST, true).

To fix this use the MSB of util_est.enqueued instead and keep the flag
util_est internal, i.e. don't export it via _task_util_est().

The maximal possible util_avg value for a task is 1024 so the MSB of
'unsigned int util_est.enqueued' isn't used to store a util value.

As a caveat the code behind the util_est_se trace point has to filter
UTIL_AVG_UNCHANGED to see the real util_est.enqueued value which should
be easy to do.

This also fixes an issue report by Xuewen Yan that util_est_update()
only used UTIL_AVG_UNCHANGED for the subtrahend of the equation:

  last_enqueued_diff = ue.enqueued - (task_util() | UTIL_AVG_UNCHANGED)

Fixes: b89997aa88f0b sched/pelt: Fix task util_est update filtering
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Xuewen Yan <xuewen.yan@unisoc.com>
Reviewed-by: Vincent Donnefort <vincent.donnefort@arm.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20210602145808.1562603-1-dietmar.eggemann@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agosched/fair: Make sure to update tg contrib for blocked load
Vincent Guittot [Thu, 27 May 2021 12:29:16 +0000 (14:29 +0200)]
sched/fair: Make sure to update tg contrib for blocked load

commit 02da26ad5ed6ea8680e5d01f20661439611ed776 upstream.

During the update of fair blocked load (__update_blocked_fair()), we
update the contribution of the cfs in tg->load_avg if cfs_rq's pelt
has decayed.  Nevertheless, the pelt values of a cfs_rq could have
been recently updated while propagating the change of a child. In this
case, cfs_rq's pelt will not decayed because it has already been
updated and we don't update tg->load_avg.

__update_blocked_fair
  ...
  for_each_leaf_cfs_rq_safe: child cfs_rq
    update cfs_rq_load_avg() for child cfs_rq
    ...
    update_load_avg(cfs_rq_of(se), se, 0)
      ...
      update cfs_rq_load_avg() for parent cfs_rq
-propagation of child's load makes parent cfs_rq->load_sum
 becoming null
        -UPDATE_TG is not set so it doesn't update parent
 cfs_rq->tg_load_avg_contrib
  ..
  for_each_leaf_cfs_rq_safe: parent cfs_rq
    update cfs_rq_load_avg() for parent cfs_rq
      - nothing to do because parent cfs_rq has already been updated
recently so cfs_rq->tg_load_avg_contrib is not updated
    ...
    parent cfs_rq is decayed
      list_del_leaf_cfs_rq parent cfs_rq
  - but it still contibutes to tg->load_avg

we must set UPDATE_TG flags when propagting pending load to the parent

Fixes: 039ae8bcf7a5 ("sched/fair: Fix O(nr_cgroups) in the load balancing path")
Reported-by: Odin Ugedal <odin@uged.al>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Odin Ugedal <odin@uged.al>
Link: https://lkml.kernel.org/r/20210527122916.27683-3-vincent.guittot@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agosched/fair: Keep load_avg and load_sum synced
Vincent Guittot [Thu, 27 May 2021 12:29:15 +0000 (14:29 +0200)]
sched/fair: Keep load_avg and load_sum synced

commit 7c7ad626d9a0ff0a36c1e2a3cfbbc6a13828d5eb upstream.

when removing a cfs_rq from the list we only check _sum value so we must
ensure that _avg and _sum stay synced so load_sum can't be null whereas
load_avg is not after propagating load in the cgroup hierarchy.

Use load_avg to compute load_sum similarly to what is done for util_sum
and runnable_sum.

Fixes: 0e2d2aaaae52 ("sched/fair: Rewrite PELT migration propagation")
Reported-by: Odin Ugedal <odin@uged.al>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Odin Ugedal <odin@uged.al>
Link: https://lkml.kernel.org/r/20210527122916.27683-2-vincent.guittot@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoperf: Fix data race between pin_count increment/decrement
Marco Elver [Thu, 27 May 2021 10:47:11 +0000 (12:47 +0200)]
perf: Fix data race between pin_count increment/decrement

commit 6c605f8371159432ec61cbb1488dcf7ad24ad19a upstream.

KCSAN reports a data race between increment and decrement of pin_count:

  write to 0xffff888237c2d4e0 of 4 bytes by task 15740 on cpu 1:
   find_get_context kernel/events/core.c:4617
   __do_sys_perf_event_open kernel/events/core.c:12097 [inline]
   __se_sys_perf_event_open kernel/events/core.c:11933
   ...
  read to 0xffff888237c2d4e0 of 4 bytes by task 15743 on cpu 0:
   perf_unpin_context kernel/events/core.c:1525 [inline]
   __do_sys_perf_event_open kernel/events/core.c:12328 [inline]
   __se_sys_perf_event_open kernel/events/core.c:11933
   ...

Because neither read-modify-write here is atomic, this can lead to one
of the operations being lost, resulting in an inconsistent pin_count.
Fix it by adding the missing locking in the CPU-event case.

Fixes: fe4b04fa31a6 ("perf: Cure task_oncpu_function_call() races")
Reported-by: syzbot+142c9018f5962db69c7e@syzkaller.appspotmail.com
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210527104711.2671610-1-elver@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agogpio: wcd934x: Fix shift-out-of-bounds error
Srinivas Kandagatla [Tue, 25 May 2021 16:55:39 +0000 (17:55 +0100)]
gpio: wcd934x: Fix shift-out-of-bounds error

commit dbec64b11c65d74f31427e2b9d5746fbf17bf840 upstream.

bit-mask for pins 0 to 4 is BIT(0) to BIT(4) however we ended up with BIT(n - 1)
which is not right, and this was caught by below usban check

UBSAN: shift-out-of-bounds in drivers/gpio/gpio-wcd934x.c:34:14

Fixes: 59c324683400 ("gpio: wcd934x: Add support to wcd934x gpio controller")
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agophy: ti: Fix an error code in wiz_probe()
Yang Li [Tue, 25 May 2021 10:50:32 +0000 (18:50 +0800)]
phy: ti: Fix an error code in wiz_probe()

commit b8203ec7f58ae925e10fadd3d136073ae7503a6e upstream.

When the code execute this if statement, the value of ret is 0.
However, we can see from the dev_err() log that the value of
ret should be -EINVAL.

Clean up smatch warning:

drivers/phy/ti/phy-j721e-wiz.c:1216 wiz_probe() warn: missing error code 'ret'

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Fixes: c9f9eba06629 ("phy: ti: j721e-wiz: Manage typec-gpio-dir")
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Link: https://lore.kernel.org/r/1621939832-65535-1-git-send-email-yang.lee@linux.alibaba.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoASoC: meson: gx-card: fix sound-dai dt schema
Jerome Brunet [Mon, 24 May 2021 09:34:48 +0000 (11:34 +0200)]
ASoC: meson: gx-card: fix sound-dai dt schema

commit d031d99b02eaf7363c33f5b27b38086cc8104082 upstream.

There is a fair amount of warnings when running 'make dtbs_check' with
amlogic,gx-sound-card.yaml.

Ex:
arch/arm64/boot/dts/amlogic/meson-gxm-q200.dt.yaml: sound: dai-link-0:sound-dai:0:1: missing phandle tag in 0
arch/arm64/boot/dts/amlogic/meson-gxm-q200.dt.yaml: sound: dai-link-0:sound-dai:0:2: missing phandle tag in 0
arch/arm64/boot/dts/amlogic/meson-gxm-q200.dt.yaml: sound: dai-link-0:sound-dai:0: [66, 0, 0] is too long

The reason is that the sound-dai phandle provided has cells, and in such
case the schema should use 'phandle-array' instead of 'phandle'.

Fixes: fd00366b8e41 ("ASoC: meson: gx: add sound card dt-binding documentation")
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Link: https://lore.kernel.org/r/20210524093448.357140-1-jbrunet@baylibre.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoASoC: core: Fix Null-point-dereference in fmt_single_name()
Kefeng Wang [Mon, 24 May 2021 02:49:41 +0000 (10:49 +0800)]
ASoC: core: Fix Null-point-dereference in fmt_single_name()

commit 41daf6ba594d55f201c50280ebcd430590441da1 upstream.

Check the return value of devm_kstrdup() in case of
Null-point-dereference.

Fixes: 45dd9943fce0 ("ASoC: core: remove artificial component and DAI name constraint")
Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Link: https://lore.kernel.org/r/20210524024941.159952-1-wangkefeng.wang@huawei.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agophy: cadence: Sierra: Fix error return code in cdns_sierra_phy_probe()
Wang Wensheng [Mon, 17 May 2021 01:57:49 +0000 (01:57 +0000)]
phy: cadence: Sierra: Fix error return code in cdns_sierra_phy_probe()

commit 6411e386db0a477217607015e7d2910d02f75426 upstream.

Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: a43f72ae136a ("phy: cadence: Sierra: Change MAX_LANES of Sierra to 16")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Wensheng <wangwensheng4@huawei.com>
Link: https://lore.kernel.org/r/20210517015749.127799-1-wangwensheng4@huawei.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agotools/bootconfig: Fix error return code in apply_xbc()
Zhen Lei [Sat, 8 May 2021 03:42:16 +0000 (11:42 +0800)]
tools/bootconfig: Fix error return code in apply_xbc()

commit e8ba0b2b64126381643bb50df3556b139a60545a upstream.

Fix to return a negative error code from the error handling case instead
of 0, as done elsewhere in this function.

Link: https://lkml.kernel.org/r/20210508034216.2277-1-thunder.leizhen@huawei.com
Fixes: a995e6bc0524 ("tools/bootconfig: Fix to check the write failure correctly")
Reported-by: Hulk Robot <hulkci@huawei.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agovmlinux.lds.h: Avoid orphan section with !SMP
Nathan Chancellor [Thu, 6 May 2021 00:14:11 +0000 (17:14 -0700)]
vmlinux.lds.h: Avoid orphan section with !SMP

commit d4c6399900364facd84c9e35ce1540b6046c345f upstream.

With x86_64_defconfig and the following configs, there is an orphan
section warning:

CONFIG_SMP=n
CONFIG_AMD_MEM_ENCRYPT=y
CONFIG_HYPERVISOR_GUEST=y
CONFIG_KVM=y
CONFIG_PARAVIRT=y

ld: warning: orphan section `.data..decrypted' from `arch/x86/kernel/cpu/vmware.o' being placed in section `.data..decrypted'
ld: warning: orphan section `.data..decrypted' from `arch/x86/kernel/kvm.o' being placed in section `.data..decrypted'

These sections are created with DEFINE_PER_CPU_DECRYPTED, which
ultimately turns into __PCPU_ATTRS, which in turn has a section
attribute with a value of PER_CPU_BASE_SECTION + the section name. When
CONFIG_SMP is not set, the base section is .data and that is not
currently handled in any linker script.

Add .data..decrypted to PERCPU_DECRYPTED_SECTION, which is included in
PERCPU_INPUT -> PERCPU_SECTION, which is include in the x86 linker
script when either CONFIG_X86_64 or CONFIG_SMP is unset, taking care of
the warning.

Fixes: ac26963a1175 ("percpu: Introduce DEFINE_PER_CPU_DECRYPTED")
Link: https://github.com/ClangBuiltLinux/linux/issues/1360
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com> # build
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210506001410.1026691-1-nathan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoARM: cpuidle: Avoid orphan section warning
Arnd Bergmann [Wed, 30 Dec 2020 15:54:56 +0000 (16:54 +0100)]
ARM: cpuidle: Avoid orphan section warning

commit d94b93a9101573eb75b819dee94b1417acff631b upstream.

Since commit 83109d5d5fba ("x86/build: Warn on orphan section placement"),
we get a warning for objects in orphan sections. The cpuidle implementation
for OMAP causes this when CONFIG_CPU_IDLE is disabled:

arm-linux-gnueabi-ld: warning: orphan section `__cpuidle_method_of_table' from `arch/arm/mach-omap2/pm33xx-core.o' being placed in section `__cpuidle_method_of_table'
arm-linux-gnueabi-ld: warning: orphan section `__cpuidle_method_of_table' from `arch/arm/mach-omap2/pm33xx-core.o' being placed in section `__cpuidle_method_of_table'
arm-linux-gnueabi-ld: warning: orphan section `__cpuidle_method_of_table' from `arch/arm/mach-omap2/pm33xx-core.o' being placed in section `__cpuidle_method_of_table'

Change the definition of CPUIDLE_METHOD_OF_DECLARE() to silently
drop the table and all code referenced from it when CONFIG_CPU_IDLE
is disabled.

Fixes: 06ee7a950b6a ("ARM: OMAP2+: pm33xx-core: Add cpuidle_ops for am335x/am437x")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Miguel Ojeda <ojeda@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20201230155506.1085689-1-arnd@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoRDMA/mlx4: Do not map the core_clock page to user space unless enabled
Shay Drory [Thu, 3 Jun 2021 13:19:39 +0000 (16:19 +0300)]
RDMA/mlx4: Do not map the core_clock page to user space unless enabled

commit 404e5a12691fe797486475fe28cc0b80cb8bef2c upstream.

Currently when mlx4 maps the hca_core_clock page to the user space there
are read-modifiable registers, one of which is semaphore, on this page as
well as the clock counter. If user reads the wrong offset, it can modify
the semaphore and hang the device.

Do not map the hca_core_clock page to the user space unless the device has
been put in a backwards compatibility mode to support this feature.

After this patch, mlx4 core_clock won't be mapped to user space on the
majority of existing devices and the uverbs device time feature in
ibv_query_rt_values_ex() will be disabled.

Fixes: 52033cfb5aab ("IB/mlx4: Add mmap call to map the hardware clock")
Link: https://lore.kernel.org/r/9632304e0d6790af84b3b706d8c18732bc0d5e27.1622726305.git.leonro@nvidia.com
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoRDMA/ipoib: Fix warning caused by destroying non-initial netns
Kamal Heib [Tue, 25 May 2021 15:01:34 +0000 (18:01 +0300)]
RDMA/ipoib: Fix warning caused by destroying non-initial netns

commit a3e74fb9247cd530dca246699d5eb5a691884d32 upstream.

After the commit 5ce2dced8e95 ("RDMA/ipoib: Set rtnl_link_ops for ipoib
interfaces"), if the IPoIB device is moved to non-initial netns,
destroying that netns lets the device vanish instead of moving it back to
the initial netns, This is happening because default_device_exit() skips
the interfaces due to having rtnl_link_ops set.

Steps to reporoduce:
  ip netns add foo
  ip link set mlx5_ib0 netns foo
  ip netns delete foo

WARNING: CPU: 1 PID: 704 at net/core/dev.c:11435 netdev_exit+0x3f/0x50
Modules linked in: xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT
nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack
nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink tun d
 fuse
CPU: 1 PID: 704 Comm: kworker/u64:3 Tainted: G S      W  5.13.0-rc1+ #1
Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.1.5 04/11/2016
Workqueue: netns cleanup_net
RIP: 0010:netdev_exit+0x3f/0x50
Code: 48 8b bb 30 01 00 00 e8 ef 81 b1 ff 48 81 fb c0 3a 54 a1 74 13 48
8b 83 90 00 00 00 48 81 c3 90 00 00 00 48 39 d8 75 02 5b c3 <0f> 0b 5b
c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00
RSP: 0018:ffffb297079d7e08 EFLAGS: 00010206
RAX: ffff8eb542c00040 RBX: ffff8eb541333150 RCX: 000000008010000d
RDX: 000000008010000e RSI: 000000008010000d RDI: ffff8eb440042c00
RBP: ffffb297079d7e48 R08: 0000000000000001 R09: ffffffff9fdeac00
R10: ffff8eb5003be000 R11: 0000000000000001 R12: ffffffffa1545620
R13: ffffffffa1545628 R14: 0000000000000000 R15: ffffffffa1543b20
FS:  0000000000000000(0000) GS:ffff8ed37fa00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005601b5f4c2e8 CR3: 0000001fc8c10002 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 ops_exit_list.isra.9+0x36/0x70
 cleanup_net+0x234/0x390
 process_one_work+0x1cb/0x360
 ? process_one_work+0x360/0x360
 worker_thread+0x30/0x370
 ? process_one_work+0x360/0x360
 kthread+0x116/0x130
 ? kthread_park+0x80/0x80
 ret_from_fork+0x22/0x30

To avoid the above warning and later on the kernel panic that could happen
on shutdown due to a NULL pointer dereference, make sure to set the
netns_refund flag that was introduced by commit 3a5ca857079e ("can: dev:
Move device back to init netns on owning netns delete") to properly
restore the IPoIB interfaces to the initial netns.

Fixes: 5ce2dced8e95 ("RDMA/ipoib: Set rtnl_link_ops for ipoib interfaces")
Link: https://lore.kernel.org/r/20210525150134.139342-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agodrm/msm/a6xx: avoid shadow NULL reference in failure path
Jonathan Marek [Thu, 13 May 2021 17:14:00 +0000 (13:14 -0400)]
drm/msm/a6xx: avoid shadow NULL reference in failure path

commit ce86c239e4d218ae6040bec18e6d19a58edb8b7c upstream.

If a6xx_hw_init() fails before creating the shadow_bo, the a6xx_pm_suspend
code referencing it will crash. Change the condition to one that avoids
this problem (note: creation of shadow_bo is behind this same condition)

Fixes: e8b0b994c3a5 ("drm/msm/a6xx: Clear shadow on suspend")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Akhil P Oommen <akhilpo@codeaurora.org>
Link: https://lore.kernel.org/r/20210513171431.18632-6-jonathan@marek.ca
Signed-off-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agodrm/msm/a6xx: update/fix CP_PROTECT initialization
Jonathan Marek [Thu, 13 May 2021 17:13:59 +0000 (13:13 -0400)]
drm/msm/a6xx: update/fix CP_PROTECT initialization

commit 408434036958699a7f50ddec984f7ba33e11a8f5 upstream.

Update CP_PROTECT register programming based on downstream.

A6XX_PROTECT_RW is renamed to A6XX_PROTECT_NORDWR to make things aligned
and also be more clear about what it does.

Note that this required switching to use the CP_ALWAYS_ON_COUNTER as the
GMU counter is not accessible from the cmdstream.  Which also means
using the CPU counter for the msm_gpu_submit_flush() tracepoint (as
catapult depends on being able to compare this to the start/end values
captured in cmdstream).  This may need to be revisited when IFPC is
enabled.

Also, compared to downstream, this opens up CP_PERFCTR_CP_SEL as the
userspace performance tooling (fdperf and pps-producer) expect to be
able to configure the CP counters.

Fixes: 4b565ca5a2cb ("drm/msm: Add A6XX device support")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Akhil P Oommen <akhilpo@codeaurora.org>
Link: https://lore.kernel.org/r/20210513171431.18632-5-jonathan@marek.ca
[switch to CP_ALWAYS_ON_COUNTER, open up CP_PERFCNTR_CP_SEL, and spiff
 up commit msg]
Signed-off-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agodrm/msm/a6xx: fix incorrectly set uavflagprd_inv field for A650
Jonathan Marek [Thu, 13 May 2021 17:13:58 +0000 (13:13 -0400)]
drm/msm/a6xx: fix incorrectly set uavflagprd_inv field for A650

commit b4387eaf3821a4c4241ac3a556e13244eb1fdaa5 upstream.

Value was shifted in the wrong direction, resulting in the field always
being zero, which is incorrect for A650.

Fixes: d0bac4e9cd66 ("drm/msm/a6xx: set ubwc config for A640 and A650")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Akhil P Oommen <akhilpo@codeaurora.org>
Link: https://lore.kernel.org/r/20210513171431.18632-4-jonathan@marek.ca
Signed-off-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agodrm/mcde: Fix off by 10^3 in calculation
Linus Walleij [Tue, 8 Jun 2021 21:33:18 +0000 (23:33 +0200)]
drm/mcde: Fix off by 10^3 in calculation

commit c8a570443943304cac2e4186dbce6989b6c2b8b5 upstream.

The calclulation of how many bytes we stuff into the
DSI pipeline for video mode panels is off by three
orders of magnitude because we did not account for the
fact that the DRM mode clock is in kilohertz rather
than hertz.

This used to be:
drm_mode_vrefresh(mode) * mode->htotal * mode->vtotal
which would become for example for s6e63m0:
60 x 514 x 831 = 25628040 Hz, but mode->clock is
25628 as it is in kHz.

This affects only the Samsung GT-I8190 "Golden" phone
right now since it is the only MCDE device with a video
mode display.

Curiously some specimen work with this code and wild
settings in the EOL and empty packets at the end of the
display, but I have noticed an eeire flicker until now.
Others were not so lucky and got black screens.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reported-by: Stephan Gerhold <stephan@gerhold.net>
Fixes: 920dd1b1425b ("drm/mcde: Use mode->clock instead of reverse calculating it from the vrefresh")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: Stephan Gerhold <stephan@gerhold.net>
Reviewed-by: Stephan Gerhold <stephan@gerhold.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210608213318.3897858-1-linus.walleij@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousb: typec: mux: Fix copy-paste mistake in typec_mux_match
Bjorn Andersson [Thu, 10 Jun 2021 00:21:32 +0000 (17:21 -0700)]
usb: typec: mux: Fix copy-paste mistake in typec_mux_match

commit 142d0b24c1b17139f1aaaacae7542a38aa85640f upstream.

Fix the copy-paste mistake in the return path of typec_mux_match(),
where dev is considered a member of struct typec_switch rather than
struct typec_mux.

The two structs are identical in regards to having the struct device as
the first entry, so this provides no functional change.

Fixes: 3370db35193b ("usb: typec: Registering real device entries for the muxes")
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20210610002132.3088083-1-bjorn.andersson@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousb: dwc3: gadget: Disable gadget IRQ during pullup disable
Wesley Cheng [Fri, 21 May 2021 04:23:57 +0000 (21:23 -0700)]
usb: dwc3: gadget: Disable gadget IRQ during pullup disable

commit 8212937305f84ef73ea81036dafb80c557583d4b upstream.

Current sequence utilizes dwc3_gadget_disable_irq() alongside
synchronize_irq() to ensure that no further DWC3 events are generated.
However, the dwc3_gadget_disable_irq() API only disables device
specific events.  Endpoint events can still be generated.  Briefly
disable the interrupt line, so that the cleanup code can run to
prevent device and endpoint events. (i.e. __dwc3_gadget_stop() and
dwc3_stop_active_transfers() respectively)

Without doing so, it can lead to both the interrupt handler and the
pullup disable routine both writing to the GEVNTCOUNT register, which
will cause an incorrect count being read from future interrupts.

Fixes: ae7e86108b12 ("usb: dwc3: Stop active transfers before halting the controller")
Signed-off-by: Wesley Cheng <wcheng@codeaurora.org>
Link: https://lore.kernel.org/r/1621571037-1424-1-git-send-email-wcheng@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agophy: usb: Fix misuse of IS_ENABLED
Chen Li [Tue, 27 Apr 2021 07:17:45 +0000 (15:17 +0800)]
phy: usb: Fix misuse of IS_ENABLED

commit 7c2fc79250cafa1a29befeb60163028ec4720814 upstream.

While IS_ENABLED() is perfectly fine for CONFIG_* symbols, it is not
for other symbols such as __BIG_ENDIAN that is provided directly by
the compiler.

Switch to use CONFIG_CPU_BIG_ENDIAN instead of __BIG_ENDIAN.

Signed-off-by: Chen Li <chenli@uniontech.com>
Reviewed-by: Al Cooper <alcooperx@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Fixes: 94583a41047e ("phy: usb: Restructure in preparation for adding 7216 USB support")
Link: https://lore.kernel.org/r/87czuggpra.wl-chenli@uniontech.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoregulator: rtmv20: Fix .set_current_limit/.get_current_limit callbacks
Axel Lin [Sun, 30 May 2021 12:41:00 +0000 (20:41 +0800)]
regulator: rtmv20: Fix .set_current_limit/.get_current_limit callbacks

commit 86ab21cc39e6b99b7065ab9008c90bec5dec535a upstream.

Current code does not set .curr_table and .n_linear_ranges settings,
so it cannot use the regulator_get/set_current_limit_regmap helpers.
If we setup the curr_table, it will has 200 entries.
Implement customized .set_current_limit/.get_current_limit callbacks
instead.

Fixes: b8c054a5eaf0 ("regulator: rtmv20: Adds support for Richtek RTMV20 load switch regulator")
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Reviewed-by: ChiYuan Huang <cy_huang@richtek.com>
Link: https://lore.kernel.org/r/20210530124101.477727-1-axel.lin@ingics.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoregulator: bd71828: Fix .n_voltages settings
Axel Lin [Sun, 23 May 2021 07:10:45 +0000 (15:10 +0800)]
regulator: bd71828: Fix .n_voltages settings

commit 4c668630bf8ea90a041fc69c9984486e0f56682d upstream.

Current .n_voltages settings do not cover the latest 2 valid selectors,
so it fails to set voltage for the hightest voltage support.
The latest linear range has step_uV = 0, so it does not matter if we
count the .n_voltages to maximum selector + 1 or the first selector of
latest linear range + 1.
To simplify calculating the n_voltages, let's just set the
.n_voltages to maximum selector + 1.

Fixes: 522498f8cb8c ("regulator: bd71828: Basic support for ROHM bd71828 PMIC regulators")
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Reviewed-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com>
Link: https://lore.kernel.org/r/20210523071045.2168904-2-axel.lin@ingics.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoregulator: fan53880: Fix missing n_voltages setting
Axel Lin [Mon, 17 May 2021 10:53:24 +0000 (18:53 +0800)]
regulator: fan53880: Fix missing n_voltages setting

commit 34991ee96fd8477479dd15adadceb6b28b30d9b0 upstream.

Fixes: e6dea51e2d41 ("regulator: fan53880: Add initial support")
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Acked-by: Christoph Fritz <chf.fritz@googlemail.com>
Link: https://lore.kernel.org/r/20210517105325.1227393-1-axel.lin@ingics.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoregulator: bd718x7: Fix the BUCK7 voltage setting on BD71837
Matti Vaittinen [Wed, 2 Jun 2021 05:45:58 +0000 (08:45 +0300)]
regulator: bd718x7: Fix the BUCK7 voltage setting on BD71837

commit bc537e65b09a05923f98a31920d1ab170e648dba upstream.

Changing the BD71837 voltages for other regulators except the first 4 BUCKs
should be forbidden when the regulator is enabled. There may be out-of-spec
voltage spikes if the voltage of these "non DVS" bucks is changed when
enabled. This restriction was accidentally removed when the LDO voltage
change was allowed for BD71847. (It was not noticed that the BD71837
BUCK7 used same voltage setting function as LDOs).

Additionally this bug causes incorrect voltage monitoring register access.
The voltage change function accidentally used for bd71837 BUCK7 is
intended to only handle LDO voltage changes. A BD71847 LDO specific
voltage monitoring disabling code gets executed on BD71837 and register
offsets are wrongly calculated as regulator is assumed to be an LDO.

Prevent the BD71837 BUCK7 voltage change when BUCK7 is enabled by using
the correct voltage setting operation.

Fixes: 9bcbabafa19b ("regulator: bd718x7: remove voltage change restriction from BD71847 LDOs")
Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com>
Link: https://lore.kernel.org/r/bd8c00931421fafa57e3fdf46557a83075b7cc17.1622610103.git.matti.vaittinen@fi.rohmeurope.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoregulator: max77620: Use device_set_of_node_from_dev()
Dmitry Osipenko [Sun, 23 May 2021 22:42:42 +0000 (01:42 +0300)]
regulator: max77620: Use device_set_of_node_from_dev()

commit 6f55c5dd1118b3076d11d9cb17f5c5f4bc3a1162 upstream.

The MAX77620 driver fails to re-probe on deferred probe because driver
core tries to claim resources that are already claimed by the PINCTRL
device. Use device_set_of_node_from_dev() helper which marks OF node as
reused, skipping erroneous execution of pinctrl_bind_pins() for the PMIC
device on the re-probe.

Fixes: aea6cb99703e ("regulator: resolve supply after creating regulator")
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Link: https://lore.kernel.org/r/20210523224243.13219-2-digetx@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoregulator: core: resolve supply for boot-on/always-on regulators
Dmitry Baryshkov [Wed, 19 May 2021 22:12:23 +0000 (01:12 +0300)]
regulator: core: resolve supply for boot-on/always-on regulators

commit 98e48cd9283dbac0e1445ee780889f10b3d1db6a upstream.

For the boot-on/always-on regulators the set_machine_constrainst() is
called before resolving rdev->supply. Thus the code would try to enable
rdev before enabling supplying regulator. Enforce resolving supply
regulator before enabling rdev.

Fixes: aea6cb99703e ("regulator: resolve supply after creating regulator")
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20210519221224.2868496-1-dmitry.baryshkov@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousb: typec: tcpm: cancel frs hrtimer when unregister tcpm port
Li Jun [Wed, 2 Jun 2021 09:57:08 +0000 (17:57 +0800)]
usb: typec: tcpm: cancel frs hrtimer when unregister tcpm port

commit 7ade4805e296c8d1e40c842395bbe478c7210555 upstream.

Like the state_machine_timer, we should also cancel possible pending
frs hrtimer when unregister tcpm port.

Fixes: 8dc4bd073663 ("usb: typec: tcpm: Add support for Sink Fast Role SWAP(FRS)")
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Li Jun <jun.li@nxp.com>
Link: https://lore.kernel.org/r/1622627829-11070-2-git-send-email-jun.li@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousb: typec: tcpm: cancel vdm and state machine hrtimer when unregister tcpm port
Li Jun [Wed, 2 Jun 2021 09:57:07 +0000 (17:57 +0800)]
usb: typec: tcpm: cancel vdm and state machine hrtimer when unregister tcpm port

commit 3a13ff7ef4349d70d1d18378d661117dd5af8efe upstream.

A pending hrtimer may expire after the kthread_worker of tcpm port
is destroyed, see below kernel dump when do module unload, fix it
by cancel the 2 hrtimers.

[  111.517018] Unable to handle kernel paging request at virtual address ffff8000118cb880
[  111.518786] blk_update_request: I/O error, dev sda, sector 60061185 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[  111.526594] Mem abort info:
[  111.526597]   ESR = 0x96000047
[  111.526600]   EC = 0x25: DABT (current EL), IL = 32 bits
[  111.526604]   SET = 0, FnV = 0
[  111.526607]   EA = 0, S1PTW = 0
[  111.526610] Data abort info:
[  111.526612]   ISV = 0, ISS = 0x00000047
[  111.526615]   CM = 0, WnR = 1
[  111.526619] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000041d75000
[  111.526623] [ffff8000118cb880] pgd=10000001bffff003, p4d=10000001bffff003, pud=10000001bfffe003, pmd=10000001bfffa003, pte=0000000000000000
[  111.526642] Internal error: Oops: 96000047 [#1] PREEMPT SMP
[  111.526647] Modules linked in: dwc3_imx8mp dwc3 phy_fsl_imx8mq_usb [last unloaded: tcpci]
[  111.526663] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.13.0-rc4-00927-gebbe9dbd802c-dirty #36
[  111.526670] Hardware name: NXP i.MX8MPlus EVK board (DT)
[  111.526674] pstate: 800000c5 (Nzcv daIF -PAN -UAO -TCO BTYPE=--)
[  111.526681] pc : queued_spin_lock_slowpath+0x1a0/0x390
[  111.526695] lr : _raw_spin_lock_irqsave+0x88/0xb4
[  111.526703] sp : ffff800010003e20
[  111.526706] x29: ffff800010003e20 x28: ffff00017f380180
[  111.537156] buffer_io_error: 6 callbacks suppressed
[  111.537162] Buffer I/O error on dev sda1, logical block 60040704, async page read
[  111.539932]  x27: ffff00017f3801c0
[  111.539938] x26: ffff800010ba2490 x25: 0000000000000000 x24: 0000000000000001
[  111.543025] blk_update_request: I/O error, dev sda, sector 60061186 op 0x0:(READ) flags 0x0 phys_seg 7 prio class 0
[  111.548304]
[  111.548306] x23: 00000000000000c0 x22: ffff0000c2a9f184 x21: ffff00017f380180
[  111.551374] Buffer I/O error on dev sda1, logical block 60040705, async page read
[  111.554499]
[  111.554503] x20: ffff0000c5f14210 x19: 00000000000000c0 x18: 0000000000000000
[  111.557391] Buffer I/O error on dev sda1, logical block 60040706, async page read
[  111.561218]
[  111.561222] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[  111.564205] Buffer I/O error on dev sda1, logical block 60040707, async page read
[  111.570887] x14: 00000000000000f5 x13: 0000000000000001 x12: 0000000000000040
[  111.570902] x11: ffff0000c05ac6d8
[  111.583420] Buffer I/O error on dev sda1, logical block 60040708, async page read
[  111.588978]  x10: 0000000000000000 x9 : 0000000000040000
[  111.588988] x8 : 0000000000000000
[  111.597173] Buffer I/O error on dev sda1, logical block 60040709, async page read
[  111.605766]  x7 : ffff00017f384880 x6 : ffff8000118cb880
[  111.605777] x5 : ffff00017f384880
[  111.611094] Buffer I/O error on dev sda1, logical block 60040710, async page read
[  111.617086]  x4 : 0000000000000000 x3 : ffff0000c2a9f184
[  111.617096] x2 : ffff8000118cb880
[  111.622242] Buffer I/O error on dev sda1, logical block 60040711, async page read
[  111.626927]  x1 : ffff8000118cb880 x0 : ffff00017f384888
[  111.626938] Call trace:
[  111.626942]  queued_spin_lock_slowpath+0x1a0/0x390
[  111.795809]  kthread_queue_work+0x30/0xc0
[  111.799828]  state_machine_timer_handler+0x20/0x30
[  111.804624]  __hrtimer_run_queues+0x140/0x1e0
[  111.808990]  hrtimer_interrupt+0xec/0x2c0
[  111.813004]  arch_timer_handler_phys+0x38/0x50
[  111.817456]  handle_percpu_devid_irq+0x88/0x150
[  111.821991]  __handle_domain_irq+0x80/0xe0
[  111.826093]  gic_handle_irq+0xc0/0x140
[  111.829848]  el1_irq+0xbc/0x154
[  111.832991]  arch_cpu_idle+0x1c/0x2c
[  111.836572]  default_idle_call+0x24/0x6c
[  111.840497]  do_idle+0x238/0x2ac
[  111.843729]  cpu_startup_entry+0x2c/0x70
[  111.847657]  rest_init+0xdc/0xec
[  111.850890]  arch_call_rest_init+0x14/0x20
[  111.854988]  start_kernel+0x508/0x540
[  111.858659] Code: 910020e0 8b0200c2 f861d884 aa0203e1 (f8246827)
[  111.864760] ---[ end trace 308b9a4a3dcb73ac ]---
[  111.869381] Kernel panic - not syncing: Oops: Fatal exception in interrupt
[  111.876258] SMP: stopping secondary CPUs
[  111.880185] Kernel Offset: disabled
[  111.883673] CPU features: 0x00001001,20000846
[  111.888031] Memory Limit: none
[  111.891090] ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]---

Fixes: 3ed8e1c2ac99 ("usb: typec: tcpm: Migrate workqueue to RT priority for processing events")
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Li Jun <jun.li@nxp.com>
Link: https://lore.kernel.org/r/1622627829-11070-1-git-send-email-jun.li@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousb: fix various gadget panics on 10gbps cabling
Maciej Żenczykowski [Wed, 9 Jun 2021 02:44:59 +0000 (19:44 -0700)]
usb: fix various gadget panics on 10gbps cabling

commit 032e288097a553db5653af552dd8035cd2a0ba96 upstream.

usb_assign_descriptors() is called with 5 parameters,
the last 4 of which are the usb_descriptor_header for:
  full-speed (USB1.1 - 12Mbps [including USB1.0 low-speed @ 1.5Mbps),
  high-speed (USB2.0 - 480Mbps),
  super-speed (USB3.0 - 5Gbps),
  super-speed-plus (USB3.1 - 10Gbps).

The differences between full/high/super-speed descriptors are usually
substantial (due to changes in the maximum usb block size from 64 to 512
to 1024 bytes and other differences in the specs), while the difference
between 5 and 10Gbps descriptors may be as little as nothing
(in many cases the same tuning is simply good enough).

However if a gadget driver calls usb_assign_descriptors() with
a NULL descriptor for super-speed-plus and is then used on a max 10gbps
configuration, the kernel will crash with a null pointer dereference,
when a 10gbps capable device port + cable + host port combination shows up.
(This wouldn't happen if the gadget max-speed was set to 5gbps, but
it of course defaults to the maximum, and there's no real reason to
artificially limit it)

The fix is to simply use the 5gbps descriptor as the 10gbps descriptor,
if a 10gbps descriptor wasn't provided.

Obviously this won't fix the problem if the 5gbps descriptor is also
NULL, but such cases can't be so trivially solved (and any such gadgets
are unlikely to be used with USB3 ports any way).

Cc: Felipe Balbi <balbi@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210609024459.1126080-1-zenczykowski@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousb: fix various gadgets null ptr deref on 10gbps cabling.
Maciej Żenczykowski [Tue, 8 Jun 2021 04:41:41 +0000 (21:41 -0700)]
usb: fix various gadgets null ptr deref on 10gbps cabling.

commit 90c4d05780d47e14a50e11a7f17373104cd47d25 upstream.

This avoids a null pointer dereference in
f_{ecm,eem,hid,loopback,printer,rndis,serial,sourcesink,subset,tcm}
by simply reusing the 5gbps config for 10gbps.

Fixes: eaef50c76057 ("usb: gadget: Update usb_assign_descriptors for SuperSpeedPlus")
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Michael R Sweet <msweet@msweet.org>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Pawel Laszczak <pawell@cadence.com>
Cc: Peter Chen <peter.chen@nxp.com>
Cc: Sudhakar Panneerselvam <sudhakar.panneerselvam@oracle.com>
Cc: Wei Ming Chen <jj251510319013@gmail.com>
Cc: Will McVicker <willmcvicker@google.com>
Cc: Zqiang <qiang.zhang@windriver.com>
Reviewed-By: Lorenzo Colitti <lorenzo@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Link: https://lore.kernel.org/r/20210608044141.3898496-1-zenczykowski@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousb: gadget: eem: fix wrong eem header operation
Linyu Yuan [Tue, 8 Jun 2021 23:35:47 +0000 (07:35 +0800)]
usb: gadget: eem: fix wrong eem header operation

commit 305f670846a31a261462577dd0b967c4fa796871 upstream.

when skb_clone() or skb_copy_expand() fail,
it should pull skb with lengh indicated by header,
or not it will read network data and check it as header.

Cc: <stable@vger.kernel.org>
Signed-off-by: Linyu Yuan <linyyuan@codeaurora.com>
Link: https://lore.kernel.org/r/20210608233547.3767-1-linyyuan@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoUSB: serial: cp210x: fix alternate function for CP2102N QFN20
Stefan Agner [Fri, 28 May 2021 20:39:31 +0000 (22:39 +0200)]
USB: serial: cp210x: fix alternate function for CP2102N QFN20

commit 6f7ec77cc8b64ff5037c1945e4650c65c458037d upstream.

The QFN20 part has a different GPIO/port function assignment. The
configuration struct bit field ordered as TX/RX/RS485/WAKEUP/CLK
which exactly matches GPIO0-3 for QFN24/28. However, QFN20 has a
different GPIO to primary function assignment.

Special case QFN20 to follow to properly detect which GPIOs are
available.

Signed-off-by: Stefan Agner <stefan@agner.ch>
Link: https://lore.kernel.org/r/51830b2b24118eb0f77c5c9ac64ffb2f519dbb1d.1622218300.git.stefan@agner.ch
Fixes: c8acfe0aadbe ("USB: serial: cp210x: implement GPIO support for CP2102N")
Cc: stable@vger.kernel.org # 4.19
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoUSB: serial: quatech2: fix control-request directions
Johan Hovold [Mon, 24 May 2021 09:17:05 +0000 (11:17 +0200)]
USB: serial: quatech2: fix control-request directions

commit eb8dbe80326c3d44c1e38ee4f40e0d8d3e06f2d0 upstream.

The direction of the pipe argument must match the request-type direction
bit or control requests may fail depending on the host-controller-driver
implementation.

Fix the three requests which erroneously used usb_rcvctrlpipe().

Fixes: f7a33e608d9a ("USB: serial: add quatech2 usb to serial driver")
Cc: stable@vger.kernel.org # 3.5
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoUSB: serial: omninet: add device id for Zyxel Omni 56K Plus
Alexandre GRIVEAUX [Sun, 23 May 2021 16:35:21 +0000 (18:35 +0200)]
USB: serial: omninet: add device id for Zyxel Omni 56K Plus

commit fc0b3dc9a11771c3919eaaaf9d649138b095aa0f upstream.

Add device id for Zyxel Omni 56K Plus modem, this modem include:

USB chip:
NetChip
NET2888

Main chip:
901041A
F721501APGF

Another modem using the same chips is the Zyxel Omni 56K DUO/NEO,
could be added with the right USB ID.

Signed-off-by: Alexandre GRIVEAUX <agriveaux@deutnet.info>
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoUSB: serial: ftdi_sio: add NovaTech OrionMX product ID
George McCollister [Fri, 4 Jun 2021 00:32:08 +0000 (19:32 -0500)]
USB: serial: ftdi_sio: add NovaTech OrionMX product ID

commit bc96c72df33ee81b24d87eab953c73f7bcc04f29 upstream.

Add PID for the NovaTech OrionMX so it can be automatically detected.

Signed-off-by: George McCollister <george.mccollister@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousb: gadget: f_fs: Ensure io_completion_wq is idle during unbind
Wesley Cheng [Sat, 22 May 2021 00:44:21 +0000 (17:44 -0700)]
usb: gadget: f_fs: Ensure io_completion_wq is idle during unbind

commit 6fc1db5e6211e30fbb1cee8d7925d79d4ed2ae14 upstream.

During unbind, ffs_func_eps_disable() will be executed, resulting in
completion callbacks for any pending USB requests.  When using AIO,
irrespective of the completion status, io_data work is queued to
io_completion_wq to evaluate and handle the completed requests.  Since
work runs asynchronously to the unbind() routine, there can be a
scenario where the work runs after the USB gadget has been fully
removed, resulting in accessing of a resource which has been already
freed. (i.e. usb_ep_free_request() accessing the USB ep structure)

Explicitly drain the io_completion_wq, instead of relying on the
destroy_workqueue() (in ffs_data_put()) to make sure no pending
completion work items are running.

Signed-off-by: Wesley Cheng <wcheng@codeaurora.org>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/1621644261-1236-1-git-send-email-wcheng@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousb: typec: intel_pmc_mux: Add missed error check for devm_ioremap_resource()
Andy Shevchenko [Mon, 7 Jun 2021 20:50:06 +0000 (23:50 +0300)]
usb: typec: intel_pmc_mux: Add missed error check for devm_ioremap_resource()

commit 843fabdd7623271330af07f1b7fbd7fabe33c8de upstream.

devm_ioremap_resource() can return an error, add missed check for it.

Fixes: 43d596e32276 ("usb: typec: intel_pmc_mux: Check the port status before connect")
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210607205007.71458-2-andy.shevchenko@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousb: typec: intel_pmc_mux: Put fwnode in error case during ->probe()
Andy Shevchenko [Mon, 7 Jun 2021 20:50:05 +0000 (23:50 +0300)]
usb: typec: intel_pmc_mux: Put fwnode in error case during ->probe()

commit 1a85b350a7741776a406005b943e3dec02c424ed upstream.

device_get_next_child_node() bumps a reference counting of a returned variable.
We have to balance it whenever we return to the caller.

Fixes: 6701adfa9693 ("usb: typec: driver for Intel PMC mux control")
Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210607205007.71458-1-andy.shevchenko@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousb: typec: ucsi: Clear PPM capability data in ucsi_init() error path
Mayank Rana [Wed, 9 Jun 2021 07:35:35 +0000 (00:35 -0700)]
usb: typec: ucsi: Clear PPM capability data in ucsi_init() error path

commit f247f0a82a4f8c3bfed178d8fd9e069d1424ee4e upstream.

If ucsi_init() fails for some reason (e.g. ucsi_register_port()
fails or general communication failure to the PPM), particularly at
any point after the GET_CAPABILITY command had been issued, this
results in unwinding the initialization and returning an error.
However the ucsi structure's ucsi_capability member retains its
current value, including likely a non-zero num_connectors.
And because ucsi_init() itself is done in a workqueue a UCSI
interface driver will be unaware that it failed and may think the
ucsi_register() call was completely successful.  Later, if
ucsi_unregister() is called, due to this stale ucsi->cap value it
would try to access the items in the ucsi->connector array which
might not be in a proper state or not even allocated at all and
results in NULL or invalid pointer dereference.

Fix this by clearing the ucsi->cap value to 0 during the error
path of ucsi_init() in order to prevent a later ucsi_unregister()
from entering the connector cleanup loop.

Fixes: c1b0bc2dabfa ("usb: typec: Add support for UCSI interface")
Cc: stable@vger.kernel.org
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Mayank Rana <mrana@codeaurora.org>
Signed-off-by: Jack Pham <jackp@codeaurora.org>
Link: https://lore.kernel.org/r/20210609073535.5094-1-jackp@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousb: typec: wcove: Use LE to CPU conversion when accessing msg->header
Andy Shevchenko [Wed, 9 Jun 2021 17:22:02 +0000 (20:22 +0300)]
usb: typec: wcove: Use LE to CPU conversion when accessing msg->header

commit d5ab95da2a41567440097c277c5771ad13928dad upstream.

As LKP noticed the Sparse is not happy about strict type handling:
   .../typec/tcpm/wcove.c:380:50: sparse:     expected unsigned short [usertype] header
   .../typec/tcpm/wcove.c:380:50: sparse:     got restricted __le16 const [usertype] header

Fix this by switching to use pd_header_cnt_le() instead of pd_header_cnt()
in the affected code.

Fixes: ae8a2ca8a221 ("usb: typec: Group all TCPCI/TCPM code together")
Fixes: 3c4fb9f16921 ("usb: typec: wcove: start using tcpm for USB PD support")
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20210609172202.83377-1-andriy.shevchenko@linux.intel.com
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousb: musb: fix MUSB_QUIRK_B_DISCONNECT_99 handling
Thomas Petazzoni [Fri, 28 May 2021 14:04:46 +0000 (16:04 +0200)]
usb: musb: fix MUSB_QUIRK_B_DISCONNECT_99 handling

commit b65ba0c362be665192381cc59e3ac3ef6f0dd1e1 upstream.

In commit 92af4fc6ec33 ("usb: musb: Fix suspend with devices
connected for a64"), the logic to support the
MUSB_QUIRK_B_DISCONNECT_99 quirk was modified to only conditionally
schedule the musb->irq_work delayed work.

This commit badly breaks ECM Gadget on AM335X. Indeed, with this
commit, one can observe massive packet loss:

$ ping 192.168.0.100
...
15 packets transmitted, 3 received, 80% packet loss, time 14316ms

Reverting this commit brings back a properly functioning ECM
Gadget. An analysis of the commit seems to indicate that a mistake was
made: the previous code was not falling through into the
MUSB_QUIRK_B_INVALID_VBUS_91, but now it is, unless the condition is
taken.

Changing the logic to be as it was before the problematic commit *and*
only conditionally scheduling musb->irq_work resolves the regression:

$ ping 192.168.0.100
...
64 packets transmitted, 64 received, 0% packet loss, time 64475ms

Fixes: 92af4fc6ec33 ("usb: musb: Fix suspend with devices connected for a64")
Cc: stable@vger.kernel.org
Tested-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Tested-by: Drew Fustini <drew@beagleboard.org>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Link: https://lore.kernel.org/r/20210528140446.278076-1-thomas.petazzoni@bootlin.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousb: dwc3: ep0: fix NULL pointer exception
Marian-Cristian Rotariu [Tue, 8 Jun 2021 16:26:50 +0000 (19:26 +0300)]
usb: dwc3: ep0: fix NULL pointer exception

commit d00889080ab60051627dab1d85831cd9db750e2a upstream.

There is no validation of the index from dwc3_wIndex_to_dep() and we might
be referring a non-existing ep and trigger a NULL pointer exception. In
certain configurations we might use fewer eps and the index might wrongly
indicate a larger ep index than existing.

By adding this validation from the patch we can actually report a wrong
index back to the caller.

In our usecase we are using a composite device on an older kernel, but
upstream might use this fix also. Unfortunately, I cannot describe the
hardware for others to reproduce the issue as it is a proprietary
implementation.

[   82.958261] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000a4
[   82.966891] Mem abort info:
[   82.969663]   ESR = 0x96000006
[   82.972703]   Exception class = DABT (current EL), IL = 32 bits
[   82.978603]   SET = 0, FnV = 0
[   82.981642]   EA = 0, S1PTW = 0
[   82.984765] Data abort info:
[   82.987631]   ISV = 0, ISS = 0x00000006
[   82.991449]   CM = 0, WnR = 0
[   82.994409] user pgtable: 4k pages, 39-bit VAs, pgdp = 00000000c6210ccc
[   83.000999] [00000000000000a4] pgd=0000000053aa5003, pud=0000000053aa5003, pmd=0000000000000000
[   83.009685] Internal error: Oops: 96000006 [#1] PREEMPT SMP
[   83.026433] Process irq/62-dwc3 (pid: 303, stack limit = 0x000000003985154c)
[   83.033470] CPU: 0 PID: 303 Comm: irq/62-dwc3 Not tainted 4.19.124 #1
[   83.044836] pstate: 60000085 (nZCv daIf -PAN -UAO)
[   83.049628] pc : dwc3_ep0_handle_feature+0x414/0x43c
[   83.054558] lr : dwc3_ep0_interrupt+0x3b4/0xc94

...

[   83.141788] Call trace:
[   83.144227]  dwc3_ep0_handle_feature+0x414/0x43c
[   83.148823]  dwc3_ep0_interrupt+0x3b4/0xc94
[   83.181546] ---[ end trace aac6b5267d84c32f ]---

Signed-off-by: Marian-Cristian Rotariu <marian.c.rotariu@gmail.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210608162650.58426-1-marian.c.rotariu@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousb: dwc3: gadget: Bail from dwc3_gadget_exit() if dwc->gadget is NULL
Jack Pham [Fri, 28 May 2021 16:04:05 +0000 (09:04 -0700)]
usb: dwc3: gadget: Bail from dwc3_gadget_exit() if dwc->gadget is NULL

commit 03715ea2e3dbbc56947137ce3b4ac18a726b2f87 upstream.

There exists a possible scenario in which dwc3_gadget_init() can fail:
during during host -> peripheral mode switch in dwc3_set_mode(), and
a pending gadget driver fails to bind.  Then, if the DRD undergoes
another mode switch from peripheral->host the resulting
dwc3_gadget_exit() will attempt to reference an invalid and dangling
dwc->gadget pointer as well as call dma_free_coherent() on unmapped
DMA pointers.

The exact scenario can be reproduced as follows:
 - Start DWC3 in peripheral mode
 - Configure ConfigFS gadget with FunctionFS instance (or use g_ffs)
 - Run FunctionFS userspace application (open EPs, write descriptors, etc)
 - Bind gadget driver to DWC3's UDC
 - Switch DWC3 to host mode
   => dwc3_gadget_exit() is called. usb_del_gadget() will put the
ConfigFS driver instance on the gadget_driver_pending_list
 - Stop FunctionFS application (closes the ep files)
 - Switch DWC3 to peripheral mode
   => dwc3_gadget_init() fails as usb_add_gadget() calls
check_pending_gadget_drivers() and attempts to rebind the UDC
to the ConfigFS gadget but fails with -19 (-ENODEV) because the
FFS instance is not in FFS_ACTIVE state (userspace has not
re-opened and written the descriptors yet, i.e. desc_ready!=0).
 - Switch DWC3 back to host mode
   => dwc3_gadget_exit() is called again, but this time dwc->gadget
is invalid.

Although it can be argued that userspace should take responsibility
for ensuring that the FunctionFS application be ready prior to
allowing the composite driver bind to the UDC, failure to do so
should not result in a panic from the kernel driver.

Fix this by setting dwc->gadget to NULL in the failure path of
dwc3_gadget_init() and add a check to dwc3_gadget_exit() to bail out
unless the gadget pointer is valid.

Fixes: e81a7018d93a ("usb: dwc3: allocate gadget structure dynamically")
Cc: <stable@vger.kernel.org>
Reviewed-by: Peter Chen <peter.chen@kernel.org>
Signed-off-by: Jack Pham <jackp@codeaurora.org>
Link: https://lore.kernel.org/r/20210528160405.17550-1-jackp@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousb: dwc3: meson-g12a: Disable the regulator in the error handling path of the probe
Christophe JAILLET [Fri, 21 May 2021 16:55:50 +0000 (18:55 +0200)]
usb: dwc3: meson-g12a: Disable the regulator in the error handling path of the probe

commit 1d0d3d818eafe1963ec1eaf302175cd14938188e upstream.

If an error occurs after a successful 'regulator_enable()' call,
'regulator_disable()' must be called.

Fix the error handling path of the probe accordingly.

The remove function doesn't need to be fixed, because the
'regulator_disable()' call is already hidden in 'dwc3_meson_g12a_suspend()'
which is called via 'pm_runtime_set_suspended()' in the remove function.

Fixes: c99993376f72 ("usb: dwc3: Add Amlogic G12A DWC3 glue")
Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/79df054046224bbb0716a8c5c2082650290eec86.1621616013.git.christophe.jaillet@wanadoo.fr
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousb: dwc3-meson-g12a: fix usb2 PHY glue init when phy0 is disabled
Neil Armstrong [Tue, 1 Jun 2021 08:48:30 +0000 (10:48 +0200)]
usb: dwc3-meson-g12a: fix usb2 PHY glue init when phy0 is disabled

commit 4d2aa178d2ad2fb156711113790dde13e9aa2376 upstream.

When only PHY1 is used (for example on Odroid-HC4), the regmap init code
uses the usb2 ports when doesn't initialize the PHY1 regmap entry.

This fixes:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020
...
pc : regmap_update_bits_base+0x40/0xa0
lr : dwc3_meson_g12a_usb2_init_phy+0x4c/0xf8
...
Call trace:
regmap_update_bits_base+0x40/0xa0
dwc3_meson_g12a_usb2_init_phy+0x4c/0xf8
dwc3_meson_g12a_usb2_init+0x7c/0xc8
dwc3_meson_g12a_usb_init+0x28/0x48
dwc3_meson_g12a_probe+0x298/0x540
platform_probe+0x70/0xe0
really_probe+0xf0/0x4d8
driver_probe_device+0xfc/0x168
...

Fixes: 013af227f58a97 ("usb: dwc3: meson-g12a: handle the phy and glue registers separately")
Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210601084830.260196-1-narmstrong@baylibre.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousb: pd: Set PD_T_SINK_WAIT_CAP to 310ms
Kyle Tso [Fri, 28 May 2021 08:16:13 +0000 (16:16 +0800)]
usb: pd: Set PD_T_SINK_WAIT_CAP to 310ms

commit 6490fa565534fa83593278267785a694fd378a2b upstream.

Current timer PD_T_SINK_WAIT_CAP is set to 240ms which will violate the
SinkWaitCapTimer (tTypeCSinkWaitCap 310 - 620 ms) defined in the PD
Spec if the port is faster enough when running the state machine. Set it
to the lower bound 310ms to ensure the timeout is in Spec.

Fixes: f0690a25a140 ("staging: typec: USB Type-C Port Manager (tcpm)")
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Kyle Tso <kyletso@google.com>
Link: https://lore.kernel.org/r/20210528081613.730661-1-kyletso@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousb: f_ncm: only first packet of aggregate needs to start timer
Maciej Żenczykowski [Tue, 8 Jun 2021 08:54:38 +0000 (01:54 -0700)]
usb: f_ncm: only first packet of aggregate needs to start timer

commit 1958ff5ad2d4908b44a72bcf564dfe67c981e7fe upstream.

The reasoning for this change is that if we already had
a packet pending, then we also already had a pending timer,
and as such there is no need to reschedule it.

This also prevents packets getting delayed 60 ms worst case
under a tiny packet every 290us transmit load, by keeping the
timeout always relative to the first queued up packet.
(300us delay * 16KB max aggregation / 80 byte packet =~ 60 ms)

As such the first packet is now at most delayed by 300us.

Under low transmit load, this will simply result in us sending
a shorter aggregate, as originally intended.

This patch has the benefit of greatly reducing (by ~10 factor
with 1500 byte frames aggregated into 16 kiB) the number of
(potentially pretty costly) updates to the hrtimer.

Cc: Brooke Basile <brookebasile@gmail.com>
Cc: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Link: https://lore.kernel.org/r/20210608085438.813960-1-zenczykowski@gmail.com
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoUSB: f_ncm: ncm_bitrate (speed) is unsigned
Maciej Żenczykowski [Tue, 8 Jun 2021 00:53:44 +0000 (17:53 -0700)]
USB: f_ncm: ncm_bitrate (speed) is unsigned

commit 3370139745853f7826895293e8ac3aec1430508e upstream.

[  190.544755] configfs-gadget gadget: notify speed -44967296

This is because 4250000000 - 2**32 is -44967296.

Fixes: 9f6ce4240a2b ("usb: gadget: f_ncm.c added")
Cc: Brooke Basile <brookebasile@gmail.com>
Cc: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: Yauheni Kaliuta <yauheni.kaliuta@nokia.com>
Cc: Linux USB Mailing List <linux-usb@vger.kernel.org>
Acked-By: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210608005344.3762668-1-zenczykowski@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agommc: renesas_sdhi: Fix HS400 on R-Car M3-W+
Geert Uytterhoeven [Fri, 4 Jun 2021 12:59:43 +0000 (14:59 +0200)]
mmc: renesas_sdhi: Fix HS400 on R-Car M3-W+

commit 6687cd72aa9112a454a4646986e0402dd1b07d0e upstream.

R-Car M3-W ES3.0 is marketed as R-Car M3-W+ (R8A77961), and has its own
compatible value "renesas,r8a77961".

Hence using soc_device_match() with soc_id = "r8a7796" and revision =
"ES3.*" does not actually match running on an R-Car M3-W+ SoC.

Fix this by matching with soc_id = "r8a77961" instead.

Fixes: a38c078fea0b1393 ("mmc: renesas_sdhi: Avoid bad TAP in HS400")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Link: https://lore.kernel.org/r/ee8af5d631f5331139ffea714539030d97352e93.1622811525.git.geert+renesas@glider.be
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agommc: renesas_sdhi: abort tuning when timeout detected
Wolfram Sang [Wed, 2 Jun 2021 07:34:35 +0000 (09:34 +0200)]
mmc: renesas_sdhi: abort tuning when timeout detected

commit 2c9017d0b5d3fbf17e69577a42d9e610ca122810 upstream.

We have to bring the eMMC from sending-data state back to transfer state
once we detected a CRC error (timeout) during tuning. So, send a stop
command via mmc_abort_tuning().

Fixes: 4f11997773b6 ("mmc: tmio: Add tuning support")
Reported-by Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Link: https://lore.kernel.org/r/20210602073435.5955-1-wsa+renesas@sang-engineering.com
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoftrace: Do not blindly read the ip address in ftrace_bug()
Steven Rostedt (VMware) [Tue, 8 Jun 2021 01:39:08 +0000 (21:39 -0400)]
ftrace: Do not blindly read the ip address in ftrace_bug()

commit 6c14133d2d3f768e0a35128faac8aa6ed4815051 upstream.

It was reported that a bug on arm64 caused a bad ip address to be used for
updating into a nop in ftrace_init(), but the error path (rightfully)
returned -EINVAL and not -EFAULT, as the bug caused more than one error to
occur. But because -EINVAL was returned, the ftrace_bug() tried to report
what was at the location of the ip address, and read it directly. This
caused the machine to panic, as the ip was not pointing to a valid memory
address.

Instead, read the ip address with copy_from_kernel_nofault() to safely
access the memory, and if it faults, report that the address faulted,
otherwise report what was in that location.

Link: https://lore.kernel.org/lkml/20210607032329.28671-1-mark-pk.tsai@mediatek.com/
Cc: stable@vger.kernel.org
Fixes: 05736a427f7e1 ("ftrace: warn on failure to disable mcount callers")
Reported-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
Tested-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agocgroup1: don't allow '\n' in renaming
Alexander Kuznetsov [Wed, 9 Jun 2021 07:17:19 +0000 (10:17 +0300)]
cgroup1: don't allow '\n' in renaming

commit b7e24eb1caa5f8da20d405d262dba67943aedc42 upstream.

cgroup_mkdir() have restriction on newline usage in names:
$ mkdir $'/sys/fs/cgroup/cpu/test\ntest2'
mkdir: cannot create directory
'/sys/fs/cgroup/cpu/test\ntest2': Invalid argument

But in cgroup1_rename() such check is missed.
This allows us to make /proc/<pid>/cgroup unparsable:
$ mkdir /sys/fs/cgroup/cpu/test
$ mv /sys/fs/cgroup/cpu/test $'/sys/fs/cgroup/cpu/test\ntest2'
$ echo $$ > $'/sys/fs/cgroup/cpu/test\ntest2'
$ cat /proc/self/cgroup
11:pids:/
10:freezer:/
9:hugetlb:/
8:cpuset:/
7:blkio:/user.slice
6:memory:/user.slice
5:net_cls,net_prio:/
4:perf_event:/
3:devices:/user.slice
2:cpu,cpuacct:/test
test2
1:name=systemd:/
0::/

Signed-off-by: Alexander Kuznetsov <wwfq@yandex-team.ru>
Reported-by: Andrey Krasichkov <buglloc@yandex-team.ru>
Acked-by: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: stable@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agobtrfs: promote debugging asserts to full-fledged checks in validate_super
Nikolay Borisov [Mon, 31 May 2021 09:26:01 +0000 (12:26 +0300)]
btrfs: promote debugging asserts to full-fledged checks in validate_super

commit aefd7f7065567a4666f42c0fc8cdb379d2e036bf upstream.

Syzbot managed to trigger this assert while performing its fuzzing.
Turns out it's better to have those asserts turned into full-fledged
checks so that in case buggy btrfs images are mounted the users gets
an error and mounting is stopped. Alternatively with CONFIG_BTRFS_ASSERT
disabled such image would have been erroneously allowed to be mounted.

Reported-by: syzbot+a6bf271c02e4fe66b4e4@syzkaller.appspotmail.com
CC: stable@vger.kernel.org # 5.4+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ add uuids to the messages ]
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agobtrfs: return value from btrfs_mark_extent_written() in case of error
Ritesh Harjani [Sun, 30 May 2021 14:54:05 +0000 (20:24 +0530)]
btrfs: return value from btrfs_mark_extent_written() in case of error

commit e7b2ec3d3d4ebeb4cff7ae45cf430182fa6a49fb upstream.

We always return 0 even in case of an error in btrfs_mark_extent_written().
Fix it to return proper error value in case of a failure. All callers
handle it.

CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoasync_xor: check src_offs is not NULL before updating it
Xiao Ni [Fri, 28 May 2021 06:16:38 +0000 (14:16 +0800)]
async_xor: check src_offs is not NULL before updating it

commit 9be148e408df7d361ec5afd6299b7736ff3928b0 upstream.

When PAGE_SIZE is greater than 4kB, multiple stripes may share the same
page. Thus, src_offs is added to async_xor_offs() with array of offsets.
However, async_xor() passes NULL src_offs to async_xor_offs(). In such
case, src_offs should not be updated. Add a check before the update.

Fixes: ceaf2966ab08(async_xor: increase src_offs when dropping destination page)
Cc: stable@vger.kernel.org # v5.10+
Reported-by: Oleksandr Shchirskyi <oleksandr.shchirskyi@linux.intel.com>
Tested-by: Oleksandr Shchirskyi <oleksandr.shchirskyi@intel.com>
Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agostaging: rtl8723bs: Fix uninitialized variables
Wenli Looi [Tue, 8 Jun 2021 06:46:20 +0000 (23:46 -0700)]
staging: rtl8723bs: Fix uninitialized variables

commit 43c85d770db80cb135f576f8fde6ff1a08e707a4 upstream.

The sinfo.pertid and sinfo.generation variables are not initialized and
it causes a crash when we use this as a wireless access point.

[  456.873025] ------------[ cut here ]------------
[  456.878198] kernel BUG at mm/slub.c:3968!
[  456.882680] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM

  [ snip ]

[  457.271004] Backtrace:
[  457.273733] [<c02b7ee4>] (kfree) from [<c0e2a470>] (nl80211_send_station+0x954/0xfc4)
[  457.282481]  r9:eccca0c0 r8:e8edfec0 r7:00000000 r6:00000011 r5:e80a9480 r4:e8edfe00
[  457.291132] [<c0e29b1c>] (nl80211_send_station) from [<c0e2b18c>] (cfg80211_new_sta+0x90/0x1cc)
[  457.300850]  r10:e80a9480 r9:e8edfe00 r8:ea678cca r7:00000a20 r6:00000000 r5:ec46d000
[  457.309586]  r4:ec46d9e0
[  457.312433] [<c0e2b0fc>] (cfg80211_new_sta) from [<bf086684>] (rtw_cfg80211_indicate_sta_assoc+0x80/0x9c [r8723bs])
[  457.324095]  r10:00009930 r9:e85b9d80 r8:bf091050 r7:00000000 r6:00000000 r5:0000001c
[  457.332831]  r4:c1606788
[  457.335692] [<bf086604>] (rtw_cfg80211_indicate_sta_assoc [r8723bs]) from [<bf03df38>] (rtw_stassoc_event_callback+0x1c8/0x1d4 [r8723bs])
[  457.349489]  r7:ea678cc0 r6:000000a1 r5:f1225f84 r4:f086b000
[  457.355845] [<bf03dd70>] (rtw_stassoc_event_callback [r8723bs]) from [<bf048e4c>] (mlme_evt_hdl+0x8c/0xb4 [r8723bs])
[  457.367601]  r7:c1604900 r6:f086c4b8 r5:00000000 r4:f086c000
[  457.373959] [<bf048dc0>] (mlme_evt_hdl [r8723bs]) from [<bf03693c>] (rtw_cmd_thread+0x198/0x3d8 [r8723bs])
[  457.384744]  r5:f086e000 r4:f086c000
[  457.388754] [<bf0367a4>] (rtw_cmd_thread [r8723bs]) from [<c014a214>] (kthread+0x170/0x174)
[  457.398083]  r10:ed7a57e8 r9:bf0367a4 r8:f086b000 r7:e8ede000 r6:00000000 r5:e9975200
[  457.406828]  r4:e8369900
[  457.409653] [<c014a0a4>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
[  457.417718] Exception stack(0xe8edffb0 to 0xe8edfff8)
[  457.423356] ffa0:                                     00000000 00000000 00000000 00000000
[  457.432492] ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[  457.441618] ffe0: 00000000 00000000 00000000 00000000 00000013 00000000
[  457.449006]  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c014a0a4
[  457.457750]  r4:e9975200
[  457.460574] Code: 1a000003 e5953004 e3130001 1a000000 (e7f001f2)
[  457.467381] ---[ end trace 4acbc8c15e9e6aa7 ]---

Link: https://forum.armbian.com/topic/14727-wifi-ap-kernel-bug-in-kernel-5444/
Fixes: 8689c051a201 ("cfg80211: dynamically allocate per-tid stats for station info")
Fixes: f5ea9120be2e ("nl80211: add generation number to all dumps")
Signed-off-by: Wenli Looi <wlooi@ucalgary.ca>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210608064620.74059-1-wlooi@ucalgary.ca
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agokvm: avoid speculation-based attacks from out-of-range memslot accesses
Paolo Bonzini [Tue, 8 Jun 2021 19:31:42 +0000 (15:31 -0400)]
kvm: avoid speculation-based attacks from out-of-range memslot accesses

commit da27a83fd6cc7780fea190e1f5c19e87019da65c upstream.

KVM's mechanism for accessing guest memory translates a guest physical
address (gpa) to a host virtual address using the right-shifted gpa
(also known as gfn) and a struct kvm_memory_slot.  The translation is
performed in __gfn_to_hva_memslot using the following formula:

      hva = slot->userspace_addr + (gfn - slot->base_gfn) * PAGE_SIZE

It is expected that gfn falls within the boundaries of the guest's
physical memory.  However, a guest can access invalid physical addresses
in such a way that the gfn is invalid.

__gfn_to_hva_memslot is called from kvm_vcpu_gfn_to_hva_prot, which first
retrieves a memslot through __gfn_to_memslot.  While __gfn_to_memslot
does check that the gfn falls within the boundaries of the guest's
physical memory or not, a CPU can speculate the result of the check and
continue execution speculatively using an illegal gfn. The speculation
can result in calculating an out-of-bounds hva.  If the resulting host
virtual address is used to load another guest physical address, this
is effectively a Spectre gadget consisting of two consecutive reads,
the second of which is data dependent on the first.

Right now it's not clear if there are any cases in which this is
exploitable.  One interesting case was reported by the original author
of this patch, and involves visiting guest page tables on x86.  Right
now these are not vulnerable because the hva read goes through get_user(),
which contains an LFENCE speculation barrier.  However, there are
patches in progress for x86 uaccess.h to mask kernel addresses instead of
using LFENCE; once these land, a guest could use speculation to read
from the VMM's ring 3 address space.  Other architectures such as ARM
already use the address masking method, and would be susceptible to
this same kind of data-dependent access gadgets.  Therefore, this patch
proactively protects from these attacks by masking out-of-bounds gfns
in __gfn_to_hva_memslot, which blocks speculation of invalid hvas.

Sean Christopherson noted that this patch does not cover
kvm_read_guest_offset_cached.  This however is limited to a few bytes
past the end of the cache, and therefore it is unlikely to be useful in
the context of building a chain of data dependent accesses.

Reported-by: Artemiy Margaritov <artemiy.margaritov@gmail.com>
Co-developed-by: Artemiy Margaritov <artemiy.margaritov@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoKVM: X86: MMU: Use the correct inherited permissions to get shadow page
Lai Jiangshan [Thu, 3 Jun 2021 05:24:55 +0000 (13:24 +0800)]
KVM: X86: MMU: Use the correct inherited permissions to get shadow page

commit b1bd5cba3306691c771d558e94baa73e8b0b96b7 upstream.

When computing the access permissions of a shadow page, use the effective
permissions of the walk up to that point, i.e. the logic AND of its parents'
permissions.  Two guest PxE entries that point at the same table gfn need to
be shadowed with different shadow pages if their parents' permissions are
different.  KVM currently uses the effective permissions of the last
non-leaf entry for all non-leaf entries.  Because all non-leaf SPTEs have
full ("uwx") permissions, and the effective permissions are recorded only
in role.access and merged into the leaves, this can lead to incorrect
reuse of a shadow page and eventually to a missing guest protection page
fault.

For example, here is a shared pagetable:

   pgd[]   pud[]        pmd[]            virtual address pointers
                     /->pmd1(u--)->pte1(uw-)->page1 <- ptr1 (u--)
        /->pud1(uw-)--->pmd2(uw-)->pte2(uw-)->page2 <- ptr2 (uw-)
   pgd-|           (shared pmd[] as above)
        \->pud2(u--)--->pmd1(u--)->pte1(uw-)->page1 <- ptr3 (u--)
                     \->pmd2(uw-)->pte2(uw-)->page2 <- ptr4 (u--)

  pud1 and pud2 point to the same pmd table, so:
  - ptr1 and ptr3 points to the same page.
  - ptr2 and ptr4 points to the same page.

(pud1 and pud2 here are pud entries, while pmd1 and pmd2 here are pmd entries)

- First, the guest reads from ptr1 first and KVM prepares a shadow
  page table with role.access=u--, from ptr1's pud1 and ptr1's pmd1.
  "u--" comes from the effective permissions of pgd, pud1 and
  pmd1, which are stored in pt->access.  "u--" is used also to get
  the pagetable for pud1, instead of "uw-".

- Then the guest writes to ptr2 and KVM reuses pud1 which is present.
  The hypervisor set up a shadow page for ptr2 with pt->access is "uw-"
  even though the pud1 pmd (because of the incorrect argument to
  kvm_mmu_get_page in the previous step) has role.access="u--".

- Then the guest reads from ptr3.  The hypervisor reuses pud1's
  shadow pmd for pud2, because both use "u--" for their permissions.
  Thus, the shadow pmd already includes entries for both pmd1 and pmd2.

- At last, the guest writes to ptr4.  This causes no vmexit or pagefault,
  because pud1's shadow page structures included an "uw-" page even though
  its role.access was "u--".

Any kind of shared pagetable might have the similar problem when in
virtual machine without TDP enabled if the permissions are different
from different ancestors.

In order to fix the problem, we change pt->access to be an array, and
any access in it will not include permissions ANDed from child ptes.

The test code is: https://lore.kernel.org/kvm/20210603050537.19605-1-jiangshanlai@gmail.com/
Remember to test it with TDP disabled.

The problem had existed long before the commit 41074d07c78b ("KVM: MMU:
Fix inherited permissions for emulated guest pte updates"), and it
is hard to find which is the culprit.  So there is no fixes tag here.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Message-Id: <20210603052455.21023-1-jiangshanlai@gmail.com>
Cc: stable@vger.kernel.org
Fixes: cea0f0e7ea54 ("[PATCH] KVM: MMU: Shadow page table caching")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoperf/x86/intel/uncore: Fix M2M event umask for Ice Lake server
Kan Liang [Tue, 1 Jun 2021 13:09:03 +0000 (06:09 -0700)]
perf/x86/intel/uncore: Fix M2M event umask for Ice Lake server

commit 848ff3768684701a4ce73a2ec0e5d438d4e2b0da upstream.

Perf tool errors out with the latest event list for the Ice Lake server.

event syntax error: 'unc_m2m_imc_reads.to_pmm'
                           \___ value too big for format, maximum is 255

The same as the Snow Ridge server, the M2M uncore unit in the Ice Lake
server has the unit mask extension field as well.

Fixes: 2b3b76b5ec67 ("perf/x86/intel/uncore: Add Ice Lake server uncore support")
Reported-by: Jin Yao <yao.jin@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1622552943-119174-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agodrm: Lock pointer access in drm_master_release()
Desmond Cheong Zhi Xi [Wed, 9 Jun 2021 09:21:19 +0000 (17:21 +0800)]
drm: Lock pointer access in drm_master_release()

commit c336a5ee984708db4826ef9e47d184e638e29717 upstream.

This patch eliminates the following smatch warning:
drivers/gpu/drm/drm_auth.c:320 drm_master_release() warn: unlocked access 'master' (line 318) expected lock '&dev->master_mutex'

The 'file_priv->master' field should be protected by the mutex lock to
'&dev->master_mutex'. This is because other processes can concurrently
modify this field and free the current 'file_priv->master'
pointer. This could result in a use-after-free error when 'master' is
dereferenced in subsequent function calls to
'drm_legacy_lock_master_cleanup()' or to 'drm_lease_revoke()'.

An example of a scenario that would produce this error can be seen
from a similar bug in 'drm_getunique()' that was reported by Syzbot:
https://syzkaller.appspot.com/bug?id=148d2f1dfac64af52ffd27b661981a540724f803

In the Syzbot report, another process concurrently acquired the
device's master mutex in 'drm_setmaster_ioctl()', then overwrote
'fpriv->master' in 'drm_new_set_master()'. The old value of
'fpriv->master' was subsequently freed before the mutex was unlocked.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210609092119.173590-1-desmondcheongzx@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agodrm: Fix use-after-free read in drm_getunique()
Desmond Cheong Zhi Xi [Tue, 8 Jun 2021 11:04:36 +0000 (19:04 +0800)]
drm: Fix use-after-free read in drm_getunique()

commit b436acd1cf7fac0ba987abd22955d98025c80c2b upstream.

There is a time-of-check-to-time-of-use error in drm_getunique() due
to retrieving file_priv->master prior to locking the device's master
mutex.

An example can be seen in the crash report of the use-after-free error
found by Syzbot:
https://syzkaller.appspot.com/bug?id=148d2f1dfac64af52ffd27b661981a540724f803

In the report, the master pointer was used after being freed. This is
because another process had acquired the device's master mutex in
drm_setmaster_ioctl(), then overwrote fpriv->master in
drm_new_set_master(). The old value of fpriv->master was subsequently
freed before the mutex was unlocked.

To fix this, we lock the device's master mutex before retrieving the
pointer from from fpriv->master. This patch passes the Syzbot
reproducer test.

Reported-by: syzbot+c3a706cec1ea99e1c693@syzkaller.appspotmail.com
Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210608110436.239583-1-desmondcheongzx@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoRevert "ACPI: sleep: Put the FACS table after using it"
Zhang Rui [Thu, 3 Jun 2021 02:34:14 +0000 (10:34 +0800)]
Revert "ACPI: sleep: Put the FACS table after using it"

commit f1ffa9d4cccc8fdf6c03fb1b3429154d22037988 upstream.

Commit 95722237cb2a ("ACPI: sleep: Put the FACS table after using it")
puts the FACS table during initialization.

But the hardware signature bits in the FACS table need to be accessed,
after every hibernation, to compare with the original hardware
signature.

So there is no reason to release the FACS table mapping after
initialization.

This reverts commit 95722237cb2ae4f7b73471058cdb19e8f4057c93.

An alternative solution is to use acpi_gbl_FACS variable instead, which
is mapped by the ACPICA core and never released.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=212277
Reported-by: Stephan Hohe <sth.dev@tejp.de>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Cc: 5.8+ <stable@vger.kernel.org> # 5.8+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agospi: bcm2835: Fix out-of-bounds access with more than 4 slaves
Lukas Wunner [Sat, 22 May 2021 17:49:50 +0000 (19:49 +0200)]
spi: bcm2835: Fix out-of-bounds access with more than 4 slaves

commit 13817d466eb8713a1ffd254f537402f091d48444 upstream.

Commit 571e31fa60b3 ("spi: bcm2835: Cache CS register value for
->prepare_message()") limited the number of slaves to 3 at compile-time.
The limitation was necessitated by a statically-sized array prepare_cs[]
in the driver private data which contains a per-slave register value.

The commit sought to enforce the limitation at run-time by setting the
controller's num_chipselect to 3:  Slaves with a higher chipselect are
rejected by spi_add_device().

However the commit neglected that num_chipselect only limits the number
of *native* chipselects.  If GPIO chipselects are specified in the
device tree for more than 3 slaves, num_chipselect is silently raised by
of_spi_get_gpio_numbers() and the result are out-of-bounds accesses to
the statically-sized array prepare_cs[].

As a bandaid fix which is backportable to stable, raise the number of
allowed slaves to 24 (which "ought to be enough for anybody"), enforce
the limitation on slave ->setup and revert num_chipselect to 3 (which is
the number of native chipselects supported by the controller).
An upcoming for-next commit will allow an arbitrary number of slaves.

Fixes: 571e31fa60b3 ("spi: bcm2835: Cache CS register value for ->prepare_message()")
Reported-by: Joe Burmeister <joe.burmeister@devtank.co.uk>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org # v5.4+
Cc: Phil Elwell <phil@raspberrypi.com>
Link: https://lore.kernel.org/r/75854affc1923309fde05e47494263bde73e5592.1621703210.git.lukas@wunner.de
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoALSA: hda/realtek: fix mute/micmute LEDs for HP ZBook Power G8
Jeremy Szu [Tue, 8 Jun 2021 11:47:48 +0000 (19:47 +0800)]
ALSA: hda/realtek: fix mute/micmute LEDs for HP ZBook Power G8

commit 600dd2a7e8b62170d177381cc1303861f48f9780 upstream.

The HP ZBook Power G8 using ALC236 codec which using 0x02 to
control mute LED and 0x01 to control micmute LED.
Therefore, add a quirk to make it works.

Signed-off-by: Jeremy Szu <jeremy.szu@canonical.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210608114750.32009-1-jeremy.szu@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoALSA: hda/realtek: fix mute/micmute LEDs for HP EliteBook 840 Aero G8
Jeremy Szu [Sat, 5 Jun 2021 08:25:38 +0000 (16:25 +0800)]
ALSA: hda/realtek: fix mute/micmute LEDs for HP EliteBook 840 Aero G8

commit dfb06401b4cdfc71e2fc3e19b877ab845cc9f7f7 upstream.

The HP EliteBook 840 Aero G8 using ALC285 codec which using 0x04 to
control mute LED and 0x01 to control micmute LED.
In the other hand, there is no output from right channel of speaker.
Therefore, add a quirk to make it works.

Signed-off-by: Jeremy Szu <jeremy.szu@canonical.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210605082539.41797-3-jeremy.szu@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoALSA: hda/realtek: fix mute/micmute LEDs and speaker for HP EliteBook x360 1040 G8
Jeremy Szu [Sat, 5 Jun 2021 08:25:37 +0000 (16:25 +0800)]
ALSA: hda/realtek: fix mute/micmute LEDs and speaker for HP EliteBook x360 1040 G8

commit 61d3e87468fad82dc8e8cb6de7db563ada64b532 upstream.

The HP EliteBook x360 1040 G8 using ALC285 codec which using 0x04 to control
mute LED and 0x01 to control micmute LED.
In the other hand, there is no output from right channel of speaker.
Therefore, add a quirk to make it works.

Signed-off-by: Jeremy Szu <jeremy.szu@canonical.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210605082539.41797-2-jeremy.szu@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoALSA: hda/realtek: fix mute/micmute LEDs and speaker for HP Elite Dragonfly G2
Jeremy Szu [Sat, 5 Jun 2021 08:25:36 +0000 (16:25 +0800)]
ALSA: hda/realtek: fix mute/micmute LEDs and speaker for HP Elite Dragonfly G2

commit 15d295b560e6dd45f839a53ae69e4f63b54eb32f upstream.

The HP Elite Dragonfly G2 using ALC285 codec which using 0x04 to control
mute LED and 0x01 to control micmute LED.
In the other hand, there is no output from right channel of speaker.
Therefore, add a quirk to make it works.

Signed-off-by: Jeremy Szu <jeremy.szu@canonical.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210605082539.41797-1-jeremy.szu@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoALSA: hda/realtek: headphone and mic don't work on an Acer laptop
Hui Wang [Tue, 8 Jun 2021 02:46:00 +0000 (10:46 +0800)]
ALSA: hda/realtek: headphone and mic don't work on an Acer laptop

commit 57c9e21a49b1c196cda28f54de9a5d556ac93f20 upstream.

There are 2 issues on this machine, the 1st one is mic's plug/unplug
can't be detected, that is because the mic is set to manual detecting
mode, need to apply ALC255_FIXUP_XIAOMI_HEADSET_MIC to set it to auto
detecting mode. The other one is headphone's plug/unplug can't be
detected by pulseaudio, that is because the pulseaudio will use
ucm2/sof-hda-dsp on this machine, and the ucm2 only handle
'Headphone Jack', but on this machine the headphone's pincfg sets the
location to Front, then the alsa mixer name is "Front Headphone Jack"
instead of "Headphone Jack", so override the pincfg to change location
to Left.

BugLink: http://bugs.launchpad.net/bugs/1930188
Cc: <stable@vger.kernel.org>
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Link: https://lore.kernel.org/r/20210608024600.6198-1-hui.wang@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoALSA: firewire-lib: fix the context to call snd_pcm_stop_xrun()
Takashi Sakamoto [Sat, 5 Jun 2021 09:10:54 +0000 (18:10 +0900)]
ALSA: firewire-lib: fix the context to call snd_pcm_stop_xrun()

commit 9981b20a5e3694f4625ab5a1ddc98ce7503f6d12 upstream.

In the workqueue to queue wake-up event, isochronous context is not
processed, thus it's useless to check context for the workqueue to switch
status of runtime for PCM substream to XRUN. On the other hand, in
software IRQ context of 1394 OHCI, it's needed.

This commit fixes the bug introduced when tasklet was replaced with
workqueue.

Cc: <stable@vger.kernel.org>
Fixes: 2b3d2987d800 ("ALSA: firewire: Replace tasklet with work")
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Link: https://lore.kernel.org/r/20210605091054.68866-1-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoALSA: seq: Fix race of snd_seq_timer_open()
Takashi Iwai [Thu, 10 Jun 2021 15:20:59 +0000 (17:20 +0200)]
ALSA: seq: Fix race of snd_seq_timer_open()

commit 83e197a8414c0ba545e7e3916ce05f836f349273 upstream.

The timer instance per queue is exclusive, and snd_seq_timer_open()
should have managed the concurrent accesses.  It looks as if it's
checking the already existing timer instance at the beginning, but
it's not right, because there is no protection, hence any later
concurrent call of snd_seq_timer_open() may override the timer
instance easily.  This may result in UAF, as the leftover timer
instance can keep running while the queue itself gets closed, as
spotted by syzkaller recently.

For avoiding the race, add a proper check at the assignment of
tmr->timeri again, and return -EBUSY if it's been already registered.

Reported-by: syzbot+ddc1260a83ed1cbf6fb5@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/000000000000dce34f05c42f110c@google.com
Link: https://lore.kernel.org/r/20210610152059.24633-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoi2c: mpc: implement erratum A-004447 workaround
Chris Packham [Tue, 11 May 2021 21:20:52 +0000 (09:20 +1200)]
i2c: mpc: implement erratum A-004447 workaround

[ Upstream commit 8f0cdec8b5fd94135d643662506ee94ae9e98785 ]

The P2040/P2041 has an erratum where the normal i2c recovery mechanism
does not work. Implement the alternative recovery mechanism documented
in the P2040 Chip Errata Rev Q.

Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agoi2c: mpc: Make use of i2c_recover_bus()
Chris Packham [Mon, 29 Mar 2021 01:52:03 +0000 (14:52 +1300)]
i2c: mpc: Make use of i2c_recover_bus()

[ Upstream commit 65171b2df15eb7545431d75c2729b5062da89b43 ]

Move the existing calls of mpc_i2c_fixup() to a recovery function
registered via bus_recovery_info. This makes it more obvious that
recovery is supported and allows for a future where recovery is
triggered by the i2c core.

Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agospi: Cleanup on failure of initial setup
Lukas Wunner [Thu, 27 May 2021 21:10:56 +0000 (23:10 +0200)]
spi: Cleanup on failure of initial setup

[ Upstream commit 2ec6f20b33eb4f62ab90bdcd620436c883ec3af6 ]

Commit c7299fea6769 ("spi: Fix spi device unregister flow") changed the
SPI core's behavior if the ->setup() hook returns an error upon adding
an spi_device:  Before, the ->cleanup() hook was invoked to free any
allocations that were made by ->setup().  With the commit, that's no
longer the case, so the ->setup() hook is expected to free the
allocations itself.

I've identified 5 drivers which depend on the old behavior and am fixing
them up hereinafter: spi-bitbang.c spi-fsl-spi.c spi-omap-uwire.c
spi-omap2-mcspi.c spi-pxa2xx.c

Importantly, ->setup() is not only invoked on spi_device *addition*:
It may subsequently be called to *change* SPI parameters.  If changing
these SPI parameters fails, freeing memory allocations would be wrong.
That should only be done if the spi_device is finally destroyed.
I am therefore using a bool "initial_setup" in 4 of the affected drivers
to differentiate between the invocation on *adding* the spi_device and
any subsequent invocations: spi-bitbang.c spi-fsl-spi.c spi-omap-uwire.c
spi-omap2-mcspi.c

In spi-pxa2xx.c, it seems the ->setup() hook can only fail on spi_device
addition, not any subsequent calls.  It therefore doesn't need the bool.

It's worth noting that 5 other drivers already perform a cleanup if the
->setup() hook fails.  Before c7299fea6769, they caused a double-free
if ->setup() failed on spi_device addition.  Since the commit, they're
fine.  These drivers are: spi-mpc512x-psc.c spi-pl022.c spi-s3c64xx.c
spi-st-ssc4.c spi-tegra114.c

(spi-pxa2xx.c also already performs a cleanup, but only in one of
several error paths.)

Fixes: c7299fea6769 ("spi: Fix spi device unregister flow")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: Saravana Kannan <saravanak@google.com>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> # pxa2xx
Link: https://lore.kernel.org/r/f76a0599469f265b69c371538794101fa37b5536.1622149321.git.lukas@wunner.de
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agospi: Don't have controller clean up spi device before driver unbind
Saravana Kannan [Wed, 5 May 2021 16:47:34 +0000 (09:47 -0700)]
spi: Don't have controller clean up spi device before driver unbind

[ Upstream commit 27e7db56cf3dffd302bd7ddfacb1d405cf671a2a ]

When a spi device is unregistered and triggers a driver unbind, the
driver might need to access the spi device. So, don't have the
controller clean up the spi device before the driver is unbound. Clean
up the spi device after the driver is unbound.

Fixes: c7299fea6769 ("spi: Fix spi device unregister flow")
Reported-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Tested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20210505164734.175546-1-saravanak@google.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agopowerpc/fsl: set fsl,i2c-erratum-a004447 flag for P1010 i2c controllers
Chris Packham [Tue, 11 May 2021 21:20:51 +0000 (09:20 +1200)]
powerpc/fsl: set fsl,i2c-erratum-a004447 flag for P1010 i2c controllers

[ Upstream commit 19ae697a1e4edf1d755b413e3aa38da65e2db23b ]

The i2c controllers on the P1010 have an erratum where the documented
scheme for i2c bus recovery will not work (A-004447). A different
mechanism is needed which is documented in the P1010 Chip Errata Rev L.

Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agopowerpc/fsl: set fsl,i2c-erratum-a004447 flag for P2041 i2c controllers
Chris Packham [Tue, 11 May 2021 21:20:50 +0000 (09:20 +1200)]
powerpc/fsl: set fsl,i2c-erratum-a004447 flag for P2041 i2c controllers

[ Upstream commit 7adc7b225cddcfd0f346d10144fd7a3d3d9f9ea7 ]

The i2c controllers on the P2040/P2041 have an erratum where the
documented scheme for i2c bus recovery will not work (A-004447). A
different mechanism is needed which is documented in the P2040 Chip
Errata Rev Q (latest available at the time of writing).

Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agonvmet: fix false keep-alive timeout when a controller is torn down
Sagi Grimberg [Tue, 25 May 2021 15:49:05 +0000 (08:49 -0700)]
nvmet: fix false keep-alive timeout when a controller is torn down

[ Upstream commit aaeadd7075dc9e184bc7876e9dd7b3bada771df2 ]

Controller teardown flow may take some time in case it has many I/O
queues, and the host may not send us keep-alive during this period.
Hence reset the traffic based keep-alive timer so we don't trigger
a controller teardown as a result of a keep-alive expiration.

Reported-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agonvme-tcp: remove incorrect Kconfig dep in BLK_DEV_NVME
Sagi Grimberg [Fri, 21 May 2021 21:51:15 +0000 (14:51 -0700)]
nvme-tcp: remove incorrect Kconfig dep in BLK_DEV_NVME

[ Upstream commit 042a3eaad6daeabcfaf163aa44da8ea3cf8b5496 ]

We need to select NVME_CORE.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agobnx2x: Fix missing error code in bnx2x_iov_init_one()
Jiapeng Chong [Tue, 25 May 2021 11:00:12 +0000 (19:00 +0800)]
bnx2x: Fix missing error code in bnx2x_iov_init_one()

[ Upstream commit 65161c35554f7135e6656b3df1ce2c500ca0bdcf ]

Eliminate the follow smatch warning:

drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c:1227
bnx2x_iov_init_one() warn: missing error code 'err'.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agodm verity: fix require_signatures module_param permissions
John Keeping [Wed, 12 May 2021 11:14:21 +0000 (12:14 +0100)]
dm verity: fix require_signatures module_param permissions

[ Upstream commit 0c1f3193b1cdd21e7182f97dc9bca7d284d18a15 ]

The third parameter of module_param() is permissions for the sysfs node
but it looks like it is being used as the initial value of the parameter
here.  In fact, false here equates to omitting the file from sysfs and
does not affect the value of require_signatures.

Making the parameter writable is not simple because going from
false->true is fine but it should not be possible to remove the
requirement to verify a signature.  But it can be useful to inspect the
value of this parameter from userspace, so change the permissions to
make a read-only file in sysfs.

Signed-off-by: John Keeping <john@metanate.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agoMIPS: Fix kernel hang under FUNCTION_GRAPH_TRACER and PREEMPT_TRACER
Tiezhu Yang [Sat, 15 May 2021 11:02:01 +0000 (19:02 +0800)]
MIPS: Fix kernel hang under FUNCTION_GRAPH_TRACER and PREEMPT_TRACER

[ Upstream commit 78cf0eb926cb1abeff2106bae67752e032fe5f3e ]

When update the latest mainline kernel with the following three configs,
the kernel hangs during startup:

(1) CONFIG_FUNCTION_GRAPH_TRACER=y
(2) CONFIG_PREEMPT_TRACER=y
(3) CONFIG_FTRACE_STARTUP_TEST=y

When update the latest mainline kernel with the above two configs (1)
and (2), the kernel starts normally, but it still hangs when execute
the following command:

echo "function_graph" > /sys/kernel/debug/tracing/current_tracer

Without CONFIG_PREEMPT_TRACER=y, the above two kinds of kernel hangs
disappeared, so it seems that CONFIG_PREEMPT_TRACER has some influences
with function_graph tracer at the first glance.

I use ejtag to find out the epc address is related with preempt_enable()
in the file arch/mips/lib/mips-atomic.c, because function tracing can
trace the preempt_{enable,disable} calls that are traced, replace them
with preempt_{enable,disable}_notrace to prevent function tracing from
going into an infinite loop, and then it can fix the kernel hang issue.

By the way, it seems that this commit is a complement and improvement of
commit f93a1a00f2bd ("MIPS: Fix crash that occurs when function tracing
is enabled").

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agonvme-fabrics: decode host pathing error for connect
Hannes Reinecke [Fri, 21 May 2021 08:23:46 +0000 (10:23 +0200)]
nvme-fabrics: decode host pathing error for connect

[ Upstream commit 4d9442bf263ac45d495bb7ecf75009e59c0622b2 ]

Add an additional decoding for 'host pathing error' during connect.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agonet: dsa: microchip: enable phy errata workaround on 9567
George McCollister [Mon, 24 May 2021 20:29:53 +0000 (15:29 -0500)]
net: dsa: microchip: enable phy errata workaround on 9567

[ Upstream commit 8c42a49738f16af0061f9ae5c2f5a955f268d9e3 ]

Also enable phy errata workaround on 9567 since has the same errata as
the 9477 according to the manufacture's documentation.

Signed-off-by: George McCollister <george.mccollister@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agonet: appletalk: cops: Fix data race in cops_probe1
Saubhik Mukherjee [Mon, 24 May 2021 13:37:12 +0000 (19:07 +0530)]
net: appletalk: cops: Fix data race in cops_probe1

[ Upstream commit a4dd4fc6105e54393d637450a11d4cddb5fabc4f ]

In cops_probe1(), there is a write to dev->base_addr after requesting an
interrupt line and registering the interrupt handler cops_interrupt().
The handler might be called in parallel to handle an interrupt.
cops_interrupt() tries to read dev->base_addr leading to a potential
data race. So write to dev->base_addr before calling request_irq().

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Saubhik Mukherjee <saubhik.mukherjee@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agonet: macb: ensure the device is available before accessing GEMGXL control registers
Zong Li [Sat, 22 May 2021 09:16:11 +0000 (17:16 +0800)]
net: macb: ensure the device is available before accessing GEMGXL control registers

[ Upstream commit 5eff1461a6dec84f04fafa9128548bad51d96147 ]

If runtime power menagement is enabled, the gigabit ethernet PLL would
be disabled after macb_probe(). During this period of time, the system
would hang up if we try to access GEMGXL control registers.

We can't put runtime_pm_get/runtime_pm_put/ there due to the issue of
sleep inside atomic section (7fa2955ff70ce453 ("sh_eth: Fix sleeping
function called from invalid context"). Add netif_running checking to
ensure the device is available before accessing GEMGXL device.

Changed in v2:
 - Use netif_running instead of its own flag

Signed-off-by: Zong Li <zong.li@sifive.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agoscsi: target: qla2xxx: Wait for stop_phase1 at WWN removal
Dmitry Bogdanov [Thu, 15 Apr 2021 20:35:54 +0000 (23:35 +0300)]
scsi: target: qla2xxx: Wait for stop_phase1 at WWN removal

[ Upstream commit 2ef7665dfd88830f15415ba007c7c9a46be7acd8 ]

Target de-configuration panics at high CPU load because TPGT and WWPN can
be removed on separate threads.

TPGT removal requests a reset HBA on a separate thread and waits for reset
complete (phase1). Due to high CPU load that HBA reset can be delayed for
some time.

WWPN removal does qlt_stop_phase2(). There it is believed that phase1 has
already completed and thus tgt.tgt_ops is subsequently cleared. However,
tgt.tgt_ops is needed to process incoming traffic and therefore this will
cause one of the following panics:

NIP qlt_reset+0x7c/0x220 [qla2xxx]
LR  qlt_reset+0x68/0x220 [qla2xxx]
Call Trace:
0xc000003ffff63a78 (unreliable)
qlt_handle_imm_notify+0x800/0x10c0 [qla2xxx]
qlt_24xx_atio_pkt+0x208/0x590 [qla2xxx]
qlt_24xx_process_atio_queue+0x33c/0x7a0 [qla2xxx]
qla83xx_msix_atio_q+0x54/0x90 [qla2xxx]

or

NIP qlt_24xx_handle_abts+0xd0/0x2a0 [qla2xxx]
LR  qlt_24xx_handle_abts+0xb4/0x2a0 [qla2xxx]
Call Trace:
qlt_24xx_handle_abts+0x90/0x2a0 [qla2xxx] (unreliable)
qlt_24xx_process_atio_queue+0x500/0x7a0 [qla2xxx]
qla83xx_msix_atio_q+0x54/0x90 [qla2xxx]

or

NIP qlt_create_sess+0x90/0x4e0 [qla2xxx]
LR  qla24xx_do_nack_work+0xa8/0x180 [qla2xxx]
Call Trace:
0xc0000000348fba30 (unreliable)
qla24xx_do_nack_work+0xa8/0x180 [qla2xxx]
qla2x00_do_work+0x674/0xbf0 [qla2xxx]
qla2x00_iocb_work_fn

The patch fixes the issue by serializing qlt_stop_phase1() and
qlt_stop_phase2() functions to make WWPN removal wait for phase1
completion.

Link: https://lore.kernel.org/r/20210415203554.27890-1-d.bogdanov@yadro.com
Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agoscsi: hisi_sas: Drop free_irq() of devm_request_irq() allocated irq
Yang Yingliang [Wed, 19 May 2021 13:05:19 +0000 (21:05 +0800)]
scsi: hisi_sas: Drop free_irq() of devm_request_irq() allocated irq

[ Upstream commit 7907a021e4bbfa29cccacd2ba2dade894d9a7d4c ]

irqs allocated with devm_request_irq() should not be freed using
free_irq(). Doing so causes a dangling pointer and a subsequent double
free.

Link: https://lore.kernel.org/r/20210519130519.2661938-1-yangyingliang@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agoscsi: vmw_pvscsi: Set correct residual data length
Matt Wang [Wed, 19 May 2021 09:49:32 +0000 (09:49 +0000)]
scsi: vmw_pvscsi: Set correct residual data length

[ Upstream commit e662502b3a782d479e67736a5a1c169a703d853a ]

Some commands (such as INQUIRY) may return less data than the initiator
requested. To avoid conducting useless information, set the right residual
count to make upper layer aware of this.

Before (INQUIRY PAGE 0xB0 with 128B buffer):

$ sg_raw -r 128 /dev/sda 12 01 B0 00 80 00
SCSI Status: Good

Received 128 bytes of data:
 00 00 b0 00 3c 01 00 00 00 00 00 00 00 00 00 00 00 ...<............
 10 00 00 00 00 00 01 00 00 00 00 00 40 00 00 08 00 ...........@....
 20 80 00 00 00 00 00 00 00 00 00 20 00 00 00 00 00 .......... .....
 30 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
 50 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
 60 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
 70 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

After:

$ sg_raw -r 128 /dev/sda 12 01 B0 00 80 00
SCSI Status: Good

Received 64 bytes of data:
00 00 b0 00 3c 01 00 00 00 00 00 00 00 00 00 00 00 ...<............
10 00 00 00 00 00 01 00 00 00 00 00 40 00 00 08 00 ...........@....
20 80 00 00 00 00 00 00 00 00 00 20 00 00 00 00 00 .......... .....
30 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

[mkp: clarified description]

Link: https://lore.kernel.org/r/03C41093-B62E-43A2-913E-CFC92F1C70C3@vmware.com
Signed-off-by: Matt Wang <wwentao@vmware.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agoscsi: bnx2fc: Return failure if io_req is already in ABTS processing
Javed Hasan [Wed, 19 May 2021 06:14:16 +0000 (23:14 -0700)]
scsi: bnx2fc: Return failure if io_req is already in ABTS processing

[ Upstream commit 122c81c563b0c1c6b15ff76a9159af5ee1f21563 ]

Return failure from bnx2fc_eh_abort() if io_req is already in ABTS
processing.

Link: https://lore.kernel.org/r/20210519061416.19321-1-jhasan@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Javed Hasan <jhasan@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agonet:sfc: fix non-freed irq in legacy irq mode
Íñigo Huguet [Fri, 21 May 2021 14:38:35 +0000 (16:38 +0200)]
net:sfc: fix non-freed irq in legacy irq mode

[ Upstream commit 8f03eeb6e0a0a0b8d617ee0a4bce729e47130036 ]

SFC driver can be configured via modparam to work using MSI-X, MSI or
legacy IRQ interrupts. In the last one, the interrupt was not properly
released on module remove.

It was not freed because the flag irqs_hooked was not set during
initialization in the case of using legacy IRQ.

Example of (trimmed) trace during module remove without this fix:

remove_proc_entry: removing non-empty directory 'irq/125', leaking at least '0000:3b:00.1'
WARNING: CPU: 39 PID: 3658 at fs/proc/generic.c:715 remove_proc_entry+0x15c/0x170
...trimmed...
Call Trace:
 unregister_irq_proc+0xe3/0x100
 free_desc+0x29/0x70
 irq_free_descs+0x47/0x70
 mp_unmap_irq+0x58/0x60
 acpi_unregister_gsi_ioapic+0x2a/0x40
 acpi_pci_irq_disable+0x78/0xb0
 pci_disable_device+0xd1/0x100
 efx_pci_remove+0xa1/0x1e0 [sfc]
 pci_device_remove+0x38/0xa0
 __device_release_driver+0x177/0x230
 driver_detach+0xcb/0x110
 bus_remove_driver+0x58/0xd0
 pci_unregister_driver+0x2a/0xb0
 efx_exit_module+0x24/0xf40 [sfc]
 __do_sys_delete_module.constprop.0+0x171/0x280
 ? exit_to_user_mode_prepare+0x83/0x1d0
 do_syscall_64+0x3d/0x80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f9f9385800b
...trimmed...

Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agoRDS tcp loopback connection can hang
Rao Shoaib [Fri, 21 May 2021 18:08:06 +0000 (11:08 -0700)]
RDS tcp loopback connection can hang

[ Upstream commit aced3ce57cd37b5ca332bcacd370d01f5a8c5371 ]

When TCP is used as transport and a program on the
system connects to RDS port 16385, connection is
accepted but denied per the rules of RDS. However,
RDS connections object is left in the list. Next
loopback connection will select that connection
object as it is at the head of list. The connection
attempt will hang as the connection object is set
to connect over TCP which is not allowed

The issue can be reproduced easily, use rds-ping
to ping a local IP address. After that use any
program like ncat to connect to the same IP
address and port 16385. This will hang so ctrl-c out.
Now try rds-ping, it will hang.

To fix the issue this patch adds checks to disallow
the connection object creation and destroys the
connection object.

Signed-off-by: Rao Shoaib <rao.shoaib@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agonet/qla3xxx: fix schedule while atomic in ql_sem_spinlock
Zheyu Ma [Thu, 20 May 2021 12:32:36 +0000 (12:32 +0000)]
net/qla3xxx: fix schedule while atomic in ql_sem_spinlock

[ Upstream commit 13a6f3153922391e90036ba2267d34eed63196fc ]

When calling the 'ql_sem_spinlock', the driver has already acquired the
spin lock, so the driver should not call 'ssleep' in atomic context.

This bug can be fixed by using 'mdelay' instead of 'ssleep'.

The KASAN's log reveals it:

[    3.238124 ] BUG: scheduling while atomic: swapper/0/1/0x00000002
[    3.238748 ] 2 locks held by swapper/0/1:
[    3.239151 ]  #0: ffff88810177b240 (&dev->mutex){....}-{3:3}, at:
__device_driver_lock+0x41/0x60
[    3.240026 ]  #1: ffff888107c60e28 (&qdev->hw_lock){....}-{2:2}, at:
ql3xxx_probe+0x2aa/0xea0
[    3.240873 ] Modules linked in:
[    3.241187 ] irq event stamp: 460854
[    3.241541 ] hardirqs last  enabled at (460853): [<ffffffff843051bf>]
_raw_spin_unlock_irqrestore+0x4f/0x70
[    3.242245 ] hardirqs last disabled at (460854): [<ffffffff843058ca>]
_raw_spin_lock_irqsave+0x2a/0x70
[    3.242245 ] softirqs last  enabled at (446076): [<ffffffff846002e4>]
__do_softirq+0x2e4/0x4b1
[    3.242245 ] softirqs last disabled at (446069): [<ffffffff811ba5e0>]
irq_exit_rcu+0x100/0x110
[    3.242245 ] Preemption disabled at:
[    3.242245 ] [<ffffffff828ca5ba>] ql3xxx_probe+0x2aa/0xea0
[    3.242245 ] Kernel panic - not syncing: scheduling while atomic
[    3.242245 ] CPU: 2 PID: 1 Comm: swapper/0 Not tainted
5.13.0-rc1-00145
-gee7dc339169-dirty #16
[    3.242245 ] Call Trace:
[    3.242245 ]  dump_stack+0xba/0xf5
[    3.242245 ]  ? ql3xxx_probe+0x1f0/0xea0
[    3.242245 ]  panic+0x15a/0x3f2
[    3.242245 ]  ? vprintk+0x76/0x150
[    3.242245 ]  ? ql3xxx_probe+0x2aa/0xea0
[    3.242245 ]  __schedule_bug+0xae/0xe0
[    3.242245 ]  __schedule+0x72e/0xa00
[    3.242245 ]  schedule+0x43/0xf0
[    3.242245 ]  schedule_timeout+0x28b/0x500
[    3.242245 ]  ? del_timer_sync+0xf0/0xf0
[    3.242245 ]  ? msleep+0x2f/0x70
[    3.242245 ]  msleep+0x59/0x70
[    3.242245 ]  ql3xxx_probe+0x307/0xea0
[    3.242245 ]  ? _raw_spin_unlock_irqrestore+0x3a/0x70
[    3.242245 ]  ? pci_device_remove+0x110/0x110
[    3.242245 ]  local_pci_probe+0x45/0xa0
[    3.242245 ]  pci_device_probe+0x12b/0x1d0
[    3.242245 ]  really_probe+0x2a9/0x610
[    3.242245 ]  driver_probe_device+0x90/0x1d0
[    3.242245 ]  ? mutex_lock_nested+0x1b/0x20
[    3.242245 ]  device_driver_attach+0x68/0x70
[    3.242245 ]  __driver_attach+0x124/0x1b0
[    3.242245 ]  ? device_driver_attach+0x70/0x70
[    3.242245 ]  bus_for_each_dev+0xbb/0x110
[    3.242245 ]  ? rdinit_setup+0x45/0x45
[    3.242245 ]  driver_attach+0x27/0x30
[    3.242245 ]  bus_add_driver+0x1eb/0x2a0
[    3.242245 ]  driver_register+0xa9/0x180
[    3.242245 ]  __pci_register_driver+0x82/0x90
[    3.242245 ]  ? yellowfin_init+0x25/0x25
[    3.242245 ]  ql3xxx_driver_init+0x23/0x25
[    3.242245 ]  do_one_initcall+0x7f/0x3d0
[    3.242245 ]  ? rdinit_setup+0x45/0x45
[    3.242245 ]  ? rcu_read_lock_sched_held+0x4f/0x80
[    3.242245 ]  kernel_init_freeable+0x2aa/0x301
[    3.242245 ]  ? rest_init+0x2c0/0x2c0
[    3.242245 ]  kernel_init+0x18/0x190
[    3.242245 ]  ? rest_init+0x2c0/0x2c0
[    3.242245 ]  ? rest_init+0x2c0/0x2c0
[    3.242245 ]  ret_from_fork+0x1f/0x30
[    3.242245 ] Dumping ftrace buffer:
[    3.242245 ]    (ftrace buffer empty)
[    3.242245 ] Kernel Offset: disabled
[    3.242245 ] Rebooting in 1 seconds.

Reported-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agowq: handle VM suspension in stall detection
Sergey Senozhatsky [Thu, 20 May 2021 10:14:22 +0000 (19:14 +0900)]
wq: handle VM suspension in stall detection

[ Upstream commit 940d71c6462e8151c78f28e4919aa8882ff2054e ]

If VCPU is suspended (VM suspend) in wq_watchdog_timer_fn() then
once this VCPU resumes it will see the new jiffies value, while it
may take a while before IRQ detects PVCLOCK_GUEST_STOPPED on this
VCPU and updates all the watchdogs via pvclock_touch_watchdogs().
There is a small chance of misreported WQ stalls in the meantime,
because new jiffies is time_after() old 'ts + thresh'.

wq_watchdog_timer_fn()
{
for_each_pool(pool, pi) {
if (time_after(jiffies, ts + thresh)) {
pr_emerg("BUG: workqueue lockup - pool");
}
}
}

Save jiffies at the beginning of this function and use that value
for stall detection. If VM gets suspended then we continue using
"old" jiffies value and old WQ touch timestamps. If IRQ at some
point restarts the stall detection cycle (pvclock_touch_watchdogs())
then old jiffies will always be before new 'ts + thresh'.

Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>