platform/kernel/linux-rpi.git
3 years ago[regression fix] really dumb fuckup in sparc64 __csum_partial_copy() changes
Al Viro [Tue, 8 Dec 2020 21:37:47 +0000 (16:37 -0500)]
[regression fix] really dumb fuckup in sparc64 __csum_partial_copy() changes

~0U is -1, not 1

Reported-by: Anatoly Pugachev <matorola@gmail.com>
Tested-by: Anatoly Pugachev <matorola@gmail.com>
Fixes: fdf8bee96f9a "sparc64: propagate the calling convention changes down to __csum_partial_copy_...()"
X-brown-paperbag: yes
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agonetfilter: nftables: comment indirect serialization of commit_mutex with rtnl_mutex
Pablo Neira Ayuso [Tue, 8 Dec 2020 17:57:02 +0000 (18:57 +0100)]
netfilter: nftables: comment indirect serialization of commit_mutex with rtnl_mutex

Add an explicit comment in the code to describe the indirect
serialization of the holders of the commit_mutex with the rtnl_mutex.
Commit 90d2723c6d4c ("netfilter: nf_tables: do not hold reference on
netdevice from preparation phase") already describes this, but a comment
in this case is better for reference.

Reported-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 years agoMerge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Tue, 8 Dec 2020 20:20:34 +0000 (12:20 -0800)]
Merge branch 'fixes' of git://git./linux/kernel/git/viro/vfs

Pull seq_file fix from Al Viro:
 "This fixes a regression introduced in this cycle wrt iov_iter based
  variant for reading a seq_file"

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fix return values of seq_read_iter()

3 years agonetfilter: nft_dynset: fix timeouts later than 23 days
Pablo Neira Ayuso [Tue, 8 Dec 2020 17:25:53 +0000 (18:25 +0100)]
netfilter: nft_dynset: fix timeouts later than 23 days

Use nf_msecs_to_jiffies64 and nf_jiffies64_to_msecs as provided by
8e1102d5a159 ("netfilter: nf_tables: support timeouts larger than 23
days"), otherwise ruleset listing breaks.

Fixes: a8b1e36d0d1d ("netfilter: nft_dynset: fix element timeout for HZ != 1000")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 years agobonding: fix feature flag setting at init time
Jarod Wilson [Sat, 5 Dec 2020 17:22:29 +0000 (12:22 -0500)]
bonding: fix feature flag setting at init time

Don't try to adjust XFRM support flags if the bond device isn't yet
registered. Bad things can currently happen when netdev_change_features()
is called without having wanted_features fully filled in yet. This code
runs both on post-module-load mode changes, as well as at module init
time, and when run at module init time, it is before register_netdevice()
has been called and filled in wanted_features. The empty wanted_features
led to features also getting emptied out, which was definitely not the
intended behavior, so prevent that from happening.

Originally, I'd hoped to stop adjusting wanted_features at all in the
bonding driver, as it's documented as being something only the network
core should touch, but we actually do need to do this to properly update
both the features and wanted_features fields when changing the bond type,
or we get to a situation where ethtool sees:

    esp-hw-offload: off [requested on]

I do think we should be using netdev_update_features instead of
netdev_change_features here though, so we only send notifiers when the
features actually changed.

Fixes: a3b658cfb664 ("bonding: allow xfrm offload setup post-module-load")
Reported-by: Ivan Vecera <ivecera@redhat.com>
Suggested-by: Ivan Vecera <ivecera@redhat.com>
Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Link: https://lore.kernel.org/r/20201205172229.576587-1-jarod@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agotools/bpftool: Fix PID fetching with a lot of results
Andrii Nakryiko [Fri, 4 Dec 2020 23:20:01 +0000 (15:20 -0800)]
tools/bpftool: Fix PID fetching with a lot of results

In case of having so many PID results that they don't fit into a singe page
(4096) bytes, bpftool will erroneously conclude that it got corrupted data due
to 4096 not being a multiple of struct pid_iter_entry, so the last entry will
be partially truncated. Fix this by sizing the buffer to fit exactly N entries
with no truncation in the middle of record.

Fixes: d53dee3fe013 ("tools/bpftool: Show info for processes holding BPF map/prog/link/btf FDs")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20201204232002.3589803-1-andrii@kernel.org
3 years agodrm/i915/gt: Declare gen9 has 64 mocs entries!
Chris Wilson [Fri, 27 Nov 2020 10:25:40 +0000 (10:25 +0000)]
drm/i915/gt: Declare gen9 has 64 mocs entries!

We checked the table size against a hardcoded number of entries, and
that number was excluding the special mocs registers at the end.

Fixes: 777a7717d60c ("drm/i915/gt: Program mocs:63 for cache eviction on gen9")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v4.3+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201127102540.13117-1-chris@chris-wilson.co.uk
(cherry picked from commit 444fbf5d7058099447c5366ba8bb60d610aeb44b)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[backported and updated the Fixes sha]

3 years agodrm/i915/display/dp: Compute the correct slice count for VDSC on DP
Manasi Navare [Fri, 4 Dec 2020 20:58:04 +0000 (12:58 -0800)]
drm/i915/display/dp: Compute the correct slice count for VDSC on DP

This patch fixes the slice count computation algorithm
for calculating the slice count based on Peak pixel rate
and the max slice width allowed on the DSC engines.
We need to ensure slice count > min slice count req
as per DP spec based on peak pixel rate and that it is
greater than min slice count based on the max slice width
advertised by DPCD. So use max of these two.
In the prev patch we were using min of these 2 causing it
to violate the max slice width limitation causing a blank
screen on 8K@60.

Fixes: d9218c8f6cf4 ("drm/i915/dp: Add helpers for Compressed BPP and Slice Count for DSC")
Cc: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: <stable@vger.kernel.org> # v5.0+
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201204205804.25225-1-manasi.d.navare@intel.com
(cherry picked from commit d371d6ea92ad2a47f42bbcaa786ee5f6069c9c14)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
3 years agodrm/i915: fix size_t greater or equal to zero comparison
Colin Ian King [Fri, 2 Oct 2020 17:03:54 +0000 (18:03 +0100)]
drm/i915: fix size_t greater or equal to zero comparison

Currently the check that the unsigned size_t variable i is >= 0
is always true because the unsigned variable will never be negative,
causing the loop to run forever.  Fix this by changing the
pre-decrement check to a zero check on i followed by a decrement of i.

Addresses-Coverity: ("Unsigned compared against 0")
Fixes: bfed6708d6c9 ("drm/i915: use vmap in shmem_pin_map")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201002170354.94627-1-colin.king@canonical.com
(cherry picked from commit e70956a2498dc81d8f2522cba074f55ae910e13c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
3 years agodrm/i915/gt: Cancel the preemption timeout on responding to it
Chris Wilson [Fri, 4 Dec 2020 15:12:32 +0000 (15:12 +0000)]
drm/i915/gt: Cancel the preemption timeout on responding to it

We currently presume that the engine reset is successful, cancelling the
expired preemption timer in the process. However, engine resets can
fail, leaving the timeout still pending and we will then respond to the
timeout again next time the tasklet fires. What we want is for the
failed engine reset to be promoted to a full device reset, which is
kicked by the heartbeat once the engine stops processing events.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1168
Fixes: 3a7a92aba8fb ("drm/i915/execlists: Force preemption")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201204151234.19729-2-chris@chris-wilson.co.uk
(cherry picked from commit d997e240ceecb4f732611985d3a939ad1bfc1893)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
3 years agodrm/i915/gt: Ignore repeated attempts to suspend request flow across reset
Chris Wilson [Fri, 4 Dec 2020 15:12:31 +0000 (15:12 +0000)]
drm/i915/gt: Ignore repeated attempts to suspend request flow across reset

Before reseting the engine, we suspend the execution of the guilty
request, so that we can continue execution with a new context while we
slowly compress the captured error state for the guilty context. However,
if the reset fails, we will promptly attempt to reset the same request
again, and discover the ongoing capture. Ignore the second attempt to
suspend and capture the same request.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1168
Fixes: 32ff621fd744 ("drm/i915/gt: Allow temporary suspension of inflight requests")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v5.7+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201204151234.19729-1-chris@chris-wilson.co.uk
(cherry picked from commit b969540500bce60cf1cdfff5464388af32b9a553)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
3 years agodrm/i915/gem: Propagate error from cancelled submit due to context closure
Chris Wilson [Thu, 3 Dec 2020 10:34:32 +0000 (10:34 +0000)]
drm/i915/gem: Propagate error from cancelled submit due to context closure

In the course of discovering and closing many races with context closure
and execbuf submission, since commit 61231f6bd056 ("drm/i915/gem: Check
that the context wasn't closed during setup") we started checking that
the context was not closed by another userspace thread during the execbuf
ioctl. In doing so we cancelled the inflight request (by telling it to be
skipped), but kept reporting success since we do submit a request, albeit
one that doesn't execute. As the error is known before we return from the
ioctl, we can report the error we detect immediately, rather than leave
it on the fence status. With the immediate propagation of the error, it
is easier for userspace to handle.

Fixes: 61231f6bd056 ("drm/i915/gem: Check that the context wasn't closed during setup")
Testcase: igt/gem_ctx_exec/basic-close-race
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v5.7+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201203103432.31526-1-chris@chris-wilson.co.uk
(cherry picked from commit ba38b79eaeaeed29d2383f122d5c711ebf5ed3d1)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
3 years agodrm/i915/gem: Check the correct variable in selftest
Dan Carpenter [Thu, 3 Dec 2020 08:45:17 +0000 (11:45 +0300)]
drm/i915/gem: Check the correct variable in selftest

There is a copy and paste bug in this code.  It's supposed to check
"obj2" instead of checking "obj" a second time.

Fixes: 80f0b679d6f0 ("drm/i915: Add an implementation for i915_gem_ww_ctx locking, v2.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/8ilneOcJAjwqU4t@mwand
(cherry picked from commit 14f2d7604f7ce4cb3d303aea17292d119dfafa75)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
3 years agonetfilter: x_tables: Switch synchronization to RCU
Subash Abhinov Kasiviswanathan [Wed, 25 Nov 2020 18:27:22 +0000 (11:27 -0700)]
netfilter: x_tables: Switch synchronization to RCU

When running concurrent iptables rules replacement with data, the per CPU
sequence count is checked after the assignment of the new information.
The sequence count is used to synchronize with the packet path without the
use of any explicit locking. If there are any packets in the packet path using
the table information, the sequence count is incremented to an odd value and
is incremented to an even after the packet process completion.

The new table value assignment is followed by a write memory barrier so every
CPU should see the latest value. If the packet path has started with the old
table information, the sequence counter will be odd and the iptables
replacement will wait till the sequence count is even prior to freeing the
old table info.

However, this assumes that the new table information assignment and the memory
barrier is actually executed prior to the counter check in the replacement
thread. If CPU decides to execute the assignment later as there is no user of
the table information prior to the sequence check, the packet path in another
CPU may use the old table information. The replacement thread would then free
the table information under it leading to a use after free in the packet
processing context-

Unable to handle kernel NULL pointer dereference at virtual
address 000000000000008e
pc : ip6t_do_table+0x5d0/0x89c
lr : ip6t_do_table+0x5b8/0x89c
ip6t_do_table+0x5d0/0x89c
ip6table_filter_hook+0x24/0x30
nf_hook_slow+0x84/0x120
ip6_input+0x74/0xe0
ip6_rcv_finish+0x7c/0x128
ipv6_rcv+0xac/0xe4
__netif_receive_skb+0x84/0x17c
process_backlog+0x15c/0x1b8
napi_poll+0x88/0x284
net_rx_action+0xbc/0x23c
__do_softirq+0x20c/0x48c

This could be fixed by forcing instruction order after the new table
information assignment or by switching to RCU for the synchronization.

Fixes: 80055dab5de0 ("netfilter: x_tables: make xt_replace_table wait until old rules are not used anymore")
Reported-by: Sean Tranchetti <stranche@codeaurora.org>
Reported-by: kernel test robot <lkp@intel.com>
Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3 years agopinctrl: aspeed: Fix GPIO requests on pass-through banks
Andrew Jeffery [Thu, 26 Nov 2020 06:33:37 +0000 (17:03 +1030)]
pinctrl: aspeed: Fix GPIO requests on pass-through banks

Commit 6726fbff19bf ("pinctrl: aspeed: Fix GPI only function problem.")
fixes access to GPIO banks T and U on the AST2600. Both banks contain
input-only pins and the GPIO pin function is named GPITx and GPIUx
respectively. Unfortunately the fix had a negative impact on GPIO banks
D and E for the AST2400 and AST2500 where the GPIO pass-through
functions take similar "GPI"-style names. The net effect on the older
SoCs was that when the GPIO subsystem requested a pin in banks D or E be
muxed for GPIO, they were instead muxed for pass-through mode.
Mistakenly muxing pass-through mode e.g. breaks booting the host on
IBM's Witherspoon (AC922) platform where GPIOE0 is used for FSI.

Further exploit the names in the provided expression structure to
differentiate pass-through from pin-specific GPIO modes.

This follow-up fix gives the expected behaviour for the following tests:

Witherspoon BMC (AST2500):

1. Power-on the Witherspoon host
2. Request GPIOD1 be muxed via /sys/class/gpio/export
3. Request GPIOE1 be muxed via /sys/class/gpio/export
4. Request the balls for GPIOs E2 and E3 be muxed as GPIO pass-through
   ("GPIE2" mode) via a pinctrl hog in the devicetree

Rainier BMC (AST2600):

5. Request GPIT0 be muxed via /sys/class/gpio/export
6. Request GPIU0 be muxed via /sys/class/gpio/export

Together the tests demonstrate that all three pieces of functionality
(general GPIOs via 1, 2 and 3, input-only GPIOs via 5 and 6, pass-through
mode via 4) operate as desired across old and new SoCs.

Fixes: 9b92f5c51e9a ("pinctrl: aspeed: Fix GPI only function problem.")
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Tested-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Cc: Billy Tsai <billy_tsai@aspeedtech.com>
Cc: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/20201126063337.489927-1-andrew@aj.id.au
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
3 years agomedia: vidtv: fix some warnings
Mauro Carvalho Chehab [Tue, 8 Dec 2020 06:57:13 +0000 (07:57 +0100)]
media: vidtv: fix some warnings

As reported by sparse:

drivers/media/test-drivers/vidtv/vidtv_ts.h:47:47: warning: array of flexible structures
drivers/media/test-drivers/vidtv/vidtv_channel.c:458:54: warning: incorrect type in argument 3 (different base types)
drivers/media/test-drivers/vidtv/vidtv_channel.c:458:54:    expected unsigned short [usertype] service_id
drivers/media/test-drivers/vidtv/vidtv_channel.c:458:54:    got restricted __be16 [usertype] service_id
drivers/media/test-drivers/vidtv/vidtv_s302m.c:471 vidtv_s302m_encoder_init() warn: possible memory leak of 'e'

Address such warnings.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
3 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Jakub Kicinski [Tue, 8 Dec 2020 02:29:53 +0000 (18:29 -0800)]
Merge branch 'master' of git://git./linux/kernel/git/klassert/ipsec

Steffen Klassert says:

====================
pull request (net): ipsec 2020-12-07

1) Sysbot reported fixes for the new 64/32 bit compat layer.
   From Dmitry Safonov.

2) Fix a memory leak in xfrm_user_policy that was introduced
   by adding the 64/32 bit compat layer. From Yu Kuai.

* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
  net: xfrm: fix memory leak in xfrm_user_policy()
  xfrm/compat: Don't allocate memory with __GFP_ZERO
  xfrm/compat: memset(0) 64-bit padding at right place
  xfrm/compat: Translate by copying XFRMA_UNSPEC attribute
====================

Link: https://lore.kernel.org/r/20201207093937.2874932-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: stmmac: dwmac-meson8b: fix mask definition of the m250_sel mux
Martin Blumenstingl [Sat, 5 Dec 2020 21:32:07 +0000 (22:32 +0100)]
net: stmmac: dwmac-meson8b: fix mask definition of the m250_sel mux

The m250_sel mux clock uses bit 4 in the PRG_ETH0 register. Fix this by
shifting the PRG_ETH0_CLK_M250_SEL_MASK accordingly as the "mask" in
struct clk_mux expects the mask relative to the "shift" field in the
same struct.

While here, get rid of the PRG_ETH0_CLK_M250_SEL_SHIFT macro and use
__ffs() to determine it from the existing PRG_ETH0_CLK_M250_SEL_MASK
macro.

Fixes: 566e8251625304 ("net: stmmac: add a glue driver for the Amlogic Meson 8b / GXBB DWMAC")
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Jerome Brunet <jbrunet@baylibre.com>
Link: https://lore.kernel.org/r/20201205213207.519341-1-martin.blumenstingl@googlemail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agodpaa2-mac: Add a missing of_node_put after of_device_is_available
Christophe JAILLET [Sun, 6 Dec 2020 15:13:39 +0000 (16:13 +0100)]
dpaa2-mac: Add a missing of_node_put after of_device_is_available

Add an 'of_node_put()' call when a tested device node is not available.

Fixes: 94ae899b2096 ("dpaa2-mac: add PCS support through the Lynx module")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Link: https://lore.kernel.org/r/20201206151339.44306-1-christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agomptcp: print new line in mptcp_seq_show() if mptcp isn't in use
Jianguo Wu [Sat, 5 Dec 2020 07:56:33 +0000 (15:56 +0800)]
mptcp: print new line in mptcp_seq_show() if mptcp isn't in use

When do cat /proc/net/netstat, the output isn't append with a new line, it looks like this:
[root@localhost ~]# cat /proc/net/netstat
...
MPTcpExt: 0 0 0 0 0 0 0 0 0 0 0 0 0[root@localhost ~]#

This is because in mptcp_seq_show(), if mptcp isn't in use, net->mib.mptcp_statistics is NULL,
so it just puts all 0 after "MPTcpExt:", and return, forgot the '\n'.

After this patch:

[root@localhost ~]# cat /proc/net/netstat
...
MPTcpExt: 0 0 0 0 0 0 0 0 0 0 0 0 0
[root@localhost ~]#

Fixes: fc518953bc9c8d7d ("mptcp: add and use MIB counter infrastructure")
Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn>
Acked-by: Florian Westphal <fw@strlen.de>
Link: https://lore.kernel.org/r/142e2fd9-58d9-bb13-fb75-951cccc2331e@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agobridge: Fix a deadlock when enabling multicast snooping
Joseph Huang [Fri, 4 Dec 2020 23:56:28 +0000 (18:56 -0500)]
bridge: Fix a deadlock when enabling multicast snooping

When enabling multicast snooping, bridge module deadlocks on multicast_lock
if 1) IPv6 is enabled, and 2) there is an existing querier on the same L2
network.

The deadlock was caused by the following sequence: While holding the lock,
br_multicast_open calls br_multicast_join_snoopers, which eventually causes
IP stack to (attempt to) send out a Listener Report (in igmp6_join_group).
Since the destination Ethernet address is a multicast address, br_dev_xmit
feeds the packet back to the bridge via br_multicast_rcv, which in turn
calls br_multicast_add_group, which then deadlocks on multicast_lock.

The fix is to move the call br_multicast_join_snoopers outside of the
critical section. This works since br_multicast_join_snoopers only deals
with IP and does not modify any multicast data structures of the bridge,
so there's no need to hold the lock.

Steps to reproduce:
1. sysctl net.ipv6.conf.all.force_mld_version=1
2. have another querier
3. ip link set dev bridge type bridge mcast_snooping 0 && \
   ip link set dev bridge type bridge mcast_snooping 1 < deadlock >

A typical call trace looks like the following:

[  936.251495]  _raw_spin_lock+0x5c/0x68
[  936.255221]  br_multicast_add_group+0x40/0x170 [bridge]
[  936.260491]  br_multicast_rcv+0x7ac/0xe30 [bridge]
[  936.265322]  br_dev_xmit+0x140/0x368 [bridge]
[  936.269689]  dev_hard_start_xmit+0x94/0x158
[  936.273876]  __dev_queue_xmit+0x5ac/0x7f8
[  936.277890]  dev_queue_xmit+0x10/0x18
[  936.281563]  neigh_resolve_output+0xec/0x198
[  936.285845]  ip6_finish_output2+0x240/0x710
[  936.290039]  __ip6_finish_output+0x130/0x170
[  936.294318]  ip6_output+0x6c/0x1c8
[  936.297731]  NF_HOOK.constprop.0+0xd8/0xe8
[  936.301834]  igmp6_send+0x358/0x558
[  936.305326]  igmp6_join_group.part.0+0x30/0xf0
[  936.309774]  igmp6_group_added+0xfc/0x110
[  936.313787]  __ipv6_dev_mc_inc+0x1a4/0x290
[  936.317885]  ipv6_dev_mc_inc+0x10/0x18
[  936.321677]  br_multicast_open+0xbc/0x110 [bridge]
[  936.326506]  br_multicast_toggle+0xec/0x140 [bridge]

Fixes: 4effd28c1245 ("bridge: join all-snoopers multicast address")
Signed-off-by: Joseph Huang <Joseph.Huang@garmin.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Link: https://lore.kernel.org/r/20201204235628.50653-1-Joseph.Huang@garmin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoenetc: Fix reporting of h/w packet counters
Claudiu Manoil [Fri, 4 Dec 2020 17:15:05 +0000 (19:15 +0200)]
enetc: Fix reporting of h/w packet counters

Noticed some inconsistencies in packet statistics reporting.
This patch adds the missing Tx packet counter registers to
ethtool reporting and fixes the information strings for a
few of them.

Fixes: 16eb4c85c964 ("enetc: Add ethtool statistics")
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Link: https://lore.kernel.org/r/20201204171505.21389-1-claudiu.manoil@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agopowerpc/mm: Fix KUAP warning by providing copy_from_kernel_nofault_allowed()
Christophe Leroy [Mon, 7 Dec 2020 16:58:01 +0000 (16:58 +0000)]
powerpc/mm: Fix KUAP warning by providing copy_from_kernel_nofault_allowed()

Since commit c33165253492 ("powerpc: use non-set_fs based maccess
routines"), userspace access is not granted anymore when using
copy_from_kernel_nofault()

However, kthread_probe_data() uses copy_from_kernel_nofault()
to check validity of pointers. When the pointer is NULL,
it points to userspace, leading to a KUAP fault and triggering
the following big hammer warning many times when you request
a sysrq "show task":

[ 1117.202054] ------------[ cut here ]------------
[ 1117.202102] Bug: fault blocked by AP register !
[ 1117.202261] WARNING: CPU: 0 PID: 377 at arch/powerpc/include/asm/nohash/32/kup-8xx.h:66 do_page_fault+0x4a8/0x5ec
[ 1117.202310] Modules linked in:
[ 1117.202428] CPU: 0 PID: 377 Comm: sh Tainted: G        W         5.10.0-rc5-01340-g83f53be2de31-dirty #4175
[ 1117.202499] NIP:  c0012048 LR: c0012048 CTR: 00000000
[ 1117.202573] REGS: cacdbb88 TRAP: 0700   Tainted: G        W          (5.10.0-rc5-01340-g83f53be2de31-dirty)
[ 1117.202625] MSR:  00021032 <ME,IR,DR,RI>  CR: 24082222  XER: 20000000
[ 1117.202899]
[ 1117.202899] GPR00: c0012048 cacdbc40 c2929290 00000023 c092e554 00000001 c09865e8 c092e640
[ 1117.202899] GPR08: 00001032 00000000 00000000 00014efc 28082224 100d166a 100a0920 00000000
[ 1117.202899] GPR16: 100cac0c 100b0000 1080c3fc 1080d685 100d0000 100d0000 00000000 100a0900
[ 1117.202899] GPR24: 100d0000 c07892ec 00000000 c0921510 c21f4440 0000005c c0000000 cacdbc80
[ 1117.204362] NIP [c0012048] do_page_fault+0x4a8/0x5ec
[ 1117.204461] LR [c0012048] do_page_fault+0x4a8/0x5ec
[ 1117.204509] Call Trace:
[ 1117.204609] [cacdbc40] [c0012048] do_page_fault+0x4a8/0x5ec (unreliable)
[ 1117.204771] [cacdbc70] [c00112f0] handle_page_fault+0x8/0x34
[ 1117.204911] --- interrupt: 301 at copy_from_kernel_nofault+0x70/0x1c0
[ 1117.204979] NIP:  c010dbec LR: c010dbac CTR: 00000001
[ 1117.205053] REGS: cacdbc80 TRAP: 0301   Tainted: G        W          (5.10.0-rc5-01340-g83f53be2de31-dirty)
[ 1117.205104] MSR:  00009032 <EE,ME,IR,DR,RI>  CR: 28082224  XER: 00000000
[ 1117.205416] DAR: 0000005c DSISR: c0000000
[ 1117.205416] GPR00: c0045948 cacdbd38 c2929290 00000001 00000017 00000017 00000027 0000000f
[ 1117.205416] GPR08: c09926ec 00000000 00000000 3ffff000 24082224
[ 1117.206106] NIP [c010dbec] copy_from_kernel_nofault+0x70/0x1c0
[ 1117.206202] LR [c010dbac] copy_from_kernel_nofault+0x30/0x1c0
[ 1117.206258] --- interrupt: 301
[ 1117.206372] [cacdbd38] [c004bbb0] kthread_probe_data+0x44/0x70 (unreliable)
[ 1117.206561] [cacdbd58] [c0045948] print_worker_info+0xe0/0x194
[ 1117.206717] [cacdbdb8] [c00548ac] sched_show_task+0x134/0x168
[ 1117.206851] [cacdbdd8] [c005a268] show_state_filter+0x70/0x100
[ 1117.206989] [cacdbe08] [c039baa0] sysrq_handle_showstate+0x14/0x24
[ 1117.207122] [cacdbe18] [c039bf18] __handle_sysrq+0xac/0x1d0
[ 1117.207257] [cacdbe48] [c039c0c0] write_sysrq_trigger+0x4c/0x74
[ 1117.207407] [cacdbe68] [c01fba48] proc_reg_write+0xb4/0x114
[ 1117.207550] [cacdbe88] [c0179968] vfs_write+0x12c/0x478
[ 1117.207686] [cacdbf08] [c0179e60] ksys_write+0x78/0x128
[ 1117.207826] [cacdbf38] [c00110d0] ret_from_syscall+0x0/0x34
[ 1117.207938] --- interrupt: c01 at 0xfd4e784
[ 1117.208008] NIP:  0fd4e784 LR: 0fe0f244 CTR: 10048d38
[ 1117.208083] REGS: cacdbf48 TRAP: 0c01   Tainted: G        W          (5.10.0-rc5-01340-g83f53be2de31-dirty)
[ 1117.208134] MSR:  0000d032 <EE,PR,ME,IR,DR,RI>  CR: 44002222  XER: 00000000
[ 1117.208470]
[ 1117.208470] GPR00: 00000004 7fc34090 77bfb4e0 00000001 1080fa40 00000002 7400000f fefefeff
[ 1117.208470] GPR08: 7f7f7f7f 10048d38 1080c414 7fc343c0 00000000
[ 1117.209104] NIP [0fd4e784] 0xfd4e784
[ 1117.209180] LR [0fe0f244] 0xfe0f244
[ 1117.209236] --- interrupt: c01
[ 1117.209274] Instruction dump:
[ 1117.209353] 714a4000 418200f0 73ca0001 40820084 73ca0032 408200f8 73c90040 4082ff60
[ 1117.209727] 0fe00000 3c60c082 386399f4 48013b65 <0fe0000080010034 3860000b 7c0803a6
[ 1117.210102] ---[ end trace 1927c0323393af3e ]---

To avoid that, copy_from_kernel_nofault_allowed() is used to check
whether the address is a valid kernel address. But the default
version of it returns true for any address.

Provide a powerpc version of copy_from_kernel_nofault_allowed()
that returns false when the address is below TASK_USER_MAX,
so that copy_from_kernel_nofault() will return -ERANGE.

Fixes: c33165253492 ("powerpc: use non-set_fs based maccess routines")
Reported-by: Qian Cai <qcai@redhat.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/18bcb456d32a3e74f5ae241fd6f1580c092d07f5.1607360230.git.christophe.leroy@csgroup.eu
3 years agoclk: renesas: r9a06g032: Drop __packed for portability
Geert Uytterhoeven [Mon, 30 Nov 2020 08:57:43 +0000 (09:57 +0100)]
clk: renesas: r9a06g032: Drop __packed for portability

The R9A06G032 clock driver uses an array of packed structures to reduce
kernel size.  However, this array contains pointers, which are no longer
aligned naturally, and cannot be relocated on PPC64.  Hence when
compile-testing this driver on PPC64 with CONFIG_RELOCATABLE=y (e.g.
PowerPC allyesconfig), the following warnings are produced:

    WARNING: 136 bad relocations
    c000000000616be3 R_PPC64_UADDR64   .rodata+0x00000000000cf338
    c000000000616bfe R_PPC64_UADDR64   .rodata+0x00000000000cf370
    ...

Fix this by dropping the __packed attribute from the r9a06g032_clkdesc
definition, trading a small size increase for portability.

This increases the 156-entry clock table by 1 byte per entry, but due to
the compiler generating more efficient code for unpacked accesses, the
net size increase is only 76 bytes (gcc 9.3.0 on arm32).

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 4c3d88526eba2143 ("clk: renesas: Renesas R9A06G032 clock driver")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20201130085743.1656317-1-geert+renesas@glider.be
Tested-by: Stephen Rothwell <sfr@canb.auug.org.au> # PowerPC allyesconfig build
Acked-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
3 years agoclk: imx: scu: fix MXC_CLK_SCU module build break
Dong Aisheng [Mon, 30 Nov 2020 08:46:24 +0000 (16:46 +0800)]
clk: imx: scu: fix MXC_CLK_SCU module build break

This issue can be reproduced by having a kernel config with
CONFIG_IMX_MBOX=m and CONFIG_MXC_CLK_SCU=m.  It's caused by the Makefile
wanting to build clk-scu.o and clk-imx8qxp.o as different targets but
that doesn't work (e.g. MXC_CLK_SCU = y while CLK_IMX8QXP = n)

"obj-$(CONFIG_MXC_CLK_SCU) += clk-imx-scu.o clk-imx-lpcg-scu.o
clk-imx-scu-$(CONFIG_CLK_IMX8QXP) += clk-scu.o clk-imx8qxp.o"

Having MXC_CLK_SCU=y/m while CLK_IMX8QXP=n will cause a linker problem
like below:

  LD [M]  drivers/clk/imx/clk-imx-scu.o
  arm-poky-linux-gnueabi-ld: no input files

Make MXC_CLK_SCU be un-selectable by users so it can only be selected by
the CLK_IMX8QXP option, ensuring the two symbols are built together.
Drop COMPILE_TEST too because this option isn't selectable anymore. We
can remove it from MXC_CLK_SCU because CLK_IMX8QXP selects MXC_CLK_SCU
which already has COMPILE_TEST.

Fixes: e0d0d4d86c766 ("clk: imx8qxp: Support building i.MX8QXP clock driver as module")
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
Link: https://lore.kernel.org/r/20201130084624.21113-1-aisheng.dong@nxp.com
[sboyd@kernel.org: Rework commit text]
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
3 years agoRDMA/core: Fix empty gid table for non IB/RoCE devices
Gal Pressman [Sun, 6 Dec 2020 15:32:38 +0000 (17:32 +0200)]
RDMA/core: Fix empty gid table for non IB/RoCE devices

The query_gid_table ioctl skips non IB/RoCE ports, which as a result
returns an empty gid table for devices such as EFA which have a GID table,
but are not IB/RoCE.

Fixes: c4b4d548fabc ("RDMA/core: Introduce new GID table query API")
Link: https://lore.kernel.org/r/20201206153238.34878-1-galpress@amazon.com
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
3 years agolwt_bpf: Replace preempt_disable() with migrate_disable()
Cong Wang [Sat, 5 Dec 2020 07:59:46 +0000 (23:59 -0800)]
lwt_bpf: Replace preempt_disable() with migrate_disable()

migrate_disable() is just a wrapper for preempt_disable() in
non-RT kernel. It is safe to replace it, and RT kernel will
benefit.

Note that it is introduced since Feb 2020.

Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201205075946.497763-2-xiyou.wangcong@gmail.com
3 years agolwt: Disable BH too in run_lwt_bpf()
Dongdong Wang [Sat, 5 Dec 2020 07:59:45 +0000 (23:59 -0800)]
lwt: Disable BH too in run_lwt_bpf()

The per-cpu bpf_redirect_info is shared among all skb_do_redirect()
and BPF redirect helpers. Callers on RX path are all in BH context,
disabling preemption is not sufficient to prevent BH interruption.

In production, we observed strange packet drops because of the race
condition between LWT xmit and TC ingress, and we verified this issue
is fixed after we disable BH.

Although this bug was technically introduced from the beginning, that
is commit 3a0af8fd61f9 ("bpf: BPF for lightweight tunnel infrastructure"),
at that time call_rcu() had to be call_rcu_bh() to match the RCU context.
So this patch may not work well before RCU flavor consolidation has been
completed around v5.0.

Update the comments above the code too, as call_rcu() is now BH friendly.

Signed-off-by: Dongdong Wang <wangdongdong.6@bytedance.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Cong Wang <cong.wang@bytedance.com>
Link: https://lore.kernel.org/bpf/20201205075946.497763-1-xiyou.wangcong@gmail.com
3 years agoMerge tag 'trace-v5.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt...
Linus Torvalds [Mon, 7 Dec 2020 19:20:26 +0000 (11:20 -0800)]
Merge tag 'trace-v5.10-rc7' of git://git./linux/kernel/git/rostedt/linux-trace

Pull tracing fix from Steven Rostedt:
 "Fix userstacktrace option for instances

  While writing an application that requires user stack trace option to
  work in instances, I found that the instance option has a bug that
  makes it a nop. The check for performing the user stack trace in an
  instance, checks the top level options (not the instance options) to
  determine if a user stack trace should be performed or not.

  This is not only incorrect, but also confusing for users. It confused
  me for a bit!"

* tag 'trace-v5.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Fix userstacktrace option for instances

3 years agoMAINTAINERS: add a limited ARM and ARM64 SoC entry
Krzysztof Kozlowski [Tue, 1 Dec 2020 21:15:16 +0000 (23:15 +0200)]
MAINTAINERS: add a limited ARM and ARM64 SoC entry

It is expected for ARM and ARM64 SoC related code to go through
sub-architecture maintainers.  Their addresses were therefore not
documented to push patch traffic through sub-architecture maintainers.

However when patches touch generic code, e.g. multi_v7_defconfig, the
patch might not be picked up by them and instead should go to the SoC
maintainers - Arnd and Olof.

Add a minimal maintainer's entry for SoC covering only Makefile, so it
will not appear on most of submissions (except new devicetree boards).
It will though serve as a documentation and reference for cases when
submitter does not know where to send his SoC-related patches.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Link: https://lore.kernel.org/r/20201201211516.24921-2-krzk@kernel.org'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
3 years agoMAINTAINERS: correct SoC Git address (formerly: arm-soc)
Krzysztof Kozlowski [Tue, 1 Dec 2020 21:15:15 +0000 (23:15 +0200)]
MAINTAINERS: correct SoC Git address (formerly: arm-soc)

The SoC Git was moved from arm/arm-soc.git to soc/soc.git.  Correct the
ARM Sub-architectures entry.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Link: https://lore.kernel.org/r/20201201211516.24921-1-krzk@kernel.org'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
3 years agoARM: keystone: remove SECTION_SIZE_BITS/MAX_PHYSMEM_BITS
Arnd Bergmann [Thu, 3 Dec 2020 23:18:40 +0000 (00:18 +0100)]
ARM: keystone: remove SECTION_SIZE_BITS/MAX_PHYSMEM_BITS

These definitions are evidently left over from the days when
sparsemem settings were platform specific. This was no longer
the case when the platform got merged.

There was no warning in the past, but now the asm/sparsemem.h
header ends up being included indirectly, causing this warning:

In file included from /git/arm-soc/arch/arm/mach-keystone/keystone.c:24:
arch/arm/mach-keystone/memory.h:10:9: warning: 'SECTION_SIZE_BITS' macro redefined [-Wmacro-redefined]
 #define SECTION_SIZE_BITS       34
        ^
arch/arm/include/asm/sparsemem.h:23:9: note: previous definition is here
 #define SECTION_SIZE_BITS       28
        ^

Clearly the definitions never had any effect here, so remove them.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20201203231847.1484900-1-arnd@kernel.org'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
3 years agoMerge tag 'imx-fixes-5.10-5' of git://git.kernel.org/pub/scm/linux/kernel/git/shawngu...
Arnd Bergmann [Mon, 7 Dec 2020 14:28:59 +0000 (15:28 +0100)]
Merge tag 'imx-fixes-5.10-5' of git://git./linux/kernel/git/shawnguo/linux into arm/fixes

i.MX fixes for 5.10, round 5:

- Fix a regression on SoC revision detection with ANATOP, that is
  introduced by commit  4a4fb66119eb ("ARM: imx: Add missing
  of_node_put()").
- Drop PAD_GPIO_6 from imx6qdl-wandboard ENET pin group, as the pin is
  used by camera sensor now.
- Fix I2C3_SCL pinmux on imx6qdl-kontron-samx6i board, so that SoM EEPROM
  can be accessed.

* tag 'imx-fixes-5.10-5' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
  ARM: dts: imx6qdl-kontron-samx6i: fix I2C_PM scl pin
  ARM: dts: imx6qdl-wandboard-revd1: Remove PAD_GPIO_6 from enetgrp
  ARM: imx: Use correct SRC base address

Link: https://lore.kernel.org/r/20201201091820.GW4072@dragon
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
3 years agoMerge tag 'sunxi-fixes-for-5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Arnd Bergmann [Mon, 7 Dec 2020 14:28:27 +0000 (15:28 +0100)]
Merge tag 'sunxi-fixes-for-5.10-2' of git://git./linux/kernel/git/sunxi/linux into arm/fixes

A few more RGMII-ID fixes, and a bunch of other more random fixes

* tag 'sunxi-fixes-for-5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
  ARM: dts: sun7i: pcduino3-nano: enable RGMII RX/TX delay on PHY
  ARM: dts: sun8i: v3s: fix GIC node memory range
  ARM: dts: sun8i: v40: bananapi-m2-berry: Fix ethernet node
  ARM: dts: sun8i: r40: bananapi-m2-berry: Fix dcdc1 regulator
  ARM: dts: sun7i: bananapi: Enable RGMII RX/TX delay on Ethernet PHY
  ARM: dts: s3: pinecube: align compatible property to other S3 boards
  ARM: sunxi: Add machine match for the Allwinner V3 SoC
  arm64: dts: allwinner: h6: orangepi-one-plus: Fix ethernet

Link: https://lore.kernel.org/r/1280f1de-1b6d-4cc2-8448-e5a9096a41e8.lettre@localhost
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
3 years agoiommu/amd: Set DTE[IntTabLen] to represent 512 IRTEs
Suravee Suthikulpanit [Mon, 7 Dec 2020 09:19:20 +0000 (03:19 -0600)]
iommu/amd: Set DTE[IntTabLen] to represent 512 IRTEs

According to the AMD IOMMU spec, the commit 73db2fc595f3
("iommu/amd: Increase interrupt remapping table limit to 512 entries")
also requires the interrupt table length (IntTabLen) to be set to 9
(power of 2) in the device table mapping entry (DTE).

Fixes: 73db2fc595f3 ("iommu/amd: Increase interrupt remapping table limit to 512 entries")
Reported-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Link: https://lore.kernel.org/r/20201207091920.3052-1-suravee.suthikulpanit@amd.com
Signed-off-by: Will Deacon <will@kernel.org>
3 years agoudp: fix the proto value passed to ip_protocol_deliver_rcu for the segments
Xin Long [Mon, 7 Dec 2020 07:55:40 +0000 (15:55 +0800)]
udp: fix the proto value passed to ip_protocol_deliver_rcu for the segments

Guillaume noticed that: for segments udp_queue_rcv_one_skb() returns the
proto, and it should pass "ret" unmodified to ip_protocol_deliver_rcu().
Otherwize, with a negtive value passed, it will underflow inet_protos.

This can be reproduced with IPIP FOU:

  # ip fou add port 5555 ipproto 4
  # ethtool -K eth1 rx-gro-list on

Fixes: cf329aa42b66 ("udp: cope with UDP GRO packet misdirection")
Reported-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: hns3: remove a misused pragma packed
Huazhong Tan [Mon, 7 Dec 2020 07:20:25 +0000 (15:20 +0800)]
net: hns3: remove a misused pragma packed

hclge_dbg_reg_info[] is defined as an array of packed structure
accidentally. However, this array contains pointers, which are
no longer aligned naturally, and cannot be relocated on PPC64.
Hence, when compile-testing this driver on PPC64 with
CONFIG_RELOCATABLE=y (e.g. PowerPC allyesconfig), there will be
some warnings.

Since each field in structure hclge_qos_pri_map_cmd and
hclge_dbg_bitmap_cmd is type u8, the pragma packed is unnecessary
for these two structures as well, so remove the pragma packed in
hclge_debugfs.h to fix this issue, and this increases
hclge_dbg_reg_info[] by 4 bytes per entry.

Fixes: a582b78dfc33 ("net: hns3: code optimization for debugfs related to "dump reg"")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoLinux 5.10-rc7
Linus Torvalds [Sun, 6 Dec 2020 22:25:12 +0000 (14:25 -0800)]
Linux 5.10-rc7

3 years agoMerge tag 'char-misc-5.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregk...
Linus Torvalds [Sun, 6 Dec 2020 19:48:17 +0000 (11:48 -0800)]
Merge tag 'char-misc-5.10-rc7' of git://git./linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg KH:
 "Here are some small driver fixes, and one "large" revert, for
  5.10-rc7.

  They include:

   - revert mei patch from 5.10-rc1 that was using a reserved userspace
     value. It will be resubmitted once the proper id has been assigned
     by the virtio people.

   - habanalabs fixes found by the fall-through audit from Gustavo

   - speakup driver fixes for reported issues

   - fpga config build fix for reported issue.

  All of these except the revert have been in linux-next with no
  reported issues. The revert is "clean" and just removes a
  previously-added driver, so no real issue there"

* tag 'char-misc-5.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  Revert "mei: virtio: virtualization frontend driver"
  fpga: Specify HAS_IOMEM dependency for FPGA_DFL
  habanalabs: put devices before driver removal
  habanalabs: free host huge va_range if not used
  speakup: Reject setting the speakup line discipline outside of speakup

3 years agoMerge tag 'tty-5.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Linus Torvalds [Sun, 6 Dec 2020 19:43:50 +0000 (11:43 -0800)]
Merge tag 'tty-5.10-rc7' of git://git./linux/kernel/git/gregkh/tty

Pull tty fixes from Greg KH:
 "Here are two tty core fixes for 5.10-rc7.

  They resolve some reported locking issues in the tty core. While they
  have not been in a released linux-next yet, they have passed all of
  the 0-day bot testing as well as the submitter's testing"

* tag 'tty-5.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  tty: Fix ->session locking
  tty: Fix ->pgrp locking in tiocspgrp()

3 years agoMerge tag 'usb-5.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Sun, 6 Dec 2020 19:38:36 +0000 (11:38 -0800)]
Merge tag 'usb-5.10-rc7' of git://git./linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here are some small USB fixes for 5.10-rc7 that resolve a number of
  reported issues, and add some new device ids.

  Nothing major here, but these solve some problems that people were
  having with the 5.10-rc tree:

   - reverts for USB storage dma settings that broke working devices

   - thunderbolt use-after-free fix

   - cdns3 driver fixes

   - gadget driver userspace copy fix

   - new device ids

  All of these except for the reverts have been in linux-next with no
  reported issues. The reverts are "clean" and were tested by Hans, as
  well as passing the 0-day tests"

* tag 'usb-5.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: gadget: f_fs: Use local copy of descriptors for userspace copy
  usb: ohci-omap: Fix descriptor conversion
  Revert "usb-storage: fix sdev->host->dma_dev"
  Revert "uas: fix sdev->host->dma_dev"
  Revert "uas: bump hw_max_sectors to 2048 blocks for SS or faster drives"
  USB: serial: kl5kusb105: fix memleak on open
  USB: serial: ch341: sort device-id entries
  USB: serial: ch341: add new Product ID for CH341A
  USB: serial: option: fix Quectel BG96 matching
  usb: cdns3: core: fix goto label for error path
  usb: cdns3: gadget: clear trb->length as zero after preparing every trb
  usb: cdns3: Fix hardware based role switch
  USB: serial: option: add support for Thales Cinterion EXS82
  USB: serial: option: add Fibocom NL668 variants
  thunderbolt: Fix use-after-free in remove_unplugged_switch()

3 years agoMerge tag 'x86-urgent-2020-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 6 Dec 2020 19:22:39 +0000 (11:22 -0800)]
Merge tag 'x86-urgent-2020-12-06' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "A set of fixes for x86:

   - Make the AMD L3 QoS code and data priorization enable/disable
     mechanism work correctly.

     The control bit was only set/cleared on one of the CPUs in a L3
     domain, but it has to be modified on all CPUs in the domain. The
     initial documentation was not clear about this, but the updated one
     from Oct 2020 spells it out.

   - Fix an off by one in the UV platform detection code which causes
     the UV hubs to be identified wrongly.

     The chip revisions start at 1 not at 0.

   - Fix a long standing bug in the evaluation of prefixes in the
     uprobes code which fails to handle repeated prefixes properly.

     The aggregate size of the prefixes can be larger than the bytes
     array but the code blindly iterated over the aggregate size beyond
     the array boundary. Add a macro to handle this case properly and
     use it at the affected places"

* tag 'x86-urgent-2020-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/sev-es: Use new for_each_insn_prefix() macro to loop over prefixes bytes
  x86/insn-eval: Use new for_each_insn_prefix() macro to loop over prefixes bytes
  x86/uprobes: Do not use prefixes.nbytes when looping over prefixes.bytes
  x86/platform/uv: Fix UV4 hub revision adjustment
  x86/resctrl: Fix AMD L3 QOS CDP enable/disable

3 years agoMerge tag 'perf-urgent-2020-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 6 Dec 2020 19:20:18 +0000 (11:20 -0800)]
Merge tag 'perf-urgent-2020-12-06' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Thomas Gleixner:
 "Two fixes for performance monitoring on X86:

   - Add recursion protection to another callchain invoked from
     x86_pmu_stop() which can recurse back into x86_pmu_stop(). The
     first attempt to fix this missed this extra code path.

   - Use the already filtered status variable to check for PEBS counter
     overflow bits and not the unfiltered full status read from
     IA32_PERF_GLOBAL_STATUS which can have unrelated bits check which
     would be evaluated incorrectly"

* tag 'perf-urgent-2020-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel: Check PEBS status correctly
  perf/x86/intel: Fix a warning on x86_pmu_stop() with large PEBS

3 years agoMerge tag 'irq-urgent-2020-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 6 Dec 2020 19:15:55 +0000 (11:15 -0800)]
Merge tag 'irq-urgent-2020-12-06' of git://git./linux/kernel/git/tip/tip

Pull irq fixes from Thomas Gleixner:
 "A set of updates for the interrupt subsystem:

   - Make multiqueue devices which use the managed interrupt affinity
     infrastructure work on PowerPC/Pseries. PowerPC does not use the
     generic infrastructure for setting up PCI/MSI interrupts and the
     multiqueue changes failed to update the legacy PCI/MSI
     infrastructure. Make this work by passing the affinity setup
     information down to the mapping and allocation functions.

   - Move Jason Cooper from MAINTAINERS to CREDITS as his mail is
     bouncing and he's not reachable. We hope all is well with him and
     say thanks for his work over the years"

* tag 'irq-urgent-2020-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  powerpc/pseries: Pass MSI affinity to irq_create_mapping()
  genirq/irqdomain: Add an irq_create_mapping_affinity() function
  MAINTAINERS: Move Jason Cooper to CREDITS

3 years agoMerge tag 'locking-urgent-2020-12-06' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 6 Dec 2020 19:11:32 +0000 (11:11 -0800)]
Merge tag 'locking-urgent-2020-12-06' of git://git./linux/kernel/git/tip/tip

Pull intel_idle build fix from Thomas Gleixner:
 "A tiny build fix for a recent change in the intel_idle driver which
  missed a CONFIG dependency and broke the build for certain
  configurations"

* tag 'locking-urgent-2020-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  intel_idle: Build fix

3 years agoMerge tag 'kbuild-fixes-v5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 6 Dec 2020 18:31:39 +0000 (10:31 -0800)]
Merge tag 'kbuild-fixes-v5.10-2' of git://git./linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

 - Move -Wcast-align to W=3, which tends to be false-positive and there
   is no tree-wide solution.

 - Pass -fmacro-prefix-map to KBUILD_CPPFLAGS because it is a
   preprocessor option and makes sense for .S files as well.

 - Disable -gdwarf-2 for Clang's integrated assembler to avoid warnings.

 - Disable --orphan-handling=warn for LLD 10.0.1 to avoid warnings.

 - Fix undesirable line breaks in *.mod files.

* tag 'kbuild-fixes-v5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: avoid split lines in .mod files
  kbuild: Disable CONFIG_LD_ORPHAN_WARN for ld.lld 10.0.1
  kbuild: Hoist '--orphan-handling' into Kconfig
  Kbuild: do not emit debug info for assembly with LLVM_IAS=1
  kbuild: use -fmacro-prefix-map for .S sources
  Makefile.extrawarn: move -Wcast-align to W=3

3 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Sun, 6 Dec 2020 18:20:59 +0000 (10:20 -0800)]
Merge branch 'akpm' (patches from Andrew)

Merge misc fixes from Andrew Morton:
 "12 patches.

  Subsystems affected by this patch series: mm (memcg, zsmalloc, swap,
  mailmap, selftests, pagecache, hugetlb, pagemap), lib, and coredump"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm/mmap.c: fix mmap return value when vma is merged after call_mmap()
  hugetlb_cgroup: fix offline of hugetlb cgroup with reservations
  mm/filemap: add static for function __add_to_page_cache_locked
  userfaultfd: selftests: fix SIGSEGV if huge mmap fails
  tools/testing/selftests/vm: fix build error
  mailmap: add two more addresses of Uwe Kleine-König
  mm/swapfile: do not sleep with a spin lock held
  mm/zsmalloc.c: drop ZSMALLOC_PGTABLE_MAPPING
  mm: list_lru: set shrinker map bit when child nr_items is not zero
  mm: memcg/slab: fix obj_cgroup_charge() return value handling
  coredump: fix core_pattern parse error
  zlib: export S390 symbols for zlib modules

3 years agomm/mmap.c: fix mmap return value when vma is merged after call_mmap()
Liu Zixian [Sun, 6 Dec 2020 06:15:15 +0000 (22:15 -0800)]
mm/mmap.c: fix mmap return value when vma is merged after call_mmap()

On success, mmap should return the begin address of newly mapped area,
but patch "mm: mmap: merge vma after call_mmap() if possible" set
vm_start of newly merged vma to return value addr.  Users of mmap will
get wrong address if vma is merged after call_mmap().  We fix this by
moving the assignment to addr before merging vma.

We have a driver which changes vm_flags, and this bug is found by our
testcases.

Fixes: d70cec898324 ("mm: mmap: merge vma after call_mmap() if possible")
Signed-off-by: Liu Zixian <liuzixian4@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Hongxiang Lou <louhongxiang@huawei.com>
Cc: Hu Shiyuan <hushiyuan@huawei.com>
Cc: Matthew Wilcox <willy@infradead.org>
Link: https://lkml.kernel.org/r/20201203085350.22624-1-liuzixian4@huawei.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agohugetlb_cgroup: fix offline of hugetlb cgroup with reservations
Mike Kravetz [Sun, 6 Dec 2020 06:15:12 +0000 (22:15 -0800)]
hugetlb_cgroup: fix offline of hugetlb cgroup with reservations

Adrian Moreno was ruuning a kubernetes 1.19 + containerd/docker workload
using hugetlbfs.  In this environment the issue is reproduced by:

 - Start a simple pod that uses the recently added HugePages medium
   feature (pod yaml attached)

 - Start a DPDK app. It doesn't need to run successfully (as in transfer
   packets) nor interact with real hardware. It seems just initializing
   the EAL layer (which handles hugepage reservation and locking) is
   enough to trigger the issue

 - Delete the Pod (or let it "Complete").

This would result in a kworker thread going into a tight loop (top output):

   1425 root      20   0       0      0      0 R  99.7   0.0   5:22.45 kworker/28:7+cgroup_destroy

'perf top -g' reports:

  -   63.28%     0.01%  [kernel]                    [k] worker_thread
     - 49.97% worker_thread
        - 52.64% process_one_work
           - 62.08% css_killed_work_fn
              - hugetlb_cgroup_css_offline
                   41.52% _raw_spin_lock
                 - 2.82% _cond_resched
                      rcu_all_qs
                   2.66% PageHuge
        - 0.57% schedule
           - 0.57% __schedule

We are spinning in the do-while loop in hugetlb_cgroup_css_offline.
Worse yet, we are holding the master cgroup lock (cgroup_mutex) while
infinitely spinning.  Little else can be done on the system as the
cgroup_mutex can not be acquired.

Do note that the issue can be reproduced by simply offlining a hugetlb
cgroup containing pages with reservation counts.

The loop in hugetlb_cgroup_css_offline is moving page counts from the
cgroup being offlined to the parent cgroup.  This is done for each
hstate, and is repeated until hugetlb_cgroup_have_usage returns false.
The routine moving counts (hugetlb_cgroup_move_parent) is only moving
'usage' counts.  The routine hugetlb_cgroup_have_usage is checking for
both 'usage' and 'reservation' counts.  Discussion about what to do with
reservation counts when reparenting was discussed here:

https://lore.kernel.org/linux-kselftest/CAHS8izMFAYTgxym-Hzb_JmkTK1N_S9tGN71uS6MFV+R7swYu5A@mail.gmail.com/

The decision was made to leave a zombie cgroup for with reservation
counts.  Unfortunately, the code checking reservation counts was
incorrectly added to hugetlb_cgroup_have_usage.

To fix the issue, simply remove the check for reservation counts.  While
fixing this issue, a related bug in hugetlb_cgroup_css_offline was
noticed.  The hstate index is not reinitialized each time through the
do-while loop.  Fix this as well.

Fixes: 1adc4d419aa2 ("hugetlb_cgroup: add interface for charge/uncharge hugetlb reservations")
Reported-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Adrian Moreno <amorenoz@redhat.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20201203220242.158165-1-mike.kravetz@oracle.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm/filemap: add static for function __add_to_page_cache_locked
Alex Shi [Sun, 6 Dec 2020 06:15:09 +0000 (22:15 -0800)]
mm/filemap: add static for function __add_to_page_cache_locked

  mm/filemap.c:830:14: warning: no previous prototype for `__add_to_page_cache_locked' [-Wmissing-prototypes]

Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Link: https://lkml.kernel.org/r/1604661895-5495-1-git-send-email-alex.shi@linux.alibaba.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agouserfaultfd: selftests: fix SIGSEGV if huge mmap fails
Axel Rasmussen [Sun, 6 Dec 2020 06:15:05 +0000 (22:15 -0800)]
userfaultfd: selftests: fix SIGSEGV if huge mmap fails

The error handling in hugetlb_allocate_area() was incorrect for the
hugetlb_shared test case.

Previously the behavior was:

- mmap a hugetlb area
  - If this fails, set the pointer to NULL, and carry on
- mmap an alias of the same hugetlb fd
  - If this fails, munmap the original area

If the original mmap failed, it's likely the second one did too.  If
both failed, we'd blindly try to munmap a NULL pointer, causing a
SIGSEGV.  Instead, "goto fail" so we return before trying to mmap the
alias.

This issue can be hit "in real life" by forgetting to set
/proc/sys/vm/nr_hugepages (leaving it at 0), and then trying to run the
hugetlb_shared test.

Another small improvement is, when the original mmap fails, don't just
print "it failed": perror(), so we can see *why*.  :)

Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Alan Gilbert <dgilbert@redhat.com>
Link: https://lkml.kernel.org/r/20201204203443.2714693-1-axelrasmussen@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agotools/testing/selftests/vm: fix build error
Xingxing Su [Sun, 6 Dec 2020 06:15:02 +0000 (22:15 -0800)]
tools/testing/selftests/vm: fix build error

Only x86 and PowerPC implement the pkey-xxx.h, and an error was reported
when compiling protection_keys.c.

Add a Arch judgment to compile "protection_keys" in the Makefile.

If other arch implement this, add the arch name to the Makefile.
eg:
    ifneq (,$(findstring $(ARCH),powerpc mips ... ))

Following build errors:

    pkey-helpers.h:93:2: error: #error Architecture not supported
     #error Architecture not supported
    pkey-helpers.h:96:20: error: `PKEY_DISABLE_ACCESS' undeclared
     #define PKEY_MASK (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE)
                        ^
    protection_keys.c:218:45: error: `PKEY_DISABLE_WRITE' undeclared
     pkey_assert(flags & (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
                                                ^

Signed-off-by: Xingxing Su <suxingxing@loongson.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: Mina Almasry <almasrymina@google.com>
Link: https://lkml.kernel.org/r/1606826876-30656-1-git-send-email-suxingxing@loongson.cn
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomailmap: add two more addresses of Uwe Kleine-König
Uwe Kleine-König [Sun, 6 Dec 2020 06:14:58 +0000 (22:14 -0800)]
mailmap: add two more addresses of Uwe Kleine-König

This fixes attribution for the commits (among others)

 - d4097456cd1d ("video/framebuffer: move the probe func into
   .devinit.text in Blackfin LCD driver")

 - 0312e024d6cd ("mfd: mc13xxx: Add support for mc34708")

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/20201127213358.3440830-1-u.kleine-koenig@pengutronix.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm/swapfile: do not sleep with a spin lock held
Qian Cai [Sun, 6 Dec 2020 06:14:55 +0000 (22:14 -0800)]
mm/swapfile: do not sleep with a spin lock held

We can't call kvfree() with a spin lock held, so defer it.  Fixes a
might_sleep() runtime warning.

Fixes: 873d7bcfd066 ("mm/swapfile.c: use kvzalloc for swap_info_struct allocation")
Signed-off-by: Qian Cai <qcai@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20201202151549.10350-1-qcai@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm/zsmalloc.c: drop ZSMALLOC_PGTABLE_MAPPING
Minchan Kim [Sun, 6 Dec 2020 06:14:51 +0000 (22:14 -0800)]
mm/zsmalloc.c: drop ZSMALLOC_PGTABLE_MAPPING

While I was doing zram testing, I found sometimes decompression failed
since the compression buffer was corrupted.  With investigation, I found
below commit calls cond_resched unconditionally so it could make a
problem in atomic context if the task is reschedule.

  BUG: sleeping function called from invalid context at mm/vmalloc.c:108
  in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 946, name: memhog
  3 locks held by memhog/946:
   #0: ffff9d01d4b193e8 (&mm->mmap_lock#2){++++}-{4:4}, at: __mm_populate+0x103/0x160
   #1: ffffffffa3d53de0 (fs_reclaim){+.+.}-{0:0}, at: __alloc_pages_slowpath.constprop.0+0xa98/0x1160
   #2: ffff9d01d56b8110 (&zspage->lock){.+.+}-{3:3}, at: zs_map_object+0x8e/0x1f0
  CPU: 0 PID: 946 Comm: memhog Not tainted 5.9.3-00011-gc5bfc0287345-dirty #316
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014
  Call Trace:
    unmap_kernel_range_noflush+0x2eb/0x350
    unmap_kernel_range+0x14/0x30
    zs_unmap_object+0xd5/0xe0
    zram_bvec_rw.isra.0+0x38c/0x8e0
    zram_rw_page+0x90/0x101
    bdev_write_page+0x92/0xe0
    __swap_writepage+0x94/0x4a0
    pageout+0xe3/0x3a0
    shrink_page_list+0xb94/0xd60
    shrink_inactive_list+0x158/0x460

We can fix this by removing the ZSMALLOC_PGTABLE_MAPPING feature (which
contains the offending calling code) from zsmalloc.

Even though this option showed some amount improvement(e.g., 30%) in
some arm32 platforms, it has been headache to maintain since it have
abused APIs[1](e.g., unmap_kernel_range in atomic context).

Since we are approaching to deprecate 32bit machines and already made
the config option available for only builtin build since v5.8, lastly it
has been not default option in zsmalloc, it's time to drop the option
for better maintenance.

[1] http://lore.kernel.org/linux-mm/20201105170249.387069-1-minchan@kernel.org

Fixes: e47110e90584 ("mm/vunmap: add cond_resched() in vunmap_pmd_range")
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Tony Lindgren <tony@atomide.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Harish Sriram <harish@linux.ibm.com>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20201117202916.GA3856507@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm: list_lru: set shrinker map bit when child nr_items is not zero
Yang Shi [Sun, 6 Dec 2020 06:14:48 +0000 (22:14 -0800)]
mm: list_lru: set shrinker map bit when child nr_items is not zero

When investigating a slab cache bloat problem, significant amount of
negative dentry cache was seen, but confusingly they neither got shrunk
by reclaimer (the host has very tight memory) nor be shrunk by dropping
cache.  The vmcore shows there are over 14M negative dentry objects on
lru, but tracing result shows they were even not scanned at all.

Further investigation shows the memcg's vfs shrinker_map bit is not set.
So the reclaimer or dropping cache just skip calling vfs shrinker.  So
we have to reboot the hosts to get the memory back.

I didn't manage to come up with a reproducer in test environment, and
the problem can't be reproduced after rebooting.  But it seems there is
race between shrinker map bit clear and reparenting by code inspection.
The hypothesis is elaborated as below.

The memcg hierarchy on our production environment looks like:

                root
               /    \
          system   user

The main workloads are running under user slice's children, and it
creates and removes memcg frequently.  So reparenting happens very often
under user slice, but no task is under user slice directly.

So with the frequent reparenting and tight memory pressure, the below
hypothetical race condition may happen:

       CPU A                            CPU B
reparent
    dst->nr_items == 0
                                 shrinker:
                                     total_objects == 0
    add src->nr_items to dst
    set_bit
                                     return SHRINK_EMPTY
                                     clear_bit
child memcg offline
    replace child's kmemcg_id with
    parent's (in memcg_offline_kmem())
                                  list_lru_del() between shrinker runs
                                     see parent's kmemcg_id
                                     dec dst->nr_items
reparent again
    dst->nr_items may go negative
    due to concurrent list_lru_del()

                                 The second run of shrinker:
                                     read nr_items without any
                                     synchronization, so it may
                                     see intermediate negative
                                     nr_items then total_objects
                                     may return 0 coincidently

                                     keep the bit cleared
    dst->nr_items != 0
    skip set_bit
    add scr->nr_item to dst

After this point dst->nr_item may never go zero, so reparenting will not
set shrinker_map bit anymore.  And since there is no task under user
slice directly, so no new object will be added to its lru to set the
shrinker map bit either.  That bit is kept cleared forever.

How does list_lru_del() race with reparenting? It is because reparenting
replaces children's kmemcg_id to parent's without protecting from
nlru->lock, so list_lru_del() may see parent's kmemcg_id but actually
deleting items from child's lru, but dec'ing parent's nr_items, so the
parent's nr_items may go negative as commit 2788cf0c401c ("memcg:
reparent list_lrus and free kmemcg_id on css offline") says.

Since it is impossible that dst->nr_items goes negative and
src->nr_items goes zero at the same time, so it seems we could set the
shrinker map bit iff src->nr_items != 0.  We could synchronize
list_lru_count_one() and reparenting with nlru->lock, but it seems
checking src->nr_items in reparenting is the simplest and avoids lock
contention.

Fixes: fae91d6d8be5 ("mm/list_lru.c: set bit in memcg shrinker bitmap on first list_lru item appearance")
Suggested-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: <stable@vger.kernel.org> [4.19]
Link: https://lkml.kernel.org/r/20201202171749.264354-1-shy828301@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm: memcg/slab: fix obj_cgroup_charge() return value handling
Roman Gushchin [Sun, 6 Dec 2020 06:14:45 +0000 (22:14 -0800)]
mm: memcg/slab: fix obj_cgroup_charge() return value handling

Commit 10befea91b61 ("mm: memcg/slab: use a single set of kmem_caches
for all allocations") introduced a regression into the handling of the
obj_cgroup_charge() return value.  If a non-zero value is returned
(indicating of exceeding one of memory.max limits), the allocation
should fail, instead of falling back to non-accounted mode.

To make the code more readable, move memcg_slab_pre_alloc_hook() and
memcg_slab_post_alloc_hook() calling conditions into bodies of these
hooks.

Fixes: 10befea91b61 ("mm: memcg/slab: use a single set of kmem_caches for all allocations")
Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20201127161828.GD840171@carbon.dhcp.thefacebook.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agocoredump: fix core_pattern parse error
Menglong Dong [Sun, 6 Dec 2020 06:14:42 +0000 (22:14 -0800)]
coredump: fix core_pattern parse error

'format_corename()' will splite 'core_pattern' on spaces when it is in
pipe mode, and take helper_argv[0] as the path to usermode executable.
It works fine in most cases.

However, if there is a space between '|' and '/file/path', such as
'| /usr/lib/systemd/systemd-coredump %P %u %g', then helper_argv[0] will
be parsed as '', and users will get a 'Core dump to | disabled'.

It is not friendly to users, as the pattern above was valid previously.
Fix this by ignoring the spaces between '|' and '/file/path'.

Fixes: 315c69261dd3 ("coredump: split pipe command whitespace before expanding template")
Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Paul Wise <pabs3@bonedaddy.net>
Cc: Jakub Wilk <jwilk@jwilk.net> [https://bugs.debian.org/924398]
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/5fb62870.1c69fb81.8ef5d.af76@mx.google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agozlib: export S390 symbols for zlib modules
Randy Dunlap [Sun, 6 Dec 2020 06:14:38 +0000 (22:14 -0800)]
zlib: export S390 symbols for zlib modules

Fix build errors when ZLIB_INFLATE=m and ZLIB_DEFLATE=m and ZLIB_DFLTCC=y
by exporting the 2 needed symbols in dfltcc_inflate.c.

Fixes these build errors:

  ERROR: modpost: "dfltcc_inflate" [lib/zlib_inflate/zlib_inflate.ko] undefined!
  ERROR: modpost: "dfltcc_can_inflate" [lib/zlib_inflate/zlib_inflate.ko] undefined!

Fixes: 126196100063 ("lib/zlib: add s390 hardware support for kernel zlib_inflate")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: Mikhail Zaslonko <zaslonko@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Link: https://lkml.kernel.org/r/20201123191712.4882-1-rdunlap@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agokbuild: avoid split lines in .mod files
Masahiro Yamada [Thu, 3 Dec 2020 17:55:51 +0000 (02:55 +0900)]
kbuild: avoid split lines in .mod files

"xargs echo" is not a safe way to remove line breaks because the input
may exceed the command line limit and xargs may break it up into
multiple invocations of echo. This should never happen because
scripts/gen_autoksyms.sh expects all undefined symbols are placed in
the second line of .mod files.

One possible way is to replace "xargs echo" with
"sed ':x;N;$!bx;s/\n/ /g'" or something, but I rewrote the code by
using awk because it is more readable.

This issue was reported by Sami Tolvanen; in his Clang LTO patch set,
$(multi-used-m) is no longer an ELF object, but a thin archive that
contains LLVM bitcode files. llvm-nm prints out symbols for each
archive member separately, which results a lot of dupications, in some
places, beyond the system-defined limit.

This problem must be fixed irrespective of LTO, and we must ensure
zero possibility of having this issue.

Link: https://lkml.org/lkml/2020/12/1/1658
Reported-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
3 years agoRevert "mei: virtio: virtualization frontend driver"
Michael S. Tsirkin [Sat, 5 Dec 2020 19:38:46 +0000 (14:38 -0500)]
Revert "mei: virtio: virtualization frontend driver"

This reverts commit d162219c655c8cf8003128a13840d6c1e183fb80.

The device uses a VIRTIO device ID out of a not-for-production range.
Releasing Linux using an ID out of this range will make it conflict with
development setups. An official request to reserve an ID for an MEI
device is yet to be submitted to the virtio TC, thus there's no chance
it will be reserved and fixed in time before the next release.

Once requested it usually takes 2-3 weeks to land in the spec, which
means the device can be supported with the official ID in the next Linux
version if contributors act quickly.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Tomas Winkler <tomas.winkler@intel.com>
Cc: Alexander Usyskin <alexander.usyskin@intel.com>
Cc: Wang Yu <yu1.wang@intel.com>
Cc: Liu Shuo <shuo.a.liu@intel.com>
Link: https://lore.kernel.org/r/20201205193625.469773-1-mst@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agox86/sev-es: Use new for_each_insn_prefix() macro to loop over prefixes bytes
Masami Hiramatsu [Thu, 3 Dec 2020 04:51:01 +0000 (13:51 +0900)]
x86/sev-es: Use new for_each_insn_prefix() macro to loop over prefixes bytes

Since insn.prefixes.nbytes can be bigger than the size of
insn.prefixes.bytes[] when a prefix is repeated, the proper
check must be:

  insn.prefixes.bytes[i] != 0 and i < 4

instead of using insn.prefixes.nbytes. Use the new
for_each_insn_prefix() macro which does it correctly.

Debugged by Kees Cook <keescook@chromium.org>.

 [ bp: Massage commit message. ]

Fixes: 25189d08e516 ("x86/sev-es: Add support for handling IOIO exceptions")
Reported-by: syzbot+9b64b619f10f19d19a7c@syzkaller.appspotmail.com
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/160697106089.3146288.2052422845039649176.stgit@devnote2
3 years agox86/insn-eval: Use new for_each_insn_prefix() macro to loop over prefixes bytes
Masami Hiramatsu [Thu, 3 Dec 2020 04:50:50 +0000 (13:50 +0900)]
x86/insn-eval: Use new for_each_insn_prefix() macro to loop over prefixes bytes

Since insn.prefixes.nbytes can be bigger than the size of
insn.prefixes.bytes[] when a prefix is repeated, the proper check must
be

  insn.prefixes.bytes[i] != 0 and i < 4

instead of using insn.prefixes.nbytes. Use the new
for_each_insn_prefix() macro which does it correctly.

Debugged by Kees Cook <keescook@chromium.org>.

 [ bp: Massage commit message. ]

Fixes: 32d0b95300db ("x86/insn-eval: Add utility functions to get segment selector")
Reported-by: syzbot+9b64b619f10f19d19a7c@syzkaller.appspotmail.com
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/160697104969.3146288.16329307586428270032.stgit@devnote2
3 years agox86/uprobes: Do not use prefixes.nbytes when looping over prefixes.bytes
Masami Hiramatsu [Thu, 3 Dec 2020 04:50:37 +0000 (13:50 +0900)]
x86/uprobes: Do not use prefixes.nbytes when looping over prefixes.bytes

Since insn.prefixes.nbytes can be bigger than the size of
insn.prefixes.bytes[] when a prefix is repeated, the proper check must
be

  insn.prefixes.bytes[i] != 0 and i < 4

instead of using insn.prefixes.nbytes.

Introduce a for_each_insn_prefix() macro for this purpose. Debugged by
Kees Cook <keescook@chromium.org>.

 [ bp: Massage commit message, sync with the respective header in tools/
   and drop "we". ]

Fixes: 2b1444983508 ("uprobes, mm, x86: Add the ability to install and remove uprobes breakpoints")
Reported-by: syzbot+9b64b619f10f19d19a7c@syzkaller.appspotmail.com
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/160697103739.3146288.7437620795200799020.stgit@devnote2
3 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Linus Torvalds [Sun, 6 Dec 2020 00:16:34 +0000 (16:16 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input

Pull input fixes from Dmitry Torokhov:
 "A fix for 'RETRIGEN' handling in Atmel touch controllers that was
  causing lost interrupts on systems using edge-triggered interrupts, a
  quirk for i8042 driver, and a couple more fixes."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: atmel_mxt_ts - fix lost interrupts
  Input: xpad - support Ardwiino Controllers
  Input: i8042 - add ByteSpeed touchpad to noloop table
  Input: i8042 - fix error return code in i8042_setup_aux()
  Input: soc_button_array - add missing include

3 years agonet: mscc: ocelot: fix dropping of unknown IPv4 multicast on Seville
Vladimir Oltean [Fri, 4 Dec 2020 17:54:16 +0000 (19:54 +0200)]
net: mscc: ocelot: fix dropping of unknown IPv4 multicast on Seville

The current assumption is that the felix DSA driver has flooding knobs
per traffic class, while ocelot switchdev has a single flooding knob.
This was correct for felix VSC9959 and ocelot VSC7514, but with the
introduction of seville VSC9953, we see a switch driven by felix.c which
has a single flooding knob.

So it is clear that we must do what should have been done from the
beginning, which is not to overwrite the configuration done by ocelot.c
in felix, but instead to teach the common ocelot library about the
differences in our switches, and set up the flooding PGIDs centrally.

The effect that the bogus iteration through FELIX_NUM_TC has upon
seville is quite dramatic. ANA_FLOODING is located at 0x00b548, and
ANA_FLOODING_IPMC is located at 0x00b54c. So the bogus iteration will
actually overwrite ANA_FLOODING_IPMC when attempting to write
ANA_FLOODING[1]. There is no ANA_FLOODING[1] in sevile, just ANA_FLOODING.

And when ANA_FLOODING_IPMC is overwritten with a bogus value, the effect
is that ANA_FLOODING_IPMC gets the value of 0x0003CF7D:
MC6_DATA = 61,
MC6_CTRL = 61,
MC4_DATA = 60,
MC4_CTRL = 0.
Because MC4_CTRL is zero, this means that IPv4 multicast control packets
are not flooded, but dropped. An invalid configuration, and this is how
the issue was actually spotted.

Reported-by: Eldar Gasanov <eldargasanov2@gmail.com>
Reported-by: Maxim Kochetkov <fido_max@inbox.ru>
Tested-by: Eldar Gasanov <eldargasanov2@gmail.com>
Fixes: 84705fc16552 ("net: dsa: felix: introduce support for Seville VSC9953 switch")
Fixes: 3c7b51bd39b2 ("net: dsa: felix: allow flooding for all traffic classes")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20201204175416.1445937-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Sat, 5 Dec 2020 23:27:22 +0000 (15:27 -0800)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:
 "Some more I2C driver updates. IMX updates are a tad bigger, but not
  exceptionally big"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: mlxbf: Fix the return check of devm_ioremap and ioremap
  i2c: mlxbf: select CONFIG_I2C_SLAVE
  i2c: imx: Don't generate STOP condition if arbitration has been lost
  i2c: imx: Check for I2SR_IAL after every byte
  i2c: imx: Fix reset of I2SR_IAL flag
  i2c: qcom: Fix IRQ error misassignement
  i2c: qup: Fix error return code in qup_i2c_bam_schedule_desc()

3 years agoMerge tag 'block-5.10-2020-12-05' of git://git.kernel.dk/linux-block
Linus Torvalds [Sat, 5 Dec 2020 22:45:30 +0000 (14:45 -0800)]
Merge tag 'block-5.10-2020-12-05' of git://git.kernel.dk/linux-block

Pull block fix from Jens Axboe:
 "Single fix for an issue with chunk_sectors and stacked devices"

* tag 'block-5.10-2020-12-05' of git://git.kernel.dk/linux-block:
  block: use gcd() to fix chunk_sectors limit stacking

3 years agoMerge tag 'io_uring-5.10-2020-12-05' of git://git.kernel.dk/linux-block
Linus Torvalds [Sat, 5 Dec 2020 22:39:59 +0000 (14:39 -0800)]
Merge tag 'io_uring-5.10-2020-12-05' of git://git.kernel.dk/linux-block

Pull io_uring fix from Jens Axboe:
 "Just a small fix this time, for an issue with 32-bit compat apps and
  buffer selection with recvmsg"

* tag 'io_uring-5.10-2020-12-05' of git://git.kernel.dk/linux-block:
  io_uring: fix recvmsg setup with compat buf-select

3 years agonet: marvell: prestera: Fix error return code in prestera_port_create()
Zhang Changzhong [Fri, 4 Dec 2020 08:49:42 +0000 (16:49 +0800)]
net: marvell: prestera: Fix error return code in prestera_port_create()

Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: 501ef3066c89 ("net: marvell: prestera: Add driver for Prestera family ASIC devices")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Link: https://lore.kernel.org/r/1607071782-34006-1-git-send-email-zhangchangzhong@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agovrf: packets with lladdr src needs dst at input with orig_iif when needs strict
Stephen Suryaputra [Fri, 4 Dec 2020 03:06:04 +0000 (22:06 -0500)]
vrf: packets with lladdr src needs dst at input with orig_iif when needs strict

Depending on the order of the routes to fe80::/64 are installed on the
VRF table, the NS for the source link-local address of the originator
might be sent to the wrong interface.

This patch ensures that packets with link-local addr source is doing a
lookup with the orig_iif when the destination addr indicates that it
is strict.

Add the reproducer as a use case in self test script fcnal-test.sh.

Fixes: b4869aa2f881 ("net: vrf: ipv6 support for local traffic to local addresses")
Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20201204030604.18828-1-ssuryaextr@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agocan: softing: softing_netdev_open(): fix error handling
Zhang Qilong [Fri, 4 Dec 2020 13:35:06 +0000 (14:35 +0100)]
can: softing: softing_netdev_open(): fix error handling

If softing_netdev_open() fails, we should call close_candev() to avoid
reference leak.

Fixes: 03fd3cf5a179d ("can: add driver for Softing card")
Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com>
Acked-by: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be>
Link: https://lore.kernel.org/r/20201202151632.1343786-1-zhangqilong3@huawei.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Link: https://lore.kernel.org/r/20201204133508.742120-2-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoch_ktls: fix build warning for ipv4-only config
Arnd Bergmann [Thu, 3 Dec 2020 22:26:16 +0000 (23:26 +0100)]
ch_ktls: fix build warning for ipv4-only config

When CONFIG_IPV6 is disabled, clang complains that a variable
is uninitialized for non-IPv4 data:

drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c:1046:6: error: variable 'cntrl1' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
        if (tx_info->ip_family == AF_INET) {
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c:1059:2: note: uninitialized use occurs here
        cntrl1 |= T6_TXPKT_ETHHDR_LEN_V(maclen - ETH_HLEN) |
        ^~~~~~

Replace the preprocessor conditional with the corresponding C version,
and make the ipv4 case unconditional in this configuration to improve
readability and avoid the warning.

Fixes: 86716b51d14f ("ch_ktls: Update cheksum information")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20201203222641.964234-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge tag 'powerpc-5.10-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Sat, 5 Dec 2020 19:16:21 +0000 (11:16 -0800)]
Merge tag 'powerpc-5.10-5' of git://git./linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "Some more powerpc fixes for 5.10:

   - Three commits fixing possible missed TLB invalidations for
     multi-threaded processes when CPUs are hotplugged in and out.

   - A fix for a host crash triggerable by host userspace (qemu) in KVM
     on Power9.

   - A fix for a host crash in machine check handling when running HPT
     guests on a HPT host.

   - One commit fixing potential missed TLB invalidations when using the
     hash MMU on Power9 or later.

   - A regression fix for machines with CPUs on node 0 but no memory.

  Thanks to Aneesh Kumar K.V, CĂ©dric Le Goater, Greg Kurz, Milan
  Mohanty, Milton Miller, Nicholas Piggin, Paul Mackerras, and Srikar
  Dronamraju"

* tag 'powerpc-5.10-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/64s/powernv: Fix memory corruption when saving SLB entries on MCE
  KVM: PPC: Book3S HV: XIVE: Fix vCPU id sanity check
  powerpc/numa: Fix a regression on memoryless node 0
  powerpc/64s: Trim offlined CPUs from mm_cpumasks
  kernel/cpu: add arch override for clear_tasks_mm_cpumask() mm handling
  powerpc/64s/pseries: Fix hash tlbiel_all_isa300 for guest kernels
  powerpc/64s: Fix hash ISA v3.0 TLBIEL instruction generation

3 years agoMerge tag '5.10-rc6-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Sat, 5 Dec 2020 19:09:07 +0000 (11:09 -0800)]
Merge tag '5.10-rc6-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
 "Three smb3 fixes (two for stable) fixing

   - a null pointer issue in a DFS error path

   - a problem with excessive padding when mounted with "idsfromsid"
     causing owner fields to get corrupted

   - a more recent problem with compounded reparse point query found in
     testing to the Linux kernel server"

* tag '5.10-rc6-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: refactor create_sd_buf() and and avoid corrupting the buffer
  cifs: add NULL check for ses->tcon_ipc
  smb3: set COMPOUND_FID to FileID field of subsequent compound request

3 years agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Sat, 5 Dec 2020 18:59:21 +0000 (10:59 -0800)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Four small fixes in two drivers.

  The mpt3sas fixes are all problems with timeout under unusual
  conditions, and the storvsc is a missed incoming packet validation
  and a missed error return"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: mpt3sas: Increase IOCInit request timeout to 30s
  scsi: mpt3sas: Fix ioctl timeout
  scsi: storvsc: Validate length of incoming packet in storvsc_on_channel_callback()
  scsi: storvsc: Fix error return in storvsc_probe()

3 years agoMerge tag 'for-5.10/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 5 Dec 2020 18:51:25 +0000 (10:51 -0800)]
Merge tag 'for-5.10/dm-fixes-2' of git://git./linux/kernel/git/device-mapper/linux-dm

Pull fix for device mapper fixes from Mike Snitzer:
 "Apologies for the glaring bug I introduced with my previous pull
  request!

  Fix incorrect branching at top of blk_max_size_offset()"

* tag 'for-5.10/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  block: fix incorrect branching in blk_max_size_offset()

3 years agoi2c: mlxbf: Fix the return check of devm_ioremap and ioremap
Wang Xiaojun [Thu, 3 Dec 2020 01:46:47 +0000 (20:46 -0500)]
i2c: mlxbf: Fix the return check of devm_ioremap and ioremap

devm_ioremap and ioremap may return NULL which cannot be checked
by IS_ERR.

Signed-off-by: Wang Xiaojun <wangxiaojun11@huawei.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Acked-by: Khalil Blaiech <kblaiech@nvidia.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
3 years agoi2c: mlxbf: select CONFIG_I2C_SLAVE
Arnd Bergmann [Thu, 3 Dec 2020 22:32:50 +0000 (23:32 +0100)]
i2c: mlxbf: select CONFIG_I2C_SLAVE

If this is not enabled, the interfaces used in this driver do not work:

drivers/i2c/busses/i2c-mlxbf.c:1888:3: error: implicit declaration of function 'i2c_slave_event' [-Werror,-Wimplicit-function-declaration]
                i2c_slave_event(slave, I2C_SLAVE_WRITE_REQUESTED, &value);
                ^
drivers/i2c/busses/i2c-mlxbf.c:1888:26: error: use of undeclared identifier 'I2C_SLAVE_WRITE_REQUESTED'
                i2c_slave_event(slave, I2C_SLAVE_WRITE_REQUESTED, &value);
                                       ^
drivers/i2c/busses/i2c-mlxbf.c:1890:32: error: use of undeclared identifier 'I2C_SLAVE_WRITE_RECEIVED'
                ret = i2c_slave_event(slave, I2C_SLAVE_WRITE_RECEIVED,
                                             ^
drivers/i2c/busses/i2c-mlxbf.c:1892:26: error: use of undeclared identifier 'I2C_SLAVE_STOP'
                i2c_slave_event(slave, I2C_SLAVE_STOP, &value);
                                       ^

Fixes: b5b5b32081cd ("i2c: mlxbf: I2C SMBus driver for Mellanox BlueField SoC")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Khalil Blaiech <kblaiech@nvidia.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
3 years agomac80211: mesh: fix mesh_pathtbl_init() error path
Eric Dumazet [Fri, 4 Dec 2020 16:24:28 +0000 (08:24 -0800)]
mac80211: mesh: fix mesh_pathtbl_init() error path

If tbl_mpp can not be allocated, we call mesh_table_free(tbl_path)
while tbl_path rhashtable has not yet been initialized, which causes
panics.

Simply factorize the rhashtable_init() call into mesh_table_alloc()

WARNING: CPU: 1 PID: 8474 at kernel/workqueue.c:3040 __flush_work kernel/workqueue.c:3040 [inline]
WARNING: CPU: 1 PID: 8474 at kernel/workqueue.c:3040 __cancel_work_timer+0x514/0x540 kernel/workqueue.c:3136
Modules linked in:
CPU: 1 PID: 8474 Comm: syz-executor663 Not tainted 5.10.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__flush_work kernel/workqueue.c:3040 [inline]
RIP: 0010:__cancel_work_timer+0x514/0x540 kernel/workqueue.c:3136
Code: 5d c3 e8 bf ae 29 00 0f 0b e9 f0 fd ff ff e8 b3 ae 29 00 0f 0b 43 80 3c 3e 00 0f 85 31 ff ff ff e9 34 ff ff ff e8 9c ae 29 00 <0f> 0b e9 dc fe ff ff 89 e9 80 e1 07 80 c1 03 38 c1 0f 8c 7d fd ff
RSP: 0018:ffffc9000165f5a0 EFLAGS: 00010293
RAX: ffffffff814b7064 RBX: 0000000000000001 RCX: ffff888021c80000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff888024039ca0 R08: dffffc0000000000 R09: fffffbfff1dd3e64
R10: fffffbfff1dd3e64 R11: 0000000000000000 R12: 1ffff920002cbebd
R13: ffff888024039c88 R14: 1ffff11004807391 R15: dffffc0000000000
FS:  0000000001347880(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000140 CR3: 000000002cc0a000 CR4: 00000000001506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 rhashtable_free_and_destroy+0x25/0x9c0 lib/rhashtable.c:1137
 mesh_table_free net/mac80211/mesh_pathtbl.c:69 [inline]
 mesh_pathtbl_init+0x287/0x2e0 net/mac80211/mesh_pathtbl.c:785
 ieee80211_mesh_init_sdata+0x2ee/0x530 net/mac80211/mesh.c:1591
 ieee80211_setup_sdata+0x733/0xc40 net/mac80211/iface.c:1569
 ieee80211_if_add+0xd5c/0x1cd0 net/mac80211/iface.c:1987
 ieee80211_add_iface+0x59/0x130 net/mac80211/cfg.c:125
 rdev_add_virtual_intf net/wireless/rdev-ops.h:45 [inline]
 nl80211_new_interface+0x563/0xb40 net/wireless/nl80211.c:3855
 genl_family_rcv_msg_doit net/netlink/genetlink.c:739 [inline]
 genl_family_rcv_msg net/netlink/genetlink.c:783 [inline]
 genl_rcv_msg+0xe4e/0x1280 net/netlink/genetlink.c:800
 netlink_rcv_skb+0x190/0x3a0 net/netlink/af_netlink.c:2494
 genl_rcv+0x24/0x40 net/netlink/genetlink.c:811
 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
 netlink_unicast+0x780/0x930 net/netlink/af_netlink.c:1330
 netlink_sendmsg+0x9a8/0xd40 net/netlink/af_netlink.c:1919
 sock_sendmsg_nosec net/socket.c:651 [inline]
 sock_sendmsg net/socket.c:671 [inline]
 ____sys_sendmsg+0x519/0x800 net/socket.c:2353
 ___sys_sendmsg net/socket.c:2407 [inline]
 __sys_sendmsg+0x2b1/0x360 net/socket.c:2440
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 60854fd94573 ("mac80211: mesh: convert path table to rhashtable")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Link: https://lore.kernel.org/r/20201204162428.2583119-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years ago[SECURITY] fix namespaced fscaps when !CONFIG_SECURITY
Serge Hallyn [Mon, 16 Nov 2020 03:55:31 +0000 (21:55 -0600)]
[SECURITY] fix namespaced fscaps when !CONFIG_SECURITY

Namespaced file capabilities were introduced in 8db6c34f1dbc .
When userspace reads an xattr for a namespaced capability, a
virtualized representation of it is returned if the caller is
in a user namespace owned by the capability's owning rootid.
The function which performs this virtualization was not hooked
up if CONFIG_SECURITY=n.  Therefore in that case the original
xattr was shown instead of the virtualized one.

To test this using libcap-bin (*1),

$ v=$(mktemp)
$ unshare -Ur setcap cap_sys_admin-eip $v
$ unshare -Ur setcap -v cap_sys_admin-eip $v
/tmp/tmp.lSiIFRvt8Y: OK

"setcap -v" verifies the values instead of setting them, and
will check whether the rootid value is set.  Therefore, with
this bug un-fixed, and with CONFIG_SECURITY=n, setcap -v will
fail:

$ v=$(mktemp)
$ unshare -Ur setcap cap_sys_admin=eip $v
$ unshare -Ur setcap -v cap_sys_admin=eip $v
nsowner[got=1000, want=0],/tmp/tmp.HHDiOOl9fY differs in []

Fix this bug by calling cap_inode_getsecurity() in
security_inode_getsecurity() instead of returning
-EOPNOTSUPP, when CONFIG_SECURITY=n.

*1 - note, if libcap is too old for getcap to have the '-n'
option, then use verify-caps instead.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=209689
Cc: Hervé Guillemet <herve@guillemet.org>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Serge Hallyn <shallyn@cisco.com>
Signed-off-by: Andrew G. Morgan <morgan@kernel.org>
Signed-off-by: James Morris <jamorris@linux.microsoft.com>
3 years agoopenvswitch: fix error return code in validate_and_copy_dec_ttl()
Wang Hai [Fri, 4 Dec 2020 11:43:14 +0000 (19:43 +0800)]
openvswitch: fix error return code in validate_and_copy_dec_ttl()

Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Changing 'return start' to 'return action_start' can fix this bug.

Fixes: 69929d4c49e1 ("net: openvswitch: fix TTL decrement action netlink message format")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Reviewed-by: Eelco Chaudron <echaudro@redhat.com>
Link: https://lore.kernel.org/r/20201204114314.1596-1-wanghai38@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: bridge: vlan: fix error return code in __vlan_add()
Zhang Changzhong [Fri, 4 Dec 2020 08:48:56 +0000 (16:48 +0800)]
net: bridge: vlan: fix error return code in __vlan_add()

Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: f8ed289fab84 ("bridge: vlan: use br_vlan_(get|put)_master to deal with refcounts")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Link: https://lore.kernel.org/r/1607071737-33875-1-git-send-email-zhangchangzhong@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoipv4: fix error return code in rtm_to_fib_config()
Zhang Changzhong [Fri, 4 Dec 2020 08:48:14 +0000 (16:48 +0800)]
ipv4: fix error return code in rtm_to_fib_config()

Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: d15662682db2 ("ipv4: Allow ipv6 gateway with ipv4 routes")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/1607071695-33740-1-git-send-email-zhangchangzhong@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoethernet: select CONFIG_CRC32 as needed
Arnd Bergmann [Thu, 3 Dec 2020 23:20:37 +0000 (00:20 +0100)]
ethernet: select CONFIG_CRC32 as needed

A number of ethernet drivers require crc32 functionality to be
avaialable in the kernel, causing a link error otherwise:

arm-linux-gnueabi-ld: drivers/net/ethernet/agere/et131x.o: in function `et1310_setup_device_for_multicast':
et131x.c:(.text+0x5918): undefined reference to `crc32_le'
arm-linux-gnueabi-ld: drivers/net/ethernet/cadence/macb_main.o: in function `macb_start_xmit':
macb_main.c:(.text+0x4b88): undefined reference to `crc32_le'
arm-linux-gnueabi-ld: drivers/net/ethernet/faraday/ftgmac100.o: in function `ftgmac100_set_rx_mode':
ftgmac100.c:(.text+0x2b38): undefined reference to `crc32_le'
arm-linux-gnueabi-ld: drivers/net/ethernet/freescale/fec_main.o: in function `set_multicast_list':
fec_main.c:(.text+0x6120): undefined reference to `crc32_le'
arm-linux-gnueabi-ld: drivers/net/ethernet/freescale/fman/fman_dtsec.o: in function `dtsec_add_hash_mac_address':
fman_dtsec.c:(.text+0x830): undefined reference to `crc32_le'
arm-linux-gnueabi-ld: drivers/net/ethernet/freescale/fman/fman_dtsec.o:fman_dtsec.c:(.text+0xb68): more undefined references to `crc32_le' follow
arm-linux-gnueabi-ld: drivers/net/ethernet/netronome/nfp/nfpcore/nfp_hwinfo.o: in function `nfp_hwinfo_read':
nfp_hwinfo.c:(.text+0x250): undefined reference to `crc32_be'
arm-linux-gnueabi-ld: nfp_hwinfo.c:(.text+0x288): undefined reference to `crc32_be'
arm-linux-gnueabi-ld: drivers/net/ethernet/netronome/nfp/nfpcore/nfp_resource.o: in function `nfp_resource_acquire':
nfp_resource.c:(.text+0x144): undefined reference to `crc32_be'
arm-linux-gnueabi-ld: nfp_resource.c:(.text+0x158): undefined reference to `crc32_be'
arm-linux-gnueabi-ld: drivers/net/ethernet/nxp/lpc_eth.o: in function `lpc_eth_set_multicast_list':
lpc_eth.c:(.text+0x1934): undefined reference to `crc32_le'
arm-linux-gnueabi-ld: drivers/net/ethernet/rocker/rocker_ofdpa.o: in function `ofdpa_flow_tbl_do':
rocker_ofdpa.c:(.text+0x2e08): undefined reference to `crc32_le'
arm-linux-gnueabi-ld: drivers/net/ethernet/rocker/rocker_ofdpa.o: in function `ofdpa_flow_tbl_del':
rocker_ofdpa.c:(.text+0x3074): undefined reference to `crc32_le'
arm-linux-gnueabi-ld: drivers/net/ethernet/rocker/rocker_ofdpa.o: in function `ofdpa_port_fdb':
arm-linux-gnueabi-ld: drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.o: in function `mlx5dr_ste_calc_hash_index':
dr_ste.c:(.text+0x354): undefined reference to `crc32_le'
arm-linux-gnueabi-ld: drivers/net/ethernet/microchip/lan743x_main.o: in function `lan743x_netdev_set_multicast':
lan743x_main.c:(.text+0x5dc4): undefined reference to `crc32_le'

Add the missing 'select CRC32' entries in Kconfig for each of them.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Acked-by: Mark Einon <mark.einon@gmail.com>
Acked-by: Simon Horman <simon.horman@netronome.com>
Link: https://lore.kernel.org/r/20201203232114.1485603-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: ipa: pass the correct size when freeing DMA memory
Alex Elder [Thu, 3 Dec 2020 21:51:06 +0000 (15:51 -0600)]
net: ipa: pass the correct size when freeing DMA memory

When the coherent memory is freed in gsi_trans_pool_exit_dma(), we
are mistakenly passing the size of a single element in the pool
rather than the actual allocated size.  Fix this bug.

Fixes: 9dd441e4ed575 ("soc: qcom: ipa: GSI transactions")
Reported-by: Stephen Boyd <swboyd@chromium.org>
Tested-by: Sujit Kautkar <sujitka@chromium.org>
Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20201203215106.17450-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoblock: fix incorrect branching in blk_max_size_offset()
Mike Snitzer [Fri, 4 Dec 2020 22:21:03 +0000 (17:21 -0500)]
block: fix incorrect branching in blk_max_size_offset()

If non-zero 'chunk_sectors' is passed in to blk_max_size_offset() that
override will be incorrectly ignored.

Old blk_max_size_offset() branching, prior to commit 3ee16db390b4,
must be used only if passed 'chunk_sectors' override is zero.

Fixes: 3ee16db390b4 ("dm: fix IO splitting")
Cc: stable@vger.kernel.org # 5.9
Reported-by: John Dorminy <jdorminy@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
3 years agonet/sched: fq_pie: initialize timer earlier in fq_pie_init()
Davide Caratti [Thu, 3 Dec 2020 18:40:47 +0000 (19:40 +0100)]
net/sched: fq_pie: initialize timer earlier in fq_pie_init()

with the following tdc testcase:

 83be: (qdisc, fq_pie) Create FQ-PIE with invalid number of flows

as fq_pie_init() fails, fq_pie_destroy() is called to clean up. Since the
timer is not yet initialized, it's possible to observe a splat like this:

  INFO: trying to register non-static key.
  the code is fine but needs lockdep annotation.
  turning off the locking correctness validator.
  CPU: 0 PID: 975 Comm: tc Not tainted 5.10.0-rc4+ #298
  Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014
  Call Trace:
   dump_stack+0x99/0xcb
   register_lock_class+0x12dd/0x1750
   __lock_acquire+0xfe/0x3970
   lock_acquire+0x1c8/0x7f0
   del_timer_sync+0x49/0xd0
   fq_pie_destroy+0x3f/0x80 [sch_fq_pie]
   qdisc_create+0x916/0x1160
   tc_modify_qdisc+0x3c4/0x1630
   rtnetlink_rcv_msg+0x346/0x8e0
   netlink_unicast+0x439/0x630
   netlink_sendmsg+0x719/0xbf0
   sock_sendmsg+0xe2/0x110
   ____sys_sendmsg+0x5ba/0x890
   ___sys_sendmsg+0xe9/0x160
   __sys_sendmsg+0xd3/0x170
   do_syscall_64+0x33/0x40
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  [...]
  ODEBUG: assert_init not available (active state 0) object type: timer_list hint: 0x0
  WARNING: CPU: 0 PID: 975 at lib/debugobjects.c:508 debug_print_object+0x162/0x210
  [...]
  Call Trace:
   debug_object_assert_init+0x268/0x380
   try_to_del_timer_sync+0x6a/0x100
   del_timer_sync+0x9e/0xd0
   fq_pie_destroy+0x3f/0x80 [sch_fq_pie]
   qdisc_create+0x916/0x1160
   tc_modify_qdisc+0x3c4/0x1630
   rtnetlink_rcv_msg+0x346/0x8e0
   netlink_rcv_skb+0x120/0x380
   netlink_unicast+0x439/0x630
   netlink_sendmsg+0x719/0xbf0
   sock_sendmsg+0xe2/0x110
   ____sys_sendmsg+0x5ba/0x890
   ___sys_sendmsg+0xe9/0x160
   __sys_sendmsg+0xd3/0x170
   do_syscall_64+0x33/0x40
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

fix it moving timer_setup() before any failure, like it was done on 'red'
with former commit 608b4adab178 ("net_sched: initialize timer earlier in
red_init()").

Fixes: ec97ecf1ebe4 ("net: sched: add Flow Queue PIE packet scheduler")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Cong Wang <cong.wang@bytedance.com>
Link: https://lore.kernel.org/r/2e78e01c504c633ebdff18d041833cf2e079a3a4.1607020450.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agotracing: Fix userstacktrace option for instances
Steven Rostedt (VMware) [Fri, 4 Dec 2020 21:36:16 +0000 (16:36 -0500)]
tracing: Fix userstacktrace option for instances

When the instances were able to use their own options, the userstacktrace
option was left hardcoded for the top level. This made the instance
userstacktrace option bascially into a nop, and will confuse users that set
it, but nothing happens (I was confused when it happened to me!)

Cc: stable@vger.kernel.org
Fixes: 16270145ce6b ("tracing: Add trace options for core options to instances")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoMerge tag 'for-5.10/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device...
Linus Torvalds [Fri, 4 Dec 2020 21:28:39 +0000 (13:28 -0800)]
Merge tag 'for-5.10/dm-fixes' of git://git./linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:

 - Fix DM's bio splitting changes that were made during v5.9. This
   restores splitting in terms of varied per-target ti->max_io_len
   rather than use block core's single stacked 'chunk_sectors' limit.

 - Like DM crypt, update DM integrity to not use crypto drivers that
   have CRYPTO_ALG_ALLOCATES_MEMORY set.

 - Fix DM writecache target's argument parsing and status display.

 - Remove needless BUG() from dm writecache's persistent_memory_claim()

 - Remove old gcc workaround in DM cache target's block_div() for ARM
   link errors now that gcc >= 4.9 is required.

 - Fix RCU locking in dm_blk_report_zones and dm_dax_zero_page_range.

 - Remove old, and now frowned upon, BUG_ON(in_interrupt()) in
   dm_table_event().

 - Remove invalid sparse annotations from dm_prepare_ioctl() and
   dm_unprepare_ioctl().

* tag 'for-5.10/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm: remove invalid sparse __acquires and __releases annotations
  dm: fix double RCU unlock in dm_dax_zero_page_range() error path
  dm: fix IO splitting
  dm writecache: remove BUG() and fail gracefully instead
  dm table: Remove BUG_ON(in_interrupt())
  dm: fix bug with RCU locking in dm_blk_report_zones
  Revert "dm cache: fix arm link errors with inline"
  dm writecache: fix the maximum number of arguments
  dm writecache: advance the number of arguments when reporting max_age
  dm integrity: don't use drivers that have CRYPTO_ALG_ALLOCATES_MEMORY

3 years agodm: remove invalid sparse __acquires and __releases annotations
Mike Snitzer [Fri, 4 Dec 2020 20:25:18 +0000 (15:25 -0500)]
dm: remove invalid sparse __acquires and __releases annotations

Fixes sparse warnings:
drivers/md/dm.c:508:12: warning: context imbalance in 'dm_prepare_ioctl' - wrong count at exit
drivers/md/dm.c:543:13: warning: context imbalance in 'dm_unprepare_ioctl' - wrong count at exit

Fixes: 971888c46993f ("dm: hold DM table for duration of ioctl rather than use blkdev_get")
Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
3 years agodm: fix double RCU unlock in dm_dax_zero_page_range() error path
Mike Snitzer [Fri, 4 Dec 2020 20:19:27 +0000 (15:19 -0500)]
dm: fix double RCU unlock in dm_dax_zero_page_range() error path

Remove redundant dm_put_live_table() in dm_dax_zero_page_range() error
path to fix sparse warning:
drivers/md/dm.c:1208:9: warning: context imbalance in 'dm_dax_zero_page_range' - unexpected unlock

Fixes: cdf6cdcd3b99a ("dm,dax: Add dax zero_page_range operation")
Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
3 years agodm: fix IO splitting
Mike Snitzer [Mon, 30 Nov 2020 15:57:43 +0000 (10:57 -0500)]
dm: fix IO splitting

Commit 882ec4e609c1 ("dm table: stack 'chunk_sectors' limit to account
for target-specific splitting") caused a couple regressions:
1) Using lcm_not_zero() when stacking chunk_sectors was a bug because
   chunk_sectors must reflect the most limited of all devices in the
   IO stack.
2) DM targets that set max_io_len but that do _not_ provide an
   .iterate_devices method no longer had there IO split properly.

And commit 5091cdec56fa ("dm: change max_io_len() to use
blk_max_size_offset()") also caused a regression where DM no longer
supported varied (per target) IO splitting. The implication being the
potential for severely reduced performance for IO stacks that use a DM
target like dm-cache to hide performance limitations of a slower
device (e.g. one that requires 4K IO splitting).

Coming full circle: Fix all these issues by discontinuing stacking
chunk_sectors up using ti->max_io_len in dm_calculate_queue_limits(),
add optional chunk_sectors override argument to blk_max_size_offset()
and update DM's max_io_len() to pass ti->max_io_len to its
blk_max_size_offset() call.

Passing in an optional chunk_sectors override to blk_max_size_offset()
allows for code reuse of block's centralized calculation for max IO
size based on provided offset and split boundary.

Fixes: 882ec4e609c1 ("dm table: stack 'chunk_sectors' limit to account for target-specific splitting")
Fixes: 5091cdec56fa ("dm: change max_io_len() to use blk_max_size_offset()")
Cc: stable@vger.kernel.org
Reported-by: John Dorminy <jdorminy@redhat.com>
Reported-by: Bruce Johnston <bjohnsto@redhat.com>
Reported-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Reviewed-by: John Dorminy <jdorminy@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
3 years agoMerge tag 'mac80211-for-net-2020-12-04' of git://git.kernel.org/pub/scm/linux/kernel...
Jakub Kicinski [Fri, 4 Dec 2020 19:26:28 +0000 (11:26 -0800)]
Merge tag 'mac80211-for-net-2020-12-04' of git://git./linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
Three small fixes:
 * initialize some data to avoid using stack garbage
 * fix 6 GHz channel selection in mac80211
 * correctly restart monitor mode interfaces in mac80211

* tag 'mac80211-for-net-2020-12-04' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211:
  mac80211: set SDATA_STATE_RUNNING for monitor interfaces
  cfg80211: initialize rekey_data
  mac80211: fix return value of ieee80211_chandef_he_6ghz_oper
====================

Link: https://lore.kernel.org/r/20201204122017.118099-1-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge tag 'drm-fixes-2020-12-04' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 4 Dec 2020 17:25:22 +0000 (09:25 -0800)]
Merge tag 'drm-fixes-2020-12-04' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "This week's regular fixes.

  i915 has fixes for a few races, use-after-free, and gpu hangs. Tegra
  just has some minor fixes that I didn't see much point in hanging on
  to. The nouveau fix is for all pre-nv50 cards and was reported a few
  times. Otherwise it's just some amdgpu, and a few misc fixes.

  Summary:

  amdgpu:
   - SMU11 manual fan fix
   - Renoir display clock fix
   - VCN3 dynamic powergating fix

  i915:
   - Program mocs:63 for cache eviction on gen9 (Chris)
   - Protect context lifetime with RCU (Chris)
   - Split the breadcrumb spinlock between global and contexts (Chris)
   - Retain default context state across shrinking (Venkata)
   - Limit frequency drop to RPe on parking (Chris)
   - Return earlier from intel_modeset_init() without display (Jani)
   - Defer initial modeset until after GGTT is initialized (Chris)

  nouveau:
   - pre-nv50 regression fix

  rockchip:
   - uninitialised LVDS property fix

  omap:
   - bridge fix

  panel:
   - race fix

  mxsfb:
   - fence sync fix
   - modifiers fix

  tegra:
   - idr init fix
   - sor fixes
   - output/of cleanup fix"

* tag 'drm-fixes-2020-12-04' of git://anongit.freedesktop.org/drm/drm: (22 commits)
  drm/amdgpu/vcn3.0: remove old DPG workaround
  drm/amdgpu/vcn3.0: stall DPG when WPTR/RPTR reset
  drm/amd/display: Init clock value by current vbios CLKs
  drm/amdgpu/pm/smu11: Fix fan set speed bug
  drm/i915/display: Defer initial modeset until after GGTT is initialised
  drm/i915/display: return earlier from intel_modeset_init() without display
  drm/i915/gt: Limit frequency drop to RPe on parking
  drm/i915/gt: Retain default context state across shrinking
  drm/i915/gt: Split the breadcrumb spinlock between global and contexts
  drm/i915/gt: Protect context lifetime with RCU
  drm/i915/gt: Program mocs:63 for cache eviction on gen9
  drm/omap: sdi: fix bridge enable/disable
  drm/panel: sony-acx565akm: Fix race condition in probe
  drm/rockchip: Avoid uninitialized use of endpoint id in LVDS
  drm/tegra: sor: Disable clocks on error in tegra_sor_init()
  drm/nouveau: make sure ret is initialized in nouveau_ttm_io_mem_reserve
  drm: mxsfb: Implement .format_mod_supported
  drm: mxsfb: fix fence synchronization
  drm/tegra: output: Do not put OF node twice
  drm/tegra: replace idr_init() by idr_init_base()
  ...

3 years agotty: Fix ->session locking
Jann Horn [Thu, 3 Dec 2020 01:25:05 +0000 (02:25 +0100)]
tty: Fix ->session locking

Currently, locking of ->session is very inconsistent; most places
protect it using the legacy tty mutex, but disassociate_ctty(),
__do_SAK(), tiocspgrp() and tiocgsid() don't.
Two of the writers hold the ctrl_lock (because they already need it for
->pgrp), but __proc_set_tty() doesn't do that yet.

On a PREEMPT=y system, an unprivileged user can theoretically abuse
this broken locking to read 4 bytes of freed memory via TIOCGSID if
tiocgsid() is preempted long enough at the right point. (Other things
might also go wrong, especially if root-only ioctls are involved; I'm
not sure about that.)

Change the locking on ->session such that:

 - tty_lock() is held by all writers: By making disassociate_ctty()
   hold it. This should be fine because the same lock can already be
   taken through the call to tty_vhangup_session().
   The tricky part is that we need to shorten the area covered by
   siglock to be able to take tty_lock() without ugly retry logic; as
   far as I can tell, this should be fine, since nothing in the
   signal_struct is touched in the `if (tty)` branch.
 - ctrl_lock is held by all writers: By changing __proc_set_tty() to
   hold the lock a little longer.
 - All readers that aren't holding tty_lock() hold ctrl_lock: By
   adding locking to tiocgsid() and __do_SAK(), and expanding the area
   covered by ctrl_lock in tiocspgrp().

Cc: stable@kernel.org
Signed-off-by: Jann Horn <jannh@google.com>
Reviewed-by: Jiri Slaby <jirislaby@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agotty: Fix ->pgrp locking in tiocspgrp()
Jann Horn [Thu, 3 Dec 2020 01:25:04 +0000 (02:25 +0100)]
tty: Fix ->pgrp locking in tiocspgrp()

tiocspgrp() takes two tty_struct pointers: One to the tty that userspace
passed to ioctl() (`tty`) and one to the TTY being changed (`real_tty`).
These pointers are different when ioctl() is called with a master fd.

To properly lock real_tty->pgrp, we must take real_tty->ctrl_lock.

This bug makes it possible for racing ioctl(TIOCSPGRP, ...) calls on
both sides of a PTY pair to corrupt the refcount of `struct pid`,
leading to use-after-free errors.

Fixes: 47f86834bbd4 ("redo locking of tty->pgrp")
CC: stable@kernel.org
Signed-off-by: Jann Horn <jannh@google.com>
Reviewed-by: Jiri Slaby <jirislaby@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoxsk: Return error code if force_zc is set
Zhang Changzhong [Fri, 4 Dec 2020 10:21:16 +0000 (18:21 +0800)]
xsk: Return error code if force_zc is set

If force_zc is set, we should exit out with an error, not fall back to
copy mode.

Fixes: 921b68692abb ("xsk: Enable sharing of dma mappings")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/1607077277-41995-1-git-send-email-zhangchangzhong@huawei.com
3 years agousb: gadget: f_fs: Use local copy of descriptors for userspace copy
Vamsi Krishna Samavedam [Mon, 30 Nov 2020 20:34:53 +0000 (12:34 -0800)]
usb: gadget: f_fs: Use local copy of descriptors for userspace copy

The function may be unbound causing the ffs_ep and its descriptors
to be freed while userspace is in the middle of an ioctl requesting
the same descriptors. Avoid dangling pointer reference by first
making a local copy of desctiptors before releasing the spinlock.

Fixes: c559a3534109 ("usb: gadget: f_fs: add ioctl returning ep descriptor")
Reviewed-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: Vamsi Krishna Samavedam <vskrishn@codeaurora.org>
Signed-off-by: Jack Pham <jackp@codeaurora.org>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20201130203453.28154-1-jackp@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousb: ohci-omap: Fix descriptor conversion
Linus Walleij [Mon, 30 Nov 2020 08:30:33 +0000 (09:30 +0100)]
usb: ohci-omap: Fix descriptor conversion

There were a bunch of issues with the patch converting the
OMAP1 OSK board to use descriptors for controlling the USB
host:

- The chip label was incorrect
- The GPIO offset was off-by-one
- The code should use sleeping accessors

This patch tries to fix all issues at the same time.

Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Fixes: 15d157e87443 ("usb: ohci-omap: Convert to use GPIO descriptors")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20201130083033.29435-1-linus.walleij@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>