Linus Torvalds [Fri, 12 Nov 2021 18:56:25 +0000 (10:56 -0800)]
thermal: int340x: fix build on 32-bit targets
[ Upstream commit
d9c8e52ff9e84ff1a406330f9ea4de7c5eb40282 ]
Commit
aeb58c860dc5 ("thermal/drivers/int340x: processor_thermal: Suppot
64 bit RFIM responses") started using 'readq()' to read 64-bit status
responses from the int340x hardware.
That's all fine and good, but on 32-bit targets a 64-bit 'readq()' is
ambiguous, since it's no longer an atomic access. Some hardware might
require 64-bit accesses, and other hardware might want low word first or
high word first.
It's quite likely that the driver isn't relevant in a 32-bit environment
any more, and there's a patch floating around to just make it depend on
X86_64, but let's make it buildable on x86-32 anyway.
The driver previously just read the low 32 bits, so the hardware
certainly is ok with 32-bit reads, and in a little-endian environment
the low word first model is the natural one.
So just add the include for the 'io-64-nonatomic-lo-hi.h' version.
Fixes:
aeb58c860dc5 ("thermal/drivers/int340x: processor_thermal: Suppot 64 bit RFIM responses")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Willem de Bruijn [Thu, 11 Nov 2021 11:57:17 +0000 (06:57 -0500)]
selftests/net: udpgso_bench_rx: fix port argument
[ Upstream commit
d336509cb9d03970911878bb77f0497f64fda061 ]
The below commit added optional support for passing a bind address.
It configures the sockaddr bind arguments before parsing options and
reconfigures on options -b and -4.
This broke support for passing port (-p) on its own.
Configure sockaddr after parsing all arguments.
Fixes:
3327a9c46352 ("selftests: add functionals test for UDP GRO")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Rahul Lakkireddy [Thu, 11 Nov 2021 10:25:16 +0000 (15:55 +0530)]
cxgb4: fix eeprom len when diagnostics not implemented
[ Upstream commit
4ca110bf8d9b31a60f8f8ff6706ea147d38ad97c ]
Ensure diagnostics monitoring support is implemented for the SFF 8472
compliant port module and set the correct length for ethtool port
module eeprom read.
Fixes:
f56ec6766dcf ("cxgb4: Add support for ethtool i2c dump")
Signed-off-by: Manoj Malviya <manojmalviya@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Dust Li [Wed, 10 Nov 2021 07:02:34 +0000 (15:02 +0800)]
net/smc: fix sk_refcnt underflow on linkdown and fallback
[ Upstream commit
e5d5aadcf3cd59949316df49c27cb21788d7efe4 ]
We got the following WARNING when running ab/nginx
test with RDMA link flapping (up-down-up).
The reason is when smc_sock fallback and at linkdown
happens simultaneously, we may got the following situation:
__smc_lgr_terminate()
--> smc_conn_kill()
--> smc_close_active_abort()
smc_sock->sk_state = SMC_CLOSED
sock_put(smc_sock)
smc_sock was set to SMC_CLOSED and sock_put() been called
when terminate the link group. But later application call
close() on the socket, then we got:
__smc_release():
if (smc_sock->fallback)
smc_sock->sk_state = SMC_CLOSED
sock_put(smc_sock)
Again we set the smc_sock to CLOSED through it's already
in CLOSED state, and double put the refcnt, so the following
warning happens:
refcount_t: underflow; use-after-free.
WARNING: CPU: 5 PID: 860 at lib/refcount.c:28 refcount_warn_saturate+0x8d/0xf0
Modules linked in:
CPU: 5 PID: 860 Comm: nginx Not tainted 5.10.46+ #403
Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 8c24b4c 04/01/2014
RIP: 0010:refcount_warn_saturate+0x8d/0xf0
Code: 05 5c 1e b5 01 01 e8 52 25 bc ff 0f 0b c3 80 3d 4f 1e b5 01 00 75 ad 48
RSP: 0018:
ffffc90000527e50 EFLAGS:
00010286
RAX:
0000000000000026 RBX:
ffff8881300df2c0 RCX:
0000000000000027
RDX:
0000000000000000 RSI:
ffff88813bd58040 RDI:
ffff88813bd58048
RBP:
0000000000000000 R08:
0000000000000003 R09:
0000000000000001
R10:
ffff8881300df2c0 R11:
ffffc90000527c78 R12:
ffff8881300df340
R13:
ffff8881300df930 R14:
ffff88810b3dad80 R15:
ffff8881300df4f8
FS:
00007f739de8fb80(0000) GS:
ffff88813bd40000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
000000000a01b008 CR3:
0000000111b64003 CR4:
00000000003706e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Call Trace:
smc_release+0x353/0x3f0
__sock_release+0x3d/0xb0
sock_close+0x11/0x20
__fput+0x93/0x230
task_work_run+0x65/0xa0
exit_to_user_mode_prepare+0xf9/0x100
syscall_exit_to_user_mode+0x27/0x190
entry_SYSCALL_64_after_hwframe+0x44/0xa9
This patch adds check in __smc_release() to make
sure we won't do an extra sock_put() and set the
socket to CLOSED when its already in CLOSED state.
Fixes:
51f1de79ad8e (net/smc: replace sock_put worker by socket refcounting)
Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
Acked-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Eiichi Tsukata [Tue, 9 Nov 2021 00:15:02 +0000 (00:15 +0000)]
vsock: prevent unnecessary refcnt inc for nonblocking connect
[ Upstream commit
c7cd82b90599fa10915f41e3dd9098a77d0aa7b6 ]
Currently vosck_connect() increments sock refcount for nonblocking
socket each time it's called, which can lead to memory leak if
it's called multiple times because connect timeout function decrements
sock refcount only once.
Fixes it by making vsock_connect() return -EALREADY immediately when
sock state is already SS_CONNECTING.
Fixes:
d021c344051a ("VSOCK: Introduce VM Sockets")
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Eiichi Tsukata <eiichi.tsukata@nutanix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Marek Behún [Mon, 8 Nov 2021 21:49:18 +0000 (22:49 +0100)]
net: marvell: mvpp2: Fix wrong SerDes reconfiguration order
[ Upstream commit
bb7bbb6e36474933540c24ae1f1ad651b843981f ]
Commit
bfe301ebbc94 ("net: mvpp2: convert to use
mac_prepare()/mac_finish()") introduced a bug wherein it leaves the MAC
RESET register asserted after mac_finish(), due to wrong order of
function calls.
Before it was:
.mac_config()
mvpp22_mode_reconfigure()
assert reset
mvpp2_xlg_config()
deassert reset
Now it is:
.mac_prepare()
.mac_config()
mvpp2_xlg_config()
deassert reset
.mac_finish()
mvpp2_xlg_config()
assert reset
Obviously this is wrong.
This bug is triggered when phylink tries to change the PHY interface
mode from a GMAC mode (sgmii, 1000base-x, 2500base-x) to XLG mode
(10gbase-r, xaui). The XLG mode does not work since reset is left
asserted. Only after
ifconfig down && ifconfig up
is called will the XLG mode work.
Move the call to mvpp22_mode_reconfigure() to .mac_prepare()
implementation. Since some of the subsequent functions need to know
whether the interface is being changed, we unfortunately also need to
pass around the new interface mode before setting port->phy_interface.
Fixes:
bfe301ebbc94 ("net: mvpp2: convert to use mac_prepare()/mac_finish()")
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Christophe JAILLET [Mon, 8 Nov 2021 21:28:55 +0000 (22:28 +0100)]
net: ethernet: ti: cpsw_ale: Fix access to un-initialized memory
[ Upstream commit
7a166854b4e24c57d56b3eba9fe1594985ee0a2c ]
It is spurious to allocate a bitmap without initializing it.
So, better safe than sorry, initialize it to 0 at least to have some known
values.
While at it, switch to the devm_bitmap_ API which is less verbose.
Fixes:
4b41d3436796 ("net: ethernet: ti: cpsw: allow untagged traffic on host port")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Vladimir Oltean [Mon, 8 Nov 2021 20:28:54 +0000 (22:28 +0200)]
net: stmmac: allow a tc-taprio base-time of zero
[ Upstream commit
f64ab8e4f368f48afb08ae91928e103d17b235e9 ]
Commit
fe28c53ed71d ("net: stmmac: fix taprio configuration when
base_time is in the past") allowed some base time values in the past,
but apparently not all, the base-time value of 0 (Jan 1st 1970) is still
explicitly denied by the driver.
Remove the bogus check.
Fixes:
b60189e0392f ("net: stmmac: Integrate EST with TAPRIO scheduler API")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Guangbin Huang [Wed, 10 Nov 2021 13:42:56 +0000 (21:42 +0800)]
net: hns3: allow configure ETS bandwidth of all TCs
[ Upstream commit
688db0c7a4a69ddc8b8143a1cac01eb20082a3aa ]
Currently, driver only allow configuring ETS bandwidth of TCs according
to the max TC number queried from firmware. However, the hardware actually
supports 8 TCs and users may need to configure ETS bandwidth of all TCs,
so remove the restriction.
Fixes:
330baff5423b ("net: hns3: add ETS TC weight setting in SSU module")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Yufeng Mo [Wed, 10 Nov 2021 13:42:53 +0000 (21:42 +0800)]
net: hns3: fix kernel crash when unload VF while it is being reset
[ Upstream commit
e140c7983e3054be0652bf914f4454f16c5520b0 ]
When fully configure VLANs for a VF, then unload the VF while
triggering a reset to PF, will cause a kernel crash because the
irq is already uninit.
[ 293.177579] ------------[ cut here ]------------
[ 293.183502] kernel BUG at drivers/pci/msi.c:352!
[ 293.189547] Internal error: Oops - BUG: 0 [#1] SMP
......
[ 293.390124] Workqueue: hclgevf hclgevf_service_task [hclgevf]
[ 293.402627] pstate:
80c00009 (Nzcv daif +PAN +UAO)
[ 293.414324] pc : free_msi_irqs+0x19c/0x1b8
[ 293.425429] lr : free_msi_irqs+0x18c/0x1b8
[ 293.436545] sp :
ffff00002716fbb0
[ 293.446950] x29:
ffff00002716fbb0 x28:
0000000000000000
[ 293.459519] x27:
0000000000000000 x26:
ffff45b91ea16b00
[ 293.472183] x25:
0000000000000000 x24:
ffffa587b08f4700
[ 293.484717] x23:
ffffc591ac30e000 x22:
ffffa587b08f8428
[ 293.497190] x21:
ffffc591ac30e300 x20:
0000000000000000
[ 293.509594] x19:
ffffa58a062a8300 x18:
0000000000000000
[ 293.521949] x17:
0000000000000000 x16:
ffff45b91dcc3f48
[ 293.534013] x15:
0000000000000000 x14:
0000000000000000
[ 293.545883] x13:
0000000000000040 x12:
0000000000000228
[ 293.557508] x11:
0000000000000020 x10:
0000000000000040
[ 293.568889] x9 :
ffff45b91ea1e190 x8 :
ffffc591802d0000
[ 293.580123] x7 :
ffffc591802d0148 x6 :
0000000000000120
[ 293.591190] x5 :
ffffc591802d0000 x4 :
0000000000000000
[ 293.602015] x3 :
0000000000000000 x2 :
0000000000000000
[ 293.612624] x1 :
00000000000004a4 x0 :
ffffa58a1e0c6b80
[ 293.623028] Call trace:
[ 293.630340] free_msi_irqs+0x19c/0x1b8
[ 293.638849] pci_disable_msix+0x118/0x140
[ 293.647452] pci_free_irq_vectors+0x20/0x38
[ 293.656081] hclgevf_uninit_msi+0x44/0x58 [hclgevf]
[ 293.665309] hclgevf_reset_rebuild+0x1ac/0x2e0 [hclgevf]
[ 293.674866] hclgevf_reset+0x358/0x400 [hclgevf]
[ 293.683545] hclgevf_reset_service_task+0xd0/0x1b0 [hclgevf]
[ 293.693325] hclgevf_service_task+0x4c/0x2e8 [hclgevf]
[ 293.702307] process_one_work+0x1b0/0x448
[ 293.710034] worker_thread+0x54/0x468
[ 293.717331] kthread+0x134/0x138
[ 293.724114] ret_from_fork+0x10/0x18
[ 293.731324] Code:
f940b000 b4ffff00 a903e7b8 f90017b6 (
d4210000)
This patch fixes the problem by waiting for the VF reset done
while unloading the VF.
Fixes:
e2cb1dec9779 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Jie Wang [Wed, 10 Nov 2021 13:42:51 +0000 (21:42 +0800)]
net: hns3: fix pfc packet number incorrect after querying pfc parameters
[ Upstream commit
0b653a81a26d66ffe526a54c2177e24fb1400301 ]
Currently, driver will send command to firmware to query pfc packet number
when user uses dcb tool to get pfc parameters. However, the periodic
service task will also periodically query and record MAC statistics,
including pfc packet number.
As the hardware registers of statistics is cleared after reading, it will
cause pfc packet number of MAC statistics are not correct after using dcb
tool to get pfc parameters.
To fix this problem, when user uses dcb tool to get pfc parameters, driver
updates MAC statistics firstly and then get pfc packet number from MAC
statistics.
Fixes:
64fd2300fcc1 ("net: hns3: add support for querying pfc puase packets statistic")
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Jie Wang [Wed, 10 Nov 2021 13:42:50 +0000 (21:42 +0800)]
net: hns3: fix ROCE base interrupt vector initialization bug
[ Upstream commit
beb27ca451a57a1c0e52b5268703f3c3173c1f8c ]
Currently, NIC init ROCE interrupt vector with MSIX interrupt. But ROCE use
pci_irq_vector() to get interrupt vector, which adds the relative interrupt
vector again and gets wrong interrupt vector.
So fixes it by assign relative interrupt vector to ROCE instead of MSIX
interrupt vector and delete the unused struct member base_msi_vector
declaration of hclgevf_dev.
Fixes:
46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Eric Dumazet [Mon, 8 Nov 2021 18:08:15 +0000 (10:08 -0800)]
net/sched: sch_taprio: fix undefined behavior in ktime_mono_to_any
[ Upstream commit
6dc25401cba4d428328eade8ceae717633fdd702 ]
1) if q->tk_offset == TK_OFFS_MAX, then get_tcp_tstamp() calls
ktime_mono_to_any() with out-of-bound value.
2) if q->tk_offset is changed in taprio_parse_clockid(),
taprio_get_time() might also call ktime_mono_to_any()
with out-of-bound value as sysbot found:
UBSAN: array-index-out-of-bounds in kernel/time/timekeeping.c:908:27
index 3 is out of range for type 'ktime_t *[3]'
CPU: 1 PID: 25668 Comm: kworker/u4:0 Not tainted 5.15.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: bat_events batadv_iv_send_outstanding_bat_ogm_packet
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
ubsan_epilogue+0xb/0x5a lib/ubsan.c:151
__ubsan_handle_out_of_bounds.cold+0x62/0x6c lib/ubsan.c:291
ktime_mono_to_any+0x1d4/0x1e0 kernel/time/timekeeping.c:908
get_tcp_tstamp net/sched/sch_taprio.c:322 [inline]
get_packet_txtime net/sched/sch_taprio.c:353 [inline]
taprio_enqueue_one+0x5b0/0x1460 net/sched/sch_taprio.c:420
taprio_enqueue+0x3b1/0x730 net/sched/sch_taprio.c:485
dev_qdisc_enqueue+0x40/0x300 net/core/dev.c:3785
__dev_xmit_skb net/core/dev.c:3869 [inline]
__dev_queue_xmit+0x1f6e/0x3630 net/core/dev.c:4194
batadv_send_skb_packet+0x4a9/0x5f0 net/batman-adv/send.c:108
batadv_iv_ogm_send_to_if net/batman-adv/bat_iv_ogm.c:393 [inline]
batadv_iv_ogm_emit net/batman-adv/bat_iv_ogm.c:421 [inline]
batadv_iv_send_outstanding_bat_ogm_packet+0x6d7/0x8e0 net/batman-adv/bat_iv_ogm.c:1701
process_one_work+0x9b2/0x1690 kernel/workqueue.c:2298
worker_thread+0x658/0x11f0 kernel/workqueue.c:2445
kthread+0x405/0x4f0 kernel/kthread.c:327
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
Fixes:
7ede7b03484b ("taprio: make clock reference conversions easier")
Fixes:
54002066100b ("taprio: Adjust timestamps for TCP packets")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Vedang Patel <vedang.patel@intel.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Reviewed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Link: https://lore.kernel.org/r/20211108180815.1822479-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Marek Behún [Thu, 4 Nov 2021 17:17:47 +0000 (18:17 +0100)]
net: dsa: mv88e6xxx: Don't support >1G speeds on 6191X on ports other than 10
[ Upstream commit
dc2fc9f03c5c410d8f01c2206b3d529f80b13733 ]
Model 88E6191X only supports >1G speeds on port 10. Port 0 and 9 are
only 1G.
Fixes:
de776d0d316f ("net: dsa: mv88e6xxx: add support for mv88e6393x family")
Signed-off-by: Marek Behún <kabel@kernel.org>
Cc: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20211104171747.10509-1-kabel@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Evan Quan [Sat, 9 Oct 2021 09:35:36 +0000 (17:35 +0800)]
drm/amdgpu: fix uvd crash on Polaris12 during driver unloading
[ Upstream commit
4fc30ea780e0a5c1c019bc2e44f8523e1eed9051 ]
There was a change(below) target for such issue:
d82e2c249c8f ("drm/amdgpu: Fix crash on device remove/driver unload")
But the fix for VI ASICs was missing there. This is a supplement for
that.
Fixes:
d82e2c249c8f ("drm/amdgpu: Fix crash on device remove/driver unload")
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Muchun Song [Tue, 9 Nov 2021 02:35:19 +0000 (18:35 -0800)]
seq_file: fix passing wrong private data
[ Upstream commit
10a6de19cad6efb9b49883513afb810dc265fca2 ]
DEFINE_PROC_SHOW_ATTRIBUTE() is supposed to be used to define a series
of functions and variables to register proc file easily. And the users
can use proc_create_data() to pass their own private data and get it
via seq->private in the callback. Unfortunately, the proc file system
use PDE_DATA() to get private data instead of inode->i_private. So fix
it. Fortunately, there only one user of it which does not pass any
private data, so this bug does not break any in-tree codes.
Link: https://lkml.kernel.org/r/20211029032638.84884-1-songmuchun@bytedance.com
Fixes:
97a32539b956 ("proc: convert everything to "struct proc_ops"")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Florent Revest <revest@chromium.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Andrew Halaney [Tue, 9 Nov 2021 02:34:27 +0000 (18:34 -0800)]
init: make unknown command line param message clearer
[ Upstream commit
8bc2b3dca7292347d8e715fb723c587134abe013 ]
The prior message is confusing users, which is the exact opposite of the
goal. If the message is being seen, one of the following situations is
happening:
1. the param is misspelled
2. the param is not valid due to the kernel configuration
3. the param is intended for init but isn't after the '--'
delineator on the command line
To make that more clear to the user, explicitly mention "kernel command
line" and also note that the params are still passed to user space to
avoid causing any alarm over params intended for init.
Link: https://lkml.kernel.org/r/20211013223502.96756-1-ahalaney@redhat.com
Fixes:
86d1919a4fb0 ("init: print out unknown kernel parameters")
Signed-off-by: Andrew Halaney <ahalaney@redhat.com>
Suggested-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Imre Deak [Tue, 26 Oct 2021 22:50:59 +0000 (01:50 +0300)]
drm/i915/fb: Fix rounding error in subsampled plane size calculation
[ Upstream commit
90ab96f3872eae816f4e07deaa77322a91237960 ]
For NV12 FBs with odd main surface tile-row height the CCS surface
height was incorrectly calculated 1 less than the actual value. Fix this
by rounding up the result of divison. For consistency do the same for
the CCS surface width calculation.
Fixes:
b3e57bccd68a ("drm/i915/tgl: Gen-12 render decompression")
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211026225105.2783797-2-imre.deak@intel.com
(cherry picked from commit
2ee5ef9c934ad26376c9282171e731e6c0339815)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Dan Carpenter [Tue, 9 Nov 2021 11:47:36 +0000 (14:47 +0300)]
gve: Fix off by one in gve_tx_timeout()
[ Upstream commit
1c360cc1cc883fbdf0a258b4df376571fbeac5ee ]
The priv->ntfy_blocks[] has "priv->num_ntfy_blks" elements so this >
needs to be >= to prevent an off by one bug. The priv->ntfy_blocks[]
array is allocated in gve_alloc_notify_blocks().
Fixes:
87a7f321bb6a ("gve: Recover from queue stall due to missed IRQ")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Arnd Bergmann [Wed, 3 Nov 2021 15:33:12 +0000 (16:33 +0100)]
dmaengine: stm32-dma: avoid 64-bit division in stm32_dma_get_max_width
[ Upstream commit
2498363310e9b5e5de0e104709adc35c9f3ff7d9 ]
Using the % operator on a 64-bit variable is expensive and can
cause a link failure:
arm-linux-gnueabi-ld: drivers/dma/stm32-dma.o: in function `stm32_dma_get_max_width':
stm32-dma.c:(.text+0x170): undefined reference to `__aeabi_uldivmod'
arm-linux-gnueabi-ld: drivers/dma/stm32-dma.o: in function `stm32_dma_set_xfer_param':
stm32-dma.c:(.text+0x1cd4): undefined reference to `__aeabi_uldivmod'
As we know that we just want to check the alignment in
stm32_dma_get_max_width(), there is no need for a full division, and
using a simple mask is a faster replacement.
Same in stm32_dma_set_xfer_param(), change this to only allow burst
transfers if the address is a multiple of the length.
stm32_dma_get_best_burst just after will take buf_len into account to fix
burst in case of misalignment.
Fixes:
b20fd5fa310c ("dmaengine: stm32-dma: fix stm32_dma_get_max_width")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
Link: https://lore.kernel.org/r/20211103153312.41483-1-amelie.delaunay@foss.st.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Amelie Delaunay [Mon, 11 Oct 2021 09:42:59 +0000 (11:42 +0200)]
dmaengine: stm32-dma: fix burst in case of unaligned memory address
[ Upstream commit
af229d2c2557b5cf2a3b1eb39847ec1de7446873 ]
Theorically, address pointers used by STM32 DMA must be chosen so as to
ensure that all transfers within a burst block are aligned on the address
boundary equal to the size of the transfer.
If this is always the case for peripheral addresses on STM32, it is not for
memory addresses if the user doesn't respect this alignment constraint.
To avoid a weird behavior of the DMA controller in this case (no error
triggered but data are not transferred as expected), force no burst.
Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
Link: https://lore.kernel.org/r/20211011094259.315023-4-amelie.delaunay@foss.st.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Jussi Maki [Wed, 3 Nov 2021 20:47:36 +0000 (13:47 -0700)]
bpf, sockmap: sk_skb data_end access incorrect when src_reg = dst_reg
[ Upstream commit
b2c4618162ec615a15883a804cce7e27afecfa58 ]
The current conversion of skb->data_end reads like this:
; data_end = (void*)(long)skb->data_end;
559: (79) r1 = *(u64 *)(r2 +200) ; r1 = skb->data
560: (61) r11 = *(u32 *)(r2 +112) ; r11 = skb->len
561: (0f) r1 += r11
562: (61) r11 = *(u32 *)(r2 +116)
563: (1f) r1 -= r11
But similar to the case in
84f44df664e9 ("bpf: sock_ops sk access may stomp
registers when dst_reg = src_reg"), the code will read an incorrect skb->len
when src == dst. In this case we end up generating this xlated code:
; data_end = (void*)(long)skb->data_end;
559: (79) r1 = *(u64 *)(r1 +200) ; r1 = skb->data
560: (61) r11 = *(u32 *)(r1 +112) ; r11 = (skb->data)->len
561: (0f) r1 += r11
562: (61) r11 = *(u32 *)(r1 +116)
563: (1f) r1 -= r11
... where line 560 is the reading 4B of (skb->data + 112) instead of the
intended skb->len Here the skb pointer in r1 gets set to skb->data and the
later deref for skb->len ends up following skb->data instead of skb.
This fixes the issue similarly to the patch mentioned above by creating an
additional temporary variable and using to store the register when dst_reg =
src_reg. We name the variable bpf_temp_reg and place it in the cb context for
sk_skb. Then we restore from the temp to ensure nothing is lost.
Fixes:
16137b09a66f2 ("bpf: Compute data_end dynamically with JIT code")
Signed-off-by: Jussi Maki <joamaki@gmail.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20211103204736.248403-6-john.fastabend@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
John Fastabend [Wed, 3 Nov 2021 20:47:35 +0000 (13:47 -0700)]
bpf: sockmap, strparser, and tls are reusing qdisc_skb_cb and colliding
[ Upstream commit
e0dc3b93bd7bcff8c3813d1df43e0908499c7cf0 ]
Strparser is reusing the qdisc_skb_cb struct to stash the skb message handling
progress, e.g. offset and length of the skb. First this is poorly named and
inherits a struct from qdisc that doesn't reflect the actual usage of cb[] at
this layer.
But, more importantly strparser is using the following to access its metadata.
(struct _strp_msg *)((void *)skb->cb + offsetof(struct qdisc_skb_cb, data))
Where _strp_msg is defined as:
struct _strp_msg {
struct strp_msg strp; /* 0 8 */
int accum_len; /* 8 4 */
/* size: 12, cachelines: 1, members: 2 */
/* last cacheline: 12 bytes */
};
So we use 12 bytes of ->data[] in struct. However in BPF code running parser
and verdict the user has read capabilities into the data[] array as well. Its
not too problematic, but we should not be exposing internal state to BPF
program. If its really needed then we can use the probe_read() APIs which allow
reading kernel memory. And I don't believe cb[] layer poses any API breakage by
moving this around because programs can't depend on cb[] across layers.
In order to fix another issue with a ctx rewrite we need to stash a temp
variable somewhere. To make this work cleanly this patch builds a cb struct
for sk_skb types called sk_skb_cb struct. Then we can use this consistently
in the strparser, sockmap space. Additionally we can start allowing ->cb[]
write access after this.
Fixes:
604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jussi Maki <joamaki@gmail.com>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20211103204736.248403-5-john.fastabend@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
John Fastabend [Wed, 3 Nov 2021 20:47:34 +0000 (13:47 -0700)]
bpf, sockmap: Fix race in ingress receive verdict with redirect to self
[ Upstream commit
c5d2177a72a1659554922728fc407f59950aa929 ]
A socket in a sockmap may have different combinations of programs attached
depending on configuration. There can be no programs in which case the socket
acts as a sink only. There can be a TX program in this case a BPF program is
attached to sending side, but no RX program is attached. There can be an RX
program only where sends have no BPF program attached, but receives are hooked
with BPF. And finally, both TX and RX programs may be attached. Giving us the
permutations:
None, Tx, Rx, and TxRx
To date most of our use cases have been TX case being used as a fast datapath
to directly copy between local application and a userspace proxy. Or Rx cases
and TxRX applications that are operating an in kernel based proxy. The traffic
in the first case where we hook applications into a userspace application looks
like this:
AppA redirect AppB
Tx <-----------> Rx
| |
+ +
TCP <--> lo <--> TCP
In this case all traffic from AppA (after 3whs) is copied into the AppB
ingress queue and no traffic is ever on the TCP recieive_queue.
In the second case the application never receives, except in some rare error
cases, traffic on the actual user space socket. Instead the send happens in
the kernel.
AppProxy socket pool
sk0 ------------->{sk1,sk2, skn}
^ |
| |
| v
ingress lb egress
TCP TCP
Here because traffic is never read off the socket with userspace recv() APIs
there is only ever one reader on the sk receive_queue. Namely the BPF programs.
However, we've started to introduce a third configuration where the BPF program
on receive should process the data, but then the normal case is to push the
data into the receive queue of AppB.
AppB
recv() (userspace)
-----------------------
tcp_bpf_recvmsg() (kernel)
| |
| |
| |
ingress_msgQ |
| |
RX_BPF |
| |
v v
sk->receive_queue
This is different from the App{A,B} redirect because traffic is first received
on the sk->receive_queue.
Now for the issue. The tcp_bpf_recvmsg() handler first checks the ingress_msg
queue for any data handled by the BPF rx program and returned with PASS code
so that it was enqueued on the ingress msg queue. Then if no data exists on
that queue it checks the socket receive queue. Unfortunately, this is the same
receive_queue the BPF program is reading data off of. So we get a race. Its
possible for the recvmsg() hook to pull data off the receive_queue before the
BPF hook has a chance to read it. It typically happens when an application is
banging on recv() and getting EAGAINs. Until they manage to race with the RX
BPF program.
To fix this we note that before this patch at attach time when the socket is
loaded into the map we check if it needs a TX program or just the base set of
proto bpf hooks. Then it uses the above general RX hook regardless of if we
have a BPF program attached at rx or not. This patch now extends this check to
handle all cases enumerated above, TX, RX, TXRX, and none. And to fix above
race when an RX program is attached we use a new hook that is nearly identical
to the old one except now we do not let the recv() call skip the RX BPF program.
Now only the BPF program pulls data from sk->receive_queue and recv() only
pulls data from the ingress msgQ post BPF program handling.
With this resolved our AppB from above has been up and running for many hours
without detecting any errors. We do this by correlating counters in RX BPF
events and the AppB to ensure data is never skipping the BPF program. Selftests,
was not able to detect this because we only run them for a short period of time
on well ordered send/recvs so we don't get any of the noise we see in real
application environments.
Fixes:
51199405f9672 ("bpf: skb_verdict, support SK_PASS on RX BPF path")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jussi Maki <joamaki@gmail.com>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20211103204736.248403-4-john.fastabend@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
John Fastabend [Wed, 3 Nov 2021 20:47:33 +0000 (13:47 -0700)]
bpf, sockmap: Remove unhash handler for BPF sockmap usage
[ Upstream commit
b8b8315e39ffaca82e79d86dde26e9144addf66b ]
We do not need to handle unhash from BPF side we can simply wait for the
close to happen. The original concern was a socket could transition from
ESTABLISHED state to a new state while the BPF hook was still attached.
But, we convinced ourself this is no longer possible and we also improved
BPF sockmap to handle listen sockets so this is no longer a problem.
More importantly though there are cases where unhash is called when data is
in the receive queue. The BPF unhash logic will flush this data which is
wrong. To be correct it should keep the data in the receive queue and allow
a receiving application to continue reading the data. This may happen when
tcp_abort() is received for example. Instead of complicating the logic in
unhash simply moving all this to tcp_close() hook solves this.
Fixes:
51199405f9672 ("bpf: skb_verdict, support SK_PASS on RX BPF path")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jussi Maki <joamaki@gmail.com>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20211103204736.248403-3-john.fastabend@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Arnd Bergmann [Fri, 5 Nov 2021 07:54:03 +0000 (08:54 +0100)]
arm64: pgtable: make __pte_to_phys/__phys_to_pte_val inline functions
[ Upstream commit
c7c386fbc20262c1d911c615c65db6a58667d92c ]
gcc warns about undefined behavior the vmalloc code when building
with CONFIG_ARM64_PA_BITS_52, when the 'idx++' in the argument to
__phys_to_pte_val() is evaluated twice:
mm/vmalloc.c: In function 'vmap_pfn_apply':
mm/vmalloc.c:2800:58: error: operation on 'data->idx' may be undefined [-Werror=sequence-point]
2800 | *pte = pte_mkspecial(pfn_pte(data->pfns[data->idx++], data->prot));
| ~~~~~~~~~^~
arch/arm64/include/asm/pgtable-types.h:25:37: note: in definition of macro '__pte'
25 | #define __pte(x) ((pte_t) { (x) } )
| ^
arch/arm64/include/asm/pgtable.h:80:15: note: in expansion of macro '__phys_to_pte_val'
80 | __pte(__phys_to_pte_val((phys_addr_t)(pfn) << PAGE_SHIFT) | pgprot_val(prot))
| ^~~~~~~~~~~~~~~~~
mm/vmalloc.c:2800:30: note: in expansion of macro 'pfn_pte'
2800 | *pte = pte_mkspecial(pfn_pte(data->pfns[data->idx++], data->prot));
| ^~~~~~~
I have no idea why this never showed up earlier, but the safest
workaround appears to be changing those macros into inline functions
so the arguments get evaluated only once.
Cc: Matthew Wilcox <willy@infradead.org>
Fixes:
75387b92635e ("arm64: handle 52-bit physical addresses in page table entries")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20211105075414.2553155-1-arnd@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Reiji Watanabe [Mon, 1 Nov 2021 04:54:21 +0000 (21:54 -0700)]
arm64: arm64_ftr_reg->name may not be a human-readable string
[ Upstream commit
9dc232a8ab18bb20f1dcb03c8e049e3607f3ed15 ]
The id argument of ARM64_FTR_REG_OVERRIDE() is used for two purposes:
one as the system register encoding (used for the sys_id field of
__ftr_reg_entry), and the other as the register name (stringified
and used for the name field of arm64_ftr_reg), which is debug
information. The id argument is supposed to be a macro that
indicates an encoding of the register (eg. SYS_ID_AA64PFR0_EL1, etc).
ARM64_FTR_REG(), which also has the same id argument,
uses ARM64_FTR_REG_OVERRIDE() and passes the id to the macro.
Since the id argument is completely macro-expanded before it is
substituted into a macro body of ARM64_FTR_REG_OVERRIDE(),
the stringified id in the body of ARM64_FTR_REG_OVERRIDE is not
a human-readable register name, but a string of numeric bitwise
operations.
Fix this so that human-readable register names are available as
debug information.
Fixes:
8f266a5d878a ("arm64: cpufeature: Add global feature override facility")
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211101045421.2215822-1-reijiw@google.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Christophe JAILLET [Sun, 7 Nov 2021 20:13:07 +0000 (21:13 +0100)]
litex_liteeth: Fix a double free in the remove function
[ Upstream commit
c45231a7668d6b632534f692b10592ea375b55b0 ]
'netdev' is a managed resource allocated in the probe using
'devm_alloc_etherdev()'.
It must not be freed explicitly in the remove function.
Fixes:
ee7da21ac4c3 ("net: Add driver for LiteX's LiteETH network interface")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Chengfeng Ye [Fri, 5 Nov 2021 13:36:36 +0000 (06:36 -0700)]
nfc: pn533: Fix double free when pn533_fill_fragment_skbs() fails
[ Upstream commit
9fec40f850658e00a14a7dd9e06f7fbc7e59cc4a ]
skb is already freed by dev_kfree_skb in pn533_fill_fragment_skbs,
but follow error handler branch when pn533_fill_fragment_skbs()
fails, skb is freed again, results in double free issue. Fix this
by not free skb in error path of pn533_fill_fragment_skbs.
Fixes:
963a82e07d4e ("NFC: pn533: Split large Tx frames in chunks")
Fixes:
93ad42020c2d ("NFC: pn533: Target mode Tx fragmentation support")
Signed-off-by: Chengfeng Ye <cyeaa@connect.ust.hk>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Eric Dumazet [Fri, 5 Nov 2021 21:42:14 +0000 (14:42 -0700)]
llc: fix out-of-bound array index in llc_sk_dev_hash()
[ Upstream commit
8ac9dfd58b138f7e82098a4e0a0d46858b12215b ]
Both ifindex and LLC_SK_DEV_HASH_ENTRIES are signed.
This means that (ifindex % LLC_SK_DEV_HASH_ENTRIES) is negative
if @ifindex is negative.
We could simply make LLC_SK_DEV_HASH_ENTRIES unsigned.
In this patch I chose to use hash_32() to get more entropy
from @ifindex, like llc_sk_laddr_hashfn().
UBSAN: array-index-out-of-bounds in ./include/net/llc.h:75:26
index -43 is out of range for type 'hlist_head [64]'
CPU: 1 PID: 20999 Comm: syz-executor.3 Not tainted 5.15.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
ubsan_epilogue+0xb/0x5a lib/ubsan.c:151
__ubsan_handle_out_of_bounds.cold+0x62/0x6c lib/ubsan.c:291
llc_sk_dev_hash include/net/llc.h:75 [inline]
llc_sap_add_socket+0x49c/0x520 net/llc/llc_conn.c:697
llc_ui_bind+0x680/0xd70 net/llc/af_llc.c:404
__sys_bind+0x1e9/0x250 net/socket.c:1693
__do_sys_bind net/socket.c:1704 [inline]
__se_sys_bind net/socket.c:1702 [inline]
__x64_sys_bind+0x6f/0xb0 net/socket.c:1702
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fa503407ae9
Fixes:
6d2e3ea28446 ("llc: use a device based hash table to speed up multicast delivery")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Ian Rogers [Sat, 6 Nov 2021 05:37:33 +0000 (22:37 -0700)]
perf bpf: Add missing free to bpf_event__print_bpf_prog_info()
[ Upstream commit
88c42f4d6cb249eb68524282f8d4cc32f9059984 ]
If btf__new() is called then there needs to be a corresponding btf__free().
Fixes:
f8dfeae009effc0b ("perf bpf: Show more BPF program info in print_bpf_prog_info()")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: http://lore.kernel.org/lkml/20211106053733.3580931-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Dan Carpenter [Fri, 5 Nov 2021 20:45:12 +0000 (13:45 -0700)]
zram: off by one in read_block_state()
[ Upstream commit
a88e03cf3d190cf46bc4063a9b7efe87590de5f4 ]
snprintf() returns the number of bytes it would have printed if there
were space. But it does not count the NUL terminator. So that means
that if "count == copied" then this has already overflowed by one
character.
This bug likely isn't super harmful in real life.
Link: https://lkml.kernel.org/r/20210916130404.GA25094@kili
Fixes:
c0265342bff4 ("zram: introduce zram memory tracking")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Miaohe Lin [Fri, 5 Nov 2021 20:45:03 +0000 (13:45 -0700)]
mm/zsmalloc.c: close race window between zs_pool_dec_isolated() and zs_unregister_migration()
[ Upstream commit
afe8605ca45424629fdddfd85984b442c763dc47 ]
There is one possible race window between zs_pool_dec_isolated() and
zs_unregister_migration() because wait_for_isolated_drain() checks the
isolated count without holding class->lock and there is no order inside
zs_pool_dec_isolated(). Thus the below race window could be possible:
zs_pool_dec_isolated zs_unregister_migration
check pool->destroying != 0
pool->destroying = true;
smp_mb();
wait_for_isolated_drain()
wait for pool->isolated_pages == 0
atomic_long_dec(&pool->isolated_pages);
atomic_long_read(&pool->isolated_pages) == 0
Since we observe the pool->destroying (false) before atomic_long_dec()
for pool->isolated_pages, waking pool->migration_wait up is missed.
Fix this by ensure checking pool->destroying happens after the
atomic_long_dec(&pool->isolated_pages).
Link: https://lkml.kernel.org/r/20210708115027.7557-1-linmiaohe@huawei.com
Fixes:
701d678599d0 ("mm/zsmalloc.c: fix race condition in zs_destroy_pool")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Henry Burns <henryburns@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Marc Kleine-Budde [Tue, 19 Oct 2021 15:00:04 +0000 (17:00 +0200)]
can: mcp251xfd: mcp251xfd_chip_start(): fix error handling for mcp251xfd_chip_rx_int_enable()
[ Upstream commit
69c55f6e7669d46bb40e41f6e2b218428178368a ]
This patch fixes the error handling for mcp251xfd_chip_rx_int_enable().
Instead just returning the error, properly shut down the chip.
Link: https://lore.kernel.org/all/20211106201526.44292-2-mkl@pengutronix.de
Fixes:
55e5b97f003e ("can: mcp25xxfd: add driver for Microchip MCP25xxFD SPI CAN")
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Vincent Mailhol [Tue, 26 Oct 2021 18:07:40 +0000 (03:07 +0900)]
can: etas_es58x: es58x_rx_err_msg(): fix memory leak in error path
[ Upstream commit
d9447f768bc8c60623e4bb3ce65b8f4654d33a50 ]
In es58x_rx_err_msg(), if can->do_set_mode() fails, the function
directly returns without calling netif_rx(skb). This means that the
skb previously allocated by alloc_can_err_skb() is not freed. In other
terms, this is a memory leak.
This patch simply removes the return statement in the error branch and
let the function continue.
Issue was found with GCC -fanalyzer, please follow the link below for
details.
Fixes:
8537257874e9 ("can: etas_es58x: add core support for ETAS ES58X CAN USB interfaces")
Link: https://lore.kernel.org/all/20211026180740.1953265-1-mailhol.vincent@wanadoo.fr
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Alex Deucher [Wed, 3 Nov 2021 19:52:53 +0000 (15:52 -0400)]
drm/amdgpu/powerplay: fix sysfs_emit/sysfs_emit_at handling
[ Upstream commit
e9c76719c1e99caf95e70de74170291b9457bbc1 ]
sysfs_emit and sysfs_emit_at requrie a page boundary
aligned buf address. Make them happy!
v2: fix sysfs_emit -> sysfs_emit_at missed conversions
Cc: Lang Yu <lang.yu@amd.com>
Cc: Darren Powell <darren.powell@amd.com>
Fixes:
6db0c87a0a8e ("amdgpu/pm: Replace hwmgr smu usage of sprintf with sysfs_emit")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1774
Reviewed-by: Lang Yu <lang.yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Fabio Estevam [Thu, 4 Nov 2021 00:11:12 +0000 (21:11 -0300)]
Revert "drm/imx: Annotate dma-fence critical section in commit path"
[ Upstream commit
14d9a37c952588930d7226953359fea3ab956d39 ]
This reverts commit
f4b34faa08428d813fc3629f882c503487f94a12.
Since commit
f4b34faa0842 ("drm/imx: Annotate dma-fence critical section in
commit path") the following possible circular dependency is detected:
[ 5.001811] ======================================================
[ 5.001817] WARNING: possible circular locking dependency detected
[ 5.001824] 5.14.9-01225-g45da36cc6fcc-dirty #1 Tainted: G W
[ 5.001833] ------------------------------------------------------
[ 5.001838] kworker/u8:0/7 is trying to acquire lock:
[ 5.001848]
c1752080 (regulator_list_mutex){+.+.}-{3:3}, at: regulator_lock_dependent+0x40/0x294
[ 5.001903]
[ 5.001903] but task is already holding lock:
[ 5.001909]
c176df78 (dma_fence_map){++++}-{0:0}, at: imx_drm_atomic_commit_tail+0x10/0x160
[ 5.001957]
[ 5.001957] which lock already depends on the new lock.
...
Revert it for now.
Tested on a imx6q-sabresd.
Fixes:
f4b34faa0842 ("drm/imx: Annotate dma-fence critical section in commit path")
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20211104001112.4035691-1-festevam@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Arnd Bergmann [Fri, 29 Oct 2021 12:02:38 +0000 (14:02 +0200)]
drm: fb_helper: improve CONFIG_FB dependency
[ Upstream commit
9d6366e743f37d36ef69347924ead7bcc596076e ]
My previous patch correctly addressed the possible link failure, but as
Jani points out, the dependency is now stricter than it needs to be.
Change it again, to allow DRM_FBDEV_EMULATION to be used when
DRM_KMS_HELPER and FB are both loadable modules and DRM is linked into
the kernel.
As a side-effect, the option is now only visible when at least one DRM
driver makes use of DRM_KMS_HELPER. This is better, because the option
has no effect otherwise.
Fixes:
606b102876e3 ("drm: fb_helper: fix CONFIG_FB dependency")
Suggested-by: Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20211029120307.1407047-1-arnd@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Hangbin Liu [Wed, 27 Oct 2021 03:35:53 +0000 (11:35 +0800)]
selftests/bpf/xdp_redirect_multi: Limit the tests in netns
[ Upstream commit
8955c1a329873385775081e029d9a7c6aa9037e1 ]
As I want to test both DEVMAP and DEVMAP_HASH in XDP multicast redirect, I
limited DEVMAP max entries to a small value for performace. When the test
runs after amount of interface creating/deleting tests. The interface index
will exceed the map max entries and xdp_redirect_multi will error out with
"Get interfacesInterface index to large".
Fix this issue by limit the tests in netns and specify the ifindex when
creating interfaces.
Fixes:
d23292476297 ("selftests/bpf: Add xdp_redirect_multi test")
Reported-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211027033553.962413-5-liuhangbin@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Hangbin Liu [Wed, 27 Oct 2021 03:35:52 +0000 (11:35 +0800)]
selftests/bpf/xdp_redirect_multi: Give tcpdump a chance to terminate cleanly
[ Upstream commit
648c3677062fbd14d754b853daebb295426771e8 ]
No need to kill tcpdump with -9.
Fixes:
d23292476297 ("selftests/bpf: Add xdp_redirect_multi test")
Suggested-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211027033553.962413-4-liuhangbin@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Hangbin Liu [Wed, 27 Oct 2021 03:35:51 +0000 (11:35 +0800)]
selftests/bpf/xdp_redirect_multi: Use arping to accurate the arp number
[ Upstream commit
f53ea9dbf78d42a10e2392b5c59362ccc224fd1d ]
The arp request number triggered by ping none exist address is not accurate,
which may lead the test false negative/positive. Change to use arping to
accurate the arp number. Also do not use grep pattern match for dot.
Fixes:
d23292476297 ("selftests/bpf: Add xdp_redirect_multi test")
Suggested-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211027033553.962413-3-liuhangbin@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Hangbin Liu [Wed, 27 Oct 2021 03:35:50 +0000 (11:35 +0800)]
selftests/bpf/xdp_redirect_multi: Put the logs to tmp folder
[ Upstream commit
8b4ac13abe7d82da0e0d22a9ba2e27301559a93e ]
The xdp_redirect_multi test logs are created in selftest folder and not cleaned
after test. Let's creat a tmp dir and remove the logs after testing.
Fixes:
d23292476297 ("selftests/bpf: Add xdp_redirect_multi test")
Suggested-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211027033553.962413-2-liuhangbin@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Mehrdad Arshad Rad [Thu, 4 Nov 2021 17:13:54 +0000 (10:13 -0700)]
libbpf: Fix lookup_and_delete_elem_flags error reporting
[ Upstream commit
64165ddf8ea184631c65e3bbc8d59f6d940590ca ]
Fix bpf_map_lookup_and_delete_elem_flags() to pass the return code through
libbpf_err_errno() as we do similarly in bpf_map_lookup_and_delete_elem().
Fixes:
f12b65432728 ("libbpf: Streamline error reporting for low-level APIs")
Signed-off-by: Mehrdad Arshad Rad <arshad.rad@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211104171354.11072-1-arshad.rad@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Rafael J. Wysocki [Thu, 4 Nov 2021 21:54:17 +0000 (22:54 +0100)]
ACPI: PM: Fix device wakeup power reference counting error
[ Upstream commit
452a3e723f75880757acf87b053935c43aa89f89 ]
Fix a device wakeup power reference counting error introduced by
commit
a2d7b2e004af ("ACPI: PM: Fix sharing of wakeup power
resources") because of a coding mistake.
Fixes:
a2d7b2e004af ("ACPI: PM: Fix sharing of wakeup power resources")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Kai Song [Wed, 6 Oct 2021 14:19:26 +0000 (22:19 +0800)]
mfd: altera-sysmgr: Fix a mistake caused by resource_size conversion
[ Upstream commit
fae2570d629cdd72f0611d015fc4ba705ae5422b ]
The resource_size defines that:
res->end - res->start + 1;
The origin original code is:
sysmgr_config.max_register = res->end - res->start - 3;
So, the correct fix is that:
sysmgr_config.max_register = resource_size(res) - 4;
Fixes:
d12edf9661a4 ("mfd: altera-sysmgr: Use resource_size function on resource object")
Signed-off-by: Kai Song <songkai01@inspur.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Link: https://lore.kernel.org/r/20211006141926.6120-1-songkai01@inspur.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Mark Brown [Fri, 24 Sep 2021 14:33:47 +0000 (15:33 +0100)]
mfd: sprd: Add SPI device ID table
[ Upstream commit
c5c7f0677107052060037583b9c8c15d818afb04 ]
Currently autoloading for SPI devices does not use the DT ID table, it uses
SPI modalises. Supporting OF modalises is going to be difficult if not
impractical, an attempt was made but has been reverted, so ensure that
module autoloading works for this driver by adding a SPI device ID table.
Fixes:
96c8395e2166 ("spi: Revert modalias changes")
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Baolin Wang <baolin.wang7@gmail.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Link: https://lore.kernel.org/r/20210924143347.14721-4-broonie@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Mark Brown [Fri, 24 Sep 2021 14:33:46 +0000 (15:33 +0100)]
mfd: cpcap: Add SPI device ID table
[ Upstream commit
d5fa8592b773f4da2b04e7333cd37efec5e4ca43 ]
Currently autoloading for SPI devices does not use the DT ID table, it uses
SPI modalises. Supporting OF modalises is going to be difficult if not
impractical, an attempt was made but has been reverted, so ensure that
module autoloading works for this driver by adding a SPI device ID table.
Fixes:
96c8395e2166 ("spi: Revert modalias changes")
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Link: https://lore.kernel.org/r/20210924143347.14721-3-broonie@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Krzysztof Kozlowski [Fri, 28 May 2021 11:51:26 +0000 (07:51 -0400)]
mfd: core: Add missing of_node_put for loop iteration
[ Upstream commit
002be81140075e17a1ebd5c3c55e356fbab0ddad ]
Early exits from for_each_child_of_node() should decrement the
node reference counter. Reported by Coccinelle:
drivers/mfd/mfd-core.c:197:2-24: WARNING:
Function "for_each_child_of_node" should have of_node_put() before goto around lines 209.
Fixes:
c94bb233a9fe ("mfd: Make MFD core code Device Tree and IRQ domain aware")
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Link: https://lore.kernel.org/r/20210528115126.18370-1-krzysztof.kozlowski@canonical.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Takashi Iwai [Fri, 5 Nov 2021 10:21:03 +0000 (11:21 +0100)]
ALSA: memalloc: Catch call with NULL snd_dma_buffer pointer
[ Upstream commit
dce9446192439eaac81c21f517325fb473735e53 ]
Although we've covered all calls with NULL dma buffer pointer, so far,
there may be still some else in the wild. For catching such a case
more easily, add a WARN_ON_ONCE() in snd_dma_get_ops().
Fixes:
37af81c5998f ("ALSA: core: Abstract memory alloc helpers")
Link: https://lore.kernel.org/r/20211105102103.28148-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Arnd Bergmann [Thu, 4 Nov 2021 13:34:42 +0000 (14:34 +0100)]
octeontx2-pf: select CONFIG_NET_DEVLINK
[ Upstream commit
9cbc3367968de69017a87a1118b62490ac1bdd0a ]
The octeontx2 pf nic driver failsz to link when the devlink support
is not reachable:
aarch64-linux-ld: drivers/net/ethernet/marvell/octeontx2/nic/otx2_devlink.o: in function `otx2_dl_mcam_count_get':
otx2_devlink.c:(.text+0x10): undefined reference to `devlink_priv'
aarch64-linux-ld: drivers/net/ethernet/marvell/octeontx2/nic/otx2_devlink.o: in function `otx2_dl_mcam_count_validate':
otx2_devlink.c:(.text+0x50): undefined reference to `devlink_priv'
aarch64-linux-ld: drivers/net/ethernet/marvell/octeontx2/nic/otx2_devlink.o: in function `otx2_dl_mcam_count_set':
otx2_devlink.c:(.text+0xd0): undefined reference to `devlink_priv'
aarch64-linux-ld: drivers/net/ethernet/marvell/octeontx2/nic/otx2_devlink.o: in function `otx2_devlink_info_get':
otx2_devlink.c:(.text+0x150): undefined reference to `devlink_priv'
This is already selected by the admin function driver, but not the
actual nic, which might be built-in when the af driver is not.
Fixes:
2da489432747 ("octeontx2-pf: devlink params support to set mcam entry count")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Huang Guobin [Tue, 2 Nov 2021 09:37:33 +0000 (17:37 +0800)]
bonding: Fix a use-after-free problem when bond_sysfs_slave_add() failed
[ Upstream commit
b93c6a911a3fe926b00add28f3b932007827c4ca ]
When I do fuzz test for bonding device interface, I got the following
use-after-free Calltrace:
==================================================================
BUG: KASAN: use-after-free in bond_enslave+0x1521/0x24f0
Read of size 8 at addr
ffff88825bc11c00 by task ifenslave/7365
CPU: 5 PID: 7365 Comm: ifenslave Tainted: G E 5.15.0-rc1+ #13
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1 04/01/2014
Call Trace:
dump_stack_lvl+0x6c/0x8b
print_address_description.constprop.0+0x48/0x70
kasan_report.cold+0x82/0xdb
__asan_load8+0x69/0x90
bond_enslave+0x1521/0x24f0
bond_do_ioctl+0x3e0/0x450
dev_ifsioc+0x2ba/0x970
dev_ioctl+0x112/0x710
sock_do_ioctl+0x118/0x1b0
sock_ioctl+0x2e0/0x490
__x64_sys_ioctl+0x118/0x150
do_syscall_64+0x35/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f19159cf577
Code: b3 66 90 48 8b 05 11 89 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 78
RSP: 002b:
00007ffeb3083c78 EFLAGS:
00000246 ORIG_RAX:
0000000000000010
RAX:
ffffffffffffffda RBX:
00007ffeb3084bca RCX:
00007f19159cf577
RDX:
00007ffeb3083ce0 RSI:
0000000000008990 RDI:
0000000000000003
RBP:
00007ffeb3084bc4 R08:
0000000000000040 R09:
0000000000000000
R10:
00007ffeb3084bc0 R11:
0000000000000246 R12:
00007ffeb3083ce0
R13:
0000000000000000 R14:
0000000000000000 R15:
00007ffeb3083cb0
Allocated by task 7365:
kasan_save_stack+0x23/0x50
__kasan_kmalloc+0x83/0xa0
kmem_cache_alloc_trace+0x22e/0x470
bond_enslave+0x2e1/0x24f0
bond_do_ioctl+0x3e0/0x450
dev_ifsioc+0x2ba/0x970
dev_ioctl+0x112/0x710
sock_do_ioctl+0x118/0x1b0
sock_ioctl+0x2e0/0x490
__x64_sys_ioctl+0x118/0x150
do_syscall_64+0x35/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
Freed by task 7365:
kasan_save_stack+0x23/0x50
kasan_set_track+0x20/0x30
kasan_set_free_info+0x24/0x40
__kasan_slab_free+0xf2/0x130
kfree+0xd1/0x5c0
slave_kobj_release+0x61/0x90
kobject_put+0x102/0x180
bond_sysfs_slave_add+0x7a/0xa0
bond_enslave+0x11b6/0x24f0
bond_do_ioctl+0x3e0/0x450
dev_ifsioc+0x2ba/0x970
dev_ioctl+0x112/0x710
sock_do_ioctl+0x118/0x1b0
sock_ioctl+0x2e0/0x490
__x64_sys_ioctl+0x118/0x150
do_syscall_64+0x35/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
Last potentially related work creation:
kasan_save_stack+0x23/0x50
kasan_record_aux_stack+0xb7/0xd0
insert_work+0x43/0x190
__queue_work+0x2e3/0x970
delayed_work_timer_fn+0x3e/0x50
call_timer_fn+0x148/0x470
run_timer_softirq+0x8a8/0xc50
__do_softirq+0x107/0x55f
Second to last potentially related work creation:
kasan_save_stack+0x23/0x50
kasan_record_aux_stack+0xb7/0xd0
insert_work+0x43/0x190
__queue_work+0x2e3/0x970
__queue_delayed_work+0x130/0x180
queue_delayed_work_on+0xa7/0xb0
bond_enslave+0xe25/0x24f0
bond_do_ioctl+0x3e0/0x450
dev_ifsioc+0x2ba/0x970
dev_ioctl+0x112/0x710
sock_do_ioctl+0x118/0x1b0
sock_ioctl+0x2e0/0x490
__x64_sys_ioctl+0x118/0x150
do_syscall_64+0x35/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
The buggy address belongs to the object at
ffff88825bc11c00
which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 0 bytes inside of
1024-byte region [
ffff88825bc11c00,
ffff88825bc12000)
The buggy address belongs to the page:
page:
ffffea00096f0400 refcount:1 mapcount:0 mapping:
0000000000000000 index:0x0 pfn:0x25bc10
head:
ffffea00096f0400 order:3 compound_mapcount:0 compound_pincount:0
flags: 0x57ff00000010200(slab|head|node=1|zone=2|lastcpupid=0x7ff)
raw:
057ff00000010200 ffffea0009a71c08 ffff888240001968 ffff88810004dbc0
raw:
0000000000000000 00000000000a000a 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff88825bc11b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff88825bc11b80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>
ffff88825bc11c00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff88825bc11c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88825bc11d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
Put new_slave in bond_sysfs_slave_add() will cause use-after-free problems
when new_slave is accessed in the subsequent error handling process. Since
new_slave will be put in the subsequent error handling process, remove the
unnecessary put to fix it.
In addition, when sysfs_create_file() fails, if some files have been crea-
ted successfully, we need to call sysfs_remove_file() to remove them.
Since there are sysfs_create_files() & sysfs_remove_files() can be used,
use these two functions instead.
Fixes:
7afcaec49696 (bonding: use kobject_put instead of _del after kobject_add)
Signed-off-by: Huang Guobin <huangguobin4@huawei.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Jason Gunthorpe [Tue, 19 Oct 2021 23:27:31 +0000 (20:27 -0300)]
drm/ttm: remove ttm_bo_vm_insert_huge()
[ Upstream commit
0d979509539ed1df883a30d442177ca7be609565 ]
The huge page functionality in TTM does not work safely because PUD and
PMD entries do not have a special bit.
get_user_pages_fast() considers any page that passed pmd_huge() as
usable:
if (unlikely(pmd_trans_huge(pmd) || pmd_huge(pmd) ||
pmd_devmap(pmd))) {
And vmf_insert_pfn_pmd_prot() unconditionally sets
entry = pmd_mkhuge(pfn_t_pmd(pfn, prot));
eg on x86 the page will be _PAGE_PRESENT | PAGE_PSE.
As such gup_huge_pmd() will try to deref a struct page:
head = try_grab_compound_head(pmd_page(orig), refs, flags);
and thus crash.
Thomas further notices that the drivers are not expecting the struct page
to be used by anything - in particular the refcount incr above will cause
them to malfunction.
Thus everything about this is not able to fully work correctly considering
GUP_fast. Delete it entirely. It can return someday along with a proper
PMD/PUD_SPECIAL bit in the page table itself to gate GUP_fast.
Fixes:
314b6580adc5 ("drm/ttm, drm/vmwgfx: Support huge TTM pagefaults")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Thomas Hellström <thomas.helllstrom@linux.intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
[danvet: Update subject per Thomas' &Christian's review]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/0-v2-a44694790652+4ac-ttm_pmd_jgg@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Luis Chamberlain [Wed, 3 Nov 2021 16:40:23 +0000 (09:40 -0700)]
block: fix device_add_disk() kobject_create_and_add() error handling
[ Upstream commit
fe7d064fa3faec5d8157029fb8720b4fddc9e1e8 ]
Commit
83cbce957446 ("block: add error handling for device_add_disk /
add_disk") added error handling to device_add_disk(), however the goto
label for the kobject_create_and_add() failure did not set the return
value correctly, and so we can end up in a situation where
kobject_create_and_add() fails but we report success.
Fixes:
83cbce957446 ("block: add error handling for device_add_disk / add_disk")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211103164023.1384821-1-mcgrof@kernel.org
[axboe: fold in followup fix from Wu Bo <wubo40@huawei.com>]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Heiner Kallweit [Wed, 3 Nov 2021 21:08:28 +0000 (22:08 +0100)]
net: phy: fix duplex out of sync problem while changing settings
[ Upstream commit
a4db9055fdb9cf607775c66d39796caf6439ec92 ]
As reported by Zhang there's a small issue if in forced mode the duplex
mode changes with the link staying up [0]. In this case the MAC isn't
notified about the change.
The proposed patch relies on the phylib state machine and ignores the
fact that there are drivers that uses phylib but not the phylib state
machine. So let's don't change the behavior for such drivers and fix
it w/o re-adding state PHY_FORCING for the case that phylib state
machine is used.
[0] https://lore.kernel.org/netdev/
a5c26ffd-4ee4-a5e6-4103-
873208ce0dc5@huawei.com/T/
Fixes:
2bd229df5e2e ("net: phy: remove state PHY_FORCING")
Reported-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Tested-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/7b8b9456-a93f-abbc-1dc5-a2c2542f932c@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Rafael J. Wysocki [Wed, 3 Nov 2021 18:43:47 +0000 (19:43 +0100)]
cpufreq: intel_pstate: Clear HWP desired on suspend/shutdown and offline
[ Upstream commit
dbea75fe18f60e364de6d994fc938a24ba249d81 ]
Commit
a365ab6b9dfb ("cpufreq: intel_pstate: Implement the
->adjust_perf() callback") caused intel_pstate to use nonzero HWP
desired values in certain usage scenarios, but it did not prevent
them from being leaked into the confugirations in which HWP desired
is expected to be 0.
The failing scenarios are switching the driver from the passive
mode to the active mode and starting a new kernel via kexec() while
intel_pstate is running in the passive mode.
To address this issue, ensure that HWP desired will be cleared on
offline and suspend/shutdown.
Fixes:
a365ab6b9dfb ("cpufreq: intel_pstate: Implement the ->adjust_perf() callback")
Reported-by: Julia Lawall <julia.lawall@inria.fr>
Tested-by: Julia Lawall <julia.lawall@inria.fr>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Selvin Xavier [Sat, 11 Sep 2021 10:03:05 +0000 (03:03 -0700)]
PCI: Do not enable AtomicOps on VFs
[ Upstream commit
5ec0a6fcb60ea430f8ee7e0bec22db9b22f856d3 ]
Host crashes when pci_enable_atomic_ops_to_root() is called for VFs with
virtual buses. The virtual buses added to SR-IOV have bus->self set to NULL
and host crashes due to this.
PID: 4481 TASK:
ffff89c6941b0000 CPU: 53 COMMAND: "bash"
...
#3 [
ffff9a9481713808] oops_end at
ffffffffb9025cd6
#4 [
ffff9a9481713828] page_fault_oops at
ffffffffb906e417
#5 [
ffff9a9481713888] exc_page_fault at
ffffffffb9a0ad14
#6 [
ffff9a94817138b0] asm_exc_page_fault at
ffffffffb9c00ace
[exception RIP: pcie_capability_read_dword+28]
RIP:
ffffffffb952fd5c RSP:
ffff9a9481713960 RFLAGS:
00010246
RAX:
0000000000000001 RBX:
ffff89c6b1096000 RCX:
0000000000000000
RDX:
ffff9a9481713990 RSI:
0000000000000024 RDI:
0000000000000000
RBP:
0000000000000080 R8:
0000000000000008 R9:
ffff89c64341a2f8
R10:
0000000000000002 R11:
0000000000000000 R12:
ffff89c648bab000
R13:
0000000000000000 R14:
0000000000000000 R15:
ffff89c648bab0c8
ORIG_RAX:
ffffffffffffffff CS: 0010 SS: 0018
#7 [
ffff9a9481713988] pci_enable_atomic_ops_to_root at
ffffffffb95359a6
#8 [
ffff9a94817139c0] bnxt_qplib_determine_atomics at
ffffffffc08c1a33 [bnxt_re]
#9 [
ffff9a94817139d0] bnxt_re_dev_init at
ffffffffc08ba2d1 [bnxt_re]
Per PCIe r5.0, sec 9.3.5.10, the AtomicOp Requester Enable bit in Device
Control 2 is reserved for VFs. The PF value applies to all associated VFs.
Return -EINVAL if pci_enable_atomic_ops_to_root() is called for a VF.
Link: https://lore.kernel.org/r/1631354585-16597-1-git-send-email-selvin.xavier@broadcom.com
Fixes:
35f5ace5dea4 ("RDMA/bnxt_re: Enable global atomic ops if platform supports")
Fixes:
430a23689dea ("PCI: Add pci_enable_atomic_ops_to_root()")
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Tetsuo Handa [Wed, 3 Nov 2021 23:04:33 +0000 (16:04 -0700)]
ataflop: remove ataflop_probe_lock mutex
[ Upstream commit
4ddb85d36613c45bde00d368bf9f357bd0708a0c ]
Commit
bf9c0538e485b591 ("ataflop: use a separate gendisk for each media
format") introduced ataflop_probe_lock mutex, but forgot to unlock the
mutex when atari_floppy_init() (i.e. module loading) succeeded. This will
result in double lock deadlock if ataflop_probe() is called. Also,
unregister_blkdev() must not be called from atari_floppy_init() with
ataflop_probe_lock held when atari_floppy_init() failed, for
ataflop_probe() waits for ataflop_probe_lock with major_names_lock held
(i.e. AB-BA deadlock).
__register_blkdev() needs to be called last in order to avoid calling
ataflop_probe() when atari_floppy_init() is about to fail, for memory for
completing already-started ataflop_probe() safely will be released as soon
as atari_floppy_init() released ataflop_probe_lock mutex.
As with commit
8b52d8be86d72308 ("loop: reorder loop_exit"),
unregister_blkdev() needs to be called first in order to avoid calling
ataflop_alloc_disk() from ataflop_probe() after del_gendisk() from
atari_floppy_exit().
By relocating __register_blkdev() / unregister_blkdev() as explained above,
we can remove ataflop_probe_lock mutex, for probe function and __exit
function are serialized by major_names_lock mutex.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Fixes:
bf9c0538e485b591 ("ataflop: use a separate gendisk for each media format")
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Link: https://lore.kernel.org/r/20211103230437.1639990-11-mcgrof@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Luis Chamberlain [Mon, 27 Sep 2021 22:03:01 +0000 (15:03 -0700)]
block/ataflop: provide a helper for cleanup up an atari disk
[ Upstream commit
deae1138d04758c7f8939fcb8aee330bc37e3015 ]
Instead of using two separate code paths for cleaning up an atari disk,
use one. We take the more careful approach to check for *all* disk
types, as is done on exit. The init path didn't have that check as
the alternative disk types are only probed for later, they are not
initialized by default.
Yes, there is a shared tag for all disks.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20210927220302.1073499-14-mcgrof@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Luis Chamberlain [Mon, 27 Sep 2021 22:03:00 +0000 (15:03 -0700)]
block/ataflop: add registration bool before calling del_gendisk()
[ Upstream commit
573effb298011d3fcabc9b12025cf637f8a07911 ]
The ataflop assumes del_gendisk() is safe to call, this is only
true because add_disk() does not return a failure, but that will
change soon. And so, before we get to adding error handling for
that case, let's make sure we keep track of which disks actually
get registered. Then we use this to only call del_gendisk for them.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20210927220302.1073499-13-mcgrof@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Luis Chamberlain [Mon, 27 Sep 2021 22:02:59 +0000 (15:02 -0700)]
block/ataflop: use the blk_cleanup_disk() helper
[ Upstream commit
44a469b6acae6ad05c4acca8429467d1d50a8b8d ]
Use the helper to replace two lines with one.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20210927220302.1073499-12-mcgrof@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Luis Chamberlain [Wed, 3 Nov 2021 23:04:28 +0000 (16:04 -0700)]
nvdimm/pmem: cleanup the disk if pmem_release_disk() is yet assigned
[ Upstream commit
accf58afb689f81daadde24080ea1164ad2db75f ]
Prior to devm being able to use pmem_release_disk() there are other
failure which can occur for which we must account for and release the
disk for. Address those few cases.
Fixes:
3dd60fb9d95d ("nvdimm/pmem: stop using q_usage_count as external pgmap refcount")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20211103230437.1639990-6-mcgrof@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Chenyuan Mi [Tue, 7 Sep 2021 12:26:33 +0000 (20:26 +0800)]
drm/nouveau/svm: Fix refcount leak bug and missing check against null bug
[ Upstream commit
6bb8c2d51811eb5e6504f49efe3b089d026009d2 ]
The reference counting issue happens in one exception handling path of
nouveau_svmm_bind(). When cli->svm.svmm is null, the function forgets
to decrease the refcount of mm increased by get_task_mm(), causing a
refcount leak.
Fix this issue by using mmput() to decrease the refcount in the
exception handling path.
Also, the function forgets to do check against null when get mm
by get_task_mm().
Fix this issue by adding null check after get mm by get_task_mm().
Signed-off-by: Chenyuan Mi <cymi20@fudan.edu.cn>
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Fixes:
822cab6150d3 ("drm/nouveau/svm: check for SVM initialized before migrating")
Reviewed-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210907122633.16665-1-cymi20@fudan.edu.cn
Link: https://gitlab.freedesktop.org/drm/nouveau/-/merge_requests/14
Signed-off-by: Sasha Levin <sashal@kernel.org>
Andrea Righi [Thu, 4 Nov 2021 10:46:13 +0000 (11:46 +0100)]
selftests: net: properly support IPv6 in GSO GRE test
[ Upstream commit
a985442fdecb59504e3a2f1cfdd3c53af017ea5b ]
Explicitly pass -6 to netcat when the test is using IPv6 to prevent
failures.
Also make sure to pass "-N" to netcat to close the socket after EOF on
the client side, otherwise we would always hit the timeout and the test
would fail.
Without this fix applied:
TEST: GREv6/v4 - copy file w/ TSO [FAIL]
TEST: GREv6/v4 - copy file w/ GSO [FAIL]
TEST: GREv6/v6 - copy file w/ TSO [FAIL]
TEST: GREv6/v6 - copy file w/ GSO [FAIL]
With this fix applied:
TEST: GREv6/v4 - copy file w/ TSO [ OK ]
TEST: GREv6/v4 - copy file w/ GSO [ OK ]
TEST: GREv6/v6 - copy file w/ TSO [ OK ]
TEST: GREv6/v6 - copy file w/ GSO [ OK ]
Fixes:
025efa0a82df ("selftests: add simple GSO GRE test")
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Avri Altman [Sun, 31 Oct 2021 12:36:54 +0000 (14:36 +0200)]
scsi: ufs: ufshpb: Properly handle max-single-cmd
[ Upstream commit
9ec5128a8b5631d652ed06b37e0166f337802f90 ]
The spec recommends that for transfer length larger than the max-single-cmd
attribute (bMAX_DATA_SIZE_FOR_HPB_SINGLE_CMD) it is possible to couple
pre-requests with the HPB-READ command. Being a recommendation, using
pre-requests can be perceived merely as a means of optimization. A common
practice was to send pre-requests for chunks within some interval, and
leave the READ10 untouched if larger.
Now that the pre-request flows have been removed, all the commands are
single commands. Properly handle this attribute and do not send HPB-READ
for transfer lengths larger than max-single-cmd.
[mkp: resolve conflict]
Fixes:
09d9e4d04187 ("scsi: ufs: ufshpb: Remove HPB2.0 flows")
Link: https://lore.kernel.org/r/20211031123654.17719-1-avri.altman@wdc.com
Reviewed-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Bean Huo [Wed, 29 Sep 2021 20:06:38 +0000 (22:06 +0200)]
scsi: ufs: core: Fix NULL pointer dereference
[ Upstream commit
1da3b0141e74c18c2377d4c2655406a90a87742f ]
Calling ufshcd_rpm_{get/put}_sync() prior to ufshcd_scsi_add_wlus() being
called will trigger a NULL pointer dereference. This is because
hba->sdev_ufs_device is initialized in ufshcd_scsi_add_wlus().
Unable to handle kernel NULL pointer dereference at virtual address
0000000000000348
Mem abort info:
ESR = 0x96000004
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x04: level 0 translation fault
Data abort info:
ISV = 0, ISS = 0x00000004
CM = 0, WnR = 0
[
0000000000000348] user address but active_mm is swapper
Internal error: Oops:
96000004 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 91 Comm: kworker/u16:1 Not tainted 5.15.0-rc1-beanhuo-linaro-1423
Hardware name: MicronRB (DT)
Workqueue: events_unbound async_run_entry_fn
pstate:
20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : pm_runtime_drop_link+0x128/0x338
lr : ufshpb_get_dev_info+0x8c/0x148
sp :
ffff800012573c10
x29:
ffff800012573c10 x28:
0000000000000000 x27:
0000000000000003
x26:
ffff000001d21298 x25:
000000005abcea60 x24:
ffff800011d89000
x23:
0000000000000001 x22:
ffff000001d21880 x21:
ffff000001ec9300
x20:
0000000000000004 x19:
0000000000000198 x18:
ffffffffffffffff
x17:
0000000000000000 x16:
0000000000000000 x15:
0000000000041400
x14:
5eee00201100200a x13:
000000000000bb03 x12:
0000000000000000
x11:
0000000000000100 x10:
0200000000000000 x9 :
bb0000021a162c01
x8 :
0302010021021003 x7 :
0000000000000000 x6 :
ffff800012573af0
x5 :
0000000000000001 x4 :
0000000000000001 x3 :
0000000000000200
x2 :
0000000000000348 x1 :
0000000000000348 x0 :
ffff80001095308c
Call trace:
pm_runtime_drop_link+0x128/0x338
ufshpb_get_dev_info+0x8c/0x148
ufshcd_probe_hba+0xda0/0x11b8
ufshcd_async_scan+0x34/0x330
async_run_entry_fn+0x38/0x180
process_one_work+0x1f4/0x498
worker_thread+0x48/0x480
kthread+0x140/0x158
ret_from_fork+0x10/0x20
Code:
88027c01 35ffffa2 17fff6c4 f9800051 (
885f7c40)
---[ end trace
2ba541335f595c95 ]
ufshpb_get_dev_info() is only called during asynchronous scanning and at
that time pm_runtime_get_sync() has been called:
...
/* Hold auto suspend until async scan completes */
pm_runtime_get_sync(dev);
atomic_set(&hba->scsi_block_reqs_cnt, 0);
...
ufshcd_async_scan()
ufshcd_probe_hba(hba, true);
ufshcd_device_params_init(hba);
ufshpb_get_dev_info();
...
pm_runtime_put_sync(hba->dev);
Remove ufshcd_rpm_{get/put}_sync() from ufshpb_get_dev_info() to fix this
problem.
Link: https://lore.kernel.org/r/20210929200640.828611-2-huobean@gmail.com
Fixes:
351b3a849ac7 ("scsi: ufs: ufshpb: Use proper power management API")
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Daejun Park [Thu, 2 Sep 2021 00:35:34 +0000 (09:35 +0900)]
scsi: ufs: ufshpb: Use proper power management API
[ Upstream commit
351b3a849ac7d92449dc75c43db8a857b38387ea ]
In ufshpb, pm_runtime_{get,put}_sync() are used to avoid unwanted runtime
suspend during query requests. Whereas commit
b294ff3e3449 ("scsi: ufs:
core: Enable power management for wlun") modified the driver core to use
ufshcd_rpm_{get,put}_sync() APIs.
Switch to these APIs in HPB module as well.
Link: https://lore.kernel.org/r/20210902003534epcms2p1937a0f0eeb48a441cb69f5ef13ff8430@epcms2p1
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Jackie Liu [Fri, 22 Oct 2021 01:02:01 +0000 (09:02 +0800)]
scsi: bsg: Fix errno when scsi_bsg_register_queue() fails
[ Upstream commit
5f7cf82c1d7373fcf9e1062f5654efd5fa2b9211 ]
When the value of error is printed, it will always be 0. We should print
the correct error code when scsi_bsg_register_queue() fails.
Link: https://lore.kernel.org/r/20211022010201.426746-1-liu.yun@linux.dev
Fixes:
ead09dd3aed5 ("scsi: bsg: Simplify device registration")
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Luis Chamberlain [Wed, 3 Nov 2021 16:58:43 +0000 (09:58 -0700)]
nvdimm/btt: do not call del_gendisk() if not needed
[ Upstream commit
3aefb5ee843fbe4789d03bb181e190d462df95e4 ]
del_gendisk() should not called if the disk has not been added. Fix this.
Fixes:
41cd8b70c37a ("libnvdimm, btt: add support for blk integrity")
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20211103165843.1402142-1-mcgrof@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Christophe JAILLET [Sun, 27 Jun 2021 11:46:24 +0000 (13:46 +0200)]
PCI: j721e: Fix j721e_pcie_probe() error path
[ Upstream commit
496bb18483cc0474913e81e18a6b313aaea4c120 ]
If an error occurs after a successful cdns_pcie_init_phy() call, it must be
undone by a cdns_pcie_disable_phy() call, as already done above and below.
Update the goto to branch at the correct place of the error handling path.
Link: https://lore.kernel.org/r/db477b0cb444891a17c4bb424467667dc30d0bab.1624794264.git.christophe.jaillet@wanadoo.fr
Fixes:
49e0efdce791 ("PCI: j721e: Add support to provide refclk to PCIe connector")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Krzysztof Wilczyński <kw@linux.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Hans de Goede [Sun, 31 Oct 2021 15:31:35 +0000 (16:31 +0100)]
ACPI: PMIC: Fix intel_pmic_regs_handler() read accesses
[ Upstream commit
009a789443fe4c8e6b1ecb7c16b4865c026184cd ]
The handling of PMIC register reads through writing 0 to address 4
of the OpRegion is wrong. Instead of returning the read value
through the value64, which is a no-op for function == ACPI_WRITE calls,
store the value and then on a subsequent function == ACPI_READ with
address == 3 (the address for the value field of the OpRegion)
return the stored value.
This has been tested on a Xiaomi Mi Pad 2 and makes the ACPI battery dev
there mostly functional (unfortunately there are still other issues).
Here are the SET() / GET() functions of the PMIC ACPI device,
which use this OpRegion, which clearly show the new behavior to
be correct:
OperationRegion (REGS, 0x8F, Zero, 0x50)
Field (REGS, ByteAcc, NoLock, Preserve)
{
CLNT, 8,
SA, 8,
OFF, 8,
VAL, 8,
RWM, 8
}
Method (GET, 3, Serialized)
{
If ((AVBE == One))
{
CLNT = Arg0
SA = Arg1
OFF = Arg2
RWM = Zero
If ((AVBG == One))
{
GPRW = Zero
}
}
Return (VAL) /* \_SB_.PCI0.I2C7.PMI5.VAL_ */
}
Method (SET, 4, Serialized)
{
If ((AVBE == One))
{
CLNT = Arg0
SA = Arg1
OFF = Arg2
VAL = Arg3
RWM = One
If ((AVBG == One))
{
GPRW = One
}
}
}
Fixes:
0afa877a5650 ("ACPI / PMIC: intel: add REGS operation region support")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Daniel Thompson [Tue, 2 Nov 2021 17:31:58 +0000 (17:31 +0000)]
kdb: Adopt scheduler's task classification
[ Upstream commit
b77dbc86d60459b42ab375e4e23172e7245f2854 ]
Currently kdb contains some open-coded routines to generate a summary
character for each task. This code currently issues warnings, is
almost certainly broken and won't make sense to any kernel dev who
has ever used /proc to examine task states.
Fix both the warning and the potential for confusion by adopting the
scheduler's task classification. Whilst doing this we also simplify the
filtering by using mask strings directly (which means we don't have to
guess all the characters the scheduler might give us).
Unfortunately we can't quite match the scheduler classification completely.
We add four extra states: - for idle loops and i, m and s for sleeping
system daemons (which means kthreads in one of the I, M and S states).
These extra states are used to manage the filters for tools to make the
output of ps and bta less noisy.
Note: The Fixes below is the last point the original dubious code was
moved; it was not introduced by that patch. However it gives us
the last point to which this patch can be easily backported.
Happily that should be enough to cover the introduction of
CONFIG_WERROR!
Fixes:
2f064a59a11f ("sched: Change task_struct::state")
Link: https://lore.kernel.org/r/20211102173158.3315227-1-daniel.thompson@linaro.org
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Brett Creeley [Thu, 9 Sep 2021 21:38:08 +0000 (14:38 -0700)]
ice: Fix not stopping Tx queues for VFs
[ Upstream commit
b385cca47363316c6d9a74ae9db407bbc281f815 ]
When a VF is removed and/or reset its Tx queues need to be
stopped from the PF. This is done by calling the ice_dis_vf_qs()
function, which calls ice_vsi_stop_lan_tx_rings(). Currently
ice_dis_vf_qs() is protected by the VF state bit ICE_VF_STATE_QS_ENA.
Unfortunately, this is causing the Tx queues to not be disabled in some
cases and when the VF tries to re-enable/reconfigure its Tx queues over
virtchnl the op is failing. This is because a VF can be reset and/or
removed before the ICE_VF_STATE_QS_ENA bit is set, but the Tx queues
were already configured via ice_vsi_cfg_single_txq() in the
VIRTCHNL_OP_CONFIG_VSI_QUEUES op. However, the ICE_VF_STATE_QS_ENA bit
is set on a successful VIRTCHNL_OP_ENABLE_QUEUES, which will always
happen after the VIRTCHNL_OP_CONFIG_VSI_QUEUES op.
This was causing the following error message when loading the ice
driver, creating VFs, and modifying VF trust in an endless loop:
[35274.192484] ice 0000:88:00.0: Failed to set LAN Tx queue context, error: ICE_ERR_PARAM
[35274.193074] ice 0000:88:00.0: VF 0 failed opcode 6, retval: -5
[35274.193640] iavf 0000:88:01.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6
Fix this by always calling ice_dis_vf_qs() and silencing the error
message in ice_vsi_stop_tx_ring() since the calling code ignores the
return anyway. Also, all other places that call ice_vsi_stop_tx_ring()
catch the error, so this doesn't affect those flows since there was no
change to the values the function returns.
Other solutions were considered (i.e. tracking which VF queues had been
"started/configured" in VIRTCHNL_OP_CONFIG_VSI_QUEUES, but it seemed
more complicated than it was worth. This solution also brings in the
chance for other unexpected conditions due to invalid state bit checks.
So, the proposed solution seemed like the best option since there is no
harm in failing to stop Tx queues that were never started.
This issue can be seen using the following commands:
for i in {0..50}; do
rmmod ice
modprobe ice
sleep 1
echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs
echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs
ip link set ens785f1 vf 0 trust on
ip link set ens785f0 vf 0 trust on
sleep 2
echo 0 > /sys/class/net/ens785f0/device/sriov_numvfs
echo 0 > /sys/class/net/ens785f1/device/sriov_numvfs
sleep 1
echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs
echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs
ip link set ens785f1 vf 0 trust on
ip link set ens785f0 vf 0 trust on
done
Fixes:
77ca27c41705 ("ice: add support for virtchnl_queue_select.[tx|rx]_queues bitmap")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Sylwester Dziedziuch [Thu, 6 May 2021 15:40:03 +0000 (08:40 -0700)]
ice: Fix replacing VF hardware MAC to existing MAC filter
[ Upstream commit
ce572a5b88d5ca6737b5e23da9892792fd708ad3 ]
VF was not able to change its hardware MAC address in case
the new address was already present in the MAC filter list.
Change the handling of VF add mac request to not return
if requested MAC address is already present on the list
and check if its hardware MAC needs to be updated in this case.
Fixes:
ed4c068d46f6 ("ice: Enable ip link show on the PF to display VF unicast MAC(s)")
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Vladimir Oltean [Tue, 2 Nov 2021 19:31:22 +0000 (21:31 +0200)]
net: dsa: felix: fix broken VLAN-tagged PTP under VLAN-aware bridge
[ Upstream commit
92f62485b3715882cd397b0cbd80a96d179b86d6 ]
Normally it is expected that the dsa_device_ops :: rcv() method finishes
parsing the DSA tag and consumes it, then never looks at it again.
But commit
c0bcf537667c ("net: dsa: ocelot: add hardware timestamping
support for Felix") added support for RX timestamping in a very
unconventional way. On this switch, a partial timestamp is available in
the DSA header, but the driver got away with not parsing that timestamp
right away, but instead delayed that parsing for a little longer:
dsa_switch_rcv():
nskb = cpu_dp->rcv(skb, dev); <------------- not here
-> ocelot_rcv()
...
skb = nskb;
skb_push(skb, ETH_HLEN);
skb->pkt_type = PACKET_HOST;
skb->protocol = eth_type_trans(skb, skb->dev);
...
if (dsa_skb_defer_rx_timestamp(p, skb)) <--- but here
-> felix_rxtstamp()
return 0;
When in felix_rxtstamp(), this driver accounted for the fact that
eth_type_trans() happened in the meanwhile, so it got a hold of the
extraction header again by subtracting (ETH_HLEN + OCELOT_TAG_LEN) bytes
from the current skb->data.
This worked for quite some time but was quite fragile from the very
beginning. Not to mention that having DSA tag parsing split in two
different files, under different folders (net/dsa/tag_ocelot.c vs
drivers/net/dsa/ocelot/felix.c) made it quite non-obvious for patches to
come that they might break this.
Finally, the blamed commit does the following: at the end of
ocelot_rcv(), it checks whether the skb payload contains a VLAN header.
If it does, and this port is under a VLAN-aware bridge, that VLAN ID
might not be correct in the sense that the packet might have suffered
VLAN rewriting due to TCAM rules (VCAP IS1). So we consume the VLAN ID
from the skb payload using __skb_vlan_pop(), and take the classified
VLAN ID from the DSA tag, and construct a hwaccel VLAN tag with the
classified VLAN, and the skb payload is VLAN-untagged.
The big problem is that __skb_vlan_pop() does:
memmove(skb->data + VLAN_HLEN, skb->data, 2 * ETH_ALEN);
__skb_pull(skb, VLAN_HLEN);
aka it moves the Ethernet header 4 bytes to the right, and pulls 4 bytes
from the skb headroom (effectively also moving skb->data, by definition).
So for felix_rxtstamp()'s fragile logic, all bets are off now.
Instead of having the "extraction" pointer point to the DSA header,
it actually points to 4 bytes _inside_ the extraction header.
Corollary, the last 4 bytes of the "extraction" header are in fact 4
stale bytes of the destination MAC address from the Ethernet header,
from prior to the __skb_vlan_pop() movement.
So of course, RX timestamps are completely bogus when the system is
configured in this way.
The fix is actually very simple: just don't structure the code like that.
For better or worse, the DSA PTP timestamping API does not offer a
straightforward way for drivers to present their RX timestamps, but
other drivers (sja1105) have established a simple mechanism to carry
their RX timestamp from dsa_device_ops :: rcv() all the way to
dsa_switch_ops :: port_rxtstamp() and even later. That mechanism is to
simply save the partial timestamp to the skb->cb, and complete it later.
Question: why don't we simply populate the skb's struct
skb_shared_hwtstamps from ocelot_rcv(), and bother with this
complication of propagating the timestamp to felix_rxtstamp()?
Answer: dsa_switch_ops :: port_rxtstamp() answers the question whether
PTP packets need sleepable context to retrieve the full RX timestamp.
Currently felix_rxtstamp() answers "no, thanks" to that question, and
calls ocelot_ptp_gettime64() from softirq atomic context. This is
understandable, since Felix VSC9959 is a PCIe memory-mapped switch, so
hardware access does not require sleeping. But the felix driver is
preparing for the introduction of other switches where hardware access
is over a slow bus like SPI or MDIO:
https://lore.kernel.org/lkml/
20210814025003.2449143-1-colin.foster@in-advantage.com/
So I would like to keep this code structure, so the rework needed when
that driver will need PTP support will be minimal (answer "yes, I need
deferred context for this skb's RX timestamp", then the partial
timestamp will still be found in the skb->cb.
Fixes:
ea440cd2d9b2 ("net: dsa: tag_ocelot: use VLAN information from tagging header when available")
Reported-by: Po Liu <po.liu@nxp.com>
Cc: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Ziyang Xuan [Tue, 2 Nov 2021 02:12:18 +0000 (10:12 +0800)]
net: vlan: fix a UAF in vlan_dev_real_dev()
[ Upstream commit
563bcbae3ba233c275c244bfce2efe12938f5363 ]
The real_dev of a vlan net_device may be freed after
unregister_vlan_dev(). Access the real_dev continually by
vlan_dev_real_dev() will trigger the UAF problem for the
real_dev like following:
==================================================================
BUG: KASAN: use-after-free in vlan_dev_real_dev+0xf9/0x120
Call Trace:
kasan_report.cold+0x83/0xdf
vlan_dev_real_dev+0xf9/0x120
is_eth_port_of_netdev_filter.part.0+0xb1/0x2c0
is_eth_port_of_netdev_filter+0x28/0x40
ib_enum_roce_netdev+0x1a3/0x300
ib_enum_all_roce_netdevs+0xc7/0x140
netdevice_event_work_handler+0x9d/0x210
...
Freed by task 9288:
kasan_save_stack+0x1b/0x40
kasan_set_track+0x1c/0x30
kasan_set_free_info+0x20/0x30
__kasan_slab_free+0xfc/0x130
slab_free_freelist_hook+0xdd/0x240
kfree+0xe4/0x690
kvfree+0x42/0x50
device_release+0x9f/0x240
kobject_put+0x1c8/0x530
put_device+0x1b/0x30
free_netdev+0x370/0x540
ppp_destroy_interface+0x313/0x3d0
...
Move the put_device(real_dev) to vlan_dev_free(). Ensure
real_dev not be freed before vlan_dev unregistered.
Fixes:
1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot+e4df4e1389e28972e955@syzkaller.appspotmail.com
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Stafford Horne [Wed, 3 Nov 2021 11:19:33 +0000 (20:19 +0900)]
openrisc: fix SMP tlb flush NULL pointer dereference
[ Upstream commit
27dff9a9c247d4e38d82c2e7234914cfe8499294 ]
Throughout the OpenRISC kernel port VMA is passed as NULL when flushing
kernel tlb entries. Somehow this was missed when I was testing
c28b27416da9 ("openrisc: Implement proper SMP tlb flushing") and now the
SMP kernel fails to completely boot.
In OpenRISC VMA is used only to determine which cores need to have their
TLB entries flushed.
This patch updates the logic to flush tlbs on all cores when the VMA is
passed as NULL. Also, we update places VMA is passed as NULL to use
flush_tlb_kernel_range instead. Now, the only place VMA is passed as
NULL is in the implementation of flush_tlb_kernel_range.
Fixes:
c28b27416da9 ("openrisc: Implement proper SMP tlb flushing")
Reported-by: Jan Henrik Weinstock <jan.weinstock@rwth-aachen.de>
Signed-off-by: Stafford Horne <shorne@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Jakub Kicinski [Tue, 2 Nov 2021 22:02:36 +0000 (15:02 -0700)]
ethtool: fix ethtool msg len calculation for pause stats
[ Upstream commit
1aabe578dd86e9f2867c4db4fba9a15f4ba1825d ]
ETHTOOL_A_PAUSE_STAT_MAX is the MAX attribute id,
so we need to subtract non-stats and add one to
get a count (IOW -2+1 == -1).
Otherwise we'll see:
ethnl cmd 21: calculated reply length 40, but consumed 52
Fixes:
9a27a33027f2 ("ethtool: add standard pause stats")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Hangbin Liu [Wed, 3 Nov 2021 02:44:59 +0000 (10:44 +0800)]
kselftests/net: add missed toeplitz.sh/toeplitz_client.sh to Makefile
[ Upstream commit
17b67370c38de2a878debf39dcbc704a206af4d0 ]
When generating the selftests to another folder, the toeplitz.sh
and toeplitz_client.sh are missing as they are not in Makefile, e.g.
make -C tools/testing/selftests/ install \
TARGETS="net" INSTALL_PATH=/tmp/kselftests
Making them under TEST_PROGS_EXTENDED as they test NIC hardware features
and are not intended to be run from kselftests.
Fixes:
5ebfb4cc3048 ("selftests/net: toeplitz test")
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Hangbin Liu [Wed, 3 Nov 2021 02:44:58 +0000 (10:44 +0800)]
kselftests/net: add missed vrf_strict_mode_test.sh test to Makefile
[ Upstream commit
8883deb50eb6529ae1fd4641e402da8ab4f720d2 ]
When generating the selftests to another folder, the
vrf_strict_mode_test.sh test will miss as it is not in Makefile, e.g.
make -C tools/testing/selftests/ install \
TARGETS="net" INSTALL_PATH=/tmp/kselftests
Fixes:
8735e6eaa438 ("selftests: add selftest for the VRF strict mode")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Hangbin Liu [Wed, 3 Nov 2021 02:44:57 +0000 (10:44 +0800)]
kselftests/net: add missed SRv6 tests
[ Upstream commit
653e7f19b4a0a632cead2390281bde352d3d3273 ]
When generating the selftests to another folder, the SRv6 tests are
missing as they are not in Makefile, e.g.
make -C tools/testing/selftests/ install \
TARGETS="net" INSTALL_PATH=/tmp/kselftests
Fixes:
03a0b567a03d ("selftests: seg6: add selftest for SRv6 End.DT46 Behavior")
Fixes:
2195444e09b4 ("selftests: add selftest for the SRv6 End.DT4 behavior")
Fixes:
2bc035538e16 ("selftests: add selftest for the SRv6 End.DT6 (VRF) behavior")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Hangbin Liu [Wed, 3 Nov 2021 02:44:56 +0000 (10:44 +0800)]
kselftests/net: add missed setup_loopback.sh/setup_veth.sh to Makefile
[ Upstream commit
b99ac1841147eefd8d8b52fcf00d7d917949ae7f ]
When generating the selftests to another folder, the include file
setup_loopback.sh/setup_veth.sh for gro.sh/gre_gro.sh are missing as
they are not in Makefile, e.g.
make -C tools/testing/selftests/ install \
TARGETS="net" INSTALL_PATH=/tmp/kselftests
Fixes:
7d1575014a63 ("selftests/net: GRO coalesce test")
Fixes:
9af771d2ec04 ("selftests/net: allow GRO coalesce test on veth")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Hangbin Liu [Wed, 3 Nov 2021 02:44:55 +0000 (10:44 +0800)]
kselftests/net: add missed icmp.sh test to Makefile
[ Upstream commit
ca3676f94b8f40f52d285f9aef36dfd6725bfc14 ]
When generating the selftests to another folder, the icmp.sh test will
miss as it is not in Makefile, e.g.
make -C tools/testing/selftests/ install \
TARGETS="net" INSTALL_PATH=/tmp/kselftests
Fixes:
7e9838b7915e ("selftests/net: Add icmp.sh for testing ICMP dummy address responses")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Maxim Kiselev [Mon, 1 Nov 2021 15:23:41 +0000 (18:23 +0300)]
net: davinci_emac: Fix interrupt pacing disable
[ Upstream commit
d52bcb47bdf971a59a2467975d2405fcfcb2fa19 ]
This patch allows to use 0 for `coal->rx_coalesce_usecs` param to
disable rx irq coalescing.
Previously we could enable rx irq coalescing via ethtool
(For ex: `ethtool -C eth0 rx-usecs 2000`) but we couldn't disable
it because this part rejects 0 value:
if (!coal->rx_coalesce_usecs)
return -EINVAL;
Fixes:
84da2658a619 ("TI DaVinci EMAC : Implement interrupt pacing functionality.")
Signed-off-by: Maxim Kiselev <bigunclemax@gmail.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Link: https://lore.kernel.org/r/20211101152343.4193233-1-bigunclemax@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Beld Zhang [Tue, 2 Nov 2021 18:32:08 +0000 (12:32 -0600)]
io-wq: fix max-workers not correctly set on multi-node system
[ Upstream commit
71c9ce27bb57c59d8d7f5298e730c8096eef3d1f ]
In io-wq.c:io_wq_max_workers(), new_count[] was changed right after each
node's value was set. This caused the following node getting the setting
of the previous one.
Returned values are copied from node 0.
Fixes:
2e480058ddc2 ("io-wq: provide a way to limit max number of workers")
Signed-off-by: Beld Zhang <beldzhang@gmail.com>
[axboe: minor fixups]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Yu Kuai [Tue, 2 Nov 2021 01:52:35 +0000 (09:52 +0800)]
nbd: fix possible overflow for 'first_minor' in nbd_dev_add()
[ Upstream commit
940c264984fd1457918393c49674f6b39ee16506 ]
If 'part_shift' is not zero, then 'index << part_shift' might
overflow to a value that is not greater than '0xfffff', then sysfs
might complains about duplicate creation.
Fixes:
b0d9111a2d53 ("nbd: use an idr to keep track of nbd devices")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/r/20211102015237.2309763-3-yebin10@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Yu Kuai [Tue, 2 Nov 2021 01:52:34 +0000 (09:52 +0800)]
nbd: fix max value for 'first_minor'
[ Upstream commit
e4c4871a73944353ea23e319de27ef73ce546623 ]
commit
b1a811633f73 ("block: nbd: add sanity check for first_minor")
checks that 'first_minor' should not be greater than 0xff, which is
wrong. Whitout the commit, the details that when user pass 0x100000,
it ends up create sysfs dir "/sys/block/43:0" are as follows:
nbd_dev_add
disk->first_minor = index << part_shift
-> default part_shift is 5, first_minor is 0x2000000
device_add_disk
ddev->devt = MKDEV(disk->major, disk->first_minor)
-> (0x2b << 20) | (0x2000000) = 0x2b00000
device_add
device_create_sys_dev_entry
format_dev_t
sprintf(buffer, "%u:%u", MAJOR(dev), MINOR(dev));
-> got 43:0
sysfs_create_link -> /sys/block/43:0
By the way, with the wrong fix, when part_shift is the default value,
only 8 ndb devices can be created since 8 << 5 is greater than 0xff.
Since the max bits for 'first_minor' should be the same as what
MKDEV() does, which is 20. Change the upper bound of 'first_minor'
from 0xff to 0xfffff.
Fixes:
b1a811633f73 ("block: nbd: add sanity check for first_minor")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/r/20211102015237.2309763-2-yebin10@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
YueHaibing [Fri, 8 Oct 2021 07:44:17 +0000 (15:44 +0800)]
xen-pciback: Fix return in pm_ctrl_init()
[ Upstream commit
4745ea2628bb43a7ec34b71763b5a56407b33990 ]
Return NULL instead of passing to ERR_PTR while err is zero,
this fix smatch warnings:
drivers/xen/xen-pciback/conf_space_capability.c:163
pm_ctrl_init() warn: passing zero to 'ERR_PTR'
Fixes:
a92336a1176b ("xen/pciback: Drop two backends, squash and cleanup some code.")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20211008074417.8260-1-yuehaibing@huawei.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Sander Vanheule [Thu, 28 Oct 2021 08:52:43 +0000 (10:52 +0200)]
gpio: realtek-otto: fix GPIO line IRQ offset
[ Upstream commit
585a07079909ba9061ddd88214c36653e1aef71a ]
The irqchip uses one domain for all GPIO lines, so the line offset
should be determined w.r.t. the first line of the first port, not the
first line of the triggered port.
Fixes:
0d82fb1127fb ("gpio: Add Realtek Otto GPIO support")
Signed-off-by: Sander Vanheule <sander@svanheule.net>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Christophe JAILLET [Thu, 19 Aug 2021 20:48:08 +0000 (22:48 +0200)]
i2c: xlr: Fix a resource leak in the error handling path of 'xlr_i2c_probe()'
[ Upstream commit
7f98960c046ee1136e7096aee168eda03aef8a5d ]
A successful 'clk_prepare()' call should be balanced by a corresponding
'clk_unprepare()' call in the error handling path of the probe, as already
done in the remove function.
More specifically, 'clk_prepare_enable()' is used, but 'clk_disable()' is
also already called. So just the unprepare step has still to be done.
Update the error handling path accordingly.
Fixes:
75d31c2372e4 ("i2c: xlr: add support for Sigma Designs controller variant")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Dave Jiang [Mon, 25 Oct 2021 15:01:04 +0000 (08:01 -0700)]
dmaengine: idxd: fix resource leak on dmaengine driver disable
[ Upstream commit
a3e340c1574b6679f5b333221284d0959095da52 ]
The wq resources needs to be released before the kernel type is reset by
__drv_disable_wq(). With dma channels unregistered and wq quiesced, all the
wq resources for dmaengine can be freed. There is no need to wait until wq
is disabled. With the wq->type being reset to "unknown", the driver is
skipping the freeing of the resources.
Fixes:
0cda4f6986a3 ("dmaengine: idxd: create dmaengine driver for wq 'device'")
Reported-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Tested-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163517405099.3484556.12521975053711345244.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Trond Myklebust [Wed, 27 Oct 2021 01:56:40 +0000 (21:56 -0400)]
NFSv4: Fix a regression in nfs_set_open_stateid_locked()
[ Upstream commit
01d29f87fcfef38d51ce2b473981a5c1e861ac0a ]
If we already hold open state on the client, yet the server gives us a
completely different stateid to the one we already hold, then we
currently treat it as if it were an out-of-sequence update, and wait for
5 seconds for other updates to come in.
This commit fixes the behaviour so that we immediately start processing
of the new stateid, and then leave it to the call to
nfs4_test_and_free_stateid() to decide what to do with the old stateid.
Fixes:
b4868b44c562 ("NFSv4: Wait for stateid updates after CLOSE/OPEN_DOWNGRADE")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Quinn Tran [Tue, 26 Oct 2021 11:54:11 +0000 (04:54 -0700)]
scsi: qla2xxx: edif: Fix EDIF bsg
[ Upstream commit
9fd26c633e8ab5a291c0241533efff161bbe5570 ]
Various EDIF bsgs did not properly fill out the reply_payload_rcv_len
field. This causes app to parse empty data in the return payload.
Link: https://lore.kernel.org/r/20211026115412.27691-13-njavali@marvell.com
Fixes:
7ebb336e45ef ("scsi: qla2xxx: edif: Add start + stop bsgs")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Quinn Tran [Tue, 26 Oct 2021 11:54:09 +0000 (04:54 -0700)]
scsi: qla2xxx: edif: Increase ELS payload
[ Upstream commit
0f6d600a26e89d31d8381b324fc970f72579a126 ]
Currently, firmware limits ELS payload to FC frame size/2112. This patch
adjusts memory buffer size to be able to handle max ELS payload.
Link: https://lore.kernel.org/r/20211026115412.27691-11-njavali@marvell.com
Fixes:
84318a9f01ce ("scsi: qla2xxx: edif: Add send, receive, and accept for auth_els")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Quinn Tran [Tue, 26 Oct 2021 11:54:05 +0000 (04:54 -0700)]
scsi: qla2xxx: edif: Flush stale events and msgs on session down
[ Upstream commit
b1af26c245545a289b331c7b71996ecd88321540 ]
On session down, driver will flush all stale messages and doorbell
events. This prevents authentication application from having to process
stale data.
Link: https://lore.kernel.org/r/20211026115412.27691-7-njavali@marvell.com
Fixes:
4de067e5df12 ("scsi: qla2xxx: edif: Add N2N support for EDIF")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Co-developed-by: Karunakara Merugu <kmerugu@marvell.com>
Signed-off-by: Karunakara Merugu <kmerugu@marvell.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Quinn Tran [Tue, 26 Oct 2021 11:54:04 +0000 (04:54 -0700)]
scsi: qla2xxx: edif: Fix app start delay
[ Upstream commit
b492d6a4880fddce098472dec5086d37802c68d3 ]
Current driver does unnecessary pause for each session to get to certain
state before allowing the app start call to return. In larger environment,
this introduces a long delay. Originally the delay was meant to
synchronize app and driver. However, the with current implementation the
two sides use various events to synchronize their state.
The same is applied to the authentication failure call.
Link: https://lore.kernel.org/r/20211026115412.27691-6-njavali@marvell.com
Fixes:
4de067e5df12 ("scsi: qla2xxx: edif: Add N2N support for EDIF")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Quinn Tran [Tue, 26 Oct 2021 11:54:03 +0000 (04:54 -0700)]
scsi: qla2xxx: edif: Fix app start fail
[ Upstream commit
8e6d5df3cb32dddf558a52414d29febecb660396 ]
On app start, all sessions need to be reset to see if secure connection can
be made. Fix the broken check which prevents that process.
Link: https://lore.kernel.org/r/20211026115412.27691-5-njavali@marvell.com
Fixes:
4de067e5df12 ("scsi: qla2xxx: edif: Add N2N support for EDIF")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Quinn Tran [Tue, 26 Oct 2021 11:54:02 +0000 (04:54 -0700)]
scsi: qla2xxx: Turn off target reset during issue_lip
[ Upstream commit
0b7a9fd934a68ebfc1019811b7bdc1742072ad7b ]
When user uses issue_lip to do link bounce, driver sends additional target
reset to remote device before resetting the link. The target reset would
affect other paths with active I/Os. This patch will remove the unnecessary
target reset.
Link: https://lore.kernel.org/r/20211026115412.27691-4-njavali@marvell.com
Fixes:
5854771e314e ("[SCSI] qla2xxx: Add ISPFX00 specific bus reset routine")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Quinn Tran [Tue, 26 Oct 2021 11:54:01 +0000 (04:54 -0700)]
scsi: qla2xxx: Fix gnl list corruption
[ Upstream commit
c98c5daaa24b583cba1369b7d167f93c6ae7299c ]
Current code does list element deletion and addition in and out of lock
protection. This patch moves deletion behind lock.
list_add double add: new=
ffff9130b5eb89f8, prev=
ffff9130b5eb89f8,
next=
ffff9130c6a715f0.
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:31!
invalid opcode: 0000 [#1] SMP PTI
CPU: 1 PID: 182395 Comm: kworker/1:37 Kdump: loaded Tainted: G W OE
--------- - - 4.18.0-193.el8.x86_64 #1
Hardware name: HP ProLiant DL160 Gen8, BIOS J03 02/10/2014
Workqueue: qla2xxx_wq qla2x00_iocb_work_fn [qla2xxx]
RIP: 0010:__list_add_valid+0x41/0x50
Code: 85 94 00 00 00 48 39 c7 74 0b 48 39 d7 74 06 b8 01 00 00 00 c3 48 89 f2
4c 89 c1 48 89 fe 48 c7 c7 60 83 ad 97 e8 4d bd ce ff <0f> 0b 0f 1f 00 66 2e
0f 1f 84 00 00 00 00 00 48 8b 07 48 8b 57 08
RSP: 0018:
ffffaba306f47d68 EFLAGS:
00010046
RAX:
0000000000000058 RBX:
ffff9130b5eb8800 RCX:
0000000000000006
RDX:
0000000000000000 RSI:
0000000000000096 RDI:
ffff9130b7456a00
RBP:
ffff9130c6a70a58 R08:
000000000008d7be R09:
0000000000000001
R10:
0000000000000000 R11:
0000000000000001 R12:
ffff9130c6a715f0
R13:
ffff9130b5eb8824 R14:
ffff9130b5eb89f8 R15:
ffff9130b5eb89f8
FS:
0000000000000000(0000) GS:
ffff9130b7440000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00007efcaaef11a0 CR3:
000000005200a002 CR4:
00000000000606e0
Call Trace:
qla24xx_async_gnl+0x113/0x3c0 [qla2xxx]
? qla2x00_iocb_work_fn+0x53/0x80 [qla2xxx]
? process_one_work+0x1a7/0x3b0
? worker_thread+0x30/0x390
? create_worker+0x1a0/0x1a0
? kthread+0x112/0x130
Link: https://lore.kernel.org/r/20211026115412.27691-3-njavali@marvell.com
Fixes:
726b85487067 ("qla2xxx: Add framework for async fabric discovery")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Quinn Tran [Tue, 26 Oct 2021 11:54:00 +0000 (04:54 -0700)]
scsi: qla2xxx: Relogin during fabric disturbance
[ Upstream commit
bb2ca6b3f09ac20e8357d257d0557ab5ddf6adcd ]
For RSCN of type "Area, Domain, or Fabric", which indicate a portion or
entire fabric was disturbed, current driver does not set the scan_need flag
to indicate a session was affected by the disturbance. This in turn can
lead to I/O timeout and delay of relogin. Hence initiate relogin in the
event of fabric disturbance.
Link: https://lore.kernel.org/r/20211026115412.27691-2-njavali@marvell.com
Fixes:
1560bafdff9e ("scsi: qla2xxx: Use complete switch scan for RSCN events")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Dmitry Bogdanov [Mon, 18 Oct 2021 13:57:53 +0000 (16:57 +0300)]
scsi: target: core: Remove from tmr_list during LUN unlink
[ Upstream commit
12b6fcd0ea7f3cb7c3b34668fc678779924123ae ]
Currently TMF commands are removed from de_device.dev_tmf_list at the very
end of se_cmd lifecycle. However, se_lun unlinks from se_cmd upon a command
status (response) being queued in transport layer. This means that LUN and
backend device can be deleted in the meantime and a panic will occur:
target_tmr_work()
cmd->se_tfo->queue_tm_rsp(cmd); // send abort_rsp to a wire
transport_lun_remove_cmd(cmd) // unlink se_cmd from se_lun
- // - // - // -
<<<--- lun remove
<<<--- core backend device remove
- // - // - // -
qlt_handle_abts_completion()
tfo->free_mcmd()
transport_generic_free_cmd()
target_put_sess_cmd()
core_tmr_release_req() {
if (dev) { // backend device, can not be null
spin_lock_irqsave(&dev->se_tmr_lock, flags); //<<<--- CRASH
Call Trace:
NIP [
c000000000e1683c] _raw_spin_lock_irqsave+0x2c/0xc0
LR [
c00800000e433338] core_tmr_release_req+0x40/0xa0 [target_core_mod]
Call Trace:
(unreliable)
0x0
target_put_sess_cmd+0x2a0/0x370 [target_core_mod]
transport_generic_free_cmd+0x6c/0x1b0 [target_core_mod]
tcm_qla2xxx_complete_mcmd+0x28/0x50 [tcm_qla2xxx]
process_one_work+0x2c4/0x5c0
worker_thread+0x88/0x690
For the iSCSI protocol this is easily reproduced:
- Send some SCSI sommand
- Send Abort of that command over iSCSI
- Remove LUN on target
- Send next iSCSI command to acknowledge the Abort_Response
- Target panics
There is no need to keep the command in tmr_list until response completion,
so move the removal from tmr_list from the response completion to the
response queueing when the LUN is unlinked. Move the removal from state
list too as it is a subject to the same race condition.
Link: https://lore.kernel.org/r/20211018135753.15297-1-d.bogdanov@yadro.com
Fixes:
c66ac9db8d4a ("[SCSI] target: Add LIO target core v4.0.0-rc6")
Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>