platform/kernel/linux-rpi.git
2 years agomisc: alcor_pci: Fix an error handling path
Christophe JAILLET [Sun, 6 Feb 2022 08:39:54 +0000 (09:39 +0100)]
misc: alcor_pci: Fix an error handling path

[ Upstream commit 5b3dc949f554379edcb8ef6111aa5ecb78feb798 ]

A successful ida_simple_get() should be balanced by a corresponding
ida_simple_remove().

Add the missing call in the error handling path of the probe.

While at it, switch to ida_alloc()/ida_free() instead to
ida_simple_get()/ida_simple_remove().
The latter is deprecated and more verbose.

Fixes: 4f556bc04e3c ("misc: cardreader: add new Alcor Micro Cardreader PCI driver")
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/918a9875b7f67b7f8f123c4446452603422e8c5e.1644136776.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agofsi: Aspeed: Fix a potential double free
Christophe JAILLET [Sun, 9 Jan 2022 21:56:10 +0000 (22:56 +0100)]
fsi: Aspeed: Fix a potential double free

[ Upstream commit 83ba7e895debc529803a7a258653f2fe9bf3bf40 ]

A struct device can never be devm_alloc()'ed.
Here, it is embedded in "struct fsi_master", and "struct fsi_master" is
embedded in "struct fsi_master_aspeed".

Since "struct device" is embedded, the data structure embedding it must be
released with the release function, as is already done here.

So use kzalloc() instead of devm_kzalloc() when allocating "aspeed" and
update all error handling branches accordingly.

This prevent a potential double free().

This also fix another issue if opb_readl() fails. Instead of a direct
return, it now jumps in the error handling path.

Fixes: 606397d67f41 ("fsi: Add ast2600 master driver")
Suggested-by: Greg KH <gregkh@linuxfoundation.org>
Suggested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/2c123f8b0a40dc1a061fae982169fe030b4f47e6.1641765339.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agopps: clients: gpio: Propagate return value from pps_gpio_probe
Robert Hancock [Wed, 12 Jan 2022 20:52:14 +0000 (14:52 -0600)]
pps: clients: gpio: Propagate return value from pps_gpio_probe

[ Upstream commit abaca3179b41d4b3b115f27814ee36f6fb45e897 ]

If the pps-gpio driver was probed prior to the GPIO device it uses, the
devm_gpiod_get call returned an -EPROBE_DEFER error, but pps_gpio_probe
replaced that error code with -EINVAL, causing the pps-gpio probe to
fail and not be retried later. Propagate the error return value so that
deferred probe works properly.

Fixes: 161520451dfa (pps: new client driver using GPIO)
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Rodolfo Giometti <giometti@enneenne.com>
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: Rodolfo Giometti <giometti@enneenne.com>
Link: https://lore.kernel.org/r/20220112205214.2060954-1-robert.hancock@calian.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agopwm: lpc18xx-sct: Initialize driver data and hardware before pwmchip_add()
Uwe Kleine-König [Wed, 10 Nov 2021 08:49:48 +0000 (09:49 +0100)]
pwm: lpc18xx-sct: Initialize driver data and hardware before pwmchip_add()

[ Upstream commit 0401f24cd238ae200a23a13925f98de3d2c883b8 ]

When a driver calls pwmchip_add() it has to be prepared to immediately
get its callbacks called. So move allocation of driver data and hardware
initialization before the call to pwmchip_add().

This fixes a potential NULL pointer exception and a race condition on
register writes.

Fixes: 841e6f90bb78 ("pwm: NXP LPC18xx PWM/SCT driver")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agofsi: scom: Remove retries in indirect scoms
Joel Stanley [Tue, 7 Dec 2021 03:38:11 +0000 (14:08 +1030)]
fsi: scom: Remove retries in indirect scoms

[ Upstream commit ab1b79159ad5a6dc4e4994b49737f7feb13b7155 ]

In commit f72ddbe1d7b7 ("fsi: scom: Remove retries") the retries were
removed from get and put scoms. That patch missed the retires in get and
put indirect scom.

For the same reason, remove them from the scom driver to allow the
caller to decide to retry.

This removes the following special case which would have caused the
retry code to return early:

 -       if ((ind_data & XSCOM_DATA_IND_COMPLETE) || (err != SCOM_PIB_BLOCKED))
 -               return 0;

I believe this case is handled.

Fixes: f72ddbe1d7b7 ("fsi: scom: Remove retries")
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20211207033811.518981-3-joel@jms.id.au
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agofsi: scom: Fix error handling
Joel Stanley [Tue, 7 Dec 2021 03:38:10 +0000 (14:08 +1030)]
fsi: scom: Fix error handling

[ Upstream commit d46fddd52d11eb6a3a7ed836f9f273e9cf8cd01c ]

SCOM error handling is made complex by trying to pass around two bits of
information: the function return code, and a status parameter that
represents the CFAM error status register.

The commit f72ddbe1d7b7 ("fsi: scom: Remove retries") removed the
"hidden" retries in the SCOM driver, in preference of allowing the
calling code (userspace or driver) to decide how to handle a failed
SCOM. However it introduced a bug by attempting to be smart about the
return codes that were "errors" and which were ok to fall through to the
status register parsing.

We get the following errors:

 - EINVAL or ENXIO, for indirect scoms where the value is invalid
 - EINVAL, where the size or address is incorrect
 - EIO or ETIMEOUT, where FSI write failed (aspeed master)
 - EAGAIN, where the master detected a crc error (GPIO master only)
 - EBUSY, where the bus is disabled (GPIO master in external mode)

In all of these cases we should fail the SCOM read/write and return the
error.

Thanks to Dan Carpenter for the detailed bug report.

Fixes: f72ddbe1d7b7 ("fsi: scom: Remove retries")
Link: https://lists.ozlabs.org/pipermail/linux-fsi/2021-November/000235.html
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20211207033811.518981-2-joel@jms.id.au
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agomxser: fix xmit_buf leak in activate when LSR == 0xff
Jiri Slaby [Mon, 24 Jan 2022 07:14:24 +0000 (08:14 +0100)]
mxser: fix xmit_buf leak in activate when LSR == 0xff

[ Upstream commit cd3a4907ee334b40d7aa880c7ab310b154fd5cd4 ]

When LSR is 0xff in ->activate() (rather unlike), we return an error.
Provided ->shutdown() is not called when ->activate() fails, nothing
actually frees the buffer in this case.

Fix this by properly freeing the buffer in a designated label. We jump
there also from the "!info->type" if now too.

Fixes: 6769140d3047 ("tty: mxser: use the tty_port_open method")
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Link: https://lore.kernel.org/r/20220124071430.14907-6-jslaby@suse.cz
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agostaging: r8188eu: release_firmware is not called if allocation fails
Michael Straube [Fri, 7 Jan 2022 10:36:04 +0000 (11:36 +0100)]
staging: r8188eu: release_firmware is not called if allocation fails

[ Upstream commit 39850edf2befe27bcb3d6c37b6ee76d2ee4df903 ]

In function load_firmware() release_firmware() is not called if the
allocation of pFirmware->szFwBuffer fails or if fw->size is greater
than FW_8188E_SIZE.

Move the call to release_firmware() to the exit label at the end of
the function to fix this.

Fixes: 8cd574e6af54 ("staging: r8188eu: introduce new hal dir for RTL8188eu driver")
Signed-off-by: Michael Straube <straube.linux@gmail.com>
Link: https://lore.kernel.org/r/20220107103620.15648-4-straube.linux@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agostaging: r8188eu: convert DBG_88E_LEVEL call in hal/rtl8188e_hal_init.c
Phillip Potter [Mon, 29 Nov 2021 00:20:39 +0000 (00:20 +0000)]
staging: r8188eu: convert DBG_88E_LEVEL call in hal/rtl8188e_hal_init.c

[ Upstream commit 5ec394d58bdba731c2a33645be7018e71f72f287 ]

Convert DBG_88E_LEVEL macro call in hal/rtl8188e_hal_init.c to plain
dev_dbg call, as although the information is potentially useful, we should
be exposing it using standard kernel debugging functionality.

Signed-off-by: Phillip Potter <phil@philpotter.co.uk>
Link: https://lore.kernel.org/r/20211129002041.865-6-phil@philpotter.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agokernel/resource: fix kfree() of bootmem memory again
Miaohe Lin [Wed, 23 Mar 2022 23:07:18 +0000 (16:07 -0700)]
kernel/resource: fix kfree() of bootmem memory again

[ Upstream commit 0cbcc92917c5de80f15c24d033566539ad696892 ]

Since commit ebff7d8f270d ("mem hotunplug: fix kfree() of bootmem
memory"), we could get a resource allocated during boot via
alloc_resource().  And it's required to release the resource using
free_resource().  Howerver, many people use kfree directly which will
result in kernel BUG.  In order to fix this without fixing every call
site, just leak a couple of bytes in such corner case.

Link: https://lkml.kernel.org/r/20220217083619.19305-1-linmiaohe@huawei.com
Fixes: ebff7d8f270d ("mem hotunplug: fix kfree() of bootmem memory")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Suggested-by: David Hildenbrand <david@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agodrivers: ethernet: cpsw: fix panic when interrupt coaleceing is set via ethtool
Sondhauß, Jan [Wed, 23 Mar 2022 08:47:33 +0000 (08:47 +0000)]
drivers: ethernet: cpsw: fix panic when interrupt coaleceing is set via ethtool

[ Upstream commit 2844e2434385819f674d1fb4130c308c50ba681e ]

cpsw_ethtool_begin directly returns the result of pm_runtime_get_sync
when successful.
pm_runtime_get_sync returns -error code on failure and 0 on successful
resume but also 1 when the device is already active. So the common case
for cpsw_ethtool_begin is to return 1. That leads to inconsistent calls
to pm_runtime_put in the call-chain so that pm_runtime_put is called
one too many times and as result leaving the cpsw dev behind suspended.

The suspended cpsw dev leads to an access violation later on by
different parts of the cpsw driver.

Fix this by calling the return-friendly pm_runtime_resume_and_get
function.

Fixes: d43c65b05b84 ("ethtool: runtime-resume netdev parent in ethnl_ops_begin")
Signed-off-by: Jan Sondhauss <jan.sondhauss@wago.com>
Reviewed-by: Vignesh Raghavendra <vigneshr@ti.com>
Link: https://lore.kernel.org/r/20220323084725.65864-1-jan.sondhauss@wago.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoice: don't allow to run ice_send_event_to_aux() in atomic ctx
Alexander Lobakin [Wed, 23 Mar 2022 12:43:53 +0000 (13:43 +0100)]
ice: don't allow to run ice_send_event_to_aux() in atomic ctx

[ Upstream commit 5a3156932da06f09953764de113419f254086faf ]

ice_send_event_to_aux() eventually descends to mutex_lock()
(-> might_sched()), so it must not be called under non-task
context. However, at least two fixes have happened already for the
bug splats occurred due to this function being called from atomic
context.
To make the emergency landings softer, bail out early when executed
in non-task context emitting a warn splat only once. This way we
trade some events being potentially lost for system stability and
avoid any related hangs and crashes.

Fixes: 348048e724a0e ("ice: Implement iidc operations")
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Tested-by: Michal Kubiak <michal.kubiak@intel.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Acked-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoice: fix 'scheduling while atomic' on aux critical err interrupt
Alexander Lobakin [Wed, 23 Mar 2022 12:43:52 +0000 (13:43 +0100)]
ice: fix 'scheduling while atomic' on aux critical err interrupt

[ Upstream commit 32d53c0aa3a7b727243473949bad2a830b908edc ]

There's a kernel BUG splat on processing aux critical error
interrupts in ice_misc_intr():

[ 2100.917085] BUG: scheduling while atomic: swapper/15/0/0x00010000
...
[ 2101.060770] Call Trace:
[ 2101.063229]  <IRQ>
[ 2101.065252]  dump_stack+0x41/0x60
[ 2101.068587]  __schedule_bug.cold.100+0x4c/0x58
[ 2101.073060]  __schedule+0x6a4/0x830
[ 2101.076570]  schedule+0x35/0xa0
[ 2101.079727]  schedule_preempt_disabled+0xa/0x10
[ 2101.084284]  __mutex_lock.isra.7+0x310/0x420
[ 2101.088580]  ? ice_misc_intr+0x201/0x2e0 [ice]
[ 2101.093078]  ice_send_event_to_aux+0x25/0x70 [ice]
[ 2101.097921]  ice_misc_intr+0x220/0x2e0 [ice]
[ 2101.102232]  __handle_irq_event_percpu+0x40/0x180
[ 2101.106965]  handle_irq_event_percpu+0x30/0x80
[ 2101.111434]  handle_irq_event+0x36/0x53
[ 2101.115292]  handle_edge_irq+0x82/0x190
[ 2101.119148]  handle_irq+0x1c/0x30
[ 2101.122480]  do_IRQ+0x49/0xd0
[ 2101.125465]  common_interrupt+0xf/0xf
[ 2101.129146]  </IRQ>
...

As Andrew correctly mentioned previously[0], the following call
ladder happens:

ice_misc_intr() <- hardirq
  ice_send_event_to_aux()
    device_lock()
      mutex_lock()
        might_sleep()
          might_resched() <- oops

Add a new PF state bit which indicates that an aux critical error
occurred and serve it in ice_service_task() in process context.
The new ice_pf::oicr_err_reg is read-write in both hardirq and
process contexts, but only 3 bits of non-critical data probably
aren't worth explicit synchronizing (and they're even in the same
byte [31:24]).

[0] https://lore.kernel.org/all/YeSRUVmrdmlUXHDn@lunn.ch

Fixes: 348048e724a0e ("ice: Implement iidc operations")
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Tested-by: Michal Kubiak <michal.kubiak@intel.com>
Acked-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agomfd: asic3: Add missing iounmap() on error asic3_mfd_probe
Miaoqian Lin [Mon, 7 Mar 2022 07:29:47 +0000 (07:29 +0000)]
mfd: asic3: Add missing iounmap() on error asic3_mfd_probe

[ Upstream commit e84ee1a75f944a0fe3c277aaa10c426603d2b0bc ]

Add the missing iounmap() before return from asic3_mfd_probe
in the error handling case.

Fixes: 64e8867ba809 ("mfd: tmio_mmc hardware abstraction for CNF area")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Link: https://lore.kernel.org/r/20220307072947.5369-1-linmq006@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agotipc: fix the timer expires after interval 100ms
Hoang Le [Mon, 21 Mar 2022 04:22:29 +0000 (11:22 +0700)]
tipc: fix the timer expires after interval 100ms

[ Upstream commit 6a7d8cff4a3301087dd139293e9bddcf63827282 ]

In the timer callback function tipc_sk_timeout(), we're trying to
reschedule another timeout to retransmit a setup request if destination
link is congested. But we use the incorrect timeout value
(msecs_to_jiffies(100)) instead of (jiffies + msecs_to_jiffies(100)),
so that the timer expires immediately, it's irrelevant for original
description.

In this commit we correct the timeout value in sk_reset_timer()

Fixes: 6787927475e5 ("tipc: buffer overflow handling in listener socket")
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Link: https://lore.kernel.org/r/20220321042229.314288-1-hoang.h.le@dektech.com.au
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agonet: dsa: fix panic on shutdown if multi-chip tree failed to probe
Vladimir Oltean [Fri, 18 Mar 2022 19:54:43 +0000 (21:54 +0200)]
net: dsa: fix panic on shutdown if multi-chip tree failed to probe

[ Upstream commit 8fd36358ce82382519b50b05f437493e1e00c4a9 ]

DSA probing is atypical because a tree of devices must probe all at
once, so out of N switches which call dsa_tree_setup_routing_table()
during probe, for (N - 1) of them, "complete" will return false and they
will exit probing early. The Nth switch will set up the whole tree on
their behalf.

The implication is that for (N - 1) switches, the driver binds to the
device successfully, without doing anything. When the driver is bound,
the ->shutdown() method may run. But if the Nth switch has failed to
initialize the tree, there is nothing to do for the (N - 1) driver
instances, since the slave devices have not been created, etc. Moreover,
dsa_switch_shutdown() expects that the calling @ds has been in fact
initialized, so it jumps at dereferencing the various data structures,
which is incorrect.

Avoid the ensuing NULL pointer dereferences by simply checking whether
the Nth switch has previously set "ds->setup = true" for the switch
which is currently shutting down. The entire setup is serialized under
dsa2_mutex which we already hold.

Fixes: 0650bf52b31f ("net: dsa: be compatible with masters which unregister on shutdown")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220318195443.275026-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoopenvswitch: always update flow key after nat
Aaron Conole [Fri, 18 Mar 2022 12:43:19 +0000 (08:43 -0400)]
openvswitch: always update flow key after nat

[ Upstream commit 60b44ca6bd7518dd38fa2719bc9240378b6172c3 ]

During NAT, a tuple collision may occur.  When this happens, openvswitch
will make a second pass through NAT which will perform additional packet
modification.  This will update the skb data, but not the flow key that
OVS uses.  This means that future flow lookups, and packet matches will
have incorrect data.  This has been supported since
5d50aa83e2c8 ("openvswitch: support asymmetric conntrack").

That commit failed to properly update the sw_flow_key attributes, since
it only called the ovs_ct_nat_update_key once, rather than each time
ovs_ct_nat_execute was called.  As these two operations are linked, the
ovs_ct_nat_execute() function should always make sure that the
sw_flow_key is updated after a successful call through NAT infrastructure.

Fixes: 5d50aa83e2c8 ("openvswitch: support asymmetric conntrack")
Cc: Dumitru Ceara <dceara@redhat.com>
Cc: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Link: https://lore.kernel.org/r/20220318124319.3056455-1-aconole@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agotcp: ensure PMTU updates are processed during fastopen
Jakub Kicinski [Mon, 21 Mar 2022 16:59:57 +0000 (09:59 -0700)]
tcp: ensure PMTU updates are processed during fastopen

[ Upstream commit ed0c99dc0f499ff8b6e75b5ae6092ab42be1ad39 ]

tp->rx_opt.mss_clamp is not populated, yet, during TFO send so we
rise it to the local MSS. tp->mss_cache is not updated, however:

tcp_v6_connect():
  tp->rx_opt.mss_clamp = IPV6_MIN_MTU - headers;
  tcp_connect():
     tcp_connect_init():
       tp->mss_cache = min(mtu, tp->rx_opt.mss_clamp)
     tcp_send_syn_data():
       tp->rx_opt.mss_clamp = tp->advmss

After recent fixes to ICMPv6 PTB handling we started dropping
PMTU updates higher than tp->mss_cache. Because of the stale
tp->mss_cache value PMTU updates during TFO are always dropped.

Thanks to Wei for helping zero in on the problem and the fix!

Fixes: c7bb4b89033b ("ipv6: tcp: drop silly ICMPv6 packet too big messages")
Reported-by: Andre Nash <alnash@fb.com>
Reported-by: Neil Spring <ntspring@fb.com>
Reviewed-by: Wei Wang <weiwan@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20220321165957.1769954-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agonet: bcmgenet: Use stronger register read/writes to assure ordering
Jeremy Linton [Thu, 10 Mar 2022 04:53:58 +0000 (22:53 -0600)]
net: bcmgenet: Use stronger register read/writes to assure ordering

[ Upstream commit 8d3ea3d402db94b61075617e71b67459a714a502 ]

GCC12 appears to be much smarter about its dependency tracking and is
aware that the relaxed variants are just normal loads and stores and
this is causing problems like:

[  210.074549] ------------[ cut here ]------------
[  210.079223] NETDEV WATCHDOG: enabcm6e4ei0 (bcmgenet): transmit queue 1 timed out
[  210.086717] WARNING: CPU: 1 PID: 0 at net/sched/sch_generic.c:529 dev_watchdog+0x234/0x240
[  210.095044] Modules linked in: genet(E) nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat]
[  210.146561] ACPI CPPC: PCC check channel failed for ss: 0. ret=-110
[  210.146927] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G            E     5.17.0-rc7G12+ #58
[  210.153226] CPPC Cpufreq:cppc_scale_freq_workfn: failed to read perf counters
[  210.161349] Hardware name: Raspberry Pi Foundation Raspberry Pi 4 Model B/Raspberry Pi 4 Model B, BIOS EDK2-DEV 02/08/2022
[  210.161353] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  210.161358] pc : dev_watchdog+0x234/0x240
[  210.161364] lr : dev_watchdog+0x234/0x240
[  210.161368] sp : ffff8000080a3a40
[  210.161370] x29: ffff8000080a3a40 x28: ffffcd425af87000 x27: ffff8000080a3b20
[  210.205150] x26: ffffcd425aa00000 x25: 0000000000000001 x24: ffffcd425af8ec08
[  210.212321] x23: 0000000000000100 x22: ffffcd425af87000 x21: ffff55b142688000
[  210.219491] x20: 0000000000000001 x19: ffff55b1426884c8 x18: ffffffffffffffff
[  210.226661] x17: 64656d6974203120 x16: 0000000000000001 x15: 6d736e617274203a
[  210.233831] x14: 2974656e65676d63 x13: ffffcd4259c300d8 x12: ffffcd425b07d5f0
[  210.241001] x11: 00000000ffffffff x10: ffffcd425b07d5f0 x9 : ffffcd4258bdad9c
[  210.248171] x8 : 00000000ffffdfff x7 : 000000000000003f x6 : 0000000000000000
[  210.255341] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000001000
[  210.262511] x2 : 0000000000001000 x1 : 0000000000000005 x0 : 0000000000000044
[  210.269682] Call trace:
[  210.272133]  dev_watchdog+0x234/0x240
[  210.275811]  call_timer_fn+0x3c/0x15c
[  210.279489]  __run_timers.part.0+0x288/0x310
[  210.283777]  run_timer_softirq+0x48/0x80
[  210.287716]  __do_softirq+0x128/0x360
[  210.291392]  __irq_exit_rcu+0x138/0x140
[  210.295243]  irq_exit_rcu+0x1c/0x30
[  210.298745]  el1_interrupt+0x38/0x54
[  210.302334]  el1h_64_irq_handler+0x18/0x24
[  210.306445]  el1h_64_irq+0x7c/0x80
[  210.309857]  arch_cpu_idle+0x18/0x2c
[  210.313445]  default_idle_call+0x4c/0x140
[  210.317470]  cpuidle_idle_call+0x14c/0x1a0
[  210.321584]  do_idle+0xb0/0x100
[  210.324737]  cpu_startup_entry+0x30/0x8c
[  210.328675]  secondary_start_kernel+0xe4/0x110
[  210.333138]  __secondary_switched+0x94/0x98

The assumption when these were relaxed seems to be that device memory
would be mapped non reordering, and that other constructs
(spinlocks/etc) would provide the barriers to assure that packet data
and in memory rings/queues were ordered with respect to device
register reads/writes. This itself seems a bit sketchy, but the real
problem with GCC12 is that it is moving the actual reads/writes around
at will as though they were independent operations when in truth they
are not, but the compiler can't know that. When looking at the
assembly dumps for many of these routines its possible to see very
clean, but not strictly in program order operations occurring as the
compiler would be free to do if these weren't actually register
reads/write operations.

Its possible to suppress the timeout with a liberal bit of dma_mb()'s
sprinkled around but the device still seems unable to reliably
send/receive data. A better plan is to use the safer readl/writel
everywhere.

Since this partially reverts an older commit, which notes the use of
the relaxed variants for performance reasons. I would suggest that
any performance problems with this commit are targeted at relaxing only
the performance critical code paths after assuring proper barriers.

Fixes: 69d2ea9c79898 ("net: bcmgenet: Use correct I/O accessors")
Reported-by: Peter Robinson <pbrobinson@gmail.com>
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Acked-by: Peter Robinson <pbrobinson@gmail.com>
Tested-by: Peter Robinson <pbrobinson@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220310045358.224350-1-jeremy.linton@arm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoPCI: Avoid broken MSI on SB600 USB devices
Bjorn Helgaas [Mon, 21 Mar 2022 18:34:46 +0000 (13:34 -0500)]
PCI: Avoid broken MSI on SB600 USB devices

[ Upstream commit 63cd736f449445edcd7f0bcc7d84453e9beec0aa ]

Some ATI SB600 USB adapters advertise MSI, but if INTx is disabled by
setting PCI_COMMAND_INTX_DISABLE, MSI doesn't work either.  The PCI/PCIe
specs do not require software to set PCI_COMMAND_INTX_DISABLE when enabling
MSI, but Linux has done that for many years.

Mick reported that 306c54d0edb6 ("usb: hcd: Try MSI interrupts on PCI
devices") broke these devices.  Prior to 306c54d0edb6, they used INTx.
Starting with 306c54d0edb6, they use MSI, and and the fact that Linux sets
PCI_COMMAND_INTX_DISABLE means both INTx and MSI are disabled on these
devices.

Avoid this SB600 defect by disabling MSI so we use INTx as before.

Fixes: 306c54d0edb6 ("usb: hcd: Try MSI interrupts on PCI devices")
Link: https://lore.kernel.org/r/20220321183446.1108325-1-helgaas@kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215690
Link: https://lore.kernel.org/all/PxIByDyBRcsbpcmVhGSNDFAoUcMmb78ctXCkw6fbpx25TGlCHvA6SJjjFkNr1FfQZMntYPTNyvEnblxzAZ8a6jP9ddLpKeCN6Chi_2FuexU=@protonmail.com/
Link: https://lore.kernel.org/r/20220314101448.90074-1-andriy.shevchenko@linux.intel.com
BugLink: https://lore.kernel.org/all/20200702143045.23429-1-andriy.shevchenko@linux.intel.com/
Reported-by: Mick Lorain <micklorain@protonmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoselftests/bpf/test_lirc_mode2.sh: Exit with proper code
Hangbin Liu [Mon, 21 Mar 2022 02:41:49 +0000 (10:41 +0800)]
selftests/bpf/test_lirc_mode2.sh: Exit with proper code

[ Upstream commit ec80906b0fbd7be11e3e960813b977b1ffe5f8fe ]

When test_lirc_mode2_user exec failed, the test report failed but still
exit with 0. Fix it by exiting with an error code.

Another issue is for the LIRCDEV checking. With bash -n, we need to quote
the variable, or it will always be true. So if test_lirc_mode2_user was
not run, just exit with skip code.

Fixes: 6bdd533cee9a ("bpf: add selftest for lirc_mode2 type program")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220321024149.157861-1-liuhangbin@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agopowerpc/pseries: Fix use after free in remove_phb_dynamic()
Michael Ellerman [Fri, 18 Mar 2022 03:42:19 +0000 (14:42 +1100)]
powerpc/pseries: Fix use after free in remove_phb_dynamic()

[ Upstream commit fe2640bd7a62f1f7c3f55fbda31084085075bc30 ]

In remove_phb_dynamic() we use &phb->io_resource, after we've called
device_unregister(&host_bridge->dev). But the unregister may have freed
phb, because pcibios_free_controller_deferred() is the release function
for the host_bridge.

If there are no outstanding references when we call device_unregister()
then phb will be freed out from under us.

This has gone mainly unnoticed, but with slub_debug and page_poison
enabled it can lead to a crash:

  PID: 7574   TASK: c0000000d492cb80  CPU: 13  COMMAND: "drmgr"
   #0 [c0000000e4f075a0] crash_kexec at c00000000027d7dc
   #1 [c0000000e4f075d0] oops_end at c000000000029608
   #2 [c0000000e4f07650] __bad_page_fault at c0000000000904b4
   #3 [c0000000e4f076c0] do_bad_slb_fault at c00000000009a5a8
   #4 [c0000000e4f076f0] data_access_slb_common_virt at c000000000008b30
   Data SLB Access [380] exception frame:
   R0:  c000000000167250    R1:  c0000000e4f07a00    R2:  c000000002a46100
   R3:  c000000002b39ce8    R4:  00000000000000c0    R5:  00000000000000a9
   R6:  3894674d000000c0    R7:  0000000000000000    R8:  00000000000000ff
   R9:  0000000000000100    R10: 6b6b6b6b6b6b6b6b    R11: 0000000000008000
   R12: c00000000023da80    R13: c0000009ffd38b00    R14: 0000000000000000
   R15: 000000011c87f0f0    R16: 0000000000000006    R17: 0000000000000003
   R18: 0000000000000002    R19: 0000000000000004    R20: 0000000000000005
   R21: 000000011c87ede8    R22: 000000011c87c5a8    R23: 000000011c87d3a0
   R24: 0000000000000000    R25: 0000000000000001    R26: c0000000e4f07cc8
   R27: c00000004d1cc400    R28: c0080000031d00e8    R29: c00000004d23d800
   R30: c00000004d1d2400    R31: c00000004d1d2540
   NIP: c000000000167258    MSR: 8000000000009033    OR3: c000000000e9f474
   CTR: 0000000000000000    LR:  c000000000167250    XER: 0000000020040003
   CCR: 0000000024088420    MQ:  0000000000000000    DAR: 6b6b6b6b6b6b6ba3
   DSISR: c0000000e4f07920     Syscall Result: fffffffffffffff2
   [NIP  : release_resource+56]
   [LR   : release_resource+48]
   #5 [c0000000e4f07a00] release_resource at c000000000167258  (unreliable)
   #6 [c0000000e4f07a30] remove_phb_dynamic at c000000000105648
   #7 [c0000000e4f07ab0] dlpar_remove_slot at c0080000031a09e8 [rpadlpar_io]
   #8 [c0000000e4f07b50] remove_slot_store at c0080000031a0b9c [rpadlpar_io]
   #9 [c0000000e4f07be0] kobj_attr_store at c000000000817d8c
  #10 [c0000000e4f07c00] sysfs_kf_write at c00000000063e504
  #11 [c0000000e4f07c20] kernfs_fop_write_iter at c00000000063d868
  #12 [c0000000e4f07c70] new_sync_write at c00000000054339c
  #13 [c0000000e4f07d10] vfs_write at c000000000546624
  #14 [c0000000e4f07d60] ksys_write at c0000000005469f4
  #15 [c0000000e4f07db0] system_call_exception at c000000000030840
  #16 [c0000000e4f07e10] system_call_vectored_common at c00000000000c168

To avoid it, we can take a reference to the host_bridge->dev until we're
done using phb. Then when we drop the reference the phb will be freed.

Fixes: 2dd9c11b9d4d ("powerpc/pseries: use pci_host_bridge.release_fn() to kfree(phb)")
Reported-by: David Dai <zdai@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Tested-by: Sachin Sant <sachinp@linux.ibm.com>
Link: https://lore.kernel.org/r/20220318034219.1188008-1-mpe@ellerman.id.au
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoi2c: mux: demux-pinctrl: do not deactivate a master that is not active
Peter Rosin [Wed, 2 Mar 2022 11:22:35 +0000 (12:22 +0100)]
i2c: mux: demux-pinctrl: do not deactivate a master that is not active

[ Upstream commit 1a22aabf20adf89cb216f566913196128766f25b ]

Attempting to rollback the activation of the current master when
the current master has not been activated is bad. priv->cur_chan
and priv->cur_adap are both still zeroed out and the rollback
may result in attempts to revert an of changeset that has not been
applied and do result in calls to both del and put the zeroed out
i2c_adapter. Maybe it crashes, or whatever, but it's bad in any
case.

Fixes: e9d1a0a41d44 ("i2c: mux: demux-pinctrl: Fix an error handling path in 'i2c_demux_pinctrl_probe()'")
Signed-off-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agonetfilter: conntrack: Add and use nf_ct_set_auto_assign_helper_warned()
Phil Sutter [Wed, 2 Mar 2022 21:02:55 +0000 (22:02 +0100)]
netfilter: conntrack: Add and use nf_ct_set_auto_assign_helper_warned()

[ Upstream commit 31d0bb9763efad30377505f3467f958d1ebe1e3d ]

The function sets the pernet boolean to avoid the spurious warning from
nf_ct_lookup_helper() when assigning conntrack helpers via nftables.

Fixes: 1a64edf54f55 ("netfilter: nft_ct: add helper set support")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoi2c: meson: Fix wrong speed use from probe
Lucas Tanure [Tue, 1 Mar 2022 07:26:00 +0000 (07:26 +0000)]
i2c: meson: Fix wrong speed use from probe

[ Upstream commit cb13aa16f34f794a9cee2626862af8a95f0f0ee9 ]

Having meson_i2c_set_clk_div after i2c_add_adapter
causes issues for client drivers that try to use
the bus before the requested speed is applied.

The bus can be used just after i2c_add_adapter, so
move i2c_add_adapter to the final step as
meson_i2c_set_clk_div needs to be called before
the bus is used.

Fixes: 09af1c2fa490 ("i2c: meson: set clock divider in probe instead of setting it for each transfer")
Signed-off-by: Lucas Tanure <tanure@linux.com>
Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoaf_netlink: Fix shift out of bounds in group mask calculation
Petr Machata [Thu, 17 Mar 2022 14:53:06 +0000 (15:53 +0100)]
af_netlink: Fix shift out of bounds in group mask calculation

[ Upstream commit 0caf6d9922192dd1afa8dc2131abfb4df1443b9f ]

When a netlink message is received, netlink_recvmsg() fills in the address
of the sender. One of the fields is the 32-bit bitfield nl_groups, which
carries the multicast group on which the message was received. The least
significant bit corresponds to group 1, and therefore the highest group
that the field can represent is 32. Above that, the UB sanitizer flags the
out-of-bounds shift attempts.

Which bits end up being set in such case is implementation defined, but
it's either going to be a wrong non-zero value, or zero, which is at least
not misleading. Make the latter choice deterministic by always setting to 0
for higher-numbered multicast groups.

To get information about membership in groups >= 32, userspace is expected
to use nl_pktinfo control messages[0], which are enabled by NETLINK_PKTINFO
socket option.
[0] https://lwn.net/Articles/147608/

The way to trigger this issue is e.g. through monitoring the BRVLAN group:

# bridge monitor vlan &
# ip link add name br type bridge

Which produces the following citation:

UBSAN: shift-out-of-bounds in net/netlink/af_netlink.c:162:19
shift exponent 32 is too large for 32-bit type 'int'

Fixes: f7fa9b10edbb ("[NETLINK]: Support dynamic number of multicast groups per netlink family")
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/2bef6aabf201d1fc16cca139a744700cff9dcb04.1647527635.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoipv4: Fix route lookups when handling ICMP redirects and PMTU updates
Guillaume Nault [Thu, 17 Mar 2022 12:45:09 +0000 (13:45 +0100)]
ipv4: Fix route lookups when handling ICMP redirects and PMTU updates

[ Upstream commit 544b4dd568e3b09c1ab38a759d3187e7abda11a0 ]

The PMTU update and ICMP redirect helper functions initialise their fl4
variable with either __build_flow_key() or build_sk_flow_key(). These
initialisation functions always set ->flowi4_scope with
RT_SCOPE_UNIVERSE and might set the ECN bits of ->flowi4_tos. This is
not a problem when the route lookup is later done via
ip_route_output_key_hash(), which properly clears the ECN bits from
->flowi4_tos and initialises ->flowi4_scope based on the RTO_ONLINK
flag. However, some helpers call fib_lookup() directly, without
sanitising the tos and scope fields, so the route lookup can fail and,
as a result, the ICMP redirect or PMTU update aren't taken into
account.

Fix this by extracting the ->flowi4_tos and ->flowi4_scope sanitisation
code into ip_rt_fix_tos(), then use this function in handlers that call
fib_lookup() directly.

Note 1: We can't sanitise ->flowi4_tos and ->flowi4_scope in a central
place (like __build_flow_key() or flowi4_init_output()), because
ip_route_output_key_hash() expects non-sanitised values. When called
with sanitised values, it can erroneously overwrite RT_SCOPE_LINK with
RT_SCOPE_UNIVERSE in ->flowi4_scope. Therefore we have to be careful to
sanitise the values only for those paths that don't call
ip_route_output_key_hash().

Note 2: The problem is mostly about sanitising ->flowi4_tos. Having
->flowi4_scope initialised with RT_SCOPE_UNIVERSE instead of
RT_SCOPE_LINK probably wasn't really a problem: sockets with the
SOCK_LOCALROUTE flag set (those that'd result in RTO_ONLINK being set)
normally shouldn't receive ICMP redirects or PMTU updates.

Fixes: 4895c771c7f0 ("ipv4: Add FIB nexthop exceptions.")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoBluetooth: btmtksdio: Fix kernel oops in btmtksdio_interrupt
Yake Yang [Wed, 16 Mar 2022 23:15:19 +0000 (07:15 +0800)]
Bluetooth: btmtksdio: Fix kernel oops in btmtksdio_interrupt

[ Upstream commit b062a0b9c1dc1ff63094337dccfe1568d5b62023 ]

Fix the following kernel oops in btmtksdio_interrrupt

[   14.339134]  btmtksdio_interrupt+0x28/0x54
[   14.339139]  process_sdio_pending_irqs+0x68/0x1a0
[   14.339144]  sdio_irq_work+0x40/0x70
[   14.339154]  process_one_work+0x184/0x39c
[   14.339160]  worker_thread+0x228/0x3e8
[   14.339168]  kthread+0x148/0x3ac
[   14.339176]  ret_from_fork+0x10/0x30

That happened because hdev->power_on is already called before
sdio_set_drvdata which btmtksdio_interrupt handler relies on is not
properly set up.

The details are shown as the below: hci_register_dev would run
queue_work(hdev->req_workqueue, &hdev->power_on) as WQ_HIGHPRI
workqueue_struct to complete the power-on sequeunce and thus hci_power_on
may run before sdio_set_drvdata is done in btmtksdio_probe.

The hci_dev_do_open in hci_power_on would initialize the device and enable
the interrupt and thus it is possible that btmtksdio_interrupt is being
called right before sdio_set_drvdata is filled out.

When btmtksdio_interrupt is being called and sdio_set_drvdata is not filled
, the kernel oops is going to happen because btmtksdio_interrupt access an
uninitialized pointer.

Fixes: 9aebfd4a2200 ("Bluetooth: mediatek: add support for MediaTek MT7663S and MT7668S SDIO devices")
Reviewed-by: Mark Chen <markyawenchen@gmail.com>
Co-developed-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Yake Yang <yake.yang@mediatek.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoBluetooth: call hci_le_conn_failed with hdev lock in hci_le_conn_failed
Niels Dossche [Wed, 16 Mar 2022 15:33:50 +0000 (16:33 +0100)]
Bluetooth: call hci_le_conn_failed with hdev lock in hci_le_conn_failed

[ Upstream commit 9fa6b4cda3b414e990f008f45f9bcecbcb54d4d1 ]

hci_le_conn_failed function's documentation says that the caller must
hold hdev->lock. The only callsite that does not hold that lock is
hci_le_conn_failed. The other 3 callsites hold the hdev->lock very
locally. The solution is to hold the lock during the call to
hci_le_conn_failed.

Fixes: 3c857757ef6e ("Bluetooth: Add directed advertising support through connect()")
Signed-off-by: Niels Dossche <dossche.niels@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoBluetooth: hci_uart: add missing NULL check in h5_enqueue
Pavel Skripkin [Sun, 13 Mar 2022 17:49:36 +0000 (20:49 +0300)]
Bluetooth: hci_uart: add missing NULL check in h5_enqueue

[ Upstream commit 32cb08e958696908a9aad5e49a78d74f7e32fffb ]

Syzbot hit general protection fault in __pm_runtime_resume(). The problem
was in missing NULL check.

hu->serdev can be NULL and we should not blindly pass &serdev->dev
somewhere, since it will cause GPF.

Reported-by: syzbot+b9bd12fbed3485a3e51f@syzkaller.appspotmail.com
Fixes: d9dd833cf6d2 ("Bluetooth: hci_h5: Add runtime suspend")
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoselftests/bpf: Fix error reporting from sock_fields programs
Jakub Sitnicki [Thu, 17 Mar 2022 11:39:17 +0000 (12:39 +0100)]
selftests/bpf: Fix error reporting from sock_fields programs

[ Upstream commit a4c9fe0ed4a13e25e43fcd44d9f89bc19ba8fbb7 ]

The helper macro that records an error in BPF programs that exercise sock
fields access has been inadvertently broken by adaptation work that
happened in commit b18c1f0aa477 ("bpf: selftest: Adapt sock_fields test to
use skel and global variables").

BPF_NOEXIST flag cannot be used to update BPF_MAP_TYPE_ARRAY. The operation
always fails with -EEXIST, which in turn means the error never gets
recorded, and the checks for errors always pass.

Revert the change in update flags.

Fixes: b18c1f0aa477 ("bpf: selftest: Adapt sock_fields test to use skel and global variables")
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220317113920.1068535-2-jakub@cloudflare.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoaf_unix: Fix some data-races around unix_sk(sk)->oob_skb.
Kuniyuki Iwashima [Thu, 17 Mar 2022 03:08:08 +0000 (12:08 +0900)]
af_unix: Fix some data-races around unix_sk(sk)->oob_skb.

[ Upstream commit e82025c623e2bf04d162bafceb66a59115814479 ]

Out-of-band data automatically places a "mark" showing wherein the
sequence the out-of-band data would have been.  If the out-of-band data
implies cancelling everything sent so far, the "mark" is helpful to flush
them.  When the socket's read pointer reaches the "mark", the ioctl() below
sets a non zero value to the arg `atmark`:

The out-of-band data is queued in sk->sk_receive_queue as well as ordinary
data and also saved in unix_sk(sk)->oob_skb.  It can be used to test if the
head of the receive queue is the out-of-band data meaning the socket is at
the "mark".

While testing that, unix_ioctl() reads unix_sk(sk)->oob_skb locklessly.
Thus, all accesses to oob_skb need some basic protection to avoid
load/store tearing which KCSAN detects when these are called concurrently:

  - ioctl(fd_a, SIOCATMARK, &atmark, sizeof(atmark))
  - send(fd_b_connected_to_a, buf, sizeof(buf), MSG_OOB)

BUG: KCSAN: data-race in unix_ioctl / unix_stream_sendmsg

write to 0xffff888003d9cff0 of 8 bytes by task 175 on cpu 1:
 unix_stream_sendmsg (net/unix/af_unix.c:2087 net/unix/af_unix.c:2191)
 sock_sendmsg (net/socket.c:705 net/socket.c:725)
 __sys_sendto (net/socket.c:2040)
 __x64_sys_sendto (net/socket.c:2048)
 do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
 entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:113)

read to 0xffff888003d9cff0 of 8 bytes by task 176 on cpu 0:
 unix_ioctl (net/unix/af_unix.c:3101 (discriminator 1))
 sock_do_ioctl (net/socket.c:1128)
 sock_ioctl (net/socket.c:1242)
 __x64_sys_ioctl (fs/ioctl.c:52 fs/ioctl.c:874 fs/ioctl.c:860 fs/ioctl.c:860)
 do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
 entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:113)

value changed: 0xffff888003da0c00 -> 0xffff888003da0d00

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 176 Comm: unix_race_oob_i Not tainted 5.17.0-rc5-59529-g83dc4c2af682 #12
Hardware name: Red Hat KVM, BIOS 1.11.0-2.amzn2 04/01/2014

Fixes: 314001f0bf92 ("af_unix: Add OOB support")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoibmvnic: fix race between xmit and reset
Sukadev Bhattiprolu [Thu, 17 Mar 2022 01:12:31 +0000 (18:12 -0700)]
ibmvnic: fix race between xmit and reset

[ Upstream commit 4219196d1f662cb10a462eb9e076633a3fc31a15 ]

There is a race between reset and the transmit paths that can lead to
ibmvnic_xmit() accessing an scrq after it has been freed in the reset
path. It can result in a crash like:

Kernel attempted to read user page (0) - exploit attempt? (uid: 0)
BUG: Kernel NULL pointer dereference on read at 0x00000000
Faulting instruction address: 0xc0080000016189f8
Oops: Kernel access of bad area, sig: 11 [#1]
...
NIP [c0080000016189f8] ibmvnic_xmit+0x60/0xb60 [ibmvnic]
LR [c000000000c0046c] dev_hard_start_xmit+0x11c/0x280
Call Trace:
[c008000001618f08] ibmvnic_xmit+0x570/0xb60 [ibmvnic] (unreliable)
[c000000000c0046c] dev_hard_start_xmit+0x11c/0x280
[c000000000c9cfcc] sch_direct_xmit+0xec/0x330
[c000000000bfe640] __dev_xmit_skb+0x3a0/0x9d0
[c000000000c00ad4] __dev_queue_xmit+0x394/0x730
[c008000002db813c] __bond_start_xmit+0x254/0x450 [bonding]
[c008000002db8378] bond_start_xmit+0x40/0xc0 [bonding]
[c000000000c0046c] dev_hard_start_xmit+0x11c/0x280
[c000000000c00ca4] __dev_queue_xmit+0x564/0x730
[c000000000cf97e0] neigh_hh_output+0xd0/0x180
[c000000000cfa69c] ip_finish_output2+0x31c/0x5c0
[c000000000cfd244] __ip_queue_xmit+0x194/0x4f0
[c000000000d2a3c4] __tcp_transmit_skb+0x434/0x9b0
[c000000000d2d1e0] __tcp_retransmit_skb+0x1d0/0x6a0
[c000000000d2d984] tcp_retransmit_skb+0x34/0x130
[c000000000d310e8] tcp_retransmit_timer+0x388/0x6d0
[c000000000d315ec] tcp_write_timer_handler+0x1bc/0x330
[c000000000d317bc] tcp_write_timer+0x5c/0x200
[c000000000243270] call_timer_fn+0x50/0x1c0
[c000000000243704] __run_timers.part.0+0x324/0x460
[c000000000243894] run_timer_softirq+0x54/0xa0
[c000000000ea713c] __do_softirq+0x15c/0x3e0
[c000000000166258] __irq_exit_rcu+0x158/0x190
[c000000000166420] irq_exit+0x20/0x40
[c00000000002853c] timer_interrupt+0x14c/0x2b0
[c000000000009a00] decrementer_common_virt+0x210/0x220
--- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c

The immediate cause of the crash is the access of tx_scrq in the following
snippet during a reset, where the tx_scrq can be either NULL or an address
that will soon be invalid:

ibmvnic_xmit()
{
...
tx_scrq = adapter->tx_scrq[queue_num];
txq = netdev_get_tx_queue(netdev, queue_num);
ind_bufp = &tx_scrq->ind_buf;

if (test_bit(0, &adapter->resetting)) {
...
}

But beyond that, the call to ibmvnic_xmit() itself is not safe during a
reset and the reset path attempts to avoid this by stopping the queue in
ibmvnic_cleanup(). However just after the queue was stopped, an in-flight
ibmvnic_complete_tx() could have restarted the queue even as the reset is
progressing.

Since the queue was restarted we could get a call to ibmvnic_xmit() which
can then access the bad tx_scrq (or other fields).

We cannot however simply have ibmvnic_complete_tx() check the ->resetting
bit and skip starting the queue. This can race at the "back-end" of a good
reset which just restarted the queue but has not cleared the ->resetting
bit yet. If we skip restarting the queue due to ->resetting being true,
the queue would remain stopped indefinitely potentially leading to transmit
timeouts.

IOW ->resetting is too broad for this purpose. Instead use a new flag
that indicates whether or not the queues are active. Only the open/
reset paths control when the queues are active. ibmvnic_complete_tx()
and others wake up the queue only if the queue is marked active.

So we will have:
A. reset/open thread in ibmvnic_cleanup() and __ibmvnic_open()

->resetting = true
->tx_queues_active = false
disable tx queues
...
->tx_queues_active = true
start tx queues

B. Tx interrupt in ibmvnic_complete_tx():

if (->tx_queues_active)
netif_wake_subqueue();

To ensure that ->tx_queues_active and state of the queues are consistent,
we need a lock which:

- must also be taken in the interrupt path (ibmvnic_complete_tx())
- shared across the multiple queues in the adapter (so they don't
  become serialized)

Use rcu_read_lock() and have the reset thread synchronize_rcu() after
updating the ->tx_queues_active state.

While here, consolidate a few boolean fields in ibmvnic_adapter for
better alignment.

Based on discussions with Brian King and Dany Madden.

Fixes: 7ed5b31f4a66 ("net/ibmvnic: prevent more than one thread from running in reset")
Reported-by: Vaishnavi Bhat <vaish123@in.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agobareudp: use ipv6_mod_enabled to check if IPv6 enabled
Hangbin Liu [Tue, 15 Mar 2022 06:26:18 +0000 (14:26 +0800)]
bareudp: use ipv6_mod_enabled to check if IPv6 enabled

[ Upstream commit e077ed58c243afc197bc2a2ba0e1ff61135e4ec2 ]

bareudp_create_sock() use AF_INET6 by default if IPv6 CONFIG enabled.
But if user start kernel with ipv6.disable=1, the bareudp sock will
created failed, which cause the interface open failed even with ethertype
ip. e.g.

 # ip link add bareudp1 type bareudp dstport 2 ethertype ip
 # ip link set bareudp1 up
 RTNETLINK answers: Address family not supported by protocol

Fix it by using ipv6_mod_enabled() to check if IPv6 enabled. There is
no need to check IS_ENABLED(CONFIG_IPV6) as ipv6_mod_enabled() will
return false when CONFIG_IPV6 no enabled in include/linux/ipv6.h.

Reported-by: Jianlin Shi <jishi@redhat.com>
Fixes: 571912c69f0e ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20220315062618.156230-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agocan: isotp: support MSG_TRUNC flag when reading from socket
Oliver Hartkopp [Wed, 16 Mar 2022 16:42:58 +0000 (17:42 +0100)]
can: isotp: support MSG_TRUNC flag when reading from socket

[ Upstream commit 42bf50a1795a1854d48717b7361dbdbce496b16b ]

When providing the MSG_TRUNC flag via recvmsg() syscall the return value
provides the real length of the packet or datagram, even when it was longer
than the passed buffer.

Fixes: e057dd3fc20f ("can: add ISO 15765-2:2016 transport protocol")
Link: https://github.com/linux-can/can-utils/issues/347#issuecomment-1065932671
Link: https://lore.kernel.org/all/20220316164258.54155-3-socketcan@hartkopp.net
Suggested-by: Derek Will <derekrobertwill@gmail.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agocan: isotp: return -EADDRNOTAVAIL when reading from unbound socket
Oliver Hartkopp [Wed, 16 Mar 2022 16:42:57 +0000 (17:42 +0100)]
can: isotp: return -EADDRNOTAVAIL when reading from unbound socket

[ Upstream commit 30ffd5332e06316bd69a654c06aa033872979b7c ]

When reading from an unbound can-isotp socket the syscall blocked
indefinitely. As unbound sockets (without given CAN address information)
do not make sense anyway we directly return -EADDRNOTAVAIL on read()
analogue to the known behavior from sendmsg().

Fixes: e057dd3fc20f ("can: add ISO 15765-2:2016 transport protocol")
Link: https://github.com/linux-can/can-utils/issues/349
Link: https://lore.kernel.org/all/20220316164258.54155-2-socketcan@hartkopp.net
Suggested-by: Derek Will <derekrobertwill@gmail.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agocan: isotp: sanitize CAN ID checks in isotp_bind()
Oliver Hartkopp [Wed, 16 Mar 2022 16:42:56 +0000 (17:42 +0100)]
can: isotp: sanitize CAN ID checks in isotp_bind()

commit 3ea566422cbde9610c2734980d1286ab681bb40e upstream.

Syzbot created an environment that lead to a state machine status that
can not be reached with a compliant CAN ID address configuration.
The provided address information consisted of CAN ID 0x6000001 and 0xC28001
which both boil down to 11 bit CAN IDs 0x001 in sending and receiving.

Sanitize the SFF/EFF CAN ID values before performing the address checks.

Fixes: e057dd3fc20f ("can: add ISO 15765-2:2016 transport protocol")
Link: https://lore.kernel.org/all/20220316164258.54155-1-socketcan@hartkopp.net
Reported-by: syzbot+2339c27f5c66c652843e@syzkaller.appspotmail.com
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 years agomt76: mt7921: fix mt7921_queues_acq implementation
Lorenzo Bianconi [Thu, 3 Mar 2022 08:46:41 +0000 (09:46 +0100)]
mt76: mt7921: fix mt7921_queues_acq implementation

[ Upstream commit 849ee6ac9dd3efd0a57cbc98b9a9d6ae87374aff ]

Fix mt7921_queues_acq implementation according to the vendor sdk.

Fixes: 474a9f21e2e20 ("mt76: mt7921: add debugfs support")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agonetfilter: flowtable: Fix QinQ and pppoe support for inet table
Pablo Neira Ayuso [Wed, 16 Mar 2022 10:19:43 +0000 (11:19 +0100)]
netfilter: flowtable: Fix QinQ and pppoe support for inet table

[ Upstream commit 0492d857636e1c52cd71594a723c4b26a7b31978 ]

nf_flow_offload_inet_hook() does not check for 802.1q and PPPoE.
Fetch inner ethertype from these encapsulation protocols.

Fixes: 72efd585f714 ("netfilter: flowtable: add pppoe support")
Fixes: 4cd91f7c290f ("netfilter: flowtable: add vlan support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agodrm/i915/display: Fix HPD short pulse handling for eDP
José Roberto de Souza [Fri, 11 Mar 2022 18:51:48 +0000 (10:51 -0800)]
drm/i915/display: Fix HPD short pulse handling for eDP

[ Upstream commit 3a84fd1ed53582b31e843a152ee3219e9e4ccb8c ]

Commit 13ea6db2cf24 ("drm/i915/edp: Ignore short pulse when panel
powered off") completely broke short pulse handling for eDP as it is
usually generated by sink when it is displaying image and there is
some error or status that source needs to handle.

When power panel is enabled, this state is enough to power aux
transactions and VDD override is disabled, so intel_pps_have_power()
is always returning false causing short pulses to be ignored.

So here better naming this function that intends to check if aux
lines are powered to avoid the endless cycle mentioned in the commit
being fixed and fixing the check for what it is intended.

v2:
- renamed to intel_pps_have_panel_power_or_vdd()
- fixed indentation

Fixes: 13ea6db2cf24 ("drm/i915/edp: Ignore short pulse when panel powered off")
Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Uma Shankar <uma.shankar@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220311185149.110527-1-jose.souza@intel.com
(cherry picked from commit 8f0c1c0949b609acfad62b8d5f742a3b5e7b05ab)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoUSB: storage: ums-realtek: fix error code in rts51x_read_mem()
Dan Carpenter [Fri, 4 Mar 2022 07:35:04 +0000 (10:35 +0300)]
USB: storage: ums-realtek: fix error code in rts51x_read_mem()

[ Upstream commit b07cabb8361dc692522538205552b1b9dab134be ]

The rts51x_read_mem() function should return negative error codes.
Currently if the kmalloc() fails it returns USB_STOR_TRANSPORT_ERROR (3)
which is treated as success by the callers.

Fixes: 065e60964e29 ("ums_realtek: do not use stack memory for DMA")
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20220304073504.GA26464@kili
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agosamples/bpf, xdpsock: Fix race when running for fix duration of time
Niklas Söderlund [Tue, 15 Mar 2022 10:29:48 +0000 (11:29 +0100)]
samples/bpf, xdpsock: Fix race when running for fix duration of time

[ Upstream commit 8fa42d78f6354bb96ad3a079dcbef528ca9fa9e0 ]

When running xdpsock for a fix duration of time before terminating
using --duration=<n>, there is a race condition that may cause xdpsock
to terminate immediately.

When running for a fixed duration of time the check to determine when to
terminate execution is in is_benchmark_done() and is being executed in
the context of the poller thread,

    if (opt_duration > 0) {
            unsigned long dt = (get_nsecs() - start_time);

            if (dt >= opt_duration)
                    benchmark_done = true;
    }

However start_time is only set after the poller thread have been
created. This leaves a small window when the poller thread is starting
and calls is_benchmark_done() for the first time that start_time is not
yet set. In that case start_time have its initial value of 0 and the
duration check fails as it do not correlate correctly for the
applications start time and immediately sets benchmark_done which in
turn terminates the xdpsock application.

Fix this by setting start_time before creating the poller thread.

Fixes: d3f11b018f6c ("samples/bpf: xdpsock: Add duration option to specify how long to run")
Signed-off-by: Niklas Söderlund <niklas.soderlund@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220315102948.466436-1-niklas.soderlund@corigine.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agobpf, sockmap: Fix double uncharge the mem of sk_msg
Wang Yufen [Fri, 4 Mar 2022 08:11:45 +0000 (16:11 +0800)]
bpf, sockmap: Fix double uncharge the mem of sk_msg

[ Upstream commit 2486ab434b2c2a14e9237296db00b1e1b7ae3273 ]

If tcp_bpf_sendmsg is running during a tear down operation, psock may be
freed.

tcp_bpf_sendmsg()
 tcp_bpf_send_verdict()
  sk_msg_return()
  tcp_bpf_sendmsg_redir()
   unlikely(!psock))
     sk_msg_free()

The mem of msg has been uncharged in tcp_bpf_send_verdict() by
sk_msg_return(), and would be uncharged by sk_msg_free() again. When psock
is null, we can simply returning an error code, this would then trigger
the sk_msg_free_nocharge in the error path of __SK_REDIRECT and would have
the side effect of throwing an error up to user space. This would be a
slight change in behavior from user side but would look the same as an
error if the redirect on the socket threw an error.

This issue can cause the following info:
WARNING: CPU: 0 PID: 2136 at net/ipv4/af_inet.c:155 inet_sock_destruct+0x13c/0x260
Call Trace:
 <TASK>
 __sk_destruct+0x24/0x1f0
 sk_psock_destroy+0x19b/0x1c0
 process_one_work+0x1b3/0x3c0
 worker_thread+0x30/0x350
 ? process_one_work+0x3c0/0x3c0
 kthread+0xe6/0x110
 ? kthread_complete_and_exit+0x20/0x20
 ret_from_fork+0x22/0x30
 </TASK>

Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20220304081145.2037182-5-wangyufen@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agobpf, sockmap: Fix more uncharged while msg has more_data
Wang Yufen [Fri, 4 Mar 2022 08:11:44 +0000 (16:11 +0800)]
bpf, sockmap: Fix more uncharged while msg has more_data

[ Upstream commit 84472b436e760ba439e1969a9e3c5ae7c86de39d ]

In tcp_bpf_send_verdict(), if msg has more data after
tcp_bpf_sendmsg_redir():

tcp_bpf_send_verdict()
 tosend = msg->sg.size  //msg->sg.size = 22220
 case __SK_REDIRECT:
  sk_msg_return()  //uncharged msg->sg.size(22220) sk->sk_forward_alloc
  tcp_bpf_sendmsg_redir() //after tcp_bpf_sendmsg_redir, msg->sg.size=11000
 goto more_data;
 tosend = msg->sg.size  //msg->sg.size = 11000
 case __SK_REDIRECT:
  sk_msg_return()  //uncharged msg->sg.size(11000) to sk->sk_forward_alloc

The msg->sg.size(11000) has been uncharged twice, to fix we can charge the
remaining msg->sg.size before goto more data.

This issue can cause the following info:
WARNING: CPU: 0 PID: 9860 at net/core/stream.c:208 sk_stream_kill_queues+0xd4/0x1a0
Call Trace:
 <TASK>
 inet_csk_destroy_sock+0x55/0x110
 __tcp_close+0x279/0x470
 tcp_close+0x1f/0x60
 inet_release+0x3f/0x80
 __sock_release+0x3d/0xb0
 sock_close+0x11/0x20
 __fput+0x92/0x250
 task_work_run+0x6a/0xa0
 do_exit+0x33b/0xb60
 do_group_exit+0x2f/0xa0
 get_signal+0xb6/0x950
 arch_do_signal_or_restart+0xac/0x2a0
 ? vfs_write+0x237/0x290
 exit_to_user_mode_prepare+0xa9/0x200
 syscall_exit_to_user_mode+0x12/0x30
 do_syscall_64+0x46/0x80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
 </TASK>

WARNING: CPU: 0 PID: 2136 at net/ipv4/af_inet.c:155 inet_sock_destruct+0x13c/0x260
Call Trace:
 <TASK>
 __sk_destruct+0x24/0x1f0
 sk_psock_destroy+0x19b/0x1c0
 process_one_work+0x1b3/0x3c0
 worker_thread+0x30/0x350
 ? process_one_work+0x3c0/0x3c0
 kthread+0xe6/0x110
 ? kthread_complete_and_exit+0x20/0x20
 ret_from_fork+0x22/0x30
 </TASK>

Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20220304081145.2037182-4-wangyufen@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agobpf, sockmap: Fix memleak in tcp_bpf_sendmsg while sk msg is full
Wang Yufen [Fri, 4 Mar 2022 08:11:43 +0000 (16:11 +0800)]
bpf, sockmap: Fix memleak in tcp_bpf_sendmsg while sk msg is full

[ Upstream commit 9c34e38c4a870eb30b13f42f5b44f42e9d19ccb8 ]

If tcp_bpf_sendmsg() is running while sk msg is full. When sk_msg_alloc()
returns -ENOMEM error, tcp_bpf_sendmsg() goes to wait_for_memory. If partial
memory has been alloced by sk_msg_alloc(), that is, msg_tx->sg.size is
greater than osize after sk_msg_alloc(), memleak occurs. To fix we use
sk_msg_trim() to release the allocated memory, then goto wait for memory.

Other call paths of sk_msg_alloc() have the similar issue, such as
tls_sw_sendmsg(), so handle sk_msg_trim logic inside sk_msg_alloc(),
as Cong Wang suggested.

This issue can cause the following info:
WARNING: CPU: 3 PID: 7950 at net/core/stream.c:208 sk_stream_kill_queues+0xd4/0x1a0
Call Trace:
 <TASK>
 inet_csk_destroy_sock+0x55/0x110
 __tcp_close+0x279/0x470
 tcp_close+0x1f/0x60
 inet_release+0x3f/0x80
 __sock_release+0x3d/0xb0
 sock_close+0x11/0x20
 __fput+0x92/0x250
 task_work_run+0x6a/0xa0
 do_exit+0x33b/0xb60
 do_group_exit+0x2f/0xa0
 get_signal+0xb6/0x950
 arch_do_signal_or_restart+0xac/0x2a0
 exit_to_user_mode_prepare+0xa9/0x200
 syscall_exit_to_user_mode+0x12/0x30
 do_syscall_64+0x46/0x80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
 </TASK>

WARNING: CPU: 3 PID: 2094 at net/ipv4/af_inet.c:155 inet_sock_destruct+0x13c/0x260
Call Trace:
 <TASK>
 __sk_destruct+0x24/0x1f0
 sk_psock_destroy+0x19b/0x1c0
 process_one_work+0x1b3/0x3c0
 kthread+0xe6/0x110
 ret_from_fork+0x22/0x30
 </TASK>

Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20220304081145.2037182-3-wangyufen@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agobpf, sockmap: Fix memleak in sk_psock_queue_msg
Wang Yufen [Fri, 4 Mar 2022 08:11:42 +0000 (16:11 +0800)]
bpf, sockmap: Fix memleak in sk_psock_queue_msg

[ Upstream commit 938d3480b92fa5e454b7734294f12a7b75126f09 ]

If tcp_bpf_sendmsg is running during a tear down operation we may enqueue
data on the ingress msg queue while tear down is trying to free it.

 sk1 (redirect sk2)                         sk2
 -------------------                      ---------------
tcp_bpf_sendmsg()
 tcp_bpf_send_verdict()
  tcp_bpf_sendmsg_redir()
   bpf_tcp_ingress()
                                          sock_map_close()
                                           lock_sock()
    lock_sock() ... blocking
                                           sk_psock_stop
                                            sk_psock_clear_state(psock, SK_PSOCK_TX_ENABLED);
                                           release_sock(sk);
    lock_sock()
    sk_mem_charge()
    get_page()
    sk_psock_queue_msg()
     sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED);
      drop_sk_msg()
    release_sock()

While drop_sk_msg(), the msg has charged memory form sk by sk_mem_charge
and has sg pages need to put. To fix we use sk_msg_free() and then kfee()
msg.

This issue can cause the following info:
WARNING: CPU: 0 PID: 9202 at net/core/stream.c:205 sk_stream_kill_queues+0xc8/0xe0
Call Trace:
 <IRQ>
 inet_csk_destroy_sock+0x55/0x110
 tcp_rcv_state_process+0xe5f/0xe90
 ? sk_filter_trim_cap+0x10d/0x230
 ? tcp_v4_do_rcv+0x161/0x250
 tcp_v4_do_rcv+0x161/0x250
 tcp_v4_rcv+0xc3a/0xce0
 ip_protocol_deliver_rcu+0x3d/0x230
 ip_local_deliver_finish+0x54/0x60
 ip_local_deliver+0xfd/0x110
 ? ip_protocol_deliver_rcu+0x230/0x230
 ip_rcv+0xd6/0x100
 ? ip_local_deliver+0x110/0x110
 __netif_receive_skb_one_core+0x85/0xa0
 process_backlog+0xa4/0x160
 __napi_poll+0x29/0x1b0
 net_rx_action+0x287/0x300
 __do_softirq+0xff/0x2fc
 do_softirq+0x79/0x90
 </IRQ>

WARNING: CPU: 0 PID: 531 at net/ipv4/af_inet.c:154 inet_sock_destruct+0x175/0x1b0
Call Trace:
 <TASK>
 __sk_destruct+0x24/0x1f0
 sk_psock_destroy+0x19b/0x1c0
 process_one_work+0x1b3/0x3c0
 ? process_one_work+0x3c0/0x3c0
 worker_thread+0x30/0x350
 ? process_one_work+0x3c0/0x3c0
 kthread+0xe6/0x110
 ? kthread_complete_and_exit+0x20/0x20
 ret_from_fork+0x22/0x30
 </TASK>

Fixes: 9635720b7c88 ("bpf, sockmap: Fix memleak on ingress msg enqueue")
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20220304081145.2037182-2-wangyufen@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoRDMA/mlx5: Fix memory leak in error flow for subscribe event routine
Yongzhi Liu [Fri, 11 Mar 2022 17:06:01 +0000 (09:06 -0800)]
RDMA/mlx5: Fix memory leak in error flow for subscribe event routine

[ Upstream commit 087f9c3f2309ed183f7e4b85ae57121d8663224d ]

In case the second xa_insert() fails, the obj_event is not released.  Fix
the error unwind flow to free that memory to avoid a memory leak.

Fixes: 759738537142 ("IB/mlx5: Enable subscription for device events over DEVX")
Link: https://lore.kernel.org/r/1647018361-18266-1-git-send-email-lyz_cs@pku.edu.cn
Signed-off-by: Yongzhi Liu <lyz_cs@pku.edu.cn>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoRevert "RDMA/core: Fix ib_qp_usecnt_dec() called when error"
Leon Romanovsky [Wed, 9 Mar 2022 18:42:13 +0000 (20:42 +0200)]
Revert "RDMA/core: Fix ib_qp_usecnt_dec() called when error"

[ Upstream commit 7922d3de4d270a9aedb71212fc0d5ae697ced516 ]

This reverts commit 7c4a539ec38f4ce400a0f3fcb5ff6c940fcd67bb. which causes
to the following error in mlx4.

  Destroy of kernel CQ shouldn't fail
  WARNING: CPU: 4 PID: 18064 at include/rdma/ib_verbs.h:3936 mlx4_ib_dealloc_xrcd+0x12e/0x1b0 [mlx4_ib]
  Modules linked in: bonding ib_ipoib ip_gre ipip tunnel4 geneve rdma_ucm nf_tables ib_umad mlx4_en mlx4_ib ib_uverbs mlx4_core ip6_gre gre ip6_tunnel tunnel6 iptable_raw openvswitch nsh rpcrdma ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_cm ib_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter overlay fuse [last unloaded: mlx4_core]
  CPU: 4 PID: 18064 Comm: ibv_xsrq_pingpo Not tainted 5.17.0-rc7_master_62c6ecb #1
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  RIP: 0010:mlx4_ib_dealloc_xrcd+0x12e/0x1b0 [mlx4_ib]
  Code: 1e 93 08 00 40 80 fd 01 0f 87 fa f1 04 00 83 e5 01 0f 85 2b ff ff ff 48 c7 c7 20 4f b6 a0 c6 05 fd 92 08 00 01 e8 47 c9 82 e2 <0f> 0b e9 11 ff ff ff 0f b6 2d eb 92 08 00 40 80 fd 01 0f 87 b1 f1
  RSP: 0018:ffff8881a4957750 EFLAGS: 00010286
  RAX: 0000000000000000 RBX: ffff8881ac4b6800 RCX: 0000000000000000
  RDX: 0000000000000027 RSI: 0000000000000004 RDI: ffffed103492aedc
  RBP: 0000000000000000 R08: 0000000000000001 R09: ffff8884d2e378eb
  R10: ffffed109a5c6f1d R11: 0000000000000001 R12: ffff888132620000
  R13: ffff8881a4957a90 R14: ffff8881aa2d4000 R15: ffff8881a4957ad0
  FS:  00007f0401747740(0000) GS:ffff8884d2e00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 000055f8ae036118 CR3: 000000012fe94005 CR4: 0000000000370ea0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   <TASK>
   ib_dealloc_xrcd_user+0xce/0x120 [ib_core]
   ib_uverbs_dealloc_xrcd+0xad/0x210 [ib_uverbs]
   uverbs_free_xrcd+0xe8/0x190 [ib_uverbs]
   destroy_hw_idr_uobject+0x7a/0x130 [ib_uverbs]
   uverbs_destroy_uobject+0x164/0x730 [ib_uverbs]
   uobj_destroy+0x72/0xf0 [ib_uverbs]
   ib_uverbs_cmd_verbs+0x19fb/0x3110 [ib_uverbs]
   ib_uverbs_ioctl+0x169/0x260 [ib_uverbs]
   __x64_sys_ioctl+0x856/0x1550
   do_syscall_64+0x3d/0x90
   entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: 7c4a539ec38f ("RDMA/core: Fix ib_qp_usecnt_dec() called when error")
Link: https://lore.kernel.org/r/74c11029adaf449b3b9228a77cc82f39e9e892c8.1646851220.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoRDMA/irdma: Prevent some integer underflows
Dan Carpenter [Mon, 7 Mar 2022 12:59:28 +0000 (15:59 +0300)]
RDMA/irdma: Prevent some integer underflows

[ Upstream commit 6f6dbb819dfc1a35bcb8b709b5c83a3ea8beff75 ]

My static checker complains that:

    drivers/infiniband/hw/irdma/ctrl.c:3605 irdma_sc_ceq_init()
    warn: can subtract underflow 'info->dev->hmc_fpm_misc.max_ceqs'?

It appears that "info->dev->hmc_fpm_misc.max_ceqs" comes from the firmware
in irdma_sc_parse_fpm_query_buf() so, yes, there is a chance that it could
be zero.  Even if we trust the firmware, it's easy enough to change the
condition just as a hardenning measure.

Fixes: 3f49d6842569 ("RDMA/irdma: Implement HW Admin Queue OPs")
Link: https://lore.kernel.org/r/20220307125928.GE16710@kili
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agopower: ab8500_chargalg: Use CLOCK_MONOTONIC
Linus Walleij [Tue, 8 Mar 2022 15:44:25 +0000 (16:44 +0100)]
power: ab8500_chargalg: Use CLOCK_MONOTONIC

[ Upstream commit c22fca40522e2be8af168f3087d87d85e404ea72 ]

The HRTimer in the AB8500 charging code is using CLOCK_REALTIME
to set an alarm some hours forward in time +/- 5 min for a safety
timer.

I have observed that this will sometimes fire sporadically
early when charging a battery with the result that
charging stops.

As CLOCK_REALTIME can be subject to adjustments of time from
sources such as NTP, this cannot be trusted and will likely
for example fire events if the clock is set forward some hours
by say NTP.

Use CLOCK_MONOTONIC as indicated in other instances and the
problem goes away. Also initialize the timer to REL mode
as this is what will be used later.

Fixes: 257107ae6b9b ("ab8500-chargalg: Use hrtimer")
Cc: Lee Jones <lee.jones@linaro.org>
Suggested-by: Matti Vaittinen <mazziesaccount@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agomtd: rawnand: atmel: fix refcount issue in atmel_nand_controller_init
Xin Xiong [Fri, 4 Mar 2022 08:53:32 +0000 (16:53 +0800)]
mtd: rawnand: atmel: fix refcount issue in atmel_nand_controller_init

[ Upstream commit fecbd4a317c95d73c849648c406bcf1b6a0ec1cf ]

The reference counting issue happens in several error handling paths
on a refcounted object "nc->dmac". In these paths, the function simply
returns the error code, forgetting to balance the reference count of
"nc->dmac", increased earlier by dma_request_channel(), which may
cause refcount leaks.

Fix it by decrementing the refcount of specific object in those error
paths.

Fixes: f88fc122cc34 ("mtd: nand: Cleanup/rework the atmel_nand driver")
Co-developed-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Co-developed-by: Xin Tan <tanxin.ctf@gmail.com>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Signed-off-by: Xin Xiong <xiongx18@fudan.edu.cn>
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20220304085330.3610-1-xiongx18@fudan.edu.cn
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoMIPS: pgalloc: fix memory leak caused by pgd_free()
Yaliang Wang [Thu, 10 Mar 2022 11:31:16 +0000 (19:31 +0800)]
MIPS: pgalloc: fix memory leak caused by pgd_free()

[ Upstream commit 2bc5bab9a763d520937e4f3fe8df51c6a1eceb97 ]

pgd page is freed by generic implementation pgd_free() since commit
f9cb654cb550 ("asm-generic: pgalloc: provide generic pgd_free()"),
however, there are scenarios that the system uses more than one page as
the pgd table, in such cases the generic implementation pgd_free() won't
be applicable anymore. For example, when PAGE_SIZE_4KB is enabled and
MIPS_VA_BITS_48 is not enabled in a 64bit system, the macro "PGD_ORDER"
will be set as "1", which will cause allocating two pages as the pgd
table. Well, at the same time, the generic implementation pgd_free()
just free one pgd page, which will result in the memory leak.

The memory leak can be easily detected by executing shell command:
"while true; do ls > /dev/null; grep MemFree /proc/meminfo; done"

Fixes: f9cb654cb550 ("asm-generic: pgalloc: provide generic pgd_free()")
Signed-off-by: Yaliang Wang <Yaliang.Wang@windriver.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoMIPS: RB532: fix return value of __setup handler
Randy Dunlap [Sat, 12 Mar 2022 04:20:26 +0000 (20:20 -0800)]
MIPS: RB532: fix return value of __setup handler

[ Upstream commit 8755d57ba1ff910666572fab9e32890e8cc6ed3b ]

__setup() handlers should return 1 to obsolete_checksetup() in
init/main.c to indicate that the boot option has been handled.
A return of 0 causes the boot option/value to be listed as an Unknown
kernel parameter and added to init's (limited) argument or environment
strings. Also, error return codes don't mean anything to
obsolete_checksetup() -- only non-zero (usually 1) or zero.
So return 1 from setup_kmac().

Fixes: 9e21c7e40b7e ("MIPS: RB532: Replace parse_mac_addr() with mac_pton().")
Fixes: 73b4390fb234 ("[MIPS] Routerboard 532: Support for base system")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
From: Igor Zhbanov <i.zhbanov@omprussia.ru>
Link: lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: linux-mips@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Phil Sutter <n0-1@freewrt.org>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Daniel Walter <dwalter@google.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agomips: cdmm: Fix refcount leak in mips_cdmm_phys_base
Miaoqian Lin [Wed, 9 Mar 2022 09:17:10 +0000 (09:17 +0000)]
mips: cdmm: Fix refcount leak in mips_cdmm_phys_base

[ Upstream commit 4528668ca331f7ce5999b7746657b46db5b3b785 ]

The of_find_compatible_node() function returns a node pointer with
refcount incremented, We should use of_node_put() on it when done
Add the missing of_node_put() to release the refcount.

Fixes: 2121aa3e2312 ("mips: cdmm: Add mti,mips-cdmm dtb node support")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Acked-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoath10k: Fix error handling in ath10k_setup_msa_resources
Miaoqian Lin [Tue, 8 Mar 2022 07:02:38 +0000 (07:02 +0000)]
ath10k: Fix error handling in ath10k_setup_msa_resources

[ Upstream commit 9747a78d5f758a5284751a10aee13c30d02bd5f1 ]

The device_node pointer is returned by of_parse_phandle() with refcount
incremented. We should use of_node_put() on it when done.

This function only calls of_node_put() in the regular path.
And it will cause refcount leak in error path.

Fixes: 727fec790ead ("ath10k: Setup the msa resources before qmi init")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Reviewed-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20220308070238.19295-1-linmq006@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agovxcan: enable local echo for sent CAN frames
Oliver Hartkopp [Wed, 9 Mar 2022 12:04:16 +0000 (13:04 +0100)]
vxcan: enable local echo for sent CAN frames

[ Upstream commit 259bdba27e32368b4404f69d613b1c1014c07cbf ]

The vxcan driver provides a pair of virtual CAN interfaces to exchange
CAN traffic between different namespaces - analogue to veth.

In opposite to the vcan driver the local sent CAN traffic on this interface
is not echo'ed back but only sent to the remote peer. This is unusual and
can be easily fixed by removing IFF_ECHO from the netdevice flags that
are set for vxcan interfaces by default at startup.

Without IFF_ECHO set on driver level, the local sent CAN frames are echo'ed
in af_can.c in can_send(). This patch makes vxcan interfaces adopt the
same local echo behavior and procedures as known from the vcan interfaces.

Fixes: a8f820a380a2 ("can: add Virtual CAN Tunnel driver (vxcan)")
Link: https://lore.kernel.org/all/20220309120416.83514-5-socketcan@hartkopp.net
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agopowerpc: 8xx: fix a return value error in mpc8xx_pic_init
Hangyu Hua [Wed, 23 Feb 2022 07:02:23 +0000 (15:02 +0800)]
powerpc: 8xx: fix a return value error in mpc8xx_pic_init

[ Upstream commit 3fd46e551f67f4303c3276a0d6cd20baf2d192c4 ]

mpc8xx_pic_init() should return -ENOMEM instead of 0 when
irq_domain_add_linear() return NULL. This cause mpc8xx_pics_init to continue
executing even if mpc8xx_pic_host is NULL.

Fixes: cc76404feaed ("powerpc/8xx: Fix possible device node reference leak")
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220223070223.26845-1-hbh25y@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoscsi: mpt3sas: Fix incorrect 4GB boundary check
Sreekanth Reddy [Thu, 3 Mar 2022 14:02:30 +0000 (19:32 +0530)]
scsi: mpt3sas: Fix incorrect 4GB boundary check

[ Upstream commit 208cc9fe6f21112b5cc6cb87065fb8ab66e79316 ]

The driver must perform its 4GB boundary check using the pool's DMA address
instead of using the virtual address.

Link: https://lore.kernel.org/r/20220303140230.13098-1-sreekanth.reddy@broadcom.com
Fixes: d6adc251dd2f ("scsi: mpt3sas: Force PCIe scatterlist allocations to be within same 4 GB region")
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoplatform/x86: huawei-wmi: check the return value of device_create_file()
Jia-Ju Bai [Thu, 3 Mar 2022 02:24:21 +0000 (18:24 -0800)]
platform/x86: huawei-wmi: check the return value of device_create_file()

[ Upstream commit c91a5b1c221a58d008485cf7d02ccce73108b119 ]

The function device_create_file() in huawei_wmi_battery_add() can fail,
so its return value should be checked.

Fixes: 355a070b09ab ("platform/x86: huawei-wmi: Add battery charging thresholds")
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Link: https://lore.kernel.org/r/20220303022421.313-1-baijiaju1990@gmail.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoselftests/bpf: Make test_lwt_ip_encap more stable and faster
Felix Maurer [Thu, 3 Mar 2022 11:15:26 +0000 (12:15 +0100)]
selftests/bpf: Make test_lwt_ip_encap more stable and faster

[ Upstream commit d23a8720327d33616f584d76c80824bfa4699be6 ]

In test_lwt_ip_encap, the ingress IPv6 encap test failed from time to
time. The failure occured when an IPv4 ping through the IPv6 GRE
encapsulation did not receive a reply within the timeout. The IPv4 ping
and the IPv6 ping in the test used different timeouts (1 sec for IPv4
and 6 sec for IPv6), probably taking into account that IPv6 might need
longer to successfully complete. However, when IPv4 pings (with the
short timeout) are encapsulated into the IPv6 tunnel, the delays of IPv6
apply.

The actual reason for the long delays with IPv6 was that the IPv6
neighbor discovery sometimes did not complete in time. This was caused
by the outgoing interface only having a tentative link local address,
i.e., not having completed DAD for that lladdr. The ND was successfully
retried after 1 sec but that was too late for the ping timeout.

The IPv6 addresses for the test were already added with nodad. However,
for the lladdrs, DAD was still performed. We now disable DAD in the test
netns completely and just assume that the two lladdrs on each veth pair
do not collide. This removes all the delays for IPv6 traffic in the
test.

Without the delays, we can now also reduce the delay of the IPv6 ping to
1 sec. This makes the whole test complete faster because we don't need
to wait for the excessive timeout for each IPv6 ping that is supposed
to fail.

Fixes: 0fde56e4385b0 ("selftests: bpf: add test_lwt_ip_encap selftest")
Signed-off-by: Felix Maurer <fmaurer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/4987d549d48b4e316cd5b3936de69c8d4bc75a4f.1646305899.git.fmaurer@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agolibbpf: Unmap rings when umem deleted
lic121 [Tue, 1 Mar 2022 13:26:23 +0000 (13:26 +0000)]
libbpf: Unmap rings when umem deleted

[ Upstream commit 9c6e6a80ee741adf6cb3cfd8eef7d1554f91fceb ]

xsk_umem__create() does mmap for fill/comp rings, but xsk_umem__delete()
doesn't do the unmap. This works fine for regular cases, because
xsk_socket__delete() does unmap for the rings. But for the case that
xsk_socket__create_shared() fails, umem rings are not unmapped.

fill_save/comp_save are checked to determine if rings have already be
unmapped by xsk. If fill_save and comp_save are NULL, it means that the
rings have already been used by xsk. Then they are supposed to be
unmapped by xsk_socket__delete(). Otherwise, xsk_umem__delete() does the
unmap.

Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices")
Signed-off-by: Cheng Li <lic121@chinatelecom.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220301132623.GA19995@vscode.7~
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agomfd: mc13xxx: Add check for mc13xxx_irq_request
Jiasheng Jiang [Thu, 24 Feb 2022 02:23:31 +0000 (10:23 +0800)]
mfd: mc13xxx: Add check for mc13xxx_irq_request

[ Upstream commit e477e51a41cb5d6034f3c5ea85a71ad4613996b9 ]

As the potential failure of the devm_request_threaded_irq(),
it should be better to check the return value of the
mc13xxx_irq_request() and return error if fails.

Fixes: 8e00593557c3 ("mfd: Add mc13892 support to mc13xxx")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Link: https://lore.kernel.org/r/20220224022331.3208275-1-jiasheng@iscas.ac.cn
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agopowerpc/64s: Don't use DSISR for SLB faults
Michael Ellerman [Tue, 22 Feb 2022 11:34:49 +0000 (22:34 +1100)]
powerpc/64s: Don't use DSISR for SLB faults

[ Upstream commit d4679ac8ea2e5078704aa1c026db36580cc1bf9a ]

Since commit 46ddcb3950a2 ("powerpc/mm: Show if a bad page fault on data
is read or write.") we use page_fault_is_write(regs->dsisr) in
__bad_page_fault() to determine if the fault is for a read or write, and
change the message printed accordingly.

But SLB faults, aka Data Segment Interrupts, don't set DSISR (Data
Storage Interrupt Status Register) to a useful value. All ISA versions
from v2.03 through v3.1 specify that the Data Segment Interrupt sets
DSISR "to an undefined value". As far as I can see there's no mention of
SLB faults setting DSISR in any BookIV content either.

This manifests as accesses that should be a read being incorrectly
reported as writes, for example, using the xmon "dump" command:

  0:mon> d 0x5deadbeef0000000
  5deadbeef0000000
  [359526.415354][    C6] BUG: Unable to handle kernel data access on write at 0x5deadbeef0000000
  [359526.415611][    C6] Faulting instruction address: 0xc00000000010a300
  cpu 0x6: Vector: 380 (Data SLB Access) at [c00000000ffbf400]
      pc: c00000000010a300: mread+0x90/0x190

If we disassemble the PC, we see a load instruction:

  0:mon> di c00000000010a300
  c00000000010a300 89490000      lbz     r10,0(r9)

We can also see in exceptions-64s.S that the data_access_slb block
doesn't set IDSISR=1, which means it doesn't load DSISR into pt_regs. So
the value we're using to determine if the fault is a read/write is some
stale value in pt_regs from a previous page fault.

Rework the printing logic to separate the SLB fault case out, and only
print read/write in the cases where we can determine it.

The result looks like eg:

  0:mon> d 0x5deadbeef0000000
  5deadbeef0000000
  [  721.779525][    C6] BUG: Unable to handle kernel data access at 0x5deadbeef0000000
  [  721.779697][    C6] Faulting instruction address: 0xc00000000014cbe0
  cpu 0x6: Vector: 380 (Data SLB Access) at [c00000000ffbf390]

  0:mon> d 0
  0000000000000000
  [  742.793242][    C6] BUG: Kernel NULL pointer dereference at 0x00000000
  [  742.793316][    C6] Faulting instruction address: 0xc00000000014cbe0
  cpu 0x6: Vector: 380 (Data SLB Access) at [c00000000ffbf390]

Fixes: 46ddcb3950a2 ("powerpc/mm: Show if a bad page fault on data is read or write.")
Reported-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Link: https://lore.kernel.org/r/20220222113449.319193-1-mpe@ellerman.id.au
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agopowerpc/sysdev: fix incorrect use to determine if list is empty
Jakob Koschel [Mon, 28 Feb 2022 14:24:33 +0000 (15:24 +0100)]
powerpc/sysdev: fix incorrect use to determine if list is empty

[ Upstream commit fa1321b11bd01752f5be2415e74a0e1a7c378262 ]

'gtm' will *always* be set by list_for_each_entry().
It is incorrect to assume that the iterator value will be NULL if the
list is empty.

Instead of checking the pointer it should be checked if
the list is empty.

Fixes: 83ff9dcf375c ("powerpc/sysdev: implement FSL GTM support")
Signed-off-by: Jakob Koschel <jakobkoschel@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220228142434.576226-1-jakobkoschel@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoMIPS: Sanitise Cavium switch cases in TLB handler synthesizers
Maciej W. Rozycki [Fri, 4 Mar 2022 21:13:11 +0000 (21:13 +0000)]
MIPS: Sanitise Cavium switch cases in TLB handler synthesizers

[ Upstream commit 6ddcba9d480b6bcced4223a729794dfa6becb7eb ]

It makes no sense to fall through to `break'.  Therefore reorder the
switch statements so as to have the Cavium cases first, followed by the
default case, which improves readability and pacifies code analysis
tools.  No change in semantics, assembly produced is exactly the same.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Fixes: bc431d2153cc ("MIPS: Fix fall-through warnings for Clang")
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agomips: DEC: honor CONFIG_MIPS_FP_SUPPORT=n
Randy Dunlap [Fri, 4 Mar 2022 16:24:27 +0000 (08:24 -0800)]
mips: DEC: honor CONFIG_MIPS_FP_SUPPORT=n

[ Upstream commit 97bf0395c226907e1a9b908511a35192bf1e09bb ]

Include the DECstation interrupt handler in opting out of
FPU support.

Fixes a linker error:

mips-linux-ld: arch/mips/dec/int-handler.o: in function `fpu':
(.text+0x148): undefined reference to `handle_fpe_int'

Fixes: 183b40f992c8 ("MIPS: Allow FP support to be disabled")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Paul Burton <paulburton@kernel.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Maciej W. Rozycki <macro@orcam.me.uk>
Cc: linux-mips@vger.kernel.org
Acked-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agodrm/msm/a6xx: Fix missing ARRAY_SIZE() check
Rob Clark [Sat, 5 Mar 2022 17:34:03 +0000 (09:34 -0800)]
drm/msm/a6xx: Fix missing ARRAY_SIZE() check

[ Upstream commit cca96584b35765bf9eb5f38ca55a144ea2ba0de4 ]

Fixes: f6d62d091cfd ("drm/msm/a6xx: add support for Adreno 660 GPU")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20220305173405.914989-1-robdclark@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agonet: axienet: fix RX ring refill allocation failure handling
Robert Hancock [Sat, 5 Mar 2022 02:24:37 +0000 (20:24 -0600)]
net: axienet: fix RX ring refill allocation failure handling

[ Upstream commit 7a7d340ba4d9351e4c8847b898a2b996727a922a ]

If a memory allocation error occurred during an attempt to refill a slot
in the RX ring after the packet was received, the hardware tail pointer
would still have been updated to point to or past the slot which remained
marked as previously completed. This would likely result in the DMA engine
raising an error when it eventually tried to use that slot again.

If a slot cannot be refilled, then just stop processing and do not move
the tail pointer past it. On the next attempt, we should skip receiving
the packet from the empty slot and just try to refill it again.

This failure mode has not actually been observed, but was found as part
of other driver updates.

Fixes: 8a3b7a252dca ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoPCI: Reduce warnings on possible RW1C corruption
Mark Tomlinson [Thu, 6 Aug 2020 04:14:55 +0000 (16:14 +1200)]
PCI: Reduce warnings on possible RW1C corruption

[ Upstream commit 92c45b63ce22c8898aa41806e8d6692bcd577510 ]

For hardware that only supports 32-bit writes to PCI there is the
possibility of clearing RW1C (write-one-to-clear) bits. A rate-limited
messages was introduced by fb2659230120, but rate-limiting is not the best
choice here. Some devices may not show the warnings they should if another
device has just produced a bunch of warnings. Also, the number of messages
can be a nuisance on devices which are otherwise working fine.

Change the ratelimit to a single warning per bus. This ensures no bus is
'starved' of emitting a warning and also that there isn't a continuous
stream of warnings. It would be preferable to have a warning per device,
but the pci_dev structure is not available here, and a lookup from devfn
would be far too slow.

Suggested-by: Bjorn Helgaas <helgaas@kernel.org>
Fixes: fb2659230120 ("PCI: Warn on possible RW1C corruption for sub-32 bit config writes")
Link: https://lore.kernel.org/r/20200806041455.11070-1-mark.tomlinson@alliedtelesis.co.nz
Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoRDMA/core: Fix ib_qp_usecnt_dec() called when error
Yajun Deng [Thu, 3 Mar 2022 02:42:32 +0000 (10:42 +0800)]
RDMA/core: Fix ib_qp_usecnt_dec() called when error

[ Upstream commit 7c4a539ec38f4ce400a0f3fcb5ff6c940fcd67bb ]

ib_destroy_qp() would called by ib_create_qp_user() if error, the former
contains ib_qp_usecnt_dec(), but ib_qp_usecnt_inc() was not called before.

So move ib_qp_usecnt_inc() into create_qp().

Fixes: d2b10794fc13 ("RDMA/core: Create clean QP creations interface for uverbs")
Link: https://lore.kernel.org/r/20220303024232.2847388-1-yajun.deng@linux.dev
Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoIB/hfi1: Allow larger MTU without AIP
Mike Marciniszyn [Tue, 8 Feb 2022 19:25:09 +0000 (14:25 -0500)]
IB/hfi1: Allow larger MTU without AIP

[ Upstream commit b135e324d7a2e7fa0a7ef925076136e799b79f44 ]

The AIP code signals the phys_mtu in the following query_port()
fragment:

props->phys_mtu = HFI1_CAP_IS_KSET(AIP) ? hfi1_max_mtu :
ib_mtu_enum_to_int(props->max_mtu);

Using the largest MTU possible should not depend on AIP.

Fix by unconditionally using the hfi1_max_mtu value.

Fixes: 6d72344cf6c4 ("IB/ipoib: Increase ipoib Datagram mode MTU's upper limit")
Link: https://lore.kernel.org/r/1644348309-174874-1-git-send-email-mike.marciniszyn@cornelisnetworks.com
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agopower: supply: wm8350-power: Add missing free in free_charger_irq
Jiasheng Jiang [Thu, 3 Mar 2022 09:43:22 +0000 (17:43 +0800)]
power: supply: wm8350-power: Add missing free in free_charger_irq

[ Upstream commit 6dee930f6f6776d1e5a7edf542c6863b47d9f078 ]

In free_charger_irq(), there is no free for 'WM8350_IRQ_CHG_FAST_RDY'.
Therefore, it should be better to add it in order to avoid the memory leak.

Fixes: 14431aa0c5a4 ("power_supply: Add support for WM8350 PMU")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Acked-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agopower: supply: wm8350-power: Handle error for wm8350_register_irq
Jiasheng Jiang [Fri, 4 Mar 2022 01:57:51 +0000 (09:57 +0800)]
power: supply: wm8350-power: Handle error for wm8350_register_irq

[ Upstream commit b0b14b5ba11bec56fad344a4a0b2e16449cc8b94 ]

As the potential failure of the wm8350_register_irq(),
it should be better to check it and return error if fails.
Also, use 'free_' in order to avoid same code.

Fixes: 14431aa0c5a4 ("power_supply: Add support for WM8350 PMU")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Acked-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoi2c: xiic: Make bus names unique
Robert Hancock [Thu, 27 Jan 2022 17:50:13 +0000 (11:50 -0600)]
i2c: xiic: Make bus names unique

[ Upstream commit 1d366c2f9df8279df2adbb60471f86fc40a1c39e ]

This driver is for an FPGA logic core, so there can be arbitrarily many
instances of the bus on a given system. Previously all of the I2C bus
names were "xiic-i2c" which caused issues with lm_sensors when trying to
map human-readable names to sensor inputs because it could not properly
distinguish the busses, for example. Append the platform device name to
the I2C bus name so it is unique between different instances.

Fixes: e1d5b6598cdc ("i2c: Add support for Xilinx XPS IIC Bus Interface")
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Tested-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agohv_balloon: rate-limit "Unhandled message" warning
Anssi Hannula [Tue, 22 Feb 2022 14:14:00 +0000 (16:14 +0200)]
hv_balloon: rate-limit "Unhandled message" warning

[ Upstream commit 1d7286729aa616772be334eb908e11f527e1e291 ]

For a couple of times I have encountered a situation where

  hv_balloon: Unhandled message: type: 12447

is being flooded over 1 million times per second with various values,
filling the log and consuming cycles, making debugging difficult.

Add rate limiting to the message.

Most other Hyper-V drivers already have similar rate limiting in their
message callbacks.

The cause of the floods in my case was probably fixed by 96d9d1fa5cd5
("Drivers: hv: balloon: account for vmbus packet header in
max_pkt_size").

Fixes: 9aa8b50b2b3d ("Drivers: hv: Add Hyper-V balloon driver")
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20220222141400.98160-1-anssi.hannula@bitwise.fi
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoKVM: x86/emulator: Defer not-present segment check in __load_segment_descriptor()
Hou Wenlong [Tue, 8 Feb 2022 09:34:03 +0000 (17:34 +0800)]
KVM: x86/emulator: Defer not-present segment check in __load_segment_descriptor()

[ Upstream commit ca85f002258fdac3762c57d12d5e6e401b6a41af ]

Per Intel's SDM on the "Instruction Set Reference", when
loading segment descriptor, not-present segment check should
be after all type and privilege checks. But the emulator checks
it first, then #NP is triggered instead of #GP if privilege fails
and segment is not present. Put not-present segment check after
type and privilege checks in __load_segment_descriptor().

Fixes: 38ba30ba51a00 (KVM: x86 emulator: Emulate task switch in emulator.c)
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
Message-Id: <52573c01d369f506cadcf7233812427cf7db81a7.1644292363.git.houwenlong.hwl@antgroup.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoKVM: x86: Fix emulation in writing cr8
Zhenzhong Duan [Thu, 10 Feb 2022 09:45:06 +0000 (17:45 +0800)]
KVM: x86: Fix emulation in writing cr8

[ Upstream commit f66af9f222f08d5b11ea41c1bd6c07a0f12daa07 ]

In emulation of writing to cr8, one of the lowest four bits in TPR[3:0]
is kept.

According to Intel SDM 10.8.6.1(baremetal scenario):
"APIC.TPR[bits 7:4] = CR8[bits 3:0], APIC.TPR[bits 3:0] = 0";

and SDM 28.3(use TPR shadow):
"MOV to CR8. The instruction stores bits 3:0 of its source operand into
bits 7:4 of VTPR; the remainder of VTPR (bits 3:0 and bits 31:8) are
cleared.";

and AMD's APM 16.6.4:
"Task Priority Sub-class (TPS)-Bits 3 : 0. The TPS field indicates the
current sub-priority to be used when arbitrating lowest-priority messages.
This field is written with zero when TPR is written using the architectural
CR8 register.";

so in KVM emulated scenario, clear TPR[3:0] to make a consistent behavior
as in other scenarios.

This doesn't impact evaluation and delivery of pending virtual interrupts
because processor does not use the processor-priority sub-class to
determine which interrupts to delivery and which to inhibit.

Sub-class is used by hardware to arbitrate lowest priority interrupts,
but KVM just does a round-robin style delivery.

Fixes: b93463aa59d6 ("KVM: Accelerated apic support")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220210094506.20181-1-zhenzhong.duan@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agopowerpc/Makefile: Don't pass -mcpu=powerpc64 when building 32-bit
Michael Ellerman [Tue, 15 Feb 2022 11:28:58 +0000 (22:28 +1100)]
powerpc/Makefile: Don't pass -mcpu=powerpc64 when building 32-bit

[ Upstream commit 2863dd2db23e0407f6c50b8ba5c0e55abef894f1 ]

When CONFIG_GENERIC_CPU=y (true for all our defconfigs) we pass
-mcpu=powerpc64 to the compiler, even when we're building a 32-bit
kernel.

This happens because we have an ifdef CONFIG_PPC_BOOK3S_64/else block in
the Makefile that was written before 32-bit supported GENERIC_CPU. Prior
to that the else block only applied to 64-bit Book3E.

The GCC man page says -mcpu=powerpc64 "[specifies] a pure ... 64-bit big
endian PowerPC ... architecture machine [type], with an appropriate,
generic processor model assumed for scheduling purposes."

It's unclear how that interacts with -m32, which we are also passing,
although obviously -m32 is taking precedence in some sense, as the
32-bit kernel only contains 32-bit instructions.

This was noticed by inspection, not via any bug reports, but it does
affect code generation. Comparing before/after code generation, there
are some changes to instruction scheduling, and the after case (with
-mcpu=powerpc64 removed) the compiler seems more keen to use r8.

Fix it by making the else case only apply to Book3E 64, which excludes
32-bit.

Fixes: 0e00a8c9fd92 ("powerpc: Allow CPU selection also on PPC32")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220215112858.304779-1-mpe@ellerman.id.au
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agopowerpc/mm/numa: skip NUMA_NO_NODE onlining in parse_numa_properties()
Daniel Henrique Barboza [Thu, 24 Feb 2022 18:23:12 +0000 (15:23 -0300)]
powerpc/mm/numa: skip NUMA_NO_NODE onlining in parse_numa_properties()

[ Upstream commit 749ed4a20657bcea66a6e082ca3dc0d228cbec80 ]

Executing node_set_online() when nid = NUMA_NO_NODE results in an
undefined behavior. node_set_online() will call node_set_state(), into
__node_set(), into set_bit(), and since NUMA_NO_NODE is -1 we'll end up
doing a negative shift operation inside
arch/powerpc/include/asm/bitops.h. This potential UB was detected
running a kernel with CONFIG_UBSAN.

The behavior was introduced by commit 10f78fd0dabb ("powerpc/numa: Fix a
regression on memoryless node 0"), where the check for nid > 0 was
removed to fix a problem that was happening with nid = 0, but the result
is that now we're trying to online NUMA_NO_NODE nids as well.

Checking for nid >= 0 will allow node 0 to be onlined while avoiding
this UB with NUMA_NO_NODE.

Fixes: 10f78fd0dabb ("powerpc/numa: Fix a regression on memoryless node 0")
Reported-by: Ping Fang <pifang@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220224182312.1012527-1-danielhb413@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agolibbpf: Skip forward declaration when counting duplicated type names
Xu Kuohai [Tue, 1 Mar 2022 05:32:49 +0000 (00:32 -0500)]
libbpf: Skip forward declaration when counting duplicated type names

[ Upstream commit 4226961b0019b2e1612029e8950a9e911affc995 ]

Currently if a declaration appears in the BTF before the definition, the
definition is dumped as a conflicting name, e.g.:

    $ bpftool btf dump file vmlinux format raw | grep "'unix_sock'"
    [81287] FWD 'unix_sock' fwd_kind=struct
    [89336] STRUCT 'unix_sock' size=1024 vlen=14

    $ bpftool btf dump file vmlinux format c | grep "struct unix_sock"
    struct unix_sock;
    struct unix_sock___2 { <--- conflict, the "___2" is unexpected
    struct unix_sock___2 *unix_sk;

This causes a compilation error if the dump output is used as a header file.

Fix it by skipping declaration when counting duplicated type names.

Fixes: 351131b51c7a ("libbpf: add btf_dump API for BTF-to-C conversion")
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20220301053250.1464204-2-xukuohai@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agogpu: host1x: Fix a memory leak in 'host1x_remove()'
Christophe JAILLET [Sun, 7 Nov 2021 21:16:36 +0000 (22:16 +0100)]
gpu: host1x: Fix a memory leak in 'host1x_remove()'

[ Upstream commit 025c6643a81564f066d8381b9e2f4603e0f8438f ]

Add a missing 'host1x_channel_list_free()' call in the remove function,
as already done in the error handling path of the probe function.

Fixes: 8474b02531c4 ("gpu: host1x: Refactor channel allocation code")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoRDMA/irdma: Remove incorrect masking of PD
Mustafa Ismail [Fri, 25 Feb 2022 16:32:11 +0000 (10:32 -0600)]
RDMA/irdma: Remove incorrect masking of PD

[ Upstream commit 17850f2b0b4b806e47cc44df94186bfc2cdd490b ]

The PD id is masked with 0x7fff, while PD can be 18 bits for GEN2 HW.
Remove the masking as it should not be needed and can cause incorrect PD
id to be used.

Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Link: https://lore.kernel.org/r/20220225163211.127-4-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoRDMA/irdma: Fix Passthrough mode in VM
Mustafa Ismail [Fri, 25 Feb 2022 16:32:10 +0000 (10:32 -0600)]
RDMA/irdma: Fix Passthrough mode in VM

[ Upstream commit b200189626b5cefbaf8be9cadd7a28215e065bb9 ]

Using PCI_FUNC macro in a VM, when the device is in passthrough mode does
not provide the real function instance. This means that currently, devices
will not probe unless the instance in the VM matches the instance in the
host.

Fix this by getting the pf_id from the LAN during the probe.

Fixes: 8498a30e1b94 ("RDMA/irdma: Register auxiliary driver and implement private channel OPs")
Link: https://lore.kernel.org/r/20220225163211.127-3-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoRDMA/irdma: Fix netdev notifications for vlan's
Mustafa Ismail [Fri, 25 Feb 2022 16:32:09 +0000 (10:32 -0600)]
RDMA/irdma: Fix netdev notifications for vlan's

[ Upstream commit 6702bc14744847842a87fed21a795b6e8bab6965 ]

Currently, events on vlan netdevs are being ignored. Fix this by finding
the real netdev and processing the notifications for vlan netdevs.

Fixes: 915cc7ac0f8e ("RDMA/irdma: Add miscellaneous utility definitions")
Link: https://lore.kernel.org/r/20220225163211.127-2-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoxsk: Fix race at socket teardown
Magnus Karlsson [Mon, 28 Feb 2022 09:45:52 +0000 (10:45 +0100)]
xsk: Fix race at socket teardown

[ Upstream commit 18b1ab7aa76bde181bdb1ab19a87fa9523c32f21 ]

Fix a race in the xsk socket teardown code that can lead to a NULL pointer
dereference splat. The current xsk unbind code in xsk_unbind_dev() starts by
setting xs->state to XSK_UNBOUND, sets xs->dev to NULL and then waits for any
NAPI processing to terminate using synchronize_net(). After that, the release
code starts to tear down the socket state and free allocated memory.

  BUG: kernel NULL pointer dereference, address: 00000000000000c0
  PGD 8000000932469067 P4D 8000000932469067 PUD 0
  Oops: 0000 [#1] PREEMPT SMP PTI
  CPU: 25 PID: 69132 Comm: grpcpp_sync_ser Tainted: G          I       5.16.0+ #2
  Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.2.10 03/09/2015
  RIP: 0010:__xsk_sendmsg+0x2c/0x690
  [...]
  RSP: 0018:ffffa2348bd13d50 EFLAGS: 00010246
  RAX: 0000000000000000 RBX: 0000000000000040 RCX: ffff8d5fc632d258
  RDX: 0000000000400000 RSI: ffffa2348bd13e10 RDI: ffff8d5fc5489800
  RBP: ffffa2348bd13db0 R08: 0000000000000000 R09: 00007ffffffff000
  R10: 0000000000000000 R11: 0000000000000000 R12: ffff8d5fc5489800
  R13: ffff8d5fcb0f5140 R14: ffff8d5fcb0f5140 R15: 0000000000000000
  FS:  00007f991cff9400(0000) GS:ffff8d6f1f700000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00000000000000c0 CR3: 0000000114888005 CR4: 00000000001706e0
  Call Trace:
  <TASK>
  ? aa_sk_perm+0x43/0x1b0
  xsk_sendmsg+0xf0/0x110
  sock_sendmsg+0x65/0x70
  __sys_sendto+0x113/0x190
  ? debug_smp_processor_id+0x17/0x20
  ? fpregs_assert_state_consistent+0x23/0x50
  ? exit_to_user_mode_prepare+0xa5/0x1d0
  __x64_sys_sendto+0x29/0x30
  do_syscall_64+0x3b/0xc0
  entry_SYSCALL_64_after_hwframe+0x44/0xae

There are two problems with the current code. First, setting xs->dev to NULL
before waiting for all users to stop using the socket is not correct. The
entry to the data plane functions xsk_poll(), xsk_sendmsg(), and xsk_recvmsg()
are all guarded by a test that xs->state is in the state XSK_BOUND and if not,
it returns right away. But one process might have passed this test but still
have not gotten to the point in which it uses xs->dev in the code. In this
interim, a second process executing xsk_unbind_dev() might have set xs->dev to
NULL which will lead to a crash for the first process. The solution here is
just to get rid of this NULL assignment since it is not used anymore. Before
commit 42fddcc7c64b ("xsk: use state member for socket synchronization"),
xs->dev was the gatekeeper to admit processes into the data plane functions,
but it was replaced with the state variable xs->state in the aforementioned
commit.

The second problem is that synchronize_net() does not wait for any process in
xsk_poll(), xsk_sendmsg(), or xsk_recvmsg() to complete, which means that the
state they rely on might be cleaned up prematurely. This can happen when the
notifier gets called (at driver unload for example) as it uses xsk_unbind_dev().
Solve this by extending the RCU critical region from just the ndo_xsk_wakeup
to the whole functions mentioned above, so that both the test of xs->state ==
XSK_BOUND and the last use of any member of xs is covered by the RCU critical
section. This will guarantee that when synchronize_net() completes, there will
be no processes left executing xsk_poll(), xsk_sendmsg(), or xsk_recvmsg() and
state can be cleaned up safely. Note that we need to drop the RCU lock for the
skb xmit path as it uses functions that might sleep. Due to this, we have to
retest the xs->state after we grab the mutex that protects the skb xmit code
from, among a number of things, an xsk_unbind_dev() being executed from the
notifier at the same time.

Fixes: 42fddcc7c64b ("xsk: use state member for socket synchronization")
Reported-by: Elza Mathew <elza.mathew@intel.com>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/bpf/20220228094552.10134-1-magnus.karlsson@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agobpf, arm64: Feed byte-offset into bpf line info
Hou Tao [Sat, 26 Feb 2022 12:19:06 +0000 (20:19 +0800)]
bpf, arm64: Feed byte-offset into bpf line info

[ Upstream commit dda7596c109fc382876118627e29db7607cde35d ]

insn_to_jit_off passed to bpf_prog_fill_jited_linfo() is calculated in
instruction granularity instead of bytes granularity, but BPF line info
requires byte offset.

bpf_prog_fill_jited_linfo() will be the last user of ctx.offset before
it is freed, so convert the offset into byte-offset before calling into
bpf_prog_fill_jited_linfo() in order to fix the line info dump on arm64.

Fixes: 37ab566c178d ("bpf: arm64: Enable arm64 jit to provide bpf_line_info")
Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220226121906.5709-3-houtao1@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agobpf, arm64: Call build_prologue() first in first JIT pass
Hou Tao [Sat, 26 Feb 2022 12:19:05 +0000 (20:19 +0800)]
bpf, arm64: Call build_prologue() first in first JIT pass

[ Upstream commit 68e4f238b0e9d3670a1612ad900a6e98b2b3f7dd ]

BPF line info needs ctx->offset to be the instruction offset in the whole JITed
image instead of the body itself, so also call build_prologue() first in first
JIT pass.

Fixes: 37ab566c178d ("bpf: arm64: Enable arm64 jit to provide bpf_line_info")
Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220226121906.5709-2-houtao1@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agodrm/bridge: cdns-dsi: Make sure to to create proper aliases for dt
Nishanth Menon [Tue, 21 Sep 2021 17:40:59 +0000 (12:40 -0500)]
drm/bridge: cdns-dsi: Make sure to to create proper aliases for dt

[ Upstream commit ffb5c099aaa13ab7f73c29ea6ae26bce8d7575ae ]

Add MODULE_DEVICE_TABLE to the device tree table to create required
aliases needed for module to be loaded with device tree based platform.

Fixes: e19233955d9e ("drm/bridge: Add Cadence DSI driver")
Signed-off-by: Nishanth Menon <nm@ti.com>
Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210921174059.17946-1-nm@ti.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoscsi: hisi_sas: Change permission of parameter prot_mask
Xiang Chen [Thu, 24 Feb 2022 11:51:24 +0000 (19:51 +0800)]
scsi: hisi_sas: Change permission of parameter prot_mask

[ Upstream commit c4e070457a93705e56ed06b3910d9e5fe56d3be3 ]

Currently the permission of parameter prot_mask is 0x0, which means that
the member does not appear in sysfs. Change it as other module parameters
to 0444 for world-readable.

[mkp: s/v3/v2/]

Link: https://lore.kernel.org/r/1645703489-87194-2-git-send-email-john.garry@huawei.com
Fixes: d6a9000b81be ("scsi: hisi_sas: Add support for DIF feature for v2 hw")
Reported-by: Yihang Li <liyihang6@hisilicon.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agopower: supply: bq24190_charger: Fix bq24190_vbus_is_enabled() wrong false return
Hans de Goede [Sat, 12 Feb 2022 16:48:16 +0000 (17:48 +0100)]
power: supply: bq24190_charger: Fix bq24190_vbus_is_enabled() wrong false return

[ Upstream commit f7731754fdce33dad19be746f647d6ac47c5d695 ]

The datasheet says that the BQ24190_REG_POC_CHG_CONFIG bits can
have a value of either 10(0x2) or 11(0x3) for OTG (5V boost regulator)
mode.

Sofar bq24190_vbus_is_enabled() was only checking for 10 but some BIOS-es
uses 11 when enabling the regulator at boot.

Make bq24190_vbus_is_enabled() also check for 11 so that it does not
wrongly returns false when the bits are set to 11.

Fixes: 66b6bef2c4e0 ("power: supply: bq24190_charger: Export 5V boost converter as regulator")
Cc: Bastien Nocera <hadess@hadess.net>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agodrm/tegra: Fix reference leak in tegra_dsi_ganged_probe
Miaoqian Lin [Mon, 10 Jan 2022 06:53:16 +0000 (06:53 +0000)]
drm/tegra: Fix reference leak in tegra_dsi_ganged_probe

[ Upstream commit 221e3638feb8bc42143833c9a704fa89b6c366bb ]

The reference taken by 'of_find_device_by_node()' must be released when
not needed anymore. Add put_device() call to fix this.

Fixes: e94236cde4d5 ("drm/tegra: dsi: Add ganged mode support")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoext2: correct max file size computing
Zhang Yi [Sat, 12 Feb 2022 05:05:32 +0000 (13:05 +0800)]
ext2: correct max file size computing

[ Upstream commit 50b3a818991074177a56c87124c7a7bdf5fa4f67 ]

We need to calculate the max file size accurately if the total blocks
that can address by block tree exceed the upper_limit. But this check is
not correct now, it only compute the total data blocks but missing
metadata blocks are needed. So in the case of "data blocks < upper_limit
&& total blocks > upper_limit", we will get wrong result. Fortunately,
this case could not happen in reality, but it's confused and better to
correct the computing.

  bits   data blocks   metadatablocks   upper_limit
  10        16843020            66051    2147483647
  11       134480396           263171    1073741823
  12      1074791436          1050627     536870911 (*)
  13      8594130956          4198403     268435455 (*)
  14     68736258060         16785411     134217727 (*)
  15    549822930956         67125251      67108863 (*)
  16   4398314962956        268468227      33554431 (*)

  [*] Need to calculate in depth.

Fixes: 1c2d14212b15 ("ext2: Fix underflow in ext2_max_size()")
Link: https://lore.kernel.org/r/20220212050532.179055-1-yi.zhang@huawei.com
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agopower: supply: sbs-charger: Don't cancel work that is not initialized
Christophe JAILLET [Sun, 13 Feb 2022 17:07:03 +0000 (18:07 +0100)]
power: supply: sbs-charger: Don't cancel work that is not initialized

[ Upstream commit de85193cff0d94d030a53656d8fcc41794807bef ]

This driver can use an interrupt or polling in order get the charger's
status.

When using polling, a delayed work is used.

However, the remove() function unconditionally call
cancel_delayed_work_sync(), even if the delayed work is not used and is not
initialized.

In order to fix it, use devm_delayed_work_autocancel() and remove the now
useless remove() function.

Fixes: feb583e37f8a ("power: supply: add sbs-charger driver")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoTOMOYO: fix __setup handlers return values
Randy Dunlap [Tue, 22 Feb 2022 21:45:33 +0000 (13:45 -0800)]
TOMOYO: fix __setup handlers return values

[ Upstream commit 39844b7e3084baecef52d1498b5fa81afa2cefa9 ]

__setup() handlers should return 1 if the parameter is handled.
Returning 0 causes the entire string to be added to init's
environment strings (limited to 32 strings), unnecessarily polluting it.

Using the documented strings "TOMOYO_loader=string1" and
"TOMOYO_trigger=string2" causes an Unknown parameter message:
  Unknown kernel command line parameters
    "BOOT_IMAGE=/boot/bzImage-517rc5 TOMOYO_loader=string1 \
     TOMOYO_trigger=string2", will be passed to user space.

and these strings are added to init's environment string space:
  Run /sbin/init as init process
    with arguments:
     /sbin/init
    with environment:
     HOME=/
     TERM=linux
     BOOT_IMAGE=/boot/bzImage-517rc5
     TOMOYO_loader=string1
     TOMOYO_trigger=string2

With this change, these __setup handlers act as expected,
and init's environment is not polluted with these strings.

Fixes: 0e4ae0e0dec63 ("TOMOYO: Make several options configurable.")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: Igor Zhbanov <i.zhbanov@omprussia.ru>
Link: https://lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru
Cc: James Morris <jmorris@namei.org>
Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
Cc: tomoyo-dev-en@lists.osdn.me
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agodrm/amd/display: Remove vupdate_int_entry definition
Maíra Canal [Tue, 22 Feb 2022 13:16:57 +0000 (10:16 -0300)]
drm/amd/display: Remove vupdate_int_entry definition

[ Upstream commit 3679b8518cd213c25d555553ef212e233faf698c ]

Remove the vupdate_int_entry definition and utilization to avoid the
following warning by Clang:

drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn21/irq_service_dcn21.c:410:2:
warning: initializer overrides prior initialization of this subobject
[-Winitializer-overrides]
    vupdate_no_lock_int_entry(0),
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn21/irq_service_dcn21.c:280:39:
note: expanded from macro 'vupdate_no_lock_int_entry'
    [DC_IRQ_SOURCE_VUPDATE1 + reg_num] = {\
    ^~
drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn21/irq_service_dcn21.c:404:2:
note: previous initialization is here
    vupdate_int_entry(0),
    ^~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn21/irq_service_dcn21.c:269:39:
note: expanded from macro 'vupdate_int_entry'
    [DC_IRQ_SOURCE_VUPDATE1 + reg_num] = {\
    ^~
drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn21/irq_service_dcn21.c:411:2:
warning: initializer overrides prior initialization of this subobject
[-Winitializer-overrides]
    vupdate_no_lock_int_entry(1),
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn21/irq_service_dcn21.c:280:39:
note: expanded from macro 'vupdate_no_lock_int_entry'
    [DC_IRQ_SOURCE_VUPDATE1 + reg_num] = {\
    ^~
drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn21/irq_service_dcn21.c:405:2:
note: previous initialization is here
    vupdate_int_entry(1),
    ^~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn21/irq_service_dcn21.c:269:39:
note: expanded from macro 'vupdate_int_entry'
    [DC_IRQ_SOURCE_VUPDATE1 + reg_num] = {\
    ^~
drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn21/irq_service_dcn21.c:412:2:
warning: initializer overrides prior initialization of this subobject
[-Winitializer-overrides]
    vupdate_no_lock_int_entry(2),
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn21/irq_service_dcn21.c:280:39:
note: expanded from macro 'vupdate_no_lock_int_entry'
    [DC_IRQ_SOURCE_VUPDATE1 + reg_num] = {\
    ^~
drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn21/irq_service_dcn21.c:406:2:
note: previous initialization is here
    vupdate_int_entry(2),
    ^~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn21/irq_service_dcn21.c:269:39:
note: expanded from macro 'vupdate_int_entry'
    [DC_IRQ_SOURCE_VUPDATE1 + reg_num] = {\
    ^~
drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn21/irq_service_dcn21.c:413:2:
warning: initializer overrides prior initialization of this subobject
[-Winitializer-overrides]
    vupdate_no_lock_int_entry(3),
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn21/irq_service_dcn21.c:280:39:
note: expanded from macro 'vupdate_no_lock_int_entry'
    [DC_IRQ_SOURCE_VUPDATE1 + reg_num] = {\
    ^~
drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn21/irq_service_dcn21.c:407:2:
note: previous initialization is here
    vupdate_int_entry(3),
    ^~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn21/irq_service_dcn21.c:269:39:
note: expanded from macro 'vupdate_int_entry'
    [DC_IRQ_SOURCE_VUPDATE1 + reg_num] = {\
    ^~
drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn21/irq_service_dcn21.c:414:2:
warning: initializer overrides prior initialization of this subobject
[-Winitializer-overrides]
    vupdate_no_lock_int_entry(4),
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn21/irq_service_dcn21.c:280:39:
note: expanded from macro 'vupdate_no_lock_int_entry'
    [DC_IRQ_SOURCE_VUPDATE1 + reg_num] = {\
    ^~
drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn21/irq_service_dcn21.c:408:2:
note: previous initialization is here
    vupdate_int_entry(4),
    ^~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn21/irq_service_dcn21.c:269:39:
note: expanded from macro 'vupdate_int_entry'
    [DC_IRQ_SOURCE_VUPDATE1 + reg_num] = {\
    ^~
drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn21/irq_service_dcn21.c:415:2:
warning: initializer overrides prior initialization of this subobject
[-Winitializer-overrides]
    vupdate_no_lock_int_entry(5),
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn21/irq_service_dcn21.c:280:39:
note: expanded from macro 'vupdate_no_lock_int_entry'
    [DC_IRQ_SOURCE_VUPDATE1 + reg_num] = {\
    ^~
drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn21/irq_service_dcn21.c:409:2:
note: previous initialization is here
    vupdate_int_entry(5),
    ^~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn21/irq_service_dcn21.c:269:39:
note: expanded from macro 'vupdate_int_entry'
        [DC_IRQ_SOURCE_VUPDATE1 + reg_num] = {\
        ^~
6 warnings generated.

Fixes: 688f97ed3f5e ("drm/amd/display: Add vupdate_no_lock interrupts for DCN2.1")
Signed-off-by: Maíra Canal <maira.canal@usp.br>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoRDMA/mlx5: Fix the flow of a miss in the allocation of a cache ODP MR
Aharon Landau [Tue, 15 Feb 2022 17:55:30 +0000 (19:55 +0200)]
RDMA/mlx5: Fix the flow of a miss in the allocation of a cache ODP MR

[ Upstream commit 2f0e60d5e9f96341a0c8a01be8878cdb3b29ff20 ]

When an ODP MR cache entry is empty and trying to allocate it, increment
the ent->miss counter and call to queue_adjust_cache_locked() to verify
the entry is balanced.

Fixes: aad719dcf379 ("RDMA/mlx5: Allow MRs to be created in the cache synchronously")
Link: https://lore.kernel.org/r/09503e295276dcacc92cb1d8aef1ad0961c99dc1.1644947594.git.leonro@nvidia.com
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoscsi: pm8001: Fix abort all task initialization
Damien Le Moal [Sun, 20 Feb 2022 03:17:55 +0000 (12:17 +0900)]
scsi: pm8001: Fix abort all task initialization

[ Upstream commit 7f12845c8389855dbcc67baa068b6832dc4a396e ]

In pm80xx_send_abort_all(), the n_elem field of the ccb used is not
initialized to 0. This missing initialization sometimes lead to the task
completion path seeing the ccb with a non-zero n_elem resulting in the
execution of invalid dma_unmap_sg() calls in pm8001_ccb_task_free(),
causing a crash such as:

[  197.676341] RIP: 0010:iommu_dma_unmap_sg+0x6d/0x280
[  197.700204] RSP: 0018:ffff889bbcf89c88 EFLAGS: 00010012
[  197.705485] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff83d0bda0
[  197.712687] RDX: 0000000000000002 RSI: 0000000000000000 RDI: ffff88810dffc0d0
[  197.719887] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff8881c790098b
[  197.727089] R10: ffffed1038f20131 R11: 0000000000000001 R12: 0000000000000000
[  197.734296] R13: ffff88810dffc0d0 R14: 0000000000000010 R15: 0000000000000000
[  197.741493] FS:  0000000000000000(0000) GS:ffff889bbcf80000(0000) knlGS:0000000000000000
[  197.749659] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  197.755459] CR2: 00007f16c1b42734 CR3: 0000000004814000 CR4: 0000000000350ee0
[  197.762656] Call Trace:
[  197.765127]  <IRQ>
[  197.767162]  pm8001_ccb_task_free+0x5f1/0x820 [pm80xx]
[  197.772364]  ? do_raw_spin_unlock+0x54/0x220
[  197.776680]  pm8001_mpi_task_abort_resp+0x2ce/0x4f0 [pm80xx]
[  197.782406]  process_oq+0xe85/0x7890 [pm80xx]
[  197.786817]  ? lock_acquire+0x194/0x490
[  197.790697]  ? handle_irq_event+0x10e/0x1b0
[  197.794920]  ? mpi_sata_completion+0x2d70/0x2d70 [pm80xx]
[  197.800378]  ? __wake_up_bit+0x100/0x100
[  197.804340]  ? lock_is_held_type+0x98/0x110
[  197.808565]  pm80xx_chip_isr+0x94/0x130 [pm80xx]
[  197.813243]  tasklet_action_common.constprop.0+0x24b/0x2f0
[  197.818785]  __do_softirq+0x1b5/0x82d
[  197.822485]  ? do_raw_spin_unlock+0x54/0x220
[  197.826799]  __irq_exit_rcu+0x17e/0x1e0
[  197.830678]  irq_exit_rcu+0xa/0x20
[  197.834114]  common_interrupt+0x78/0x90
[  197.840051]  </IRQ>
[  197.844236]  <TASK>
[  197.848397]  asm_common_interrupt+0x1e/0x40

Avoid this issue by always initializing the ccb n_elem field to 0 in
pm8001_send_abort_all(), pm8001_send_read_log() and
pm80xx_send_abort_all().

Link: https://lore.kernel.org/r/20220220031810.738362-17-damien.lemoal@opensource.wdc.com
Fixes: c6b9ef5779c3 ("[SCSI] pm80xx: NCQ error handling changes")
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoscsi: pm8001: Fix NCQ NON DATA command completion handling
Damien Le Moal [Sun, 20 Feb 2022 03:17:54 +0000 (12:17 +0900)]
scsi: pm8001: Fix NCQ NON DATA command completion handling

[ Upstream commit 1d6736c3e162061dc811c76e605f35ef3234bffa ]

NCQ NON DATA is an NCQ command with the DMA_NONE DMA direction and so a
register-device-to-host-FIS response is expected for it.

However, for an IO_SUCCESS case, mpi_sata_completion() expects a
set-device-bits-FIS for any ata task with an use_ncq field true, which
includes NCQ NON DATA commands.

Fix this to correctly treat NCQ NON DATA commands as non-data by also
testing for the DMA_NONE DMA direction.

Link: https://lore.kernel.org/r/20220220031810.738362-16-damien.lemoal@opensource.wdc.com
Fixes: dbf9bfe61571 ("[SCSI] pm8001: add SAS/SATA HBA driver")
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoscsi: pm8001: Fix NCQ NON DATA command task initialization
Damien Le Moal [Sun, 20 Feb 2022 03:17:53 +0000 (12:17 +0900)]
scsi: pm8001: Fix NCQ NON DATA command task initialization

[ Upstream commit aa028141ab0bc62c44a84d42f09db35d82df82a2 ]

In the pm8001_chip_sata_req() and pm80xx_chip_sata_req() functions, all
tasks with a DMA direction of DMA_NONE (no data transfer) are initialized
using the ATAP value 0x04. However, NCQ NON DATA commands, while being
DMA_NONE commands are NCQ commands and need to be initialized using the
value 0x07 for ATAP, similarly to other NCQ commands.

Make sure that NCQ NON DATA command tasks are initialized similarly to
other NCQ commands by also testing the task "use_ncq" field in addition to
the DMA direction. While at it, reorganize the code into a chain of if -
else if - else to avoid useless affectations and debug messages.

Link: https://lore.kernel.org/r/20220220031810.738362-15-damien.lemoal@opensource.wdc.com
Fixes: dbf9bfe61571 ("[SCSI] pm8001: add SAS/SATA HBA driver")
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 years agoscsi: pm8001: Fix le32 values handling in pm80xx_chip_sata_req()
Damien Le Moal [Sun, 20 Feb 2022 03:17:50 +0000 (12:17 +0900)]
scsi: pm8001: Fix le32 values handling in pm80xx_chip_sata_req()

[ Upstream commit fd6d0e376211d7ed759db96b0fbd9a1cee67d462 ]

Make sure that the __le32 fields of struct sata_cmd are manipulated after
applying the correct endian conversion. That is, use cpu_to_le32() for
assigning values and le32_to_cpu() for consulting a field value.  In
particular, make sure that the calculations for the 4G boundary check are
done using CPU endianness and *not* little endian values. With these fixes,
many sparse warnings are removed.

While at it, fix some code identation and add blank lines after variable
declarations and in some other places to make this code more readable.

Link: https://lore.kernel.org/r/20220220031810.738362-12-damien.lemoal@opensource.wdc.com
Fixes: 0ecdf00ba6e5 ("[SCSI] pm80xx: 4G boundary fix.")
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>