platform/kernel/linux-starfive.git
3 years agomptcp: Retransmit DATA_FIN
Mat Martineau [Fri, 23 Apr 2021 16:40:33 +0000 (09:40 -0700)]
mptcp: Retransmit DATA_FIN

With this change, the MPTCP-level retransmission timer is used to resend
DATA_FIN. The retranmit timer is not stopped while waiting for a
MPTCP-level ACK of DATA_FIN, and retransmitted DATA_FINs are sent on all
subflows. The retry interval starts at TCP_RTO_MIN and then doubles on
each attempt, up to TCP_RTO_MAX.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/146
Fixes: 43b54c6ee382 ("mptcp: Use full MPTCP-level disconnect state machine")
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: geneve: modify IP header check in geneve6_xmit_skb and geneve_xmit_skb
Phillip Potter [Thu, 22 Apr 2021 23:49:45 +0000 (00:49 +0100)]
net: geneve: modify IP header check in geneve6_xmit_skb and geneve_xmit_skb

Modify the header size check in geneve6_xmit_skb and geneve_xmit_skb
to use pskb_inet_may_pull rather than pskb_network_may_pull. This fixes
two kernel selftest failures introduced by the commit introducing the
checks:
IPv4 over geneve6: PMTU exceptions
IPv4 over geneve6: PMTU exceptions - nexthop objects

It does this by correctly accounting for the fact that IPv4 packets may
transit over geneve IPv6 tunnels (and vice versa), and still fixes the
uninit-value bug fixed by the original commit.

Reported-by: kernel test robot <oliver.sang@intel.com>
Fixes: 6628ddfec758 ("net: geneve: check skb is large enough for IPv4/IPv6 header")
Suggested-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Phillip Potter <phil@philpotter.co.uk>
Acked-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoopenvswitch: meter: remove rate from the bucket size calculation
Ilya Maximets [Wed, 21 Apr 2021 13:57:47 +0000 (15:57 +0200)]
openvswitch: meter: remove rate from the bucket size calculation

Implementation of meters supposed to be a classic token bucket with 2
typical parameters: rate and burst size.

Burst size in this schema is the maximum number of bytes/packets that
could pass without being rate limited.

Recent changes to userspace datapath made meter implementation to be
in line with the kernel one, and this uncovered several issues.

The main problem is that maximum bucket size for unknown reason
accounts not only burst size, but also the numerical value of rate.
This creates a lot of confusion around behavior of meters.

For example, if rate is configured as 1000 pps and burst size set to 1,
this should mean that meter will tolerate bursts of 1 packet at most,
i.e. not a single packet above the rate should pass the meter.
However, current implementation calculates maximum bucket size as
(rate + burst size), so the effective bucket size will be 1001.  This
means that first 1000 packets will not be rate limited and average
rate might be twice as high as the configured rate.  This also makes
it practically impossible to configure meter that will have burst size
lower than the rate, which might be a desirable configuration if the
rate is high.

Inability to configure low values of a burst size and overall inability
for a user to predict what will be a maximum and average rate from the
configured parameters of a meter without looking at the OVS and kernel
code might be also classified as a security issue, because drop meters
are frequently used as a way of protection from DoS attacks.

This change removes rate from the calculation of a bucket size, making
it in line with the classic token bucket algorithm and essentially
making the rate and burst tolerance being predictable from a users'
perspective.

Same change proposed for the userspace implementation.

Fixes: 96fbc13d7e77 ("openvswitch: Add meter infrastructure")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'RTL8211E-RGMII-D'
David S. Miller [Thu, 22 Apr 2021 22:08:35 +0000 (15:08 -0700)]
Merge branch 'RTL8211E-RGMII-D'

Kunihiko Hayashi says:

====================
Change phy-mode to RGMII-ID to enable delay pins for RTL8211E

UniPhier PXs2, LD20, and PXs3 boards have RTL8211E ethernet phy, and the
phy have the RX/TX delays of RGMII interface using pull-ups on the RXDLY
and TXDLY pins.

After the commit bbc4d71d6354 ("net: phy: realtek: fix rtl8211e rx/tx
delay config"), the delays are working correctly, however, "rgmii" means
no delay and the phy doesn't work. So need to set the phy-mode to
"rgmii-id" to show that RX/TX delays are enabled.

Changes since v1:
- Fix the commit message
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoarm64: dts: uniphier: Change phy-mode to RGMII-ID to enable delay pins for RTL8211E
Kunihiko Hayashi [Thu, 22 Apr 2021 17:31:49 +0000 (02:31 +0900)]
arm64: dts: uniphier: Change phy-mode to RGMII-ID to enable delay pins for RTL8211E

UniPhier LD20 and PXs3 boards have RTL8211E ethernet phy, and the phy have
the RX/TX delays of RGMII interface using pull-ups on the RXDLY and TXDLY
pins.

After the commit bbc4d71d6354 ("net: phy: realtek: fix rtl8211e rx/tx
delay config"), the delays are working correctly, however, "rgmii" means
no delay and the phy doesn't work. So need to set the phy-mode to
"rgmii-id" to show that RX/TX delays are enabled.

Fixes: c73730ee4c9a ("arm64: dts: uniphier: add AVE ethernet node")
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoARM: dts: uniphier: Change phy-mode to RGMII-ID to enable delay pins for RTL8211E
Kunihiko Hayashi [Thu, 22 Apr 2021 17:31:48 +0000 (02:31 +0900)]
ARM: dts: uniphier: Change phy-mode to RGMII-ID to enable delay pins for RTL8211E

UniPhier PXs2 boards have RTL8211E ethernet phy, and the phy have the RX/TX
delays of RGMII interface using pull-ups on the RXDLY and TXDLY pins.

After the commit bbc4d71d6354 ("net: phy: realtek: fix rtl8211e rx/tx
delay config"), the delays are working correctly, however, "rgmii" means
no delay and the phy doesn't work. So need to set the phy-mode to
"rgmii-id" to show that RX/TX delays are enabled.

Fixes: e3cc931921d2 ("ARM: dts: uniphier: add AVE ethernet node")
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agobnxt_en: fix ternary sign extension bug in bnxt_show_temp()
Dan Carpenter [Thu, 22 Apr 2021 09:10:28 +0000 (12:10 +0300)]
bnxt_en: fix ternary sign extension bug in bnxt_show_temp()

The problem is that bnxt_show_temp() returns long but "rc" is an int
and "len" is a u32.  With ternary operations the type promotion is quite
tricky.  The negative "rc" is first promoted to u32 and then to long so
it ends up being a high positive value instead of a a negative as we
intended.

Fix this by removing the ternary.

Fixes: d69753fa1ecb ("bnxt_en: return proper error codes in bnxt_show_temp")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phy: marvell: fix m88e1111_set_downshift
Maxim Kochetkov [Thu, 22 Apr 2021 10:46:44 +0000 (13:46 +0300)]
net: phy: marvell: fix m88e1111_set_downshift

Changing downshift params without software reset has no effect,
so call genphy_soft_reset() after change downshift params.

As the datasheet says:
Changes to these bits are disruptive to the normal operation therefore,
any changes to these registers must be followed by software reset
to take effect.

Fixes: 5c6bc5199b5d ("net: phy: marvell: add downshift support for M88E1111")
Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phy: marvell: fix m88e1011_set_downshift
Maxim Kochetkov [Thu, 22 Apr 2021 10:46:43 +0000 (13:46 +0300)]
net: phy: marvell: fix m88e1011_set_downshift

Changing downshift params without software reset has no effect,
so call genphy_soft_reset() after change downshift params.

As the datasheet says:
Changes to these bits are disruptive to the normal operation therefore,
any changes to these registers must be followed by software reset
to take effect.

Fixes: 911af5e149bb ("net: phy: marvell: fix downshift function naming")
Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoneighbour: Prevent Race condition in neighbour subsytem
Chinmay Agarwal [Wed, 21 Apr 2021 19:42:22 +0000 (01:12 +0530)]
neighbour: Prevent Race condition in neighbour subsytem

Following Race Condition was detected:

<CPU A, t0>: Executing: __netif_receive_skb() ->__netif_receive_skb_core()
-> arp_rcv() -> arp_process().arp_process() calls __neigh_lookup() which
takes a reference on neighbour entry 'n'.
Moves further along, arp_process() and calls neigh_update()->
__neigh_update(). Neighbour entry is unlocked just before a call to
neigh_update_gc_list.

This unlocking paves way for another thread that may take a reference on
the same and mark it dead and remove it from gc_list.

<CPU B, t1> - neigh_flush_dev() is under execution and calls
neigh_mark_dead(n) marking the neighbour entry 'n' as dead. Also n will be
removed from gc_list.
Moves further along neigh_flush_dev() and calls
neigh_cleanup_and_release(n), but since reference count increased in t1,
'n' couldn't be destroyed.

<CPU A, t3>- Code hits neigh_update_gc_list, with neighbour entry
set as dead.

<CPU A, t4> - arp_process() finally calls neigh_release(n), destroying
the neighbour entry and we have a destroyed ntry still part of gc_list.

Fixes: eb4e8fac00d1("neighbour: Prevent a dead entry from updating gc_list")
Signed-off-by: Chinmay Agarwal <chinagar@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agobonding: 3ad: Fix the conflict between bond_update_slave_arr and the state machine
jinyiting [Wed, 21 Apr 2021 08:38:21 +0000 (16:38 +0800)]
bonding: 3ad: Fix the conflict between bond_update_slave_arr and the state machine

The bond works in mode 4, and performs down/up operations on the bond
that is normally negotiated. The probability of bond-> slave_arr is NULL

Test commands:
   ifconfig bond1 down
   ifconfig bond1 up

The conflict occurs in the following process:

__dev_open (CPU A)
--bond_open
  --queue_delayed_work(bond->wq,&bond->ad_work,0);
  --bond_update_slave_arr
    --bond_3ad_get_active_agg_info

ad_work(CPU B)
--bond_3ad_state_machine_handler
  --ad_agg_selection_logic

ad_work runs on cpu B. In the function ad_agg_selection_logic, all
agg->is_active will be cleared. Before the new active aggregator is
selected on CPU B, bond_3ad_get_active_agg_info failed on CPU A,
bond->slave_arr will be set to NULL. The best aggregator in
ad_agg_selection_logic has not changed, no need to update slave arr.

The conflict occurred in that ad_agg_selection_logic clears
agg->is_active under mode_lock, but bond_open -> bond_update_slave_arr
is inspecting agg->is_active outside the lock.

Also, bond_update_slave_arr is normal for potential sleep when
allocating memory, so replace the WARN_ON with a call to might_sleep.

Signed-off-by: jinyiting <jinyiting@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: qrtr: Avoid potential use after free in MHI send
Bjorn Andersson [Wed, 21 Apr 2021 17:40:07 +0000 (10:40 -0700)]
net: qrtr: Avoid potential use after free in MHI send

It is possible that the MHI ul_callback will be invoked immediately
following the queueing of the skb for transmission, leading to the
callback decrementing the refcount of the associated sk and freeing the
skb.

As such the dereference of skb and the increment of the sk refcount must
happen before the skb is queued, to avoid the skb to be used after free
and potentially the sk to drop its last refcount..

Fixes: 6e728f321393 ("net: qrtr: Add MHI transport layer")
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phy: intel-xway: enable integrated led functions
Martin Schiller [Wed, 21 Apr 2021 05:50:47 +0000 (07:50 +0200)]
net: phy: intel-xway: enable integrated led functions

The Intel xway phys offer the possibility to deactivate the integrated
LED function and to control the LEDs manually.
If this was set by the bootloader, it must be ensured that the
integrated LED function is enabled for all LEDs when loading the driver.

Before commit 6e2d85ec0559 ("net: phy: Stop with excessive soft reset")
the LEDs were enabled by a soft-reset of the PHY (using
genphy_soft_reset). Initialize the XWAY_MDIO_LED with it's default
value (which is applied during a soft reset) instead of adding back
the soft reset. This brings back the default LED configuration while
still preventing an excessive amount of soft resets.

Fixes: 6e2d85ec0559 ("net: phy: Stop with excessive soft reset")
Signed-off-by: Martin Schiller <ms@dev.tdt.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: renesas: ravb: Fix a stuck issue when a lot of frames are received
Yoshihiro Shimoda [Wed, 21 Apr 2021 04:52:46 +0000 (13:52 +0900)]
net: renesas: ravb: Fix a stuck issue when a lot of frames are received

When a lot of frames were received in the short term, the driver
caused a stuck of receiving until a new frame was received. For example,
the following command from other device could cause this issue.

    $ sudo ping -f -l 1000 -c 1000 <this driver's ipaddress>

The previous code always cleared the interrupt flag of RX but checks
the interrupt flags in ravb_poll(). So, ravb_poll() could not call
ravb_rx() in the next time until a new RX frame was received if
ravb_rx() returned true. To fix the issue, always calls ravb_rx()
regardless the interrupt flags condition.

Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac: fix TSO and TBS feature enabling during driver open
Ong Boon Leong [Wed, 21 Apr 2021 09:11:49 +0000 (17:11 +0800)]
net: stmmac: fix TSO and TBS feature enabling during driver open

TSO and TBS cannot co-exist and current implementation requires two
fixes:

 1) stmmac_open() does not need to call stmmac_enable_tbs() because
    the MAC is reset in stmmac_init_dma_engine() anyway.
 2) Inside stmmac_hw_setup(), we should call stmmac_enable_tso() for
    TX Q that is _not_ configured for TBS.

Fixes: 579a25a854d4 ("net: stmmac: Initial support for TBS")
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonfp: devlink: initialize the devlink port attribute "lanes"
Yinjun Zhang [Wed, 21 Apr 2021 09:24:15 +0000 (11:24 +0200)]
nfp: devlink: initialize the devlink port attribute "lanes"

The number of lanes of devlink port should be correctly initialized
when registering the port, so that the input check when running
"devlink port split <port> count <N>" can pass.

Fixes: a21cf0a8330b ("devlink: Add a new devlink port lanes attribute and pass to netlink")
Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge tag 'wireless-drivers-2021-04-21' of git://git.kernel.org/pub/scm/linux/kernel...
David S. Miller [Wed, 21 Apr 2021 17:18:09 +0000 (10:18 -0700)]
Merge tag 'wireless-drivers-2021-04-21' of git://git./linux/kernel/git/kvalo/wireless-drivers

Kalle Valo says:

====================
wireless-drivers fixes for v5.12

As there was -rc8 release, one more important fix for v5.12.

iwlwifi

* fix spinlock warning in gen2 devices
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: davinci_emac: Fix incorrect masking of tx and rx error channel
Colin Ian King [Tue, 20 Apr 2021 17:16:14 +0000 (18:16 +0100)]
net: davinci_emac: Fix incorrect masking of tx and rx error channel

The bit-masks used for the TXERRCH and RXERRCH (tx and rx error channels)
are incorrect and always lead to a zero result. The mask values are
currently the incorrect post-right shifted values, fix this by setting
them to the currect values.

(I double checked these against the TMS320TCI6482 data sheet, section
5.30, page 127 to ensure I had the correct mask values for the TXERRCH
and RXERRCH fields in the MACSTATUS register).

Addresses-Coverity: ("Operands don't affect result")
Fixes: a6286ee630f6 ("net: Add TI DaVinci EMAC driver")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: marvell: prestera: fix port event handling on init
Vadym Kochan [Tue, 20 Apr 2021 13:31:51 +0000 (16:31 +0300)]
net: marvell: prestera: fix port event handling on init

For some reason there might be a crash during ports creation if port
events are handling at the same time  because fw may send initial
port event with down state.

The crash points to cancel_delayed_work() which is called when port went
is down.  Currently I did not find out the real cause of the issue, so
fixed it by cancel port stats work only if previous port's state was up
& runnig.

The following is the crash which can be triggered:

[   28.311104] Unable to handle kernel paging request at virtual address
000071775f776600
[   28.319097] Mem abort info:
[   28.321914]   ESR = 0x96000004
[   28.324996]   EC = 0x25: DABT (current EL), IL = 32 bits
[   28.330350]   SET = 0, FnV = 0
[   28.333430]   EA = 0, S1PTW = 0
[   28.336597] Data abort info:
[   28.339499]   ISV = 0, ISS = 0x00000004
[   28.343362]   CM = 0, WnR = 0
[   28.346354] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000100bf7000
[   28.352842] [000071775f776600] pgd=0000000000000000,
p4d=0000000000000000
[   28.359695] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[   28.365310] Modules linked in: prestera_pci(+) prestera
uio_pdrv_genirq
[   28.372005] CPU: 0 PID: 1291 Comm: kworker/0:1H Not tainted
5.11.0-rc4 #1
[   28.378846] Hardware name: DNI AmazonGo1 A7040 board (DT)
[   28.384283] Workqueue: prestera_fw_wq prestera_fw_evt_work_fn
[prestera_pci]
[   28.391413] pstate: 60000085 (nZCv daIf -PAN -UAO -TCO BTYPE=--)
[   28.397468] pc : get_work_pool+0x48/0x60
[   28.401442] lr : try_to_grab_pending+0x6c/0x1b0
[   28.406018] sp : ffff80001391bc60
[   28.409358] x29: ffff80001391bc60 x28: 0000000000000000
[   28.414725] x27: ffff000104fc8b40 x26: ffff80001127de88
[   28.420089] x25: 0000000000000000 x24: ffff000106119760
[   28.425452] x23: ffff00010775dd60 x22: ffff00010567e000
[   28.430814] x21: 0000000000000000 x20: ffff80001391bcb0
[   28.436175] x19: ffff00010775deb8 x18: 00000000000000c0
[   28.441537] x17: 0000000000000000 x16: 000000008d9b0e88
[   28.446898] x15: 0000000000000001 x14: 00000000000002ba
[   28.452261] x13: 80a3002c00000002 x12: 00000000000005f4
[   28.457622] x11: 0000000000000030 x10: 000000000000000c
[   28.462985] x9 : 000000000000000c x8 : 0000000000000030
[   28.468346] x7 : ffff800014400000 x6 : ffff000106119758
[   28.473708] x5 : 0000000000000003 x4 : ffff00010775dc60
[   28.479068] x3 : 0000000000000000 x2 : 0000000000000060
[   28.484429] x1 : 000071775f776600 x0 : ffff00010775deb8
[   28.489791] Call trace:
[   28.492259]  get_work_pool+0x48/0x60
[   28.495874]  cancel_delayed_work+0x38/0xb0
[   28.500011]  prestera_port_handle_event+0x90/0xa0 [prestera]
[   28.505743]  prestera_evt_recv+0x98/0xe0 [prestera]
[   28.510683]  prestera_fw_evt_work_fn+0x180/0x228 [prestera_pci]
[   28.516660]  process_one_work+0x1e8/0x360
[   28.520710]  worker_thread+0x44/0x480
[   28.524412]  kthread+0x154/0x160
[   28.527670]  ret_from_fork+0x10/0x38
[   28.531290] Code: a8c17bfd d50323bf d65f03c0 9278dc21 (f9400020)
[   28.537429] ---[ end trace 5eced933df3a080b ]---

Fixes: 501ef3066c89 ("net: marvell: prestera: Add driver for Prestera family ASIC devices")
Signed-off-by: Vadym Kochan <vkochan@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agovsock/virtio: free queued packets when closing socket
Stefano Garzarella [Tue, 20 Apr 2021 11:07:27 +0000 (13:07 +0200)]
vsock/virtio: free queued packets when closing socket

As reported by syzbot [1], there is a memory leak while closing the
socket. We partially solved this issue with commit ac03046ece2b
("vsock/virtio: free packets during the socket release"), but we
forgot to drain the RX queue when the socket is definitely closed by
the scheduled work.

To avoid future issues, let's use the new virtio_transport_remove_sock()
to drain the RX queue before removing the socket from the af_vsock lists
calling vsock_remove_sock().

[1] https://syzkaller.appspot.com/bug?extid=24452624fc4c571eedd9

Fixes: ac03046ece2b ("vsock/virtio: free packets during the socket release")
Reported-and-tested-by: syzbot+24452624fc4c571eedd9@syzkaller.appspotmail.com
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'sfc-txq-lookups'
David S. Miller [Wed, 21 Apr 2021 00:03:53 +0000 (17:03 -0700)]
Merge branch 'sfc-txq-lookups'

Edward Cree says:

====================
sfc: fix TXQ lookups

The TXQ handling changes in 12804793b17c ("sfc: decouple TXQ type from label")
 which were made as part of the support for encap offloads on EF10 caused some
 breakage on Siena (5000- and 6000-series) NICs, which caused null-dereference
 kernel panics.
This series fixes those issues, and also a similarly incorrect code-path on
 EF10 which worked by chance.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agosfc: ef10: fix TX queue lookup in TX event handling
Edward Cree [Tue, 20 Apr 2021 12:29:35 +0000 (13:29 +0100)]
sfc: ef10: fix TX queue lookup in TX event handling

We're starting from a TXQ label, not a TXQ type, so
 efx_channel_get_tx_queue() is inappropriate.  This worked by chance,
 because labels and types currently match on EF10, but we shouldn't
 rely on that.

Fixes: 12804793b17c ("sfc: decouple TXQ type from label")
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agosfc: farch: fix TX queue lookup in TX event handling
Edward Cree [Tue, 20 Apr 2021 12:28:28 +0000 (13:28 +0100)]
sfc: farch: fix TX queue lookup in TX event handling

We're starting from a TXQ label, not a TXQ type, so
 efx_channel_get_tx_queue() is inappropriate (and could return NULL,
 leading to panics).

Fixes: 12804793b17c ("sfc: decouple TXQ type from label")
Cc: stable@vger.kernel.org
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agosfc: farch: fix TX queue lookup in TX flush done handling
Edward Cree [Tue, 20 Apr 2021 12:27:22 +0000 (13:27 +0100)]
sfc: farch: fix TX queue lookup in TX flush done handling

We're starting from a TXQ instance number ('qid'), not a TXQ type, so
 efx_get_tx_queue() is inappropriate (and could return NULL, leading
 to panics).

Fixes: 12804793b17c ("sfc: decouple TXQ type from label")
Reported-by: Trevor Hemsley <themsley@voiceflex.com>
Cc: stable@vger.kernel.org
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMAINTAINERS: update
Lijun Pan [Mon, 19 Apr 2021 21:25:25 +0000 (16:25 -0500)]
MAINTAINERS: update

I am making this change again since I received the following instruction.

"As an IBM employee, you are not allowed to use your gmail account to work
in any way on VNIC. You are not allowed to use your personal email account
as a "hobby". You are an IBM employee 100% of the time.
Please remove yourself completely from the maintainers file.
I grant you a 1 time exception on contributions to VNIC to make this
change."

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: fix a data race when get vlan device
Di Zhu [Mon, 19 Apr 2021 13:56:41 +0000 (21:56 +0800)]
net: fix a data race when get vlan device

We encountered a crash: in the packet receiving process, we got an
illegal VLAN device address, but the VLAN device address saved in vmcore
is correct. After checking the code, we found a possible data
competition:
CPU 0:                             CPU 1:
    (RCU read lock)                  (RTNL lock)
    vlan_do_receive()        register_vlan_dev()
      vlan_find_dev()

        ->__vlan_group_get_device()  ->vlan_group_prealloc_vid()

In vlan_group_prealloc_vid(), We need to make sure that memset()
in kzalloc() is executed before assigning  value to vlan devices array:
=================================
kzalloc()
    ->memset(object, 0, size)

smp_wmb()

vg->vlan_devices_arrays[pidx][vidx] = array;
==================================

Because __vlan_group_get_device() function depends on this order.
otherwise we may get a wrong address from the hardware cache on
another cpu.

So fix it by adding memory barrier instruction to ensure the order
of memory operations.

Signed-off-by: Di Zhu <zhudi21@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agogro: fix napi_gro_frags() Fast GRO breakage due to IP alignment check
Alexander Lobakin [Mon, 19 Apr 2021 12:53:06 +0000 (12:53 +0000)]
gro: fix napi_gro_frags() Fast GRO breakage due to IP alignment check

Commit 38ec4944b593 ("gro: ensure frag0 meets IP header alignment")
did the right thing, but missed the fact that napi_gro_frags() logics
calls for skb_gro_reset_offset() *before* pulling Ethernet header
to the skb linear space.
That said, the introduced check for frag0 address being aligned to 4
always fails for it as Ethernet header is obviously 14 bytes long,
and in case with NET_IP_ALIGN its start is not aligned to 4.

Fix this by adding @nhoff argument to skb_gro_reset_offset() which
tells if an IP header is placed right at the start of frag0 or not.
This restores Fast GRO for napi_gro_frags() that became very slow
after the mentioned commit, and preserves the introduced check to
avoid silent unaligned accesses.

From v1 [0]:
 - inline tiny skb_gro_reset_offset() to let the code be optimized
   more efficively (esp. for the !NET_IP_ALIGN case) (Eric);
 - pull in Reviewed-by from Eric.

[0] https://lore.kernel.org/netdev/20210418114200.5839-1-alobakin@pm.me

Fixes: 38ec4944b593 ("gro: ensure frag0 meets IP header alignment")
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ethernet: ixp4xx: Set the DMA masks explicitly
Linus Walleij [Sun, 18 Apr 2021 18:28:53 +0000 (20:28 +0200)]
net: ethernet: ixp4xx: Set the DMA masks explicitly

The former fix only papered over the actual problem: the
ethernet core expects the netdev .dev member to have the
proper DMA masks set, or there will be BUG_ON() triggered
in kernel/dma/mapping.c.

Fix this by simply copying dma_mask and dma_mask_coherent
from the parent device.

Fixes: e45d0fad4a5f ("net: ethernet: ixp4xx: Use parent dev for DMA pool")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: sched: tapr: prevent cycle_time == 0 in parse_taprio_schedule
Du Cheng [Fri, 16 Apr 2021 23:30:46 +0000 (07:30 +0800)]
net: sched: tapr: prevent cycle_time == 0 in parse_taprio_schedule

There is a reproducible sequence from the userland that will trigger a WARN_ON()
condition in taprio_get_start_time, which causes kernel to panic if configured
as "panic_on_warn". Catch this condition in parse_taprio_schedule to
prevent this condition.

Reported as bug on syzkaller:
https://syzkaller.appspot.com/bug?extid=d50710fd0873a9c6b40c

Reported-by: syzbot+d50710fd0873a9c6b40c@syzkaller.appspotmail.com
Signed-off-by: Du Cheng <ducheng2@gmail.com>
Acked-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agovsock/vmci: log once the failed queue pair allocation
Stefano Garzarella [Fri, 16 Apr 2021 10:44:16 +0000 (12:44 +0200)]
vsock/vmci: log once the failed queue pair allocation

VMCI feature is not supported in conjunction with the vSphere Fault
Tolerance (FT) feature.

VMware Tools can repeatedly try to create a vsock connection. If FT is
enabled the kernel logs is flooded with the following messages:

    qp_alloc_hypercall result = -20
    Could not attach to queue pair with -20

"qp_alloc_hypercall result = -20" was hidden by commit e8266c4c3307
("VMCI: Stop log spew when qp allocation isn't possible"), but "Could
not attach to queue pair with -20" is still there flooding the log.

Since the error message can be useful in some cases, print it only once.

Fixes: d021c344051a ("VSOCK: Introduce VM Sockets")
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoiwlwifi: Fix softirq/hardirq disabling in iwl_pcie_gen2_enqueue_hcmd()
Jiri Kosina [Sat, 17 Apr 2021 09:13:39 +0000 (11:13 +0200)]
iwlwifi: Fix softirq/hardirq disabling in iwl_pcie_gen2_enqueue_hcmd()

Analogically to what we did in 2800aadc18a6 ("iwlwifi: Fix softirq/hardirq
disabling in iwl_pcie_enqueue_hcmd()"), we must apply the same fix to
iwl_pcie_gen2_enqueue_hcmd(), as it's being called from exactly the same
contexts.

Reported-by: Heiner Kallweit <hkallweit1@gmail.com
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/nycvar.YFH.7.76.2104171112390.18270@cbobk.fhfr.pm
3 years agoMerge tag 'net-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Sat, 17 Apr 2021 16:57:15 +0000 (09:57 -0700)]
Merge tag 'net-5.12-rc8' of git://git./linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Networking fixes for 5.12-rc8, including fixes from netfilter, and
  bpf. BPF verifier changes stand out, otherwise things have slowed
  down.

  Current release - regressions:

   - gro: ensure frag0 meets IP header alignment

   - Revert "net: stmmac: re-init rx buffers when mac resume back"

   - ethernet: macb: fix the restore of cmp registers

  Previous releases - regressions:

   - ixgbe: Fix NULL pointer dereference in ethtool loopback test

   - ixgbe: fix unbalanced device enable/disable in suspend/resume

   - phy: marvell: fix detection of PHY on Topaz switches

   - make tcp_allowed_congestion_control readonly in non-init netns

   - xen-netback: Check for hotplug-status existence before watching

  Previous releases - always broken:

   - bpf: mitigate a speculative oob read of up to map value size by
     tightening the masking window

   - sctp: fix race condition in sctp_destroy_sock

   - sit, ip6_tunnel: Unregister catch-all devices

   - netfilter: nftables: clone set element expression template

   - netfilter: flowtable: fix NAT IPv6 offload mangling

   - net: geneve: check skb is large enough for IPv4/IPv6 header

   - netlink: don't call ->netlink_bind with table lock held"

* tag 'net-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (52 commits)
  netlink: don't call ->netlink_bind with table lock held
  MAINTAINERS: update my email
  bpf: Update selftests to reflect new error states
  bpf: Tighten speculative pointer arithmetic mask
  bpf: Move sanitize_val_alu out of op switch
  bpf: Refactor and streamline bounds check into helper
  bpf: Improve verifier error messages for users
  bpf: Rework ptr_limit into alu_limit and add common error path
  bpf: Ensure off_reg has no mixed signed bounds for all types
  bpf: Move off_reg into sanitize_ptr_alu
  bpf: Use correct permission flag for mixed signed bounds arithmetic
  ch_ktls: do not send snd_una update to TCB in middle
  ch_ktls: tcb close causes tls connection failure
  ch_ktls: fix device connection close
  ch_ktls: Fix kernel panic
  i40e: fix the panic when running bpf in xdpdrv mode
  net/mlx5e: fix ingress_ifindex check in mlx5e_flower_parse_meta
  net/mlx5e: Fix setting of RS FEC mode
  net/mlx5: Fix setting of devlink traps in switchdev mode
  Revert "net: stmmac: re-init rx buffers when mac resume back"
  ...

3 years agoMerge tag 'libnvdimm-fixes-for-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 17 Apr 2021 16:40:44 +0000 (09:40 -0700)]
Merge tag 'libnvdimm-fixes-for-5.12-rc8' of git://git./linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm fixes from Dan Williams:
 "The largest change is for a regression that landed during -rc1 for
  block-device read-only handling. Vaibhav found a new use for the
  ability (originally introduced by virtio_pmem) to call back to the
  platform to flush data, but also found an original bug in that
  implementation. Lastly, Arnd cleans up some compile warnings in dax.

  This has all appeared in -next with no reported issues.

  Summary:

   - Fix a regression of read-only handling in the pmem driver

   - Fix a compile warning

   - Fix support for platform cache flush commands on powerpc/papr"

* tag 'libnvdimm-fixes-for-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  libnvdimm/region: Fix nvdimm_has_flush() to handle ND_REGION_ASYNC
  libnvdimm: Notify disk drivers to revalidate region read-only
  dax: avoid -Wempty-body warnings

3 years agoMerge tag 'cxl-fixes-for-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 17 Apr 2021 16:30:58 +0000 (09:30 -0700)]
Merge tag 'cxl-fixes-for-5.12-rc8' of git://git./linux/kernel/git/cxl/cxl

Pull CXL memory class fixes from Dan Williams:
 "A collection of fixes for the CXL memory class driver introduced in
  this release cycle.

  The driver was primarily developed on a work-in-progress QEMU
  emulation of the interface and we have since found a couple places
  where it hid spec compliance bugs in the driver, or had a spec
  implementation bug itself.

  The biggest change here is replacing a percpu_ref with an rwsem to
  cleanup a couple bugs in the error unwind path during ioctl device
  init. Lastly there were some minor cleanups to not export the
  power-management sysfs-ABI for the ioctl device, use the proper sysfs
  helper for emitting values, and prevent subtle bugs as new
  administration commands are added to the supported list.

  The bulk of it has appeared in -next save for the top commit which was
  found today and validated on a fixed-up QEMU model.

  Summary:

   - Fix support for CXL memory devices with registers offset from the
     BAR base.

   - Fix the reporting of device capacity.

   - Fix the driver commands list definition to be disconnected from the
     UAPI command list.

   - Replace percpu_ref with rwsem to fix initialization error path.

   - Fix leaks in the driver initialization error path.

   - Drop the power/ directory from CXL device sysfs.

   - Use the recommended sysfs helper for attribute 'show'
     implementations"

* tag 'cxl-fixes-for-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
  cxl/mem: Fix memory device capacity probing
  cxl/mem: Fix register block offset calculation
  cxl/mem: Force array size of mem_commands[] to CXL_MEM_COMMAND_ID_MAX
  cxl/mem: Disable cxl device power management
  cxl/mem: Do not rely on device_add() side effects for dev_set_name() failures
  cxl/mem: Fix synchronization mechanism for device removal vs ioctl operations
  cxl/mem: Use sysfs_emit() for attribute show routines

3 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Sat, 17 Apr 2021 15:38:23 +0000 (08:38 -0700)]
Merge branch 'akpm' (patches from Andrew)

Merge misc fixes from Andrew Morton:
 "12 patches.

  Subsystems affected by this patch series: mm (documentation, kasan,
  and pagemap), csky, ia64, gcov, and lib"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  lib: remove "expecting prototype" kernel-doc warnings
  gcov: clang: fix clang-11+ build
  mm: ptdump: fix build failure
  mm/mapping_dirty_helpers: guard hugepage pud's usage
  ia64: tools: remove duplicate definition of ia64_mf() on ia64
  ia64: tools: remove inclusion of ia64-specific version of errno.h header
  ia64: fix discontig.c section mismatches
  ia64: remove duplicate entries in generic_defconfig
  csky: change a Kconfig symbol name to fix e1000 build error
  kasan: remove redundant config option
  kasan: fix hwasan build for gcc
  mm: eliminate "expecting prototype" kernel-doc warnings

3 years agocxl/mem: Fix memory device capacity probing
Dan Williams [Sat, 17 Apr 2021 00:43:30 +0000 (17:43 -0700)]
cxl/mem: Fix memory device capacity probing

The CXL Identify Memory Device output payload emits capacity in 256MB
units. The driver is treating the capacity field as bytes. This was
missed because QEMU reports bytes when it should report bytes / 256MB.

Fixes: 8adaf747c9f0 ("cxl/mem: Find device capabilities")
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ben Widawsky <ben.widawsky@intel.com>
Link: https://lore.kernel.org/r/161862021044.3259705.7008520073059739760.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
3 years agonetlink: don't call ->netlink_bind with table lock held
Florian Westphal [Fri, 16 Apr 2021 19:29:13 +0000 (21:29 +0200)]
netlink: don't call ->netlink_bind with table lock held

When I added support to allow generic netlink multicast groups to be
restricted to subscribers with CAP_NET_ADMIN I was unaware that a
genl_bind implementation already existed in the past.

It was reverted due to ABBA deadlock:

1. ->netlink_bind gets called with the table lock held.
2. genetlink bind callback is invoked, it grabs the genl lock.

But when a new genl subsystem is (un)registered, these two locks are
taken in reverse order.

One solution would be to revert again and add a comment in genl
referring 1e82a62fec613, "genetlink: remove genl_bind").

This would need a second change in mptcp to not expose the raw token
value anymore, e.g.  by hashing the token with a secret key so userspace
can still associate subflow events with the correct mptcp connection.

However, Paolo Abeni reminded me to double-check why the netlink table is
locked in the first place.

I can't find one.  netlink_bind() is already called without this lock
when userspace joins a group via NETLINK_ADD_MEMBERSHIP setsockopt.
Same holds for the netlink_unbind operation.

Digging through the history, commit f773608026ee1
("netlink: access nlk groups safely in netlink bind and getname")
expanded the lock scope.

commit 3a20773beeeeade ("net: netlink: cap max groups which will be considered in netlink_bind()")
... removed the nlk->ngroups access that the lock scope
extension was all about.

Reduce the lock scope again and always call ->netlink_bind without
the table lock.

The Fixes tag should be vs. the patch mentioned in the link below,
but that one got squash-merged into the patch that came earlier in the
series.

Fixes: 4d54cc32112d8d ("mptcp: avoid lock_fast usage in accept path")
Link: https://lore.kernel.org/mptcp/20210213000001.379332-8-mathew.j.martineau@linux.intel.com/T/#u
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Xin Long <lucien.xin@gmail.com>
Cc: Johannes Berg <johannes.berg@intel.com>
Cc: Sean Tranchetti <stranche@codeaurora.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge tag 'io_uring-5.12-2021-04-16' of git://git.kernel.dk/linux-block
Linus Torvalds [Fri, 16 Apr 2021 23:18:53 +0000 (16:18 -0700)]
Merge tag 'io_uring-5.12-2021-04-16' of git://git.kernel.dk/linux-block

Pull io_uring fix from Jens Axboe:
 "Fix for a potential hang at exit with SQPOLL from Pavel"

* tag 'io_uring-5.12-2021-04-16' of git://git.kernel.dk/linux-block:
  io_uring: fix early sqd_list removal sqpoll hangs

3 years agolib: remove "expecting prototype" kernel-doc warnings
Randy Dunlap [Fri, 16 Apr 2021 22:46:26 +0000 (15:46 -0700)]
lib: remove "expecting prototype" kernel-doc warnings

Fix various kernel-doc warnings in lib/ due to missing or erroneous
function names.

Add kernel-doc for some function parameters that was missing.  Use
kernel-doc "Return:" notation in earlycpio.c.

Quietens the following warnings:

  lib/earlycpio.c:61: warning: expecting prototype for cpio_data find_cpio_data(). Prototype was for find_cpio_data() instead

  lib/lru_cache.c:640: warning: expecting prototype for lc_dump(). Prototype was for lc_seq_dump_details() instead
  lru_cache.c:90: warning: Function parameter or member 'cache' not described in 'lc_create'

  lib/parman.c:368: warning: expecting prototype for parman_item_del(). Prototype was for parman_item_remove() instead
  parman.c:309: warning: Excess function parameter 'prority' description in 'parman_prio_init'

  lib/radix-tree.c:703: warning: expecting prototype for __radix_tree_insert(). Prototype was for radix_tree_insert() instead
  radix-tree.c:180: warning: Excess function parameter 'addr' description in 'radix_tree_find_next_bit'
  radix-tree.c:180: warning: Excess function parameter 'size' description in 'radix_tree_find_next_bit'
  radix-tree.c:931: warning: Function parameter or member 'iter' not described in 'radix_tree_iter_replace'

Link: https://lkml.kernel.org/r/20210411221756.15461-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: Jiri Pirko <jiri@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agogcov: clang: fix clang-11+ build
Johannes Berg [Fri, 16 Apr 2021 22:46:23 +0000 (15:46 -0700)]
gcov: clang: fix clang-11+ build

With clang-11+, the code is broken due to my kvmalloc() conversion
(which predated the clang-11 support code) leaving one vmalloc() in
place.  Fix that.

Link: https://lkml.kernel.org/r/20210412214210.6e1ecca9cdc5.I24459763acf0591d5e6b31c7e3a59890d802f79c@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm: ptdump: fix build failure
Christophe Leroy [Fri, 16 Apr 2021 22:46:20 +0000 (15:46 -0700)]
mm: ptdump: fix build failure

READ_ONCE() cannot be used for reading PTEs.  Use ptep_get() instead, to
avoid the following errors:

    CC      mm/ptdump.o
  In file included from <command-line>:
  mm/ptdump.c: In function 'ptdump_pte_entry':
  include/linux/compiler_types.h:320:38: error: call to '__compiletime_assert_207' declared with attribute error: Unsupported access size for {READ,WRITE}_ONCE().
    320 |  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
        |                                      ^
  include/linux/compiler_types.h:301:4: note: in definition of macro '__compiletime_assert'
    301 |    prefix ## suffix();    \
        |    ^~~~~~
  include/linux/compiler_types.h:320:2: note: in expansion of macro '_compiletime_assert'
    320 |  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
        |  ^~~~~~~~~~~~~~~~~~~
  include/asm-generic/rwonce.h:36:2: note: in expansion of macro 'compiletime_assert'
     36 |  compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
        |  ^~~~~~~~~~~~~~~~~~
  include/asm-generic/rwonce.h:49:2: note: in expansion of macro 'compiletime_assert_rwonce_type'
     49 |  compiletime_assert_rwonce_type(x);    \
        |  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  mm/ptdump.c:114:14: note: in expansion of macro 'READ_ONCE'
    114 |  pte_t val = READ_ONCE(*pte);
        |              ^~~~~~~~~
  make[2]: *** [mm/ptdump.o] Error 1

See commit 481e980a7c19 ("mm: Allow arches to provide ptep_get()") and
commit c0e1c8c22beb ("powerpc/8xx: Provide ptep_get() with 16k pages")
for details.

Link: https://lkml.kernel.org/r/912b349e2bcaa88939904815ca0af945740c6bd4.1618478922.git.christophe.leroy@csgroup.eu
Fixes: 30d621f6723b ("mm: add generic ptdump")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Steven Price <steven.price@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm/mapping_dirty_helpers: guard hugepage pud's usage
Zack Rusin [Fri, 16 Apr 2021 22:46:18 +0000 (15:46 -0700)]
mm/mapping_dirty_helpers: guard hugepage pud's usage

Mapping dirty helpers have, so far, been only used on X86, but a port of
vmwgfx to ARM64 exposed a problem which results in a compilation error
on ARM64 systems:

  mm/mapping_dirty_helpers.c: In function `wp_clean_pud_entry':
  mm/mapping_dirty_helpers.c:172:32: error: implicit declaration of function `pud_dirty'; did you mean `pmd_dirty'? [-Werror=implicit-function-declaration]

This is due to the fact that mapping_dirty_helpers code assumes that
pud_dirty is always defined, which is not the case for architectures
that don't define CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD.

ARM64 arch is a little inconsistent when it comes to PUD hugepage
helpers, e.g. it defines pud_young but not pud_dirty but regardless of
that the core kernel code shouldn't assume that any of the PUD hugepage
helpers are available unless CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
is defined.  This prevents compilation errors whenever one of the
drivers is ported to new architectures.

Link: https://lkml.kernel.org/r/20210409165151.694574-1-zackr@vmware.com
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Thomas Hellstrm (Intel) <thomas_os@shipmail.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoia64: tools: remove duplicate definition of ia64_mf() on ia64
John Paul Adrian Glaubitz [Fri, 16 Apr 2021 22:46:15 +0000 (15:46 -0700)]
ia64: tools: remove duplicate definition of ia64_mf() on ia64

The ia64_mf() macro defined in tools/arch/ia64/include/asm/barrier.h is
already defined in <asm/gcc_intrin.h> on ia64 which causes libbpf
failing to build:

    CC       /usr/src/linux/tools/bpf/bpftool//libbpf/staticobjs/libbpf.o
  In file included from /usr/src/linux/tools/include/asm/barrier.h:24,
                   from /usr/src/linux/tools/include/linux/ring_buffer.h:4,
                   from libbpf.c:37:
  /usr/src/linux/tools/include/asm/../../arch/ia64/include/asm/barrier.h:43: error: "ia64_mf" redefined [-Werror]
     43 | #define ia64_mf()       asm volatile ("mf" ::: "memory")
        |
  In file included from /usr/include/ia64-linux-gnu/asm/intrinsics.h:20,
                   from /usr/include/ia64-linux-gnu/asm/swab.h:11,
                   from /usr/include/linux/swab.h:8,
                   from /usr/include/linux/byteorder/little_endian.h:13,
                   from /usr/include/ia64-linux-gnu/asm/byteorder.h:5,
                   from /usr/src/linux/tools/include/uapi/linux/perf_event.h:20,
                   from libbpf.c:36:
  /usr/include/ia64-linux-gnu/asm/gcc_intrin.h:382: note: this is the location of the previous definition
    382 | #define ia64_mf() __asm__ volatile ("mf" ::: "memory")
        |
  cc1: all warnings being treated as errors

Thus, remove the definition from tools/arch/ia64/include/asm/barrier.h.

Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoia64: tools: remove inclusion of ia64-specific version of errno.h header
John Paul Adrian Glaubitz [Fri, 16 Apr 2021 22:46:12 +0000 (15:46 -0700)]
ia64: tools: remove inclusion of ia64-specific version of errno.h header

There is no longer an ia64-specific version of the errno.h header below
arch/ia64/include/uapi/asm/, so trying to build tools/bpf fails with:

    CC       /usr/src/linux/tools/bpf/bpftool/btf_dumper.o
  In file included from /usr/src/linux/tools/include/linux/err.h:8,
                   from btf_dumper.c:11:
  /usr/src/linux/tools/include/uapi/asm/errno.h:13:10: fatal error: ../../../arch/ia64/include/uapi/asm/errno.h: No such file or directory
     13 | #include "../../../arch/ia64/include/uapi/asm/errno.h"
        |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  compilation terminated.

Thus, just remove the inclusion of the ia64-specific errno.h so that the
build will use the generic errno.h header on this target which was used
there anyway as the ia64-specific errno.h was just a wrapper for the
generic header.

Fixes: c25f867ddd00 ("ia64: remove unneeded uapi asm-generic wrappers")
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoia64: fix discontig.c section mismatches
Randy Dunlap [Fri, 16 Apr 2021 22:46:09 +0000 (15:46 -0700)]
ia64: fix discontig.c section mismatches

Fix IA64 discontig.c Section mismatch warnings.

When CONFIG_SPARSEMEM=y and CONFIG_MEMORY_HOTPLUG=y, the functions
computer_pernodesize() and scatter_node_data() should not be marked as
__meminit because they are needed after init, on any memory hotplug
event.  Also, early_nr_cpus_node() is called by compute_pernodesize(),
so early_nr_cpus_node() cannot be __meminit either.

  WARNING: modpost: vmlinux.o(.text.unlikely+0x1612): Section mismatch in reference from the function arch_alloc_nodedata() to the function .meminit.text:compute_pernodesize()
  The function arch_alloc_nodedata() references the function __meminit compute_pernodesize().
  This is often because arch_alloc_nodedata lacks a __meminit annotation or the annotation of compute_pernodesize is wrong.

  WARNING: modpost: vmlinux.o(.text.unlikely+0x1692): Section mismatch in reference from the function arch_refresh_nodedata() to the function .meminit.text:scatter_node_data()
  The function arch_refresh_nodedata() references the function __meminit scatter_node_data().
  This is often because arch_refresh_nodedata lacks a __meminit annotation or the annotation of scatter_node_data is wrong.

  WARNING: modpost: vmlinux.o(.text.unlikely+0x1502): Section mismatch in reference from the function compute_pernodesize() to the function .meminit.text:early_nr_cpus_node()
  The function compute_pernodesize() references the function __meminit early_nr_cpus_node().
  This is often because compute_pernodesize lacks a __meminit annotation or the annotation of early_nr_cpus_node is wrong.

Link: https://lkml.kernel.org/r/20210411001201.3069-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoia64: remove duplicate entries in generic_defconfig
Randy Dunlap [Fri, 16 Apr 2021 22:46:06 +0000 (15:46 -0700)]
ia64: remove duplicate entries in generic_defconfig

Fix ia64 generic_defconfig duplicate entries, as warned by:

  arch/ia64/configs/generic_defconfig: warning: override: reassigning to symbol ATA:  => 58
  arch/ia64/configs/generic_defconfig: warning: override: reassigning to symbol ATA_PIIX:  => 59

These 2 symbols still have the same value as in the removed lines.

Link: https://lkml.kernel.org/r/20210411020255.18052-1-rdunlap@infradead.org
Fixes: c331649e6371 ("ia64: Use libata instead of the legacy ide driver in defconfigs")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agocsky: change a Kconfig symbol name to fix e1000 build error
Randy Dunlap [Fri, 16 Apr 2021 22:46:03 +0000 (15:46 -0700)]
csky: change a Kconfig symbol name to fix e1000 build error

e1000's #define of CONFIG_RAM_BASE conflicts with a Kconfig symbol in
arch/csky/Kconfig.

The symbol in e1000 has been around longer, so change arch/csky/ to use
DRAM_BASE instead of RAM_BASE to remove the conflict.  (although e1000
is also a 2-line change)

Link: https://lkml.kernel.org/r/20210411055335.7111-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Acked-by: Guo Ren <guoren@kernel.org>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agokasan: remove redundant config option
Walter Wu [Fri, 16 Apr 2021 22:46:00 +0000 (15:46 -0700)]
kasan: remove redundant config option

CONFIG_KASAN_STACK and CONFIG_KASAN_STACK_ENABLE both enable KASAN stack
instrumentation, but we should only need one config, so that we remove
CONFIG_KASAN_STACK_ENABLE and make CONFIG_KASAN_STACK workable.  see [1].

When enable KASAN stack instrumentation, then for gcc we could do no
prompt and default value y, and for clang prompt and default value n.

This patch fixes the following compilation warning:

  include/linux/kasan.h:333:30: warning: 'CONFIG_KASAN_STACK' is not defined, evaluates to 0 [-Wundef]

[akpm@linux-foundation.org: fix merge snafu]

Link: https://bugzilla.kernel.org/show_bug.cgi?id=210221
Link: https://lkml.kernel.org/r/20210226012531.29231-1-walter-zh.wu@mediatek.com
Fixes: d9b571c885a8 ("kasan: fix KASAN_STACK dependency for HW_TAGS")
Signed-off-by: Walter Wu <walter-zh.wu@mediatek.com>
Suggested-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agokasan: fix hwasan build for gcc
Arnd Bergmann [Fri, 16 Apr 2021 22:45:57 +0000 (15:45 -0700)]
kasan: fix hwasan build for gcc

gcc-11 adds support for -fsanitize=kernel-hwaddress, so it becomes
possible to enable CONFIG_KASAN_SW_TAGS.

Unfortunately this fails to build at the moment, because the
corresponding command line arguments use llvm specific syntax.

Change it to use the cc-param macro instead, which works on both clang
and gcc.

[elver@google.com: fixup for "kasan: fix hwasan build for gcc"]
Link: https://lkml.kernel.org/r/YHQZVfVVLE/LDK2v@elver.google.com
Link: https://lkml.kernel.org/r/20210323124112.1229772-1-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Acked-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm: eliminate "expecting prototype" kernel-doc warnings
Randy Dunlap [Fri, 16 Apr 2021 22:45:54 +0000 (15:45 -0700)]
mm: eliminate "expecting prototype" kernel-doc warnings

Fix stray kernel-doc warnings in mm/ due to mis-typed or missing function
names.

Quietens these kernel-doc warnings:

  mm/mmu_gather.c:264: warning: expecting prototype for tlb_gather_mmu(). Prototype was for __tlb_gather_mmu() instead
  mm/oom_kill.c:180: warning: expecting prototype for Check whether unreclaimable slab amount is greater than(). Prototype was for should_dump_unreclaim_slab() instead
  mm/shuffle.c:155: warning: expecting prototype for shuffle_free_memory(). Prototype was for __shuffle_free_memory() instead

Link: https://lkml.kernel.org/r/20210411210642.11362-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
David S. Miller [Fri, 16 Apr 2021 22:48:08 +0000 (15:48 -0700)]
Merge git://git./pub/scm/linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
pull-request: bpf 2021-04-17

The following pull-request contains BPF updates for your *net* tree.

We've added 10 non-merge commits during the last 9 day(s) which contain
a total of 8 files changed, 175 insertions(+), 111 deletions(-).

The main changes are:

1) Fix a potential NULL pointer dereference in libbpf's xsk
   umem handling, from Ciara Loftus.

2) Mitigate a speculative oob read of up to map value size by
   tightening the masking window, from Daniel Borkmann.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMAINTAINERS: update my email
Lijun Pan [Fri, 16 Apr 2021 04:18:13 +0000 (23:18 -0500)]
MAINTAINERS: update my email

Update my email and change myself to Reviewer.

Signed-off-by: Lijun Pan <lijunp213@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agobpf: Update selftests to reflect new error states
Daniel Borkmann [Wed, 24 Mar 2021 13:52:31 +0000 (14:52 +0100)]
bpf: Update selftests to reflect new error states

Update various selftest error messages:

 * The 'Rx tried to sub from different maps, paths, or prohibited types'
   is reworked into more specific/differentiated error messages for better
   guidance.

 * The change into 'value -4294967168 makes map_value pointer be out of
   bounds' is due to moving the mixed bounds check into the speculation
   handling and thus occuring slightly later than above mentioned sanity
   check.

 * The change into 'math between map_value pointer and register with
   unbounded min value' is similarly due to register sanity check coming
   before the mixed bounds check.

 * The case of 'map access: known scalar += value_ptr from different maps'
   now loads fine given masks are the same from the different paths (despite
   max map value size being different).

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
3 years agobpf: Tighten speculative pointer arithmetic mask
Daniel Borkmann [Wed, 24 Mar 2021 09:38:26 +0000 (10:38 +0100)]
bpf: Tighten speculative pointer arithmetic mask

This work tightens the offset mask we use for unprivileged pointer arithmetic
in order to mitigate a corner case reported by Piotr and Benedict where in
the speculative domain it is possible to advance, for example, the map value
pointer by up to value_size-1 out-of-bounds in order to leak kernel memory
via side-channel to user space.

Before this change, the computed ptr_limit for retrieve_ptr_limit() helper
represents largest valid distance when moving pointer to the right or left
which is then fed as aux->alu_limit to generate masking instructions against
the offset register. After the change, the derived aux->alu_limit represents
the largest potential value of the offset register which we mask against which
is just a narrower subset of the former limit.

For minimal complexity, we call sanitize_ptr_alu() from 2 observation points
in adjust_ptr_min_max_vals(), that is, before and after the simulated alu
operation. In the first step, we retieve the alu_state and alu_limit before
the operation as well as we branch-off a verifier path and push it to the
verification stack as we did before which checks the dst_reg under truncation,
in other words, when the speculative domain would attempt to move the pointer
out-of-bounds.

In the second step, we retrieve the new alu_limit and calculate the absolute
distance between both. Moreover, we commit the alu_state and final alu_limit
via update_alu_sanitation_state() to the env's instruction aux data, and bail
out from there if there is a mismatch due to coming from different verification
paths with different states.

Reported-by: Piotr Krysiuk <piotras@gmail.com>
Reported-by: Benedict Schlueter <benedict.schlueter@rub.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Benedict Schlueter <benedict.schlueter@rub.de>
3 years agobpf: Move sanitize_val_alu out of op switch
Daniel Borkmann [Wed, 24 Mar 2021 10:25:39 +0000 (11:25 +0100)]
bpf: Move sanitize_val_alu out of op switch

Add a small sanitize_needed() helper function and move sanitize_val_alu()
out of the main opcode switch. In upcoming work, we'll move sanitize_ptr_alu()
as well out of its opcode switch so this helps to streamline both.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
3 years agobpf: Refactor and streamline bounds check into helper
Daniel Borkmann [Tue, 23 Mar 2021 14:05:48 +0000 (15:05 +0100)]
bpf: Refactor and streamline bounds check into helper

Move the bounds check in adjust_ptr_min_max_vals() into a small helper named
sanitize_check_bounds() in order to simplify the former a bit.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
3 years agobpf: Improve verifier error messages for users
Daniel Borkmann [Tue, 23 Mar 2021 08:30:01 +0000 (09:30 +0100)]
bpf: Improve verifier error messages for users

Consolidate all error handling and provide more user-friendly error messages
from sanitize_ptr_alu() and sanitize_val_alu().

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
3 years agobpf: Rework ptr_limit into alu_limit and add common error path
Daniel Borkmann [Tue, 23 Mar 2021 08:04:10 +0000 (09:04 +0100)]
bpf: Rework ptr_limit into alu_limit and add common error path

Small refactor with no semantic changes in order to consolidate the max
ptr_limit boundary check.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
3 years agobpf: Ensure off_reg has no mixed signed bounds for all types
Daniel Borkmann [Tue, 23 Mar 2021 07:51:02 +0000 (08:51 +0100)]
bpf: Ensure off_reg has no mixed signed bounds for all types

The mixed signed bounds check really belongs into retrieve_ptr_limit()
instead of outside of it in adjust_ptr_min_max_vals(). The reason is
that this check is not tied to PTR_TO_MAP_VALUE only, but to all pointer
types that we handle in retrieve_ptr_limit() and given errors from the latter
propagate back to adjust_ptr_min_max_vals() and lead to rejection of the
program, it's a better place to reside to avoid anything slipping through
for future types. The reason why we must reject such off_reg is that we
otherwise would not be able to derive a mask, see details in 9d7eceede769
("bpf: restrict unknown scalars of mixed signed bounds for unprivileged").

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
3 years agobpf: Move off_reg into sanitize_ptr_alu
Daniel Borkmann [Mon, 22 Mar 2021 14:45:52 +0000 (15:45 +0100)]
bpf: Move off_reg into sanitize_ptr_alu

Small refactor to drag off_reg into sanitize_ptr_alu(), so we later on can
use off_reg for generalizing some of the checks for all pointer types.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
3 years agobpf: Use correct permission flag for mixed signed bounds arithmetic
Daniel Borkmann [Tue, 23 Mar 2021 07:32:59 +0000 (08:32 +0100)]
bpf: Use correct permission flag for mixed signed bounds arithmetic

We forbid adding unknown scalars with mixed signed bounds due to the
spectre v1 masking mitigation. Hence this also needs bypass_spec_v1
flag instead of allow_ptr_leaks.

Fixes: 2c78ee898d8f ("bpf: Implement CAP_BPF")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
3 years agoMerge tag 'riscv-for-linus-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 16 Apr 2021 17:05:42 +0000 (10:05 -0700)]
Merge tag 'riscv-for-linus-5.12-rc8' of git://git./linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:
 "A handful of fixes:

   - a fix to properly select SPARSEMEM_STATIC on rv32

   - a few fixes to kprobes

  I don't generally like sending stuff this late, but these all seem
  pretty safe"

* tag 'riscv-for-linus-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: keep interrupts disabled for BREAKPOINT exception
  riscv: kprobes/ftrace: Add recursion protection to the ftrace callback
  riscv: add do_page_fault and do_trap_break into the kprobes blacklist
  riscv: Fix spelling mistake "SPARSEMEM" to "SPARSMEM"

3 years agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Fri, 16 Apr 2021 16:45:30 +0000 (09:45 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux

Pull arm64 fix from Catalin Marinas:
 "Fix kernel compilation when using the LLVM integrated assembly.

  A recent commit (2decad92f473, "arm64: mte: Ensure TIF_MTE_ASYNC_FAULT
  is set atomically") broke the kernel build when using the LLVM
  integrated assembly (only noticeable with clang-12 as MTE is not
  supported by earlier versions and the code in question not compiled).
  The Fixes: tag in the commit refers to the original patch introducing
  subsections for the alternative code sequences"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: alternatives: Move length validation in alternative_{insn, endif}

3 years agoMerge tag 'drm-fixes-2021-04-16' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 16 Apr 2021 14:49:04 +0000 (07:49 -0700)]
Merge tag 'drm-fixes-2021-04-16' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Daniel Vetter:
 "I pinged the usual suspects, only intel fixes pending"

* tag 'drm-fixes-2021-04-16' of git://anongit.freedesktop.org/drm/drm:
  drm/i915/display/vlv_dsi: Do not skip panel_pwr_cycle_delay when disabling the panel
  drm/i915: Don't zero out the Y plane's watermarks
  drm/i915/dpcd_bl: Don't try vesa interface unless specified by VBT

3 years agoriscv: keep interrupts disabled for BREAKPOINT exception
Jisheng Zhang [Mon, 29 Mar 2021 18:16:24 +0000 (02:16 +0800)]
riscv: keep interrupts disabled for BREAKPOINT exception

Current riscv's kprobe handlers are run with both preemption and
interrupt enabled, this violates kprobe requirements. Fix this issue
by keeping interrupts disabled for BREAKPOINT exception.

Fixes: c22b0bcb1dd0 ("riscv: Add kprobes supported")
Cc: stable@vger.kernel.org
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
[Palmer: add a comment]
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
3 years agoriscv: kprobes/ftrace: Add recursion protection to the ftrace callback
Jisheng Zhang [Mon, 29 Mar 2021 18:14:40 +0000 (02:14 +0800)]
riscv: kprobes/ftrace: Add recursion protection to the ftrace callback

Currently, the riscv's kprobes(powerred by ftrace) handler is
preemptible. Futher check indicates we miss something similar as the
commit c536aa1c5b17 ("kprobes/ftrace: Add recursion protection to the
ftrace callback"), so do similar modifications as the commit does.

Fixes: 829adda597fe ("riscv: Add KPROBES_ON_FTRACE supported")
Cc: stable@vger.kernel.org
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
3 years agoriscv: add do_page_fault and do_trap_break into the kprobes blacklist
Jisheng Zhang [Mon, 29 Mar 2021 18:12:26 +0000 (02:12 +0800)]
riscv: add do_page_fault and do_trap_break into the kprobes blacklist

These two functions are used to implement the kprobes feature so they
can't be kprobed.

Fixes: c22b0bcb1dd0 ("riscv: Add kprobes supported")
Cc: stable@vger.kernel.org
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
3 years agoriscv: Fix spelling mistake "SPARSEMEM" to "SPARSMEM"
Kefeng Wang [Mon, 29 Mar 2021 03:13:07 +0000 (11:13 +0800)]
riscv: Fix spelling mistake "SPARSEMEM" to "SPARSMEM"

There is a spelling mistake when SPARSEMEM Kconfig copy.

Fixes: a5406a7ff56e ("riscv: Correct SPARSEMEM configuration")
Cc: stable@vger.kernel.org
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
3 years agocxl/mem: Fix register block offset calculation
Ben Widawsky [Thu, 15 Apr 2021 23:26:08 +0000 (16:26 -0700)]
cxl/mem: Fix register block offset calculation

The "Register Offset Low" register of a "DVSEC Register Locator"
contains the 64K aligned offset for the registers along with the BAR
indicator and an id. The implementation was treating the "Register Block
Offset Low" field a value rather than as a pre-aligned component of the
64-bit offset. So, just mask, don't mask and shift (FIELD_GET).

The user visible result of this bug is that the driver fails to bind to
the device after none of the required blocks are found.

This was missed earlier because the primary development done in the QEMU
environment only uses 0 offsets, i.e. 0 shifted is still 0.

Fixes: 8adaf747c9f0 ("cxl/mem: Find device capabilities")
Reported-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Link: https://lore.kernel.org/r/20210415232610.603273-1-ben.widawsky@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
3 years agoMerge branch 'ch_tlss-fixes'
David S. Miller [Thu, 15 Apr 2021 23:55:49 +0000 (16:55 -0700)]
Merge branch 'ch_tlss-fixes'

Vinay Kumar Yadav says:

====================
chelsio/ch_ktls: chelsio inline tls driver bug fixes

This series of patches fix following bugs in Chelsio inline tls driver.
Patch1: kernel panic.
Patch2: connection close issue.
Patch3: tcb close call issue.
Patch4: unnecessary snd_una update.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoch_ktls: do not send snd_una update to TCB in middle
Vinay Kumar Yadav [Thu, 15 Apr 2021 07:47:48 +0000 (13:17 +0530)]
ch_ktls: do not send snd_una update to TCB in middle

snd_una update should not be done when the same skb is being
sent out.chcr_short_record_handler() sends it again even
though SND_UNA update is already sent for the skb in
chcr_ktls_xmit(), which causes mismatch in un-acked
TCP seq number, later causes problem in sending out
complete record.

Fixes: 429765a149f1 ("chcr: handle partial end part of a record")
Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com>
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoch_ktls: tcb close causes tls connection failure
Vinay Kumar Yadav [Thu, 15 Apr 2021 07:47:47 +0000 (13:17 +0530)]
ch_ktls: tcb close causes tls connection failure

HW doesn't need marking TCB closed. This TCB state change
sometimes causes problem to the new connection which gets
the same tid.

Fixes: 34aba2c45024 ("cxgb4/chcr : Register to tls add and del callback")
Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com>
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoch_ktls: fix device connection close
Vinay Kumar Yadav [Thu, 15 Apr 2021 07:47:46 +0000 (13:17 +0530)]
ch_ktls: fix device connection close

When sge queue is full and chcr_ktls_xmit_wr_complete()
returns failure, skb is not freed if it is not the last tls record in
this skb, causes refcount never gets freed and tls_dev_del()
never gets called on this connection.

Fixes: 5a4b9fe7fece ("cxgb4/chcr: complete record tx handling")
Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com>
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoch_ktls: Fix kernel panic
Vinay Kumar Yadav [Thu, 15 Apr 2021 07:47:45 +0000 (13:17 +0530)]
ch_ktls: Fix kernel panic

Taking page refcount is not ideal and causes kernel panic
sometimes. It's better to take tx_ctx lock for the complete
skb transmit, to avoid page cleanup if ACK received in middle.

Fixes: 5a4b9fe7fece ("cxgb4/chcr: complete record tx handling")
Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com>
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge tag 'mlx5-fixes-2021-04-14' of git://git.kernel.org/pub/scm/linux/kernel/git...
David S. Miller [Thu, 15 Apr 2021 23:43:29 +0000 (16:43 -0700)]
Merge tag 'mlx5-fixes-2021-04-14' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5 fixes 2021-04-14

This series provides 3 small fixes to mlx5 driver.
Please pull and let me know if there is any problem.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoi40e: fix the panic when running bpf in xdpdrv mode
Jason Xing [Wed, 14 Apr 2021 02:34:28 +0000 (10:34 +0800)]
i40e: fix the panic when running bpf in xdpdrv mode

Fix this panic by adding more rules to calculate the value of @rss_size_max
which could be used in allocating the queues when bpf is loaded, which,
however, could cause the failure and then trigger the NULL pointer of
vsi->rx_rings. Prio to this fix, the machine doesn't care about how many
cpus are online and then allocates 256 queues on the machine with 32 cpus
online actually.

Once the load of bpf begins, the log will go like this "failed to get
tracking for 256 queues for VSI 0 err -12" and this "setup of MAIN VSI
failed".

Thus, I attach the key information of the crash-log here.

BUG: unable to handle kernel NULL pointer dereference at
0000000000000000
RIP: 0010:i40e_xdp+0xdd/0x1b0 [i40e]
Call Trace:
[2160294.717292]  ? i40e_reconfig_rss_queues+0x170/0x170 [i40e]
[2160294.717666]  dev_xdp_install+0x4f/0x70
[2160294.718036]  dev_change_xdp_fd+0x11f/0x230
[2160294.718380]  ? dev_disable_lro+0xe0/0xe0
[2160294.718705]  do_setlink+0xac7/0xe70
[2160294.719035]  ? __nla_parse+0xed/0x120
[2160294.719365]  rtnl_newlink+0x73b/0x860

Fixes: 41c445ff0f48 ("i40e: main driver core")
Co-developed-by: Shujin Li <lishujin@kuaishou.com>
Signed-off-by: Shujin Li <lishujin@kuaishou.com>
Signed-off-by: Jason Xing <xingwanli@kuaishou.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge tag 'acpi-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Thu, 15 Apr 2021 17:53:39 +0000 (10:53 -0700)]
Merge tag 'acpi-5.12-rc8' of git://git./linux/kernel/git/rafael/linux-pm

Pull ACPI fix from Rafael Wysocki:
 "Restore the initrd-based ACPI table override functionality broken by
  one of the recent fixes"

* tag 'acpi-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: x86: Call acpi_boot_table_init() after acpi_table_upgrade()

3 years agoMerge tag 'gpio-fixes-for-v5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 15 Apr 2021 17:48:51 +0000 (10:48 -0700)]
Merge tag 'gpio-fixes-for-v5.12-rc8' of git://git./linux/kernel/git/brgl/linux

Pull gpio fix from Bartosz Golaszewski:
 "A single fix for an older problem with the sysfs interface: do not
  allow exporting GPIO lines which were marked invalid by the driver"

* tag 'gpio-fixes-for-v5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: sysfs: Obey valid_mask

3 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Linus Torvalds [Thu, 15 Apr 2021 17:43:18 +0000 (10:43 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/hid/hid

Pull HID fixes from Jiri Kosina:
 "The changes are all device/driver specific fixes:

   - EV_KEY and EV_ABS regression fix for Wacom from Ping Cheng

   - BIOS-specific quirk to fix some of the AMD_SFH-based systems, from
     Hans de Goede

   - other small error handling fixes and device ID additions"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: wacom: set EV_KEY and EV_ABS only for non-HID_GENERIC type of devices
  AMD_SFH: Add DMI quirk table for BIOS-es which don't set the activestatus bits
  AMD_SFH: Add sensor_mask module parameter
  AMD_SFH: Removed unused activecontrolstatus member from the amd_mp2_dev struct
  HID: wacom: Assign boolean values to a bool variable
  HID cp2112: fix support for multiple gpiochips
  HID: alps: fix error return code in alps_input_configured()
  HID: asus: Add support for 2021 ASUS N-Key keyboard
  HID: google: add don USB id

3 years agoarm64: alternatives: Move length validation in alternative_{insn, endif}
Nathan Chancellor [Wed, 14 Apr 2021 00:08:04 +0000 (17:08 -0700)]
arm64: alternatives: Move length validation in alternative_{insn, endif}

After commit 2decad92f473 ("arm64: mte: Ensure TIF_MTE_ASYNC_FAULT is
set atomically"), LLVM's integrated assembler fails to build entry.S:

<instantiation>:5:7: error: expected assembly-time absolute expression
 .org . - (664b-663b) + (662b-661b)
      ^
<instantiation>:6:7: error: expected assembly-time absolute expression
 .org . - (662b-661b) + (664b-663b)
      ^

The root cause is LLVM's assembler has a one-pass design, meaning it
cannot figure out these instruction lengths when the .org directive is
outside of the subsection that they are in, which was changed by the
.arch_extension directive added in the above commit.

Apply the same fix from commit 966a0acce2fc ("arm64/alternatives: move
length validation inside the subsection") to the alternative_endif
macro, shuffling the .org directives so that the length validation
happen will always happen in the same subsections. alternative_insn has
not shown any issue yet but it appears that it could have the same issue
in the future so just preemptively change it.

Fixes: f7b93d42945c ("arm64/alternatives: use subsections for replacement sequences")
Cc: <stable@vger.kernel.org> # 5.8.x
Link: https://github.com/ClangBuiltLinux/linux/issues/1347
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Tested-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/r/20210414000803.662534-1-nathan@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Linus Torvalds [Thu, 15 Apr 2021 17:23:44 +0000 (10:23 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input

Pull input fixes from Dmitry Torokhov:
 "Just a few driver fixes here"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: elants_i2c - drop zero-checking of ABS_MT_TOUCH_MAJOR resolution
  Input: elants_i2c - fix division by zero if firmware reports zero phys size
  Input: nspire-keypad - enable interrupts only when opened
  Input: i8042 - fix Pegatron C15B ID entry
  Input: n64joy - fix return value check in n64joy_probe()
  Input: s6sy761 - fix coordinate read bit shift

3 years agoMerge tag 'drm-intel-fixes-2021-04-15' of git://anongit.freedesktop.org/drm/drm-intel...
Daniel Vetter [Thu, 15 Apr 2021 13:24:17 +0000 (15:24 +0200)]
Merge tag 'drm-intel-fixes-2021-04-15' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

Display panel & power related fixes:

- Backlight fix (Lyude)
- Display watermark fix (Ville)
- VLV panel power fix (Hans)

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YHg4nz/ndzDRmPjd@intel.com
3 years agonet/mlx5e: fix ingress_ifindex check in mlx5e_flower_parse_meta
wenxu [Fri, 9 Apr 2021 05:33:48 +0000 (13:33 +0800)]
net/mlx5e: fix ingress_ifindex check in mlx5e_flower_parse_meta

In the nft_offload there is the mate flow_dissector with no
ingress_ifindex but with ingress_iftype that only be used
in the software. So if the mask of ingress_ifindex in meta is
0, this meta check should be bypass.

Fixes: 6d65bc64e232 ("net/mlx5e: Add mlx5e_flower_parse_meta support")
Signed-off-by: wenxu <wenxu@ucloud.cn>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5e: Fix setting of RS FEC mode
Aya Levin [Sun, 11 Apr 2021 06:33:12 +0000 (09:33 +0300)]
net/mlx5e: Fix setting of RS FEC mode

Change register setting from bit number to bit mask.

Fixes: b5ede32d3329 ("net/mlx5e: Add support for FEC modes based on 50G per lane links")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: Fix setting of devlink traps in switchdev mode
Aya Levin [Mon, 12 Apr 2021 14:50:08 +0000 (17:50 +0300)]
net/mlx5: Fix setting of devlink traps in switchdev mode

Prevent setting of devlink traps on the uplink while in switchdev mode.
In this mode, it is the SW switch responsibility to handle both packets
with a mismatch in destination MAC or VLAN ID. Therefore, there are no
flow steering tables to trap undesirable packets and driver crashes upon
setting a trap.

Fixes: 241dc159391f ("net/mlx5: Notify on trap action by blocking event")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agoMerge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
David S. Miller [Wed, 14 Apr 2021 21:12:12 +0000 (14:12 -0700)]
Merge branch '10GbE' of git://git./linux/kernel/git/tnguy/net-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2021-04-14

This series contains updates to ixgbe and ice drivers.

Alex Duyck fixes a NULL pointer dereference for ixgbe.

Yongxin Liu fixes an unbalanced enable/disable which was causing a call
trace with suspend for ixgbe.

Colin King fixes a potential infinite loop for ice.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoRevert "net: stmmac: re-init rx buffers when mac resume back"
Thierry Reding [Wed, 14 Apr 2021 15:10:07 +0000 (17:10 +0200)]
Revert "net: stmmac: re-init rx buffers when mac resume back"

This reverts commit 9c63faaa931e443e7abbbee9de0169f1d4710546, which
introduces a suspend/resume regression on Jetson TX2 boards that can be
reproduced every time. Given that the issue that this was supposed to
fix only occurs very sporadically the safest course of action is to
revert before v5.12 and then we can have another go at fixing the more
rare issue in the next release (and perhaps backport it if necessary).

The root cause of the observed problem seems to be that when the system
is suspended, some packets are still in transit. When the descriptors
for these buffers are cleared on resume, the descriptors become invalid
and cause a fatal bus error.

Link: https://lore.kernel.org/r/708edb92-a5df-ecc4-3126-5ab36707e275@nvidia.com/
Reported-by: Jonathan Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agocavium/liquidio: Fix duplicate argument
Wan Jiabing [Wed, 14 Apr 2021 11:31:48 +0000 (19:31 +0800)]
cavium/liquidio: Fix duplicate argument

Fix the following coccicheck warning:

./drivers/net/ethernet/cavium/liquidio/cn66xx_regs.h:413:6-28:
duplicated argument to & or |

The CN6XXX_INTR_M1UPB0_ERR here is duplicate.
Here should be CN6XXX_INTR_M1UNB0_ERR.

Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: macb: fix the restore of cmp registers
Claudiu Beznea [Wed, 14 Apr 2021 11:20:29 +0000 (14:20 +0300)]
net: macb: fix the restore of cmp registers

Commit a14d273ba159 ("net: macb: restore cmp registers on resume path")
introduces the restore of CMP registers on resume path. In case the IP
doesn't support type 2 screeners (zero on DCFG8 register) the
struct macb::rx_fs_list::list is not initialized and thus the
list_for_each_entry(item, &bp->rx_fs_list.list, list) loop introduced in
commit a14d273ba159 ("net: macb: restore cmp registers on resume path")
will access an uninitialized list leading to crash. Thus, initialize
the struct macb::rx_fs_list::list without taking into account if the
IP supports type 2 screeners or not.

Fixes: a14d273ba159 ("net: macb: restore cmp registers on resume path")
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge tag 'for-5.12/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 14 Apr 2021 20:23:54 +0000 (13:23 -0700)]
Merge tag 'for-5.12/dm-fixes-3' of git://git./linux/kernel/git/device-mapper/linux-dm

Pull device mapper fix from Mike Snitzer:
 "Fix DM verity target FEC support's RS roots IO to always be aligned.

  This fixes a previous stable@ fix that overcorrected for a different
  configuration that also resulted in misaligned roots IO"

* tag 'for-5.12/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm verity fec: fix misaligned RS roots IO

3 years agovrf: fix a comment about loopback device
Nicolas Dichtel [Wed, 14 Apr 2021 10:03:25 +0000 (12:03 +0200)]
vrf: fix a comment about loopback device

This is a leftover of the below commit.

Fixes: 4f04256c983a ("net: vrf: Drop local rtable and rt6_info")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodoc: move seg6_flowlabel to seg6-sysctl.rst
Nicolas Dichtel [Wed, 14 Apr 2021 10:00:27 +0000 (12:00 +0200)]
doc: move seg6_flowlabel to seg6-sysctl.rst

Let's have all seg6 sysctl at the same place.

Fixes: a6dc6670cd7e ("ipv6: sr: Add documentation for seg_flowlabel sysctl")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'ibmvnic-napi-fixes'
David S. Miller [Wed, 14 Apr 2021 20:10:58 +0000 (13:10 -0700)]
Merge branch 'ibmvnic-napi-fixes'

Lijun Pan says:

====================
ibmvnic: correctly call NAPI APIs

This series correct some misuse of NAPI APIs in the driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoibmvnic: remove duplicate napi_schedule call in open function
Lijun Pan [Wed, 14 Apr 2021 07:46:16 +0000 (02:46 -0500)]
ibmvnic: remove duplicate napi_schedule call in open function

Remove the unnecessary napi_schedule() call in __ibmvnic_open() since
interrupt_rx() calls napi_schedule_prep/__napi_schedule during every
receive interrupt.

Fixes: ed651a10875f ("ibmvnic: Updated reset handling")
Signed-off-by: Lijun Pan <lijunp213@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoibmvnic: remove duplicate napi_schedule call in do_reset function
Lijun Pan [Wed, 14 Apr 2021 07:46:15 +0000 (02:46 -0500)]
ibmvnic: remove duplicate napi_schedule call in do_reset function

During adapter reset, do_reset/do_hard_reset calls ibmvnic_open(),
which will calls napi_schedule if previous state is VNIC_CLOSED
(i.e, the reset case, and "ifconfig down" case). So there is no need
for do_reset to call napi_schedule again at the end of the function
though napi_schedule will neglect the request if napi is already
scheduled.

Fixes: ed651a10875f ("ibmvnic: Updated reset handling")
Signed-off-by: Lijun Pan <lijunp213@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoibmvnic: avoid calling napi_disable() twice
Lijun Pan [Wed, 14 Apr 2021 07:46:14 +0000 (02:46 -0500)]
ibmvnic: avoid calling napi_disable() twice

__ibmvnic_open calls napi_disable without checking whether NAPI polling
has already been disabled or not. This could cause napi_disable
being called twice, which could generate deadlock. For example,
the first napi_disable will spin until NAPI_STATE_SCHED is cleared
by napi_complete_done, then set it again.
When napi_disable is called the second time, it will loop infinitely
because no dev->poll will be running to clear NAPI_STATE_SCHED.

To prevent above scenario from happening, call ibmvnic_napi_disable()
which checks if napi is disabled or not before calling napi_disable.

Fixes: bfc32f297337 ("ibmvnic: Move resource initialization to its own routine")
Suggested-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: Lijun Pan <lijunp213@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agor8169: don't advertise pause in jumbo mode
Heiner Kallweit [Wed, 14 Apr 2021 08:47:10 +0000 (10:47 +0200)]
r8169: don't advertise pause in jumbo mode

It has been reported [0] that using pause frames in jumbo mode impacts
performance. There's no available chip documentation, but vendor
drivers r8168 and r8125 don't advertise pause in jumbo mode. So let's
do the same, according to Roman it fixes the issue.

[0] https://bugzilla.kernel.org/show_bug.cgi?id=212617

Fixes: 9cf9b84cc701 ("r8169: make use of phy_set_asym_pause")
Reported-by: Roman Mamedov <rm+bko@romanrm.net>
Tested-by: Roman Mamedov <rm+bko@romanrm.net>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoethtool: pause: make sure we init driver stats
Jakub Kicinski [Wed, 14 Apr 2021 03:46:14 +0000 (20:46 -0700)]
ethtool: pause: make sure we init driver stats

The intention was for pause statistics to not be reported
when driver does not have the relevant callback (only
report an empty netlink nest). What happens currently
we report all 0s instead. Make sure statistics are
initialized to "not set" (which is -1) so the dumping
code skips them.

Fixes: 9a27a33027f2 ("ethtool: add standard pause stats")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoio_uring: fix early sqd_list removal sqpoll hangs
Pavel Begunkov [Tue, 13 Apr 2021 10:43:00 +0000 (11:43 +0100)]
io_uring: fix early sqd_list removal sqpoll hangs

[  245.463317] INFO: task iou-sqp-1374:1377 blocked for more than 122 seconds.
[  245.463334] task:iou-sqp-1374    state:D flags:0x00004000
[  245.463345] Call Trace:
[  245.463352]  __schedule+0x36b/0x950
[  245.463376]  schedule+0x68/0xe0
[  245.463385]  __io_uring_cancel+0xfb/0x1a0
[  245.463407]  do_exit+0xc0/0xb40
[  245.463423]  io_sq_thread+0x49b/0x710
[  245.463445]  ret_from_fork+0x22/0x30

It happens when sqpoll forgot to run park_task_work and goes to exit,
then exiting user may remove ctx from sqd_list, and so corresponding
io_sq_thread() -> io_uring_cancel_sqpoll() won't be executed. Hopefully
it just stucks in do_exit() in this case.

Fixes: dbe1bdbb39db ("io_uring: handle signals for IO threads like a normal thread")
Reported-by: Joakim Hassila <joj@mac.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agodm verity fec: fix misaligned RS roots IO
Jaegeuk Kim [Wed, 14 Apr 2021 15:28:28 +0000 (08:28 -0700)]
dm verity fec: fix misaligned RS roots IO

commit df7b59ba9245 ("dm verity: fix FEC for RS roots unaligned to
block size") introduced the possibility for misaligned roots IO
relative to the underlying device's logical block size. E.g. Android's
default RS roots=2 results in dm_bufio->block_size=1024, which causes
the following EIO if the logical block size of the device is 4096,
given v->data_dev_block_bits=12:

E sd 0    : 0:0:0: [sda] tag#30 request not aligned to the logical block size
E blk_update_request: I/O error, dev sda, sector 10368424 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
E device-mapper: verity-fec: 254:8: FEC 9244672: parity read failed (block 18056): -5

Fix this by onlu using f->roots for dm_bufio blocksize IFF it is
aligned to v->data_dev_block_bits.

Fixes: df7b59ba9245 ("dm verity: fix FEC for RS roots unaligned to block size")
Cc: stable@vger.kernel.org
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>