David S. Miller [Fri, 18 Oct 2019 17:19:43 +0000 (10:19 -0700)]
Merge branch 'vsock-virtio-make-the-credit-mechanism-more-robust'
Stefano Garzarella says:
====================
vsock/virtio: make the credit mechanism more robust
This series makes the credit mechanism implemented in the
virtio-vsock devices more robust.
Patch 1 sends an update to the remote peer when the buf_alloc
change.
Patch 2 prevents a malicious peer (especially the guest) can
consume all the memory of the other peer, discarding packets
when the credit available is not respected.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Garzarella [Thu, 17 Oct 2019 12:44:03 +0000 (14:44 +0200)]
vsock/virtio: discard packets if credit is not respected
If the remote peer doesn't respect the credit information
(buf_alloc, fwd_cnt), sending more data than it can send,
we should drop the packets to prevent a malicious peer
from using all of our memory.
This is patch follows the VIRTIO spec: "VIRTIO_VSOCK_OP_RW data
packets MUST only be transmitted when the peer has sufficient
free buffer space for the payload"
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Garzarella [Thu, 17 Oct 2019 12:44:02 +0000 (14:44 +0200)]
vsock/virtio: send a credit update when buffer size is changed
When the user application set a new buffer size value, we should
update the remote peer about this change, since it uses this
information to calculate the credit available.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 17 Oct 2019 07:11:03 +0000 (10:11 +0300)]
mlxsw: spectrum_trap: Push Ethernet header before reporting trap
devlink maintains packets and bytes statistics for each trap. Since
eth_type_trans() was called to set the skb's protocol, the data pointer
no longer points to the start of the packet and the bytes accounting is
off by 14 bytes.
Fix this by pushing the skb's data pointer to the start of the packet.
Fixes: b5ce611fd96e ("mlxsw: spectrum: Add devlink-trap support")
Reported-by: Alex Kushnarov <alexanderk@mellanox.com>
Tested-by: Alex Kushnarov <alexanderk@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 17 Oct 2019 01:00:56 +0000 (18:00 -0700)]
net: ensure correct skb->tstamp in various fragmenters
Thomas found that some forwarded packets would be stuck
in FQ packet scheduler because their skb->tstamp contained
timestamps far in the future.
We thought we addressed this point in commit
8203e2d844d3
("net: clear skb->tstamp in forwarding paths") but there
is still an issue when/if a packet needs to be fragmented.
In order to meet EDT requirements, we have to make sure all
fragments get the original skb->tstamp.
Note that this original skb->tstamp should be zero in
forwarding path, but might have a non zero value in
output path if user decided so.
Fixes: fb420d5d91c1 ("tcp/fq: move back to CLOCK_MONOTONIC")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Thomas Bartschies <Thomas.Bartschies@cvk.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 18 Oct 2019 17:00:07 +0000 (10:00 -0700)]
Merge branch 'net-bcmgenet-restore-internal-EPHY-support'
Doug Berger says:
====================
net: bcmgenet: restore internal EPHY support
I managed to get my hands on an old BCM97435SVMB board to do some
testing with the latest kernel and uncovered a number of things
that managed to get broken over the years (some by me ;).
This commit set attempts to correct the errors I observed in my
testing.
The first commit applies to all internal PHYs to restore proper
reporting of link status when a link comes up.
The second commit restores the soft reset to the initialization of
the older internal EPHYs used by 40nm Set-Top Box devices.
The third corrects a bug I introduced when removing excessive soft
resets by altering the initialization sequence in a way that keeps
the GENETv3 MAC interface happy.
Finally, I observed a number of issues when manually configuring
the network interface of the older EPHYs that appear to be resolved
by the fourth commit.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Doug Berger [Wed, 16 Oct 2019 23:06:32 +0000 (16:06 -0700)]
net: bcmgenet: reset 40nm EPHY on energy detect
The EPHY integrated into the 40nm Set-Top Box devices can falsely
detect energy when connected to a disabled peer interface. When the
peer interface is enabled the EPHY will detect and report the link
as active, but on occasion may get into a state where it is not
able to exchange data with the connected GENET MAC. This issue has
not been observed when the link parameters are auto-negotiated;
however, it has been observed with a manually configured link.
It has been empirically determined that issuing a soft reset to the
EPHY when energy is detected prevents it from getting into this bad
state.
Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Doug Berger [Wed, 16 Oct 2019 23:06:31 +0000 (16:06 -0700)]
net: bcmgenet: soft reset 40nm EPHYs before MAC init
It turns out that the "Workaround for putting the PHY in IDDQ mode"
used by the internal EPHYs on 40nm Set-Top Box chips when powering
down puts the interface to the GENET MAC in a state that can cause
subsequent MAC resets to be incomplete.
Rather than restore the forced soft reset when powering up internal
PHYs, this commit moves the invocation of phy_init_hw earlier in
the MAC initialization sequence to just before the MAC reset in the
open and resume functions. This allows the interface to be stable
and allows the MAC resets to be successful.
The bcmgenet_mii_probe() function is split in two to accommodate
this. The new function bcmgenet_mii_connect() handles the first
half of the functionality before the MAC initialization, and the
bcmgenet_mii_config() function is extended to provide the remaining
PHY configuration following the MAC initialization.
Fixes: 484bfa1507bf ("Revert "net: bcmgenet: Software reset EPHY after power on"")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Doug Berger [Wed, 16 Oct 2019 23:06:30 +0000 (16:06 -0700)]
net: phy: bcm7xxx: define soft_reset for 40nm EPHY
The internal 40nm EPHYs use a "Workaround for putting the PHY in
IDDQ mode." These PHYs require a soft reset to restore functionality
after they are powered back up.
This commit defines the soft_reset function to use genphy_soft_reset
during phy_init_hw to accommodate this.
Fixes: 6e2d85ec0559 ("net: phy: Stop with excessive soft reset")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Doug Berger [Wed, 16 Oct 2019 23:06:29 +0000 (16:06 -0700)]
net: bcmgenet: don't set phydev->link from MAC
When commit
28b2e0d2cd13 ("net: phy: remove parameter new_link from
phy_mac_interrupt()") removed the new_link parameter it set the
phydev->link state from the MAC before invoking phy_mac_interrupt().
However, once commit
88d6272acaaa ("net: phy: avoid unneeded MDIO
reads in genphy_read_status") was added this initialization prevents
the proper determination of the connection parameters by the function
genphy_read_status().
This commit removes that initialization to restore the proper
functionality.
Fixes: 88d6272acaaa ("net: phy: avoid unneeded MDIO reads in genphy_read_status")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sean Wang [Wed, 16 Oct 2019 21:14:08 +0000 (05:14 +0800)]
net: Update address for MediaTek ethernet driver in MAINTAINERS
Update maintainers for MediaTek ethernet driver with Mark Lee.
He is familiar with MediaTek mt762x series ethernet devices and
will keep following maintenance from the vendor side.
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Mark Lee <Mark-MC.Lee@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Wang [Wed, 16 Oct 2019 19:03:15 +0000 (12:03 -0700)]
ipv4: fix race condition between route lookup and invalidation
Jesse and Ido reported the following race condition:
<CPU A, t0> - Received packet A is forwarded and cached dst entry is
taken from the nexthop ('nhc->nhc_rth_input'). Calls skb_dst_set()
<t1> - Given Jesse has busy routers ("ingesting full BGP routing tables
from multiple ISPs"), route is added / deleted and rt_cache_flush() is
called
<CPU B, t2> - Received packet B tries to use the same cached dst entry
from t0, but rt_cache_valid() is no longer true and it is replaced in
rt_cache_route() by the newer one. This calls dst_dev_put() on the
original dst entry which assigns the blackhole netdev to 'dst->dev'
<CPU A, t3> - dst_input(skb) is called on packet A and it is dropped due
to 'dst->dev' being the blackhole netdev
There are 2 issues in the v4 routing code:
1. A per-netns counter is used to do the validation of the route. That
means whenever a route is changed in the netns, users of all routes in
the netns needs to redo lookup. v6 has an implementation of only
updating fn_sernum for routes that are affected.
2. When rt_cache_valid() returns false, rt_cache_route() is called to
throw away the current cache, and create a new one. This seems
unnecessary because as long as this route does not change, the route
cache does not need to be recreated.
To fully solve the above 2 issues, it probably needs quite some code
changes and requires careful testing, and does not suite for net branch.
So this patch only tries to add the deleted cached rt into the uncached
list, so user could still be able to use it to receive packets until
it's done.
Fixes: 95c47f9cf5e0 ("ipv4: call dst_dev_put() properly")
Signed-off-by: Wei Wang <weiwan@google.com>
Reported-by: Ido Schimmel <idosch@idosch.org>
Reported-by: Jesse Hathaway <jesse@mbuki-mvuki.org>
Tested-by: Jesse Hathaway <jesse@mbuki-mvuki.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Cc: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Brivio [Wed, 16 Oct 2019 18:52:09 +0000 (20:52 +0200)]
ipv4: Return -ENETUNREACH if we can't create route but saddr is valid
...instead of -EINVAL. An issue was found with older kernel versions
while unplugging a NFS client with pending RPCs, and the wrong error
code here prevented it from recovering once link is back up with a
configured address.
Incidentally, this is not an issue anymore since commit
4f8943f80883
("SUNRPC: Replace direct task wakeups from softirq context"), included
in 5.2-rc7, had the effect of decoupling the forwarding of this error
by using SO_ERROR in xs_wake_error(), as pointed out by Benjamin
Coddington.
To the best of my knowledge, this isn't currently causing any further
issue, but the error code doesn't look appropriate anyway, and we
might hit this in other paths as well.
In detail, as analysed by Gonzalo Siero, once the route is deleted
because the interface is down, and can't be resolved and we return
-EINVAL here, this ends up, courtesy of inet_sk_rebuild_header(),
as the socket error seen by tcp_write_err(), called by
tcp_retransmit_timer().
In turn, tcp_write_err() indirectly calls xs_error_report(), which
wakes up the RPC pending tasks with a status of -EINVAL. This is then
seen by call_status() in the SUN RPC implementation, which aborts the
RPC call calling rpc_exit(), instead of handling this as a
potentially temporary condition, i.e. as a timeout.
Return -EINVAL only if the input parameters passed to
ip_route_output_key_hash_rcu() are actually invalid (this is the case
if the specified source address is multicast, limited broadcast or
all zeroes), but return -ENETUNREACH in all cases where, at the given
moment, the given source address doesn't allow resolving the route.
While at it, drop the initialisation of err to -ENETUNREACH, which
was added to __ip_route_output_key() back then by commit
0315e3827048 ("net: Fix behaviour of unreachable, blackhole and
prohibit routes"), but actually had no effect, as it was, and is,
overwritten by the fib_lookup() return code assignment, and anyway
ignored in all other branches, including the if (fl4->saddr) one:
I find this rather confusing, as it would look like -ENETUNREACH is
the "default" error, while that statement has no effect.
Also note that after commit
fc75fc8339e7 ("ipv4: dont create routes
on down devices"), we would get -ENETUNREACH if the device is down,
but -EINVAL if the source address is specified and we can't resolve
the route, and this appears to be rather inconsistent.
Reported-by: Stefan Walter <walteste@inf.ethz.ch>
Analysed-by: Benjamin Coddington <bcodding@redhat.com>
Analysed-by: Gonzalo Siero <gsierohu@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marek Vasut [Wed, 16 Oct 2019 13:35:07 +0000 (15:35 +0200)]
net: phy: micrel: Update KSZ87xx PHY name
The KSZ8795 PHY ID is in fact used by KSZ8794/KSZ8795/KSZ8765 switches.
Update the PHY ID and name to reflect that, as this family of switches
is commonly refered to as KSZ87xx
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: David S. Miller <davem@davemloft.net>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: George McCollister <george.mccollister@gmail.com>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: Sean Nyekjaer <sean.nyekjaer@prevas.dk>
Cc: Tristram Ha <Tristram.Ha@microchip.com>
Cc: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marek Vasut [Wed, 16 Oct 2019 13:35:06 +0000 (15:35 +0200)]
net: phy: micrel: Discern KSZ8051 and KSZ8795 PHYs
The KSZ8051 PHY and the KSZ8794/KSZ8795/KSZ8765 switch share exactly the
same PHY ID. Since KSZ8051 is higher in the ksphy_driver[] list of PHYs
in the micrel PHY driver, it is used even with the KSZ87xx switch. This
is wrong, since the KSZ8051 configures registers of the PHY which are
not present on the simplified KSZ87xx switch PHYs and misconfigures
other registers of the KSZ87xx switch PHYs.
Fortunatelly, it is possible to tell apart the KSZ8051 PHY from the
KSZ87xx switch by checking the Basic Status register Bit 0, which is
read-only and indicates presence of the Extended Capability Registers.
The KSZ8051 PHY has those registers while the KSZ87xx switch does not.
This patch implements simple check for the presence of this bit for
both the KSZ8051 PHY and KSZ87xx switch, to let both use the correct
PHY driver instance.
Fixes: 9d162ed69f51 ("net: phy: micrel: add support for KSZ8795")
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: David S. Miller <davem@davemloft.net>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: George McCollister <george.mccollister@gmail.com>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: Sean Nyekjaer <sean.nyekjaer@prevas.dk>
Cc: Tristram Ha <Tristram.Ha@microchip.com>
Cc: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marek Vasut [Wed, 16 Oct 2019 13:33:24 +0000 (15:33 +0200)]
net: dsa: microchip: Add shared regmap mutex
The KSZ driver uses one regmap per register width (8/16/32), each with
it's own lock, but accessing the same set of registers. In theory, it
is possible to create a race condition between these regmaps, although
the underlying bus (SPI or I2C) locking should assure nothing bad will
really happen and the accesses would be correct.
To make the driver do the right thing, add one single shared mutex for
all the regmaps used by the driver instead. This assures that even if
some future hardware is on a bus which does not serialize the accesses
the same way SPI or I2C does, nothing bad will happen.
Note that the status_mutex was unused and only initied, hence it was
renamed and repurposed as the regmap mutex.
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: David S. Miller <davem@davemloft.net>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: George McCollister <george.mccollister@gmail.com>
Cc: Tristram Ha <Tristram.Ha@microchip.com>
Cc: Woojung Huh <woojung.huh@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marek Vasut [Wed, 16 Oct 2019 13:33:23 +0000 (15:33 +0200)]
net: dsa: microchip: Do not reinit mutexes on KSZ87xx
The KSZ87xx driver calls mutex_init() on mutexes already inited in
ksz_common.c ksz_switch_register(). Do not do it twice, drop the
reinitialization.
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: David S. Miller <davem@davemloft.net>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: George McCollister <george.mccollister@gmail.com>
Cc: Tristram Ha <Tristram.Ha@microchip.com>
Cc: Woojung Huh <woojung.huh@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Dooks (Codethink) [Wed, 16 Oct 2019 08:22:05 +0000 (09:22 +0100)]
net: stmmac: fix argument to stmmac_pcs_ctrl_ane()
The stmmac_pcs_ctrl_ane() expects a register address as
argument 1, but for some reason the mac_device_info is
being passed.
Fix the warning (and possible bug) from sparse:
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:2613:17: warning: incorrect type in argument 1 (different address spaces)
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:2613:17: expected void [noderef] <asn:2> *ioaddr
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:2613:17: got struct mac_device_info *hw
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 17 Oct 2019 19:27:29 +0000 (15:27 -0400)]
Merge branch 'dpaa2-eth-misc-fixes'
Ioana Ciornei says:
====================
dpaa2-eth: misc fixes
This patch set adds a couple of fixes around updating configuration on MAC
change. Depending on when MC connects the DPNI to a MAC, both the MAC
address and TX FQIDs should be updated everytime there is a change in
configuration.
Changes in v2:
- used reverse christmas tree ordering in patch 2/2
Changes in v3:
- add a missing new line
- go back to FQ based enqueueing after a transient error
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Radulescu [Wed, 16 Oct 2019 07:36:23 +0000 (10:36 +0300)]
dpaa2-eth: Fix TX FQID values
Depending on when MC connects the DPNI to a MAC, Tx FQIDs may
not be available during probe time.
Read the FQIDs each time the link goes up to avoid using invalid
values. In case an error occurs or an invalid value is retrieved,
fall back to QDID-based enqueueing.
Fixes: 1fa0f68c9255 ("dpaa2-eth: Use FQ-based DPIO enqueue API")
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florin Chiculita [Wed, 16 Oct 2019 07:36:22 +0000 (10:36 +0300)]
dpaa2-eth: add irq for the dpmac connect/disconnect event
Add IRQ for the DPNI endpoint change event, resolving the issue
when a dynamically created DPNI gets a randomly generated hw address
when the endpoint is a DPMAC object.
Signed-off-by: Florin Chiculita <florinlaurentiu.chiculita@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Oliver Neukum [Thu, 17 Oct 2019 13:25:47 +0000 (15:25 +0200)]
usb: hso: obey DMA rules in tiocmget
The serial state information must not be embedded into another
data structure, as this interferes with cache handling for DMA
on architectures without cache coherence..
That would result in data corruption on some architectures
Allocating it separately.
v2: fix syntax error
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Biao Huang [Tue, 15 Oct 2019 03:24:44 +0000 (11:24 +0800)]
net: stmmac: disable/enable ptp_ref_clk in suspend/resume flow
disable ptp_ref_clk in suspend flow, and enable it in resume flow.
Fixes: f573c0b9c4e0 ("stmmac: move stmmac_clk, pclk, clk_ptp_ref and stmmac_rst to platform structure")
Signed-off-by: Biao Huang <biao.huang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yonglong Liu [Wed, 16 Oct 2019 02:30:39 +0000 (10:30 +0800)]
net: phy: Fix "link partner" information disappear issue
Some drivers just call phy_ethtool_ksettings_set() to set the
links, for those phy drivers that use genphy_read_status(), if
autoneg is on, and the link is up, than execute "ethtool -s
ethx autoneg on" will cause "link partner" information disappear.
The call trace is phy_ethtool_ksettings_set()->phy_start_aneg()
->linkmode_zero(phydev->lp_advertising)->genphy_read_status(),
the link didn't change, so genphy_read_status() just return, and
phydev->lp_advertising is zero now.
This patch moves the clear operation of lp_advertising from
phy_start_aneg() to genphy_read_lpa()/genphy_c45_read_lpa(), and
if autoneg on and autoneg not complete, just clear what the
generic functions care about.
Fixes: 88d6272acaaa ("net: phy: avoid unneeded MDIO reads in genphy_read_status")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 14 Oct 2019 13:04:38 +0000 (06:04 -0700)]
rxrpc: use rcu protection while reading sk->sk_user_data
We need to extend the rcu_read_lock() section in rxrpc_error_report()
and use rcu_dereference_sk_user_data() instead of plain access
to sk->sk_user_data to make sure all rules are respected.
The compiler wont reload sk->sk_user_data at will, and RCU rules
prevent memory beeing freed too soon.
Fixes: f0308fb07080 ("rxrpc: Fix possible NULL pointer access in ICMP handling")
Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mahesh Bandewar [Wed, 16 Oct 2019 07:04:38 +0000 (00:04 -0700)]
Revert "blackhole_netdev: fix syzkaller reported issue"
This reverts commit
b0818f80c8c1bc215bba276bd61c216014fab23b.
Started seeing weird behavior after this patch especially in
the IPv6 code path. Haven't root caused it, but since this was
applied to net branch, taking a precautionary measure to revert
it and look / analyze those failures
Revert this now and I'll send a better fix after analysing / fixing
the weirdness observed.
CC: Eric Dumazet <edumazet@google.com>
CC: Wei Wang <weiwan@google.com>
CC: David S. Miller <davem@davemloft.net>
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Valentin Vidic [Tue, 15 Oct 2019 20:20:20 +0000 (22:20 +0200)]
net: usb: sr9800: fix uninitialized local variable
Make sure res does not contain random value if the call to
sr_read_cmd fails for some reason.
Reported-by: syzbot+f1842130bbcfb335bac1@syzkaller.appspotmail.com
Signed-off-by: Valentin Vidic <vvidic@valentin-vidic.from.hr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Tue, 15 Oct 2019 17:45:47 +0000 (10:45 -0700)]
net: bcmgenet: Fix RGMII_MODE_EN value for GENET v1/2/3
The RGMII_MODE_EN bit value was 0 for GENET versions 1 through 3, and
became 6 for GENET v4 and above, account for that difference.
Fixes: aa09677cba42 ("net: bcmgenet: add MDIO routines")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Doug Berger <opendmb@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Dooks (Codethink) [Tue, 15 Oct 2019 16:17:48 +0000 (17:17 +0100)]
net: stmmac: make tc_flow_parsers static
The tc_flow_parsers is not used outside of the driver, so
make it static to avoid the following sparse warning:
drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c:516:3: warning: symbol 'tc_flow_parsers' was not declared. Should it be static?
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Dooks (Codethink) [Tue, 15 Oct 2019 16:15:58 +0000 (17:15 +0100)]
davinci_cpdma: make cpdma_chan_split_pool static
The cpdma_chan_split_pool() function is not used outside of
the driver, so make it static to avoid the following sparse
warning:
drivers/net/ethernet/ti/davinci_cpdma.c:725:5: warning: symbol 'cpdma_chan_split_pool' was not declared. Should it be static?
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Bogendoerfer [Tue, 15 Oct 2019 14:42:45 +0000 (16:42 +0200)]
net: i82596: fix dma_alloc_attr for sni_82596
Commit
7f683b920479 ("i825xx: switch to switch to dma_alloc_attrs")
switched dma allocation over to dma_alloc_attr, but didn't convert
the SNI part to request consistent DMA memory. This broke sni_82596
since driver doesn't do dma_cache_sync for performance reasons.
Fix this by using different DMA_ATTRs for lasi_82596 and sni_82596.
Fixes: 7f683b920479 ("i825xx: switch to switch to dma_alloc_attrs")
Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Tue, 15 Oct 2019 07:24:38 +0000 (15:24 +0800)]
sctp: change sctp_prot .no_autobind with true
syzbot reported a memory leak:
BUG: memory leak, unreferenced object 0xffff888120b3d380 (size 64):
backtrace:
[...] slab_alloc mm/slab.c:3319 [inline]
[...] kmem_cache_alloc+0x13f/0x2c0 mm/slab.c:3483
[...] sctp_bucket_create net/sctp/socket.c:8523 [inline]
[...] sctp_get_port_local+0x189/0x5a0 net/sctp/socket.c:8270
[...] sctp_do_bind+0xcc/0x200 net/sctp/socket.c:402
[...] sctp_bindx_add+0x4b/0xd0 net/sctp/socket.c:497
[...] sctp_setsockopt_bindx+0x156/0x1b0 net/sctp/socket.c:1022
[...] sctp_setsockopt net/sctp/socket.c:4641 [inline]
[...] sctp_setsockopt+0xaea/0x2dc0 net/sctp/socket.c:4611
[...] sock_common_setsockopt+0x38/0x50 net/core/sock.c:3147
[...] __sys_setsockopt+0x10f/0x220 net/socket.c:2084
[...] __do_sys_setsockopt net/socket.c:2100 [inline]
It was caused by when sending msgs without binding a port, in the path:
inet_sendmsg() -> inet_send_prepare() -> inet_autobind() ->
.get_port/sctp_get_port(), sp->bind_hash will be set while bp->port is
not. Later when binding another port by sctp_setsockopt_bindx(), a new
bucket will be created as bp->port is not set.
sctp's autobind is supposed to call sctp_autobind() where it does all
things including setting bp->port. Since sctp_autobind() is called in
sctp_sendmsg() if the sk is not yet bound, it should have skipped the
auto bind.
THis patch is to avoid calling inet_autobind() in inet_send_prepare()
by changing sctp_prot .no_autobind with true, also remove the unused
.get_port.
Reported-by: syzbot+d44f7bbebdea49dbc84a@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vinicius Costa Gomes [Mon, 14 Oct 2019 20:38:22 +0000 (13:38 -0700)]
sched: etf: Fix ordering of packets with same txtime
When a application sends many packets with the same txtime, they may
be transmitted out of order (different from the order in which they
were enqueued).
This happens because when inserting elements into the tree, when the
txtime of two packets are the same, the new packet is inserted at the
left side of the tree, causing the reordering. The only effect of this
change should be that packets with the same txtime will be transmitted
in the order they are enqueued.
The application in question (the AVTP GStreamer plugin, still in
development) is sending video traffic, in which each video frame have
a single presentation time, the problem is that when packetizing,
multiple packets end up with the same txtime.
The receiving side was rejecting packets because they were being
received out of order.
Fixes: 25db26a91364 ("net/sched: Introduce the ETF Qdisc")
Reported-by: Ederson de Souza <ederson.desouza@intel.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 14 Oct 2019 18:22:30 +0000 (11:22 -0700)]
net: avoid potential infinite loop in tc_ctl_action()
tc_ctl_action() has the ability to loop forever if tcf_action_add()
returns -EAGAIN.
This special case has been done in case a module needed to be loaded,
but it turns out that tcf_add_notify() could also return -EAGAIN
if the socket sk_rcvbuf limit is hit.
We need to separate the two cases, and only loop for the module
loading case.
While we are at it, add a limit of 10 attempts since unbounded
loops are always scary.
syzbot repro was something like :
socket(PF_NETLINK, SOCK_RAW|SOCK_NONBLOCK, NETLINK_ROUTE) = 3
write(3, ..., 38) = 38
setsockopt(3, SOL_SOCKET, SO_RCVBUF, [0], 4) = 0
sendmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{..., 388}], msg_controllen=0, msg_flags=0x10}, ...)
NMI backtrace for cpu 0
CPU: 0 PID: 1054 Comm: khungtaskd Not tainted 5.4.0-rc1+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x172/0x1f0 lib/dump_stack.c:113
nmi_cpu_backtrace.cold+0x70/0xb2 lib/nmi_backtrace.c:101
nmi_trigger_cpumask_backtrace+0x23b/0x28b lib/nmi_backtrace.c:62
arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
check_hung_uninterruptible_tasks kernel/hung_task.c:205 [inline]
watchdog+0x9d0/0xef0 kernel/hung_task.c:289
kthread+0x361/0x430 kernel/kthread.c:255
ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 PID: 8859 Comm: syz-executor910 Not tainted 5.4.0-rc1+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:arch_local_save_flags arch/x86/include/asm/paravirt.h:751 [inline]
RIP: 0010:lockdep_hardirqs_off+0x1df/0x2e0 kernel/locking/lockdep.c:3453
Code: 5c 08 00 00 5b 41 5c 41 5d 5d c3 48 c7 c0 58 1d f3 88 48 ba 00 00 00 00 00 fc ff df 48 c1 e8 03 80 3c 10 00 0f 85 d3 00 00 00 <48> 83 3d 21 9e 99 07 00 0f 84 b9 00 00 00 9c 58 0f 1f 44 00 00 f6
RSP: 0018:
ffff8880a6f3f1b8 EFLAGS:
00000046
RAX:
1ffffffff11e63ab RBX:
ffff88808c9c6080 RCX:
0000000000000000
RDX:
dffffc0000000000 RSI:
0000000000000000 RDI:
ffff88808c9c6914
RBP:
ffff8880a6f3f1d0 R08:
ffff88808c9c6080 R09:
fffffbfff16be5d1
R10:
fffffbfff16be5d0 R11:
0000000000000003 R12:
ffffffff8746591f
R13:
ffff88808c9c6080 R14:
ffffffff8746591f R15:
0000000000000003
FS:
00000000011e4880(0000) GS:
ffff8880ae900000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
ffffffffff600400 CR3:
00000000a8920000 CR4:
00000000001406e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Call Trace:
trace_hardirqs_off+0x62/0x240 kernel/trace/trace_preemptirq.c:45
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:108 [inline]
_raw_spin_lock_irqsave+0x6f/0xcd kernel/locking/spinlock.c:159
__wake_up_common_lock+0xc8/0x150 kernel/sched/wait.c:122
__wake_up+0xe/0x10 kernel/sched/wait.c:142
netlink_unlock_table net/netlink/af_netlink.c:466 [inline]
netlink_unlock_table net/netlink/af_netlink.c:463 [inline]
netlink_broadcast_filtered+0x705/0xb80 net/netlink/af_netlink.c:1514
netlink_broadcast+0x3a/0x50 net/netlink/af_netlink.c:1534
rtnetlink_send+0xdd/0x110 net/core/rtnetlink.c:714
tcf_add_notify net/sched/act_api.c:1343 [inline]
tcf_action_add+0x243/0x370 net/sched/act_api.c:1362
tc_ctl_action+0x3b5/0x4bc net/sched/act_api.c:1410
rtnetlink_rcv_msg+0x463/0xb00 net/core/rtnetlink.c:5386
netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5404
netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
netlink_unicast+0x531/0x710 net/netlink/af_netlink.c:1328
netlink_sendmsg+0x8a5/0xd60 net/netlink/af_netlink.c:1917
sock_sendmsg_nosec net/socket.c:637 [inline]
sock_sendmsg+0xd7/0x130 net/socket.c:657
___sys_sendmsg+0x803/0x920 net/socket.c:2311
__sys_sendmsg+0x105/0x1d0 net/socket.c:2356
__do_sys_sendmsg net/socket.c:2365 [inline]
__se_sys_sendmsg net/socket.c:2363 [inline]
__x64_sys_sendmsg+0x78/0xb0 net/socket.c:2363
do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x440939
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot+cf0adbb9c28c8866c788@syzkaller.appspotmail.com
Signed-off-by: David S. Miller <davem@davemloft.net>
Nishad Kamdar [Mon, 14 Oct 2019 16:21:20 +0000 (21:51 +0530)]
net: dsa: sja1105: Use the correct style for SPDX License Identifier
This patch corrects the SPDX License Identifier style
in header files related to Distributed Switch Architecture
drivers for NXP SJA1105 series Ethernet switch support.
It uses an expilict block comment for the SPDX License
Identifier.
Changes made by using a script provided by Joe Perches here:
https://lkml.org/lkml/2019/2/7/46.
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Nishad Kamdar <nishadkamdar@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 14 Oct 2019 13:47:57 +0000 (06:47 -0700)]
tcp: fix a possible lockdep splat in tcp_done()
syzbot found that if __inet_inherit_port() returns an error,
we call tcp_done() after inet_csk_prepare_forced_close(),
meaning the socket lock is no longer held.
We might fix this in a different way in net-next, but
for 5.4 it seems safer to relax the lockdep check.
Fixes: d983ea6f16b8 ("tcp: add rcu protection around tp->fastopen_rsk")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 16 Oct 2019 01:03:35 +0000 (18:03 -0700)]
Merge branch 'Update-MT7629-to-support-PHYLINK-API'
MarkLee says:
====================
Update MT7629 to support PHYLINK API
This patch set has two goals :
1. Fix mt7629 GMII mode issue after apply mediatek
PHYLINK support patch.
2. Update mt7629 dts to reflect the latest dt-binding
with PHYLINK support.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
MarkLee [Mon, 14 Oct 2019 07:15:18 +0000 (15:15 +0800)]
arm: dts: mediatek: Update mt7629 dts to reflect the latest dt-binding
* Removes mediatek,physpeed property from dtsi that is useless in PHYLINK
* Use the fixed-link property speed = <2500> to set the phy in 2.5Gbit.
* Set gmac1 to gmii mode that connect to a internal gphy
Signed-off-by: MarkLee <Mark-MC.Lee@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
MarkLee [Mon, 14 Oct 2019 07:15:17 +0000 (15:15 +0800)]
net: ethernet: mediatek: Fix MT7629 missing GMII mode support
In the original design, mtk_phy_connect function will set ge_mode=1
if phy-mode is GMII(PHY_INTERFACE_MODE_GMII) and then set the correct
ge_mode to ETHSYS_SYSCFG0 register. This logic was broken after apply
mediatek PHYLINK patch(Fixes tag), the new mtk_mac_config function will
not set ge_mode=1 for GMII mode hence the final ETHSYS_SYSCFG0 setting
will be incorrect for mt7629 GMII mode. This patch add the missing logic
back to fix it.
Fixes: b8fc9f30821e ("net: ethernet: mediatek: Add basic PHYLINK support")
Signed-off-by: MarkLee <Mark-MC.Lee@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 16 Oct 2019 00:14:48 +0000 (17:14 -0700)]
Merge branch 'mpls-push-pop-fix'
Davide Caratti says:
====================
net/sched: fix wrong behavior of MPLS push/pop action
this series contains two fixes for TC 'act_mpls', that try to address
two problems that can be observed configuring simple 'push' / 'pop'
operations:
- patch 1/2 avoids dropping non-MPLS packets that pass through the MPLS
'pop' action.
- patch 2/2 fixes corruption of the L2 header that occurs when 'push'
or 'pop' actions are configured in TC egress path.
v2: - change commit message in patch 1/2 to better describe that the
patch impacts only TC, thanks to Simon Horman
- fix missing documentation of 'mac_len' in patch 2/2
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Davide Caratti [Sat, 12 Oct 2019 11:55:07 +0000 (13:55 +0200)]
net/sched: fix corrupted L2 header with MPLS 'push' and 'pop' actions
the following script:
# tc qdisc add dev eth0 clsact
# tc filter add dev eth0 egress protocol ip matchall \
> action mpls push protocol mpls_uc label 0x355aa bos 1
causes corruption of all IP packets transmitted by eth0. On TC egress, we
can't rely on the value of skb->mac_len, because it's 0 and a MPLS 'push'
operation will result in an overwrite of the first 4 octets in the packet
L2 header (e.g. the Destination Address if eth0 is an Ethernet); the same
error pattern is present also in the MPLS 'pop' operation. Fix this error
in act_mpls data plane, computing 'mac_len' as the difference between the
network header and the mac header (when not at TC ingress), and use it in
MPLS 'push'/'pop' core functions.
v2: unbreak 'make htmldocs' because of missing documentation of 'mac_len'
in skb_mpls_pop(), reported by kbuild test robot
CC: Lorenzo Bianconi <lorenzo@kernel.org>
Fixes: 2a2ea50870ba ("net: sched: add mpls manipulation actions to TC")
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Acked-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Davide Caratti [Sat, 12 Oct 2019 11:55:06 +0000 (13:55 +0200)]
net: avoid errors when trying to pop MLPS header on non-MPLS packets
the following script:
# tc qdisc add dev eth0 clsact
# tc filter add dev eth0 egress matchall action mpls pop
implicitly makes the kernel drop all packets transmitted by eth0, if they
don't have a MPLS header. This behavior is uncommon: other encapsulations
(like VLAN) just let the packet pass unmodified. Since the result of MPLS
'pop' operation would be the same regardless of the presence / absence of
MPLS header(s) in the original packet, we can let skb_mpls_pop() return 0
when dealing with non-MPLS packets.
For the OVS use-case, this is acceptable because __ovs_nla_copy_actions()
already ensures that MPLS 'pop' operation only occurs with packets having
an MPLS Ethernet type (and there are no other callers in current code, so
the semantic change should be ok).
v2: better documentation of use-cases for skb_mpls_pop(), thanks to Simon
Horman
Fixes: 2a2ea50870ba ("net: sched: add mpls manipulation actions to TC")
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Acked-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nishad Kamdar [Sat, 12 Oct 2019 13:12:28 +0000 (18:42 +0530)]
net: cavium: Use the correct style for SPDX License Identifier
This patch corrects the SPDX License Identifier style
in header files related to Cavium Ethernet drivers.
For C header files Documentation/process/license-rules.rst
mandates C-like comments (opposed to C source files where
C++ style should be used)
Changes made by using a script provided by Joe Perches here:
https://lkml.org/lkml/2019/2/7/46.
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Nishad Kamdar <nishadkamdar@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nishad Kamdar [Sat, 12 Oct 2019 12:18:56 +0000 (17:48 +0530)]
net: dsa: microchip: Use the correct style for SPDX License Identifier
This patch corrects the SPDX License Identifier style
in header files related to Distributed Switch Architecture
drivers for Microchip KSZ series switch support.
For C header files Documentation/process/license-rules.rst
mandates C-like comments (opposed to C source files where
C++ style should be used)
Changes made by using a script provided by Joe Perches here:
https://lkml.org/lkml/2019/2/7/46.
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Nishad Kamdar <nishadkamdar@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Randy Dunlap [Sat, 12 Oct 2019 04:03:33 +0000 (21:03 -0700)]
net: ethernet: broadcom: have drivers select DIMLIB as needed
NET_VENDOR_BROADCOM is intended to control a kconfig menu only.
It should not have anything to do with code generation.
As such, it should not select DIMLIB for all drivers under
NET_VENDOR_BROADCOM. Instead each driver that needs DIMLIB should
select it (being the symbols SYSTEMPORT, BNXT, and BCMGENET).
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1907021810220.13058@ramsan.of.borg/
Fixes: 4f75da3666c0 ("linux/dim: Move implementation to .c files")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Uwe Kleine-König <uwe@kleine-koenig.org>
Cc: Tal Gilboa <talgi@mellanox.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: netdev@vger.kernel.org
Cc: linux-rdma@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Leon Romanovsky <leonro@mellanox.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Sat, 12 Oct 2019 02:43:03 +0000 (20:43 -0600)]
net: Update address for vrf and l3mdev in MAINTAINERS
Use my kernel.org address for all entries in MAINTAINERS.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Fri, 11 Oct 2019 19:53:49 +0000 (12:53 -0700)]
net: bcmgenet: Set phydev->dev_flags only for internal PHYs
phydev->dev_flags is entirely dependent on the PHY device driver which
is going to be used, setting the internal GENET PHY revision in those
bits only makes sense when drivers/net/phy/bcm7xxx.c is the PHY driver
being used.
Fixes: 487320c54143 ("net: bcmgenet: communicate integrated PHY revision to PHY driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Doug Berger <opendmb@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mahesh Bandewar [Sat, 12 Oct 2019 01:14:55 +0000 (18:14 -0700)]
blackhole_netdev: fix syzkaller reported issue
While invalidating the dst, we assign backhole_netdev instead of
loopback device. However, this device does not have idev pointer
and hence no ip6_ptr even if IPv6 is enabled. Possibly this has
triggered the syzbot reported crash.
The syzbot report does not have reproducer, however, this is the
only device that doesn't have matching idev created.
Crash instruction is :
static inline bool ip6_ignore_linkdown(const struct net_device *dev)
{
const struct inet6_dev *idev = __in6_dev_get(dev);
return !!idev->cnf.ignore_routes_with_linkdown; <= crash
}
Also ipv6 always assumes presence of idev and never checks for it
being NULL (as does the above referenced code). So adding a idev
for the blackhole_netdev to avoid this class of crashes in the future.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 15 Oct 2019 16:40:57 +0000 (12:40 -0400)]
Merge tag 'wireless-drivers-for-davem-2019-10-15' of git://git./linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:
====================
wireless-drivers fixes for 5.4
Second set of fixes for 5.4. ath10k regression and iwlwifi BAD_COMMAND
bug are the ones getting most reports at the moment.
ath10k
* fix throughput regression on QCA98XX
iwlwifi
* fix initialization of 3168 devices (the infamous BAD_COMMAND bug)
* other smaller fixes
rt2x00
* don't include input-polldev.h header
* fix hw reset to work during first 5 minutes of system run
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 15 Oct 2019 00:01:53 +0000 (17:01 -0700)]
Merge branch 'aquantia-fixes'
Igor Russkikh says:
====================
Aquantia/Marvell AQtion atlantic driver fixes 10/2019
Here is a set of various bugfixes, to be considered for stable as well.
V2: double space removed
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Bogdanov [Fri, 11 Oct 2019 13:45:23 +0000 (13:45 +0000)]
net: aquantia: correctly handle macvlan and multicast coexistence
macvlan and multicast handling is now mixed up.
The explicit issue is that macvlan interface gets broken (no traffic)
after clearing MULTICAST flag on the real interface.
We now do separate logic and consider both ALLMULTI and MULTICAST
flags on the device.
Fixes: 11ba961c9161 ("net: aquantia: Fix IFF_ALLMULTI flag functionality")
Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Bogdanov [Fri, 11 Oct 2019 13:45:22 +0000 (13:45 +0000)]
net: aquantia: do not pass lro session with invalid tcp checksum
Individual descriptors on LRO TCP session should be checked
for CRC errors. It was discovered that HW recalculates
L4 checksums on LRO session and does not break it up on bad L4
csum.
Thus, driver should aggregate HW LRO L4 statuses from all individual
buffers of LRO session and drop packet if one of the buffers has bad
L4 checksum.
Fixes: f38f1ee8aeb2 ("net: aquantia: check rx csum for all packets in LRO session")
Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Russkikh [Fri, 11 Oct 2019 13:45:20 +0000 (13:45 +0000)]
net: aquantia: when cleaning hw cache it should be toggled
>From HW specification to correctly reset HW caches (this is a required
workaround when stopping the device), register bit should actually
be toggled.
It was previosly always just set. Due to the way driver stops HW this
never actually caused any issues, but it still may, so cleaning this up.
Fixes: 7a1bb49461b1 ("net: aquantia: fix potential IOMMU fault after driver unbind")
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Russkikh [Fri, 11 Oct 2019 13:45:19 +0000 (13:45 +0000)]
net: aquantia: temperature retrieval fix
Chip temperature is a two byte word, colocated internally with cable
length data. We do all readouts from HW memory by dwords, thus
we should clear extra high bytes, otherwise temperature output
gets weird as soon as we attach a cable to the NIC.
Fixes: 8f8940118654 ("net: aquantia: add infrastructure to readout chip temperature")
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Miaoqing Pan [Thu, 29 Aug 2019 02:45:12 +0000 (10:45 +0800)]
ath10k: fix latency issue for QCA988x
(kvalo: cherry picked from commit
1340cc631bd00431e2f174525c971f119df9efa1 in
wireless-drivers-next to wireless-drivers as this a frequently reported
regression)
Bad latency is found on QCA988x, the issue was introduced by
commit
4504f0e5b571 ("ath10k: sdio: workaround firmware UART
pin configuration bug"). If uart_pin_workaround is false, this
change will set uart pin even if uart_print is false.
Tested HW: QCA9880
Tested FW: 10.2.4-1.0-00037
Fixes: 4504f0e5b571 ("ath10k: sdio: workaround firmware UART pin configuration bug")
Signed-off-by: Miaoqing Pan <miaoqing@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
YueHaibing [Fri, 11 Oct 2019 09:46:53 +0000 (17:46 +0800)]
netdevsim: Fix error handling in nsim_fib_init and nsim_fib_exit
In nsim_fib_init(), if register_fib_notifier failed, nsim_fib_net_ops
should be unregistered before return.
In nsim_fib_exit(), unregister_fib_notifier should be called before
nsim_fib_net_ops be unregistered, otherwise may cause use-after-free:
BUG: KASAN: use-after-free in nsim_fib_event_nb+0x342/0x570 [netdevsim]
Read of size 8 at addr
ffff8881daaf4388 by task kworker/0:3/3499
CPU: 0 PID: 3499 Comm: kworker/0:3 Not tainted 5.3.0-rc7+ #30
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
Workqueue: ipv6_addrconf addrconf_dad_work [ipv6]
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0xa9/0x10e lib/dump_stack.c:113
print_address_description+0x65/0x380 mm/kasan/report.c:351
__kasan_report+0x149/0x18d mm/kasan/report.c:482
kasan_report+0xe/0x20 mm/kasan/common.c:618
nsim_fib_event_nb+0x342/0x570 [netdevsim]
notifier_call_chain+0x52/0xf0 kernel/notifier.c:95
__atomic_notifier_call_chain+0x78/0x140 kernel/notifier.c:185
call_fib_notifiers+0x30/0x60 net/core/fib_notifier.c:30
call_fib6_entry_notifiers+0xc1/0x100 [ipv6]
fib6_add+0x92e/0x1b10 [ipv6]
__ip6_ins_rt+0x40/0x60 [ipv6]
ip6_ins_rt+0x84/0xb0 [ipv6]
__ipv6_ifa_notify+0x4b6/0x550 [ipv6]
ipv6_ifa_notify+0xa5/0x180 [ipv6]
addrconf_dad_completed+0xca/0x640 [ipv6]
addrconf_dad_work+0x296/0x960 [ipv6]
process_one_work+0x5c0/0xc00 kernel/workqueue.c:2269
worker_thread+0x5c/0x670 kernel/workqueue.c:2415
kthread+0x1d7/0x200 kernel/kthread.c:255
ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
Allocated by task 3388:
save_stack+0x19/0x80 mm/kasan/common.c:69
set_track mm/kasan/common.c:77 [inline]
__kasan_kmalloc.constprop.3+0xa0/0xd0 mm/kasan/common.c:493
kmalloc include/linux/slab.h:557 [inline]
kzalloc include/linux/slab.h:748 [inline]
ops_init+0xa9/0x220 net/core/net_namespace.c:127
__register_pernet_operations net/core/net_namespace.c:1135 [inline]
register_pernet_operations+0x1d4/0x420 net/core/net_namespace.c:1212
register_pernet_subsys+0x24/0x40 net/core/net_namespace.c:1253
nsim_fib_init+0x12/0x70 [netdevsim]
veth_get_link_ksettings+0x2b/0x50 [veth]
do_one_initcall+0xd4/0x454 init/main.c:939
do_init_module+0xe0/0x330 kernel/module.c:3490
load_module+0x3c2f/0x4620 kernel/module.c:3841
__do_sys_finit_module+0x163/0x190 kernel/module.c:3931
do_syscall_64+0x72/0x2e0 arch/x86/entry/common.c:296
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Freed by task 3534:
save_stack+0x19/0x80 mm/kasan/common.c:69
set_track mm/kasan/common.c:77 [inline]
__kasan_slab_free+0x130/0x180 mm/kasan/common.c:455
slab_free_hook mm/slub.c:1423 [inline]
slab_free_freelist_hook mm/slub.c:1474 [inline]
slab_free mm/slub.c:3016 [inline]
kfree+0xe9/0x2d0 mm/slub.c:3957
ops_free net/core/net_namespace.c:151 [inline]
ops_free_list.part.7+0x156/0x220 net/core/net_namespace.c:184
ops_free_list net/core/net_namespace.c:182 [inline]
__unregister_pernet_operations net/core/net_namespace.c:1165 [inline]
unregister_pernet_operations+0x221/0x2a0 net/core/net_namespace.c:1224
unregister_pernet_subsys+0x1d/0x30 net/core/net_namespace.c:1271
nsim_fib_exit+0x11/0x20 [netdevsim]
nsim_module_exit+0x16/0x21 [netdevsim]
__do_sys_delete_module kernel/module.c:1015 [inline]
__se_sys_delete_module kernel/module.c:958 [inline]
__x64_sys_delete_module+0x244/0x330 kernel/module.c:958
do_syscall_64+0x72/0x2e0 arch/x86/entry/common.c:296
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 59c84b9fcf42 ("netdevsim: Restore per-network namespace accounting for fib entries")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cédric Le Goater [Fri, 11 Oct 2019 05:52:54 +0000 (07:52 +0200)]
net/ibmvnic: Fix EOI when running in XIVE mode.
pSeries machines on POWER9 processors can run with the XICS (legacy)
interrupt mode or with the XIVE exploitation interrupt mode. These
interrupt contollers have different interfaces for interrupt
management : XICS uses hcalls and XIVE loads and stores on a page.
H_EOI being a XICS interface the enable_scrq_irq() routine can fail
when the machine runs in XIVE mode.
Fix that by calling the EOI handler of the interrupt chip.
Fixes: f23e0643cd0b ("ibmvnic: Clear pending interrupt after device reset")
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexandre Belloni [Thu, 10 Oct 2019 20:46:06 +0000 (22:46 +0200)]
net: lpc_eth: avoid resetting twice
__lpc_eth_shutdown is called after __lpc_eth_reset but it is already
calling __lpc_eth_reset. Avoid resetting the IP twice.
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 13 Oct 2019 17:13:08 +0000 (10:13 -0700)]
Merge branch 'tcp-address-KCSAN-reports-in-tcp_poll-part-I'
Eric Dumazet says:
====================
tcp: address KCSAN reports in tcp_poll() (part I)
This all started with a KCSAN report (included
in "tcp: annotate tp->rcv_nxt lockless reads" changelog)
tcp_poll() runs in a lockless way. This means that about
all accesses of tcp socket fields done in tcp_poll() context
need annotations otherwise KCSAN will complain about data-races.
While doing this detective work, I found a more serious bug,
addressed by the first patch ("tcp: add rcu protection around
tp->fastopen_rsk").
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 11 Oct 2019 03:17:46 +0000 (20:17 -0700)]
tcp: annotate sk->sk_wmem_queued lockless reads
For the sake of tcp_poll(), there are few places where we fetch
sk->sk_wmem_queued while this field can change from IRQ or other cpu.
We need to add READ_ONCE() annotations, and also make sure write
sides use corresponding WRITE_ONCE() to avoid store-tearing.
sk_wmem_queued_add() helper is added so that we can in
the future convert to ADD_ONCE() or equivalent if/when
available.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 11 Oct 2019 03:17:45 +0000 (20:17 -0700)]
tcp: annotate sk->sk_sndbuf lockless reads
For the sake of tcp_poll(), there are few places where we fetch
sk->sk_sndbuf while this field can change from IRQ or other cpu.
We need to add READ_ONCE() annotations, and also make sure write
sides use corresponding WRITE_ONCE() to avoid store-tearing.
Note that other transports probably need similar fixes.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 11 Oct 2019 03:17:44 +0000 (20:17 -0700)]
tcp: annotate sk->sk_rcvbuf lockless reads
For the sake of tcp_poll(), there are few places where we fetch
sk->sk_rcvbuf while this field can change from IRQ or other cpu.
We need to add READ_ONCE() annotations, and also make sure write
sides use corresponding WRITE_ONCE() to avoid store-tearing.
Note that other transports probably need similar fixes.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 11 Oct 2019 03:17:43 +0000 (20:17 -0700)]
tcp: annotate tp->urg_seq lockless reads
There two places where we fetch tp->urg_seq while
this field can change from IRQ or other cpu.
We need to add READ_ONCE() annotations, and also make
sure write side use corresponding WRITE_ONCE() to avoid
store-tearing.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 11 Oct 2019 03:17:42 +0000 (20:17 -0700)]
tcp: annotate tp->snd_nxt lockless reads
There are few places where we fetch tp->snd_nxt while
this field can change from IRQ or other cpu.
We need to add READ_ONCE() annotations, and also make
sure write sides use corresponding WRITE_ONCE() to avoid
store-tearing.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 11 Oct 2019 03:17:41 +0000 (20:17 -0700)]
tcp: annotate tp->write_seq lockless reads
There are few places where we fetch tp->write_seq while
this field can change from IRQ or other cpu.
We need to add READ_ONCE() annotations, and also make
sure write sides use corresponding WRITE_ONCE() to avoid
store-tearing.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 11 Oct 2019 03:17:40 +0000 (20:17 -0700)]
tcp: annotate tp->copied_seq lockless reads
There are few places where we fetch tp->copied_seq while
this field can change from IRQ or other cpu.
We need to add READ_ONCE() annotations, and also make
sure write sides use corresponding WRITE_ONCE() to avoid
store-tearing.
Note that tcp_inq_hint() was already using READ_ONCE(tp->copied_seq)
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 11 Oct 2019 03:17:39 +0000 (20:17 -0700)]
tcp: annotate tp->rcv_nxt lockless reads
There are few places where we fetch tp->rcv_nxt while
this field can change from IRQ or other cpu.
We need to add READ_ONCE() annotations, and also make
sure write sides use corresponding WRITE_ONCE() to avoid
store-tearing.
Note that tcp_inq_hint() was already using READ_ONCE(tp->rcv_nxt)
syzbot reported :
BUG: KCSAN: data-race in tcp_poll / tcp_queue_rcv
write to 0xffff888120425770 of 4 bytes by interrupt on cpu 0:
tcp_rcv_nxt_update net/ipv4/tcp_input.c:3365 [inline]
tcp_queue_rcv+0x180/0x380 net/ipv4/tcp_input.c:4638
tcp_rcv_established+0xbf1/0xf50 net/ipv4/tcp_input.c:5616
tcp_v4_do_rcv+0x381/0x4e0 net/ipv4/tcp_ipv4.c:1542
tcp_v4_rcv+0x1a03/0x1bf0 net/ipv4/tcp_ipv4.c:1923
ip_protocol_deliver_rcu+0x51/0x470 net/ipv4/ip_input.c:204
ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231
NF_HOOK include/linux/netfilter.h:305 [inline]
NF_HOOK include/linux/netfilter.h:299 [inline]
ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252
dst_input include/net/dst.h:442 [inline]
ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413
NF_HOOK include/linux/netfilter.h:305 [inline]
NF_HOOK include/linux/netfilter.h:299 [inline]
ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523
__netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5004
__netif_receive_skb+0x37/0xf0 net/core/dev.c:5118
netif_receive_skb_internal+0x59/0x190 net/core/dev.c:5208
napi_skb_finish net/core/dev.c:5671 [inline]
napi_gro_receive+0x28f/0x330 net/core/dev.c:5704
receive_buf+0x284/0x30b0 drivers/net/virtio_net.c:1061
read to 0xffff888120425770 of 4 bytes by task 7254 on cpu 1:
tcp_stream_is_readable net/ipv4/tcp.c:480 [inline]
tcp_poll+0x204/0x6b0 net/ipv4/tcp.c:554
sock_poll+0xed/0x250 net/socket.c:1256
vfs_poll include/linux/poll.h:90 [inline]
ep_item_poll.isra.0+0x90/0x190 fs/eventpoll.c:892
ep_send_events_proc+0x113/0x5c0 fs/eventpoll.c:1749
ep_scan_ready_list.constprop.0+0x189/0x500 fs/eventpoll.c:704
ep_send_events fs/eventpoll.c:1793 [inline]
ep_poll+0xe3/0x900 fs/eventpoll.c:1930
do_epoll_wait+0x162/0x180 fs/eventpoll.c:2294
__do_sys_epoll_pwait fs/eventpoll.c:2325 [inline]
__se_sys_epoll_pwait fs/eventpoll.c:2311 [inline]
__x64_sys_epoll_pwait+0xcd/0x170 fs/eventpoll.c:2311
do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 7254 Comm: syz-fuzzer Not tainted 5.3.0+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 11 Oct 2019 03:17:38 +0000 (20:17 -0700)]
tcp: add rcu protection around tp->fastopen_rsk
Both tcp_v4_err() and tcp_v6_err() do the following operations
while they do not own the socket lock :
fastopen = tp->fastopen_rsk;
snd_una = fastopen ? tcp_rsk(fastopen)->snt_isn : tp->snd_una;
The problem is that without appropriate barrier, the compiler
might reload tp->fastopen_rsk and trigger a NULL deref.
request sockets are protected by RCU, we can simply add
the missing annotations and barriers to solve the issue.
Fixes: 168a8f58059a ("tcp: TCP Fast Open Server - main code path")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 12 Oct 2019 18:21:56 +0000 (11:21 -0700)]
Merge git://git./pub/scm/linux/kernel/git/bpf/bpf
Alexei Starovoitov says:
====================
pull-request: bpf 2019-10-12
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) a bunch of small fixes. Nothing critical.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David Howells [Thu, 10 Oct 2019 14:52:34 +0000 (15:52 +0100)]
rxrpc: Fix possible NULL pointer access in ICMP handling
If an ICMP packet comes in on the UDP socket backing an AF_RXRPC socket as
the UDP socket is being shut down, rxrpc_error_report() may get called to
deal with it after sk_user_data on the UDP socket has been cleared, leading
to a NULL pointer access when this local endpoint record gets accessed.
Fix this by just returning immediately if sk_user_data was NULL.
The oops looks like the following:
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
...
RIP: 0010:rxrpc_error_report+0x1bd/0x6a9
...
Call Trace:
? sock_queue_err_skb+0xbd/0xde
? __udp4_lib_err+0x313/0x34d
__udp4_lib_err+0x313/0x34d
icmp_unreach+0x1ee/0x207
icmp_rcv+0x25b/0x28f
ip_protocol_deliver_rcu+0x95/0x10e
ip_local_deliver+0xe9/0x148
__netif_receive_skb_one_core+0x52/0x6e
process_backlog+0xdc/0x177
net_rx_action+0xf9/0x270
__do_softirq+0x1b6/0x39a
? smpboot_register_percpu_thread+0xce/0xce
run_ksoftirqd+0x1d/0x42
smpboot_thread_fn+0x19e/0x1b3
kthread+0xf1/0xf6
? kthread_delayed_work_timer_fn+0x83/0x83
ret_from_fork+0x24/0x30
Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
Reported-by: syzbot+611164843bd48cc2190c@syzkaller.appspotmail.com
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 11 Oct 2019 02:16:06 +0000 (19:16 -0700)]
Merge branch 'smc-fixes'
Karsten Graul says:
====================
Fixes for -net, addressing two races in SMC receive path and
add a missing cleanup when the link group creating fails with
ISM devices and a VLAN id.
====================
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Karsten Graul [Thu, 10 Oct 2019 08:16:11 +0000 (10:16 +0200)]
net/smc: receive pending data after RCV_SHUTDOWN
smc_rx_recvmsg() first checks if data is available, and then if
RCV_SHUTDOWN is set. There is a race when smc_cdc_msg_recv_action() runs
in between these 2 checks, receives data and sets RCV_SHUTDOWN.
In that case smc_rx_recvmsg() would return from receive without to
process the available data.
Fix that with a final check for data available if RCV_SHUTDOWN is set.
Move the check for data into a function and call it twice.
And use the existing helper smc_rx_data_available().
Fixes: 952310ccf2d8 ("smc: receive data from RMBE")
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Karsten Graul [Thu, 10 Oct 2019 08:16:10 +0000 (10:16 +0200)]
net/smc: receive returns without data
smc_cdc_rxed_any_close_or_senddone() is used as an end condition for the
receive loop. This conflicts with smc_cdc_msg_recv_action() which could
run in parallel and set the bits checked by
smc_cdc_rxed_any_close_or_senddone() before the receive is processed.
In that case we could return from receive with no data, although data is
available. The same applies to smc_rx_wait().
Fix this by checking for RCV_SHUTDOWN only, which is set in
smc_cdc_msg_recv_action() after the receive was actually processed.
Fixes: 952310ccf2d8 ("smc: receive data from RMBE")
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Ursula Braun [Thu, 10 Oct 2019 08:16:09 +0000 (10:16 +0200)]
net/smc: fix SMCD link group creation with VLAN id
If creation of an SMCD link group with VLAN id fails, the initial
smc_ism_get_vlan() step has to be reverted as well.
Fixes: c6ba7c9ba43d ("net/smc: add base infrastructure for SMC-D and ISM")
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Jacob Keller [Wed, 9 Oct 2019 19:18:31 +0000 (12:18 -0700)]
net: update net_dim documentation after rename
Commit
8960b38932be ("linux/dim: Rename externally used net_dim
members") renamed the net_dim API, removing the "net_" prefix from the
structures and functions. The patch didn't update the net_dim.txt
documentation file.
Fix the documentation so that its examples match the current code.
Fixes: 8960b38932be ("linux/dim: Rename externally used net_dim members", 2019-06-25)
Fixes: c002bd529d71 ("linux/dim: Rename externally exposed macros", 2019-06-25)
Fixes: 4f75da3666c0 ("linux/dim: Move implementation to .c files")
Cc: Tal Gilboa <talgi@mellanox.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Heiner Kallweit [Wed, 9 Oct 2019 18:55:48 +0000 (20:55 +0200)]
r8169: fix jumbo packet handling on resume from suspend
Mariusz reported that invalid packets are sent after resume from
suspend if jumbo packets are active. It turned out that his BIOS
resets chip settings to non-jumbo on resume. Most chip settings are
re-initialized on resume from suspend by calling rtl_hw_start(),
so let's add configuring jumbo to this function.
There's nothing wrong with the commit marked as fixed, it's just
the first one where the patch applies cleanly.
Fixes: 7366016d2d4c ("r8169: read common register for PCI commit")
Reported-by: Mariusz Bialonczyk <manio@skyboo.net>
Tested-by: Mariusz Bialonczyk <manio@skyboo.net>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Eric Dumazet [Wed, 9 Oct 2019 22:41:03 +0000 (15:41 -0700)]
net: silence KCSAN warnings about sk->sk_backlog.len reads
sk->sk_backlog.len can be written by BH handlers, and read
from process contexts in a lockless way.
Note the write side should also use WRITE_ONCE() or a variant.
We need some agreement about the best way to do this.
syzbot reported :
BUG: KCSAN: data-race in tcp_add_backlog / tcp_grow_window.isra.0
write to 0xffff88812665f32c of 4 bytes by interrupt on cpu 1:
sk_add_backlog include/net/sock.h:934 [inline]
tcp_add_backlog+0x4a0/0xcc0 net/ipv4/tcp_ipv4.c:1737
tcp_v4_rcv+0x1aba/0x1bf0 net/ipv4/tcp_ipv4.c:1925
ip_protocol_deliver_rcu+0x51/0x470 net/ipv4/ip_input.c:204
ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231
NF_HOOK include/linux/netfilter.h:305 [inline]
NF_HOOK include/linux/netfilter.h:299 [inline]
ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252
dst_input include/net/dst.h:442 [inline]
ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413
NF_HOOK include/linux/netfilter.h:305 [inline]
NF_HOOK include/linux/netfilter.h:299 [inline]
ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523
__netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5004
__netif_receive_skb+0x37/0xf0 net/core/dev.c:5118
netif_receive_skb_internal+0x59/0x190 net/core/dev.c:5208
napi_skb_finish net/core/dev.c:5671 [inline]
napi_gro_receive+0x28f/0x330 net/core/dev.c:5704
receive_buf+0x284/0x30b0 drivers/net/virtio_net.c:1061
virtnet_receive drivers/net/virtio_net.c:1323 [inline]
virtnet_poll+0x436/0x7d0 drivers/net/virtio_net.c:1428
napi_poll net/core/dev.c:6352 [inline]
net_rx_action+0x3ae/0xa50 net/core/dev.c:6418
read to 0xffff88812665f32c of 4 bytes by task 7292 on cpu 0:
tcp_space include/net/tcp.h:1373 [inline]
tcp_grow_window.isra.0+0x6b/0x480 net/ipv4/tcp_input.c:413
tcp_event_data_recv+0x68f/0x990 net/ipv4/tcp_input.c:717
tcp_rcv_established+0xbfe/0xf50 net/ipv4/tcp_input.c:5618
tcp_v4_do_rcv+0x381/0x4e0 net/ipv4/tcp_ipv4.c:1542
sk_backlog_rcv include/net/sock.h:945 [inline]
__release_sock+0x135/0x1e0 net/core/sock.c:2427
release_sock+0x61/0x160 net/core/sock.c:2943
tcp_recvmsg+0x63b/0x1a30 net/ipv4/tcp.c:2181
inet_recvmsg+0xbb/0x250 net/ipv4/af_inet.c:838
sock_recvmsg_nosec net/socket.c:871 [inline]
sock_recvmsg net/socket.c:889 [inline]
sock_recvmsg+0x92/0xb0 net/socket.c:885
sock_read_iter+0x15f/0x1e0 net/socket.c:967
call_read_iter include/linux/fs.h:1864 [inline]
new_sync_read+0x389/0x4f0 fs/read_write.c:414
__vfs_read+0xb1/0xc0 fs/read_write.c:427
vfs_read fs/read_write.c:461 [inline]
vfs_read+0x143/0x2c0 fs/read_write.c:446
Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 7292 Comm: syz-fuzzer Not tainted 5.3.0+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Eric Dumazet [Wed, 9 Oct 2019 22:32:35 +0000 (15:32 -0700)]
net: annotate sk->sk_rcvlowat lockless reads
sock_rcvlowat() or int_sk_rcvlowat() might be called without the socket
lock for example from tcp_poll().
Use READ_ONCE() to document the fact that other cpus might change
sk->sk_rcvlowat under us and avoid KCSAN splats.
Use WRITE_ONCE() on write sides too.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Eric Dumazet [Wed, 9 Oct 2019 22:21:13 +0000 (15:21 -0700)]
net: silence KCSAN warnings around sk_add_backlog() calls
sk_add_backlog() callers usually read sk->sk_rcvbuf without
owning the socket lock. This means sk_rcvbuf value can
be changed by other cpus, and KCSAN complains.
Add READ_ONCE() annotations to document the lockless nature
of these reads.
Note that writes over sk_rcvbuf should also use WRITE_ONCE(),
but this will be done in separate patches to ease stable
backports (if we decide this is relevant for stable trees).
BUG: KCSAN: data-race in tcp_add_backlog / tcp_recvmsg
write to 0xffff88812ab369f8 of 8 bytes by interrupt on cpu 1:
__sk_add_backlog include/net/sock.h:902 [inline]
sk_add_backlog include/net/sock.h:933 [inline]
tcp_add_backlog+0x45a/0xcc0 net/ipv4/tcp_ipv4.c:1737
tcp_v4_rcv+0x1aba/0x1bf0 net/ipv4/tcp_ipv4.c:1925
ip_protocol_deliver_rcu+0x51/0x470 net/ipv4/ip_input.c:204
ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231
NF_HOOK include/linux/netfilter.h:305 [inline]
NF_HOOK include/linux/netfilter.h:299 [inline]
ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252
dst_input include/net/dst.h:442 [inline]
ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413
NF_HOOK include/linux/netfilter.h:305 [inline]
NF_HOOK include/linux/netfilter.h:299 [inline]
ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523
__netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5004
__netif_receive_skb+0x37/0xf0 net/core/dev.c:5118
netif_receive_skb_internal+0x59/0x190 net/core/dev.c:5208
napi_skb_finish net/core/dev.c:5671 [inline]
napi_gro_receive+0x28f/0x330 net/core/dev.c:5704
receive_buf+0x284/0x30b0 drivers/net/virtio_net.c:1061
virtnet_receive drivers/net/virtio_net.c:1323 [inline]
virtnet_poll+0x436/0x7d0 drivers/net/virtio_net.c:1428
napi_poll net/core/dev.c:6352 [inline]
net_rx_action+0x3ae/0xa50 net/core/dev.c:6418
read to 0xffff88812ab369f8 of 8 bytes by task 7271 on cpu 0:
tcp_recvmsg+0x470/0x1a30 net/ipv4/tcp.c:2047
inet_recvmsg+0xbb/0x250 net/ipv4/af_inet.c:838
sock_recvmsg_nosec net/socket.c:871 [inline]
sock_recvmsg net/socket.c:889 [inline]
sock_recvmsg+0x92/0xb0 net/socket.c:885
sock_read_iter+0x15f/0x1e0 net/socket.c:967
call_read_iter include/linux/fs.h:1864 [inline]
new_sync_read+0x389/0x4f0 fs/read_write.c:414
__vfs_read+0xb1/0xc0 fs/read_write.c:427
vfs_read fs/read_write.c:461 [inline]
vfs_read+0x143/0x2c0 fs/read_write.c:446
ksys_read+0xd5/0x1b0 fs/read_write.c:587
__do_sys_read fs/read_write.c:597 [inline]
__se_sys_read fs/read_write.c:595 [inline]
__x64_sys_read+0x4c/0x60 fs/read_write.c:595
do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 7271 Comm: syz-fuzzer Not tainted 5.3.0+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Eric Dumazet [Wed, 9 Oct 2019 22:10:15 +0000 (15:10 -0700)]
tcp: annotate lockless access to tcp_memory_pressure
tcp_memory_pressure is read without holding any lock,
and its value could be changed on other cpus.
Use READ_ONCE() to annotate these lockless reads.
The write side is already using atomic ops.
Fixes: b8da51ebb1aa ("tcp: introduce tcp_under_memory_pressure()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Eric Dumazet [Wed, 9 Oct 2019 21:51:20 +0000 (14:51 -0700)]
net: add {READ|WRITE}_ONCE() annotations on ->rskq_accept_head
reqsk_queue_empty() is called from inet_csk_listen_poll() while
other cpus might write ->rskq_accept_head value.
Use {READ|WRITE}_ONCE() to avoid compiler tricks
and potential KCSAN splats.
Fixes: fff1f3001cc5 ("tcp: add a spinlock to protect struct request_sock_queue")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Eric Dumazet [Wed, 9 Oct 2019 19:55:53 +0000 (12:55 -0700)]
net: avoid possible false sharing in sk_leave_memory_pressure()
As mentioned in https://github.com/google/ktsan/wiki/READ_ONCE-and-WRITE_ONCE#it-may-improve-performance
a C compiler can legally transform :
if (memory_pressure && *memory_pressure)
*memory_pressure = 0;
to :
if (memory_pressure)
*memory_pressure = 0;
Fixes: 0604475119de ("tcp: add TCPMemoryPressuresChrono counter")
Fixes: 180d8cd942ce ("foundations of per-cgroup memory pressure controlling.")
Fixes: 3ab224be6d69 ("[NET] CORE: Introducing new memory accounting interface.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Eric Dumazet [Wed, 9 Oct 2019 16:20:02 +0000 (09:20 -0700)]
tun: remove possible false sharing in tun_flow_update()
As mentioned in https://github.com/google/ktsan/wiki/READ_ONCE-and-WRITE_ONCE#it-may-improve-performance
a C compiler can legally transform
if (e->queue_index != queue_index)
e->queue_index = queue_index;
to :
e->queue_index = queue_index;
Note that the code using jiffies has no issue, since jiffies
has volatile attribute.
if (e->updated != jiffies)
e->updated = jiffies;
Fixes: 83b1bc122cab ("tun: align write-heavy flow entry members to a cache line")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Zhang Yu <zhangyu31@baidu.com>
Cc: Wang Li <wangli39@baidu.com>
Cc: Li RongQing <lirongqing@baidu.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Eric Dumazet [Wed, 9 Oct 2019 16:19:13 +0000 (09:19 -0700)]
netfilter: conntrack: avoid possible false sharing
As hinted by KCSAN, we need at least one READ_ONCE()
to prevent a compiler optimization.
More details on :
https://github.com/google/ktsan/wiki/READ_ONCE-and-WRITE_ONCE#it-may-improve-performance
sysbot report :
BUG: KCSAN: data-race in __nf_ct_refresh_acct / __nf_ct_refresh_acct
read to 0xffff888123eb4f08 of 4 bytes by interrupt on cpu 0:
__nf_ct_refresh_acct+0xd4/0x1b0 net/netfilter/nf_conntrack_core.c:1796
nf_ct_refresh_acct include/net/netfilter/nf_conntrack.h:201 [inline]
nf_conntrack_tcp_packet+0xd40/0x3390 net/netfilter/nf_conntrack_proto_tcp.c:1161
nf_conntrack_handle_packet net/netfilter/nf_conntrack_core.c:1633 [inline]
nf_conntrack_in+0x410/0xaa0 net/netfilter/nf_conntrack_core.c:1727
ipv4_conntrack_in+0x27/0x40 net/netfilter/nf_conntrack_proto.c:178
nf_hook_entry_hookfn include/linux/netfilter.h:135 [inline]
nf_hook_slow+0x83/0x160 net/netfilter/core.c:512
nf_hook include/linux/netfilter.h:260 [inline]
NF_HOOK include/linux/netfilter.h:303 [inline]
ip_rcv+0x12f/0x1a0 net/ipv4/ip_input.c:523
__netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5004
__netif_receive_skb+0x37/0xf0 net/core/dev.c:5118
netif_receive_skb_internal+0x59/0x190 net/core/dev.c:5208
napi_skb_finish net/core/dev.c:5671 [inline]
napi_gro_receive+0x28f/0x330 net/core/dev.c:5704
receive_buf+0x284/0x30b0 drivers/net/virtio_net.c:1061
virtnet_receive drivers/net/virtio_net.c:1323 [inline]
virtnet_poll+0x436/0x7d0 drivers/net/virtio_net.c:1428
napi_poll net/core/dev.c:6352 [inline]
net_rx_action+0x3ae/0xa50 net/core/dev.c:6418
__do_softirq+0x115/0x33f kernel/softirq.c:292
write to 0xffff888123eb4f08 of 4 bytes by task 7191 on cpu 1:
__nf_ct_refresh_acct+0xfb/0x1b0 net/netfilter/nf_conntrack_core.c:1797
nf_ct_refresh_acct include/net/netfilter/nf_conntrack.h:201 [inline]
nf_conntrack_tcp_packet+0xd40/0x3390 net/netfilter/nf_conntrack_proto_tcp.c:1161
nf_conntrack_handle_packet net/netfilter/nf_conntrack_core.c:1633 [inline]
nf_conntrack_in+0x410/0xaa0 net/netfilter/nf_conntrack_core.c:1727
ipv4_conntrack_local+0xbe/0x130 net/netfilter/nf_conntrack_proto.c:200
nf_hook_entry_hookfn include/linux/netfilter.h:135 [inline]
nf_hook_slow+0x83/0x160 net/netfilter/core.c:512
nf_hook include/linux/netfilter.h:260 [inline]
__ip_local_out+0x1f7/0x2b0 net/ipv4/ip_output.c:114
ip_local_out+0x31/0x90 net/ipv4/ip_output.c:123
__ip_queue_xmit+0x3a8/0xa40 net/ipv4/ip_output.c:532
ip_queue_xmit+0x45/0x60 include/net/ip.h:236
__tcp_transmit_skb+0xdeb/0x1cd0 net/ipv4/tcp_output.c:1158
__tcp_send_ack+0x246/0x300 net/ipv4/tcp_output.c:3685
tcp_send_ack+0x34/0x40 net/ipv4/tcp_output.c:3691
tcp_cleanup_rbuf+0x130/0x360 net/ipv4/tcp.c:1575
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 7191 Comm: syz-fuzzer Not tainted 5.3.0+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Fixes: cc16921351d8 ("netfilter: conntrack: avoid same-timeout update")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Jozsef Kadlecsik <kadlec@netfilter.org>
Cc: Florian Westphal <fw@strlen.de>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Nicolas Dichtel [Wed, 9 Oct 2019 09:19:10 +0000 (11:19 +0200)]
netns: fix NLM_F_ECHO mechanism for RTM_NEWNSID
The flag NLM_F_ECHO aims to reply to the user the message notified to all
listeners.
It was not the case with the command RTM_NEWNSID, let's fix this.
Fixes: 0c7aecd4bde4 ("netns: add rtnl cmd to add and get peer netns ids")
Reported-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Guillaume Nault <gnault@redhat.com>
Tested-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Daniele Palmas [Wed, 9 Oct 2019 09:07:18 +0000 (11:07 +0200)]
net: usb: qmi_wwan: add Telit 0x1050 composition
This patch adds support for Telit FN980 0x1050 composition
0x1050: tty, adb, rmnet, tty, tty, tty, tty
Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
YueHaibing [Wed, 9 Oct 2019 03:10:52 +0000 (11:10 +0800)]
act_mirred: Fix mirred_init_module error handling
If tcf_register_action failed, mirred_device_notifier
should be unregistered.
Fixes: 3b87956ea645 ("net sched: fix race in mirred device removal")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Vinicius Costa Gomes [Tue, 8 Oct 2019 23:20:07 +0000 (16:20 -0700)]
net: taprio: Fix returning EINVAL when configuring without flags
When configuring a taprio instance if "flags" is not specified (or
it's zero), taprio currently replies with an "Invalid argument" error.
So, set the return value to zero after we are done with all the
checks.
Fixes: 9c66d1564676 ("taprio: Add support for hardware offloading")
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Acked-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Jakub Kicinski [Thu, 10 Oct 2019 00:57:36 +0000 (17:57 -0700)]
Merge branch 's390-qeth-fixes'
Julian Wiedmann says:
====================
s390/qeth: fixes 2019-10-08
Alexandra fixes two issues in the initialization code for vnicc cmds.
One is an uninitialized variable when a cmd fails, the other that we
wouldn't recover correctly if the device's supported features changed.
====================
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Alexandra Winter [Tue, 8 Oct 2019 16:21:07 +0000 (18:21 +0200)]
s390/qeth: Fix initialization of vnicc cmd masks during set online
Without this patch, a command bit in the supported commands mask is only
ever set to unsupported during set online. If a command is ever marked as
unsupported (e.g. because of error during qeth_l2_vnicc_query_cmds),
subsequent successful initialization (offline/online) would not bring it
back.
Fixes: caa1f0b10d18 ("s390/qeth: add VNICC enable/disable support")
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Alexandra Winter [Tue, 8 Oct 2019 16:21:06 +0000 (18:21 +0200)]
s390/qeth: Fix error handling during VNICC initialization
Smatch discovered the use of uninitialized variable sup_cmds
in error paths.
Fixes: caa1f0b10d18 ("s390/qeth: add VNICC enable/disable support")
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Randy Dunlap [Tue, 8 Oct 2019 15:39:04 +0000 (08:39 -0700)]
phylink: fix kernel-doc warnings
Fix kernel-doc warnings in phylink.c:
../drivers/net/phy/phylink.c:595: warning: Function parameter or member 'config' not described in 'phylink_create'
../drivers/net/phy/phylink.c:595: warning: Excess function parameter 'ndev' description in 'phylink_create'
Fixes: 8796c8923d9c ("phylink: add documentation for kernel APIs")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Xin Long [Tue, 8 Oct 2019 11:09:23 +0000 (19:09 +0800)]
sctp: add chunks to sk_backlog when the newsk sk_socket is not set
This patch is to fix a NULL-ptr deref in selinux_socket_connect_helper:
[...] kasan: GPF could be caused by NULL-ptr deref or user memory access
[...] RIP: 0010:selinux_socket_connect_helper+0x94/0x460
[...] Call Trace:
[...] selinux_sctp_bind_connect+0x16a/0x1d0
[...] security_sctp_bind_connect+0x58/0x90
[...] sctp_process_asconf+0xa52/0xfd0 [sctp]
[...] sctp_sf_do_asconf+0x785/0x980 [sctp]
[...] sctp_do_sm+0x175/0x5a0 [sctp]
[...] sctp_assoc_bh_rcv+0x285/0x5b0 [sctp]
[...] sctp_backlog_rcv+0x482/0x910 [sctp]
[...] __release_sock+0x11e/0x310
[...] release_sock+0x4f/0x180
[...] sctp_accept+0x3f9/0x5a0 [sctp]
[...] inet_accept+0xe7/0x720
It was caused by that the 'newsk' sk_socket was not set before going to
security sctp hook when processing asconf chunk with SCTP_PARAM_ADD_IP
or SCTP_PARAM_SET_PRIMARY:
inet_accept()->
sctp_accept():
lock_sock():
lock listening 'sk'
do_softirq():
sctp_rcv(): <-- [1]
asconf chunk arrives and
enqueued in 'sk' backlog
sctp_sock_migrate():
set asoc's sk to 'newsk'
release_sock():
sctp_backlog_rcv():
lock 'newsk'
sctp_process_asconf() <-- [2]
unlock 'newsk'
sock_graft():
set sk_socket <-- [3]
As it shows, at [1] the asconf chunk would be put into the listening 'sk'
backlog, as accept() was holding its sock lock. Then at [2] asconf would
get processed with 'newsk' as asoc's sk had been set to 'newsk'. However,
'newsk' sk_socket is not set until [3], while selinux_sctp_bind_connect()
would deref it, then kernel crashed.
Here to fix it by adding the chunk to sk_backlog until newsk sk_socket is
set when .accept() is done.
Note that sk->sk_socket can be NULL when the sock is closed, so SOCK_DEAD
flag is also needed to check in sctp_newsk_ready().
Thanks to Ondrej for reviewing the code.
Fixes: d452930fd3b9 ("selinux: Add SCTP support")
Reported-by: Ying Xu <yinxu@redhat.com>
Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Eric Dumazet [Mon, 7 Oct 2019 22:43:01 +0000 (15:43 -0700)]
bonding: fix potential NULL deref in bond_update_slave_arr
syzbot got a NULL dereference in bond_update_slave_arr() [1],
happening after a failure to allocate bond->slave_arr
A workqueue (bond_slave_arr_handler) is supposed to retry
the allocation later, but if the slave is removed before
the workqueue had a chance to complete, bond->slave_arr
can still be NULL.
[1]
Failed to build slave-array.
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN PTI
Modules linked in:
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:bond_update_slave_arr.cold+0xc6/0x198 drivers/net/bonding/bond_main.c:4039
RSP: 0018:
ffff88018fe33678 EFLAGS:
00010246
RAX:
dffffc0000000000 RBX:
0000000000000000 RCX:
ffffc9000290b000
RDX:
0000000000000000 RSI:
ffffffff82b63037 RDI:
ffff88019745ea20
RBP:
ffff88018fe33760 R08:
ffff880170754280 R09:
0000000000000000
R10:
0000000000000000 R11:
0000000000000000 R12:
0000000000000000
R13:
ffff88019745ea00 R14:
0000000000000000 R15:
ffff88018fe338b0
FS:
00007febd837d700(0000) GS:
ffff8801dad00000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00000000004540a0 CR3:
00000001c242e005 CR4:
00000000001626f0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Call Trace:
[<
ffffffff82b5b45e>] __bond_release_one+0x43e/0x500 drivers/net/bonding/bond_main.c:1923
[<
ffffffff82b5b966>] bond_release drivers/net/bonding/bond_main.c:2039 [inline]
[<
ffffffff82b5b966>] bond_do_ioctl+0x416/0x870 drivers/net/bonding/bond_main.c:3562
[<
ffffffff83ae25f4>] dev_ifsioc+0x6f4/0x940 net/core/dev_ioctl.c:328
[<
ffffffff83ae2e58>] dev_ioctl+0x1b8/0xc70 net/core/dev_ioctl.c:495
[<
ffffffff83995ffd>] sock_do_ioctl+0x1bd/0x300 net/socket.c:1088
[<
ffffffff83996a80>] sock_ioctl+0x300/0x5d0 net/socket.c:1196
[<
ffffffff81b124db>] vfs_ioctl fs/ioctl.c:47 [inline]
[<
ffffffff81b124db>] file_ioctl fs/ioctl.c:501 [inline]
[<
ffffffff81b124db>] do_vfs_ioctl+0xacb/0x1300 fs/ioctl.c:688
[<
ffffffff81b12dc6>] SYSC_ioctl fs/ioctl.c:705 [inline]
[<
ffffffff81b12dc6>] SyS_ioctl+0xb6/0xe0 fs/ioctl.c:696
[<
ffffffff8101ccc8>] do_syscall_64+0x528/0x770 arch/x86/entry/common.c:305
[<
ffffffff84400091>] entry_SYSCALL_64_after_hwframe+0x42/0xb7
Fixes: ee6377147409 ("bonding: Simplify the xmit function for modes that use xmit_hash")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Antonio Borneo [Mon, 7 Oct 2019 15:43:05 +0000 (17:43 +0200)]
net: stmmac: fix disabling flexible PPS output
Accordingly to Synopsys documentation [1] and [2], when bit PPSEN0
in register MAC_PPS_CONTROL is set it selects the functionality
command in the same register, otherwise selects the functionality
control.
Command functionality is required to either enable (command 0x2)
and disable (command 0x5) the flexible PPS output, but the bit
PPSEN0 is currently set only for enabling.
Set the bit PPSEN0 to properly disable flexible PPS output.
Tested on STM32MP15x, based on dwmac 4.10a.
[1] DWC Ethernet QoS Databook 4.10a October 2014
[2] DWC Ethernet QoS Databook 5.00a September 2017
Signed-off-by: Antonio Borneo <antonio.borneo@st.com>
Fixes: 9a8a02c9d46d ("net: stmmac: Add Flexible PPS support")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Antonio Borneo [Mon, 7 Oct 2019 15:43:04 +0000 (17:43 +0200)]
net: stmmac: fix length of PTP clock's name string
The field "name" in struct ptp_clock_info has a fixed size of 16
chars and is used as zero terminated string by clock_name_show()
in drivers/ptp/ptp_sysfs.c
The current initialization value requires 17 chars to fit also the
null termination, and this causes overflow to the next bytes in
the struct when the string is read as null terminated:
hexdump -C /sys/class/ptp/ptp0/clock_name
00000000 73 74 6d 6d 61 63 5f 70 74 70 5f 63 6c 6f 63 6b |stmmac_ptp_clock|
00000010 a0 ac b9 03 0a |.....|
where the extra 4 bytes (excluding the newline) after the string
represent the integer 0x03b9aca0 =
62500000 assigned to the field
"max_adj" that follows "name" in the same struct.
There is no strict requirement for the "name" content and in the
comment in ptp_clock_kernel.h it's reported it should just be 'A
short "friendly name" to identify the clock'.
Replace it with "stmmac ptp".
Signed-off-by: Antonio Borneo <antonio.borneo@st.com>
Fixes: 92ba6888510c ("stmmac: add the support for PTP hw clock driver")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Kalle Valo [Wed, 9 Oct 2019 18:10:12 +0000 (21:10 +0300)]
Merge tag 'iwlwifi-for-kalle-2019-10-09' of git://git./linux/kernel/git/iwlwifi/iwlwifi-fixes
First batch of fixes intended for v5.4
* fix for an ACPI table parsing bug;
* a fix for a NULL pointer dereference in the cfg with specific
devices;
* fix the rb_allocator;
* prevent multiple phy configuration with new devices;
* fix a race-condition in the rx queue;
* prevent a couple of memory leaks;
* fix initialization of 3168 devices (the infamous BAD_COMMAND bug);
* fix recognition of some newer systems with integrated MAC;
Luca Coelho [Tue, 8 Oct 2019 10:21:02 +0000 (13:21 +0300)]
iwlwifi: pcie: change qu with jf devices to use qu configuration
There were a bunch of devices with qu and jf that were loading the
configuration with pu and jf, which is wrong. Fix them all
accordingly. Additionally, remove 0x1010 and 0x1210 subsytem IDs from
the list, since they are obviously wrong, and 0x0044 and 0x0244, which
were duplicate.
Cc: stable@vger.kernel.org # 5.1+
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Luca Coelho [Tue, 8 Oct 2019 10:10:53 +0000 (13:10 +0300)]
iwlwifi: exclude GEO SAR support for 3168
We currently support two NICs in FW version 29, namely 7265D and 3168.
Out of these, only 7265D supports GEO SAR, so adjust the function that
checks for it accordingly.
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Fixes: f5a47fae6aa3 ("iwlwifi: mvm: fix version check for GEO_TX_POWER_LIMIT support")
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Navid Emamdoost [Fri, 27 Sep 2019 20:56:04 +0000 (15:56 -0500)]
iwlwifi: pcie: fix memory leaks in iwl_pcie_ctxt_info_gen3_init
In iwl_pcie_ctxt_info_gen3_init there are cases that the allocated dma
memory is leaked in case of error.
DMA memories prph_scratch, prph_info, and ctxt_info_gen3 are allocated
and initialized to be later assigned to trans_pcie. But in any error case
before such assignment the allocated memories should be released.
First of such error cases happens when iwl_pcie_init_fw_sec fails.
Current implementation correctly releases prph_scratch. But in two
sunsequent error cases where dma_alloc_coherent may fail, such
releases are missing.
This commit adds release for prph_scratch when allocation for
prph_info fails, and adds releases for prph_scratch and prph_info when
allocation for ctxt_info_gen3 fails.
Fixes: 2ee824026288 ("iwlwifi: pcie: support context information for 22560 devices")
Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>