Alex Elder [Wed, 24 Nov 2021 20:25:08 +0000 (14:25 -0600)]
net: ipa: explicitly disable HOLB drop during setup
During setup, ipa_endpoint_program() programs each endpoint with
various configuration parameters. One of those registers defines
whether to drop packets when a head-of-line blocking condition is
detected on an RX endpoint. We currently assume this is disabled;
instead, explicitly set it to be disabled.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alex Elder [Wed, 24 Nov 2021 20:25:07 +0000 (14:25 -0600)]
net: ipa: rework how HOL_BLOCK handling is specified
The head-of-line block (HOLB) drop timer is only meaningful when
dropping packets due to blocking is enabled. Given that, redefine
the interface so the timer is specified when enabling HOLB drop, and
use a different function when disabling.
To enable and disable HOLB drop, these functions will now be used:
ipa_endpoint_init_hol_block_enable(endpoint, milliseconds)
ipa_endpoint_init_hol_block_disable(endpoint)
The existing ipa_endpoint_init_hol_block_enable() becomes a helper
function, renamed ipa_endpoint_init_hol_block_en(), and used with
ipa_endpoint_init_hol_block_timer() to enable HOLB block on an
endpoint.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alex Elder [Wed, 24 Nov 2021 20:25:06 +0000 (14:25 -0600)]
net: ipa: zero unused portions of filter table memory
Not all filter table entries are used. Only certain endpoints
support filtering, and the table begins with a bitmap indicating
which endpoints use the "slots" that follow for filter rules.
Currently, unused filter table entries are not initialized.
Instead, zero-fill the entire unused portion of the filter table
memory regions, to make it more obvious that memory is unused (and
not subsequently modified).
This is not strictly necessary, but the result is reassuring when
looking at filter table memory.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alex Elder [Wed, 24 Nov 2021 20:25:05 +0000 (14:25 -0600)]
net: ipa: kill ipa_modem_init()
A recent commit made disabling the SMP2P "setup ready" interrupt
unrelated to ipa_modem_stop(). Given that, it seems fitting to get
rid of ipa_modem_init() and ipa_modem_exit() (which are trivial
wrapper functions), and call ipa_smp2p_init() and ipa_smp2p_exit()
directly instead.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vladimir Oltean [Thu, 25 Nov 2021 12:58:08 +0000 (14:58 +0200)]
net: dsa: felix: enable cut-through forwarding between ports by default
The VSC9959 switch embedded within NXP LS1028A (and that version of
Ocelot switches only) supports cut-through forwarding - meaning it can
start the process of looking up the destination ports for a packet, and
forward towards those ports, before the entire packet has been received
(as opposed to the store-and-forward mode).
The up side is having lower forwarding latency for large packets. The
down side is that frames with FCS errors are forwarded instead of being
dropped. However, erroneous frames do not result in incorrect updates of
the FDB or incorrect policer updates, since these processes are deferred
inside the switch to the end of frame. Since the switch starts the
cut-through forwarding process after all packet headers (including IP,
if any) have been processed, packets with large headers and small
payload do not see the benefit of lower forwarding latency.
There are two cases that need special attention.
The first is when a packet is multicast (or flooded) to multiple
destinations, one of which doesn't have cut-through forwarding enabled.
The switch deals with this automatically by disabling cut-through
forwarding for the frame towards all destination ports.
The second is when a packet is forwarded from a port of lower link speed
towards a port of higher link speed. This is not handled by the hardware
and needs software intervention.
Since we practically need to update the cut-through forwarding domain
from paths that aren't serialized by the rtnl_mutex (phylink
mac_link_down/mac_link_up ops), this means we need to serialize physical
link events with user space updates of bonding/bridging domains.
Enabling cut-through forwarding is done per {egress port, traffic class}.
I don't see any reason why this would be a configurable option as long
as it works without issues, and there doesn't appear to be any user
space configuration tool to toggle this on/off, so this patch enables
cut-through forwarding on all eligible ports and traffic classes.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20211125125808.2383984-2-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vladimir Oltean [Thu, 25 Nov 2021 12:58:07 +0000 (14:58 +0200)]
net: ocelot: remove "bridge" argument from ocelot_get_bridge_fwd_mask
The only called takes ocelot_port->bridge and passes it as the "bridge"
argument to this function, which then compares it with
ocelot_port->bridge. This is not useful.
Instead, we would like this function to return 0 if ocelot_port->bridge
is not present, which is what this patch does.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20211125125808.2383984-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Colin Ian King [Thu, 25 Nov 2021 00:29:32 +0000 (00:29 +0000)]
net: dsa: qca8k: Fix spelling mistake "Mismateched" -> "Mismatched"
There is a spelling mistake in a netdev_err error message. Fix it.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20211125002932.49217-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ong Boon Leong [Wed, 24 Nov 2021 11:40:19 +0000 (19:40 +0800)]
net: stmmac: perserve TX and RX coalesce value during XDP setup
When XDP program is loaded, it is desirable that the previous TX and RX
coalesce values are not re-inited to its default value. This prevents
unnecessary re-configurig the coalesce values that were working fine
before.
Fixes: ac746c8520d9 ("net: stmmac: enhance XDP ZC driver level switching performance")
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Tested-by: Kurt Kanzenbach <kurt@linutronix.de>
Link: https://lore.kernel.org/r/20211124114019.3949125-1-boon.leong.ong@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yang Yingliang [Wed, 24 Nov 2021 08:40:48 +0000 (16:40 +0800)]
tsnep: Add missing of_node_put() in tsnep_mdio_init()
The node pointer is returned by of_get_child_by_name() with
refcount incremented in tsnep_mdio_init(). Calling of_node_put()
to aovid the refcount leak in tsnep_mdio_init().
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20211124084048.175456-1-yangyingliang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tonghao Zhang [Thu, 25 Nov 2021 02:54:44 +0000 (10:54 +0800)]
veth: use ethtool_sprintf instead of snprintf
use ethtools api ethtool_sprintf to instead of snprintf.
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Link: https://lore.kernel.org/r/20211125025444.13115-1-xiangxia.m.yue@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King (Oracle) [Wed, 24 Nov 2021 15:44:43 +0000 (15:44 +0000)]
net: macb: convert to phylink_generic_validate()
Populate the supported interfaces bitmap and MAC capabilities mask for
the macb driver and remove the old validate implementation.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1mpuRv-00D4rb-Lz@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Heiner Kallweit [Wed, 24 Nov 2021 20:44:40 +0000 (21:44 +0100)]
r8169: disable detection of chip version 60
It seems only XID 609 made it to the mass market. Therefore let's
disable detection of the other RTL8125a XID's. If nobody complains
we can remove support for RTL_GIGA_MAC_VER_60 later.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/2cd3df01-5f8b-08dd-6def-3f31a3014bde@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Maciej Żenczykowski [Tue, 23 Nov 2021 22:32:08 +0000 (14:32 -0800)]
net-ipv6: changes to ->tclass (via IPV6_TCLASS) should sk_dst_reset()
This is to match ipv4 behaviour, see __ip_sock_set_tos()
implementation.
Technically for ipv6 this might not be required because normally we
do not allow tclass to influence routing, yet the cli tooling does
support it:
lpk11:~# ip -6 rule add pref 5 tos 45 lookup 5
lpk11:~# ip -6 rule
5: from all tos 0x45 lookup 5
and in general dscp/tclass based routing does make sense.
We already have cases where dscp can affect vlan priority and/or
transmit queue (especially on wifi).
So let's just make things match. Easier to reason about and no harm.
Cc: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Link: https://lore.kernel.org/r/20211123223208.1117871-1-zenczykowski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Maciej Żenczykowski [Tue, 23 Nov 2021 22:31:54 +0000 (14:31 -0800)]
net-ipv6: do not allow IPV6_TCLASS to muck with tcp's ECN
This is to match ipv4 behaviour, see __ip_sock_set_tos()
implementation at ipv4/ip_sockglue.c:579
void __ip_sock_set_tos(struct sock *sk, int val)
{
if (sk->sk_type == SOCK_STREAM) {
val &= ~INET_ECN_MASK;
val |= inet_sk(sk)->tos & INET_ECN_MASK;
}
if (inet_sk(sk)->tos != val) {
inet_sk(sk)->tos = val;
sk->sk_priority = rt_tos2priority(val);
sk_dst_reset(sk);
}
}
Cc: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20211123223154.1117794-1-zenczykowski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Maciej Żenczykowski [Tue, 23 Nov 2021 20:37:15 +0000 (12:37 -0800)]
net: allow SO_MARK with CAP_NET_RAW
A CAP_NET_RAW capable process can already spoof (on transmit) anything
it desires via raw packet sockets... There is no good reason to not
allow it to also be able to play routing tricks on packets from its
own normal sockets.
There is a desire to be able to use SO_MARK for routing table selection
(via ip rule fwmark) from within a user process without having to run
it as root. Granting it CAP_NET_RAW is much less dangerous than
CAP_NET_ADMIN (CAP_NET_RAW doesn't permit persistent state change,
while CAP_NET_ADMIN does - by for example allowing the reconfiguration
of the routing tables and/or bringing up/down devices).
Let's keep CAP_NET_ADMIN for persistent state changes,
while using CAP_NET_RAW for non-configuration related stuff.
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Link: https://lore.kernel.org/r/20211123203715.193413-1-zenczykowski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Maciej Żenczykowski [Tue, 23 Nov 2021 20:37:02 +0000 (12:37 -0800)]
net: allow CAP_NET_RAW to setsockopt SO_PRIORITY
CAP_NET_ADMIN is and should continue to be about configuring the
system as a whole, not about configuring per-socket or per-packet
parameters.
Sending and receiving raw packets is what CAP_NET_RAW is all about.
It can already send packets with any VLAN tag, and any IPv4 TOS
mark, and any IPv6 TCLASS mark, simply by virtue of building
such a raw packet. Not to mention using any protocol and source/
/destination ip address/port tuple.
These are the fields that networking gear uses to prioritize packets.
Hence, a CAP_NET_RAW process is already capable of affecting traffic
prioritization after it hits the wire. This change makes it capable
of affecting traffic prioritization even in the host at the nic and
before that in the queueing disciplines (provided skb->priority is
actually being used for prioritization, and not the TOS/TCLASS field)
Hence it makes sense to allow a CAP_NET_RAW process to set the
priority of sockets and thus packets it sends.
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Link: https://lore.kernel.org/r/20211123203702.193221-1-zenczykowski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ansuel Smith [Tue, 23 Nov 2021 15:44:46 +0000 (16:44 +0100)]
net: dsa: qca8k: fix warning in LAG feature
Fix warning reported by bot.
Make sure hash is init to 0 and fix wrong logic for hash_type in
qca8k_lag_can_offload.
Reported-by: kernel test robot <lkp@intel.com>
Fixes: def975307c01 ("net: dsa: qca8k: add LAG support")
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20211123154446.31019-1-ansuelsmth@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Rahul Lakkireddy [Tue, 23 Nov 2021 15:47:17 +0000 (21:17 +0530)]
cxgb4: allow reading unrecognized port module eeprom
Even if firmware fails to recognize the plugged-in port module type,
allow reading port module EEPROM anyway. This helps in obtaining
necessary diagnostics information for debugging and analysis.
Signed-off-by: Manoj Malviya <manojmalviya@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Link: https://lore.kernel.org/r/1637682437-31407-1-git-send-email-rahul.lakkireddy@chelsio.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel [Wed, 24 Nov 2021 10:11:22 +0000 (12:11 +0200)]
net: bridge: Allow base 16 inputs in sysfs
Cited commit converted simple_strtoul() to kstrtoul() as suggested by
the former's documentation. However, it also forced all the inputs to be
decimal resulting in user space breakage.
Fix by setting the base to '0' so that the base is automatically
detected.
Before:
# ip link add name br0 type bridge vlan_filtering 1
# echo "0x88a8" > /sys/class/net/br0/bridge/vlan_protocol
bash: echo: write error: Invalid argument
After:
# ip link add name br0 type bridge vlan_filtering 1
# echo "0x88a8" > /sys/class/net/br0/bridge/vlan_protocol
# echo $?
0
Fixes: 520fbdf7fb19 ("net/bridge: replace simple_strtoul to kstrtol")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Link: https://lore.kernel.org/r/20211124101122.3321496-1-idosch@idosch.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 25 Nov 2021 01:21:46 +0000 (17:21 -0800)]
Merge branch 'gro-remove-redundant-rcu_read_lock'
Eric Dumazet says:
====================
gro: remove redundant rcu_read_lock
Recent trees got an increase of rcu_read_{lock|unlock} costs,
it is time to get rid of the not needed pairs.
====================
Link: https://lore.kernel.org/r/20211123225608.2155163-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Tue, 23 Nov 2021 22:56:08 +0000 (14:56 -0800)]
gro: remove rcu_read_lock/rcu_read_unlock from gro_complete handlers
All gro_complete() handlers are called from napi_gro_complete()
while rcu_read_lock() has been called.
There is no point stacking more rcu_read_lock()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Tue, 23 Nov 2021 22:56:07 +0000 (14:56 -0800)]
gro: remove rcu_read_lock/rcu_read_unlock from gro_receive handlers
All gro_receive() handlers are called from dev_gro_receive()
while rcu_read_lock() has been called.
There is no point stacking more rcu_read_lock()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Gerhard Engleder [Wed, 24 Nov 2021 20:52:25 +0000 (21:52 +0100)]
tsnep: Fix resource_size cocci warning
The following warning is fixed, by removing the unused resource size:
drivers/net/ethernet/engleder/tsnep_main.c:1155:21-24:
WARNING: Suspicious code. resource_size is maybe missing with io
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Link: https://lore.kernel.org/r/20211124205225.13985-1-gerhard@engleder-embedded.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yang Li [Wed, 24 Nov 2021 02:36:24 +0000 (10:36 +0800)]
tsnep: fix platform_no_drv_owner.cocci warning
Remove .owner field if calls are used which set it automatically
Eliminate the following coccicheck warning:
./drivers/net/ethernet/engleder/tsnep_main.c:1263:3-8: No need to set
.owner here. The core will do it.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Link: https://lore.kernel.org/r/1637721384-70836-2-git-send-email-yang.lee@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David S. Miller [Wed, 24 Nov 2021 14:12:26 +0000 (14:12 +0000)]
Merge branch 'hns3-next'
Guangbin Huang says:
====================
net: hns3: updates for -next
This series includes some updates for the HNS3 ethernet driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Yufeng Mo [Wed, 24 Nov 2021 01:06:54 +0000 (09:06 +0800)]
net: hns3: add dql info when tx timeout
When tx timeout occurs, the info of dql maybe helpful, so print
these info to hns3_get_tx_timeo_queue_info().
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jie Wang [Wed, 24 Nov 2021 01:06:53 +0000 (09:06 +0800)]
net: hns3: debugfs add drop packet statistics of multicast and broadcast for igu
Currently, there is no way to get drop packet number of multicast and
broadcast in IGU hardware module, it is not convenient to find problem
when multicast packet or broadcast packet is dropped in IGU, so this
patch adds statistics for them in debugfs.
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yufeng Mo [Wed, 24 Nov 2021 01:06:52 +0000 (09:06 +0800)]
net: hns3: format the output of the MAC address
Printing the whole MAC addresse may bring security risks. Therefore,
the MAC address is partially encrypted to improve security.
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yufeng Mo [Wed, 24 Nov 2021 01:06:51 +0000 (09:06 +0800)]
net: hns3: add log for workqueue scheduled late
When the mbx or reset message arrives, the driver is informed
through an interrupt. This task can be processed only after
the workqueue is scheduled. In some cases, this workqueue
scheduling takes a long time. As a result, the mbx or reset
service task cannot be processed in time. So add some warning
message to improve debugging efficiency for this case.
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiapeng Chong [Wed, 24 Nov 2021 10:09:56 +0000 (18:09 +0800)]
lan78xx: Clean up some inconsistent indenting
Eliminate the follow smatch warning:
drivers/net/usb/lan78xx.c:4961 lan78xx_resume() warn: inconsistent
indenting.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Wed, 24 Nov 2021 04:16:22 +0000 (20:16 -0800)]
Merge branch 'dccp-tcp-minor-fixes-for-inet_csk_listen_start'
Kuniyuki Iwashima says:
====================
dccp/tcp: Minor fixes for inet_csk_listen_start().
The first patch removes an unused argument, and the second removes a stale
comment.
====================
Link: https://lore.kernel.org/r/20211122101622.50572-1-kuniyu@amazon.co.jp
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kuniyuki Iwashima [Mon, 22 Nov 2021 10:16:22 +0000 (19:16 +0900)]
dccp: Inline dccp_listen_start().
This patch inlines dccp_listen_start() and removes a stale comment in
inet_dccp_listen() so that it looks like inet_listen().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Reviewed-by: Richard Sailer <richard_siegfried@systemli.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kuniyuki Iwashima [Mon, 22 Nov 2021 10:16:21 +0000 (19:16 +0900)]
dccp/tcp: Remove an unused argument in inet_csk_listen_start().
The commit
1295e2cf3065 ("inet: minor optimization for backlog setting in
listen(2)") added change so that sk_max_ack_backlog is initialised earlier
in inet_dccp_listen() and inet_listen(). Since then, we no longer use
backlog in inet_csk_listen_start(), so let's remove it.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Acked-by: Yafang Shao <laoar.shao@gmail.com>
Reviewed-by: Richard Sailer <richard_siegfried@systemli.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kurt Kanzenbach [Mon, 22 Nov 2021 11:19:31 +0000 (12:19 +0100)]
net: stmmac: Calculate CDC error only once
The clock domain crossing error (CDC) is calculated at every fetch of Tx or Rx
timestamps. It includes a division. Especially on arm32 based systems it is
expensive. It also requires two conditionals in the hotpath.
Add a compensation value cache to struct plat_stmmacenet_data and subtract it
unconditionally in the RX/TX functions which spares the conditionals.
The value is initialized to 0 and if supported calculated in the PTP
initialization code.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Link: https://lore.kernel.org/r/20211122111931.135135-1-kurt@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 23 Nov 2021 01:24:47 +0000 (17:24 -0800)]
net: remove .ndo_change_proto_down
.ndo_change_proto_down was added seemingly to enable out-of-tree
implementations. Over 2.5yrs later we still have no real users
upstream. Hardwire the generic implementation for now, we can
revert once real users materialize. (rocker is a test vehicle,
not a user.)
We need to drop the optimization on the sysfs side, because
unlike ndos priv_flags will be changed at runtime, so we'd
need READ_ONCE/WRITE_ONCE everywhere..
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 23 Nov 2021 12:17:24 +0000 (12:17 +0000)]
Merge branch '100GbE' of git://git./linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
100GbE Intel Wired LAN Driver Updates 2021-11-22
Shiraz Saleem says:
Currently E800 devices come up as RoCEv2 devices by default.
This series add supports for users to configure iWARP or RoCEv2 functionality
per PCI function. devlink parameters is used to realize this and is keyed
off similar work in [1].
[1] https://lore.kernel.org/linux-rdma/
20210810132424.9129-1-parav@nvidia.com/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 23 Nov 2021 12:14:49 +0000 (12:14 +0000)]
Merge branch 'mvpp2-5gbase-r-support'
Marek Behún says:
====================
Add 5gbase-r support for mvpp2
this adds support for 5gbase-r for mvpp2 driver. Current versions of
TF-A firmware support changing the PHY to 5gbase-r via SMC calls, at
least on Macchiatobin.
Tested on Macchiatobin.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Marek Behún [Mon, 22 Nov 2021 20:51:11 +0000 (21:51 +0100)]
net: marvell: mvpp2: Add support for 5gbase-r
Add support for PHY_INTERFACE_MODE_5GBASER.
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marek Behún [Mon, 22 Nov 2021 20:51:10 +0000 (21:51 +0100)]
phy: marvell: phy-mvebu-cp110-comphy: add support for 5gbase-r
Add support for PHY_INTERFACE_MODE_5GBASER mode within the Marvell CP110
common PHY driver.
This is currently only supported via SMC calls to TF-A. Legacy support
may be added later, if needed.
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gerhard Engleder [Mon, 22 Nov 2021 20:32:25 +0000 (21:32 +0100)]
tsnep: Fix set MAC address
Commit
4dfb9982644b ("tsn: Fix build.") fixed compilation with const
dev_addr. In tsnep_netdev_set_mac_address() the call of ether_addr_copy()
was replaced with dev_set_mac_address(), which calls
ndo_set_mac_address(). This results in an endless recursive loop because
ndo_set_mac_address is set to tsnep_netdev_set_mac_address.
Call eth_hw_addr_set() instead of dev_set_mac_address() in
ndo_set_mac_address()/tsnep_netdev_set_mac_address() to copy the address
as intended.
[ 26.563303] Insufficient stack space to handle exception!
[ 26.563312] ESR: 0x96000047 -- DABT (current EL)
[ 26.563317] FAR: 0xffff80000a507fc0
[ 26.563320] Task stack: [0xffff80000a508000..0xffff80000a50c000]
[ 26.563324] IRQ stack: [0xffff80000a0c0000..0xffff80000a0c4000]
[ 26.563327] Overflow stack: [0xffff00007fbaf2b0..0xffff00007fbb02b0]
[ 26.563333] CPU: 3 PID: 381 Comm: ifconfig Not tainted 5.16.0-rc1-zynqmp #60
[ 26.563340] Hardware name: TSN endpoint (DT)
[ 26.563343] pstate:
a0000005 (NzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 26.563351] pc : inetdev_event+0x4/0x560
[ 26.563364] lr : raw_notifier_call_chain+0x54/0x78
[ 26.563372] sp :
ffff80000a508040
[ 26.563374] x29:
ffff80000a508040 x28:
ffff00000132b800 x27:
0000000000000000
[ 26.563386] x26:
0000000000000000 x25:
ffff800000ea5058 x24:
0904030201020001
[ 26.563396] x23:
ffff800000ea5058 x22:
ffff80000a5080e0 x21:
0000000000000009
[ 26.563405] x20:
00000000fffffffa x19:
ffff80000a009510 x18:
0000000000000000
[ 26.563414] x17:
0000000000000000 x16:
0000000000000000 x15:
0000ffffd1341030
[ 26.563422] x14:
ffffffffffffffff x13:
0000000000000020 x12:
0101010101010101
[ 26.563432] x11:
0000000000000020 x10:
0101010101010101 x9 :
7f7f7f7f7f7f7f7f
[ 26.563441] x8 :
7f7f7f7f7f7f7f7f x7 :
fefefeff30677364 x6 :
0000000080808080
[ 26.563450] x5 :
0000000000000000 x4 :
ffff800008dee170 x3 :
ffff80000a50bd42
[ 26.563459] x2 :
ffff80000a5080e0 x1 :
0000000000000009 x0 :
ffff80000a0092d0
[ 26.563470] Kernel panic - not syncing: kernel stack overflow
[ 26.563474] CPU: 3 PID: 381 Comm: ifconfig Not tainted 5.16.0-rc1-zynqmp #60
[ 26.563481] Hardware name: TSN endpoint (DT)
[ 26.563484] Call trace:
[ 26.563486] dump_backtrace+0x0/0x1b0
[ 26.563497] show_stack+0x18/0x68
[ 26.563504] dump_stack_lvl+0x68/0x84
[ 26.563513] dump_stack+0x18/0x34
[ 26.563519] panic+0x164/0x324
[ 26.563524] nmi_panic+0x64/0x98
[ 26.563533] panic_bad_stack+0x108/0x128
[ 2k6.563539] handle_bad_stack+0x38/0x68
[ 26.563548] __bad_stack+0x88/0x8c
[ 26.563553] inetdev_event+0x4/0x560
[ 26.563560] call_netdevice_notifiers_info+0x58/0xa8
[ 26.563569] dev_set_mac_address+0x78/0x110
[ 26.563576] tsnep_netdev_set_mac_address+0x38/0x60 [tsnep]
[ 26.563591] dev_set_mac_address+0xc4/0x110
[ 26.563599] tsnep_netdev_set_mac_address+0x38/0x60 [tsnep]
...
[ 26.565444] dev_set_mac_address+0xc4/0x110
[ 26.565452] tsnep_netdev_set_mac_address+0x38/0x60 [tsnep]
[ 26.565462] dev_set_mac_address+0xc4/0x110
[ 26.565469] dev_set_mac_address_user+0x44/0x68
[ 26.565477] dev_ifsioc+0x30c/0x568
[ 26.565483] dev_ioctl+0x124/0x3f0
[ 26.565489] sock_do_ioctl+0xb4/0xf8
[ 26.565497] sock_ioctl+0x2f4/0x398
[ 26.565503] __arm64_sys_ioctl+0xa8/0xe8
[ 26.565511] invoke_syscall+0x44/0x108
[ 26.565520] el0_svc_common.constprop.3+0x94/0xf8
[ 26.565527] do_el0_svc+0x24/0x88
[ 26.565534] el0_svc+0x20/0x50
[ 26.565541] el0t_64_sync_handler+0x90/0xb8
[ 26.565548] el0t_64_sync+0x180/0x184
[ 26.565556] SMP: stopping secondary CPUs
[ 26.565622] Kernel Offset: disabled
[ 26.565624] CPU features: 0x0,
00004002,
00000846
[ 26.565628] Memory Limit: none
[ 27.843428] ---[ end Kernel panic - not syncing: kernel stack overflow ]---
Fixes: 4dfb9982644b ("tsn: Fix build.")
Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 23 Nov 2021 11:53:16 +0000 (11:53 +0000)]
Merge branch 'qca8k-mirror-and-lag-support'
Ansuel Smith says:
====================
Add mirror and LAG support to qca8k
With the continue of adding 'Multiple feature to qca8k'
The switch supports mirror mode and LAG.
In mirror mode a port is set as mirror and other port are configured
to both igress or egress mode. With no port configured for mirror,
the mirror port is disabled and reverted to normal port.
The switch supports max 4 LAG with 4 different member max.
Current supported mode is Hash mode in both L2 or L2+3 mode.
There is a problematic implementation for the hash mode where
with multiple LAG configured, someone has to remove them to
change the hash mode as it's global.
When a member of the LAG is disconnected, the traffic is redirected
to the other port.
Some warning are present from checkpatch but can't really be fixed
as it would result in making the regs less readable.
(They really did their best with the LAG reg logic and complexity)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ansuel Smith [Tue, 23 Nov 2021 02:59:11 +0000 (03:59 +0100)]
net: dsa: qca8k: add LAG support
Add LAG support to this switch. In Documentation this is described as
trunk mode. A max of 4 LAGs are supported and each can support up to 4
port. The current tx mode supported is Hash mode with both L2 and L2+3
mode.
When no port are present in the trunk, the trunk is disabled in the
switch.
When a port is disconnected, the traffic is redirected to the other
available port.
The hash mode is global and each LAG require to have the same hash mode
set. To change the hash mode when multiple LAG are configured, it's
required to remove each LAG and set the desired hash mode to the last.
An error is printed when it's asked to set a not supported hadh mode.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ansuel Smith [Tue, 23 Nov 2021 02:59:10 +0000 (03:59 +0100)]
net: dsa: qca8k: add support for mirror mode
The switch supports mirror mode. Only one port can set as mirror port and
every other port can set to both ingress and egress mode. The mirror
port is disabled and reverted to normal operation once every port is
removed from sending packet to it.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yajun Deng [Tue, 23 Nov 2021 02:54:30 +0000 (10:54 +0800)]
neigh: introduce neigh_confirm() helper function
Add neigh_confirm() for the confirmed member in struct neighbour,
it can be called as an independent unit by other functions.
Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeremy Kerr [Tue, 23 Nov 2021 07:50:46 +0000 (15:50 +0800)]
mctp: Add MCTP-over-serial transport binding
This change adds a MCTP Serial transport binding, as defined by DMTF
specificiation DSP0253 - "MCTP Serial Transport Binding". This is
implemented as a new serial line discipline, and can be attached to
arbitrary tty devices.
From the Kconfig description:
This driver provides an MCTP-over-serial interface, through a
serial line-discipline, as defined by DMTF specification "DSP0253 -
MCTP Serial Transport Binding". By attaching the ldisc to a serial
device, we get a new net device to transport MCTP packets.
This allows communication with external MCTP endpoints which use
serial as their transport. It can also be used as an easy way to
provide MCTP connectivity between virtual machines, by forwarding
data between simple virtual serial devices.
Say y here if you need to connect to MCTP endpoints over serial. To
compile as a module, use m; the module will be called mctp-serial.
Once the N_MCTP line discipline is set [using ioctl(TCIOSETD)], we get a
new netdev suitable for MCTP communication.
The 'mctp' utility[1] provides a simple wrapper for this ioctl, using
'link serial <device>':
# mctp link serial /dev/ttyS0 &
# mctp link
dev lo index 1 address 0x00:00:00:00:00:00 net 1 mtu 65536 up
dev mctpserial0 index 5 address 0x(no-addr) net 1 mtu 68 down
[1]: https://github.com/CodeConstruct/mctp
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 23 Nov 2021 11:44:31 +0000 (11:44 +0000)]
Merge branch 'mlxsw-updates'
Ido Schimmel says:
====================
mlxsw: Various updates
Patch #1 removes deadcode reported by Coverity.
Patch #2 adds a shutdown method in the PCI driver to ensure the kexeced
kernel starts working with a device that is in a sane state.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Danielle Ratson [Tue, 23 Nov 2021 07:54:47 +0000 (09:54 +0200)]
mlxsw: pci: Add shutdown method in PCI driver
On an arm64 platform with the Spectrum ASIC, after loading and executing
a new kernel via kexec, the following trace [1] is observed. This seems
to be caused by the fact that the device is not properly shutdown before
executing the new kernel.
Fix this by implementing a shutdown method which mirrors the remove
method, as recommended by the kexec maintainer [2][3].
[1]
BUG: Bad page state in process devlink pfn:22f73d
page:
fffffe00089dcf40 refcount:-1 mapcount:0 mapping:
0000000000000000 index:0x0
flags: 0x2ffff00000000000()
raw:
2ffff00000000000 0000000000000000 ffffffff089d0201 0000000000000000
raw:
0000000000000000 0000000000000000 ffffffffffffffff 0000000000000000
page dumped because: nonzero _refcount
Modules linked in:
CPU: 1 PID: 16346 Comm: devlink Tainted: G B
5.8.0-rc6-custom-273020-gac6b365b1bf5 #44
Hardware name: Marvell Armada 7040 TX4810M (DT)
Call trace:
dump_backtrace+0x0/0x1d0
show_stack+0x1c/0x28
dump_stack+0xbc/0x118
bad_page+0xcc/0xf8
check_free_page_bad+0x80/0x88
__free_pages_ok+0x3f8/0x418
__free_pages+0x38/0x60
kmem_freepages+0x200/0x2a8
slab_destroy+0x28/0x68
slabs_destroy+0x60/0x90
___cache_free+0x1b4/0x358
kfree+0xc0/0x1d0
skb_free_head+0x2c/0x38
skb_release_data+0x110/0x1a0
skb_release_all+0x2c/0x38
consume_skb+0x38/0x130
__dev_kfree_skb_any+0x44/0x50
mlxsw_pci_rdq_fini+0x8c/0xb0
mlxsw_pci_queue_fini.isra.0+0x28/0x58
mlxsw_pci_queue_group_fini+0x58/0x88
mlxsw_pci_aqs_fini+0x2c/0x60
mlxsw_pci_fini+0x34/0x50
mlxsw_core_bus_device_unregister+0x104/0x1d0
mlxsw_devlink_core_bus_device_reload_down+0x2c/0x48
devlink_reload+0x44/0x158
devlink_nl_cmd_reload+0x270/0x290
genl_rcv_msg+0x188/0x2f0
netlink_rcv_skb+0x5c/0x118
genl_rcv+0x3c/0x50
netlink_unicast+0x1bc/0x278
netlink_sendmsg+0x194/0x390
__sys_sendto+0xe0/0x158
__arm64_sys_sendto+0x2c/0x38
el0_svc_common.constprop.0+0x70/0x168
do_el0_svc+0x28/0x88
el0_sync_handler+0x88/0x190
el0_sync+0x140/0x180
[2]
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1195432.html
[3]
https://patchwork.kernel.org/project/linux-scsi/patch/
20170212214920.28866-1-anton@ozlabs.org/#
20116693
Cc: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Danielle Ratson [Tue, 23 Nov 2021 07:54:46 +0000 (09:54 +0200)]
mlxsw: spectrum_router: Remove deadcode in mlxsw_sp_rif_mac_profile_find
The function idr_for_each_entry() already checks that the next entry in
the IDR is not NULL.
Therefore, checking that again in every iteration leads to deadcode.
Remove the unnecessary check in order to avoid that.
Addresses-Coverity: ("Logically dead code")
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shiraz Saleem [Mon, 18 Oct 2021 23:16:03 +0000 (18:16 -0500)]
RDMA/irdma: Set protocol based on PF rdma_mode flag
Set the RDMA protocol to use at driver bind time based on the ice PF's
rdma_mode flag.
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Tested-by: Leszek Kaliszczuk <leszek.kaliszczuk@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Shiraz Saleem [Mon, 18 Oct 2021 23:16:02 +0000 (18:16 -0500)]
net/ice: Add support for enable_iwarp and enable_roce devlink param
Allow support for 'enable_iwarp' and 'enable_roce' devlink params to turn
on/off iWARP or RoCE protocol support for E800 devices.
For example, a user can turn on iWARP functionality with,
devlink dev param set pci/0000:07:00.0 name enable_iwarp value true cmode runtime
This add an iWARP auxiliary rdma device, ice.iwarp.<>, under this PF.
A user request to enable both iWARP and RoCE under the same PF is rejected
since this device does not support both protocols simultaneously on the
same port.
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Tested-by: Leszek Kaliszczuk <leszek.kaliszczuk@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Shiraz Saleem [Mon, 18 Oct 2021 23:16:01 +0000 (18:16 -0500)]
devlink: Add 'enable_iwarp' generic device param
Add a new device generic parameter to enable and disable
iWARP functionality on a multi-protocol RDMA device.
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Tested-by: Leszek Kaliszczuk <leszek.kaliszczuk@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
David S. Miller [Mon, 22 Nov 2021 15:35:17 +0000 (15:35 +0000)]
Merge branch 'qca8k-next'
Ansuel Smith says:
====================
Multiple cleanup and feature for qca8k
This is a reduced version of the old massive series.
Refer to the changelog to know what is removed from this.
We clean and convert the driver to GENMASK FIELD_PREP to clean multiple
use of various naming scheme. (example we have a mix of _MASK, _S _M,
and various name) The idea is to ""simplify"" and remove some shift and
data handling by using FIELD API.
The patch contains various checkpatch warning and are ignored to not
create more mess in the header file. (fixing the too long line warning
would results in regs definition less readable)
We conver the driver to regmap API as ipq40xx SoC is based on the same
reg structure and we need to generilize the read/write access to split
the driver to commond and specific code.
The conversion to regmap is NOT done for the read/write/rmw operation,
the function are reworked to use the regmap helper instead.
This is done to keep the patch delta low but will come sooner or later
when the code will be split.
Any new feature will directly use the regmap helper and the reg
set/clear and the busy wait function are migrated to regmap helper as
the use of these function is low.
We also add a minor patch for MIB counter as qca8337 is missing 2 extra
counter, support for mdb and ageing settings.
v3:
- Try to reduce regmap conversion patch
v2:
- Move regmap init to sw_probe instead of moving switch id check.
- Removed LAGs, mirror extra features will come later in another
smaller series.
- Squash 2 ageing patch
- Add more description about the regmap patch
- Rework mdb patch to do operation under the same lock
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ansuel Smith [Mon, 22 Nov 2021 15:23:48 +0000 (16:23 +0100)]
net: dsa: qca8k: add support for mdb_add/del
Add support for mdb add/del function. The ARL table is used to insert
the rule. The rule will be searched, deleted and reinserted with the
port mask updated. The function will check if the rule has to be updated
or insert directly with no deletion of the old rule.
If every port is removed from the port mask, the rule is removed.
The rule is set STATIC in the ARL table (aka it doesn't age) to not be
flushed by fast age function.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ansuel Smith [Mon, 22 Nov 2021 15:23:47 +0000 (16:23 +0100)]
net: dsa: qca8k: add set_ageing_time support
qca8k support setting ageing time in step of 7s. Add support for it and
set the max value accepted of 7645m.
Documentation talks about support for 10000m but that values doesn't
make sense as the value doesn't match the max value in the reg.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ansuel Smith [Mon, 22 Nov 2021 15:23:46 +0000 (16:23 +0100)]
net: dsa: qca8k: add support for port fast aging
The switch supports fast aging by flushing any rule in the ARL
table for a specific port.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ansuel Smith [Mon, 22 Nov 2021 15:23:45 +0000 (16:23 +0100)]
net: dsa: qca8k: add additional MIB counter and make it dynamic
We are currently missing 2 additionals MIB counter present in QCA833x
switch.
QC832x switch have 39 MIB counter and QCA833X have 41 MIB counter.
Add the additional MIB counter and rework the MIB function to print the
correct supported counter from the match_data struct.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ansuel Smith [Mon, 22 Nov 2021 15:23:44 +0000 (16:23 +0100)]
net: dsa: qca8k: initial conversion to regmap helper
Convert any qca8k set/clear/pool to regmap helper and add
missing config to regmap_config struct.
Read/write/rmw operation are reworked to use the regmap helper
internally to keep the delta of this patch low. These additional
function will then be dropped when the code split will be proposed.
Ipq40xx SoC have the internal switch based on the qca8k regmap but use
mmio for read/write/rmw operation instead of mdio.
In preparation for the support of this internal switch, convert the
driver to regmap API to later split the driver to common and specific
code. The overhead introduced by the use of regamp API is marginal as the
internal mdio will bypass it by using its direct access and regmap will be
used only by configuration functions or fdb access.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ansuel Smith [Mon, 22 Nov 2021 15:23:43 +0000 (16:23 +0100)]
net: dsa: qca8k: move regmap init in probe and set it mandatory
In preparation for regmap conversion, move regmap init in the probe
function and make it mandatory as any read/write/rmw operation will be
converted to regmap API.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ansuel Smith [Mon, 22 Nov 2021 15:23:42 +0000 (16:23 +0100)]
net: dsa: qca8k: remove extra mutex_init in qca8k_setup
Mutex is already init in sw_probe. Remove the extra init in qca8k_setup.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ansuel Smith [Mon, 22 Nov 2021 15:23:41 +0000 (16:23 +0100)]
net: dsa: qca8k: convert to GENMASK/FIELD_PREP/FIELD_GET
Convert and try to standardize bit fields using
GENMASK/FIELD_PREP/FIELD_GET macros. Rework some logic to support the
standard macro and tidy things up. No functional change intended.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ansuel Smith [Mon, 22 Nov 2021 15:23:40 +0000 (16:23 +0100)]
net: dsa: qca8k: remove redundant check in parse_port_config
The very next check for port 0 and 6 already makes sure we don't go out
of bounds with the ports_config delay table.
Remove the redundant check.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 22 Nov 2021 15:13:54 +0000 (15:13 +0000)]
Merge branch 'skbuff-struct-group'
Kees Cook says:
====================
skbuff: Switch structure bounds to struct_group()
This is a pair of patches to add struct_group() to struct sk_buff. The
first is needed to work around sparse-specific complaints, and is new
for v2. The second patch is the same as originally sent as v1.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Kees Cook [Sun, 21 Nov 2021 00:31:49 +0000 (16:31 -0800)]
skbuff: Switch structure bounds to struct_group()
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), memmove(), and memset(), avoid
intentionally writing across neighboring fields.
Replace the existing empty member position markers "headers_start" and
"headers_end" with a struct_group(). This will allow memcpy() and sizeof()
to more easily reason about sizes, and improve readability.
"pahole" shows no size nor member offset changes to struct sk_buff.
"objdump -d" shows no object code changes (outside of WARNs affected by
source line number changes).
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com> # drivers/net/wireguard/*
Link: https://lore.kernel.org/lkml/20210728035006.GD35706@embeddedor
Signed-off-by: David S. Miller <davem@davemloft.net>
Kees Cook [Sun, 21 Nov 2021 00:31:48 +0000 (16:31 -0800)]
skbuff: Move conditional preprocessor directives out of struct sk_buff
In preparation for using the struct_group() macro in struct sk_buff,
move the conditional preprocessor directives out of the region of struct
sk_buff that will be enclosed by struct_group(). While GCC and Clang are
happy with conditional preprocessor directives here, sparse is not, even
under -Wno-directive-within-macro[1], as would be seen under a C=1 build:
net/core/filter.c: note: in included file (through include/linux/netlink.h, include/linux/sock_diag.h):
./include/linux/skbuff.h:820:1: warning: directive in macro's argument list
./include/linux/skbuff.h:822:1: warning: directive in macro's argument list
./include/linux/skbuff.h:846:1: warning: directive in macro's argument list
./include/linux/skbuff.h:848:1: warning: directive in macro's argument list
Additionally remove empty macro argument definitions and usage.
"objdump -d" shows no object code differences.
[1] https://www.spinics.net/lists/linux-sparse/msg10857.html
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Antoine Tenart [Mon, 22 Nov 2021 14:24:56 +0000 (15:24 +0100)]
sections: global data can be in .bss
When checking an address is located in a global data section also check
for the .bss section as global variables initialized to 0 can be in
there (-fzero-initialized-in-bss).
This was found when looking at ensure_safe_net_sysctl which was failing
to detect non-init sysctl pointing to a global data section when the
data was in the .bss section.
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yajun Deng [Mon, 22 Nov 2021 07:02:36 +0000 (15:02 +0800)]
arp: Remove #ifdef CONFIG_PROC_FS
proc_create_net() and remove_proc_entry() already contain the case
whether to define CONFIG_PROC_FS, so remove #ifdef CONFIG_PROC_FS.
Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christophe JAILLET [Sun, 21 Nov 2021 21:56:39 +0000 (22:56 +0100)]
hv_netvsc: Use bitmap_zalloc() when applicable
'send_section_map' is a bitmap. So use 'bitmap_zalloc()' to simplify code,
improve the semantic and avoid some open-coded arithmetic in allocator
arguments.
Also change the corresponding 'kfree()' into 'bitmap_free()' to keep
consistency.
While at it, change an '== NULL' test into a '!'.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christophe JAILLET [Sun, 21 Nov 2021 19:12:54 +0000 (20:12 +0100)]
qed: Use the bitmap API to simplify some functions
'cid_map' is a bitmap. So use 'bitmap_zalloc()' to simplify code,
improve the semantic and avoid some open-coded arithmetic in allocator
arguments.
Also change the corresponding 'kfree()' into 'bitmap_free()' to keep
consistency.
Also change some 'memset()' into 'bitmap_zero()' to keep consistency. This
is also much less verbose.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christophe JAILLET [Sun, 21 Nov 2021 18:01:03 +0000 (19:01 +0100)]
net-sysfs: Slightly optimize 'xps_queue_show()'
The 'mask' bitmap is local to this function. So the non-atomic
'__set_bit()' can be used to save a few cycles.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christophe JAILLET [Sun, 21 Nov 2021 15:32:04 +0000 (16:32 +0100)]
rds: Fix a typo in a comment
s/cold/could/
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-By: Devesh Sharma <devesh.s.sharma@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yacov Simhony [Sun, 21 Nov 2021 15:02:53 +0000 (17:02 +0200)]
Fix coverity issue 'Uninitialized scalar variable"
There are three boolean variable which were not initialized and later
being used in the code.
Signed-off-by: Yacov Simhony <ysimhony@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Sat, 20 Nov 2021 17:15:10 +0000 (09:15 -0800)]
pcmcia: hide the MAC address helpers if !NET
pcmcia_get_mac_from_cis is only called from networking and
recent changes made it call dev_addr_mod() which is itself
only defined if NET.
Reported-by: kernel test robot <lkp@intel.com>
Fixes: adeef3e32146 ("net: constify netdev->dev_addr")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 22 Nov 2021 13:56:22 +0000 (13:56 +0000)]
tsn: Fix build.
Due to const dev_addr changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
M Chetan Kumar [Sat, 20 Nov 2021 16:21:55 +0000 (21:51 +0530)]
net: wwan: iosm: device trace collection using relayfs
This patch brings in support for device trace collection.
It implements relayfs interface for pushing device trace
from kernel space to user space.
Driver gets the debugfs base directory associated to WWAN
Device and creates trace_control and trace debugfs for
device tracing. Both trace_control & trace debugfs are
created under /sys/kernel/debug/wwan/wwan0/.
In order to collect device trace on trace0 interface, user
need to write 1 to trace_ctl interface.
Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
M Chetan Kumar [Sat, 20 Nov 2021 16:21:54 +0000 (21:51 +0530)]
net: wwan: common debugfs base dir for wwan device
This patch set brings in a common debugfs base directory
i.e. /sys/kernel/debugfs/wwan/ in WWAN Subsystem for a
WWAN device instance. So that it avoids driver polluting
debugfs root with unrelated directories & possible name
collusion.
Having a common debugfs base directory for WWAN drivers
eases user to match control devices with debugfs entries.
WWAN Subsystem creates dentry (/sys/kernel/debugfs/wwan)
on module load & removes dentry on module unload.
When driver registers a new wwan device, dentry (wwanX)
is created for WWAN device instance & on driver unregister
dentry is removed.
New API is introduced to return the wwan device instance
dentry so that driver can create debugfs entries under it.
Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com>
Reviewed-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Sat, 20 Nov 2021 15:31:19 +0000 (07:31 -0800)]
octeon: constify netdev->dev_addr
Argument of a helper is missing a const.
Reported-by: kernel test robot <lkp@intel.com>
Fixes: adeef3e32146 ("net: constify netdev->dev_addr")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Haiyang Zhang [Sat, 20 Nov 2021 00:29:53 +0000 (16:29 -0800)]
net: mana: Add XDP support
Add support of XDP for the MANA driver.
Supported XDP actions:
XDP_PASS, XDP_TX, XDP_DROP, XDP_ABORTED
XDP actions not yet supported:
XDP_REDIRECT
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 22 Nov 2021 13:19:04 +0000 (13:19 +0000)]
Merge branch 'tsn-endpoint-driver'
Gerhard Engleder says:
====================
TSN endpoint Ethernet MAC driver
This series adds a driver for my FPGA based TSN endpoint Ethernet MAC.
It also includes the device tree binding.
The device is designed as Ethernet MAC for TSN networks. It will be used
in PLCs with real-time requirements up to isochronous communication with
protocols like OPC UA Pub/Sub.
v3:
- set MAC mode based on PHY information (Andrew Lunn)
- remove/postpone loopback mode interface (Andrew Lunn)
- add suppress_preamble node support (Andrew Lunn)
- add mdio timeout (Andrew Lunn)
- no need to call phy_start_aneg (Andrew Lunn)
- remove unreachable code (Andrew Lunn)
- move 'struct napi_struct' closer to queues (Vinicius Costa Gomes)
- remove unused variable (kernel test robot)
- switch from mdio interrupt to polling
- mdio register without PHY address flag
- thread safe interrupt enable register
- add PTP_1588_CLOCK_OPTIONAL dependency to Kconfig
- introduce dmadev for DMA allocation
- mdiobus for platforms without device tree
- prepare MAC address support for platforms without device tree
- add missing interrupt disable to probe error path
v2:
- add C45 check (Andrew Lunn)
- forward phy_connect_direct() return value (Andrew Lunn)
- use phy_remove_link_mode() (Andrew Lunn)
- do not touch PHY directly, use PHY subsystem (Andrew Lunn)
- remove management data lock (Andrew Lunn)
- use phy_loopback (Andrew Lunn)
- remove GMII2RGMII handling, use xgmiitorgmii (Andrew Lunn)
- remove char device for direct TX/RX queue access (Andrew Lunn)
- mdio node for mdiobus (Rob Herring)
- simplify compatible node (Rob Herring)
- limit number of items of reg and interrupts nodes (Rob Herring)
- restrict phy-connection-type node (Rob Herring)
- reference to mdio.yaml under mdio node (Rob Herring)
- remove device tree (Michal Simek)
- fix %llx warning (kernel test robot)
- fix unused tmp variable warning (kernel test robot)
- add missing of_node_put() for of_parse_phandle()
- use devm_mdiobus_alloc()
- simplify mdiobus read/write
- reduce required nodes
- ethtool priv flags interface for loopback
- add missing static for some functions
- remove obsolete hardware defines
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Gerhard Engleder [Fri, 19 Nov 2021 22:58:26 +0000 (23:58 +0100)]
tsnep: Add TSN endpoint Ethernet MAC driver
The TSN endpoint Ethernet MAC is a FPGA based network device for
real-time communication.
It is integrated as Ethernet controller with ethtool and PTP support.
For real-time communcation TC_SETUP_QDISC_TAPRIO is supported.
Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gerhard Engleder [Fri, 19 Nov 2021 22:58:25 +0000 (23:58 +0100)]
dt-bindings: net: Add tsnep Ethernet controller
The TSN endpoint Ethernet MAC is a FPGA based network device for
real-time communication.
It is integrated as normal Ethernet controller with
ethernet-controller.yaml and mdio.yaml.
Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gerhard Engleder [Fri, 19 Nov 2021 22:58:24 +0000 (23:58 +0100)]
dt-bindings: Add vendor prefix for Engleder
Engleder develops FPGA based controllers for real-time communication.
Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King (Oracle) [Fri, 19 Nov 2021 16:28:06 +0000 (16:28 +0000)]
net: phylink: handle NA interface mode in phylink_fwnode_phy_connect()
Commit
4904b6ea1f9db ("net: phy: phylink: Use PHY device interface if
N/A") introduced handling for the phy interface mode where this is not
known at phylink creation time. This was never added to the OF/fwnode
paths, but is necessary when the phy is present in DT, but the phy-mode
is not specified.
Add this handling.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sean Anderson [Fri, 19 Nov 2021 15:58:09 +0000 (10:58 -0500)]
net: phylink: Add helpers for c22 registers without MDIO
Some devices expose memory-mapped c22-compliant PHYs. Because these
devices do not have an MDIO bus, we cannot use the existing helpers.
Refactor the existing helpers to allow supplying the values for c22
registers directly, instead of using MDIO to access them. Only get_state
and set_advertisement are converted, since they contain the most complex
logic. Because set_advertisement is never actually used outside
phylink_mii_c22_pcs_config, move the MDIO-writing part into that
function. Because some modes do not need the advertisement register set
at all, we use -EINVAL for this purpose.
Additionally, a new function phylink_pcs_enable_an is provided to
determine whether to enable autonegotiation.
Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 19 Nov 2021 15:43:32 +0000 (07:43 -0800)]
net: annotate accesses to dev->gso_max_segs
dev->gso_max_segs is written under RTNL protection, or when the device is
not yet visible, but is read locklessly.
Add netif_set_gso_max_segs() helper.
Add the READ_ONCE()/WRITE_ONCE() pairs, and use netif_set_gso_max_segs()
where we can to better document what is going on.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 19 Nov 2021 15:43:31 +0000 (07:43 -0800)]
net: annotate accesses to dev->gso_max_size
dev->gso_max_size is written under RTNL protection, or when the device is
not yet visible, but is read locklessly.
Add the READ_ONCE()/WRITE_ONCE() pairs, and use netif_set_gso_max_size()
where we can to better document what is going on.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 22 Nov 2021 12:31:50 +0000 (12:31 +0000)]
Merge branch 'ethtool-copybreak'
Guangbin Huang says:
====================
ethtool: add support to set/get tx copybreak buf size and rx buf len
This series add support to set/get tx copybreak buf size and rx buf len via
ethtool and hns3 driver implements them.
Tx copybreak buf size is used for tx copybreak feature which for small size
packet or frag. Use ethtool --get-tunable command to get it, and ethtool
--set-tunable command to set it, examples are as follow:
1. set tx spare buf size to 102400:
$ ethtool --set-tunable eth1 tx-buf-size 102400
2. get tx spare buf size:
$ ethtool --get-tunable eth1 tx-buf-size
tx-buf-size: 102400
Rx buf len is buffer length of each rx BD. Use ethtool -g command to get
it, and ethtool -G command to set it, examples are as follow:
1. set rx buf len to 4096
$ ethtool -G eth1 rx-buf-len 4096
2. get rx buf len
$ ethtool -g eth1
...
RX Buf Len: 4096
Change log:
V5 -> V6
1.Fix compile error for divers/s390.
V4 -> V5
1.Change struct ethtool_ringparam_ext to kernel_ethtool_ringparam.
2.change "__u32 rx_buf_len" to "u32 rx_buf_len".
V3 -> V4
1.Fix a few allmodconfig compile warning.
2.Add more '=' synbol to ethtool-netlink.rst to refine format.
3.Move definement of struct ethtool_ringparam_ext to include/linux/ethtool.h.
4.Move related modify of rings_fill_reply() from patch 4/6 to patch 3/6.
V2 -> V3
1.Remove documentation for tx copybreak buf size, there is description for
it in userspace ethtool.
2.Move extending parameters for get/set_ringparam function from patch3/6
to patch 4/6.
V1 -> V2
1.Add documentation for rx buf len and tx copybreak buf size.
2.Extend structure ringparam_ext for extenal ring params.
3.Change type of ETHTOOL_A_RINGS_RX_BUF_LEN from NLA_U32 to
NLA_POLICY_MIN(NLA_U32, 1).
4.Add supported_ring_params in ethtool_ops to indicate if support external
params.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Hao Chen [Thu, 18 Nov 2021 12:12:45 +0000 (20:12 +0800)]
net: hns3: remove the way to set tx spare buf via module parameter
The way to set tx spare buf via module parameter is not such
convenient as the way to set it via ethtool.
So,remove the way to set tx spare buf via module parameter.
Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hao Chen [Thu, 18 Nov 2021 12:12:44 +0000 (20:12 +0800)]
net: hns3: add support to set/get rx buf len via ethtool for hns3 driver
Rx buf len is for rx BD buffer size, support setting it via ethtool -G
parameter and getting it via ethtool -g parameter.
Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hao Chen [Thu, 18 Nov 2021 12:12:43 +0000 (20:12 +0800)]
ethtool: extend ringparam setting/getting API with rx_buf_len
Add two new parameters kernel_ringparam and extack for
.get_ringparam and .set_ringparam to extend more ring params
through netlink.
Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hao Chen [Thu, 18 Nov 2021 12:12:42 +0000 (20:12 +0800)]
ethtool: add support to set/get rx buf len via ethtool
Add support to set rx buf len via ethtool -G parameter and get
rx buf len via ethtool -g parameter.
Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hao Chen [Thu, 18 Nov 2021 12:12:41 +0000 (20:12 +0800)]
net: hns3: add support to set/get tx copybreak buf size via ethtool for hns3 driver
Tx copybreak buf size is used for tx copybreak feature, the feature is
used for small size packet or frag. It adds a queue based tx shared
bounce buffer to memcpy the small packet when the len of xmitted skb is
below tx_copybreak(value to distinguish small size and normal size),
and reduce the overhead of dma map and unmap when IOMMU is on.
Support setting it via ethtool --set-tunable parameter and getting
it via ethtool --get-tunable parameter.
Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hao Chen [Thu, 18 Nov 2021 12:12:40 +0000 (20:12 +0800)]
ethtool: add support to set/get tx copybreak buf size via ethtool
Add support for ethtool to set/get tx copybreak buf size.
Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 20 Nov 2021 14:11:00 +0000 (14:11 +0000)]
Merge branch 'mptcp-more-socket-options'
Mat Martineau says:
====================
mptcp: More socket option support
These patches add MPTCP socket support for a few additional socket
options: IP_TOS, IP_FREEBIND, IP_TRANSPARENT, IPV6_FREEBIND, and
IPV6_TRANSPARENT.
Patch 1 exposes __ip_sock_set_tos() for use in patch 2.
Patch 2 adds IP_TOS support.
Patches 3 and 4 add the freebind and transparent support, with a
selftest for the latter.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Fri, 19 Nov 2021 20:41:37 +0000 (12:41 -0800)]
selftests: mptcp: add tproxy test case
No hard dependencies here, just skip if test environ lacks
nft binary or the needed kernel config options.
The test case spawns listener in ns2 but ns1 will connect
to the ip address of ns4.
policy routing + tproxy rule will redirect packets to ns2 instead
of forward.
v3: - update mptcp/config (Mat Martineau)
- more verbose SKIP messages in mptcp_connect.sh
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Fri, 19 Nov 2021 20:41:36 +0000 (12:41 -0800)]
mptcp: sockopt: add SOL_IP freebind & transparent options
These options also need to be set before bind, so do the sync of msk to
new ssk socket a bit earlier.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Poorva Sonparote [Fri, 19 Nov 2021 20:41:35 +0000 (12:41 -0800)]
mptcp: Support for IP_TOS for MPTCP setsockopt()
SOL_IP provides a way to configure network layer attributes in a
socket. This patch adds support for IP_TOS for setsockopt(.. ,SOL_IP, ..)
Support for SOL_IP is added in mptcp_setsockopt() and IP_TOS is handled in
a private function. The idea here is to take in the value passed for IP_TOS
and set it to the current subflow, open subflows as well new subflows that
might be created after the initial call to setsockopt(). This sync is done
using sync_socket_options(.., ssk) and setting the value of tos using
__ip_sock_set_tos(ssk,..).
The patch has been tested using the packetdrill script here -
https://github.com/multipath-tcp/mptcp_net-next/issues/220#issuecomment-
947863717
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/220
Signed-off-by: Poorva Sonparote <psonparo@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Poorva Sonparote [Fri, 19 Nov 2021 20:41:34 +0000 (12:41 -0800)]
ipv4: Exposing __ip_sock_set_tos() in ip.h
Making the static function __ip_sock_set_tos() from net/ipv4/ip_sockglue.c
accessible by declaring it in include/net/ip.h
The reason for doing this is to use this function to set IP_TOS value in
mptcp_setsockopt() without the lock.
Signed-off-by: Poorva Sonparote <psonparo@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 20 Nov 2021 12:26:09 +0000 (12:26 +0000)]
Merge branch 'dev_addr-const'
Jakub Kicinski says:
====================
net: constify netdev->dev_addr
Take care of a few stragglers and make netdev->dev_addr const.
netdev->dev_addr can be held on the address tree like any other
address now.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 19 Nov 2021 14:21:55 +0000 (06:21 -0800)]
net: kunit: add a test for dev_addr_lists
Add a KUnit test for the dev_addr API.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 19 Nov 2021 14:21:54 +0000 (06:21 -0800)]
dev_addr_list: put the first addr on the tree
Since all netdev->dev_addr modifications go via dev_addr_mod()
we can put it on the list. When address is change remove it
and add it back.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>