platform/kernel/linux-starfive.git
2 years agonet-sysfs: update the queue counts in the unregistration path
Antoine Tenart [Tue, 7 Dec 2021 14:57:24 +0000 (15:57 +0100)]
net-sysfs: update the queue counts in the unregistration path

When updating Rx and Tx queue kobjects, the queue count should always be
updated to match the queue kobjects count. This was not done in the net
device unregistration path, fix it. Tracking all queue count updates
will allow in a following up patch to detect illegal updates.

Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'wwan-debugfs-tweaks'
Jakub Kicinski [Thu, 9 Dec 2021 01:59:03 +0000 (17:59 -0800)]
Merge branch 'wwan-debugfs-tweaks'

Sergey Ryazanov says:

====================
WWAN debugfs tweaks

This is a follow-up series to just applied IOSM (and WWAN) debugfs
interface support [1]. The series has two main goals:
1. move the driver-specific debugfs knobs to a subdirectory;
2. make the debugfs interface optional for both IOSM and for the WWAN
   core.

As for the first part, I must say that it was my mistake. I suggested to
place debugfs entries under a common per WWAN device directory. But I
missed the driver subdirectory in the example, so it become:

/sys/kernel/debugfs/wwan/wwan0/trace

Since the traces collection is a driver-specific feature, it is better
to keep it under the driver-specific subdirectory:

/sys/kernel/debugfs/wwan/wwan0/iosm/trace

It is desirable to be able to entirely disable the debugfs interface. It
can be disabled for several reasons, including security and consumed
storage space. See detailed rationale with usage example in the 4th
patch.

The changes themselves are relatively simple, but require a code
rearrangement. So to make changes clear, I chose to split them into
preparatory and main changes and properly describe each of them.

IOSM part is compile-tested only since I do not have IOSM supported
device, so it needs Ack from the driver developers.

I would like to thank Johannes Berg and Leon Romanovsky. Their
suggestions and comments helped a lot to rework the initial
over-engineered solution to something less confusing and much more
simple. Thanks!

1. https://lore.kernel.org/netdev/20211120162155.1216081-1-m.chetan.kumar@linux.intel.com
2. https://patchwork.kernel.org/project/netdevbpf/patch/20211204174033.950528-1-arnd@kernel.org/
====================

Link: https://lore.kernel.org/r/20211207092140.19142-1-ryazanov.s.a@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: wwan: make debugfs optional
Sergey Ryazanov [Tue, 7 Dec 2021 09:21:40 +0000 (12:21 +0300)]
net: wwan: make debugfs optional

Debugfs interface is optional for the regular modem use. Some distros
and users will want to disable this feature for security or kernel
size reasons. So add a configuration option that allows to completely
disable the debugfs interface of the WWAN devices.

A primary considered use case for this option was embedded firmwares.
For example, in OpenWrt, you can not completely disable debugfs, as a
lot of wireless stuff can only be configured and monitored with the
debugfs knobs. At the same time, reducing the size of a kernel and
modules is an essential task in the world of embedded software.
Disabling the WWAN and IOSM debugfs interfaces allows us to save 50K
(x86-64 build) of space for module storage. Not much, but already
considerable when you only have 16MB of storage.

So it is hard to just disable whole debugfs. Users need some fine
grained set of options to control which debugfs interface is important
and should be available and which is not.

The new configuration symbol is enabled by default and is hidden under
the EXPERT option. So a regular user would not be bothered by another
one configuration question. While an embedded distro maintainer will be
able to a little more reduce the final image size.

Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Loic Poulain <loic.poulain@linaro.org>
Acked-by: M Chetan Kumar <m.chetan.kumar@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: wwan: iosm: move debugfs knobs into a subdir
Sergey Ryazanov [Tue, 7 Dec 2021 09:21:39 +0000 (12:21 +0300)]
net: wwan: iosm: move debugfs knobs into a subdir

The modem traces collection is a device (and so driver) specific option.
Therefore, move the related debugfs files into a driver-specific
subdirectory under the common per WWAN device directory.

Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Reviewed-by: Loic Poulain <loic.poulain@linaro.org>
Acked-by: M Chetan Kumar <m.chetan.kumar@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: wwan: iosm: allow trace port be uninitialized
Sergey Ryazanov [Tue, 7 Dec 2021 09:21:38 +0000 (12:21 +0300)]
net: wwan: iosm: allow trace port be uninitialized

Collecting modem firmware traces is optional for the regular modem use.
There are not many reasons for aborting device initialization due to an
inability to initialize the trace port and (or) its debugfs interface.
So, demote the initialization failure erro message into a warning and do
not break the initialization sequence in this case. Rework packet
processing and deinitialization so that they do not crash in case of
uninitialized trace port.

This change is mainly a preparation for an upcoming configuration option
introduction that will help disable driver debugfs functionality.

Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Reviewed-by: Loic Poulain <loic.poulain@linaro.org>
Acked-by: M Chetan Kumar <m.chetan.kumar@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: wwan: iosm: consolidate trace port init code
Sergey Ryazanov [Tue, 7 Dec 2021 09:21:37 +0000 (12:21 +0300)]
net: wwan: iosm: consolidate trace port init code

Move the channel related structures initialization from
ipc_imem_channel_init() to ipc_trace_init() and call it directly. On the
one hand, this makes the trace port initialization symmetric to the
deitialization, that is, it removes the additional wrapper.

On the other hand, this change consolidates the trace port related code
into a single source file, what facilitates an upcoming disabling of
this functionality by a user choise.

Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Reviewed-by: Loic Poulain <loic.poulain@linaro.org>
Acked-by: M Chetan Kumar <m.chetan.kumar@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'linux-can-next-for-5.17-20211208' of git://git.kernel.org/pub/scm/linux...
Jakub Kicinski [Thu, 9 Dec 2021 01:06:57 +0000 (17:06 -0800)]
Merge tag 'linux-can-next-for-5.17-20211208' of git://git./linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
can-next 2021-12-08

The first patch is by Vincent Mailhol and replaces the custom CAN
units with generic one form linux/units.h.

The next 3 patches are by Evgeny Boger and add Allwinner R40 support
to the sun4i CAN driver.

Andy Shevchenko contributes 4 patches to the hi311x CAN driver,
consisting of cleanups and converting the driver to the device
property API.

* tag 'linux-can-next-for-5.17-20211208' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next:
  can: hi311x: hi3110_can_probe(): convert to use dev_err_probe()
  can: hi311x: hi3110_can_probe(): make use of device property API
  can: hi311x: hi3110_can_probe(): try to get crystal clock rate from property
  can: hi311x: hi3110_can_probe(): use devm_clk_get_optional() to get the input clock
  ARM: dts: sun8i: r40: add node for CAN controller
  can: sun4i_can: add support for R40 CAN controller
  dt-bindings: net: can: add support for Allwinner R40 CAN controller
  can: bittiming: replace CAN units with the generic ones from linux/units.h
====================

Link: https://lore.kernel.org/r/20211208125055.223141-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'rework-dsa-bridge-tx-forwarding-offload-api'
Jakub Kicinski [Wed, 8 Dec 2021 22:31:18 +0000 (14:31 -0800)]
Merge branch 'rework-dsa-bridge-tx-forwarding-offload-api'

Vladimir Oltean says:

====================
Rework DSA bridge TX forwarding offload API

This change set is preparation work for DSA support of bridge FDB
isolation. It replaces struct net_device *dp->bridge_dev with a struct
dsa_bridge *dp->bridge that contains some extra information about that
bridge, like a unique number kept by DSA.

Up until now we computed that number only with the bridge TX forwarding
offload feature, but it will be needed for other features too, like for
isolation of FDB entries belonging to different bridges. Hardware
implementations vary, but one common pattern seems to be the presence of
a FID field which can be associated with that bridge number kept by DSA.
The idea was outlined here:
https://patchwork.kernel.org/project/netdevbpf/patch/20210818120150.892647-16-vladimir.oltean@nxp.com/
(the difference being that with this new proposal, drivers would not
need to call dsa_bridge_num_find, instead the bridge_num would be part
of the struct dsa_bridge :: num passed as argument).

No functional change is intended for drivers that don't already make use
of the bridge TX forwarding offload. I've tested the changes on the
felix, sja1105 and mv88e6xxx drivers, but nonetheless I'm copying all
DSA driver maintainers due to API changes that are taking place.

Compared to v1 and v2, the amount of patches is larger, but the contents
is mostly the same, just split up hopefully a bit better for review.
====================

Link: https://lore.kernel.org/r/20211206165758.1553882-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: eliminate dsa_switch_ops :: port_bridge_tx_fwd_{,un}offload
Vladimir Oltean [Mon, 6 Dec 2021 16:57:58 +0000 (18:57 +0200)]
net: dsa: eliminate dsa_switch_ops :: port_bridge_tx_fwd_{,un}offload

We don't really need new switch API for these, and with new switches
which intend to add support for this feature, it will become cumbersome
to maintain.

The change consists in restructuring the two drivers that implement this
offload (sja1105 and mv88e6xxx) such that the offload is enabled and
disabled from the ->port_bridge_{join,leave} methods instead of the old
->port_bridge_tx_fwd_{,un}offload.

The only non-trivial change is that mv88e6xxx_map_virtual_bridge_to_pvt()
has been moved to avoid a forward declaration, and the
mv88e6xxx_reg_lock() calls from inside it have been removed, since
locking is now done from mv88e6xxx_port_bridge_{join,leave}.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: add a "tx_fwd_offload" argument to ->port_bridge_join
Vladimir Oltean [Mon, 6 Dec 2021 16:57:57 +0000 (18:57 +0200)]
net: dsa: add a "tx_fwd_offload" argument to ->port_bridge_join

This is a preparation patch for the removal of the DSA switch methods
->port_bridge_tx_fwd_offload() and ->port_bridge_tx_fwd_unoffload().
The plan is for the switch to report whether it offloads TX forwarding
directly as a response to the ->port_bridge_join() method.

This change deals with the noisy portion of converting all existing
function prototypes to take this new boolean pointer argument.
The bool is placed in the cross-chip notifier structure for bridge join,
and a reference to it is provided to drivers. In the next change, DSA
will then actually look at this value instead of calling
->port_bridge_tx_fwd_offload().

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: keep the bridge_dev and bridge_num as part of the same structure
Vladimir Oltean [Mon, 6 Dec 2021 16:57:56 +0000 (18:57 +0200)]
net: dsa: keep the bridge_dev and bridge_num as part of the same structure

The main desire behind this is to provide coherent bridge information to
the fast path without locking.

For example, right now we set dp->bridge_dev and dp->bridge_num from
separate code paths, it is theoretically possible for a packet
transmission to read these two port properties consecutively and find a
bridge number which does not correspond with the bridge device.

Another desire is to start passing more complex bridge information to
dsa_switch_ops functions. For example, with FDB isolation, it is
expected that drivers will need to be passed the bridge which requested
an FDB/MDB entry to be offloaded, and along with that bridge_dev, the
associated bridge_num should be passed too, in case the driver might
want to implement an isolation scheme based on that number.

We already pass the {bridge_dev, bridge_num} pair to the TX forwarding
offload switch API, however we'd like to remove that and squash it into
the basic bridge join/leave API. So that means we need to pass this
pair to the bridge join/leave API.

During dsa_port_bridge_leave, first we unset dp->bridge_dev, then we
call the driver's .port_bridge_leave with what used to be our
dp->bridge_dev, but provided as an argument.

When bridge_dev and bridge_num get folded into a single structure, we
need to preserve this behavior in dsa_port_bridge_leave: we need a copy
of what used to be in dp->bridge.

Switch drivers check bridge membership by comparing dp->bridge_dev with
the provided bridge_dev, but now, if we provide the struct dsa_bridge as
a pointer, they cannot keep comparing dp->bridge to the provided
pointer, since this only points to an on-stack copy. To make this
obvious and prevent driver writers from forgetting and doing stupid
things, in this new API, the struct dsa_bridge is provided as a full
structure (not very large, contains an int and a pointer) instead of a
pointer. An explicit comparison function needs to be used to determine
bridge membership: dsa_port_offloads_bridge().

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: export bridging offload helpers to drivers
Vladimir Oltean [Mon, 6 Dec 2021 16:57:55 +0000 (18:57 +0200)]
net: dsa: export bridging offload helpers to drivers

Move the static inline helpers from net/dsa/dsa_priv.h to
include/net/dsa.h, so that drivers can call functions such as
dsa_port_offloads_bridge_dev(), which will be necessary after the
transition to a more complex bridge structure.

More functions than are needed right now are being moved, but this is
done for uniformity.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: rename dsa_port_offloads_bridge to dsa_port_offloads_bridge_dev
Vladimir Oltean [Mon, 6 Dec 2021 16:57:54 +0000 (18:57 +0200)]
net: dsa: rename dsa_port_offloads_bridge to dsa_port_offloads_bridge_dev

Currently the majority of dsa_port_bridge_dev_get() calls in drivers is
just to check whether a port is under the bridge device provided as
argument by the DSA API.

We'd like to change that DSA API so that a more complex structure is
provided as argument. To keep things more generic, and considering that
the new complex structure will be provided by value and not by
reference, direct comparisons between dp->bridge and the provided bridge
will be broken. The generic way to do the checking would simply be to
do something like dsa_port_offloads_bridge(dp, &bridge).

But there's a problem, we already have a function named that way, which
actually takes a bridge_dev net_device as argument. Rename it so that we
can use dsa_port_offloads_bridge for something else.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: hide dp->bridge_dev and dp->bridge_num in drivers behind helpers
Vladimir Oltean [Mon, 6 Dec 2021 16:57:53 +0000 (18:57 +0200)]
net: dsa: hide dp->bridge_dev and dp->bridge_num in drivers behind helpers

The location of the bridge device pointer and number is going to change.
It is not going to be kept individually per port, but in a common
structure allocated dynamically and which will have lockdep validation.

Use the helpers to access these elements so that we have a migration
path to the new organization.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: hide dp->bridge_dev and dp->bridge_num in the core behind helpers
Vladimir Oltean [Mon, 6 Dec 2021 16:57:52 +0000 (18:57 +0200)]
net: dsa: hide dp->bridge_dev and dp->bridge_num in the core behind helpers

The location of the bridge device pointer and number is going to change.
It is not going to be kept individually per port, but in a common
structure allocated dynamically and which will have lockdep validation.

Create helpers to access these elements so that we have a migration path
to the new organization.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: mv88e6xxx: compute port vlan membership based on dp->bridge_dev comparison
Vladimir Oltean [Mon, 6 Dec 2021 16:57:51 +0000 (18:57 +0200)]
net: dsa: mv88e6xxx: compute port vlan membership based on dp->bridge_dev comparison

The goal of this change is to reduce mv88e6xxx_port_vlan() to a form
where dsa_port_bridge_same() can be used, since the dp->bridge_dev
pointer will be hidden in a future change.

To do that, we observe that the "br" pointer is deduced from a
dp->bridge_dev in both cases (of a physical switch port as well as a
virtual bridge). So instead of keeping the "br" pointer, we can just
keep the "dp" pointer from which "br" gets derived.

In the last iteration over switch ports, we must use another iterator
variable, "other_dp"since now we use the "dp" variable to keep an
indirect reference to the bridge. While at it, the old code used to
filter only the ports which were part of the same switch as "ds".
There exists a dedicated DSA port iterator for that:
dsa_switch_for_each_port (which skips the ports in the tree that belong
to non-local switches), so we can just use that.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: mv88e6xxx: iterate using dsa_switch_for_each_user_port in mv88e6xxx_port_ch...
Vladimir Oltean [Mon, 6 Dec 2021 16:57:50 +0000 (18:57 +0200)]
net: dsa: mv88e6xxx: iterate using dsa_switch_for_each_user_port in mv88e6xxx_port_check_hw_vlan

Avoid a plethora of dsa_to_port() calls (some hidden behind
dsa_is_*_port and some in plain sight) by keeping two struct dsa_port
references: one to the port passed as argument, and another to the other
ports of the switch that we're iterating over.

This isn't called from the DSA initialization path, so there is no risk
that we have user ports without a dp->slave populated. So the combined
checks that a port isn't a DSA port, a CPU port, or doesn't have a slave
net device (therefore is unused), are strictly equivalent to the simple
check that the port is a user port. This is already handled by the DSA
iterator.

i gets replaced by other_dp->index, dsa_is_*_port calls get replaced by
dsa_port_is_*, and dsa_to_port gets replaced by the respective pointer
directly.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: mt7530: iterate using dsa_switch_for_each_user_port in bridging ops
Vladimir Oltean [Mon, 6 Dec 2021 16:57:49 +0000 (18:57 +0200)]
net: dsa: mt7530: iterate using dsa_switch_for_each_user_port in bridging ops

Avoid repeated calls to dsa_to_port() (some hidden behind dsa_is_user_port
and some in plain sight) by keeping two struct dsa_port references: one
to the port passed as argument, and another to the other ports of the
switch that we're iterating over.

dsa_to_port(ds, i) gets replaced by other_dp, i gets replaced by
other_port which is derived from other_dp->index, dsa_is_user_port is
handled by the DSA iterator.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: assign a bridge number even without TX forwarding offload
Vladimir Oltean [Mon, 6 Dec 2021 16:57:48 +0000 (18:57 +0200)]
net: dsa: assign a bridge number even without TX forwarding offload

The service where DSA assigns a unique bridge number for each forwarding
domain is useful even for drivers which do not implement the TX
forwarding offload feature.

For example, drivers might use the dp->bridge_num for FDB isolation.

So rename ds->num_fwd_offloading_bridges to ds->max_num_bridges, and
calculate a unique bridge_num for all drivers that set this value.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: make dp->bridge_num one-based
Vladimir Oltean [Mon, 6 Dec 2021 16:57:47 +0000 (18:57 +0200)]
net: dsa: make dp->bridge_num one-based

I have seen too many bugs already due to the fact that we must encode an
invalid dp->bridge_num as a negative value, because the natural tendency
is to check that invalid value using (!dp->bridge_num). Latest example
can be seen in commit 1bec0f05062c ("net: dsa: fix bridge_num not
getting cleared after ports leaving the bridge").

Convert the existing users to assume that dp->bridge_num == 0 is the
encoding for invalid, and valid bridge numbers start from 1.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agocan: hi311x: hi3110_can_probe(): convert to use dev_err_probe()
Andy Shevchenko [Mon, 6 Dec 2021 16:55:42 +0000 (18:55 +0200)]
can: hi311x: hi3110_can_probe(): convert to use dev_err_probe()

When deferred the reason is saved for further debugging. Besides that,
it's fine to call dev_err_probe() in ->probe() when error code is
known. Convert the driver to use dev_err_probe().

Link: https://lore.kernel.org/all/20211206165542.69887-4-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2 years agocan: hi311x: hi3110_can_probe(): make use of device property API
Andy Shevchenko [Mon, 6 Dec 2021 16:55:41 +0000 (18:55 +0200)]
can: hi311x: hi3110_can_probe(): make use of device property API

Make use of device property API in this driver so that both OF based
system and ACPI based system can use this driver.

Link: https://lore.kernel.org/all/20211206165542.69887-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2 years agocan: hi311x: hi3110_can_probe(): try to get crystal clock rate from property
Andy Shevchenko [Mon, 6 Dec 2021 16:55:40 +0000 (18:55 +0200)]
can: hi311x: hi3110_can_probe(): try to get crystal clock rate from property

In some configurations, mainly ACPI-based, the clock frequency of the
device is supplied by very well established 'clock-frequency'
property. Hence, try to get it from the property at last if no other
providers are available.

Link: https://lore.kernel.org/all/20211206165542.69887-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2 years agocan: hi311x: hi3110_can_probe(): use devm_clk_get_optional() to get the input clock
Andy Shevchenko [Mon, 6 Dec 2021 16:55:39 +0000 (18:55 +0200)]
can: hi311x: hi3110_can_probe(): use devm_clk_get_optional() to get the input clock

It's not clear what was the intention of redundant usage of IS_ERR()
around the clock pointer since with the error check of devm_clk_get()
followed by bailout it can't be invalid,

Simplify the code which fetches the input clock by using
devm_clk_get_optional(). It will allow to switch to device properties
approach in the future.

Link: https://lore.kernel.org/all/20211206165542.69887-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2 years agoARM: dts: sun8i: r40: add node for CAN controller
Evgeny Boger [Mon, 22 Nov 2021 10:46:16 +0000 (13:46 +0300)]
ARM: dts: sun8i: r40: add node for CAN controller

Allwinner R40 (also known as A40i, T3, V40) has a CAN controller. The
controller is the same as in earlier A10 and A20 SoCs, but needs reset
line to be deasserted before use.

This patch adds a CAN node and the corresponding pinctrl descriptions.

Link: https://lore.kernel.org/all/20211122104616.537156-4-boger@wirenboard.com
Signed-off-by: Evgeny Boger <boger@wirenboard.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2 years agocan: sun4i_can: add support for R40 CAN controller
Evgeny Boger [Mon, 22 Nov 2021 10:46:15 +0000 (13:46 +0300)]
can: sun4i_can: add support for R40 CAN controller

Allwinner R40 (also known as A40i, T3, V40) has a CAN controller. The
controller is the same as in earlier A10 and A20 SoCs, but needs reset
line to be deasserted before use.

This patch adds a new compatible for R40 CAN controller. Depending
on the compatible, reset line can be requested from DT.

Link: https://lore.kernel.org/all/20211122104616.537156-3-boger@wirenboard.com
Signed-off-by: Evgeny Boger <boger@wirenboard.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2 years agodt-bindings: net: can: add support for Allwinner R40 CAN controller
Evgeny Boger [Mon, 22 Nov 2021 10:46:14 +0000 (13:46 +0300)]
dt-bindings: net: can: add support for Allwinner R40 CAN controller

Allwinner R40 (also known as A40i, T3, V40) has a CAN controller. The
controller is the same as in earlier A10 and A20 SoCs, but needs reset
line to be deasserted before use.

This patch Introduces new compatible for R40 CAN controller with
required resets property.

Link: https://lore.kernel.org/all/20211122104616.537156-2-boger@wirenboard.com
Signed-off-by: Evgeny Boger <boger@wirenboard.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2 years agocan: bittiming: replace CAN units with the generic ones from linux/units.h
Vincent Mailhol [Wed, 24 Nov 2021 01:45:36 +0000 (10:45 +0900)]
can: bittiming: replace CAN units with the generic ones from linux/units.h

In [1], we introduced a set of units in linux/can/bittiming.h. Since
then, generic SI prefixes were added to linux/units.h in [2]. Those
new prefixes can perfectly replace CAN specific ones.

This patch replaces all occurrences of the CAN units with their
corresponding prefix (from linux/units) and the unit (as a comment)
according to below table.

 CAN units SI metric prefix (from linux/units) + unit (as a comment)
 ------------------------------------------------------------------------
 CAN_KBPS KILO /* BPS */
 CAN_MBPS MEGA /* BPS */
 CAM_MHZ MEGA /* Hz */

The definition are then removed from linux/can/bittiming.h

[1] commit 1d7750760b70 ("can: bittiming: add CAN_KBPS, CAN_MBPS and
CAN_MHZ macros")

[2] commit 26471d4a6cf8 ("units: Add SI metric prefix definitions")

Link: https://lore.kernel.org/all/20211124014536.782550-1-mailhol.vincent@wanadoo.fr
Suggested-by: Jimmy Assarsson <extja@kvaser.com>
Suggested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2 years agoMerge branch 's390-net-updates-2021-12-06'
Jakub Kicinski [Wed, 8 Dec 2021 06:01:07 +0000 (22:01 -0800)]
Merge branch 's390-net-updates-2021-12-06'

Alexandra Winter says:

====================
s390/net: updates 2021-12-06

This brings some maintenance improvements and removes some
unnecessary code checks.
====================

Link: https://lore.kernel.org/r/20211207090452.1155688-1-wintera@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agos390/qeth: remove check for packing mode in qeth_check_outbound_queue()
Julian Wiedmann [Tue, 7 Dec 2021 09:04:52 +0000 (10:04 +0100)]
s390/qeth: remove check for packing mode in qeth_check_outbound_queue()

If qeth_check_outbound_queue() finds a partially filled TX buffer on
the queue and flushes it, then the queue _must_ have been in packing
mode.

Remove the redundant check when updating the relevant statistics.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agos390/qeth: fine-tune .ndo_select_queue()
Julian Wiedmann [Tue, 7 Dec 2021 09:04:51 +0000 (10:04 +0100)]
s390/qeth: fine-tune .ndo_select_queue()

Avoid a conditional branch for L2 devices when selecting the TX queue,
and have shared logic for OSA devices.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agos390/qeth: don't offer .ndo_bridge_* ops for OSA devices
Julian Wiedmann [Tue, 7 Dec 2021 09:04:50 +0000 (10:04 +0100)]
s390/qeth: don't offer .ndo_bridge_* ops for OSA devices

qeth_l2_detect_dev2br_support() will only set brport_hw_features for IQD
devices. So qeth_l2_bridge_getlink() and qeth_l2_bridge_setlink() will
always return -EOPNOTSUPP on OSA devices. Just don't offer these
callbacks instead.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agos390/qeth: split up L2 netdev_ops
Julian Wiedmann [Tue, 7 Dec 2021 09:04:49 +0000 (10:04 +0100)]
s390/qeth: split up L2 netdev_ops

Splitting up the netdev_ops allows for fine-tuning some of the ndo's
in subsequent patches.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com>
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agos390/qeth: simplify qeth_receive_skb()
Julian Wiedmann [Tue, 7 Dec 2021 09:04:48 +0000 (10:04 +0100)]
s390/qeth: simplify qeth_receive_skb()

Now that the OSN code is gone, we don't need the second switch statement
in the RX path. And getting rid of its (unreachable) default case is a
nice simplification.

Also don't pass in the full HW header, all we still need is a flag to
indicate whether the skb can use CSO. This we can already obtain during
the first peek at the HW header.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com>
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agohv_sock: Extract hvs_send_data() helper that takes only header
Kees Cook [Tue, 7 Dec 2021 06:32:17 +0000 (22:32 -0800)]
hv_sock: Extract hvs_send_data() helper that takes only header

When building under -Warray-bounds, the compiler is especially
conservative when faced with casts from a smaller object to a larger
object. While this has found many real bugs, there are some cases that
are currently false positives (like here). With this as one of the last
few instances of the warning in the kernel before -Warray-bounds can be
enabled globally, rearrange the functions so that there is a header-only
version of hvs_send_data(). Silences this warning:

net/vmw_vsock/hyperv_transport.c: In function 'hvs_shutdown_lock_held.constprop':
net/vmw_vsock/hyperv_transport.c:231:32: warning: array subscript 'struct hvs_send_buf[0]' is partly outside array bounds of 'struct vmpipe_proto_header[1]' [-Warray-bounds]
  231 |         send_buf->hdr.pkt_type = 1;
      |         ~~~~~~~~~~~~~~~~~~~~~~~^~~
net/vmw_vsock/hyperv_transport.c:465:36: note: while referencing 'hdr'
  465 |         struct vmpipe_proto_header hdr;
      |                                    ^~~

This change results in no executable instruction differences.

Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Wei Liu <wei.liu@kernel.org>
Link: https://lore.kernel.org/r/20211207063217.2591451-1-keescook@chromium.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: felix: use kmemdup() to replace kmalloc + memcpy
Yihao Han [Tue, 7 Dec 2021 06:44:18 +0000 (22:44 -0800)]
net: dsa: felix: use kmemdup() to replace kmalloc + memcpy

Fix following coccicheck warning:
/drivers/net/dsa/ocelot/felix_vsc9959.c:1627:13-20:
WARNING opportunity for kmemdup
/drivers/net/dsa/ocelot/felix_vsc9959.c:1506:16-23:
WARNING opportunity for kmemdup

Signed-off-by: Yihao Han <hanyihao@vivo.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20211207064419.38632-1-hanyihao@vivo.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'prepare-ocelot-for-external-interface-control'
Jakub Kicinski [Wed, 8 Dec 2021 05:44:54 +0000 (21:44 -0800)]
Merge branch 'prepare-ocelot-for-external-interface-control'

Colin Foster says:

====================
prepare ocelot for external interface control

This patch set is derived from an attempt to include external control
for a VSC751[1234] chip via SPI. That patch set has grown large and is
getting unwieldy for reviewers and the developers... me.

I'm breaking out the changes from that patch set. Some are trivial
  net: dsa: ocelot: remove unnecessary pci_bar variables
  net: dsa: ocelot: felix: Remove requirement for PCS in felix devices

some are required for SPI
  net: dsa: ocelot: felix: add interface for custom regmaps

and some are just to expose code to be shared
  net: mscc: ocelot: split register definitions to a separate file

The entirety of this patch set should have essentially no impact on the
system performance.
====================

Link: https://lore.kernel.org/r/20211207170030.1406601-1-colin.foster@in-advantage.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: mscc: ocelot: split register definitions to a separate file
Colin Foster [Tue, 7 Dec 2021 17:00:30 +0000 (09:00 -0800)]
net: mscc: ocelot: split register definitions to a separate file

Move these to a separate file will allow them to be shared to other
drivers.

Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: ocelot: felix: add interface for custom regmaps
Colin Foster [Tue, 7 Dec 2021 17:00:29 +0000 (09:00 -0800)]
net: dsa: ocelot: felix: add interface for custom regmaps

Add an interface so that non-mmio regmaps can be used

Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: ocelot: felix: Remove requirement for PCS in felix devices
Colin Foster [Tue, 7 Dec 2021 17:00:28 +0000 (09:00 -0800)]
net: dsa: ocelot: felix: Remove requirement for PCS in felix devices

Existing felix devices all have an initialized pcs array. Future devices
might not, so running a NULL check on the array before dereferencing it
will allow those future drivers to not crash at this point

Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: ocelot: remove unnecessary pci_bar variables
Colin Foster [Tue, 7 Dec 2021 17:00:27 +0000 (09:00 -0800)]
net: dsa: ocelot: remove unnecessary pci_bar variables

The pci_bar variables for the switch and imdio don't make sense for the
generic felix driver. Moving them to felix_vsc9959 to limit scope and
simplify the felix_info struct.

Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: hns3: Fix spelling mistake "faile" -> "failed"
Colin Ian King [Mon, 6 Dec 2021 09:12:07 +0000 (09:12 +0000)]
net: hns3: Fix spelling mistake "faile" -> "failed"

There is a spelling mistake in a dev_err message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20211206091207.113648-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'wireless-drivers-next-2021-12-07' of git://git.kernel.org/pub/scm/linux...
Jakub Kicinski [Wed, 8 Dec 2021 05:01:16 +0000 (21:01 -0800)]
Merge tag 'wireless-drivers-next-2021-12-07' of git://git./linux/kernel/git/kvalo/wireless-drivers-next

Kalle Valo says:

====================
wireless-drivers-next patches for v5.17

First set of patches for v5.17. The biggest change is the iwlmei
driver for Intel's AMT devices. Also now WCN6855 support in ath11k
should be usable.

Major changes:

ath10k
 * fetch (pre-)calibration data via nvmem subsystem

ath11k
 * enable 802.11 power save mode in station mode for qca6390 and wcn6855
 * trace log support
 * proper board file detection for WCN6855 based on PCI ids
 * BSS color change support

rtw88
 * add debugfs file to force lowest basic rate
 * add quirk to disable PCI ASPM on HP 250 G7 Notebook PC

mwifiex
 * add quirk to disable deep sleep with certain hardware revision in
  Surface Book 2 devices

iwlwifi
 * add iwlmei driver for co-operating with Intel's Active Management
   Technology (AMT) devices

* tag 'wireless-drivers-next-2021-12-07' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next: (87 commits)
  iwlwifi: mei: fix linking when tracing is not enabled
  rtlwifi: rtl8192de: Style clean-ups
  mwl8k: Use named struct for memcpy() region
  intersil: Use struct_group() for memcpy() region
  libertas_tf: Use struct_group() for memcpy() region
  libertas: Use struct_group() for memcpy() region
  wlcore: no need to initialise statics to false
  rsi: Fix out-of-bounds read in rsi_read_pkt()
  rsi: Fix use-after-free in rsi_rx_done_handler()
  brcmfmac: Configure keep-alive packet on suspend
  wilc1000: remove '-Wunused-but-set-variable' warning in chip_wakeup()
  iwlwifi: mvm: read the rfkill state and feed it to iwlmei
  iwlwifi: mvm: add vendor commands needed for iwlmei
  iwlwifi: integrate with iwlmei
  iwlwifi: mei: add debugfs hooks
  iwlwifi: mei: add the driver to allow cooperation with CSME
  mei: bus: add client dma interface
  mwifiex: Ignore BTCOEX events from the 88W8897 firmware
  mwifiex: Ensure the version string from the firmware is 0-terminated
  mwifiex: Add quirk to disable deep sleep with certain hardware revision
  ...
====================

Link: https://lore.kernel.org/r/20211207144211.A9949C341C1@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'net-second-round-of-netdevice-refcount-tracking'
Jakub Kicinski [Wed, 8 Dec 2021 04:45:10 +0000 (20:45 -0800)]
Merge branch 'net-second-round-of-netdevice-refcount-tracking'

Eric Dumazet says:

====================
net: second round of netdevice refcount tracking

The most interesting part of this series is probably
("inet: add net device refcount tracker to struct fib_nh_common")
but only future reports will confirm this guess.
====================

Link: https://lore.kernel.org/r/20211207013039.1868645-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: sched: act_mirred: add net device refcount tracker
Eric Dumazet [Tue, 7 Dec 2021 01:30:39 +0000 (17:30 -0800)]
net: sched: act_mirred: add net device refcount tracker

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoopenvswitch: add net device refcount tracker to struct vport
Eric Dumazet [Tue, 7 Dec 2021 01:30:38 +0000 (17:30 -0800)]
openvswitch: add net device refcount tracker to struct vport

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonetlink: add net device refcount tracker to struct ethnl_req_info
Eric Dumazet [Tue, 7 Dec 2021 01:30:37 +0000 (17:30 -0800)]
netlink: add net device refcount tracker to struct ethnl_req_info

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet/smc: add net device tracker to struct smc_pnetentry
Eric Dumazet [Tue, 7 Dec 2021 01:30:36 +0000 (17:30 -0800)]
net/smc: add net device tracker to struct smc_pnetentry

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agopktgen add net device refcount tracker
Eric Dumazet [Tue, 7 Dec 2021 01:30:35 +0000 (17:30 -0800)]
pktgen add net device refcount tracker

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agollc: add net device refcount tracker
Eric Dumazet [Tue, 7 Dec 2021 01:30:34 +0000 (17:30 -0800)]
llc: add net device refcount tracker

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoax25: add net device refcount tracker
Eric Dumazet [Tue, 7 Dec 2021 01:30:33 +0000 (17:30 -0800)]
ax25: add net device refcount tracker

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoinet: add net device refcount tracker to struct fib_nh_common
Eric Dumazet [Tue, 7 Dec 2021 01:30:32 +0000 (17:30 -0800)]
inet: add net device refcount tracker to struct fib_nh_common

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: switchdev: add net device refcount tracker
Eric Dumazet [Tue, 7 Dec 2021 01:30:31 +0000 (17:30 -0800)]
net: switchdev: add net device refcount tracker

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: watchdog: add net device refcount tracker
Eric Dumazet [Tue, 7 Dec 2021 01:30:30 +0000 (17:30 -0800)]
net: watchdog: add net device refcount tracker

Add a netdevice_tracker inside struct net_device, to track
the self reference when a device has an active watchdog timer.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: bridge: add net device refcount tracker
Eric Dumazet [Tue, 7 Dec 2021 01:30:29 +0000 (17:30 -0800)]
net: bridge: add net device refcount tracker

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agovlan: add net device refcount tracker
Eric Dumazet [Tue, 7 Dec 2021 01:30:28 +0000 (17:30 -0800)]
vlan: add net device refcount tracker

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: eql: add net device refcount tracker
Eric Dumazet [Tue, 7 Dec 2021 01:30:27 +0000 (17:30 -0800)]
net: eql: add net device refcount tracker

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'mptcp-new-features-for-mptcp-sockets-and-netlink-pm'
Jakub Kicinski [Tue, 7 Dec 2021 19:36:36 +0000 (11:36 -0800)]
Merge branch 'mptcp-new-features-for-mptcp-sockets-and-netlink-pm'

Mat Martineau says:

====================
mptcp: New features for MPTCP sockets and netlink PM

This collection of patches adds MPTCP socket support for a few socket
options, ioctls, and one ancillary data type (specifics for each are
listed below). There's also a patch modifying the netlink MPTCP path
manager API to allow setting the backup flag on a configured interface
using the endpoint ID instead of the full IP address.

Patches 1 & 2: TCP_INQ cmsg and selftests.

Patches 2 & 3: SIOCINQ, OUTQ, and OUTQNSD ioctls and selftests.

Patch 5: Change backup flag using endpoint ID.

Patches 6 & 7: IP_TOS socket option and selftests.

Patches 8-10: TCP_CORK and TCP_NODELAY socket options. Includes a tcp
change to expose __tcp_sock_set_cork() and __tcp_sock_set_nodelay() for
use by MPTCP.
====================

Link: https://lore.kernel.org/r/20211203223541.69364-1-mathew.j.martineau@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agomptcp: support TCP_CORK and TCP_NODELAY
Maxim Galaganov [Fri, 3 Dec 2021 22:35:41 +0000 (14:35 -0800)]
mptcp: support TCP_CORK and TCP_NODELAY

First, add cork and nodelay fields to the mptcp_sock structure
so they can be used in sync_socket_options(), and fill them on setsockopt
while holding the msk socket lock.

Then, on setsockopt set proper tcp_sk(ssk)->nonagle values for subflows
by calling __tcp_sock_set_cork() or __tcp_sock_set_nodelay() on the ssk
while holding the ssk socket lock.

tcp_push_pending_frames() will be invoked on the ssk if a cork was cleared
or nodelay was set. Also set MPTCP_PUSH_PENDING bit by calling
mptcp_check_and_set_pending(). This will lead to __mptcp_push_pending()
being called inside mptcp_release_cb() with new tcp_sk(ssk)->nonagle.

Also add getsockopt support for TCP_CORK and TCP_NODELAY.

Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Maxim Galaganov <max@internet.ru>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agomptcp: expose mptcp_check_and_set_pending
Maxim Galaganov [Fri, 3 Dec 2021 22:35:40 +0000 (14:35 -0800)]
mptcp: expose mptcp_check_and_set_pending

Expose the mptcp_check_and_set_pending() function for use inside MPTCP
sockopt code. The next patch will call it when TCP_CORK is cleared or
TCP_NODELAY is set on the MPTCP socket in order to push pending data
from mptcp_release_cb().

Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Maxim Galaganov <max@internet.ru>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agotcp: expose __tcp_sock_set_cork and __tcp_sock_set_nodelay
Maxim Galaganov [Fri, 3 Dec 2021 22:35:39 +0000 (14:35 -0800)]
tcp: expose __tcp_sock_set_cork and __tcp_sock_set_nodelay

Expose __tcp_sock_set_cork() and __tcp_sock_set_nodelay() for use in
MPTCP setsockopt code -- namely for syncing MPTCP socket options with
subflows inside sync_socket_options() while already holding the subflow
socket lock.

Acked-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Maxim Galaganov <max@internet.ru>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoselftests: mptcp: check IP_TOS in/out are the same
Florian Westphal [Fri, 3 Dec 2021 22:35:38 +0000 (14:35 -0800)]
selftests: mptcp: check IP_TOS in/out are the same

Check that getsockopt(IP_TOS) returns what setsockopt(IP_TOS) did set
right before.

Also check that socklen_t == 0 and -1 input values match those
of normal tcp sockets.

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agomptcp: getsockopt: add support for IP_TOS
Florian Westphal [Fri, 3 Dec 2021 22:35:37 +0000 (14:35 -0800)]
mptcp: getsockopt: add support for IP_TOS

earlier patch added IP_TOS setsockopt support, this allows to get
the value set by earlier setsockopt.

Extends mptcp_put_int_option to handle u8 input/output by
adding required cast.

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agomptcp: allow changing the "backup" bit by endpoint id
Davide Caratti [Fri, 3 Dec 2021 22:35:36 +0000 (14:35 -0800)]
mptcp: allow changing the "backup" bit by endpoint id

a non-zero 'id' is sufficient to identify MPTCP endpoints: allow changing
the value of 'backup' bit by simply specifying the endpoint id.

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/158
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoselftests: mptcp: add inq test case
Florian Westphal [Fri, 3 Dec 2021 22:35:35 +0000 (14:35 -0800)]
selftests: mptcp: add inq test case

client & server use a unix socket connection to communicate
outside of the mptcp connection.

This allows the consumer to know in advance how many bytes have been
(or will be) sent by the peer.
This allows stricter checks on the bytecounts reported by TCP_INQ cmsg.

Suggested-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agomptcp: add SIOCINQ, OUTQ and OUTQNSD ioctls
Florian Westphal [Fri, 3 Dec 2021 22:35:34 +0000 (14:35 -0800)]
mptcp: add SIOCINQ, OUTQ and OUTQNSD ioctls

Allows to query in-sequence data ready for read(), total bytes in
write queue and total bytes in write queue that have not yet been sent.

v2: remove unneeded READ_ONCE() (Paolo Abeni)
v3: check for new data unconditionally in SIOCINQ ioctl (Mat Martineau)

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoselftests: mptcp: add TCP_INQ support
Florian Westphal [Fri, 3 Dec 2021 22:35:33 +0000 (14:35 -0800)]
selftests: mptcp: add TCP_INQ support

Do checks on the returned inq counter.

Fail on:
1. Huge value (> 1 kbyte, test case files are 1 kb)
2. last hint larger than returned bytes when read was short
3. erronenous indication of EOF.

3) happens when a hint of X bytes reads X-1 on next call
   but next recvmsg returns more data (instead of EOF).

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agomptcp: add TCP_INQ cmsg support
Florian Westphal [Fri, 3 Dec 2021 22:35:32 +0000 (14:35 -0800)]
mptcp: add TCP_INQ cmsg support

Support the TCP_INQ setsockopt.

This is a boolean that tells recvmsg path to include the remaining
in-sequence bytes in the cmsg data.

v2: do not use CB(skb)->offset, increment map_seq instead (Paolo Abeni)
v3: adjust CB(skb)->map_seq when taking skb from ofo queue (Paolo Abeni)

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/224
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agovrf: use dev_replace_track() for better tracking
Eric Dumazet [Tue, 7 Dec 2021 05:56:03 +0000 (21:56 -0800)]
vrf: use dev_replace_track() for better tracking

vrf_rt6_release() and vrf_rtable_release() changes dst->dev

Instead of

dev_hold(ndev);
dev_put(odev);

We should use

dev_replace_track(odev, ndev, &dst->dev_tracker, GFP_KERNEL);

If we do not transfer dst->dev_tracker to the new device,
we will get warnings from ref_tracker_dir_exit() when odev
is finally dismantled.

Fixes: 9038c320001d ("net: dst: add net device refcount tracking to dst_entry")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20211207055603.1926372-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet/smc: Clear memory when release and reuse buffer
Tony Lu [Fri, 3 Dec 2021 11:33:31 +0000 (12:33 +0100)]
net/smc: Clear memory when release and reuse buffer

Currently, buffers are cleared when smc connections are created and
buffers are reused. This slows down the speed of establishing new
connections. In most cases, the applications want to establish
connections as quickly as possible.

This patch moves memset() from connection creation path to release and
buffer unuse path, this trades off between speed of establishing and
release.

Test environments:
- CPU Intel Xeon Platinum 8 core, mem 32 GiB, nic Mellanox CX4
- socket sndbuf / rcvbuf: 16384 / 131072 bytes
- w/o first round, 5 rounds, avg, 100 conns batch per round
- smc_buf_create() use bpftrace kprobe, introduces extra latency

Latency benchmarks for smc_buf_create():
  w/o patch : 19040.0 ns
  w/  patch :  1932.6 ns
  ratio :        10.2% (-89.8%)

Latency benchmarks for socket create and connect:
  w/o patch :   143.3 us
  w/  patch :   102.2 us
  ratio :        71.3% (-28.7%)

The latency of establishing connections is reduced by 28.7%.

Signed-off-by: Tony Lu <tonylu@linux.alibaba.com>
Reviewed-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Link: https://lore.kernel.org/r/20211203113331.2818873-1-kgraul@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoRevert "net: hns3: add void before function which don't receive ret"
Guangbin Huang [Sat, 4 Dec 2021 01:24:48 +0000 (09:24 +0800)]
Revert "net: hns3: add void before function which don't receive ret"

This reverts commit 5ac4f180bd07116c1e57858bc3f6741adbca3eb6.

Sorry for taking no notice that the function devlink_register() has been
already declared as void, so it is needs to revert this patch.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Link: https://lore.kernel.org/r/20211204012448.51360-1-huangguangbin2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: prestera: replace zero-length array with flexible-array member
José Expósito [Sat, 4 Dec 2021 17:13:49 +0000 (18:13 +0100)]
net: prestera: replace zero-length array with flexible-array member

One-element and zero-length arrays are deprecated and should be
replaced with flexible-array members:
https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays

Replace zero-length array with flexible-array member and make use
of the struct_size() helper.

Link: https://github.com/KSPP/linux/issues/78
Signed-off-by: José Expósito <jose.exposito89@gmail.com>
Reviewed-by: Volodymyr Mytnyk <vmytnyk@marvell.com>
Tested-by: Volodymyr Mytnyk <vmytnyk@marvell.com>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20211204171349.22776-1-jose.exposito89@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: wwan: iosm: select CONFIG_RELAY
Arnd Bergmann [Sat, 4 Dec 2021 17:40:25 +0000 (18:40 +0100)]
net: wwan: iosm: select CONFIG_RELAY

The iosm driver started using relayfs, but is missing the Kconfig
logic to ensure it's built into the kernel:

x86_64-linux-ld: drivers/net/wwan/iosm/iosm_ipc_trace.o: in function `ipc_trace_create_buf_file_handler':
iosm_ipc_trace.c:(.text+0x16): undefined reference to `relay_file_operations'
x86_64-linux-ld: drivers/net/wwan/iosm/iosm_ipc_trace.o: in function `ipc_trace_subbuf_start_handler':
iosm_ipc_trace.c:(.text+0x31): undefined reference to `relay_buf_full'
x86_64-linux-ld: drivers/net/wwan/iosm/iosm_ipc_trace.o: in function `ipc_trace_ctrl_file_write':
iosm_ipc_trace.c:(.text+0xd5): undefined reference to `relay_flush'
x86_64-linux-ld: drivers/net/wwan/iosm/iosm_ipc_trace.o: in function `ipc_trace_port_rx':

Fixes: 00ef32565b9b ("net: wwan: iosm: device trace collection using relayfs")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Reviewed-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com>
Link: https://lore.kernel.org/r/20211204174033.950528-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: fix recent csum changes
Eric Dumazet [Sat, 4 Dec 2021 04:53:56 +0000 (20:53 -0800)]
net: fix recent csum changes

Vladimir reported csum issues after my recent change in skb_postpull_rcsum()

Issue here is the following:

initial skb->csum is the csum of

[part to be pulled][rest of packet]

Old code:
 skb->csum = csum_sub(skb->csum, csum_partial(pull, pull_length, 0));

New code:
 skb->csum = ~csum_partial(pull, pull_length, ~skb->csum);

This is broken if the csum of [pulled part]
happens to be equal to skb->csum, because end
result of skb->csum is 0 in new code, instead
of being 0xffffffff

David Laight suggested to use

skb->csum = -csum_partial(pull, pull_length, -skb->csum);

I based my patches on existing code present in include/net/seg6.h,
update_csum_diff4() and update_csum_diff16() which might need
a similar fix.

I guess that my tests, mostly pulling 40 bytes of IPv6 header
were not providing enough entropy to hit this bug.

v2: added wsum_negate() to make sparse happy.

Fixes: 29c3002644bd ("net: optimize skb_postpull_rcsum()")
Fixes: 0bd28476f636 ("gro: optimize skb_gro_postpull_rcsum()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Suggested-by: David Laight <David.Laight@ACULAB.COM>
Cc: David Lebrun <dlebrun@google.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20211204045356.3659278-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'net-add-preliminary-netdev-refcount-tracking'
Jakub Kicinski [Tue, 7 Dec 2021 00:06:03 +0000 (16:06 -0800)]
Merge branch 'net-add-preliminary-netdev-refcount-tracking'

Eric Dumazet says:

====================
net: add preliminary netdev refcount tracking

Two first patches add a generic infrastructure, that will be used
to get tracking of refcount increments/decrements.

The general idea is to be able to precisely pair each decrement with
a corresponding prior increment. Both share a cookie, basically
a pointer to private data storing stack traces.

The third patch adds dev_hold_track() and dev_put_track() helpers
(CONFIG_NET_DEV_REFCNT_TRACKER)

Then a series of 20 patches converts some dev_hold()/dev_put()
pairs to new hepers : dev_hold_track() and dev_put_track().

Hopefully this will be used by developpers and syzbot to
root cause bugs that cause netdevice dismantles freezes.

With CONFIG_PCPU_DEV_REFCNT=n option, we were able to detect
some class of bugs, but too late (when too many dev_put()
were happening).

Another series will be sent after this one is merged.

v3: moved NET_DEV_REFCNT_TRACKER to net/Kconfig.debug
    added "depends on DEBUG_KERNEL && STACKTRACE_SUPPORT"
    to hopefully get rid of kbuild reports for ARCH=nios2
    Reworded patch 3 changelog.
    Added missing htmldocs (Jakub)

v2: added four additional patches,
    added netdev_tracker_alloc() and netdev_tracker_free()
    addressed build error (kernel bots),
    use GFP_ATOMIC in test_ref_tracker_timer_func()
====================

Link: https://lore.kernel.org/r/20211205042217.982127-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonetpoll: add net device refcount tracker to struct netpoll
Eric Dumazet [Sun, 5 Dec 2021 04:22:17 +0000 (20:22 -0800)]
netpoll: add net device refcount tracker to struct netpoll

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoipmr, ip6mr: add net device refcount tracker to struct vif_device
Eric Dumazet [Sun, 5 Dec 2021 04:22:16 +0000 (20:22 -0800)]
ipmr, ip6mr: add net device refcount tracker to struct vif_device

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: failover: add net device refcount tracker
Eric Dumazet [Sun, 5 Dec 2021 04:22:15 +0000 (20:22 -0800)]
net: failover: add net device refcount tracker

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: linkwatch: add net device refcount tracker
Eric Dumazet [Sun, 5 Dec 2021 04:22:14 +0000 (20:22 -0800)]
net: linkwatch: add net device refcount tracker

Add a netdevice_tracker inside struct net_device, to track
the self reference when a device is in lweventlist.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet/sched: add net device refcount tracker to struct Qdisc
Eric Dumazet [Sun, 5 Dec 2021 04:22:13 +0000 (20:22 -0800)]
net/sched: add net device refcount tracker to struct Qdisc

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoipv4: add net device refcount tracker to struct in_device
Eric Dumazet [Sun, 5 Dec 2021 04:22:12 +0000 (20:22 -0800)]
ipv4: add net device refcount tracker to struct in_device

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoipv6: add net device refcount tracker to struct inet6_dev
Eric Dumazet [Sun, 5 Dec 2021 04:22:11 +0000 (20:22 -0800)]
ipv6: add net device refcount tracker to struct inet6_dev

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: add net device refcount tracker to struct netdev_adjacent
Eric Dumazet [Sun, 5 Dec 2021 04:22:10 +0000 (20:22 -0800)]
net: add net device refcount tracker to struct netdev_adjacent

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: add net device refcount tracker to struct neigh_parms
Eric Dumazet [Sun, 5 Dec 2021 04:22:09 +0000 (20:22 -0800)]
net: add net device refcount tracker to struct neigh_parms

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: add net device refcount tracker to struct pneigh_entry
Eric Dumazet [Sun, 5 Dec 2021 04:22:08 +0000 (20:22 -0800)]
net: add net device refcount tracker to struct pneigh_entry

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: add net device refcount tracker to struct neighbour
Eric Dumazet [Sun, 5 Dec 2021 04:22:07 +0000 (20:22 -0800)]
net: add net device refcount tracker to struct neighbour

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoipv6: add net device refcount tracker to struct ip6_tnl
Eric Dumazet [Sun, 5 Dec 2021 04:22:06 +0000 (20:22 -0800)]
ipv6: add net device refcount tracker to struct ip6_tnl

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agosit: add net device refcount tracking to ip_tunnel
Eric Dumazet [Sun, 5 Dec 2021 04:22:05 +0000 (20:22 -0800)]
sit: add net device refcount tracking to ip_tunnel

Note that other ip_tunnel users do not seem to hold a reference
on tunnel->dev. Probably needs some investigations.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoipv6: add net device refcount tracker to rt6_probe_deferred()
Eric Dumazet [Sun, 5 Dec 2021 04:22:04 +0000 (20:22 -0800)]
ipv6: add net device refcount tracker to rt6_probe_deferred()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dst: add net device refcount tracking to dst_entry
Eric Dumazet [Sun, 5 Dec 2021 04:22:03 +0000 (20:22 -0800)]
net: dst: add net device refcount tracking to dst_entry

We want to track all dev_hold()/dev_put() to ease leak hunting.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodrop_monitor: add net device refcount tracker
Eric Dumazet [Sun, 5 Dec 2021 04:22:02 +0000 (20:22 -0800)]
drop_monitor: add net device refcount tracker

We want to track all dev_hold()/dev_put() to ease leak hunting.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: add net device refcount tracker to dev_ifsioc()
Eric Dumazet [Sun, 5 Dec 2021 04:22:01 +0000 (20:22 -0800)]
net: add net device refcount tracker to dev_ifsioc()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: add net device refcount tracker to ethtool_phys_id()
Eric Dumazet [Sun, 5 Dec 2021 04:22:00 +0000 (20:22 -0800)]
net: add net device refcount tracker to ethtool_phys_id()

This helper might hold a netdev reference for a long time,
lets add reference tracking.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: add net device refcount tracker to struct netdev_queue
Eric Dumazet [Sun, 5 Dec 2021 04:21:59 +0000 (20:21 -0800)]
net: add net device refcount tracker to struct netdev_queue

This will help debugging pesky netdev reference leaks.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: add net device refcount tracker to struct netdev_rx_queue
Eric Dumazet [Sun, 5 Dec 2021 04:21:58 +0000 (20:21 -0800)]
net: add net device refcount tracker to struct netdev_rx_queue

This helps debugging net device refcount leaks.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: add net device refcount tracker infrastructure
Eric Dumazet [Sun, 5 Dec 2021 04:21:57 +0000 (20:21 -0800)]
net: add net device refcount tracker infrastructure

net device are refcounted. Over the years we had numerous bugs
caused by imbalanced dev_hold() and dev_put() calls.

The general idea is to be able to precisely pair each decrement with
a corresponding prior increment. Both share a cookie, basically
a pointer to private data storing stack traces.

This patch adds dev_hold_track() and dev_put_track().

To use these helpers, each data structure owning a refcount
should also use a "netdevice_tracker" to pair the hold and put.

netdevice_tracker dev_tracker;
...
dev_hold_track(dev, &dev_tracker, GFP_ATOMIC);
...
dev_put_track(dev, &dev_tracker);

Whenever a leak happens, we will get precise stack traces
of the point dev_hold_track() happened, at device dismantle phase.

We will also get a stack trace if too many dev_put_track() for the same
netdevice_tracker are attempted.

This is guarded by CONFIG_NET_DEV_REFCNT_TRACKER option.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agolib: add tests for reference tracker
Eric Dumazet [Sun, 5 Dec 2021 04:21:56 +0000 (20:21 -0800)]
lib: add tests for reference tracker

This module uses reference tracker, forcing two issues.

1) Double free of a tracker

2) leak of two trackers, one being allocated from softirq context.

"modprobe test_ref_tracker" would emit the following traces.
(Use scripts/decode_stacktrace.sh if necessary)

[  171.648681] reference already released.
[  171.653213] allocated in:
[  171.656523]  alloctest_ref_tracker_alloc2+0x1c/0x20 [test_ref_tracker]
[  171.656526]  init_module+0x86/0x1000 [test_ref_tracker]
[  171.656528]  do_one_initcall+0x9c/0x220
[  171.656532]  do_init_module+0x60/0x240
[  171.656536]  load_module+0x32b5/0x3610
[  171.656538]  __do_sys_init_module+0x148/0x1a0
[  171.656540]  __x64_sys_init_module+0x1d/0x20
[  171.656542]  do_syscall_64+0x4a/0xb0
[  171.656546]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  171.656549] freed in:
[  171.659520]  alloctest_ref_tracker_free+0x13/0x20 [test_ref_tracker]
[  171.659522]  init_module+0xec/0x1000 [test_ref_tracker]
[  171.659523]  do_one_initcall+0x9c/0x220
[  171.659525]  do_init_module+0x60/0x240
[  171.659527]  load_module+0x32b5/0x3610
[  171.659529]  __do_sys_init_module+0x148/0x1a0
[  171.659532]  __x64_sys_init_module+0x1d/0x20
[  171.659534]  do_syscall_64+0x4a/0xb0
[  171.659536]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  171.659575] ------------[ cut here ]------------
[  171.659576] WARNING: CPU: 5 PID: 13016 at lib/ref_tracker.c:112 ref_tracker_free+0x224/0x270
[  171.659581] Modules linked in: test_ref_tracker(+)
[  171.659591] CPU: 5 PID: 13016 Comm: modprobe Tainted: G S                5.16.0-smp-DEV #290
[  171.659595] RIP: 0010:ref_tracker_free+0x224/0x270
[  171.659599] Code: 5e 41 5f 5d c3 48 c7 c7 04 9c 74 a6 31 c0 e8 62 ee 67 00 83 7b 14 00 75 1a 83 7b 18 00 75 30 4c 89 ff 4c 89 f6 e8 9c 00 69 00 <0f> 0b bb ea ff ff ff eb ae 48 c7 c7 3a 0a 77 a6 31 c0 e8 34 ee 67
[  171.659601] RSP: 0018:ffff89058ba0bbd0 EFLAGS: 00010286
[  171.659603] RAX: 0000000000000029 RBX: ffff890586b19780 RCX: 08895bff57c7d100
[  171.659604] RDX: c0000000ffff7fff RSI: 0000000000000282 RDI: ffffffffc0407000
[  171.659606] RBP: ffff89058ba0bc88 R08: 0000000000000000 R09: ffffffffa6f342e0
[  171.659607] R10: 00000000ffff7fff R11: 0000000000000000 R12: 000000008f000000
[  171.659608] R13: 0000000000000014 R14: 0000000000000282 R15: ffffffffc0407000
[  171.659609] FS:  00007f97ea29d740(0000) GS:ffff8923ff940000(0000) knlGS:0000000000000000
[  171.659611] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  171.659613] CR2: 00007f97ea299000 CR3: 0000000186b4a004 CR4: 00000000001706e0
[  171.659614] Call Trace:
[  171.659615]  <TASK>
[  171.659631]  ? alloctest_ref_tracker_free+0x13/0x20 [test_ref_tracker]
[  171.659633]  ? init_module+0x105/0x1000 [test_ref_tracker]
[  171.659636]  ? do_one_initcall+0x9c/0x220
[  171.659638]  ? do_init_module+0x60/0x240
[  171.659641]  ? load_module+0x32b5/0x3610
[  171.659644]  ? __do_sys_init_module+0x148/0x1a0
[  171.659646]  ? __x64_sys_init_module+0x1d/0x20
[  171.659649]  ? do_syscall_64+0x4a/0xb0
[  171.659652]  ? entry_SYSCALL_64_after_hwframe+0x44/0xae
[  171.659656]  ? 0xffffffffc040a000
[  171.659658]  alloctest_ref_tracker_free+0x13/0x20 [test_ref_tracker]
[  171.659660]  init_module+0x105/0x1000 [test_ref_tracker]
[  171.659663]  do_one_initcall+0x9c/0x220
[  171.659666]  do_init_module+0x60/0x240
[  171.659669]  load_module+0x32b5/0x3610
[  171.659672]  __do_sys_init_module+0x148/0x1a0
[  171.659676]  __x64_sys_init_module+0x1d/0x20
[  171.659678]  do_syscall_64+0x4a/0xb0
[  171.659694]  ? exc_page_fault+0x6e/0x140
[  171.659696]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  171.659698] RIP: 0033:0x7f97ea3dbe7a
[  171.659700] Code: 48 8b 0d 61 8d 06 00 f7 d8 64 89 01 48 83 c8 ff c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2e 8d 06 00 f7 d8 64 89 01 48
[  171.659701] RSP: 002b:00007ffea67ce608 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
[  171.659703] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f97ea3dbe7a
[  171.659704] RDX: 00000000013a0ba0 RSI: 0000000000002808 RDI: 00007f97ea299000
[  171.659705] RBP: 00007ffea67ce670 R08: 0000000000000003 R09: 0000000000000000
[  171.659706] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000013a1048
[  171.659707] R13: 00000000013a0ba0 R14: 0000000001399930 R15: 00000000013a1030
[  171.659709]  </TASK>
[  171.659710] ---[ end trace f5dbd6afa41e60a9 ]---
[  171.659712] leaked reference.
[  171.663393]  alloctest_ref_tracker_alloc0+0x1c/0x20 [test_ref_tracker]
[  171.663395]  test_ref_tracker_timer_func+0x9/0x20 [test_ref_tracker]
[  171.663397]  call_timer_fn+0x31/0x140
[  171.663401]  expire_timers+0x46/0x110
[  171.663403]  __run_timers+0x16f/0x1b0
[  171.663404]  run_timer_softirq+0x1d/0x40
[  171.663406]  __do_softirq+0x148/0x2d3
[  171.663408] leaked reference.
[  171.667101]  alloctest_ref_tracker_alloc1+0x1c/0x20 [test_ref_tracker]
[  171.667103]  init_module+0x81/0x1000 [test_ref_tracker]
[  171.667104]  do_one_initcall+0x9c/0x220
[  171.667106]  do_init_module+0x60/0x240
[  171.667108]  load_module+0x32b5/0x3610
[  171.667111]  __do_sys_init_module+0x148/0x1a0
[  171.667113]  __x64_sys_init_module+0x1d/0x20
[  171.667115]  do_syscall_64+0x4a/0xb0
[  171.667117]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  171.667131] ------------[ cut here ]------------
[  171.667132] WARNING: CPU: 5 PID: 13016 at lib/ref_tracker.c:30 ref_tracker_dir_exit+0x104/0x130
[  171.667136] Modules linked in: test_ref_tracker(+)
[  171.667144] CPU: 5 PID: 13016 Comm: modprobe Tainted: G S      W         5.16.0-smp-DEV #290
[  171.667147] RIP: 0010:ref_tracker_dir_exit+0x104/0x130
[  171.667150] Code: 01 00 00 00 00 ad de 48 89 03 4c 89 63 08 48 89 df e8 20 a0 d5 ff 4c 89 f3 4d 39 ee 75 a8 4c 89 ff 48 8b 75 d0 e8 7c 05 69 00 <0f> 0b eb 0c 4c 89 ff 48 8b 75 d0 e8 6c 05 69 00 41 8b 47 08 83 f8
[  171.667151] RSP: 0018:ffff89058ba0bc68 EFLAGS: 00010286
[  171.667154] RAX: 08895bff57c7d100 RBX: ffffffffc0407010 RCX: 000000000000003b
[  171.667156] RDX: 000000000000003c RSI: 0000000000000282 RDI: ffffffffc0407000
[  171.667157] RBP: ffff89058ba0bc98 R08: 0000000000000000 R09: ffffffffa6f342e0
[  171.667159] R10: 00000000ffff7fff R11: 0000000000000000 R12: dead000000000122
[  171.667160] R13: ffffffffc0407010 R14: ffffffffc0407010 R15: ffffffffc0407000
[  171.667162] FS:  00007f97ea29d740(0000) GS:ffff8923ff940000(0000) knlGS:0000000000000000
[  171.667164] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  171.667166] CR2: 00007f97ea299000 CR3: 0000000186b4a004 CR4: 00000000001706e0
[  171.667169] Call Trace:
[  171.667170]  <TASK>
[  171.667171]  ? 0xffffffffc040a000
[  171.667173]  init_module+0x126/0x1000 [test_ref_tracker]
[  171.667175]  do_one_initcall+0x9c/0x220
[  171.667179]  do_init_module+0x60/0x240
[  171.667182]  load_module+0x32b5/0x3610
[  171.667186]  __do_sys_init_module+0x148/0x1a0
[  171.667189]  __x64_sys_init_module+0x1d/0x20
[  171.667192]  do_syscall_64+0x4a/0xb0
[  171.667194]  ? exc_page_fault+0x6e/0x140
[  171.667196]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  171.667199] RIP: 0033:0x7f97ea3dbe7a
[  171.667200] Code: 48 8b 0d 61 8d 06 00 f7 d8 64 89 01 48 83 c8 ff c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2e 8d 06 00 f7 d8 64 89 01 48
[  171.667201] RSP: 002b:00007ffea67ce608 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
[  171.667203] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f97ea3dbe7a
[  171.667204] RDX: 00000000013a0ba0 RSI: 0000000000002808 RDI: 00007f97ea299000
[  171.667205] RBP: 00007ffea67ce670 R08: 0000000000000003 R09: 0000000000000000
[  171.667206] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000013a1048
[  171.667207] R13: 00000000013a0ba0 R14: 0000000001399930 R15: 00000000013a1030
[  171.667209]  </TASK>
[  171.667210] ---[ end trace f5dbd6afa41e60aa ]---

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agolib: add reference counting tracking infrastructure
Eric Dumazet [Sun, 5 Dec 2021 04:21:55 +0000 (20:21 -0800)]
lib: add reference counting tracking infrastructure

It can be hard to track where references are taken and released.

In networking, we have annoying issues at device or netns dismantles,
and we had various proposals to ease root causing them.

This patch adds new infrastructure pairing refcount increases
and decreases. This will self document code, because programmers
will have to associate increments/decrements.

This is controled by CONFIG_REF_TRACKER which can be selected
by users of this feature.

This adds both cpu and memory costs, and thus should probably be
used with care.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoiwlwifi: mei: fix linking when tracing is not enabled
Emmanuel Grumbach [Wed, 1 Dec 2021 11:34:10 +0000 (13:34 +0200)]
iwlwifi: mei: fix linking when tracing is not enabled

I forgot to add stubs in case tracing is disabled which caused linking errors:

ERROR: modpost: "__SCT__tp_func_iwlmei_sap_data" [drivers/net/wireless/intel/iwlwifi/mei/iwlmei.ko] undefined!
ERROR: modpost: "__SCT__tp_func_iwlmei_me_msg" [drivers/net/wireless/intel/iwlwifi/mei/iwlmei.ko] undefined!
ERROR: modpost: "__tracepoint_iwlmei_sap_cmd" [drivers/net/wireless/intel/iwlwifi/mei/iwlmei.ko] undefined!
ERROR: modpost: "__tracepoint_iwlmei_me_msg" [drivers/net/wireless/intel/iwlwifi/mei/iwlmei.ko] undefined!
ERROR: modpost: "__SCK__tp_func_iwlmei_me_msg" [drivers/net/wireless/intel/iwlwifi/mei/iwlmei.ko] undefined!
ERROR: modpost: "__SCK__tp_func_iwlmei_sap_data" [drivers/net/wireless/intel/iwlwifi/mei/iwlmei.ko] undefined!
ERROR: modpost: "__tracepoint_iwlmei_sap_data" [drivers/net/wireless/intel/iwlwifi/mei/iwlmei.ko] undefined!
ERROR: modpost: "__SCT__tp_func_iwlmei_sap_cmd" [drivers/net/wireless/intel/iwlwifi/mei/iwlmei.ko] undefined!
ERROR: modpost: "__SCK__tp_func_iwlmei_sap_cmd" [drivers/net/wireless/intel/iwlwifi/mei/iwlmei.ko] undefined!

Fixes: 2da4366f9e2c ("iwlwifi: mei: add the driver to allow cooperation with CSME")
Reported-by: Jiri Slaby <jirislaby@kernel.org>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Acked-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20211201113411.130409-1-emmanuel.grumbach@intel.com
2 years agoMerge branch 'qed-enhancements'
Jakub Kicinski [Sat, 4 Dec 2021 02:24:23 +0000 (18:24 -0800)]
Merge branch 'qed-enhancements'

Manish Chopra says:

====================
qed*: enhancements

This series adds below enhancements for qed/qede drivers

patch 1: Improves tx timeout debug data logs.
patch 2: Add ESL(Enhanced system lockdown) priv flag cap/status support.

v2:
* Fixed cosmetic issues in both patches
* Added ESL feature description in patch #2

Please consider applying it to "net-next"
====================

Link: https://lore.kernel.org/r/20211202210157.25530-1-manishc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>