review.tizen.org Git - platform/kernel/linux-starfive.git/log

vmxnet3: Update driver to use ethtool_sprintf

So this patch actually does 3 things.

First it removes a stray white space at the start of the variable
declaration in vmxnet3_get_strings.

Second it flips the logic for the string test so that we exit immediately
if we are not looking for the stats strings. Doing this we can avoid
unnecessary indentation and line wrapping.

Then finally it updates the code to use ethtool_sprintf rather than a
memcpy and pointer increment to write the ethtool strings.

Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

virtio_net: Update driver to use ethtool_sprintf

Update the code to replace instances of snprintf and a pointer update with
just calling ethtool_sprintf.

Also replace the char pointer with a u8 pointer to avoid having to recast
the pointer type.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netvsc: Update driver to use ethtool_sprintf

Replace instances of sprintf or memcpy with a pointer update with
ethtool_sprintf.

Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ena: Update driver to use ethtool_sprintf

Replace instances of snprintf or memcpy with a pointer update with
ethtool_sprintf.

Acked-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

hisilicon: Update drivers to use ethtool_sprintf

Update the hisilicon drivers to make use of ethtool_sprintf. The general
idea is to reduce code size and overhead by replacing the repeated pattern
of string printf statements and ETH_STRING_LEN counter increments.

Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfp: Replace nfp_pr_et with ethtool_sprintf

The nfp_pr_et function is nearly identical to ethtool_sprintf except for
the fact that it passes the pointer by value and as a return whereas
ethtool_sprintf passes it as a pointer.

Since they are so close just update nfp to make use of ethtool_sprintf

Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

intel: Update drivers to use ethtool_sprintf

Update the Intel drivers to make use of ethtool_sprintf. The general idea
is to reduce code size and overhead by replacing the repeated pattern of
string printf statements and ETH_STRING_LEN counter increments.

Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ethtool: Add common function for filling out strings

Add a function to handle the common pattern of printing a string into the
ethtool strings interface and incrementing the string pointer by the
ETH_GSTRING_LEN. Most of the drivers end up doing this and several have
implemented their own versions of this function so it would make sense to
consolidate on one implementation.

Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'mlx5-updates-2021-03-16' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2021-03-16

mlx5 uplink representor netdev persistence.

Before this patchset we used to have separate netdevs for Native NIC mode
and Switchdev mode (uplink representor netdev), meaning that if user
switches modes between Native to Switchdev and vice versa, the driver
would cleanup the current netdev representor and create a new one for the
new mode, such behavior created an administrative nightmare for users,
where users need to be aware of such loss of both data path and control
path configurations, e.g. netdev attributes and arp/route tables,
where the later is more painful.

A simple solution for this is not to replace the netdev in first place
and use a single netdev to serve the uplink/physical port whether it is
in switchdev mode or native mode.

We already have different HW profiles for each netdev mode, in this series
we just replace the HW profile on the fly and we keep the same netdev
attached.

Refactoring: Some refactoring has been made to overcome some technical
difficulties
1) The netdev is created with the maximum amount of tx/rx queues to serve
the two profiles.

2) Some ndos are not supported in some modes, so we added a mode check for
   such cases, e.g legacy sriov ndos must be blocked in switchdev mode.

3) Some mlx5 netdev private attributes need to be moved out of profiles
   and kept in a persistent place, where the netdev is created
   e.g devlink port and other global HW resources

4) The netdev devlink port is now always registered with the switch id

Implementation: the last three patches implement the mechanism now as the
netdev can be shared.

5) Don't recreate the netdev on switchdev mode changes
6) Prevent changing switchdev mode when some netdev operations
are active, mostly when TC rules are being processed.
This is required since the netdev is kept registered while switchdev mode
can be changed.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx5: E-Switch, Protect changing mode while adding rules

We re-use the native NIC port net device instance for the Uplink
representor, a driver currently cannot unbind TC setup callback
actively, hence protect changing E-Switch mode while adding rules.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: E-Switch, Change mode lock from mutex to rw semaphore

E-Switch mode change routine will take the write lock to prevent any
consumer to access the E-Switch resources while E-Switch is going
through a mode change.

In the next patch
E-Switch consumers (e.g vport representors) will take read_lock prior to
accessing E-Switch resources to prevent E-Switch mode changing in the
middle of the operation.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Do not reload ethernet ports when changing eswitch mode

When switching modes between legacy and switchdev and back, do not
reload ethernet interfaces. just change the profile from nic profile
to uplink rep profile in switchdev mode.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Unregister eth-reps devices first

When we clean all the interfaces, i.e. rescan or reload module,
we need to clean eth-reps devices first, before eth devices.

We will re-use the native NIC port net device instance for the Uplink
representor. Changing eswitch mode will skip destroying the eth device
so the net device won't be destroyed and only change the profile.

Creating uplink eth-rep will initialize the representor related resources.
In that sense when we destroy all devices we first need to destroy
eth-rep devices so uplink eth-rep will clean all representor related
resources and only then destroy the eth device which will destroy rest
of the resources and the net device.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: Move devlink port from mlx5e priv to mlx5e resources

We re-use the native NIC port net device instance for the Uplink
representor, and the devlink port.
When changing profiles we reset the mlx5e priv but we should still
use the devlink port so move it to mlx5e resources.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: Move mlx5e hw resources into a sub object

This is to separate between resources attributes and other
attributes we will want to use.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Register nic devlink port with switch id

We will re-use the native NIC port net device instance for the Uplink
representor. Since the netdev will be kept registered while we engage
switchdev mode also the devlink will be kept registered.
Register the nic devlink port with switch id so it will be available
when changing profiles.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Move devlink port register and unregister calls

We will re-use the native NIC port net device instance for the Uplink
representor. As such we also don't want to unregister/register the
devlink port as part of the profile.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Verify dev is present in some ndos

We will re-use the native NIC port net device instance for the Uplink
representor. While changing profiles private resources are not
available but some ndos are not checking if the netdev is present.
So for those ndos check the netdev is present in the driver before
accessing the private resources.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Use nic mode netdev ndos and ethtool ops for uplink representor

Remove dedicated uplink rep netdev ndos and ethtools ops.
We will re-use the native NIC port net device instance and ethtool ops for
the Uplink representor.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Add offload stats ndos to nic netdev ops

We will re-use the native NIC port net device instance for the Uplink
representor, hence same ndos must be used.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Distinguish nic and esw offload in tc setup block cb

We will re-use the native NIC port net device instance for the Uplink
representor, hence same ndos will be used.
Now we need to distinguish in the TC callback if the mode is legacy or
switchdev and set the proper flag.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Allow legacy vf ndos only if in legacy mode

We will re-use the native NIC port net device instance for the Uplink
representor. Several VF ndo ops are not relevant in switchdev mode.
Disallow them when eswitch mode is not legacy as a preparation.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Same max num channels for both nic and uplink profiles

In downstream patches NIC netdev can change profile dynamically from
NIC mode to uplink mode and vise-versa. It is required that both profiles
must advertise the same max amount of tx/rx queues.

Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>

net: Change dev parameter to const in netif_device_present()

Not all ndos check the present bit before calling the ndo and the driver
may want to check it. Sometimes the dev parameter passed as const so we
pass it to netif_device_present() as const.
Since netif_device_present() doesn't modify dev parameter anyway, declare
it as const.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

openvswitch: Warn over-mtu packets only if iface is UP.

It is not unusual to have the bridge port down. Sometimes
it has the old MTU, which is fine since it's not being used.

However, the kernel spams the log with a warning message
when a packet is going to be sent over such port. Fix that
by warning only if the interface is UP.

Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Revert "net: socket: use BIT() for MSG_*"

This reverts commit 0bb3262c0248d44aea3be31076f44beb82a7b120.

Breaks things on mips64/qemu

Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'ocelot-mrp'

Horatiu Vultur says:

====================
net: ocelot: Extend MRP

This patch series extends the current support of MRP in Ocelot driver.
Currently the forwarding of the frames happened in SW because all frames
were trapped to CPU. With this patch the MRP frames will be forward in HW.

v1 -> v2:
- create a patch series instead of single patch
- rename ocelot_mrp_find_port to ocelot_mrp_find_partner_port
- rename PGID_MRP to PGID_BLACKHOLE
- use GFP_KERNEL instead of GFP_ATOMIC
- fix other whitespace issues
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: ocelot: Remove ocelot_xfh_get_cpuq

Now when extracting frames from CPU the cpuq is not used anymore so
remove it.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ocelot: Extend MRP

This patch extends MRP support for Ocelot. It allows to have multiple
rings and when the node has the MRC role it forwards MRP Test frames in
HW. For MRM there is no change.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ocelot: Add PGID_BLACKHOLE

Add a new PGID that is used not to forward frames anywhere. It is used
by MRP to make sure that MRP Test frames will not reach CPU port.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch '40GbE' of git://git./linux/kernel/git/tnguy/next-queue

Tony Nguyen says:

====================
40GbE Intel Wired LAN Driver Updates 2021-03-16

This series contains updates to i40e, ixgbe, and ice drivers.

Magnus Karlsson says:

Optimize run_xdp_zc() for the XDP program verdict being XDP_REDIRECT
in the xsk zero-copy path. This path is only used when having AF_XDP
zero-copy on and in that case most packets will be directed to user
space. This provides around 100k extra packets in throughput on my
server when running l2fwd in xdpsock.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'mlxsw-Add-support-for-egress-and-policy-based-sampling'

Ido Schimmel says:

====================
mlxsw: Add support for egress and policy-based sampling

So far mlxsw only supported ingress sampling using matchall classifier.
This series adds support for egress sampling and policy-based sampling
using flower classifier on Spectrum-2 and newer ASICs. As such, it is
now possible to issue these commands:

# tc filter add dev swp1 egress pref 1 proto all matchall action sample rate 100 group 1

# tc filter add dev swp2 ingress pref 1 proto ip flower dst_ip 198.51.100.1 action sample rate 100 group 2

When performing egress sampling (using either matchall or flower) the
ASIC is able to report the end-to-end latency which is passed to the
psample module.

Series overview:

Patches #1-#3 are preparations without any functional changes

Patch #4 generalizes the idea of sampling triggers and creates a hash
table to track active sampling triggers in preparation for egress and
policy-based triggers. The motivation is explained in the changelog

Patch #5 flips mlxsw to start using this hash table instead of storing
ingress sampling triggers as an attribute of the sampled port

Patch #6 finally adds support for egress sampling using matchall
classifier

Patches #7-#8 add support for policy-based sampling using flower
classifier

Patches #9 extends the mlxsw sampling selftest to cover the new triggers

Patch #10 makes sure that egress sampling configuration only fails on
Spectrum-1
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

selftests: mlxsw: Test egress sampling limitation on Spectrum-1 only

Make sure egress sampling configuration only fails on Spectrum-1, given
that mlxsw now supports it on Spectrum-{2,3}.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

selftests: mlxsw: Add tc sample tests for new triggers

Test that packets are sampled when tc-sample is used with matchall
egress binding and flower classifier. Verify that when performing
sampling on egress the end-to-end latency is reported as metadata.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum_acl: Offload FLOW_ACTION_SAMPLE

Implement support for action sample when used with a flower classifier
by implementing the required sampler_add() / sampler_del() callbacks and
registering an Rx listener for the sampled packets.

The sampler_add() callback returns an error for Spectrum-1 as the
functionality is not supported. In Spectrum-{2,3} the callback creates a
mirroring agent towards the CPU. The agent's identifier is used by the
policy engine code to mirror towards the CPU with probability.

The Rx listener for the sampled packet is registered with the 'policy
engine' mirroring reason and passes trapped packets to the psample
module after looking up their parameters (e.g., sampling group).

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: core_acl_flex_actions: Add mirror sampler action

Add core functionality required to support mirror sampler action in the
policy engine. The switch driver (e.g., 'mlxsw_spectrum') is required to
implement the sampler_add() / sampler_del() callbacks that perform the
necessary configuration before the sampler action can be installed. The
next patch will implement it for Spectrum-{2,3}, while Spectrum-1 will
return an error, given it is not supported.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum_matchall: Add support for egress sampling

Allow user space to install a matchall classifier with sample action on
egress. This is only supported on Spectrum-2 onwards, so Spectrum-1 will
continue to return an error.

Programming the hardware to sample on egress is identical to ingress
sampling with the sole change of using a different sampling trigger.

Upon receiving a sampled packet, the sampling trigger (ingress vs.
egress) will be encoded in the mirroring reason in the Completion Queue
Element (CQE). The mirroring reason is used to lookup the sampling
parameters (e.g., psample group) which are passed to the psample module.

Note that locally generated packets that are sampled are simply
consumed. This is done for several reasons.

First, such packets do not have an ingress netdev given that their Rx
local port is the CPU port. This breaks several basic assumptions.

Second, sampling using the same interface (tc), but with flower
classifier will not result in locally generated packets being sampled
given that such packets are not subject to the policy engine.

Third, realistically, this is not a big deal given that the vast
majority of the packets being transmitted through the port are not
locally generated packets.

Fourth, if such packets do need to be sampled, they can be sampled with
a 'skip_hw' filter and reported to the same sampling group as the data
path packets. The software sampling rate can also be adjusted to fit the
rate of the locally generated packets which is much lower than the rate
of the data path traffic.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum: Start using sampling triggers hash table

Start using the previously introduced sampling triggers hash table to
store sampling parameters instead of storing them as attributes of the
sampled port.

This makes it easier to introduce new sampling triggers.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum: Track sampling triggers in a hash table

Currently, mlxsw supports a single sampling trigger type (i.e., received
packet). When sampling is configured on an ingress port, the sampling
parameters (e.g., pointer to the psample group) are stored as an
attribute of the port, so that they could be passed to
psample_sample_packet() when a sampled packet is trapped to the CPU.

Subsequent patches are going to add more types of sampling triggers,
making it difficult to maintain the current scheme.

Instead, store all the active sampling triggers with their associated
parameters in a hash table. That way, more trigger types can be easily
added.

The next patch will flip mlxsw to use the hash table instead of the
current scheme.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum_matchall: Pass matchall entry to sampling operations

The entry will be required by the next patches, so pass it. No
functional changes intended.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum_matchall: Push sampling checks to per-ASIC operations

Push some sampling checks to the per-ASIC operations, as they are no
longer relevant for all ASICs.

The sampling rate validation against the MPSC maximum rate is only
relevant for Spectrum-1, as Spectrum-2 and later ASICs no longer use
MPSC register for sampling.

The ingress / egress validation is pushed down to the per-ASIC
operations since subsequent patches are going to remove it for
Spectrum-2 and later ASICs.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum_matchall: Propagate extack further

Due to the differences between Spectrum-1 and later ASICs, some of the
checks currently performed at the common code (where extack is
available) will need to be pushed to the per-ASIC operations.

As a preparation, propagate extack further to maintain proper error
reporting.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'dpaa2-switch-small-cleanup'

Ioana Ciornei says:

====================
dpaa2-switch: small cleanup

This patch set addresses various low-hanging issues in both dpaa2-switch
and dpaa2-eth drivers.
Unused ABI functions are removed from dpaa2-switch, all the kernel-doc
warnings are fixed up in both drivers and the coding style for the
remaining ABIs is fixed-up a bit.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

dpaa2-eth: fixup kdoc warnings

Running kernel-doc over the dpaa2-eth driver generates a bunch of
warnings. Fix them up by removing code comments for macros which are
self-explanatory, respecting the kdoc format for macro documentation and
other small changes like describing the expected return values of
functions.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dpaa2-switch: fit the function declaration on the same line

Multiple ABI function declarations are split unnecessarry on multiple
lines. Fix this so that we have a consistent coding style.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dpaa2-switch: reduce the size of the if_id bitmap to 64 bits

The maximum number of DPAA2 switch interfaces, including the control
interface, is 64. Even though this restriction existed from the first
place, the command structures which use an interface id bitmap were
poorly described and even though a single uint64_t is enough, all of
them used an array of 4 uint64_t's.
Fix this by reducing the size of the interface id field to a single
uint64_t.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dpaa2-switch: fix kdoc warnings

Running kernel-doc over the dpaa2-switch driver generates a bunch of
warnings. Fix them up by removing code comments for macros which are
self-explanatory and adding a bit more context for the
dpsw_if_get_port_mac_addr() function and the fields of the
dpsw_vlan_if_cfg structure.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dpaa2-switch: remove unused ABI functions

Cleanup the dpaa2-switch driver a bit by removing any unused MC firmware
ABI definitions.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipa: Remove useless error message

Fix the following coccicheck report:

drivers/net/ipa/gsi.c:1341:2-9:
line 1341 is redundant because platform_get_irq() already prints an error

Remove dev_err() messages after platform_get_irq_byname() failures.

Signed-off-by: Zihao Tang <tangzihao1@hisilicon.com>
Signed-off-by: Jay Fang <f.fangjian@huawei.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'switchdev-dsa-docs'

Vladimir Oltean says:

====================
Documentation updates for switchdev and DSA

Many changes were made to the code but of course the documentation was
not kept up to date. This is an attempt to update some of the verbiage.

The documentation is still not complete, but it's time to make some more
changes to the code first, before documenting the rest.

Changes in v2:
Integrated feedback from Andrew, Florian, Tobias, Ido, George.
====================

Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Documentation: networking: switchdev: fix command for static FDB entries

The "bridge fdb add" command provided in the switchdev documentation is
junk now, not only because it is syntactically incorrect and rejected by
the iproute2 bridge program, but also because it was not updated in
light of Arkadi Sharshevsky's radical switchdev refactoring in commit
29ab586c3d83 ("net: switchdev: Remove bridge bypass support from
switchdev"). Try to explain what the intended usage pattern is with the
new kernel implementation.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Documentation: networking: switchdev: clarify device driver behavior

This patch provides details on the expected behavior of switchdev
enabled network devices when operating in a "stand alone" mode, as well
as when being bridge members. This clarifies a number of things that
recently came up during a bug fixing session on the b53 DSA switch
driver.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Documentation: networking: dsa: add paragraph for the HSR/PRP offload

Add a short summary of the methods that a driver writer must implement
for offloading a HSR/PRP network interface.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: George McCollister <george.mccollister@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Documentation: networking: dsa: add paragraph for the MRP offload

Add a short summary of the methods that a driver writer must implement
for getting an MRP instance to work on top of a DSA switch.

Cc: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Documentation: networking: dsa: add paragraph for the LAG offload

Add a short summary of the methods that a driver writer must implement
for offloading a link aggregation group, and what is still missing.

Cc: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Documentation: networking: dsa: mention integration with devlink

Add a short summary of the devlink features supported by the DSA core.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

Documentation: networking: dsa: document the port_bridge_flags method

The documentation was already lagging behind by not mentioning the old
version of port_bridge_flags (port_set_egress_floods). So now we are
skipping one step and just explaining how a DSA driver should configure
address learning and flooding settings.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

Documentation: networking: dsa: remove TODO about porting more vendor drivers

On one hand, the link is dead and therefore useless.

On the other hand, there are always more drivers to port, but at this
stage, DSA does not need to affirm itself as the driver model to use for
Ethernet-connected switches (since we already have 15 tagging protocols
supported and probably more switch families from various vendors), so
there is nothing actionable to do.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

Documentation: networking: dsa: remove references to switchdev prepare/commit

After the recent series containing commit bae33f2b5afe ("net: switchdev:
remove the transaction structure from port attributes"), there aren't
prepare/commit transactional phases anymore in most of the switchdev
objects/attributes, and as a result, there aren't any in the DSA driver
API either. So remove this piece.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

Documentation: networking: dsa: remove static port count from limitations

After Vivien's series from 2019 containing commits 27d4d19d7c82 ("net:
dsa: remove limitation of switch index value") and ab8ccae122a4 ("net:
dsa: add ports list in the switch fabric"), this is basically no longer
true.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Documentation: networking: dsa: rewrite chapter about tagging protocol

The chapter about tagging protocols is out of date because it doesn't
mention all taggers that have been added since last documentation
update. But judging based on that, it will always tend to lag behind,
and there's no good reason why we would enumerate the supported
hardware. Instead we could do something more useful and explain what
there is to know about tagging protocols instead.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Documentation: networking: update the graphical representation

While preparing some slides for a customer presentation, I found the
existing high-level view to be a bit confusing, so I modified it a
little bit.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch '1GbE' of git://git./linux/kernel/git/tnguy/next-queue

Tony Nguyen says:

====================
1GbE Intel Wired LAN Driver Updates 2021-03-15

This series contains updates to e1000e only.

Chen Yu says:

The NIC is put in runtime suspend status when there is no cable connected.
As a result, it is safe to keep non-wakeup NIC in runtime suspended during
s2ram because the system does not rely on the NIC plug event nor WoL to
wake up the system. Besides that, unlike the s2idle, s2ram does not need to
manipulate S0ix settings during suspend.
====================

Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipv4: route.c: simplify procfs code

proc_creat_seq() that directly take a struct seq_operations,
and deal with network namespaces in ->open.

Signed-off-by: Yejune Deng <yejune.deng@gmail.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'bridge-m,cast-cleanups'

Nikolay Aleksandrov says:

====================
net: bridge: mcast: simplify allow/block EHT code

The set does two minor cleanups of the EHT allow/block handling code:
patch 01 removes code which is unreachable (it was used in initial EHT
versions, but isn't anymore) and prepares the allow/block functions to be
factored out. Patch 02 factors out common allow/block handling code.
There are no functional changes.

v2: send patch 02 and the proper version of both patches
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: bridge: mcast: factor out common allow/block EHT handling

We hande EHT state change for ALLOW messages in INCLUDE mode and for
BLOCK messages in EXCLUDE mode similarly - create the new set entries
with the proper filter mode. We also handle EHT state change for ALLOW
messages in EXCLUDE mode and for BLOCK messages in INCLUDE mode in a
similar way - delete the common entries (current set and new set).
Factor out all the common code as follows:
- ALLOW/INCLUDE, BLOCK/EXCLUDE: call __eht_create_set_entries()
- ALLOW/EXCLUDE, BLOCK/INCLUDE: call __eht_del_common_set_entries()

The set entries creation can be reused in __eht_inc_exc() as well.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: bridge: mcast: remove unreachable EHT code

In the initial EHT versions there were common functions which handled
allow/block messages for both INCLUDE and EXCLUDE modes, but later they
were separated. It seems I've left some common code which cannot be
reached because the filter mode is checked before calling the respective
functions, i.e. the host filter is always in EXCLUDE mode when using
__eht_allow_excl() and __eht_block_excl() thus we can drop the host_excl
checks inside and simplify the code a bit.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: mt7530: support MDB and bridge flag operations

Support port MDB and bridge flag operations.

As the hardware can manage multicast forwarding itself, offload_fwd_mark
can be unconditionally set to true.

Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'bcm6368'

Álvaro Fernández Rojas says:

====================
net: mdio: Add BCM6368 MDIO mux bus controller

This controller is present on BCM6318, BCM6328, BCM6362, BCM6368 and BCM63268
SoCs.

v2: add changes suggested by Andrew Lunn and Jakub Kicinski.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: mdio: Add BCM6368 MDIO mux bus controller

This controller is present on BCM6318, BCM6328, BCM6362, BCM6368 and BCM63268
SoCs.

Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dt-bindings: net: Add bcm6368-mdio-mux bindings

Add documentations for bcm6368 mdio mux driver.

Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: lapbether: Prevent racing when checking whether the netif is running

There are two "netif_running" checks in this driver. One is in
"lapbeth_xmit" and the other is in "lapbeth_rcv". They serve to make
sure that the LAPB APIs called in these functions are called before
"lapb_unregister" is called by the "ndo_stop" function.

However, these "netif_running" checks are unreliable, because it's
possible that immediately after "netif_running" returns true, "ndo_stop"
is called (which causes "lapb_unregister" to be called).

This patch adds locking to make sure "lapbeth_xmit" and "lapbeth_rcv" can
reliably check and ensure the netif is running while doing their work.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Xie He <xie.he.0141@gmail.com>
Acked-by: Martin Schiller <ms@dev.tdt.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'ipa-qmi-fixes'

Alex Elder says:

====================
net: ipa: QMI fixes

Mani Sadhasivam discovered some errors in the definitions of some
QMI messages used for IPA. This series addresses those errors,
and extends the definition of one message type to include some
newly-defined fields.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipa: extend the INDICATION_REGISTER request

The specified format of the INDICATION_REGISTER QMI request message
has been extended to support two more optional fields:
  endpoint_desc_ind:
    sender wishes to receive endpoint descriptor information via
    an IPA ENDP_DESC indication QMI message
  bw_change_ind:
    sender wishes to receive bandwidth change information via
    an IPA BW_CHANGE indication QMI message

Add definitions that permit these fields to be formatted and parsed
by the QMI library code.

Signed-off-by: Alex Elder <elder@linaro.org>
Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipa: fix another QMI message definition

The ipa_init_modem_driver_req_ei[] encoding array for the
INIT_MODEM_DRIVER request message has some errors in it.

First, the tlv_type associated with the hw_stats_quota_size field is
wrong; it duplicates the valiue used for the hw_stats_quota_base_addr
field (0x1f) and should use 0x20 instead. The tlv_type value for
the hw_stats_drop_size field also uses the same duplicate value; it
should use 0x22 instead.

Second, there is no definition for the hw_stats_drop_base_addr
field. It is an optional 32-bit enumerated type value.

Finally, the hw_stats_quota_base_addr, hw_stats_quota_size, and
hw_stats_drop_size fields are defined as enumerated types; they
should be unsigned 4-byte values.

Reported-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Alex Elder <elder@linaro.org>
Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipa: fix a duplicated tlv_type value

In the ipa_indication_register_req_ei[] encoding array, the tlv_type
associated with the ipa_mhi_ready_ind field is wrong. It duplicates
the value used for the data_usage_quota_reached field (0x11) and
should use value 0x12 instead. Fix this bug.

Reported-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Alex Elder <elder@linaro.org>
Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: sja1105: fix error return code in sja1105_cls_flower_add()

The return value 'rc' maybe overwrite to 0 in the flow_action_for_each
loop, the error code from the offload not support error handling will
not set. This commit fix it to return -EOPNOTSUPP.

Fixes: 6a56e19902af ("flow_offload: reject configuration of packet-per-second policing in offload drivers")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: b53: spi: allow device tree probing

Add missing of_match_table to allow device tree probing.

Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: mdio: Alphabetically sort header inclusion

Alphabetically sort header inclusion

Signed-off-by: Calvin Johnson <calvin.johnson@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'ionic-tx-updates'

Shannon Nelson says:

====================
ionic Tx updates

Just as the Rx path recently got a face lift, it is time for the Tx path to
get some attention. The original TSO-to-descriptor mapping was ugly and
convoluted and needed some deep work. This series pulls the dma mapping
out of the descriptor frag mapping loop and makes the dma mapping more
generic for use in the non-TSO case.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

ionic: aggregate Tx byte counting calls

Gather the Tx packet and byte counts and call
netdev_tx_completed_queue() only once per clean cycle.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>

ionic: simplify tx clean

The descriptor mappings are set up the same way whether
or not it is a TSO, so we don't need separate logic for
the two cases.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>

ionic: generic tx skb mapping

Make the new ionic_tx_map_tso() usable by the non-TSO paths,
and pull the call up a level into ionic_tx() before calling
the csum or no-csum routines.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>

ionic: simplify TSO descriptor mapping

One issue with the original TSO code was that it was working too
hard to deal with skb layouts that were never going to show up,
such as an skb->data that was longer than a single descriptor's
length. The other issue was trying to arrange the fragment dma
mapping at the same time as figuring out the descriptors needed.
There was just too much going on at the same time.

Now we do the dma mapping first, which sets up the buffers with
skb->data in buf[0] and the remaining frags in buf[1..n-1].
Next we spread the bufs across the descriptors needed, where
each descriptor gets up to mss number of bytes.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'net-qualcomm-rmnet-stop-using-C-bit-fields'

Alex Elder says:

====================
net: qualcomm: rmnet: stop using C bit-fields

Version 6 is the same as version 5, but has been rebased on updated
net-next/master.  With any luck, the patches I'm sending out this
time won't contain garbage.

Version 5 of this series responds to a suggestion made by Alexander
Duyck, to determine the offset to the checksummed range of a packet
using skb_network_header_len() on patch 2.  I have added his
Reviewed-by tag to all (other) patches, and removed Bjorn's from
patch 2.

The change required some updates to the subsequent patches, and I
reordered some assignments in a minor way in the last patch.

I don't expect any more discussion on this series (but will respond
if there is any).  So at this point I would really appreciate it
if KS and/or Sean would offer a review, or at least acknowledge it.
I presume you two are able to independently test the code as well,
so I request that, and hope you are willing to do so.

Version 4 of this series is here:
  https://lore.kernel.org/netdev/20210315133455.1576188-1-elder@linaro.org
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: qualcomm: rmnet: don't use C bit-fields in rmnet checksum header

Replace the use of C bit-fields in the rmnet_map_ul_csum_header
structure with a single two-byte (big endian) structure member,
and use masks to encode or get values within it. The content of
these fields can be accessed using simple bitwise AND and OR
operations on the (host byte order) value of the new structure
member.

Previously rmnet_map_ipv4_ul_csum_header() would update C bit-field
values in host byte order, then forcibly fix their byte order using
a combination of byte swap operations and types.

Instead, just compute the value that needs to go into the new
structure member and save it with a simple byte-order conversion.

Make similar simplifications in rmnet_map_ipv6_ul_csum_header().

Finally, in rmnet_map_checksum_uplink_packet() a set of assignments
zeroes every field in the upload checksum header. Replace that with
a single memset() operation.

Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qualcomm: rmnet: don't use C bit-fields in rmnet checksum trailer

Replace the use of C bit-fields in the rmnet_map_dl_csum_trailer
structure with a single one-byte field, using constant field masks
to encode or get at embedded values.

Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qualcomm: rmnet: use masks instead of C bit-fields

The actual layout of bits defined in C bit-fields (e.g. int foo : 3)
is implementation-defined.  Structures defined in <linux/if_rmnet.h>
address this by specifying all bit-fields twice, to cover two
possible layouts.

I think this pattern is repetitive and noisy, and I find the whole
notion of compiler "bitfield endianness" to be non-intuitive.

Stop using C bit-fields for the command/data flag and the pad length
fields in the rmnet_map structure, and define a single-byte flags
field instead.  Define a mask for the single-bit "command" flag,
and another mask for the encoded pad length.  The content of both
fields can be accessed using a simple bitwise AND operation.

Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qualcomm: rmnet: kill RMNET_MAP_GET_*() accessor macros

The following macros, defined in "rmnet_map.h", assume a socket
buffer is provided as an argument without any real indication this
is the case.
    RMNET_MAP_GET_MUX_ID()
    RMNET_MAP_GET_CD_BIT()
    RMNET_MAP_GET_PAD()
    RMNET_MAP_GET_CMD_START()
    RMNET_MAP_GET_LENGTH()
What they hide is pretty trivial accessing of fields in a structure,
and it's much clearer to see this if we do these accesses directly.

So rather than using these accessor macros, assign a local
variable of the map header pointer type to the socket buffer data
pointer, and derereference that pointer variable.

In "rmnet_map_data.c", use sizeof(object) rather than sizeof(type)
in one spot.  Also, there's no need to byte swap 0; it's all zeros
irrespective of endianness.

Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qualcomm: rmnet: simplify some byte order logic

In rmnet_map_ipv4_ul_csum_header() and rmnet_map_ipv6_ul_csum_header()
the offset within a packet at which checksumming should commence is
calculated. This calculation involves byte swapping and a forced type
conversion that makes it hard to understand.

Simplify this by computing the offset in host byte order, then
converting the result when assigning it into the header field.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qualcomm: rmnet: mark trailer field endianness

The fields in the checksum trailer structure used for QMAP protocol
RX packets are all big-endian format, so define them that way.

It turns out these fields are never actually used by the RMNet code.
The start offset is always assumed to be zero, and the length is
taken from the other packet headers. So making these fields
explicitly big endian has no effect on the behavior of the code.

Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e1000e: Remove the runtime suspend restriction on CNP+

Although there is platform issue of runtime suspend support
on CNP, it would be more flexible to let the user decide whether
to disable runtime or not because:
1. This can be done in userspace via
echo on > /sys/devices/pci0000\:00/0000\:00\:1f.d/power/control
2. More and more NICs would support runtime suspend, disabling the
runtime suspend on them by default would impact the validation.

Only disable runtime suspend on CNP in case of any user space regression.

Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Tested-by: Dvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

e1000e: Leverage direct_complete to speed up s2ram

The NIC is put in runtime suspend status when there is no cable connected.
As a result, it is safe to keep non-wakeup NIC in runtime suspended during
s2ram because the system does not rely on the NIC plug event nor WoL to wake
up the system. Besides that, unlike the s2idle, s2ram does not need to
manipulate S0ix settings during suspend.

This patch introduces the .prepare() for e1000e so that if the NIC is runtime
suspended the subsequent suspend/resume hooks will be skipped so as to speed
up the s2ram. The pm core will check whether the NIC is a wake up device so
there's no need to check it again in .prepare(). DPM_FLAG_SMART_PREPARE flag
should be set during probe to ask the pci subsystem to honor the driver's
prepare() result. Besides, the NIC remains runtime suspended after resumed
from s2ram as there is no need to resume it.

Tested on i7-2600K with 82579V NIC
Before the patch:
e1000e 0000:00:19.0: pci_pm_suspend+0x0/0x160 returned 0 after 225146 usecs
e1000e 0000:00:19.0: pci_pm_resume+0x0/0x90 returned 0 after 140588 usecs

After the patch:
echo disabled > //sys/devices/pci0000\:00/0000\:00\:19.0/power/wakeup
becomes 0 usecs because the hooks will be skipped.

Suggested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Tested-by: Dvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

net: ipa: make ipa_table_hash_support() inline

In review, Alexander Duyck suggested that ipa_table_hash_support()
was trivial enough that it could be implemented as a static inline
function in the header file. But the patch had already been
accepted. Implement his suggestion.

Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: add Marvell 88X2222 transceiver support

Add basic support for the Marvell 88X2222 multi-speed ethernet
transceiver.

This PHY provides data transmission over fiber-optic as well as Twinax
copper links. The 88X2222 supports 2 ports of 10GBase-R and 1000Base-X
on the line-side interface. The host-side interface supports 4 ports of
10GBase-R, RXAUI, 1000Base-X and 2 ports of XAUI.

This driver, however, supports only XAUI on the host-side and
1000Base-X/10GBase-R on the line-side, for now. The SGMII is also
supported over 1000Base-X. Interrupts are not supported.

Internal registers access compliant with the Clause 45 specification.

Signed-off-by: Ivan Bornyakov <i.bornyakov@metrotek.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'stmmac-clocks'

Joakim Zhang says:

====================
net: stmmac: implement clocks management

This patch set tries to implement clocks management, and takes i.MX platform as an example.

---
ChangeLogs:
V1->V2:
* change to pm runtime mechanism.
* rename function: _enable() -> _config()
* take MDIO bus into account, it needs clocks when interface
is closed.
* reverse Christmass tree.
V2->V3:
* slightly simple the code according to Andrew's suggesstion
and also add tag: Reviewed-by: Andrew Lunn <andrew@lunn.ch>
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: stmmac: dwmac-imx: add platform level clocks management for i.MX

Split clocks settings from init callback into clks_config callback,
which could support platform level clocks management.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: stmmac: add platform level clocks management

This patch intends to add platform level clocks management. Some
platforms may have their own special clocks, they also need to be
managed dynamically. If you want to manage such clocks, please implement
clks_config callback.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: stmmac: add clocks management for gmac driver

This patch intends to add clocks management for stmmac driver:

If CONFIG_PM enabled:
1. Keep clocks disabled after driver probed.
2. Enable clocks when up the net device, and disable clocks when down
the net device.

If CONFIG_PM disabled:
Keep clocks always enabled after driver probed.

Note:
1. It is fine for ethtool, since the way of implementing ethtool_ops::begin
in stmmac is only can be accessed when interface is enabled, so the clocks
are ticked.
2. The MDIO bus has a different life cycle to the MAC, need ensure
clocks are enabled when _mdio_read/write() need clocks, because these
functions can be called while the interface it not opened.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'net-pcs-stmmac=add-C37-AN-SGMII-support'

Ong Boon Leong says:

====================
net: pcs, stmmac: add C37 AN SGMII support

This patch series adds MAC-side SGMII support to stmmac driver and it is
changed as follow:-

1/6: Refactor the current C73 implementation in pcs-xpcs to prepare for
     adding C37 AN later.
2/6: Add MAC-side SGMII C37 AN support to pcs-xpcs
3,4/6: make phylink_parse_mode() to work for non-DT platform so that
       we can use stmmac platform_data to set it.
5/6: Make stmmac_open() to only skip PHY init if C73 is used, otherwise
     C37 AN will need phydev to be connected to phylink.
6/6: Finally, add pcs-xpcs SGMII interface support to Intel mGbE
     controller.

The patch series have been tested on EHL CRB PCH TSN (eth2) controller
that has Marvell 88E1512 PHY attached over SGMII interface and the
iterative tests of speed change (AN) + ping test have been successful.

[63446.009295] intel-eth-pci 0000:00:1e.4 eth2: Link is Down
[63449.986365] intel-eth-pci 0000:00:1e.4 eth2: Link is Up - 1Gbps/Full - flow control off
[63449.987625] IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
[63451.248064] intel-eth-pci 0000:00:1e.4 eth2: Link is Down
[63454.082366] intel-eth-pci 0000:00:1e.4 eth2: Link is Up - 100Mbps/Full - flow control off
[63454.083650] IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
[63456.465179] intel-eth-pci 0000:00:1e.4 eth2: Link is Down
[63459.202367] intel-eth-pci 0000:00:1e.4 eth2: Link is Up - 10Mbps/Full - flow control off
[63459.203639] IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
[63460.882832] intel-eth-pci 0000:00:1e.4 eth2: Link is Down
[63464.322366] intel-eth-pci 0000:00:1e.4 eth2: Link is Up - 1Gbps/Full - flow control off
====================

Signed-off-by: David S. Miller <davem@davemloft.net>