review.tizen.org Git - platform/kernel/linux-starfive.git/log

sched: act: allow user to specify type of HW stats for a filter

Currently, user who is adding an action expects HW to report stats,
however it does not have exact expectations about the stats types.
That is aligned with TCA_ACT_HW_STATS_TYPE_ANY.

Allow user to specify the type of HW stats for an action and require it.

Pass the information down to flow_offload layer.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

flow_offload: introduce "disabled" HW stats type and allow it in mlxsw

Introduce new type for disabled HW stats and allow the value in
mlxsw offload.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum_acl: Ask device for rule stats only if counter was created

Set a flag in case rule counter was created. Only query the device for
stats of a rule, which has the valid counter assigned.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

flow_offload: introduce "delayed" HW stats type and allow it in mlx5

Introduce new type for delayed HW stats and allow the value in
mlx5 offload.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

flow_offload: introduce "immediate" HW stats type and allow it in mlxsw

Introduce new type for immediate HW stats and allow the value in
mlxsw offload.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: restrict supported HW stats type to "any"

Currently don't allow actions with any other type to be inserted.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum_flower: Do not allow mixing HW stats types for actions

As there is one set of counters for the whole action chain, forbid to
mix the HW stats types.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

flow_offload: check for basic action hw stats type

Introduce flow_action_basic_hw_stats_types_check() helper and use it
in drivers. That sanitizes the drivers which do not have support
for action HW stats types.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ocelot_flower: use flow_offload_has_one_action() helper

Instead of directly checking number of action entries, use
flow_offload_has_one_action() helper.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

flow_offload: Introduce offload of HW stats type

Initially, pass "ANY" (struct is zeroed) to the drivers as that is the
current implicit value coming down to flow_offload. Add a bool
indicating that entries have mixed HW stats type.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'ethtool-consolidate-irq-coalescing-other-drivers'

Jakub Kicinski says:

====================
ethtool: consolidate irq coalescing - other drivers

Convert more drivers following the groundwork laid in a recent
patch set [1]. The aim of the effort is to consolidate irq
coalescing parameter validation in the core.

This set converts all the drivers outside of drivers/net/ethernet.

Only vmxnet3 them was checking unsupported parameters.

The aim is to merge this via the net-next tree so we can
convert all drivers and make the checking mandatory.

[1] https://lore.kernel.org/netdev/20200305051542.991898-1-kuba@kernel.org/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

wil6210: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

staging: qlge: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

vmxnet3: let core reject the unsupported coalescing parameters

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver correctly rejects all unsupported parameters.
As a side effect of these changes the error code for
unsupported params changes from EINVAL to EOPNOTSUPP.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8152: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tun: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

RDMA/ipoib: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

um: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dt-bindings: net: Add ipq806x mdio bindings

Add documentations for ipq806x mdio driver.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: mdio: add ipq8064 mdio driver

Currently ipq806x soc use generic bitbang driver to
comunicate with the gmac ethernet interface.
Add a dedicated driver created by chunkeey to fix this.

Co-developed-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'tun-debug'

Michal Kubecek says:

====================
tun: debug messages cleanup

While testing ethtool output for "strange" devices, I noticed confusing and
obviously incorrect message level information for a tun device and sent
a quick fix. The result of the upstream discussion was that tun driver
would rather deserve a more complex cleanup of the way it handles debug
messages.

The main problem is that all debugging statements and setting of message
level are controlled by TUN_DEBUG macro which is only defined if one edits
the source and rebuilds the module, otherwise all DBG1() and tun_debug()
statements do nothing.

This series drops the TUN_DEBUG switch and replaces custom tun_debug()
macro with standard netif_info() so that message level (mask) set and
displayed using ethtool works as expected. Some debugging messages are
dropped as they only notify about entering a function which can be done
easily using ftrace or kprobe.

Patch 1 is a trivial fix for compilation warning with W=1.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

tun: drop TUN_DEBUG and tun_debug()

TUN_DEBUG and tun_debug() are no longer used anywhere, drop them.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>

tun: replace tun_debug() by netif_info()

The tun driver uses custom macro tun_debug() which is only available if
TUN_DEBUG is set. Replace it by standard netif_ifinfo(). For that purpose,
rename tun_struct::debug to msg_enable and make it u32 and always present.
Finally, make tun_get_msglevel(), tun_set_msglevel() and TUNSETDEBUG ioctl
independent of TUN_DEBUG.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>

tun: drop useless debugging statements

Some of the tun_debug() statements only inform us about entering
a function which can be easily achieved with ftrace or kprobe. As
tun_debug() is no-op unless TUN_DEBUG is set which requires editing the
source and recompiling, setting up ftrace or kprobe is easier. Drop these
debug statements.

Also drop the tun_debug() statement informing about SIOCSIFHWADDR ioctl.
We can monitor these through rtnetlink and it makes little sense to log
address changes through ioctl but not changes through rtnetlink. Moreover,
this tun_debug() is called even if the actual address change fails which
makes it even less useful.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>

tun: get rid of DBG1() macro

This macro is no-op unless TUN_DEBUG is defined (which requires editing and
recompiling the source) and only does something if variable debug is 2 but
that variable is zero initialized and never set to anything else. Moreover,
the only use of the macro informs about entering function tun_chr_open()
which can be easily achieved using ftrace or kprobe.

Drop DBG1() macro, its only use and global variable debug.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>

tun: fix misleading comment format

The comment above tun_flow_save_rps_rxhash() starts with "/**" which
makes it look like kerneldoc comment and results in warnings when
building with W=1. Fix the format to make it look like a normal comment.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>

tc-testing: updated tdc tests for basic filter with canid extended match rules

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tc-testing: list kernel options for basic filter with canid ematch.

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'PCI-Implement-function-to-read-Device-Serial-Number'

Jacob Keller says:

====================
PCI: Implement function to read Device Serial Number

Several drivers read the Device Serial Number from the PCIe extended
configuration space. Each of these drivers implements a similar approach to
finding the position and then extracting the 8 bytes of data.

Implement a new helper function, pci_get_dsn, which can be used to extract
this data into an 8 byte array.

Modify the bnxt_en, qedf, ice, ixgbe and nfp drivers to use this new
function.

The intent for this is to reduce duplicate code across the various drivers,
and make it easier to write future code that wants to read the DSN. In
particular the ice driver will be using the DSN as its serial number when
implementing the DEVLINK_CMD_INFO_GET.

The new implementation in v2 significantly simplifies some of the callers
which just want to print the value out in MSB order. By returning things as
a u64 in CPU Endian order, the "%016llX" printf format specifier can be used
to correctly format the value.

Per patch changes since v1
  PCI: Introduce pci_get_dsn
  * Update commit message based on feedback from Bjorn Helgaas
  * Modify the function to return a u64 (zero on no capability)
  * This new implementation ensures that the first dword is the lower 32
    bits and the second dword is the upper 32 bits.

  bnxt_en: Use pci_get_dsn()
  * Use the u64 return value from pci_get_dsn()
  * Copy it into the dsn[] array by using put_unaligned_le64
  * Fix a pre-existing typo in the netdev_info error message

  scsi: qedf: Use pci_get_dsn()
  * Use the u64 return value from pci_get_dsn()
  * simplify the snprintf to use "%016llX"
  * remove the unused 'i' variable

  ice: Use pci_get_dsn()
  * Use the u64 return value from pci_get_dsn()
  * simplify the snprintf to use "%016llX"

  ixgbe: Use pci_get_dsn()
  * Use the u64 return value from pci_get_dsn()
  * simplify the snprintf to use "%016llX"

  nfp: Use pci_get_dsn()
  * Added in v2
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

nfp: Use pci_get_dsn()

Use the newly added pci_get_dsn() function for obtaining the 64-bit
Device Serial Number in the nfp6000_read_serial and
nfp_6000_get_interface functions.

pci_get_dsn() reports the Device Serial number as a u64 value created by
combining two pci_read_config_dword functions. The lower 16 bits
represent the device interface value, and the next 48 bits represent the
serial value. Use put_unaligned_be32 and put_unaligned_be16 to convert
the serial value portion into a Big Endian formatted serial u8 array.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

ixgbe: Use pci_get_dsn()

Replace the open-coded implementation for reading the PCIe DSN with
pci_get_dsn().

The original code used a simple for-loop to read the bytes in order into
a buffer one byte at a time.

The pci_get_dsn() function returns the DSN as a u64, correctly ordering
the upper and lower 32 bit dwords. Simplify the display code by using
%016llX to display the u64 DSN.

This should have equivalent behavior on both Little and Big Endian
systems. The bus will have correctly ordered the dwords in the CPU
endian format, while pci_get_dsn() will correctly order the lower and
higher dwords into a u64.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ice: Use pci_get_dsn()

Replace the open-coded implementation for reading the PCIe DSN with
pci_get_dsn().

The pci_get_dsn() function will perform two pci_read_config_dword calls
to read the lower and upper config dwords. It bitwise ORs them into
a u64 value. Instead of using put_unaligned_le32 to convert the value to
LE32 format, just use the %016llX printf specifier. This will print the
u64 correct, putting the most significant byte of the value first. Since
pci_get_dsn() correctly orders the two dwords into a u64, this should
produce equivalent results in less code.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

scsi: qedf: Use pci_get_dsn()

Replace the open-coded implementation for reading the PCIe DSN with
pci_get_dsn().

The original code used a for-loop that looped over each of the 8 bytes
and copied them into a temporary buffer. pci_get_dsn() uses two calls to
pci_read_config_dword, and correctly bitwise ORs them into a u64. Thus,
we can simplify the snprintf significantly using %016llX on a u64 value.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnxt_en: Use pci_get_dsn()

Replace the open-coded implementation for reading the PCIe DSN with
pci_get_dsn().

Use of put_unaligned_le64 should be correct. pci_get_dsn() will perform
two pci_read_config_dword calls. The first dword will be placed in the
first 32 bits of the u64, while the second dword will be placed in the
upper 32 bits of the u64.

On Little Endian systems, the least significant byte comes first, which
will be the least significant byte of the first dword, followed by the
least significant byte of the second dword. Since the _le32 variations
do not perform byte swapping, we will correctly copy the dwords into the
dsn[] array in the same order as before.

On Big Endian systems, the most significant byte of the second dword
will come first. put_unaligned_le64 will perform a CPU_TO_LE64, which
will swap things correctly before copying. This should also end up with
the correct bytes in the dsn[] array.

While at it, fix a small typo in the netdev_info error message when the
DSN cannot be read.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

PCI: Introduce pci_get_dsn

Several device drivers read their Device Serial Number from the PCIe
extended config space.

Introduce a new helper function, pci_get_dsn(). This function reads the
eight bytes of the DSN and returns them as a u64. If the capability does not
exist for the device, the function returns 0.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Michael Chan <michael.chan@broadcom.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ibmveth: Remove unused page_offset macro

We already have a function called page_offset(), and this macro
is unused, so just delete it.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

ptp: add VMware virtual PTP clock driver

Add a PTP clock driver called ptp_vmw, for guests running on VMware ESXi
hypervisor. The driver attaches to a VMware virtual device called
"precision clock" that provides a mechanism for querying host system time.
Similar to existing virtual PTP clock drivers (e.g. ptp_kvm), ptp_vmw
utilizes the kernel's PTP hardware clock API to implement a clock device
that can be used as a reference in Chrony for synchronizing guest time with
host.

The driver is only applicable to x86 guests running in VMware virtual
machines with precision clock virtual device present. It uses a VMware
specific hypercall mechanism to read time from the device.

Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Vivek Thampi <vithampi@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'wireless-drivers-next-2020-03-05' of git://git./linux/kernel/git/kvalo/wireless-drivers-next

Kalle Valo says:

====================
wireless-drivers-next patches for v5.7

First set of patches for v5.7. Lots of mt76 patches as they missed the
v5.6 deadline and hence they were postponed to the next version.
Otherwise nothing special standing out.

mt76

Major changes:

* dual-band concurrent support for MT7615

* fixes for rx path race conditions

* coverage class support for MT7615

* beacon fixes for USB devices

* MT7615 LED support

* set_antenna support for MT7615

* tracing improvements

* preparation for supporting new USB devices

* tx power fixes

brcmfmac

* support BRCM 4364 found in MacBook Pro 15,2
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

bcm63xx_enet: remove redundant variable definitions

in this function,‘ret’ is always assigned,so this's definition
'ret = 0' make no sense.

Signed-off-by: tangbin <tangbin@cmss.chinamobile.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: tulip: Replace zero-length array with flexible-array member

The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:

struct foo {
int stuff;
struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.

Also, notice that, dynamic memory allocations won't be affected by
this change:

"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]

This issue was found with the help of Coccinelle.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'mlxsw-Offload-FIFO'

Ido Schimmel says:

====================
mlxsw: Offload FIFO

Petr says:

If an ETS or PRIO band contains an offloaded qdisc, it is possible to
obtain offloaded counters for that band. However, some of the bands will
likely simply contain the default invisible FIFO qdisc, which does not
present the counters.

To remedy this situation, make FIFO offloadable, and offload it by mlxsw
when below PRIO and ETS for the sole purpose of providing counters for the
bands that do not include other qdiscs.

- In patch #1, FIFO is extended to support offloading.
- Patches #2 and #3 restructure bits of mlxsw to facilitate
the offload logic.
- Patch #4 then implements the offload itself.
- Patch #5 changes the ETS selftest to use the new counters.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

selftests: forwarding: ETS: Use Qdisc counters

Currently the SW-datapath ETS selftests use "ip link" stats to obtain the
number of packets that went through a given band. mlxsw then uses ethtool
per-priority counters.

Instead, change both to use qdiscs. In SW datapath this is the obvious
choice, and now that mlxsw offloads FIFO, this should work on the offloaded
datapath as well. This has the effect of verifying that the FIFO offload
works.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum_qdisc: Support offloading of FIFO Qdisc

There are two peculiarities about offloading FIFO:

- sometimes the qdisc has an unspecified handle (it is "invisible")
- it may be created before the qdisc that it will be a child of

These features make the offload a bit more tricky. The approach chosen in
this patch is to make note of all the FIFOs that needed to be rejected
because their parents were not known. Later when the parent is created,
they are offloaded

FIFO is only offloaded for its counters, queue length is ignored.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum_qdisc: Add handle parameter to ..._ops.replace

PRIO and ETS will need to check the value of qdisc handle in their
handlers. Add it to the callback and propagate through.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum_qdisc: Introduce struct mlxsw_sp_qdisc_state

In order to have a tidy structure where to put information related to Qdisc
offloads, introduce a new structure. Move there the two existing pieces of
data: root_qdisc and tclass_qdiscs. Embed them directly, because there's no
reason to go through pointer anymore. Convert users, update init/fini
functions.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: sched: Make FIFO Qdisc offloadable

Invoke ndo_setup_tc() as appropriate to signal init / replacement,
destroying and dumping of pFIFO / bFIFO Qdisc.

A lot of the FIFO logic is used for pFIFO_head_drop as well, but that's a
semantically very different Qdisc that isn't really in the same boat as
pFIFO / bFIFO. Split some of the functions to keep the Qdisc intact.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'ethtool-consolidate-parameter-checking-for-irq-coalescing'

Jakub Kicinski says:

====================
ethtool: consolidate parameter checking for irq coalescing

This set aims to simplify and unify the unsupported irq
coalescing parameter handling.

First patch adds a bitmask which drivers should fill in
in their ethtool_ops structs to declare which parameters
they support. Core will then ensure that driver callback
won't see any parameter outside of that set.

This allows us to save some LoC and make sure all drivers
respond the same to unsupported parameters.

If any parameter driver does not support is set to a value
other than 0 core will return -EINVAL. In the future we can
reject any present but unsupported netlink attribute, without
assuming 0 means unset. We can also add some prints or extack,
perhaps a'la Intel's current code.

I started converting the drivers alphabetically but then
realized that for the first set it's probably best to
address a representative mix of actively developed drivers.

According to my unreliable math there are roughly 69 drivers
in the tree which support some form of interrupt coalescing
settings via ethtool. Of these roughly 17 reject parameters
they don't support.

I hope drivers which ignore the parameters don't care, and
won't care about the slight change in behavior. Once all
drivers are converted we can make the checking mandatory.

I've only tested the e1000e and virtio patches, the rest builds.

v2: fix up ice and virtio conversions
v3: (patch 1)
    - move the (temporary) check if driver defines types
      earlier (Michal)
    - rename used_types -> nonzero_params, and
      coalesce_types -> supported_coalesce_params (Alex)
    - use EOPNOTSUPP instead of EINVAL (Andrew, Michal)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

virtio_net: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver correctly rejects all unsupported parameters.
As a side effect of these changes the error code for
unsupported params changes from EINVAL to EOPNOTSUPP.

v2: correctly handle rx-frames (and adjust the commit msg)
v3: adjust commit message for new error code and member name

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

e1000e: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

v3: adjust commit message for new member name

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlx5: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

v3: adjust commit message for new member name

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnxt: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

v3: adjust commit message for new member name

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ice: let core reject the unsupported coalescing parameters

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver correctly rejects all unsupported parameters.
As a side effect of these changes the info message about
the bad parameter will no longer be printed. We also
always reject the tx_coalesce_usecs_high param, even
if the target queue pair does not have a TX queue.
Error code changes from EINVAL to EOPNOTSUPP.

v2: allow adaptive TX
v3: adjust commit message for new error code and member name

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

hisilicon: let core reject the unsupported coalescing parameters

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver correctly rejects all unsupported parameters.
No functional changes.

v3: adjust commit message for new error code and member name

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

ionic: let core reject the unsupported coalescing parameters

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver correctly rejects all unsupported parameters.
As a side effect of these changes the error code for
unsupported params changes from EINVAL to EOPNOTSUPP.

v3: adjust commit message for new error code and member name

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfp: let core reject the unsupported coalescing parameters

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver correctly rejects all unsupported parameters.
No functional changes.

v3: adjust commit message for new error code and member name

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

stmmac: let core reject the unsupported coalescing parameters

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver correctly rejects all unsupported parameters.
No functional changes.

v3: adjust commit message for new error code and member name

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

enic: let core reject the unsupported coalescing parameters

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver correctly rejects all unsupported parameters.
The error code changes from EINVAL to EOPNOTSUPP.

v3: adjust commit message for new error code and member name

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

xgbe: let core reject the unsupported coalescing parameters

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver correctly rejects all unsupported parameters.
We are only losing the error print.

v3: adjust commit message for new error code and member name

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ethtool: add infrastructure for centralized checking of coalescing parameters

Linux supports 22 different interrupt coalescing parameters.
No driver implements them all. Some drivers just ignore the
ones they don't support, while others have to carry a long
list of checks to reject unsupported settings.

To simplify the drivers add the ability to specify inside
ethtool_ops which parameters are supported and let the core
reject attempts to set any other one.

This commit makes the mechanism an opt-in, only drivers which
set ethtool_opts->coalesce_types to a non-zero value will have
the checks enforced.

The same mask is used for global and per queue settings.

v3: - move the (temporary) check if driver defines types
      earlier (Michal)
    - rename used_types -> nonzero_params, and
      coalesce_types -> supported_coalesce_params (Alex)
    - use EOPNOTSUPP instead of EINVAL (Andrew, Michal)

Leaving the long series of ifs for now, it seems nice to
be able to grep for the field and flag names. This will
probably have to be revisited once netlink support lands.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

hsr: fix refcnt leak of hsr slave interface

In the commit e0a4b99773d3 ("hsr: use upper/lower device infrastructure"),
dev_get() was removed but dev_put() in the error path wasn't removed.
So, if creating hsr interface command is failed, the reference counter leak
of lower interface would occur.

Test commands:
    ip link add dummy0 type dummy
    ip link add ipvlan0 link dummy0 type ipvlan mode l2
    ip link add ipvlan1 link dummy0 type ipvlan mode l2
    ip link add hsr0 type hsr slave1 ipvlan0 slave2 ipvlan1
    ip link del ipvlan0

Result:
[  633.271992][ T1280] unregister_netdevice: waiting for ipvlan0 to become free. Usage count = -1

Fixes: e0a4b99773d3 ("hsr: use upper/lower device infrastructure")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'rmnet-cleanups'

Taehee Yoo says:

====================
net: rmnet: several code cleanup for rmnet module

This patchset is to cleanup rmnet module code.

1. The first patch is to add module alias
rmnet module can not be loaded automatically because there is no
alias name.

2. The second patch is to add extack error message code.
When rmnet netlink command fails, it doesn't print any error message.
So, users couldn't know the exact reason.
In order to tell the exact reason to the user, the extack error message
is used in this patch.

3. The third patch is to use GFP_KERNEL instead of GFP_ATOMIC.
In the sleepable context, GFP_KERNEL can be used.
So, in this patch, GFP_KERNEL is used instead of GFP_ATOMIC.

Change log:
- v1->v2: change error message in the second patch.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: rmnet: use GFP_KERNEL instead of GFP_ATOMIC

In the current code, rmnet_register_real_device() and rmnet_newlink()
are using GFP_ATOMIC.
But, these functions are allowed to sleep.
So, GFP_KERNEL can be used.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: rmnet: print error message when command fails

When rmnet netlink command fails, it doesn't print any error message.
So, users couldn't know the exact reason.
In order to tell the exact reason to the user, the extack error message
is used in this patch.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: rmnet: add missing module alias

In the current rmnet code, there is no module alias.
So, RTNL couldn't load rmnet module automatically.

Test commands:
    ip link add dummy0 type dummy
    modprobe -rv rmnet
    ip link add rmnet0 link dummy0 type rmnet  mux_id 1

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'marvell10g-tunable-and-power-saving-support'

Russell King says:

====================
marvell10g tunable and power saving support

This patch series adds support for:
- mdix configuration (auto, mdi, mdix)
- energy detect power down (edpd)
- placing in edpd mode at probe

for both the 88x3310 and 88x2110 PHYs.

Antione, could you test this for the 88x2110 PHY please?

v3: fix return code in get_tunable/set_tunable
v2: fix comments from Antione.
====================

Tested-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: marvell10g: place in powersave mode at probe

Place the 88x3310 into powersaving mode when probing, which saves 600mW
per PHY. For both PHYs on the Macchiatobin double-shot, this saves
about 10% of the board idle power.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: marvell10g: add energy detect power down tunable

Add support for the energy detect power down tunable, which saves
around 600mW when the link is down. The 88x3310 supports off, rx-only
and NLP every second. Enable EDPD by default for 88x3310.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: marvell10g: add mdix control

Add support for controlling the MDI-X state of the PHY.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'PCI-Add-and-use-constant-PCI_STATUS_ERROR_BITS-and-helper-pci_status_get_and_clear_errors'

Heiner Kallweit says:

====================
PCI: Add and use constant PCI_STATUS_ERROR_BITS and helper pci_status_get_and_clear_errors

Several drivers have own definitions for this constant, so move it
to the PCI core. In addition in multiple places the following code
sequence is used:
1. Read PCI_STATUS
2. Mask out non-error bits
3. Action based on set error bits
4. Write back set error bits to clear them

As this is a repeated pattern, add a helper to the PCI core.

Most affected drivers are network drivers. But as it's about core
PCI functionality, I suppose the series should go through the PCI
tree.

v2:
- fix formal issue with cover letter
v3:
- fix dumb typo in patch 7
v4:
- add patches 1-3
- move new constant PCI_STATUS_ERROR_BITS to include/linux/pci.h
- small improvements in commit messages
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

sound: bt87x: use pci_status_get_and_clear_errors

Use new helper pci_status_get_and_clear_errors() to simplify the code.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

PCI: pci-bridge-emul: Use new constant PCI_STATUS_ERROR_BITS

Use new constant PCI_STATUS_ERROR_BITS to simplify the code.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: skfp: use new constant PCI_STATUS_ERROR_BITS

Use new PCI core constant PCI_STATUS_ERROR_BITS to simplify the code.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: sun: use pci_status_get_and_clear_errors

Use new helper pci_status_get_and_clear_errors() to simplify the code.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8169: use pci_status_get_and_clear_errors

Use new helper pci_status_get_and_clear_errors() to simplify the code.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

PCI: Add pci_status_get_and_clear_errors

Several drivers use the following code sequence:
1. Read PCI_STATUS
2. Mask out non-error bits
3. Action based on error bits set
4. Write back set error bits to clear them

As this is a repeated pattern, add a helper to the PCI core.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

PCI: Add constant PCI_STATUS_ERROR_BITS

This collection of PCI error bits is used in more than one driver,
so move it to the PCI core.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8169: add PCI_STATUS_PARITY to PCI status error bits

In preparation of factoring out PCI_STATUS error bit handling let drivers
use the same collection of error bits. To facilitate bisecting we do this
in a separate patch per affected driver. For the r8169 driver we have to
add PCI_STATUS_PARITY to the error bits.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: skfp: add PCI_STATUS_REC_TARGET_ABORT to PCI status error bits

In preparation of factoring out PCI_STATUS error bit handling let drivers
use the same collection of error bits. To facilitate bisecting we do this
in a separate patch per affected driver. For the skfp driver we have to
add PCI_STATUS_REC_TARGET_ABORT to the error bits.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: marvell: add PCI_STATUS_SIG_TARGET_ABORT to PCI status error bits

In preparation of factoring out PCI_STATUS error bit handling let drivers
use the same collection of error bits. To facilitate bisecting we do this
in a separate patch per affected driver. For the Marvell drivers we have
to add PCI_STATUS_SIG_TARGET_ABORT to the error bits.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'Allow-unknown-unicast-traffic-to-CPU-for-Felix-DSA'

Vladimir Oltean says:

====================
Allow unknown unicast traffic to CPU for Felix DSA

This is the continuation of the previous "[PATCH net-next] net: mscc:
ocelot: Workaround to allow traffic to CPU in standalone mode":

https://www.spinics.net/lists/netdev/msg631067.html

Following the feedback received from Allan Nielsen, the Ocelot and Felix
drivers were made to use the CPU port module in the same way (patch 1),
and Felix was made to additionally allow unknown unicast frames towards
the CPU port module (patch 2).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: felix: Allow unknown unicast traffic towards the CPU port module

Compared to other DSA switches, in the Ocelot cores, the RX filtering is
a much more important concern.

Firstly, the primary use case for Ocelot is non-DSA, so there isn't any
secondary Ethernet MAC [the DSA master's one] to implicitly drop frames
having a DMAC we are not interested in.  So the switch driver itself
needs to install FDB entries towards the CPU port module (PGID_CPU) for
the MAC address of each switch port, in each VLAN installed on the port.
Every address that is not whitelisted is implicitly dropped. This is in
order to achieve a behavior similar to N standalone net devices.

Secondly, even in the secondary use case of DSA, such as illustrated by
Felix with the NPI port mode, that secondary Ethernet MAC is present,
but its RX filter is bypassed. This is because the DSA tags themselves
are placed before Ethernet, so the DMAC that the switch ports see is
not seen by the DSA master too (since it's shifter to the right).

So RX filtering is pretty important. A good RX filter won't bother the
CPU in case the switch port receives a frame that it's not interested
in, and there exists no other line of defense.

Ocelot is pretty strict when it comes to RX filtering: non-IP multicast
and broadcast traffic is allowed to go to the CPU port module, but
unknown unicast isn't. This means that traffic reception for any other
MAC addresses than the ones configured on each switch port net device
won't work. This includes use cases such as macvlan or bridging with a
non-Ocelot (so-called "foreign") interface. But this seems to be fine
for the scenarios that the Linux system embedded inside an Ocelot switch
is intended for - it is simply not interested in unknown unicast
traffic, as explained in Allan Nielsen's presentation [0].

On the other hand, the Felix DSA switch is integrated in more
general-purpose Linux systems, so it can't afford to drop that sort of
traffic in hardware, even if it will end up doing so later, in software.

Actually, unknown unicast means more for Felix than it does for Ocelot.
Felix doesn't attempt to perform the whitelisting of switch port MAC
addresses towards PGID_CPU at all, mainly because it is too complicated
to be feasible: while the MAC addresses are unique in Ocelot, by default
in DSA all ports are equal and inherited from the DSA master. This adds
into account the question of reference counting MAC addresses (delayed
ocelot_mact_forget), not to mention reference counting for the VLAN IDs
that those MAC addresses are installed in. This reference counting
should be done in the DSA core, and the fact that it wasn't needed so
far is due to the fact that the other DSA switches don't have the DSA
tag placed before Ethernet, so the DSA master is able to whitelist the
MAC addresses in hardware.

So this means that even regular traffic termination on a Felix switch
port happens through flooding (because neither Felix nor Ocelot learn
source MAC addresses from CPU-injected frames).

So far we've explained that whitelisting towards PGID_CPU:
- helps to reduce the likelihood of spamming the CPU with frames it
  won't process very far anyway
- is implemented in the ocelot driver
- is sufficient for the ocelot use cases
- is not feasible in DSA
- breaks use cases in DSA, in the current status (whitelisting enabled
  but no MAC address whitelisted)

So the proposed patch allows unknown unicast frames to be sent to the
CPU port module. This is done for the Felix DSA driver only, as Ocelot
seems to be happy without it.

[0]: https://www.youtube.com/watch?v=B1HhxEcU7Jg

Suggested-by: Allan W. Nielsen <allan.nielsen@microchip.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Allan W. Nielsen <allan.nielsen@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: mscc: ocelot: eliminate confusion between CPU and NPI port

Ocelot has the concept of a CPU port. The CPU port is represented in the
forwarding and the queueing system, but it is not a physical device. The
CPU port can either be accessed via register-based injection/extraction
(which is the case of Ocelot), via Frame-DMA (similar to the first one),
or "connected" to a physical Ethernet port (called NPI in the datasheet)
which is the case of the Felix DSA switch.

In Ocelot the CPU port is at index 11.
In Felix the CPU port is at index 6.

The CPU bit is treated special in the forwarding, as it is never cleared
from the forwarding port mask (once added to it). Other than that, it is
treated the same as a normal front port.

Both Felix and Ocelot should use the CPU port in the same way. This
means that Felix should not use the NPI port directly when forwarding to
the CPU, but instead use the CPU port.

This patch is fixing this such that Felix will use port 6 as its CPU
port, and just use the NPI port to carry the traffic.

Therefore, eliminate the "ocelot->cpu" variable which was holding the
index of the NPI port for Felix, and the index of the CPU port module
for Ocelot, so the variable was actually configuring different things
for different drivers and causing at least part of the confusion.

Also remove the "ocelot->num_cpu_ports" variable, which is the result of
another confusion. The 2 CPU ports mentioned in the datasheet are
because there are two frame extraction channels (register based or DMA
based). This is of no relevance to the driver at the moment, and
invisible to the analyzer module.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Suggested-by: Allan W. Nielsen <allan.nielsen@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'pie-minor-improvements'

Leslie Monis says:

====================
pie: minor improvements

This patch series includes the following minor changes with
respect to the PIE/FQ-PIE qdiscs:

- Patch 1 removes some ambiguity by using the term "backlog"
           instead of "qlen" when referring to the queue length
           in bytes.
- Patch 2 removes redundant type casting on two expressions.
- Patch 3 removes the pie_vars->accu_prob_overflows variable
           without affecting the precision in calculations and
           makes the size of the pie_vars structure exactly 64
           bytes.
- Patch 4 realigns a comment affected by a change in patch 3.

Changes from v1 to v2:
- Kept 8 as the argument to prandom_bytes() instead of changing it
   to 7 as suggested by David Miller.
====================

Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

pie: realign comment

Realign a comment after the change introduced by the
previous patch.

Signed-off-by: Leslie Monis <lesliemonis@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

pie: remove pie_vars->accu_prob_overflows

The variable pie_vars->accu_prob is used as an accumulator for
probability values. Since probabilty values are scaled using the
MAX_PROB macro denoting (2^64 - 1), pie_vars->accu_prob is
likely to overflow as it is of type u64.

The variable pie_vars->accu_prob_overflows counts the number of
times the variable pie_vars->accu_prob overflows.

The MAX_PROB macro needs to be equal to at least (2^39 - 1) in
order to do precise calculations without any underflow. Thus
MAX_PROB can be reduced to (2^56 - 1) without affecting the
precision in calculations drastically. Doing so will eliminate
the need for the variable pie_vars->accu_prob_overflows as the
variable pie_vars->accu_prob will never overflow.

Removing the variable pie_vars->accu_prob_overflows also reduces
the size of the structure pie_vars to exactly 64 bytes.

Signed-off-by: Mohit P. Tahiliani <tahiliani@nitk.edu.in>
Signed-off-by: Gautam Ramakrishnan <gautamramk@gmail.com>
Signed-off-by: Leslie Monis <lesliemonis@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

pie: remove unnecessary type casting

In function pie_calculate_probability(), the variables alpha and
beta are of type u64. The variables qdelay, qdelay_old and
params->target are of type psched_time_t (which is also u64).
The explicit type casting done when calculating the value for
the variable delta is redundant and not required.

Signed-off-by: Mohit P. Tahiliani <tahiliani@nitk.edu.in>
Signed-off-by: Gautam Ramakrishnan <gautamramk@gmail.com>
Signed-off-by: Leslie Monis <lesliemonis@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

pie: use term backlog instead of qlen

Remove ambiguity by using the term backlog instead of qlen when
representing the queue length in bytes.

Signed-off-by: Mohit P. Tahiliani <tahiliani@nitk.edu.in>
Signed-off-by: Gautam Ramakrishnan <gautamramk@gmail.com>
Signed-off-by: Leslie Monis <lesliemonis@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'act_ct-software-offload-of-established-flows-fixes'

Paul Blakey says:

====================
Fixes for tc act_ct software offload of established flows (diff v4->v6)

v4 of the original patchset was accidentally merged while we moved ahead
with v6 review. This two patches are the diff between v4 that was merged and
v6 that was the final revision, which was acked by the community.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net/sched: act_ct: Use pskb_network_may_pull()

To make the filler functions more generic, use network
relative skb pulling.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/sched: act_ct: Fix ipv6 lookup of offloaded connections

When checking the protocol number tcf_ct_flow_table_lookup() handles
the flow as if it's always ipv4, while it can be ipv6.

Instead, refactor the code to fetch the tcp header, if available,
in the relevant family (ipv4/ipv6) filler function, and do the
check on the returned tcp header.

Fixes: 46475bb20f4b ("net/sched: act_ct: Software offload of established flows")
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

gianfar: remove unnecessary zeroing coalesce settings

Core already zeroes out the struct ethtool_coalesce structure,
drivers don't have to set every field to 0 individually.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'Wire-up-Ocelot-tc-flower-to-Felix-DSA'

Vladimir Oltean says:

====================
Wire up Ocelot tc-flower to Felix DSA

This series is a proposal on how to wire up the tc-flower callbacks into
DSA. The example taken is the Microchip Felix switch, whose core
implementation is actually located in drivers/net/ethernet/mscc/.

The proposal is largely a compromise solution. The DSA middle layer
handles just enough to get to the interesting stuff (FLOW_CLS_REPLACE,
FLOW_CLS_DESTROY, FLOW_CLS_STATS), but also thin enough to let drivers
decide what filter keys and actions they support without worrying that
the DSA middle layer will grow exponentially. I am far from being an
expert, so I am asking reviewers to please voice your opinion if you
think it can be done differently, with better results.

The bulk of the work was actually refactoring the ocelot driver enough
to allow the VCAP (Versatile Content-Aware Processor) code for vsc7514
and the vsc9959 switch cores to live together.

Flow block offloads have not been tested yet, only filters attached to a
single port. It might be as simple as replacing ocelot_ace_rule_create
with something smarter, it might be more complicated, I haven't tried
yet.

I should point out that the tc-matchall filter offload is not
implemented in the same manner in current mainline. Florian has already
went all the way down into exposing actual per-action callbacks,
starting with port mirroring. Because currently only mirred is supported
by this DSA mid layer, everything else will return -EOPNOTSUPP. So even
though ocelot supports matchall (aka port-based) policers, we don't have
a call path to call into them. Personally I think that this is not
going to scale for tc-matchall (there may be policers, traps, drops,
VLAN retagging, etc etc), and that we should consider whether further
matchall filter/action combinations should be just passed on to drivers
with no interpretation instead.
As for the existing mirroring callbacks in DSA, they can either be kept
as-is, or replaced with simple accessors to TC_CLSMATCHALL_REPLACE and
TC_CLSMATCHALL_DESTROY, just like for flower, and drivers which
currently implement the port mirroring callbacks will need to have some
extra "if" conditions now, in order for them to call their port
mirroring implementations.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: felix: Wire up the ocelot cls_flower methods

Export the cls_flower methods from the ocelot driver and hook them up to
the DSA passthrough layer.

Tables for the VCAP IS2 parameters, as well as half key packing (field
offsets and lengths) need to be defined for the VSC9959 core, as they
are different from Ocelot, mainly due to the different port count.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: Add bypass operations for the flower classifier-action filter

Due to the immense variety of classification keys and actions available
for tc-flower, as well as due to potentially very different DSA switch
capabilities, it doesn't make a lot of sense for the DSA mid layer to
even attempt to interpret these. So just pass them on to the underlying
switch driver.

DSA implements just the standard boilerplate for binding and unbinding
flow blocks to ports, since nobody wants to deal with that.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: mscc: ocelot: parameterize the vcap_is2 properties

Remove the definitions for the VCAP IS2 table from ocelot_ace.c, since
it is specific to VSC7514.

The VSC9959 VCAP IS2 table supports more rules (1024 instead of 64) and
has a different width for the action (89 bits instead of 99).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: mscc: ocelot: remove port_pcs_init indirection for VSC7514

The Felix driver is now using its own PHYLINK instance, not calling into
ocelot_adjust_link. So the port_pcs_init function pointer is an
unnecessary indirection. Remove it.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Allan W. Nielsen <allan.nielsen@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: mscc: ocelot: don't rely on preprocessor for vcap key/action packing

The IGR_PORT_MASK key width is different between the 11-port VSC7514 and
the 6-port VSC9959 switches. And since IGR_PORT_MASK is one of the first
fields of a VCAP key entry, it means that all further field
offset/length pairs are shifted between the 2.

The ocelot driver performs packing of VCAP half keys with the help of
some preprocessor macros:

- A set of macros for defining the HKO (Half Key Offset) and HKL (Half
  Key Length) of each possible key field. The offset of each field is
  defined as the sum between the offset and the sum of the previous
  field.

- A set of accessors on top of vcap_key_set for shorter (aka less
  typing) access to the HKO and HKL of each key field.

Since the field offsets and lengths are different between switches,
defining them through the preprocessor isn't going to fly. So introduce
a structure holding (offset, length) pairs and instantiate it in
ocelot_board.c for VSC7514. In a future patch, a similar structure will
be instantiated in felix_vsc9959.c for NXP LS1028A.

The accessors also need to go. They are based on macro name
concatenation, which is horrible to understand and follow.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: mscc: ocelot: spell out full "ocelot" name instead of "oc"

This is a cosmetic patch that makes the name of the driver private
variable be used uniformly in ocelot_ace.c as in the rest of the driver.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: mscc: ocelot: return directly in ocelot_cls_flower_{replace, destroy}

There is no need to check the "ret" variable, one can just return the
function result back to the caller.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Allan W. Nielsen <allan.nielsen@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: mscc: ocelot: replace "rule" and "ocelot_rule" variable names with "ace"

The "ocelot_rule" variable name is both annoyingly long trying to
distinguish itself from struct flow_rule *rule =
flow_cls_offload_flow_rule(f), as well as actually different from the
"ace" variable name which is used all over the place in ocelot_ace.c and
is referring to the same structure.

And the "rule" variable name is, confusingly, different from f->rule,
but sometimes one has to look up to the beginning of the function to get
an understanding of what structure type is actually being handled.

So let's use the "ace" name wherever possible ("Access Control Entry").

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Allan W. Nielsen <allan.nielsen@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>