platform/kernel/linux-starfive.git
5 years agoMerge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox...
Saeed Mahameed [Tue, 2 Apr 2019 22:43:45 +0000 (15:43 -0700)]
Merge branch 'mlx5-next' of git://git./linux/kernel/git/mellanox/linux

This merge commit includes some misc shared code updates from mlx5-next branch needed
for net-next.

1) From Maxim, Remove un-used macros and spinlock from mlx5 code.

2) From Aya, Expose Management PCIE info register layout and add rate limit
print macros.

3) From Tariq, Compilation warning fix in fs_core.c

4) From Vu, Huy and Saeed, Improve mlx5 initialization flow:
The goal is to provide a better logical separation of mlx5 core
device initialization flow and will help to seamlessly support
creating different mlx5 device types such as PF, VF and SF
mlx5 sub-function virtual devices.

Mlx5_core driver needs to separate HCA resources from pci resources.
Its initialize/load/unload will be broken into stages:
1. Initialize common data structures
2. Setup function which initializes pci resources (for PF/VF)
   or some other specific resources for virtual device
3. Initialize software objects according to hardware capabilities
4. Load all mlx5_core components

It is also necessary to detach mlx5_core mdev name/message from pci
device mdev->pdev name/message for a clearer report/debug of
different mlx5 device types.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet: sched: don't set tunnel for decap action
Vlad Buslov [Mon, 1 Apr 2019 11:16:59 +0000 (14:16 +0300)]
net: sched: don't set tunnel for decap action

Action tunnel_key doesn't have a metadata/tunnel for release(decap) action.
Drivers do not dereference entry->tunnel pointer for that action type, so
this behavior doesn't result in a crash at the moment. However, this needs
to be corrected as a preparation for updating hardware offloads API to not
rely on rtnl lock, for which flow_action code will copy the tunnel data to
temporary buffer to prevent concurrent action overwrite from
invalidating/freeing it.

Fixes: 3a7b68617de7 ("cls_api: add translator to flow_action representation")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'net-phy-improve-genphy_c45_read_lpa'
David S. Miller [Tue, 2 Apr 2019 20:16:17 +0000 (13:16 -0700)]
Merge branch 'net-phy-improve-genphy_c45_read_lpa'

Heiner Kallweit says:

====================
net: phy: improve genphy_c45_read_lpa

This series improves genphy_c45_read_lpa:
- Use clause 45 standard register / bit to detect link partner autoneg
  capability.
- Consider that lpa register values may be invalid if "autoneg complete"
  bit isn't set.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: deal properly with autoneg incomplete in genphy_c45_read_lpa
Heiner Kallweit [Sun, 31 Mar 2019 17:54:07 +0000 (19:54 +0200)]
net: phy: deal properly with autoneg incomplete in genphy_c45_read_lpa

The link partner advertisement registers are not guaranteed to contain
valid values if autoneg is incomplete. Therefore, if
MDIO_AN_STAT1_COMPLETE isn't set, let's clear all link partner
capability bits. This also avoids unnecessary register reads if link
is down and phylib is in polling mode.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: use c45 standard to detect link partner autoneg capability
Heiner Kallweit [Sun, 31 Mar 2019 17:52:51 +0000 (19:52 +0200)]
net: phy: use c45 standard to detect link partner autoneg capability

Currently mii_lpa_mod_linkmode_lpa_t() checks bit LPA_LPACK to detect
whether link partner supports autoneg. This doesn't work correctly at
least on Aquantia AQCS109 when it negotiates 1000Base-T2 mode.
The "link partner is autoneg-capable" bit as defined by clause 45 is
set however. Better let's switch in general to use the clause 45
standard for link partner autoneg detection.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'genphy_read_abilities'
David S. Miller [Tue, 2 Apr 2019 20:09:56 +0000 (13:09 -0700)]
Merge branch 'genphy_read_abilities'

Heiner Kallweit says:

====================
net: phy: add and use new function genphy_read_abilities

Similar to genphy_c45_pma_read_abilities() add a function to dynamically
detect the abilities of a Clause 22 PHY. This is mainly copied from
genphy_config_init(). Main benefit is that PHY drivers no longer have
to specify whether they support GBit or not (provided they keep to
the C22 standard).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: realtek: use genphy_read_abilities
Heiner Kallweit [Wed, 27 Mar 2019 21:00:32 +0000 (22:00 +0100)]
net: phy: realtek: use genphy_read_abilities

Use new function genphy_read_abilities(). This allows to remove all
calls to genphy_config_init().

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: use genphy_read_abilities in genphy driver
Heiner Kallweit [Wed, 27 Mar 2019 20:59:33 +0000 (21:59 +0100)]
net: phy: use genphy_read_abilities in genphy driver

Currently the genphy driver populates phydev->supported like this:
First all possible feature bits are set, then genphy_config_init()
reads the available features from the chip and remove all unsupported
features from phydev->supported. This can be simplified by using
genphy_read_abilities().

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: add genphy_read_abilities
Heiner Kallweit [Wed, 27 Mar 2019 20:58:44 +0000 (21:58 +0100)]
net: phy: add genphy_read_abilities

Similar to genphy_c45_pma_read_abilities() add a function to dynamically
detect the abilities of a Clause 22 PHY. This is mainly copied from
genphy_config_init().

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/mlx5: Fix false compilation warning
Tariq Toukan [Fri, 29 Mar 2019 22:38:04 +0000 (15:38 -0700)]
net/mlx5: Fix false compilation warning

Fix the following warning:
drivers/net/ethernet/mellanox/mlx5/core//fs_core.c:845:5:
warning: 'err' may be used uninitialized in this function
[-Wmaybe-uninitialized]

No real issue here. This is only a false compiler warning.
The 'err' variable is guaranteed to be init by time of usage.

gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4)

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Expose MPEIN (Management PCIE INfo) register layout
Aya Levin [Fri, 29 Mar 2019 22:38:03 +0000 (15:38 -0700)]
net/mlx5: Expose MPEIN (Management PCIE INfo) register layout

Expose PRM layout for handling MPEIN (Management PCIE Info). It will be
used in the downstream patch for querying MPEIN via the driver.

Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Add rate limit print macros
Aya Levin [Fri, 29 Mar 2019 22:38:02 +0000 (15:38 -0700)]
net/mlx5: Add rate limit print macros

Add rate limited print macros for warning and info level. This protects
the system from burst of prints depleting HW resources and spamming dmesg.

Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Add explicit bar address field
Huy Nguyen [Fri, 29 Mar 2019 22:38:01 +0000 (15:38 -0700)]
net/mlx5: Add explicit bar address field

Add bar_addr field to store bar-0 address to avoid calling
pci_resource_start with hard-coded bar-0 as parameter.
Also note that different mlx5 device types will have bar_addr
on different bars.

This patch does not change any functionality.

Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Replace dev_err/warn/info by mlx5_core_err/warn/info
Huy Nguyen [Fri, 29 Mar 2019 22:38:00 +0000 (15:38 -0700)]
net/mlx5: Replace dev_err/warn/info by mlx5_core_err/warn/info

Replace pci dev_err/warn/info messages with mlx5_core_err/warn/info
messages to provide a better report/debug of different mlx5 device types.

This patch does not change any functionality.

Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Use dev->priv.name instead of dev_name
Huy Nguyen [Fri, 29 Mar 2019 22:37:59 +0000 (15:37 -0700)]
net/mlx5: Use dev->priv.name instead of dev_name

Use mlx5_core mdev private name in message instead of using pci dev_name
to provide a better report/debug of different mlx5 device types.

This patch does not change any functionality.

Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Make mlx5_core messages independent from mdev->pdev
Huy Nguyen [Fri, 29 Mar 2019 22:37:58 +0000 (15:37 -0700)]
net/mlx5: Make mlx5_core messages independent from mdev->pdev

Detach mlx5_core mdev messages from pci device mdev->pdev messages and
provide a better report/debug of different mlx5 device types.

This patch does not change any functionality.

Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
5 years agonet/mlx5: Break load_one into three stages
Saeed Mahameed [Fri, 29 Mar 2019 22:37:57 +0000 (15:37 -0700)]
net/mlx5: Break load_one into three stages

Using foundation from previous patches to factor mlx5_load_one flow
into three stages:
1. mlx5_function_setup() from previous patch to setup function
2. mlx5_init_once() from previous patch to init software objects
according to hw caps
3. New mlx5_load() to load mlx5 components

This provides a better logical separation of mlx5 core device
initialization flow and will help to seamlessly support creating different
mlx5 device types such as PF, VF and SF mlx5 sub-function virtual device.

This patch does not change any functionality.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Function setup/teardown procedures
Saeed Mahameed [Fri, 29 Mar 2019 22:37:56 +0000 (15:37 -0700)]
net/mlx5: Function setup/teardown procedures

Function setup and teardown procedures are the basic procedure that
each mlx5 pci function should perform to boot up a mlx5 device function
and initialize basic communication with FW, before allocating any higher
level software/firmware resources.

This provides a better logical separation of mlx5 core device
initialization flow and will help to seamlessly support creating different
mlx5 device types such as PF, VF and SF mlx5 sub-function virtual device.

This patch does not change any functionality.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Move health and page alloc init to mdev_init
Saeed Mahameed [Fri, 29 Mar 2019 22:37:55 +0000 (15:37 -0700)]
net/mlx5: Move health and page alloc init to mdev_init

Software structure initialization should be in mdev_init stage.

This provides a better logical separation of mlx5 core device
initialization flow and will help to seamlessly support creating different
mlx5 device types such as PF, VF and SF mlx5 sub-function virtual device.

This patch does not change any functionality.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Split mdev init and pci init
Saeed Mahameed [Fri, 29 Mar 2019 22:37:54 +0000 (15:37 -0700)]
net/mlx5: Split mdev init and pci init

Separate resources initialization from pci initialization.

This provides a better logical separation of mlx5 core device
initialization flow and will help to seamlessly support creating different
mlx5 device types such as PF, VF and SF mlx5 sub-function virtual device.

This patch does not change any functionality.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Remove redundant init functions parameter
Saeed Mahameed [Fri, 29 Mar 2019 22:37:53 +0000 (15:37 -0700)]
net/mlx5: Remove redundant init functions parameter

This patch does not change any functionality.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Remove spinlock support from mlx5_write64
Maxim Mikityanskiy [Fri, 29 Mar 2019 22:37:52 +0000 (15:37 -0700)]
net/mlx5: Remove spinlock support from mlx5_write64

As there is no user of mlx5_write64 that passes a spinlock to
mlx5_write64, remove this functionality and simplify the function.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Remove unused MLX5_*_DOORBELL_LOCK macros
Maxim Mikityanskiy [Fri, 29 Mar 2019 22:37:51 +0000 (15:37 -0700)]
net/mlx5: Remove unused MLX5_*_DOORBELL_LOCK macros

MLX5_*_DOORBELL_LOCK macros provided a way to avoid locking for
mlx5_write64 on 64-bit platforms where it's not necessary. Currently all
calls to mlx5_write64 don't use a spinlock, so the macros became unused.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agoopenvswitch: use after free in __ovs_ct_free_action()
Dan Carpenter [Tue, 2 Apr 2019 06:53:14 +0000 (09:53 +0300)]
openvswitch: use after free in __ovs_ct_free_action()

We free "ct_info->ct" and then use it on the next line when we pass it
to nf_ct_destroy_timeout().  This patch swaps the order to avoid the use
after free.

Fixes: 06bd2bdf19d2 ("openvswitch: Add timeout support to ct action")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotg3: allow ethtool -p to work for NICs in down state
Jon Maxwell [Tue, 2 Apr 2019 05:07:56 +0000 (16:07 +1100)]
tg3: allow ethtool -p to work for NICs in down state

Make tg3 behave like other drivers and let "ethtool -p" identify the
NIC even when it's in the DOWN state. Before this patch it would get an
error as follows if the NIC was down:

# ip link set down dev em4
# ethtool -p em4
Cannot identify NIC: Resource temporarily unavailable

With this patch ethtool identify works regardless of whether the NIC is up
or down as it does for other drivers.

Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomacsec: add noinline tag to avoid a frame size warning
Florian Westphal [Mon, 1 Apr 2019 20:59:09 +0000 (22:59 +0200)]
macsec: add noinline tag to avoid a frame size warning

seen with debug config:
drivers/net/macsec.c: In function 'dump_secy':
drivers/net/macsec.c:2597: warning: the frame size of 2216 bytes is larger
than 2048 bytes [-Wframe-larger-than=]

just mark it with noinline_for_stack, this is netlink dump code.

v2: use 'static noinline_for_stack int' consistently

Cc: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'xmit_more-softnet_data'
David S. Miller [Tue, 2 Apr 2019 01:35:02 +0000 (18:35 -0700)]
Merge branch 'xmit_more-softnet_data'

Florian Westphal says:

====================
net: move skb->xmit_more to percpu softnet data

Eric Dumazet mentioned we could place xmit_more hint in same
spot as device xmit recursion counter, instead of using
an sk_buff flag bit.

This series places xmit_recursion counter and xmit_more hint
in softnet data, filling a hole.

After this, skb->xmit_more is always zero.  Drivers are converted
to use "netdev_xmit_more()" helper instead.

Last patch removes the skb->xmit_more flag.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodrivers: net: aurora: use netdev_xmit_more helper
Florian Westphal [Mon, 1 Apr 2019 14:42:17 +0000 (16:42 +0200)]
drivers: net: aurora: use netdev_xmit_more helper

This is the last driver using always-0 skb->xmit_more.
Switch it to netdev_xmit_more and remove the now unused xmit_more flag
from sk_buff.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodrivers: net: sfc: use netdev_xmit_more helper
Florian Westphal [Mon, 1 Apr 2019 14:42:16 +0000 (16:42 +0200)]
drivers: net: sfc: use netdev_xmit_more helper

skb->xmit_more hint is now always 0, this switches the sfc driver to
use the netdev_xmit_more helper instead.

Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com>
Cc: Edward Cree <ecree@solarflare.com>
Cc: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodrivers: mellanox: use netdev_xmit_more() helper
Florian Westphal [Mon, 1 Apr 2019 14:42:15 +0000 (16:42 +0200)]
drivers: mellanox: use netdev_xmit_more() helper

skb->xmit_more hint is now always 0. This switches the mellanox drivers
to the netdev_xmit_more() helper.

Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Boris Pismenny <borisp@mellanox.com>
Cc: Ilya Lesokhin <ilyal@mellanox.com>
Cc: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: move skb->xmit_more hint to softnet data
Florian Westphal [Mon, 1 Apr 2019 14:42:14 +0000 (16:42 +0200)]
net: move skb->xmit_more hint to softnet data

There are two reasons for this.

First, the xmit_more flag conceptually doesn't fit into the skb, as
xmit_more is not a property related to the skb.
Its only a hint to the driver that the stack is about to transmit another
packet immediately.

Second, it was only done this way to not have to pass another argument
to ndo_start_xmit().

We can place xmit_more in the softnet data, next to the device recursion.
The recursion counter is already written to on each transmit. The "more"
indicator is placed right next to it.

Drivers can use the netdev_xmit_more() helper instead of skb->xmit_more
to check the "more packets coming" hint.

skb->xmit_more is retained (but always 0) to not cause build breakage.

This change takes care of the simple s/skb->xmit_more/netdev_xmit_more()/
conversions.  Remaining drivers are converted in the next patches.

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: place xmit recursion in softnet data
Florian Westphal [Mon, 1 Apr 2019 14:42:13 +0000 (16:42 +0200)]
net: place xmit recursion in softnet data

This fills a hole in softnet data, so no change in structure size.

Also prepares for xmit_more placement in the same spot;
skb->xmit_more will be removed in followup patch.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: aquantia: add SGMII statistics
Heiner Kallweit [Sun, 31 Mar 2019 15:42:24 +0000 (17:42 +0200)]
net: phy: aquantia: add SGMII statistics

The AQR107 family has SGMII statistics counters. Let's expose them to
ethtool. To interpret the counters correctly one has to be aware that
rx on SGMII side is tx on ethernet side. The counters are populated
by the chip in 100Mbps/1Gbps mode only.

v2:
- add constant AQR107_SGMII_STAT_SZ
- add struct aqr107_priv to be prepared for more private data fields
- let aqr107_get_stat() return U64_MAX in case of an error

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: use rcu_dereference_protected to fetch sk_dst_cache in sk_destruct
Xin Long [Sun, 31 Mar 2019 09:03:02 +0000 (17:03 +0800)]
net: use rcu_dereference_protected to fetch sk_dst_cache in sk_destruct

As Eric noticed, in .sk_destruct, sk->sk_dst_cache update is prevented, and
no barrier is needed for this. So change to use rcu_dereference_protected()
instead of rcu_dereference_check() to fetch sk_dst_cache in there.

v1->v2:
  - no change, repost after net-next is open.

Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: improve genphy_read_status
Heiner Kallweit [Sat, 30 Mar 2019 09:22:45 +0000 (10:22 +0100)]
net: phy: improve genphy_read_status

This patch improves few aspects of genphy_read_status():

- Don't initialize lpagb, it's not needed.

- Move initializing phydev->speed et al before the if clause.

- In auto-neg case, skip populating lp_advertising if we
  don't have a link. This avoids quite some unnecessary
  MDIO reads in case of phylib polling mode.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'nfp-flower-improvement-and-SFF-module-EEPROM'
David S. Miller [Tue, 2 Apr 2019 01:05:13 +0000 (18:05 -0700)]
Merge branch 'nfp-flower-improvement-and-SFF-module-EEPROM'

Jakub Kicinski says:

====================
nfp: flower improvement and SFF module EEPROM

The first patch in this series from Pieter improves the
handling of mangle actions in TC flower offload.  These
used to be sent down to the driver in groups, but after
Pablo N's patches they are split out causing suboptimal
expression.

The ramaining two patches from Dirk add support for reading
SFF module EEPROM data.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonfp: implement ethtool get module EEPROM
Dirk van der Merwe [Sat, 30 Mar 2019 02:24:43 +0000 (19:24 -0700)]
nfp: implement ethtool get module EEPROM

Now that the NSP provides the ability to read from the SFF modules'
EEPROM, we can use this interface to implement the ethtool callback.

If the NSP only provides partial data, we log the event from within
the driver but pass a success code to ethtool to prevent it from
discarding the partial data.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonfp: nsp: implement read SFF module EEPROM
Dirk van der Merwe [Sat, 30 Mar 2019 02:24:42 +0000 (19:24 -0700)]
nfp: nsp: implement read SFF module EEPROM

The NSP now provides the ability to read from the SFF module EEPROM.
Note that even if an error occurs, the NSP may still provide some of the
data.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonfp: flower: reduce action list size by coalescing mangle actions
Pieter Jansen van Vuuren [Sat, 30 Mar 2019 02:24:41 +0000 (19:24 -0700)]
nfp: flower: reduce action list size by coalescing mangle actions

With the introduction of flow_action_for_each pedit actions are no
longer grouped together, instead pedit actions are broken out per
32 byte word. This results in an inefficient use of the action list
that is pushed to hardware where each 32 byte word becomes its own
action. Therefore we combine groups of 32 byte word before sending
the action list to hardware.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: aquantia: add suspend / resume callbacks for AQR107 family
Heiner Kallweit [Fri, 29 Mar 2019 20:09:27 +0000 (21:09 +0100)]
net: phy: aquantia: add suspend / resume callbacks for AQR107 family

Add suspend / resume callbacks for AQR107 family. Suspend powers down
the complete chip except MDIO and internal CPU.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ethernet: ti: davinci_mdio: switch to readl/writel()
Grygorii Strashko [Fri, 29 Mar 2019 15:58:34 +0000 (17:58 +0200)]
net: ethernet: ti: davinci_mdio: switch to readl/writel()

Switch to readl/writel() APIs, because this is recommended
API and the MDIO block is reused on Keystone 2 SoCs
where LE/BE modes are supported.

Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'cxgb3-undefined-behaviour-and-use-struct_size'
David S. Miller [Mon, 1 Apr 2019 22:01:46 +0000 (15:01 -0700)]
Merge branch 'cxgb3-undefined-behaviour-and-use-struct_size'

Gustavo A. R. Silva says:

====================
cxgb3/l2t: Fix undefined behaviour and use struct_size() helper

This patchset aims to fix an undefined behaviour when using a zero-sized
array and, add the use of the struct_size() helper in kvzalloc().

You might consider the first patch in this series for stable.

More details in the commit logs.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agocxgb3/l2t: Use struct_size() in kvzalloc()
Gustavo A. R. Silva [Fri, 29 Mar 2019 15:28:41 +0000 (10:28 -0500)]
cxgb3/l2t: Use struct_size() in kvzalloc()

One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:

struct foo {
    int stuff;
    struct boo entry[];
};

size = sizeof(struct foo) + count * sizeof(struct boo);
instance = kvzalloc(size, GFP_KERNEL);

Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:

instance = kvzalloc(struct_size(instance, entry, count), GFP_KERNEL);

Notice that, in this case, variable size is not necessary, hence
it is removed.

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agocxgb3/l2t: Fix undefined behaviour
Gustavo A. R. Silva [Fri, 29 Mar 2019 15:27:26 +0000 (10:27 -0500)]
cxgb3/l2t: Fix undefined behaviour

The use of zero-sized array causes undefined behaviour when it is not
the last member in a structure. As it happens to be in this case.

Also, the current code makes use of a language extension to the C90
standard, but the preferred mechanism to declare variable-length
types such as this one is a flexible array member, introduced in
C99:

struct foo {
        int stuff;
        struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last. Which is beneficial
to cultivate a high-quality code.

Fixes: e48f129c2f20 ("[SCSI] cxgb3i: convert cdev->l2opt to use rcu to prevent NULL dereference")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: read mac address from DT for slave device
Xiaofei Shen [Fri, 29 Mar 2019 05:34:58 +0000 (11:04 +0530)]
net: dsa: read mac address from DT for slave device

Before creating a slave netdevice, get the mac address from DTS and
apply in case it is valid.

Signed-off-by: Xiaofei Shen <xiaofeis@codeaurora.org>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotcp: fix tcp_inet6_sk() for 32bit kernels
Eric Dumazet [Mon, 1 Apr 2019 10:09:20 +0000 (03:09 -0700)]
tcp: fix tcp_inet6_sk() for 32bit kernels

It turns out that struct ipv6_pinfo is not located as we think.

inet6_sk_generic() and tcp_inet6_sk() disagree on 32bit kernels by 4-bytes,
because struct tcp_sock has 8-bytes alignment,
but ipv6_pinfo size is not a multiple of 8.

sizeof(struct ipv6_pinfo): 116 (not padded to 8)

I actually first coded tcp_inet6_sk() as this patch does, but thought
that "container_of(tcp_sk(sk), struct tcp6_sock, tcp)" was cleaner.

As Julian told me : Nobody should use tcp6_sock.inet6
directly, it should be accessed via tcp_inet6_sk() or inet6_sk().

This happened when we added the first u64 field in struct tcp_sock.

Fixes: 93a77c11ae79 ("tcp: add tcp_inet6_sk() helper")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Bisected-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agor8169: use netif_receive_skb_list batching
Heiner Kallweit [Sun, 31 Mar 2019 13:18:48 +0000 (15:18 +0200)]
r8169: use netif_receive_skb_list batching

Use netif_receive_skb_list() instead of napi_gro_receive() to benefit
from batched skb processing.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'mlxsw-spectrum_acl-Get-rid-of-failed_rollback-mechanism'
David S. Miller [Sun, 31 Mar 2019 18:01:23 +0000 (11:01 -0700)]
Merge branch 'mlxsw-spectrum_acl-Get-rid-of-failed_rollback-mechanism'

Ido Schimmel says:

====================
mlxsw: spectrum_acl: Get rid of failed_rollback mechanism

Jiri says:

Currently if vregion rehash fails, it rolls back to the original ERP
set. However, in case of unlikely rollback fail, the vregion is in a
zombie state and never gets rehashed again. With the recent changes, it
is possible to try to continue the rollback. Do it from the last failed
ventry.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum_acl: Rename rehash_dis trace
Jiri Pirko [Sun, 31 Mar 2019 06:49:41 +0000 (06:49 +0000)]
mlxsw: spectrum_acl: Rename rehash_dis trace

The name of the trace is no longer correct, since there is no disable of
rehash done. So name it "rehash_rollback_failed".

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum_acl: Remove failed_rollback dead end
Jiri Pirko [Sun, 31 Mar 2019 06:49:40 +0000 (06:49 +0000)]
mlxsw: spectrum_acl: Remove failed_rollback dead end

Currently if a rollback ends with error, the vregion is in a zombie
state until end of the existence. Instead of that, rather try to
continue where rollback ended later on (after rehash interval).

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum_acl: Move rehash_dis trace call and err msg to vregion_migrate()
Jiri Pirko [Sun, 31 Mar 2019 06:49:39 +0000 (06:49 +0000)]
mlxsw: spectrum_acl: Move rehash_dis trace call and err msg to vregion_migrate()

Move the call of rehash_dis trace and the error message to
vregion_migrate() next to the failed_rollback flag set.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum_acl: Remove redundant failed_rollback from migrate_start()
Jiri Pirko [Sun, 31 Mar 2019 06:49:38 +0000 (06:49 +0000)]
mlxsw: spectrum_acl: Remove redundant failed_rollback from migrate_start()

The flag is set by the caller mlxsw_sp_acl_tcam_vregion_migrate() anyway,
so don't set it here.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: bridge: use netif_is_bridge_port()
Julian Wiedmann [Fri, 29 Mar 2019 13:38:19 +0000 (14:38 +0100)]
net: bridge: use netif_is_bridge_port()

Replace the br_port_exists() macro with its twin from netdevice.h

CC: Roopa Prabhu <roopa@cumulusnetworks.com>
CC: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoteam: use netif_is_team_port()
Julian Wiedmann [Fri, 29 Mar 2019 13:37:07 +0000 (14:37 +0100)]
team: use netif_is_team_port()

Replace the team_port_exists() macro with its twin from netdevice.h

CC: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agocxgb4/cxgb4vf: Display advertised FEC in ethtool
Vishal Kulkarni [Fri, 29 Mar 2019 12:54:03 +0000 (18:24 +0530)]
cxgb4/cxgb4vf: Display advertised FEC in ethtool

This patch advertises Forward Error Correction in ethtool

Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agocxgb4: Update 1.23.3.0 as the latest firmware supported.
Vishal Kulkarni [Fri, 29 Mar 2019 11:26:09 +0000 (16:56 +0530)]
cxgb4: Update 1.23.3.0 as the latest firmware supported.

Change t4fw_version.h to update latest firmware version
number to 1.23.3.0.

Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoopenvswitch: Make metadata_dst tunnel work in IP_TUNNEL_INFO_BRIDGE mode
wenxu [Thu, 28 Mar 2019 04:43:23 +0000 (12:43 +0800)]
openvswitch: Make metadata_dst tunnel work in IP_TUNNEL_INFO_BRIDGE mode

There is currently no support for the multicast/broadcast aspects
of VXLAN in ovs. In the datapath flow the tun_dst must specific.
But in the IP_TUNNEL_INFO_BRIDGE mode the tun_dst can not be specific.
And the packet can forward through the fdb table of vxlan devcice. In
this mode the broadcast/multicast packet can be sent through the
following ways in ovs.

ovs-vsctl add-port br0 vxlan -- set in vxlan type=vxlan \
        options:key=1000 options:remote_ip=flow
ovs-ofctl add-flow br0 in_port=LOCAL,dl_dst=ff:ff:ff:ff:ff:ff, \
        action=output:vxlan

bridge fdb append ff:ff:ff:ff:ff:ff dev vxlan_sys_4789 dst 172.168.0.1 \
        src_vni 1000 vni 1000 self
bridge fdb append ff:ff:ff:ff:ff:ff dev vxlan_sys_4789 dst 172.168.0.2 \
        src_vni 1000 vni 1000 self

Signed-off-by: wenxu <wenxu@ucloud.cn>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotcp: cleanup sk_tx_skb_cache before reuse
Eric Dumazet [Fri, 29 Mar 2019 19:46:17 +0000 (12:46 -0700)]
tcp: cleanup sk_tx_skb_cache before reuse

TCP stack relies on the fact that a freshly allocated skb
has skb->cb[] and skb_shinfo(skb)->tx_flags cleared.

When recycling tx skb, we must ensure these fields are cleared.

Fixes: 472c2e07eef0 ("tcp: add one skb cache for tx")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMAINTAINERS: Fix mellanox Innova IPsec
Boris Pismenny [Fri, 29 Mar 2019 17:19:44 +0000 (20:19 +0300)]
MAINTAINERS: Fix mellanox Innova IPsec

The Innova IPsec driver is part of all Innova drivers, and its
maintainenece is covered by an existing entry in this file.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: tc-testing: Add pedit tests
Dmytro Linkin [Thu, 28 Mar 2019 16:09:31 +0000 (16:09 +0000)]
selftests: tc-testing: Add pedit tests

Add 36 pedit action tests to check pedit options described in tc-pedit(8)
man page. Test cases can be specified by categories: actions, pedit,
raw_op, layered_op. RAW_OP cases check offset option for u8, u16 and u32
offset size. LAYERED_OP cases check fields option for eth, ip, ip6,
tcp and udp headers.

Include following tests:
377e - Add pedit action with RAW_OP offset u32
a0ca - Add pedit action with RAW_OP offset u32 (INVALID)
dd8a - Add pedit action with RAW_OP offset u16 u16
53db - Add pedit action with RAW_OP offset u16 (INVALID)
5c7e - Add pedit action with RAW_OP offset u8 add value
2893 - Add pedit action with RAW_OP offset u8 quad
3a07 - Add pedit action with RAW_OP offset u8-u16-u8
ab0f - Add pedit action with RAW_OP offset u16-u8-u8
9d12 - Add pedit action with RAW_OP offset u32 set u16 clear u8 invert
ebfa - Add pedit action with RAW_OP offset overflow u32 (INVALID)
f512 - Add pedit action with RAW_OP offset u16 at offmask shift set
c2cb - Add pedit action with RAW_OP offset u32 retain value
86d4 - Add pedit action with LAYERED_OP eth set src & dst
c715 - Add pedit action with LAYERED_OP eth set src (INVALID)
ba22 - Add pedit action with LAYERED_OP eth type set/clear sequence
5810 - Add pedit action with LAYERED_OP ip set src & dst
1092 - Add pedit action with LAYERED_OP ip set ihl & dsfield
02d8 - Add pedit action with LAYERED_OP ip set ttl & protocol
3e2d - Add pedit action with LAYERED_OP ip set ttl (INVALID)
31ae - Add pedit action with LAYERED_OP ip ttl clear/set
486f - Add pedit action with LAYERED_OP ip set duplicate fields
e790 - Add pedit action with LAYERED_OP ip set ce, df, mf, firstfrag,
nofrag fields
6829 - Add pedit action with LAYERED_OP beyond ip set dport & sport
afd8 - Add pedit action with LAYERED_OP beyond ip set icmp_type &
icmp_code
3143 - Add pedit action with LAYERED_OP beyond ip set dport (INVALID)
fc1f - Add pedit action with LAYERED_OP ip6 set src & dst
6d34 - Add pedit action with LAYERED_OP ip6 dst retain value (INVALID)
6f5e - Add pedit action with LAYERED_OP ip6 flow_lbl
6795 - Add pedit action with LAYERED_OP ip6 set payload_len, nexthdr,
hoplimit
1442 - Add pedit action with LAYERED_OP tcp set dport & sport
b7ac - Add pedit action with LAYERED_OP tcp sport set (INVALID)
cfcc - Add pedit action with LAYERED_OP tcp flags set
3bc4 - Add pedit action with LAYERED_OP tcp set dport, sport & flags
fields
f1c8 - Add pedit action with LAYERED_OP udp set dport & sport
d784 - Add pedit action with mixed RAW/LAYERED_OP #1
70ca - Add pedit action with mixed RAW/LAYERED_OP #2

Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: Move ipv6 stubs to a separate header file
David Ahern [Fri, 22 Mar 2019 13:06:09 +0000 (06:06 -0700)]
ipv6: Move ipv6 stubs to a separate header file

The number of stubs is growing and has nothing to do with addrconf.
Move the definition of the stubs to a separate header file and update
users. In the move, drop the vxlan specific comment before ipv6_stub.

Code move only; no functional change intended.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'net-Move-fib_nh-and-fib6_nh-to-a-common-struct'
David S. Miller [Fri, 29 Mar 2019 17:48:04 +0000 (10:48 -0700)]
Merge branch 'net-Move-fib_nh-and-fib6_nh-to-a-common-struct'

David Ahern says:

====================
net: Move fib_nh and fib6_nh to a common struct

First set of three with the end goal of enabling IPv6 gateways with IPv4
routes.

This set refactors ipv4 and ipv6 code to create init and release
helpers for each protocol and moving common elements to a fib_nh_common
struct.

v3
- split the reject setting into 2 with helper to the checks. This
  avoids changing cfg->fc_flags in fib6_nh_init

v2
- addressed Ido's comments: cleanup on failure path in nh_init helpers,
  ordering in fib6_nh_release, and removal of RTF_GATEWAY from fib6_info
  uses in mlxsw
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: Use common nexthop init and release helpers
David Ahern [Thu, 28 Mar 2019 03:53:58 +0000 (20:53 -0700)]
net: Use common nexthop init and release helpers

With fib_nh_common in place, move common initialization and release
code into helpers used by both ipv4 and ipv6. For the moment, the init
is just the lwt encap and the release is both the netdev reference and
the the lwt state reference. More will be added later.

Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: Add fib_nh_common and update fib_nh and fib6_nh
David Ahern [Thu, 28 Mar 2019 03:53:57 +0000 (20:53 -0700)]
net: Add fib_nh_common and update fib_nh and fib6_nh

Add fib_nh_common struct with common nexthop attributes. Convert
fib_nh and fib6_nh to use it. Use macros to move existing
fib_nh_* references to the new nh_common.nhc_*.

Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: Rename fib6_nh entries
David Ahern [Thu, 28 Mar 2019 03:53:56 +0000 (20:53 -0700)]
ipv6: Rename fib6_nh entries

Rename fib6_nh entries that will be moved to a fib_nh_common struct.
Specifically, the device, gateway, flags, and lwtstate are common
with all nexthop definitions. In some places new temporary variables
are declared or local variables renamed to maintain line lengths.

Rename only; no functional change intended.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv4: Rename fib_nh entries
David Ahern [Thu, 28 Mar 2019 03:53:55 +0000 (20:53 -0700)]
ipv4: Rename fib_nh entries

Rename fib_nh entries that will be moved to a fib_nh_common struct.
Specifically, the device, oif, gateway, flags, scope, lwtstate,
nh_weight and nh_upper_bound are common with all nexthop definitions.
In the process shorten fib_nh_lwtstate to fib_nh_lws to avoid really
long lines.

Rename only; no functional change intended.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: Change rt6_add_nexthop and rt6_nexthop_info to take fib6_nh
David Ahern [Thu, 28 Mar 2019 03:53:54 +0000 (20:53 -0700)]
ipv6: Change rt6_add_nexthop and rt6_nexthop_info to take fib6_nh

rt6_add_nexthop and rt6_nexthop_info only need the fib6_info for the
gateway flag and the nexthop weight, and the presence of a gateway is now
per-nexthop. Update the signatures to take a fib6_nh and nexthop weight
and better align with the ipv4 versions.

Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: Refactor fib6_ignore_linkdown
David Ahern [Thu, 28 Mar 2019 03:53:53 +0000 (20:53 -0700)]
ipv6: Refactor fib6_ignore_linkdown

fib6_ignore_linkdown takes a fib6_info but only looks at the net_device
and its IPv6 config. Change it to take a net_device over a fib6_info as
its input argument.

In addition, move it to a header file to make the check inline and usable
later with IPv4 code without going through the ipv6 stub, and rename to
ip6_ignore_linkdown since it is only checking the setting based on the
ipv6 struct on a device.

Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: Move gateway checks to a fib6_nh setting
David Ahern [Thu, 28 Mar 2019 03:53:52 +0000 (20:53 -0700)]
ipv6: Move gateway checks to a fib6_nh setting

The gateway setting is not per fib6_info entry but per-fib6_nh. Add a new
fib_nh_has_gw flag to fib6_nh and convert references to RTF_GATEWAY to
the new flag. For IPv6 address the flag is cheaper than checking that
nh_gw is non-0 like IPv4 does.

While this increases fib6_nh by 8-bytes, the effective allocation size of
a fib6_info is unchanged. The 8 bytes is recovered later with a
fib_nh_common change.

Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: Create cleanup helper for fib6_nh
David Ahern [Thu, 28 Mar 2019 03:53:51 +0000 (20:53 -0700)]
ipv6: Create cleanup helper for fib6_nh

Move the fib6_nh cleanup code to a new helper, fib6_nh_release.

Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: Create init helper for fib6_nh
David Ahern [Thu, 28 Mar 2019 03:53:50 +0000 (20:53 -0700)]
ipv6: Create init helper for fib6_nh

Similar to IPv4, consolidate the fib6_nh initialization into a helper.
As a new standalone function, add a cleanup path to put lwtstate on
error.

To avoid modifying fib6_config flags, move the reject check to a helper
that is invoked once by fib6_nh_init to reset the device and then
again in ip6_route_info_create to set the fib6_flags.

Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv4: Create cleanup helper for fib_nh
David Ahern [Thu, 28 Mar 2019 03:53:49 +0000 (20:53 -0700)]
ipv4: Create cleanup helper for fib_nh

Move the fib_nh cleanup code from free_fib_info_rcu into a new helper,
fib_nh_release. Move classid accounting into fib_nh_release which is
called per fib_nh to make accounting symmetrical with fib_nh_init.
Export the helper to allow for use with nexthop objects in the
future.

Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv4: Create init helper for fib_nh
David Ahern [Thu, 28 Mar 2019 03:53:48 +0000 (20:53 -0700)]
ipv4: Create init helper for fib_nh

Consolidate the fib_nh initialization which is duplicated between
fib_create_info for single path and fib_get_nhs for multipath.
Export the helper to allow for use with nexthop objects in the
future.

Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv4: Move IN_DEV_IGNORE_ROUTES_WITH_LINKDOWN to helper
David Ahern [Thu, 28 Mar 2019 03:53:47 +0000 (20:53 -0700)]
ipv4: Move IN_DEV_IGNORE_ROUTES_WITH_LINKDOWN to helper

in_dev lookup followed by IN_DEV_IGNORE_ROUTES_WITH_LINKDOWN check
is called in several places, some with the rcu lock and others with the
rtnl held.

Move the check to a helper similar to what IPv6 has. Since the helper
can be invoked from either context use rcu_dereference_rtnl to
dereference ip_ptr.

Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv4: Define fib_get_nhs when CONFIG_IP_ROUTE_MULTIPATH is disabled
David Ahern [Thu, 28 Mar 2019 03:53:46 +0000 (20:53 -0700)]
ipv4: Define fib_get_nhs when CONFIG_IP_ROUTE_MULTIPATH is disabled

Define fib_get_nhs to return EINVAL when CONFIG_IP_ROUTE_MULTIPATH is
not enabled and remove the ifdef check for CONFIG_IP_ROUTE_MULTIPATH
in fib_create_info.

Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'selftests-forwarding-Add-new-test-cases'
David S. Miller [Fri, 29 Mar 2019 00:20:53 +0000 (17:20 -0700)]
Merge branch 'selftests-forwarding-Add-new-test-cases'

Ido Schimmel says:

====================
selftests: forwarding: Add new test cases

This patchset mainly adds new forwarding test cases and performs small
changes in existing infrastructure.

Patches #1-#3 add new test cases for multicast RPF check, PCP and VLAN
matching using flower and tc VLAN modify action.

The rest of the patches are from Petr who says:

In patches #4 and #5, devlink_lib.sh is fixed to first not cause double
inclusion of lib.sh, and then to deduce the device name in a simpler way.

In patch #6, helpers for dealing with shared buffer configuration are
added to devlink_lib.sh.

In patch #7, MC-awareness test is fixed to configure shared buffers
explicitly.

In patch #8, several helpers are extracted from the MC-awareness test
and put into a new mlxsw-specific library, qos_lib.sh.

In patch #9, a new test is added which checks configuration of
strictly-prioritized streams.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: mlxsw: Add a new test for strict priority
Petr Machata [Thu, 28 Mar 2019 12:12:27 +0000 (12:12 +0000)]
selftests: mlxsw: Add a new test for strict priority

Test that when strict priority is configured on a system, the
higher-priority traffic does actually win all the available bandwidth.
The test uses a similar approach to qos_mc_aware.sh to run and account
the traffic.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: mlxsw: Add qos_lib.sh
Petr Machata [Thu, 28 Mar 2019 12:12:26 +0000 (12:12 +0000)]
selftests: mlxsw: Add qos_lib.sh

Extract reusable code from qos_mc_aware.sh and put into a new library.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: mlxsw: qos_mc_aware: Configure shared buffers
Petr Machata [Thu, 28 Mar 2019 12:12:25 +0000 (12:12 +0000)]
selftests: mlxsw: qos_mc_aware: Configure shared buffers

This test runs two streams of traffic from two independent ports to
create congestion on one egress port. It is necessary to configure the
shared buffer thresholds correctly, to make sure that there is traffic
from both streams in the shared buffer. Only then can the test actually
test prioritization among these streams.

Without this configuration, it is possible, that one of the streams
takes all of port-pool quota, and the other stream is not even admitted,
thus invalidating the result.

On Spectrum-1, this is not a problem, because MC traffic uses a separate
pool. But for Spectrum-2, MC and UC share the same pool, and the correct
configuration is important.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: forwarding: devlink_lib: Add shared buffer helpers
Petr Machata [Thu, 28 Mar 2019 12:12:24 +0000 (12:12 +0000)]
selftests: forwarding: devlink_lib: Add shared buffer helpers

Add helpers to obtain, set, and restore a pool size, and a port-pool and
tc-pool threshold.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: forwarding: devlink_lib: Simplify deduction of DEVLINK_DEV
Petr Machata [Thu, 28 Mar 2019 12:12:23 +0000 (12:12 +0000)]
selftests: forwarding: devlink_lib: Simplify deduction of DEVLINK_DEV

Use devlink -j and jq for more accurate querying. Use cut -f-2 instead
of rev-cut-rev combo.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: forwarding: devlink_lib: Avoid double sourcing of lib.sh
Petr Machata [Thu, 28 Mar 2019 12:12:23 +0000 (12:12 +0000)]
selftests: forwarding: devlink_lib: Avoid double sourcing of lib.sh

Don't source lib.sh twice and make the script work with ifnames passed
on the command line.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: forwarding: Test action VLAN modify
Danielle Ratson [Thu, 28 Mar 2019 12:12:21 +0000 (12:12 +0000)]
selftests: forwarding: Test action VLAN modify

Construct a basic topology consisting of two hosts connected using a
VLAN-aware bridge. Put each port in a different VLAN and test that ping
fails.

Add ingress and egress filters with a VLAN modify action and test that
ping passes.

Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: forwarding: Add PCP match and VLAN match tests
Amit Cohen [Thu, 28 Mar 2019 12:12:20 +0000 (12:12 +0000)]
selftests: forwarding: Add PCP match and VLAN match tests

Send packets with VLAN and PCP set and check that TC flower filters can
match on these keys.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: forwarding: Add reverse path forwarding (RPF) test cases
Ido Schimmel [Thu, 28 Mar 2019 12:12:19 +0000 (12:12 +0000)]
selftests: forwarding: Add reverse path forwarding (RPF) test cases

In case a packet is routed using a multicast route whose specified
ingress interface does not match the interface from which the packet was
received, the packet is dropped.

Add IPv4 and IPv6 test cases for above mentioned scenario.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: mvneta: Add 2500BaseT support
Maxime Chevallier [Wed, 27 Mar 2019 16:31:06 +0000 (17:31 +0100)]
net: mvneta: Add 2500BaseT support

Some PHYs will use the 2500BaseX PHY_INTERFACE_MODE when being linked
with a partner using 2.5GBaseT.

Since we can't autonegotiate this speed between the MAC and the PHY, we
need to have the proper comphy support enabled, to make sure we can
safely advertise 2.5G and 1G in BaseT and be able to switch between both
corresponding PHY interface modes. This is now possible since comphy
support was added to this driver.

This commit adds the 2500BaseT mode to the list of supported modes when
using 2500BaseX, and was tested on a setup with an Armada385 and a
88E2010 PHY, both with and without the comphy node in the DT.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoopenvswitch: Add timeout support to ct action
Yi-Hung Wei [Tue, 26 Mar 2019 18:31:14 +0000 (11:31 -0700)]
openvswitch: Add timeout support to ct action

Add support for fine-grain timeout support to conntrack action.
The new OVS_CT_ATTR_TIMEOUT attribute of the conntrack action
specifies a timeout to be associated with this connection.
If no timeout is specified, it acts as is, that is the default
timeout for the connection will be automatically applied.

Example usage:
$ nfct timeout add timeout_1 inet tcp syn_sent 100 established 200
$ ovs-ofctl add-flow br0 in_port=1,ip,tcp,action=ct(commit,timeout=timeout_1)

CC: Pravin Shelar <pshelar@ovn.org>
CC: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonetfilter: Export nf_ct_{set,destroy}_timeout()
Yi-Hung Wei [Tue, 26 Mar 2019 18:31:13 +0000 (11:31 -0700)]
netfilter: Export nf_ct_{set,destroy}_timeout()

This patch exports nf_ct_set_timeout() and nf_ct_destroy_timeout().
The two functions are derived from xt_ct_destroy_timeout() and
xt_ct_set_timeout() in xt_CT.c, and moved to nf_conntrack_timeout.c
without any functional change.
It would be useful for other users (i.e. OVS) that utilizes the
finer-grain conntrack timeout feature.

CC: Pablo Neira Ayuso <pablo@netfilter.org>
CC: Pravin Shelar <pshelar@ovn.org>
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 's390-next'
David S. Miller [Thu, 28 Mar 2019 19:57:24 +0000 (12:57 -0700)]
Merge branch 's390-next'

Julian Wiedmann says:

====================
s390/qeth: updates 2019-03-28

please apply the following patchset to net-next. This reworks the control
IO code in qeth so that we no longer need to poll for cmd completion,
and refactors the IDX setup code to also use this improved IO path.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: send IDX cmds via qeth_send_control_data()
Julian Wiedmann [Thu, 28 Mar 2019 15:39:28 +0000 (16:39 +0100)]
s390/qeth: send IDX cmds via qeth_send_control_data()

This converts the IDX code to use qeth_send_control_data(), replacing
a bunch of duplicated IO code and unbounded waits. It also allows the
IDX sequence to benefit from the improved timeout & notify
infrastructure, so that we can eliminate the DOWN -> ACTIVATING -> UP
transition in the channel state machine.

The patch looks rather big, but most of it is a straight-forward
conversion of the old IDX cmd setup & callbacks to the new model.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: use callback to finalize cmd
Julian Wiedmann [Thu, 28 Mar 2019 15:39:27 +0000 (16:39 +0100)]
s390/qeth: use callback to finalize cmd

To avoid concurrency issues, some parts of the cmd setup are delayed
until qeth_send_control_data() holds the IO channel's irq_pending
"lock". Rather than hard-coding those setup steps for each cmd type,
have the cmd provide a callback. This will make it easier to also issue
IDX commands via qeth_send_control_data().

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: let qeth_notify_reply() set the notify reason
Julian Wiedmann [Thu, 28 Mar 2019 15:39:26 +0000 (16:39 +0100)]
s390/qeth: let qeth_notify_reply() set the notify reason

As trivial cleanup before adding more users to qeth_notify_reply(),
move the setup of reply->rc from the caller into the helper.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: clarify default cmd callback
Julian Wiedmann [Thu, 28 Mar 2019 15:39:25 +0000 (16:39 +0100)]
s390/qeth: clarify default cmd callback

Current code makes it look like qeth_send_control_data_cb() is some
sort of default callback for all cmds. But in practice, it is only used
for half of the cmd buffers we issue.
Reduce the confusion by only setting this callback for cmds that
actually want it, and while at it give the callback a name that matches
the established naming scheme.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: don't poll for cmd IO completion
Julian Wiedmann [Thu, 28 Mar 2019 15:39:24 +0000 (16:39 +0100)]
s390/qeth: don't poll for cmd IO completion

All callers are running in process context now, so we can safely sleep
in qeth_send_control_data() while waiting for a cmd to complete.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: convert IP table spinlock to mutex
Julian Wiedmann [Thu, 28 Mar 2019 15:39:23 +0000 (16:39 +0100)]
s390/qeth: convert IP table spinlock to mutex

All users of the lock are running in process context now.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: defer IPv6 address notifier events
Julian Wiedmann [Thu, 28 Mar 2019 15:39:22 +0000 (16:39 +0100)]
s390/qeth: defer IPv6 address notifier events

The inet6addr_chain is atomic. So instead of starting the cmd IO for
SETIP / DELIP straight from the notifier callback, run it from a
workqueue. This is the last step towards removal of cmd IO completion
polling.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: add wrapper for IP table access
Julian Wiedmann [Thu, 28 Mar 2019 15:39:21 +0000 (16:39 +0100)]
s390/qeth: add wrapper for IP table access

Extract a little helper, so that high-level callers can manipulate the
IP table without worrying about the locking. This will make it easier
to convert the code to a different locking primitive later on.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: remove locking for RX modeset cache
Julian Wiedmann [Thu, 28 Mar 2019 15:39:20 +0000 (16:39 +0100)]
s390/qeth: remove locking for RX modeset cache

The L2 and L3 .ndo_set_rx_mode callbacks maintain an address cache
to decide which addresses have changed since the last modeset.

When the card is set offline, qeth_l?_stop_card() drains this cache.
This happens only after 1) the net_device has been detached, and
2) any pending RX modeset has completed. Consequently we can access the
cache lock-free.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: defer RX modesetting
Julian Wiedmann [Thu, 28 Mar 2019 15:39:19 +0000 (16:39 +0100)]
s390/qeth: defer RX modesetting

.ndo_set_rx_mode gets called in process context, but while holding the
addr_list spinlock. Which means we currently can't sleep while
re-programming the HW, and need to poll for IO completion. That's bad,
in particular since receiving the cmd response can fail silently and
we're then polling until the timeout hits.

As a first step towards eliminating the IO completion polling, run the
RX modeset from a work element and only take the addr_list lock while
updating the RX mode address cache.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'net-call-for-phys_port_name-into-devlink-directly-if-possible'
David S. Miller [Thu, 28 Mar 2019 19:55:31 +0000 (12:55 -0700)]
Merge branch 'net-call-for-phys_port_name-into-devlink-directly-if-possible'

Jiri Pirko says:

===================
net: call for phys_port_name into devlink directly if possible

phys_port_name may be assembled by a helper in devlink. It is currently
the case only for mlxsw driver. Benefit from the get_devlink_port ndo
and call into devlink directly from dev_get_phys_port_name(). That saves
the trip to the driver, simplifies the code and makes it similar to
recently introduced ethtool-devlink compat helpers.

Move bnxt, partly nfp and dsa to let devlink core generate the name too.
===================

Signed-off-by: David S. Miller <davem@davemloft.net>