review.tizen.org Git - platform/kernel/linux-starfive.git/log

net: intel: Remove in_interrupt() warnings

in_interrupt() is ill defined and does not provide what the name
suggests. The usage especially in driver code is deprecated and a tree wide
effort to clean up and consolidate the (ab)usage of in_interrupt() and
related checks is happening.

In this case the checks cover only parts of the contexts in which these
functions cannot be called. They fail to detect preemption or interrupt
disabled invocations.

As the functions which are invoked from the various places contain already
a broad variety of checks (always enabled or debug option dependent) cover
all invalid conditions already, there is no point in having inconsistent
warnings in those drivers.

Just remove them.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: fec_mpc52xx: Replace in_interrupt() usage

The usage of in_interrupt() in drivers is phased out and Linus clearly
requested that code which changes behaviour depending on context should
either be seperated or the context be conveyed in an argument passed by the
caller, which usually knows the context.

mpc52xx_fec_stop() uses in_interrupt() to check if it is safe to sleep. All
callers run in well defined contexts.

Pass an argument from the callers indicating whether it is safe to sleep.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: e100: Remove in_interrupt() usage and pointless GFP_ATOMIC allocation

e100_hw_init() invokes e100_self_test() only if in_interrupt() returns
false as e100_self_test() uses msleep() which requires sleepable task
context. The in_interrupt() check is incomplete because in_interrupt()
cannot catch callers from contexts which have just preemption or interrupts
disabled.

e100_hw_init() is invoked from:

  - e100_loopback_test() which clearly is sleepable task context as the
    function uses msleep() itself.

  - e100_up() which clearly is sleepable task context as well because it
    invokes e100_alloc_cbs() abd request_irq() which both require sleepable
    task context due to GFP_KERNEL allocations and mutex_lock() operations.

Remove the pointless in_interrupt() check.

As a side effect of this analysis it turned out that e100_rx_alloc_list()
which is only invoked from e100_loopback_test() and e100_up() pointlessly
uses a GFP_ATOMIC allocation. The next invoked function e100_alloc_cbs() is
using GFP_KERNEL already.

Change the allocation mode in e100_rx_alloc_list() to GFP_KERNEL as well.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: cxbg4: Remove pointless in_interrupt() check

t4_sge_stop() is only ever called from task context and the in_interrupt()
check is presumably a leftover from copying t3_sge_stop().

Aside of in_interrupt() being deprecated because it's not providing what it
claims to provide, this check would paper over illegitimate callers.

The functions invoked from t4_sge_stop() contain already warnings to catch
invocations from invalid contexts.

Remove it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: cxgb3: Cleanup in_interrupt() usage

t3_sge_stop() is called from task context and from error handlers in
interrupt context. It relies on in_interrupt() to differentiate the
contexts.

in_interrupt() is deprecated as it is ill defined and does not provide what
it suggests.

Instead of replacing it with some other construct, simply split the
function into t3_sge_stop_dma(), which can be called from any context, and
t3_sge_stop() which can be only called from task context.

This has the advantage that any bogus invocation of t3_sge_stop() from
wrong contexts can be caught by debug kernels instead of being papered over
by the conditional.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: atheros: Remove WARN_ON(in_interrupt())

in_interrupt() is ill defined and does not provide what the name
suggests. The usage especially in driver code is deprecated and a tree wide
effort to clean up and consolidate the (ab)usage of in_interrupt() and
related checks is happening.

In this case the check covers only parts of the contexts in which these
functions cannot be called. It fails to detect preemption or interrupt
disabled invocations.

As the functions which are invoked from at*_reinit_locked() contain a broad
variety of checks (always enabled or debug option dependent) which cover
all invalid conditions already, there is no point in having inconsistent
warnings in those drivers.

Just remove them.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: caif: Use netif_rx_any_context()

The usage of in_interrupt() in non-core code is phased out. Ideally the
information of the calling context should be passed by the callers or the
functions be split as appropriate.

cfhsi_rx_desc() and cfhsi_rx_pld() use in_interrupt() to distinguish if
they should use netif_rx() or netif_rx_ni() for receiving packets.

The attempt to consolidate the code by passing an arguemnt or by
distangling it failed due lack of knowledge about this driver and because
the call chains are hard to follow.

As a stop gap use netif_rx_any_context() which invokes the correct code path
depending on context and confines the in_interrupt() usage to core code.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: Add netif_rx_any_context()

Quite some drivers make conditional decisions based on in_interrupt() to
invoke either netif_rx() or netif_rx_ni().

Conditionals based on in_interrupt() or other variants of preempt count
checks in drivers should not exist for various reasons and Linus clearly
requested to either split the code pathes or pass an argument to the
common functions which provides the context.

This is obviously the correct solution, but for some of the affected
drivers this needs a major rewrite due to their convoluted structure.

As in_interrupt() usage in drivers needs to be phased out, provide
netif_rx_any_context() as a stop gap for these drivers.

This confines the in_interrupt() conditional to core code which in turn
allows to remove the access to this check for driver code and provides one
central place to do further modifications once the driver maze is cleaned
up.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: caif: Remove unused caif SPI driver

While chasing in_interrupt() (ab)use in drivers it turned out that the
caif_spi driver has never been in use since the driver was merged 10 years
ago. There never was any matching code which provides a platform device.

The driver has not seen any update (asided of treewide changes and
cleanups) since 8 years and the maintainers vanished from the planet.

So analysing the potential contexts and the (in)correctness of
in_interrupt() usage is just a pointless exercise.

Remove the cruft.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: enic: Cure the enic api locking trainwreck

enic_dev_wait() has a BUG_ON(in_interrupt()).

Chasing the callers of enic_dev_wait() revealed the gems of enic_reset()
and enic_tx_hang_reset() which are both invoked through work queues in
order to be able to call rtnl_lock(). So far so good.

After locking rtnl both functions acquire enic::enic_api_lock which
serializes against the (ab)use from infiniband. This is where the
trainwreck starts.

enic::enic_api_lock is a spin_lock() which implicitly disables preemption,
but both functions invoke a ton of functions under that lock which can
sleep. The BUG_ON(in_interrupt()) does not trigger in that case because it
can't detect the preempt disabled condition.

This clearly has never been tested with any of the mandatory debug options
for 7+ years, which would have caught that for sure.

Cure it by adding a enic_api_busy member to struct enic, which is modified
and evaluated with enic::enic_api_lock held.

If enic_api_devcmd_proxy_by_index() observes enic::enic_api_busy as true,
it drops enic::enic_api_lock and busy waits for enic::enic_api_busy to
become false.

It would be smarter to wait for a completion of that busy period, but
enic_api_devcmd_proxy_by_index() is called with other spin locks held which
obviously can't sleep.

Remove the BUG_ON(in_interrupt()) check as well because it's incomplete and
with proper debugging enabled the problem would have been caught from the
debug checks in schedule_timeout().

Fixes: 0b038566c0ea ("drivers/net: enic: Add an interface for USNIC to interact with firmware")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

devlink: include <linux/const.h> for _BITUL

Commit 5d5b4128c4ca ("devlink: introduce flash update overwrite mask")
added a usage of _BITUL to the UAPI <linux/devlink.h> header, but failed
to include the header file where it was defined. It happens that this
does not break any existing kernel include chains because it gets
included through other sources. However, when including the UAPI headers
in a userspace application (such as devlink in iproute2), _BITUL is not
defined.

Fixes: 5d5b4128c4ca ("devlink: introduce flash update overwrite mask")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'cxgb4-ch_ktls-updates-in-net-next'

Rohit Maheshwari says:

====================
cxgb4/ch_ktls: updates in net-next

This series of patches improves connections setup and statistics.

This series is broken down as follows:

Patch 1 fixes the handling of connection setup failure in HW. Driver
shouldn't return success to tls_dev_add, until HW returns success.

Patch 2 avoids the log flood.

Patch 3 adds ktls statistics at port level.

v1->v2:
- removed conn_up from all places.

v2->v3:
- Corrected timeout handling.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

cxgb4/ch_ktls: ktls stats are added at port level

All the ktls stats were at adapter level, but now changing it
to port level.

Fixes: 62370a4f346d ("cxgb4/chcr: Add ipv6 support and statistics")
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cxgb4: Avoid log flood

Changing these logs to dynamic debugs. If issue is seen, these
logs can be enabled at run time.

Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ch_ktls: Issue if connection offload fails

Since driver first return success to tls_dev_add, if req to HW is
successful, but later if HW returns failure, that connection traffic
fails permanently and connection status remains unknown to stack.

v1->v2:
- removed conn_up from all places.

v2->v3:
- Corrected timeout handling.

Fixes: 34aba2c45024 ("cxgb4/chcr : Register to tls add and del callback")
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fddi/skfp: Avoid the use of one-element array

One-element arrays are being deprecated[1]. Replace the one-element array
with a simple object of type u_char: 'u_char rm_pad1'[2], once it seems
this is just a placeholder for padding.

[1] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays
[2] https://github.com/KSPP/linux/issues/86

Built-tested-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/lkml/5f72c23f.%2FkPBWcZBu+W6HKH4%25lkp@intel.com/
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

l2tp: report rx cookie discards in netlink get

When an L2TPv3 session receives a data frame with an incorrect cookie
l2tp_core logs a warning message and bumps a stats counter to reflect
the fact that the packet has been dropped.

However, the stats counter in question is missing from the l2tp_netlink
get message for tunnel and session instances.

Include the statistic in the netlink get response.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'for-upstream' of git://git./linux/kernel/git/bluetooth/bluetooth-next

Johan Hedberg says:

====================
pull request: bluetooth-next 2020-09-29

Here's the main bluetooth-next pull request for 5.10:

- Multiple fixes to suspend/resume handling
- Added mgmt events for controller suspend/resume state
- Improved extended advertising support
- btintel: Enhanced support for next generation controllers
- Added Qualcomm Bluetooth SoC WCN6855 support
- Several other smaller fixes & improvements
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'hns3-next'

Huazhong Tan says:

====================
net: hns3: updates for -next

There are some misc updates for the HNS3 ethernet driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: dump tqp enable status in debugfs

Adds debugfs to dump tqp enable status.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: debugfs add new command to query device specifications

In order to query specifications of the device, add a new debugfs
command "dev spec" to do that.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: remove unused code in hns3_self_test()

NETIF_F_HW_VLAN_CTAG_FILTER is not set in netdev->hw_feature,
but set in netdev->features.

So the handler of NETIF_F_HW_VLAN_CTAG_FILTER in hns3_self_test() is
always true, remove it.

Signed-off-by: Guojia Liao <liaoguojia@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: Add RoCE VF reset support

Add RoCE VF client reset support by notifying the RoCE VF client
when hns3 VF is resetting and adding a interface to query whether
CMDQ is ready to work.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: add UDP segmentation offload support

Add support for UDP segmentation offload to the HNS3 driver
when the device can do it.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: rename trace event hns3_over_8bd

Since the maximun BD number may not be 8 now, so rename
hns3_over_8bd() to hns3_over_max_bd().

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: replace macro HNS3_MAX_NON_TSO_BD_NUM

Currently, the driver is able to query the device's specifications,
which includes the maximum BD number of non TSO packet, so replace
macro HNS3_MAX_NON_TSO_BD_NUM with the queried value, and rewrite
macro HNS3_MAX_NON_TSO_SIZE whose value depends on the the maximum
BD number of non TSO packet.

Also, add a new parameter max_non_tso_bd_num to function
hns3_tx_bd_num() and hns3_skb_need_linearized(), then they can get
the maximum BD number of non TSO packet from the caller instead of
calculating by themself, The note of hns3_skb_need_linearized()
should be update as well.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'octeontx2-af-cleanup-and-extend-parser-config'

Stanislaw Kardach says:

====================
octeontx2-af: cleanup and extend parser config

Current KPU configuration data is spread over multiple files which makes
it hard to read. Clean this up by gathering all configuration data in a
single structure and also in a single file (npc_profile.h). This should
increase the readability of KPU handling code (since it always
references same structure), simplify updates to the CAM key extraction
code and allow abstracting out the configuration source.
Additionally extend and fix the parser config to support additional DSA
types, NAT-T-ESP and IPv6 fields.

Patch 1 ensures that CUSTOMx LTYPEs are not aliased with meaningful
LTYPEs where possible.

Patch 2 gathers all KPU profile related data into a single struct and
creates an adapter structure which provides an interface to the KPU
profile for the octeontx2-af driver.

Patches 3-4 add support for Extended DSA, eDSA and Forward DSA.

Patches 5-6 adds IPv6 fields to CAM key extraction and optimize the
parser performance for fragmented IPv6 packets.

Patch 7 refactors ESP handling in the parser to support NAT-T-ESP.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

octeontx2-af: add parser support for NAT-T-ESP

Add support for NAT-T-ESP to KPU parser configuration. NAT ESP is a UDP
based protocol. So move ESP to LE so that both UDP and ESP can be
extracted.

Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
Acked-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

octeontx2-af: optimize parsing of IPv6 fragments

IPv6 fragmented packet may not contain completed layer 4 information.
So stop KPU parsing after setting ipv6 fragmentation flag.

Signed-off-by: Abhijit Ayarekar <aayarekar@marvell.com>
Acked-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

octeontx2-af: Add IPv6 fields to default MKEX

Added some IPv6 protocol fields to the default MKEX profile.
They include everything from the beginning of IP header and up to
source address. The pattern occupies full KW2 in MCAM entry.
Only one out of two LD registers for this protocol is used.

Signed-off-by: Vidhya Vidhyaraman <vraman@marvell.com>
Acked-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

octeontx2-af: fix Extended DSA and eDSA parsing

KPU profile interpret Extended DSA and eDSA by looking source dev. This
was incorrect and it restricts to use few source device ids and also
created confusion while parsing regular DSA tag. With below patch lookup
was based on bit 12 of Word0. This is always zero for DSA tag and it
should be one for Extended DSA and eDSA.

Signed-off-by: Satha Rao <skoteshwar@marvell.com>
Acked-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

octeontx2-af: add parser support for Forward DSA

Marvell Prestera switches supports distributed switch architecture
by inserting Forward DSA tag of 4 bytes right after ethernet SMAC.
This tag don't have a tpid field.

This patch provides parser and extraction support for the same.
Default ldata extraction profile added for FDSA such that Src_port
is extracted and placed inplace of vlanid field. Like extended DSA
and eDSA tags,a special PKIND of 62 is used for this tag.

Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Acked-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

octeontx2-af: cleanup KPU config data

Refactor KPU related NPC code gathering all configuration data in a
structured format and putting it in one place (npc_profile.h).
This increases readability and makes it easier to extend the profile
configuration (as opposed to jumping between multiple header and source
files).

To do this:
* Gather all KPU profile related data into a single adapter struct.
* Convert the built-in MKEX definition to a structured one to streamline
  the MKEX loading.
* Convert LT default register configuration into a structure, keeping
  default protocol settings in same file where identifiers for those
  protocols are defined.
* Add a single point for KPU profile loading, so that its source may
  change in the future once proper interfaces for loading such config
  are in place.

Signed-off-by: Stanislaw Kardach <skardach@marvell.com>
Acked-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

octeontx2-af: fix LD CUSTOM LTYPE aliasing

Since LD contains LTYPE definitions tweaked toward efficient
NIX_AF_RX_FLOW_KEY_ALG(0..31)_FIELD(0..4) usage, the original location
of NPC_LT_LD_CUSTOM0/1 was aliased with MPLS_IN_* definitions.
Moving custom frame to value 6 and 7 removes the aliasing at the cost of
custom frames being also considered when TCP/UDP RSS algo is configured.

However since the goal of CUSTOM frames is to classify them to a
separate set of RQs, this cost is acceptable.

Signed-off-by: Stanislaw Kardach <skardach@marvell.com>
Acked-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/sched: cls_u32: Replace one-element array with flexible-array member

There is a regular need in the kernel to provide a way to declare having
a dynamically sized set of trailing elements in a structure. Kernel code
should always use “flexible array members”[1] for these cases. The older
style of one-element or zero-length arrays should no longer be used[2].

Refactor the code according to the use of a flexible-array member in
struct tc_u_hnode and use the struct_size() helper to calculate the
size for the allocations. Commit 5778d39d070b ("net_sched: fix struct
tc_u_hnode layout in u32") makes it clear that the code is expected to
dynamically allocate divisor + 1 entries for ->ht[] in tc_uhnode. Also,
based on other observations, as the piece of code below:

1232                 for (h = 0; h <= ht->divisor; h++) {
1233                         for (n = rtnl_dereference(ht->ht[h]);
1234                              n;
1235                              n = rtnl_dereference(n->next)) {
1236                                 if (tc_skip_hw(n->flags))
1237                                         continue;
1238
1239                                 err = u32_reoffload_knode(tp, n, add, cb,
1240                                                           cb_priv, extack);
1241                                 if (err)
1242                                         return err;
1243                         }
1244                 }

we can assume that, in general, the code is actually expecting to allocate
that extra space for the one-element array in tc_uhnode, everytime it
allocates memory for instances of tc_uhnode or tc_u_common structures.
That's the reason for passing '1' as the last argument for struct_size()
in the allocation for _root_ht_ and _tp_c_, and 'divisor + 1' in the
allocation code for _ht_.

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays

Tested-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/lkml/5f7062af.z3T9tn9yIPv6h5Ny%25lkp@intel.com/
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

qed/qed_ll2: Replace one-element array with flexible-array member

There is a regular need in the kernel to provide a way to declare having
a dynamically sized set of trailing elements in a structure. Kernel code
should always use “flexible array members”[1] for these cases. The older
style of one-element or zero-length arrays should no longer be used[2].

Refactor the code according to the use of a flexible-array member in
struct qed_ll2_tx_packet, instead of a one-element array and use the
struct_size() helper to calculate the size for the allocations. Commit
f5823fe6897c ("qed: Add ll2 option to limit the number of bds per packet")
was used as a reference point for these changes.

Also, it's important to notice that flexible-array members should occur
last in any structure, and structures containing such arrays and that
are members of other structures, must also occur last in the containing
structure. That's why _cur_completing_packet_ is now moved to the bottom
in struct qed_ll2_tx_queue. _descq_mem_ and _cur_send_packet_ are also
moved for unification.

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays

Tested-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/lkml/5f707198.PA1UCZ8MYozYZYAR%25lkp@intel.com/
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

stmmac: intel: Adding ref clock 1us tic for LPI cntr

Adding reference clock (1us tic) for all LPI timer on Intel platforms.
The reference clock is derived from ptp clk. This also enables all LPI
counter.

Signed-off-by: Rusaimi Amira Ruslan <rusaimi.amira.rusaimi@intel.com>
Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'net-ipa-miscellaneous-cleanups'

Alex Elder says:

====================
net: ipa: miscellaneous cleanups

This series contains some minor cleanups I've been meaning to get
around to for a while.  The first few remove the definitions of some
currently-unused symbols.  Several fix some warnings that are reported
when the build is done with "W=2".  All are simple and have no effect
on the operation of the code.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipa: fix two comments

In ipa_uc_response_hdlr() a comment uses the wrong function name
when it describes where a clock reference is taken. Fix this.

Also fix the comment in ipa_uc_response_hdlr() to correctly refer to
ipa_uc_setup(), which is where the clock reference described here is
taken.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipa: rename a phandle variable

When "W=2" is supplied to the build command, we get a warning about
shadowing a global declaration (of a typedef) for a variable defined
in ipa_probe(). Rename the variable to get rid of the warning.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipa: fix two mild warnings

Fix two spots where a variable "channel_id" is unnecessarily
redefined inside loops in "gsi.c". This is warned about if
"W=2" is added to the build command.

Note that this problem is harmless, so there's no need to backport
it as a bugfix.

Remove a comment in gsi_init() about waking the system; the GSI
interrupt does not wake the system any more.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipa: share field mask values for GSI general interrupt

The GSI general interrupt is managed by three registers: enable;
status; and clear. The three registers have same set of field bits
at the same locations. Use a common set of field masks for all
three registers to avoid duplication.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipa: share field mask values for GSI global interrupt

The GSI global interrupt is managed by three registers: enable;
status; and clear. The three registers have same set of field bits
at the same locations. Use a common set of field masks for all
three registers to avoid duplication.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipa: share field mask values for GSI interrupt type

The GSI interrupt type register and interrupt type mask register
have the same field bits at the same locations. Use a common set of
field masks for both registers rather than essentially duplicating
them. The only place the interrupt mask register uses any of these
is in gsi_irq_enable().

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipa: remove unused status structure field masks

Most of the field masks used for fields in a status structure are
unused. Remove their definitions; we can add them back again when
we actually use them to handle arriving status messages. These are
warned about if "W=2" is added to the build command.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipa: kill unused status exceptions

Only the deaggregation status exception type is ever actually used.
If any other status exception type is reported we basically ignore
it, and consume the packet. Remove the unused definitions of status
exception type symbols; they can be added back when we actually
handle them.

Separately, two consecutive if statements test the same condition
near the top of ipa_endpoint_suspend_one(). Instead, use a single
test with a block that combines the previously-separate lines of
code.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipa: kill unused status opcodes

Three status opcodes are not currently supported. Symbols
representing their numeric values are defined but never used.
Remove those unused definitions; they can be defined again
when they actually get used.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipa: kill definition of TRE_FLAGS_IEOB_FMASK

In "gsi_trans.c", the field mask TRE_FLAGS_IEOB_FMASK is defined but
never used. Although there's no harm in defining this, remove it
for now and redefine it at some future date if it becomes needed.
This is warned about if "W=2" is added to the build command.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'ibmvnic-refactor-some-send-handle-functions'

Lijun Pan says:

====================
ibmvnic: refactor some send/handle functions

This patch series rename and factor some send crq request functions.
The new naming aligns better with handle* functions such that it make
the code easier to read and search by new contributors.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

ibmvnic: create send_control_ip_offload

Factor send_control_ip_offload out of handle_query_ip_offload_rsp.

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ibmvnic: create send_query_ip_offload

Factor send_query_ip_offload out of handle_request_cap_rsp to
pair with handle_query_ip_offload_rsp.

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ibmvnic: rename send_map_query to send_query_map

The new name send_query_map pairs with handle_query_map_rsp.

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ibmvnic: rename ibmvnic_send_req_caps to send_request_cap

The new name send_request_cap pairs with handle_request_cap_rsp.

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ibmvnic: rename send_cap_queries to send_query_cap

The new name send_query_cap pairs with handle_query_cap_rsp.

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ibmvnic: set up 200GBPS speed

Set up the speed according to crq->query_phys_parms.rsp.speed.
Fix IBMVNIC_10GBPS typo.

Fixes: f8d6ae0d27ec ("ibmvnic: Report actual backing device speed and duplex values")
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

atm: atmtcp: Constify atmtcp_v_dev_ops

The only usage of atmtcp_v_dev_ops is to pass its address to
atm_dev_register() which takes a pointer to const, and comparing its
address to another address, which does not modify it. Make it const to
allow the compiler to put it in read-only memory.

Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ip6gre: avoid tx_error when sending MLD/DAD on external tunnels

similarly to what has been done with commit 9d149045b3c0 ("geneve: change
from tx_error to tx_dropped on missing metadata"), avoid reporting errors
to userspace in case the kernel doesn't find any tunnel information for a
skb that is going to be transmitted: an increase of tx_dropped is enough.

tested with the following script:

# for t in ip6gre ip6gretap ip6erspan; do
> ip link add dev gre6-test0 type $t external
> ip address add dev gre6-test0 2001:db8::1/64
> ip link set dev gre6-test0 up
> sleep 30
> ip -s -j link show dev gre6-test0 | jq \
> '.[0].stats64.tx | {"errors": .errors, "dropped": .dropped}'
> ip link del dev gre6-test0
> done

Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'net-smc-introduce-SMC-Dv2-support'

Karsten Graul says:

====================
net/smc: introduce SMC-Dv2 support

SMC-Dv2 support (see https://www.ibm.com/support/pages/node/6326337)
provides multi-subnet support for SMC-D, eliminating the current
same-subnet restriction. The new version detects if any of the virtual
ISM devices are on the same system and can therefore be used for an
SMC-Dv2 connection. Furthermore, SMC-Dv2 eliminates the need for
PNET IDs on s390.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net/smc: CLC decline - V2 enhancements

This patch covers the small SMCD version 2 changes for CLC decline.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/smc: introduce CLC first contact extension

SMC Version 2 defines a first contact extension for CLC accept
and CLC confirm. This patch covers sending and receiving of the
CLC first contact extension.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/smc: CLC accept / confirm V2

The new format of SMCD V2 CLC accept and confirm is introduced,
and building and checking of SMCD V2 CLC accepts / confirms is adapted
accordingly.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/smc: determine accepted ISM devices

SMCD Version 2 allows to propose up to 8 additional ISM devices
offered to the peer as candidates for SMCD communication.
This patch covers the server side, i.e. selection of an ISM device
matching one of the proposed ISM devices, that will be used for
CLC accept

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/smc: build and send V2 CLC proposal

The new format of an SMCD V2 CLC proposal is introduced, and
building and checking of SMCD V2 CLC proposals is adapted
accordingly.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/smc: determine proposed ISM devices

SMCD Version 2 allows to propose up to 8 additional ISM devices
offered to the peer as candidates for SMCD communication.
This patch covers determination of the ISM devices to be proposed.
ISM devices without PNETID are preferred, since ISM devices with
PNETID are a V1 leftover and will disappear over the time.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/smc: introduce list of pnetids for Ethernet devices

SMCD version 2 allows usage of ISM devices with hardware PNETID
only, if an Ethernet net_device exists with the same hardware PNETID.
This requires to maintain a list of pnetids belonging to
Ethernet net_devices, which is covered by this patch.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/smc: introduce CHID callback for ISM devices

With SMCD version 2 the CHIDs of ISM devices are needed for the
CLC handshake.
This patch provides the new callback to retrieve the CHID of an
ISM device.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/smc: introduce System Enterprise ID (SEID)

SMCD version 2 defines a System Enterprise ID (short SEID).
This patch contains the SEID creation and adds the callback to
retrieve the created SEID.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/smc: prepare for more proposed ISM devices

SMCD Version 2 allows proposing of up to 8 ISM devices in addition
to the native ISM device of SMCD Version 1.
This patch prepares the struct smc_init_info to deal with these
additional 8 ISM devices.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/smc: split CLC confirm/accept data to be sent

When sending CLC confirm and CLC accept, separate the trailing
part of the message from the initial part (to be prepared for
future first contact extension).

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/smc: separate find device functions

This patch provides better separation of device determinations
in function smc_listen_work(). No functional change.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/smc: CLC header fields renaming

SMCD version 2 defines 2 more bits in the CLC header to specify
version 2 types. This patch prepares better naming of the CLC
header fields. No functional change.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/smc: remove constant and introduce helper to check for a pnet id

Use the existing symbol _S instead of SMC_ASCII_BLANK, and introduce a
helper to check if a pnetid is set. No functional change.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: Use kobj_to_dev() API

Use kobj_to_dev() instead of container_of().

Signed-off-by: Wang Qing <wangqing@vivo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: mvneta: try to use in-irq pp cache in mvneta_txq_bufs_free

Try to recycle the xdp tx buffer into the in-irq page_pool cache if
mvneta_txq_bufs_free is executed in the NAPI context for XDP_TX use case

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch '1GbE' of https://github.com/anguy11/next-queue

Tony Nguyen says:

====================
1GbE Intel Wired LAN Driver Updates 2020-09-28

This series contains updates to igb, igc, and e1000e drivers.

Sven Auhagen adds XDP support for igb.

Gal Hammer allows for 82576 to display part number string correctly for
igb.

Sasha adds device IDs for i221 and i226 parts. Exposes LPI counters and
removes unused fields in structures for igc. He also adds Meteor Lake
support for e1000e.

For igc, Andre renames IGC_TSYNCTXCTL_VALID to IGC_TSYNCTXCTL_TXTT_0 to
match the datasheet and adds a warning if it's not set when expected.
Removes the PTP Tx timestamp check in igc_ptp_tx_work() as it's already
checked in the watchdog_task. Cleans up some code by removing invalid error
bits, renaming a bit to match datasheet naming, and removing a, now
unneeded, macro.

Vinicius makes changes for igc PTP: removes calling SYSTIMR to latch timer
value, stores PTP time before a reset, and rejects schedules with times in
the future.

v2: Remove 'inline' from igb_xdp_tx_queue_mapping() and igb_rx_offset()
for patch 1
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

e1000e: Add support for Meteor Lake

Add devices IDs for the next LOM generations that will be
available on the next Intel Client platform (Meteor Lake)
This patch provides the initial support for these devices

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

igc: Clean up nvm_info structure

flash_bank_size and flash_base_addr field not in use and can
be removed from a nvm_info structure

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

igc: Reject schedules with a base_time in the future

When we set the BASET registers of i225 with a base_time in the
future, i225 will "hold" all packets until that base_time is reached,
causing a lot of TX Hangs.

As this behaviour seems contrary to the expectations of the IEEE
802.1Q standard (section 8.6.9, especially 8.6.9.4.5), let's start by
rejecting these types of schedules. If this is too limiting, we can
for example, setup a timer to configure the BASET registers closer to
the start time, only blocking the packets for a "short" while.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

igc: Export a way to read the PTP timer

The next patch will need a way to retrieve the current timestamp from
the NIC's PTP clock.

The 'i225' suffix is removed, if anything model specific is needed,
those specifics should be hidden by this function.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

igc: Remove reset disable flag

Boolean reset disable flag not applicable for i225 device and
could be removed.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

igc: Save PTP time before a reset

Many TSN features depend on the internal PTP clock, so the internal
PTP jumping when the adapter is reset can cause problems, usually in
the form of "TX Hangs" warnings in the driver.

The solution is to save the PTP time before a reset and restore it
after the reset is done. The value of the PTP time is saved before a
reset and we use the difference from CLOCK_MONOTONIC from reset time
to now, to correct what's going to be the new PTP time.

This is heavily inspired by commit bf4bf09bdd91 ("i40e: save PTP time
before a device reset").

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

igc: Remove references to SYSTIMR register

In i225, it's no longer necessary to use the SYSTIMR register to
latch the timer value, the timestamp is latched when SYSTIML is read.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

igc: Expose LPI counters

Completion to commit 900d1e8b346b ("igc: Add LPI counters")
LPI counters exposed by statistics update method.
A EEE TX LPI counter reflect the transmitter entries EEE (IEEE 802.3az)
into the LPI state. A EEE RX LPI counter reflect the receiver link
partner entries into EEE(IEEE 802.3az) LPI state.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

igc: Clean RX descriptor error flags

i225 advanced receive descriptor doesn't have the following extend error
bits: CE, SE, SEQ, CXE. In addition to that, the bit TCPE is called L4E
in the datasheet.

Clean up the code accordingly, and get rid of the macro
IGC_RXDEXT_ERR_FRAME_ERR_MASK since it doesn't make much sense anymore.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

igc: Remove timeout check from ptp_tx work

The Tx timestamp timeout is already checked by the watchdog_task
which runs periodically. In addition to that, from the ptp_tx work
perspective, if __IGC_PTP_TX_IN_PROGRESS flag is set we always want
handle the timestamp stored in hardware and update the skb. So remove
the timeout check in igc_ptp_tx_work() function.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

igc: Don't reschedule ptp_tx work

The ptp_tx work is scheduled only if TSICR.TXTS bit is set, therefore
TSYNCTXCTL.TXTT_0 bit is expected to be set when we check it igc_ptp_tx_
work(). If it isn't, something is really off and rescheduling the ptp_tx
work to check it later doesn't help much. This patch changes the code to
WARN_ON_ONCE() if this situation ever happens.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

igc: Rename IGC_TSYNCTXCTL_VALID macro

Rename the IGC_TSYNCTXCTL_VALID macro to IGC_TSYNCTXCTL_TXTT_0 so it
matches the datasheet.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

igc: Add new device ID's

Add new device ID's for the next step of the silicon and
reflect i221 and i226 parts

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

igb: read PBA number from flash

Fixed flash presence check for 82576 controllers so the part
number string is read and displayed correctly.

Signed-off-by: Gal Hammer <ghammer@redhat.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

igb: add XDP support

Add XDP support to the IGB driver.
The implementation follows the IXGBE XDP implementation
closely and I used the following patches as basis:

1. commit 924708081629 ("ixgbe: add XDP support for pass and drop actions")
2. commit 33fdc82f0883 ("ixgbe: add support for XDP_TX action")
3. commit ed93a3987128 ("ixgbe: tweak page counting for XDP_REDIRECT")

Due to the hardware constraints of the devices using the
IGB driver we must share the TX queues with XDP which
means locking the TX queue for XDP.

I ran tests on an older device to get better numbers.
Test machine:

Intel(R) Atom(TM) CPU C2338 @ 1.74GHz (2 Cores)
2x Intel I211

Routing Original Driver Network Stack: 382 Kpps

Routing XDP Redirect (xdp_fwd_kern): 1.48 Mpps
XDP Drop: 1.48 Mpps

Using XDP we can achieve line rate forwarding even on
an older Intel Atom CPU.

Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

Merge branch 'udp_tunnel-convert-Intel-drivers-with-shared-tables'

Jakub Kicinski says:

====================
udp_tunnel: convert Intel drivers with shared tables

This set converts Intel drivers which have the ability to spawn
multiple netdevs, but have only one UDP tunnel port table.

Appropriate support is added to the core infra in patch 1,
followed by netdevsim support and a selftest.

The table sharing works by core attaching the same table
structure to all devices sharing the table. This means the
reference count has to accommodate potentially large values.

Once core is ready i40e and ice are converted. These are
complex drivers, but we got a tested-by from Aaron, so we
should be good :)

Compared to v1 I've made sure the selftest is executable.

Other than that patches 8 and 9 are actually from the Mellanox
conversion series were kept out to avoid Mellanox vs Intel
conflicts.

Last patch is new, some docs to let users knows ethtool
can now display UDP tunnel info.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

docs: vxlan: add info about device features

Add some information about VxLAN-related netdev features
and how to dump port table via ethtool.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

selftests: net: add a test for static UDP tunnel ports

Check UDP_TUNNEL_NIC_INFO_STATIC_IANA_VXLAN works as expected.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

netdevsim: support the static IANA VXLAN port flag

Allow setting UDP_TUNNEL_NIC_INFO_STATIC_IANA_VXLAN.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

ice: convert to new udp_tunnel infrastructure

Convert ice to the new infra, use share port tables.

Leave a tiny bit more error checking in place than usual,
because this driver really does quite a bit of magic.

We need to calculate the number of VxLAN and GENEVE entries
the firmware has reserved.

Thanks to the conversion the driver will no longer sleep in
an atomic section.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ice: remove unused args from ice_get_open_tunnel_port()

ice_get_open_tunnel_port() is always passed TNL_ALL
as the second parameter.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

i40e: convert to new udp_tunnel infrastructure

Make use of the "shared port table" to convert i40e to the new
infra.

i40e did not have any reference tracking, locking is also dodgy
because rtnl gets released while talking to FW, so port may get
removed from the table while it's getting added etc.

On the good side i40e does not seem to be using the ports for
TX so we can remove the table from the driver state completely.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

selftests: net: add a test for shared UDP tunnel info tables

Add a test run of checks validating the shared UDP tunnel port
tables function as we expect.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

netdevsim: shared UDP tunnel port table support

Add the ability to simulate a device with a shared UDP tunnel port
table.

Try to reject the configurations and actions which are not supported
by the core, so we don't get syzcaller etc. warning reports.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

netdevsim: add warnings on unexpected UDP tunnel port errors

We should never see a removal of a port which is not in the table
or adding a port to an occupied entry in the table. To make sure
such errors don't escape the checks in the test script add a
warning/kernel spat.

Error injection will not trigger those, nor should it ever put
us in a bad state.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>