Xin Long [Sun, 6 Nov 2022 20:34:14 +0000 (15:34 -0500)]
net: move the ct helper function to nf_conntrack_helper for ovs and tc
Move ovs_ct_helper from openvswitch to nf_conntrack_helper and rename
as nf_ct_helper so that it can be used in TC act_ct in the next patch.
Note that it also adds the checks for the family and proto, as in TC
act_ct, the packets with correct family and proto are not guaranteed.
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Gal Pressman [Sun, 6 Nov 2022 12:31:27 +0000 (14:31 +0200)]
ethtool: Fail number of channels change when it conflicts with rxnfc
Similar to what we do with the hash indirection table [1], when network
flow classification rules are forwarding traffic to channels greater
than the requested number of channels, fail the operation.
Without this, traffic could be directed to channels which no longer
exist (dropped) after changing number of channels.
[1] commit
d4ab4286276f ("ethtool: correctly ensure {GS}CHANNELS doesn't conflict with GS{RXFH}")
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Link: https://lore.kernel.org/r/20221106123127.522985-1-gal@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Fri, 4 Nov 2022 19:01:25 +0000 (12:01 -0700)]
ethtool: linkstate: add a statistic for PHY down events
The previous attempt to augment carrier_down (see Link)
was not met with much enthusiasm so let's do the simple
thing of exposing what some devices already maintain.
Add a common ethtool statistic for link going down.
Currently users have to maintain per-driver mapping
to extract the right stat from the vendor-specific ethtool -S
stats. carrier_down does not fit the bill because it counts
a lot of software related false positives.
Add the statistic to the extended link state API to steer
vendors towards implementing all of it.
Implement for bnxt and all Linux-controlled PHYs. mlx5 and (possibly)
enic also have a counter for this but I leave the implementation
to their maintainers.
Link: https://lore.kernel.org/r/20220520004500.2250674-1-kuba@kernel.org
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20221104190125.684910-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Tue, 8 Nov 2022 04:00:13 +0000 (20:00 -0800)]
Merge branch 'net-txgbe-fix-two-bugs-in-txgbe_calc_eeprom_checksum'
YueHaibing says:
====================
net: txgbe: Fix two bugs in txgbe_calc_eeprom_checksum
Fix memleak and unsigned comparison bugs in txgbe_calc_eeprom_checksum
====================
Link: https://lore.kernel.org/r/20221105080722.20292-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
YueHaibing [Sat, 5 Nov 2022 08:07:22 +0000 (16:07 +0800)]
net: txgbe: Fix unsigned comparison to zero in txgbe_calc_eeprom_checksum()
The error checks on checksum for a negative error return always fails because
it is unsigned and can never be negative.
Fixes:
049fe5365324 ("net: txgbe: Add operations to interact with firmware")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
YueHaibing [Sat, 5 Nov 2022 08:07:21 +0000 (16:07 +0800)]
net: txgbe: Fix memleak in txgbe_calc_eeprom_checksum()
eeprom_ptrs should be freed before returned.
Fixes:
049fe5365324 ("net: txgbe: Add operations to interact with firmware")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 8 Nov 2022 03:55:52 +0000 (19:55 -0800)]
Merge branch '10GbE' of git://git./linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2022-11-04 (ixgbe, ixgbevf, igb)
This series contains updates to ixgbe, ixgbevf, and igb drivers.
Daniel Willenson adjusts descriptor buffer limits to be based on what
hardware supports instead of using a generic, least common value for
ixgbe.
Ani removes local variable for ixgbe, instead returning conditional result
directly.
Yang Li removes unneeded semicolon for ixgbe.
Jan adds error messaging when VLAN errors are encountered for ixgbevf.
Kees Cook prevents a potential use after free condition and explicitly
rounds up q_vector allocations so that allocations can be correctly
compared to ksize() for igb.
* '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
igb: Proactively round up to kmalloc bucket size
igb: Do not free q_vector unless new one was allocated
ixgbevf: Add error messages on vlan error
ixgbe: Remove unneeded semicolon
ixgbe: Remove local variable
ixgbe: change MAX_RXD/MAX_TXD based on adapter type
====================
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20221104205414.2354973-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tao Chen [Sat, 5 Nov 2022 09:05:04 +0000 (17:05 +0800)]
netlink: Fix potential skb memleak in netlink_ack
Fix coverity issue 'Resource leak'.
We should clean the skb resource if nlmsg_put/append failed.
Fixes:
738136a0e375 ("netlink: split up copies in the ack construction")
Signed-off-by: Tao Chen <chentao.kernel@linux.alibaba.com>
Link: https://lore.kernel.org/r/bff442d62c87de6299817fe1897cc5a5694ba9cc.1667638204.git.chentao.kernel@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Roman Gushchin [Fri, 4 Nov 2022 20:48:37 +0000 (13:48 -0700)]
net: macb: implement live mac addr change
Implement live mac addr change for the macb ethernet driver.
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20221104204837.614459-1-roman.gushchin@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King (Oracle) [Fri, 4 Nov 2022 17:13:01 +0000 (17:13 +0000)]
net: remove explicit phylink_generic_validate() references
Virtually all conventional network drivers are now converted to use
phylink_generic_validate() - only DSA drivers and fman_memac remain,
so lets remove the necessity for network drivers to explicitly set
this member, and default to phylink_generic_validate() when unset.
This is possible as .validate must currently be set.
Any remaining instances that have not been addressed by this patch can
be fixed up later.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/E1or0FZ-001tRa-DI@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King (Oracle) [Fri, 4 Nov 2022 16:53:59 +0000 (16:53 +0000)]
net: lan966x: move unnecessary linux/sfp.h include
lan966x_phylink.c doesn't make use of anything from linux/sfp.h, so
remove this unnecessary include.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1oqzx9-001r9g-HV@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David S. Miller [Mon, 7 Nov 2022 12:30:17 +0000 (12:30 +0000)]
Merge branch 'genetlink-per-op-type-policies'
Jakub Kicinski says:
====================
genetlink: support per op type policies
While writing new genetlink families I was increasingly annoyed by the fact
that we don't support different policies for do and dump callbacks.
This makes it hard to do proper input validation for dumps which usually
have a lot more narrow range of accepted attributes.
There is also a minor inconvenience of not supporting different per_doit
and post_doit callbacks per op.
This series addresses those problems by introducing another op format.
v3:
- minor fixes to patch 12 after I took it for a spin with a real family
- adjust commit msg in patch 8
v2: https://lore.kernel.org/all/
20221102213338.194672-1-kuba@kernel.org/
- wait for net changes to propagate
- restore the missing comment in patch 1
- drop extra space in patch 3
- improve commit message in patch 4
v1: https://lore.kernel.org/all/
20221018230728.1039524-1-kuba@kernel.org/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 4 Nov 2022 19:13:43 +0000 (12:13 -0700)]
genetlink: convert control family to split ops
Prove that the split ops work.
Sadly we need to keep bug-wards compatibility and specify
the same policy for dump as do, even tho we don't parse
inputs for the dump.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 4 Nov 2022 19:13:42 +0000 (12:13 -0700)]
genetlink: allow families to use split ops directly
Let families to hook in the new split ops.
They are more flexible and should not be much larger than
full ops. Each split op is 40B while full op is 48B.
Devlink for example has 54 dos and 19 dumps, 2 of the dumps
do not have a do -> 56 full commands = 2688B.
Split ops would have taken 2920B, so 9% more space while
allowing individual per/post doit and per-type policies.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 4 Nov 2022 19:13:41 +0000 (12:13 -0700)]
genetlink: inline old iteration helpers
All dumpers use the iterators now, inline the cmd by index
stuff into iterator code.
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 4 Nov 2022 19:13:40 +0000 (12:13 -0700)]
genetlink: use iterator in the op to policy map dumping
We can't put the full iterator in the struct ctrl_dump_policy_ctx
because dump context is statically sized by netlink core.
Allocate it dynamically.
Rename policy to dump_map to make the logic a little easier to follow.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 4 Nov 2022 19:13:39 +0000 (12:13 -0700)]
genetlink: add iterator for walking family ops
Subsequent changes will expose split op structures to users,
so walking the family ops with just an index will get harder.
Add a structured iterator, convert the simple cases.
Policy dumping needs more careful conversion.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 4 Nov 2022 19:13:38 +0000 (12:13 -0700)]
genetlink: inline genl_get_cmd()
All callers go via genl_get_cmd_split() now, so rename it
to genl_get_cmd() remove the original.
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 4 Nov 2022 19:13:37 +0000 (12:13 -0700)]
genetlink: support split policies in ctrl_dumppolicy_put_op()
Pass do and dump versions of the op to ctrl_dumppolicy_put_op()
so that it can provide a different policy index for the two.
Since we now look at policies, and those are set appropriately
there's no need to look at the GENL_DONT_VALIDATE_DUMP flag.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 4 Nov 2022 19:13:36 +0000 (12:13 -0700)]
genetlink: add policies for both doit and dumpit in ctrl_dumppolicy_start()
Separate adding doit and dumpit policies for CTRL_CMD_GETPOLICY.
This has no effect until we actually allow do and dump to come
from different sources as netlink_policy_dump_add_policy()
does deduplication.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 4 Nov 2022 19:13:35 +0000 (12:13 -0700)]
genetlink: check for callback type at op load time
Now that genl_get_cmd_split() is informed what type of callback
user is trying to access (do or dump) we can check that this
callback is indeed available and return an error early.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 4 Nov 2022 19:13:34 +0000 (12:13 -0700)]
genetlink: load policy based on validation flags
Set the policy and maxattr pointers based on validation flags.
genl_family_rcv_msg_attrs_parse() will do nothing and return NULL
if maxattrs is zero, so no behavior change is expected.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 4 Nov 2022 19:13:33 +0000 (12:13 -0700)]
genetlink: introduce split op representation
We currently have two forms of operations - small ops and "full" ops
(or just ops). The former does not have pointers for some of the less
commonly used features (namely dump start/done and policy).
The "full" ops, however, still don't contain all the necessary
information. In particular the policy is per command ID, while
do and dump often accept different attributes. It's also not
possible to define different pre_doit and post_doit callbacks
for different commands within the family.
At the same time a lot of commands do not support dumping and
therefore all the dump-related information is wasted space.
Create a new command representation which can hold info about
a do implementation or a dump implementation, but not both at
the same time.
Use this new representation on the command execution path
(genl_family_rcv_msg) as we either run a do or a dump and
don't have to create a "full" op there.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 4 Nov 2022 19:13:32 +0000 (12:13 -0700)]
genetlink: move the private fields in struct genl_family
Move the private fields down to form a "private section".
Use the kdoc "private:" label comment thing to hide them
from the main kdoc comment.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 4 Nov 2022 19:13:31 +0000 (12:13 -0700)]
genetlink: refactor the cmd <> policy mapping dump
The code at the top of ctrl_dumppolicy() dumps mappings between
ops and policies. It supports dumping both the entire family and
single op if dump is filtered. But both of those cases are handled
inside a loop, which makes the logic harder to follow and change.
Refactor to split the two cases more clearly.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 7 Nov 2022 12:20:03 +0000 (12:20 +0000)]
Merge branch 'am65-cpsw-suspend-resume'
Roger Quadros says:
====================
net: ethernet: ti: am65-cpsw: Add suspend/resume support
This series enables PM_SLEEP(suspend/resume) support to
the am65-cpsw network driver.
Dual-emac and Switch mode are tested to work with suspend/resume
on AM62-SK platform.
It can be verified on the following branch
https://github.com/rogerq/linux/commits/for-v6.2/am62-cpsw-lpm-1.0
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Roger Quadros [Fri, 4 Nov 2022 13:23:10 +0000 (15:23 +0200)]
net: ethernet: ti: am65-cpsw: Fix hardware switch mode on suspend/resume
On low power during system suspend the ALE table context is lost.
Save the ALE contect before suspend and restore it after resume.
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roger Quadros [Fri, 4 Nov 2022 13:23:09 +0000 (15:23 +0200)]
net: ethernet: ti: am65-cpsw: retain PORT_VLAN_REG after suspend/resume
During suspend resume the context of PORT_VLAN_REG is lost so
save it during suspend and restore it during resume for
host port and slave ports.
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roger Quadros [Fri, 4 Nov 2022 13:23:08 +0000 (15:23 +0200)]
net: ethernet: ti: cpsw_ale: Add cpsw_ale_restore() helper
This can be used by device driver to restore ALE context.
The data produced by cpsw_ale_dump() can be passed to
cpsw_ale_restore().
This is required as on certain platforms the ALE context
is lost on low power suspend/resume.
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roger Quadros [Fri, 4 Nov 2022 13:23:07 +0000 (15:23 +0200)]
net: ethernet: ti: am65-cpsw: Add suspend/resume support
Add PM handlers for System suspend/resume.
As DMA driver doesn't yet support suspend/resume we free up
the DMA channels at suspend and acquire and initialize them
at resume.
Move the init/free dma calls to ndo_open/close() hooks so
it is symmetric and easier to invoke from suspend/resume handler.
As CPTS looses contect during suspend/resume, invoke the
necessary CPTS suspend/resume helpers.
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roger Quadros [Fri, 4 Nov 2022 13:23:06 +0000 (15:23 +0200)]
net: ethernet: ti: am65-cpsw/cpts: Add suspend/resume helpers
CPTS looses context on suspend (e.g. on AM62).
Provide suspend/resume hooks in CPTS driver. These will be
invoked by CPSW driver if CPTS was instantiated by CPSW.
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Frank [Fri, 4 Nov 2022 08:44:41 +0000 (16:44 +0800)]
net: phy: fix yt8521 duplicated argument to & or |
cocci warnings: (new ones prefixed by >>)
>> drivers/net/phy/motorcomm.c:1122:8-35: duplicated argument to & or |
drivers/net/phy/motorcomm.c:1126:8-35: duplicated argument to & or |
drivers/net/phy/motorcomm.c:1130:8-34: duplicated argument to & or |
drivers/net/phy/motorcomm.c:1134:8-34: duplicated argument to & or |
The second YT8521_RC1R_GE_TX_DELAY_xx should be YT8521_RC1R_FE_TX_DELAY_xx.
Fixes:
70479a40954c ("net: phy: Add driver for Motorcomm yt8521 gigabit ethernet phy")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Frank <Frank.Sae@motor-comm.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Fri, 4 Nov 2022 06:17:36 +0000 (14:17 +0800)]
gve: Fix error return code in gve_prefill_rx_pages()
If alloc_page() fails in gve_prefill_rx_pages(), it should return
an error code in the error path.
Fixes:
82fd151d38d9 ("gve: Reduce alloc and copy costs in the GQ rx path")
Cc: Jeroen de Borst <jeroendb@google.com>
Cc: Catherine Sullivan <csully@google.com>
Cc: Shailend Chand <shailend@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shenwei Wang [Fri, 4 Nov 2022 02:47:54 +0000 (21:47 -0500)]
net: fec: simplify the code logic of quirks
Simplify the code logic of handling the quirk of FEC_QUIRK_HAS_RACC.
If a SoC has the RACC quirk, the driver will enable the 16bit shift
by default in the probe function for a better performance.
This patch handles the logic in one place to make the logic simple
and clean. The patch optimizes the fec_enet_xdp_get_tx_queue function
according to Paolo Abeni's comments, and it also exludes the SoCs that
require to do frame swap from XDP support.
Signed-off-by: Shenwei Wang <shenwei.wang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nathan Chancellor [Thu, 3 Nov 2022 17:01:30 +0000 (10:01 -0700)]
s390/lcs: Fix return type of lcs_start_xmit()
With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
indirect call targets are validated against the expected function
pointer prototype to make sure the call target is valid to help mitigate
ROP attacks. If they are not identical, there is a failure at run time,
which manifests as either a kernel panic or thread getting killed. A
proposed warning in clang aims to catch these at compile time, which
reveals:
drivers/s390/net/lcs.c:2090:21: error: incompatible function pointer types initializing 'netdev_tx_t (*)(struct sk_buff *, struct net_device *)' (aka 'enum netdev_tx (*)(struct sk_buff *, struct net_device *)') with an expression of type 'int (struct sk_buff *, struct net_device *)' [-Werror,-Wincompatible-function-pointer-types-strict]
.ndo_start_xmit = lcs_start_xmit,
^~~~~~~~~~~~~~
drivers/s390/net/lcs.c:2097:21: error: incompatible function pointer types initializing 'netdev_tx_t (*)(struct sk_buff *, struct net_device *)' (aka 'enum netdev_tx (*)(struct sk_buff *, struct net_device *)') with an expression of type 'int (struct sk_buff *, struct net_device *)' [-Werror,-Wincompatible-function-pointer-types-strict]
.ndo_start_xmit = lcs_start_xmit,
^~~~~~~~~~~~~~
->ndo_start_xmit() in 'struct net_device_ops' expects a return type of
'netdev_tx_t', not 'int'. Adjust the return type of lcs_start_xmit() to
match the prototype's to resolve the warning and potential CFI failure,
should s390 select ARCH_SUPPORTS_CFI_CLANG in the future.
Link: https://github.com/ClangBuiltLinux/linux/issues/1750
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nathan Chancellor [Thu, 3 Nov 2022 17:01:29 +0000 (10:01 -0700)]
s390/netiucv: Fix return type of netiucv_tx()
With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
indirect call targets are validated against the expected function
pointer prototype to make sure the call target is valid to help mitigate
ROP attacks. If they are not identical, there is a failure at run time,
which manifests as either a kernel panic or thread getting killed. A
proposed warning in clang aims to catch these at compile time, which
reveals:
drivers/s390/net/netiucv.c:1854:21: error: incompatible function pointer types initializing 'netdev_tx_t (*)(struct sk_buff *, struct net_device *)' (aka 'enum netdev_tx (*)(struct sk_buff *, struct net_device *)') with an expression of type 'int (struct sk_buff *, struct net_device *)' [-Werror,-Wincompatible-function-pointer-types-strict]
.ndo_start_xmit = netiucv_tx,
^~~~~~~~~~
->ndo_start_xmit() in 'struct net_device_ops' expects a return type of
'netdev_tx_t', not 'int'. Adjust the return type of netiucv_tx() to
match the prototype's to resolve the warning and potential CFI failure,
should s390 select ARCH_SUPPORTS_CFI_CLANG in the future.
Additionally, while in the area, remove a comment block that is no
longer relevant.
Link: https://github.com/ClangBuiltLinux/linux/issues/1750
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nathan Chancellor [Thu, 3 Nov 2022 17:01:28 +0000 (10:01 -0700)]
s390/ctcm: Fix return type of ctc{mp,}m_tx()
With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
indirect call targets are validated against the expected function
pointer prototype to make sure the call target is valid to help mitigate
ROP attacks. If they are not identical, there is a failure at run time,
which manifests as either a kernel panic or thread getting killed. A
proposed warning in clang aims to catch these at compile time, which
reveals:
drivers/s390/net/ctcm_main.c:1064:21: error: incompatible function pointer types initializing 'netdev_tx_t (*)(struct sk_buff *, struct net_device *)' (aka 'enum netdev_tx (*)(struct sk_buff *, struct net_device *)') with an expression of type 'int (struct sk_buff *, struct net_device *)' [-Werror,-Wincompatible-function-pointer-types-strict]
.ndo_start_xmit = ctcm_tx,
^~~~~~~
drivers/s390/net/ctcm_main.c:1072:21: error: incompatible function pointer types initializing 'netdev_tx_t (*)(struct sk_buff *, struct net_device *)' (aka 'enum netdev_tx (*)(struct sk_buff *, struct net_device *)') with an expression of type 'int (struct sk_buff *, struct net_device *)' [-Werror,-Wincompatible-function-pointer-types-strict]
.ndo_start_xmit = ctcmpc_tx,
^~~~~~~~~
->ndo_start_xmit() in 'struct net_device_ops' expects a return type of
'netdev_tx_t', not 'int'. Adjust the return type of ctc{mp,}m_tx() to
match the prototype's to resolve the warning and potential CFI failure,
should s390 select ARCH_SUPPORTS_CFI_CLANG in the future.
Additionally, while in the area, remove a comment block that is no
longer relevant.
Link: https://github.com/ClangBuiltLinux/linux/issues/1750
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Haijun Liu [Thu, 3 Nov 2022 09:18:29 +0000 (14:48 +0530)]
net: wwan: t7xx: Add NAPI support
Replace the work queue based RX flow with a NAPI implementation
Remove rx_thread and dpmaif_rxq_work.
Enable GRO on RX path.
Introduce dummy network device. its responsibility is
- Binds one NAPI object for each DL HW queue and acts as
the agent of all those network devices.
- Use NAPI object to poll DL packets.
- Helps to dispatch each packet to the network interface.
Signed-off-by: Haijun Liu <haijun.liu@mediatek.com>
Co-developed-by: Sreehari Kancharla <sreehari.kancharla@linux.intel.com>
Signed-off-by: Sreehari Kancharla <sreehari.kancharla@linux.intel.com>
Signed-off-by: Chandrashekar Devegowda <chandrashekar.devegowda@intel.com>
Acked-by: Ricardo Martinez <ricardo.martinez@linux.intel.com>
Acked-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ilpo Järvinen [Thu, 3 Nov 2022 09:18:28 +0000 (14:48 +0530)]
net: wwan: t7xx: Use needed_headroom instead of hard_header_len
hard_header_len is used by gro_list_prepare() but on Rx, there
is no header so use needed_headroom instead.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sreehari Kancharla <sreehari.kancharla@linux.intel.com>
Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cai Huoqing [Thu, 3 Nov 2022 08:05:11 +0000 (16:05 +0800)]
net: hinic: Add support for configuration of rx-vlan-filter by ethtool
When ethtool config rx-vlan-filter, the driver will send
control command to firmware, then set to hardware in this patch.
Signed-off-by: Cai Huoqing <cai.huoqing@linux.dev>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cai Huoqing [Thu, 3 Nov 2022 08:05:10 +0000 (16:05 +0800)]
net: hinic: Add control command support for VF PMD driver in DPDK
HINIC has a mailbox for PF-VF communication and the VF driver
could send port control command to PF driver via mailbox.
The control command only can be set to register in PF,
so add support in PF driver for VF PMD driver control
command when VF PMD driver work with linux PF driver.
Then, no need to add handlers to nic_vf_cmd_msg_handler[],
because the host driver just forwards it to the firmware.
Actually the firmware works on a coprocessor MGMT_CPU(inside the NIC)
which will recv and deal with these commands.
Signed-off-by: Cai Huoqing <cai.huoqing@linux.dev>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cai Huoqing [Thu, 3 Nov 2022 08:05:09 +0000 (16:05 +0800)]
net: hinic: Convert the cmd code from decimal to hex to be more readable
The print cmd code is in hex, so using hex cmd code intead of
decimal is easy to check the value with print info.
Signed-off-by: Cai Huoqing <cai.huoqing@linux.dev>
Signed-off-by: David S. Miller <davem@davemloft.net>
Krzysztof Kozlowski [Wed, 2 Nov 2022 16:15:12 +0000 (12:15 -0400)]
dt-bindings: net: dsa-port: constrain number of 'reg' in ports
'reg' without any constraints allows multiple items which is not the
intention in DSA port schema (as physical port is expected to have only
one address).
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Krzysztof Kozlowski [Wed, 2 Nov 2022 16:15:11 +0000 (12:15 -0400)]
dt-bindings: net: constrain number of 'reg' in ethernet ports
'reg' without any constraints allows multiple items which is not the
intention for Ethernet controller's port number.
Constrain the 'reg' on AX88178 and LAN95xx USB Ethernet Controllers.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Chiu [Tue, 1 Nov 2022 01:11:39 +0000 (09:11 +0800)]
net: axiemac: add PM callbacks to support suspend/resume
Support basic system-wide suspend and resume functions
Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Yang [Fri, 28 Oct 2022 02:01:01 +0000 (10:01 +0800)]
net: mv643xx_eth: support MII/GMII/RGMII modes for Kirkwood
Support mode switch properly, which is not available before.
If SoC has two Ethernet controllers, by setting both of them into MII
mode, the first controller enters GMII mode, while the second
controller is effectively disabled. This requires configuring (and
maybe enabling) the second controller in the device tree, even though
it cannot be used.
Signed-off-by: David Yang <mmyangfl@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veerasenareddy Burru [Thu, 3 Nov 2022 06:05:57 +0000 (23:05 -0700)]
octeon_ep: support Octeon device CNF95N
Add support for Octeon device CNF95N.
CNF95N is a Octeon Fusion family product with same PCI NIC
characteristics as CN93 which is currently supported by the driver.
update supported device list in Documentation.
Signed-off-by: Veerasenareddy Burru <vburru@marvell.com>
Link: https://lore.kernel.org/r/20221103060600.1858-1-vburru@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Sat, 5 Nov 2022 02:54:29 +0000 (19:54 -0700)]
Merge branch 'sfc-add-basic-flower-matches-to-offload'
Edward Cree says:
====================
sfc: add basic flower matches to offload
Support offloading TC flower rules with matches on L2-L4 fields.
====================
Link: https://lore.kernel.org/r/cover.1667412458.git.ecree.xilinx@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Edward Cree [Thu, 3 Nov 2022 15:27:31 +0000 (15:27 +0000)]
sfc: add Layer 4 matches to ef100 TC offload
Support matching on UDP/TCP source and destination ports and TCP flags,
with masking if supported by the hardware.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Edward Cree [Thu, 3 Nov 2022 15:27:30 +0000 (15:27 +0000)]
sfc: add Layer 3 flag matches to ef100 TC offload
Support matching on ip_frag and ip_firstfrag.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Edward Cree [Thu, 3 Nov 2022 15:27:29 +0000 (15:27 +0000)]
sfc: add Layer 3 matches to ef100 TC offload
Support matching on IP protocol, Type of Service, Time To Live, source
and destination addresses, with masking if supported by the hardware.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Edward Cree [Thu, 3 Nov 2022 15:27:28 +0000 (15:27 +0000)]
sfc: add Layer 2 matches to ef100 TC offload
Support matching on EtherType, VLANs and ethernet source/destination
addresses, with masking if supported by the hardware.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Edward Cree [Thu, 3 Nov 2022 15:27:27 +0000 (15:27 +0000)]
sfc: check recirc_id match caps before MAE offload
Offloaded TC rules always match on recirc_id in the MAE, so we should
check that the MAE reported support for this match before attempting
to insert the rule.
These checks allow us to fail early, avoiding the PCIe round-trip to
firmware for an MC_CMD_MAE_ACTION_RULE_INSERT that will only fail,
and more importantly providing a more informative error message that
identifies which match field is unsupported.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Nathan Chancellor [Thu, 3 Nov 2022 22:00:32 +0000 (15:00 -0700)]
net: ethernet: renesas: Fix return type of rswitch_start_xmit()
With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
indirect call targets are validated against the expected function
pointer prototype to make sure the call target is valid to help mitigate
ROP attacks. If they are not identical, there is a failure at run time,
which manifests as either a kernel panic or thread getting killed. A
proposed warning in clang aims to catch these at compile time, which
reveals:
drivers/net/ethernet/renesas/rswitch.c:1533:20: error: incompatible function pointer types initializing 'netdev_tx_t (*)(struct sk_buff *, struct net_device *)' (aka 'enum netdev_tx (*)(struct sk_buff *, struct net_device *)') with an expression of type 'int (struct sk_buff *, struct net_device *)' [-Werror,-Wincompatible-function-pointer-types-strict]
.ndo_start_xmit = rswitch_start_xmit,
^~~~~~~~~~~~~~~~~~
1 error generated.
->ndo_start_xmit() in 'struct net_device_ops' expects a return type of
'netdev_tx_t', not 'int'. Adjust the return type of rswitch_start_xmit()
to match the prototype's to resolve the warning and CFI failure.
Link: https://github.com/ClangBuiltLinux/linux/issues/1750
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221103220032.2142122-1-nathan@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Zhengchao Shao [Fri, 4 Nov 2022 02:25:13 +0000 (10:25 +0800)]
net: remove redundant check in ip_metrics_convert()
Now ip_metrics_convert() is only called by ip_fib_metrics_init(). Before
ip_fib_metrics_init() invokes ip_metrics_convert(), it checks whether
input parameter fc_mx is NULL. Therefore, ip_metrics_convert() doesn't
need to check fc_mx.
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20221104022513.168868-1-shaozhengchao@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kees Cook [Tue, 18 Oct 2022 09:25:25 +0000 (02:25 -0700)]
igb: Proactively round up to kmalloc bucket size
In preparation for removing the "silently change allocation size"
users of ksize(), explicitly round up all q_vector allocations so that
allocations can be correctly compared to ksize().
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Tony Nguyen <anthony.l.nguyen@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: intel-wired-lan@lists.osuosl.org
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Kees Cook [Tue, 18 Oct 2022 09:25:24 +0000 (02:25 -0700)]
igb: Do not free q_vector unless new one was allocated
Avoid potential use-after-free condition under memory pressure. If the
kzalloc() fails, q_vector will be freed but left in the original
adapter->q_vector[v_idx] array position.
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Tony Nguyen <anthony.l.nguyen@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: intel-wired-lan@lists.osuosl.org
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Jan Sokolowski [Thu, 6 Oct 2022 12:44:08 +0000 (14:44 +0200)]
ixgbevf: Add error messages on vlan error
ixgbevf did not provide an error in dmesg if VLAN addition failed.
Add two descriptive failure messages in the kernel log.
Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Yang Li [Wed, 2 Nov 2022 05:31:39 +0000 (13:31 +0800)]
ixgbe: Remove unneeded semicolon
./drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c:1305:2-3: Unneeded semicolon
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2688
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Anirudh Venkataramanan [Wed, 28 Sep 2022 22:42:10 +0000 (15:42 -0700)]
ixgbe: Remove local variable
Remove local variable "match" and directly return evaluated conditional
instead.
Suggested-by: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Daniel Willenson [Mon, 17 Oct 2022 15:52:59 +0000 (11:52 -0400)]
ixgbe: change MAX_RXD/MAX_TXD based on adapter type
Set the length limit for the receive descriptor buffer and transmit
descriptor buffer based on the controller type. The values used are called
out in the controller datasheets as a 'Note:' in the RDLEN and TDLEN
register descriptions.
This allows the user to use ethtool to allocate larger descriptor buffers
in the case where data is received or transmitted too quickly for the
driver to keep up.
Signed-off-by: Daniel Willenson <daniel@veobot.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
David S. Miller [Fri, 4 Nov 2022 10:16:53 +0000 (10:16 +0000)]
Merge branch 'net-ipa-more-endpoints'
Alex Elder says:
====================
net: ipa: support more endpoints
This series adds support for more than 32 IPA endpoints. To do
this, five registers whose bits represent endpoint state are
replicated as needed to represent endpoints beyond 32. For existing
platforms, the number of endpoints is never greater than 32, so
there is just one of each register. IPA v5.0+ supports more than
that though; these changes prepare the code for that.
Beyond that, the IPA fields that represent endpoints in a 32-bit
bitmask are updated to support an arbitrary number of these endpoint
registers. (There is one exception, explained in patch 7.)
The first two patches are some sort of unrelated cleanups, making
use of a helper function introduced recently.
The third and fourth use parameterized functions to determine the
register offset for registers that represent endpoints.
The last five convert fields representing endpoints to allow more
than 32 endpoints to be represented.
Since v1, I have implemented Jakub's suggestions:
- Don't print a message on (bitmap) memory allocation failure
- Do not do "mass null checks" when allocating bitmaps
- Rework some code to ensure error path is sane
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Wed, 2 Nov 2022 22:11:39 +0000 (17:11 -0500)]
net: ipa: use a bitmap for enabled endpoints
Replace the 32-bit unsigned used to track enabled endpoints with a
Linux bitmap, to allow an arbitrary number of endpoints to be
represented.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Wed, 2 Nov 2022 22:11:38 +0000 (17:11 -0500)]
net: ipa: use a bitmap for set-up endpoints
Replace the 32-bit unsigned used to track endpoints that have
completed setup with a Linux bitmap, to allow an arbitrary number
of endpoints to be represented.
Rework the error handling in ipa_endpoint_init() so the defined
endpoint bitmap is freed if an error occurs early. Once endpoints
have been initialized, ipa_endpoint_exit() is used to recover if
the set of filtered endpoints is invalid.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Wed, 2 Nov 2022 22:11:37 +0000 (17:11 -0500)]
net: ipa: support more filtering endpoints
Prior to IPA v5.0, there could be no more than 32 endpoints.
A filter table begins with a bitmap indicating which endpoints have
a filter defined. That bitmap is currently assumed to fit in a
32-bit value.
Starting with IPA v5.0, more than 32 endpoints are supported, so
it's conceivable that a TX endpoint has an ID that exceeds 32.
Increase the size of the field representing endpoints that support
filtering to 64 bits. Rename the bitmap field "filtered".
Unlike other similar fields, we do not use an (arbitrarily long)
Linux bitmap for this purpose. The reason is that if a filter table
ever *did* need to support more than 64 TX endpoints, its format
would change in ways we can't anticipate.
Have ipa_endpoint_init() return a negative errno rather than a mask
that indicates which endpoints support filtering, and have that
function assign the "filtered" field directly.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Wed, 2 Nov 2022 22:11:36 +0000 (17:11 -0500)]
net: ipa: use a bitmap for available endpoints
Similar to the previous patch, replace the 32-bit unsigned used to
track endpoints supported by hardware with a Linux bitmap, to allow
an arbitrary number of endpoints to be represented.
Move ipa_endpoint_deconfig() above ipa_endpoint_config() and use
it in the error path of the latter function.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Wed, 2 Nov 2022 22:11:35 +0000 (17:11 -0500)]
net: ipa: use a bitmap for defined endpoints
IPA v5.0 supports more than 32 endpoints, so we will be unable to
represent endpoints defined in the configuration data with a 32-bit
value. To prepare for that, convert the field in the IPA structure
representing defined endpoints to be a Linux bitmap.
Convert loops based on that field into for_each_set_bit() calls over
the new bitmap. Note that the loop in ipa_endpoint_config() still
assumes there are 32 or fewer endpoints (when comparing against the
available endpoint bit mask); that assumption goes away in the next
patch.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Wed, 2 Nov 2022 22:11:34 +0000 (17:11 -0500)]
net: ipa: add a parameter to suspend registers
The SUSPEND_INFO, SUSPEND_EN, SUSPEND_CLR registers represent
endpoint IDs in a bit mask. When more than 32 endpoints are
supported, these registers will be replicated as needed to represent
the number of supported endpoints. Update the definitions of these
registers to have a stride of 4 bytes, and update the code that
operates them to select the proper offset and bit.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Wed, 2 Nov 2022 22:11:33 +0000 (17:11 -0500)]
net: ipa: add a parameter to aggregation registers
Starting with IPA v5.0, a single IPA instance can have more than 32
endpoints defined. To handle this, each register that holds a
bitmap of IPA endpoints is replicated as needed to represent the
available endpoints.
To prepare for this, registers that represent endpoint IDs in a bit
mask will be defined to have a parameter, with a stride value of 4
bytes. The first 32 endpoints are represented in the first 32-bit
register, then the next (up to) 32 endpoints at an offset 4 bytes
higher. When accessing such a register, the endpoint ID divided
by 32 determines the offset, and the endpoint ID modulo 32 defines
the endpoint's bit position within the register.
The first two registers we'll update for this are STATE_AGGR_ACTIVE
and AGGR_FORCE_CLOSE.
Until more than 32 endpoints are supported, this change has no
practical effect.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Wed, 2 Nov 2022 22:11:32 +0000 (17:11 -0500)]
net: ipa: use ipa_table_mem() in ipa_table_reset_add()
Similar to the previous commit, pass flags rather than a memory
region ID to ipa_table_reset_add(), and there use ipa_table_mem() to
look up the memory region affected based on those flags.
Currently all eight of these table memory regions are assumed to
exist, because they all have canaries within them. Stop assuming
that will always be the case, and in ipa_table_reset_add() allow
these memory regions to be non-existent.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Wed, 2 Nov 2022 22:11:31 +0000 (17:11 -0500)]
net: ipa: reduce arguments to ipa_table_init_add()
Recently ipa_table_mem() was added as a way to look up one of 8
possible memory regions by indicating whether it was a filter or
route table, hashed or not, and IPv6 or not.
We can simplify the interface to ipa_table_init_add() by passing two
flags to it instead of the opcode and both hashed and non-hashed
memory region IDs. The "filter" and "ipv6" flags are sufficient to
determine the opcode to use, and with ipa_table_mem() can look up
the correct memory region as well.
It's possible to not have hashed tables, but we already verify the
number of entries in a filter or routing table is nonzero. Stop
assuming a hashed table entry exists in ipa_table_init_add().
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Tue, 1 Nov 2022 10:41:04 +0000 (10:41 +0000)]
rds: remove redundant variable total_payload_len
Variable total_payload_len is being used to accumulate payload lengths
however it is never read or used afterwards. It is redundant and can
be removed.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 4 Nov 2022 04:13:54 +0000 (21:13 -0700)]
Merge branch '40GbE' of git://git./linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2022-11-02 (i40e, iavf)
This series contains updates to i40e and iavf drivers.
Joe Damato adds tracepoint information to i40e_napi_poll to expose helpful
debug information for users who'd like to get a better understanding of
how their NIC is performing as they adjust various parameters and tuning
knobs.
Note: this does not touch any XDP related code paths. This
tracepoint will only work when not using XDP. Care has been taken to avoid
changing control flow in i40e_napi_poll with this change.
Alicja adds error messaging for unsupported duplex settings for i40e.
Ye Xingchen replaces use of __FUNCTION__ with __func__ for iavf.
Bartosz changes tense of device removal message to be more clear on the
action for iavf.
* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
iavf: Change information about device removal in dmesg
iavf: Replace __FUNCTION__ with __func__
i40e: Add appropriate error message logged for incorrect duplex setting
i40e: Add i40e_napi_poll tracepoint
i40e: Record number of RXes cleaned during NAPI
i40e: Record number TXes cleaned during NAPI
i40e: Store the irq number in i40e_q_vector
====================
Link: https://lore.kernel.org/r/20221102211011.2944983-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Fri, 4 Nov 2022 04:12:56 +0000 (21:12 -0700)]
Merge branch '1GbE' of git://git./linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2022-11-02 (e1000e, e1000, igc)
This series contains updates to e1000e, e1000, and igc drivers.
For e1000e, Sasha adds a new board type to help distinguish platforms and
adds device id support for upcoming platforms. He also adds trace points
for CSME flows to aid in debugging.
Ani removes unnecessary kmap_atomic call for e1000 and e1000e.
Muhammad sets speed based transmit offsets for launchtime functionality to
reduce latency for igc.
* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
igc: Correct the launchtime offset
e1000: Remove unnecessary use of kmap_atomic()
e1000e: Remove unnecessary use of kmap_atomic()
e1000e: Add e1000e trace module
e1000e: Add support for the next LOM generation
e1000e: Separate MTP board type from ADP
====================
Link: https://lore.kernel.org/r/20221102203957.2944396-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Nathan Chancellor [Wed, 2 Nov 2022 16:06:10 +0000 (09:06 -0700)]
hamradio: baycom_epp: Fix return type of baycom_send_packet()
With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
indirect call targets are validated against the expected function
pointer prototype to make sure the call target is valid to help mitigate
ROP attacks. If they are not identical, there is a failure at run time,
which manifests as either a kernel panic or thread getting killed. A
proposed warning in clang aims to catch these at compile time, which
reveals:
drivers/net/hamradio/baycom_epp.c:1119:25: error: incompatible function pointer types initializing 'netdev_tx_t (*)(struct sk_buff *, struct net_device *)' (aka 'enum netdev_tx (*)(struct sk_buff *, struct net_device *)') with an expression of type 'int (struct sk_buff *, struct net_device *)' [-Werror,-Wincompatible-function-pointer-types-strict]
.ndo_start_xmit = baycom_send_packet,
^~~~~~~~~~~~~~~~~~
1 error generated.
->ndo_start_xmit() in 'struct net_device_ops' expects a return type of
'netdev_tx_t', not 'int'. Adjust the return type of baycom_send_packet()
to match the prototype's to resolve the warning and CFI failure.
Link: https://github.com/ClangBuiltLinux/linux/issues/1750
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221102160610.1186145-1-nathan@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Nathan Chancellor [Wed, 2 Nov 2022 16:09:33 +0000 (09:09 -0700)]
net: ethernet: ti: Fix return type of netcp_ndo_start_xmit()
With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
indirect call targets are validated against the expected function
pointer prototype to make sure the call target is valid to help mitigate
ROP attacks. If they are not identical, there is a failure at run time,
which manifests as either a kernel panic or thread getting killed. A
proposed warning in clang aims to catch these at compile time, which
reveals:
drivers/net/ethernet/ti/netcp_core.c:1944:21: error: incompatible function pointer types initializing 'netdev_tx_t (*)(struct sk_buff *, struct net_device *)' (aka 'enum netdev_tx (*)(struct sk_buff *, struct net_device *)') with an expression of type 'int (struct sk_buff *, struct net_device *)' [-Werror,-Wincompatible-function-pointer-types-strict]
.ndo_start_xmit = netcp_ndo_start_xmit,
^~~~~~~~~~~~~~~~~~~~
1 error generated.
->ndo_start_xmit() in 'struct net_device_ops' expects a return type of
'netdev_tx_t', not 'int'. Adjust the return type of
netcp_ndo_start_xmit() to match the prototype's to resolve the warning
and CFI failure.
Link: https://github.com/ClangBuiltLinux/linux/issues/1750
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221102160933.1601260-1-nathan@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Christophe JAILLET [Wed, 2 Nov 2022 06:36:23 +0000 (07:36 +0100)]
net: usb: Use kstrtobool() instead of strtobool()
strtobool() is the same as kstrtobool().
However, the latter is more used within the kernel.
In order to remove strtobool() and slightly simplify kstrtox.h, switch to
the other function name.
While at it, include the corresponding header file (<linux/kstrtox.h>).
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/d4432a67b6f769cac0a9ec910ac725298b64e102.1667336095.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Fri, 4 Nov 2022 03:48:44 +0000 (20:48 -0700)]
Merge branch 'net-fix-netdev-to-devlink_port-linkage-and-expose-to-user'
Jiri Pirko says:
====================
net: fix netdev to devlink_port linkage and expose to user
Currently, the info about linkage from netdev to the related
devlink_port instance is done using ndo_get_devlink_port().
This is not sufficient, as it is up to the driver to implement it and
some of them don't do that. Also it leads to a lot of unnecessary
boilerplate code in all the drivers.
Instead of that, introduce a possibility for driver to expose this
relationship by new SET_NETDEV_DEVLINK_PORT macro which stores it into
dev->devlink_port. It is ensured by the driver init/fini flows that
the devlink_port pointer does not change during the netdev lifetime.
Devlink port is always registered before netdev register and
unregistered after netdev unregister.
Benefit from this linkage setup and remove explicit calls from driver
to devlink_port_type_eth_set() and clear(). Many of the driver
didn't use it correctly anyway. Let the devlink.c to track associated
netdev events and adjust type and type pointer accordingly. Also
use this events to to keep track on ifname change and remove RTNL lock
taking from devlink_nl_port_fill().
Finally, remove the ndo_get_devlink_port() ndo which is no longer used
and expose devlink_port handle as a new netdev netlink attribute to the
user. That way, during the ifname->devlink_port lookup, userspace app
does not have to dump whole devlink port list and instead it can just
do a simple RTM_GETLINK query.
====================
Link: https://lore.kernel.org/r/20221102160211.662752-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Wed, 2 Nov 2022 16:02:11 +0000 (17:02 +0100)]
net: expose devlink port over rtnetlink
Expose devlink port handle related to netdev over rtnetlink. Introduce a
new nested IFLA attribute to carry the info. Call into devlink code to
fill-up the nest with existing devlink attributes that are used over
devlink netlink.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Wed, 2 Nov 2022 16:02:10 +0000 (17:02 +0100)]
net: remove unused ndo_get_devlink_port
Remove ndo_get_devlink_port which is no longer used alongside with the
implementations in drivers.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Wed, 2 Nov 2022 16:02:09 +0000 (17:02 +0100)]
net: devlink: use devlink_port pointer instead of ndo_get_devlink_port
Use newly introduced devlink_port pointer instead of getting it calling
to ndo_get_devlink_port op.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Wed, 2 Nov 2022 16:02:08 +0000 (17:02 +0100)]
net: devlink: add not cleared type warning to port unregister
By the time port unregister is called. There should be no type set. Make
sure that the driver cleared it before and warn in case it didn't. This
enforces symmetricity with type set and port register.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Wed, 2 Nov 2022 16:02:07 +0000 (17:02 +0100)]
net: devlink: store copy netdevice ifindex and ifname to allow port_fill() without RTNL held
To avoid a need to take RTNL mutex in port_fill() function, benefit from
the introduce infrastructure that tracks netdevice notifier events.
Store the ifindex and ifname upon register and change name events.
Remove the rtnl_held bool propagated down to port_fill() function as it
is no longer needed.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Wed, 2 Nov 2022 16:02:06 +0000 (17:02 +0100)]
net: devlink: remove net namespace check from devlink_nl_port_fill()
It is ensured by the netdevice notifier event processing, that only
netdev pointers from the same net namespaces are filled. Remove the
net namespace check from devlink_nl_port_fill() as it is no longer
needed.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Wed, 2 Nov 2022 16:02:05 +0000 (17:02 +0100)]
net: devlink: remove netdev arg from devlink_port_type_eth_set()
Since devlink_port_type_eth_set() should no longer be called by any
driver with netdev pointer as it should rather use
SET_NETDEV_DEVLINK_PORT, remove the netdev arg. Add a warn to
type_clear()
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Wed, 2 Nov 2022 16:02:04 +0000 (17:02 +0100)]
net: make drivers to use SET_NETDEV_DEVLINK_PORT to set devlink_port
Benefit from the previously implemented tracking of netdev events in
devlink code and instead of calling devlink_port_type_eth_set() and
devlink_port_type_clear() to set devlink port type and link to related
netdev, use SET_NETDEV_DEVLINK_PORT() macro to assign devlink_port
pointer to netdevice which is about to be registered.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Wed, 2 Nov 2022 16:02:03 +0000 (17:02 +0100)]
net: devlink: track netdev with devlink_port assigned
Currently, ethernet drivers are using devlink_port_type_eth_set() and
devlink_port_type_clear() to set devlink port type and link to related
netdev.
Instead of calling them directly, let the driver use
SET_NETDEV_DEVLINK_PORT macro to assign devlink_port pointer and let
devlink to track it. Note the devlink port pointer is static during
the time netdevice is registered.
In devlink code, use per-namespace netdev notifier to track
the netdevices with devlink_port assigned and change the internal
devlink_port type and related type pointer accordingly.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Wed, 2 Nov 2022 16:02:02 +0000 (17:02 +0100)]
net: devlink: take RTNL in port_fill() function only if it is not held
Follow-up patch is going to introduce a netdevice notifier event
processing which is called with RTNL mutex held. Processing of this will
eventually lead to call to port_notity() and port_fill() which currently
takes RTNL mutex internally. So as a temporary solution, propagate a
bool indicating if the mutex is already held. This will go away in one
of the follow-up patches.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Wed, 2 Nov 2022 16:02:01 +0000 (17:02 +0100)]
net: devlink: move port_type_netdev_checks() call to __devlink_port_type_set()
As __devlink_port_type_set() is going to be called directly from netdevice
notifier event handle in one of the follow-up patches, move the
port_type_netdev_checks() call there.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Wed, 2 Nov 2022 16:02:00 +0000 (17:02 +0100)]
net: devlink: move port_type_warn_schedule() call to __devlink_port_type_set()
As __devlink_port_type_set() is going to be called directly from netdevice
notifier event handle in one of the follow-up patches, move the
port_type_warn_schedule() call there.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Wed, 2 Nov 2022 16:01:59 +0000 (17:01 +0100)]
net: devlink: convert devlink port type-specific pointers to union
Instead of storing type_dev as a void pointer, convert it to union and
use it to store either struct net_device or struct ib_device pointer.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Fri, 4 Nov 2022 03:46:36 +0000 (20:46 -0700)]
Merge branch 'bridge-add-mac-authentication-bypass-mab-support'
Ido Schimmel says:
====================
bridge: Add MAC Authentication Bypass (MAB) support
Patch #1 adds MAB support in the bridge driver. See the commit message
for motivation, design choices and implementation details.
Patch #2 adds corresponding test cases.
Follow-up patchsets will add offload support in mlxsw and mv88e6xxx.
====================
Link: https://lore.kernel.org/r/20221101193922.2125323-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hans J. Schultz [Tue, 1 Nov 2022 19:39:22 +0000 (21:39 +0200)]
selftests: forwarding: Add MAC Authentication Bypass (MAB) test cases
Add four test cases to verify MAB functionality:
* Verify that a locked FDB entry can be generated by the bridge,
preventing a host from communicating via the bridge. Test that user
space can clear the "locked" flag by replacing the entry, thereby
authenticating the host and allowing it to communicate via the bridge.
* Test that an entry cannot roam to a locked port, but that it can roam
to an unlocked port.
* Test that MAB can only be enabled on a port that is both locked and
has learning enabled.
* Test that locked FDB entries are flushed from a port when MAB is
disabled.
Signed-off-by: Hans J. Schultz <netdev@kapio-technology.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hans J. Schultz [Tue, 1 Nov 2022 19:39:21 +0000 (21:39 +0200)]
bridge: Add MAC Authentication Bypass (MAB) support
Hosts that support 802.1X authentication are able to authenticate
themselves by exchanging EAPOL frames with an authenticator (Ethernet
bridge, in this case) and an authentication server. Access to the
network is only granted by the authenticator to successfully
authenticated hosts.
The above is implemented in the bridge using the "locked" bridge port
option. When enabled, link-local frames (e.g., EAPOL) can be locally
received by the bridge, but all other frames are dropped unless the host
is authenticated. That is, unless the user space control plane installed
an FDB entry according to which the source address of the frame is
located behind the locked ingress port. The entry can be dynamic, in
which case learning needs to be enabled so that the entry will be
refreshed by incoming traffic.
There are deployments in which not all the devices connected to the
authenticator (the bridge) support 802.1X. Such devices can include
printers and cameras. One option to support such deployments is to
unlock the bridge ports connecting these devices, but a slightly more
secure option is to use MAB. When MAB is enabled, the MAC address of the
connected device is used as the user name and password for the
authentication.
For MAB to work, the user space control plane needs to be notified about
MAC addresses that are trying to gain access so that they will be
compared against an allow list. This can be implemented via the regular
learning process with the sole difference that learned FDB entries are
installed with a new "locked" flag indicating that the entry cannot be
used to authenticate the device. The flag cannot be set by user space,
but user space can clear the flag by replacing the entry, thereby
authenticating the device.
Locked FDB entries implement the following semantics with regards to
roaming, aging and forwarding:
1. Roaming: Locked FDB entries can roam to unlocked (authorized) ports,
in which case the "locked" flag is cleared. FDB entries cannot roam
to locked ports regardless of MAB being enabled or not. Therefore,
locked FDB entries are only created if an FDB entry with the given {MAC,
VID} does not already exist. This behavior prevents unauthenticated
devices from disrupting traffic destined to already authenticated
devices.
2. Aging: Locked FDB entries age and refresh by incoming traffic like
regular entries.
3. Forwarding: Locked FDB entries forward traffic like regular entries.
If user space detects an unauthorized MAC behind a locked port and
wishes to prevent traffic with this MAC DA from reaching the host, it
can do so using tc or a different mechanism.
Enable the above behavior using a new bridge port option called "mab".
It can only be enabled on a bridge port that is both locked and has
learning enabled. Locked FDB entries are flushed from the port once MAB
is disabled. A new option is added because there are pure 802.1X
deployments that are not interested in notifications about locked FDB
entries.
Signed-off-by: Hans J. Schultz <netdev@kapio-technology.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 3 Nov 2022 18:38:42 +0000 (11:38 -0700)]
Merge git://git./linux/kernel/git/netdev/net
No conflicts.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Thu, 3 Nov 2022 17:51:59 +0000 (10:51 -0700)]
Merge tag 'net-6.1-rc4' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from bluetooth and netfilter.
Current release - regressions:
- net: several zerocopy flags fixes
- netfilter: fix possible memory leak in nf_nat_init()
- openvswitch: add missing .resv_start_op
Previous releases - regressions:
- neigh: fix null-ptr-deref in neigh_table_clear()
- sched: fix use after free in red_enqueue()
- dsa: fall back to default tagger if we can't load the one from DT
- bluetooth: fix use-after-free in l2cap_conn_del()
Previous releases - always broken:
- netfilter: netlink notifier might race to release objects
- nfc: fix potential memory leak of skb
- bluetooth: fix use-after-free caused by l2cap_reassemble_sdu
- bluetooth: use skb_put to set length
- eth: tun: fix bugs for oversize packet when napi frags enabled
- eth: lan966x: fixes for when MTU is changed
- eth: dwmac-loongson: fix invalid mdio_node"
* tag 'net-6.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (53 commits)
vsock: fix possible infinite sleep in vsock_connectible_wait_data()
vsock: remove the unused 'wait' in vsock_connectible_recvmsg()
ipv6: fix WARNING in ip6_route_net_exit_late()
bridge: Fix flushing of dynamic FDB entries
net, neigh: Fix null-ptr-deref in neigh_table_clear()
net/smc: Fix possible leaked pernet namespace in smc_init()
stmmac: dwmac-loongson: fix invalid mdio_node
ibmvnic: Free rwi on reset success
net: mdio: fix undefined behavior in bit shift for __mdiobus_register
Bluetooth: L2CAP: Fix attempting to access uninitialized memory
Bluetooth: L2CAP: Fix l2cap_global_chan_by_psm
Bluetooth: L2CAP: Fix accepting connection request for invalid SPSM
Bluetooth: hci_conn: Fix not restoring ISO buffer count on disconnect
Bluetooth: L2CAP: Fix memory leak in vhci_write
Bluetooth: L2CAP: fix use-after-free in l2cap_conn_del()
Bluetooth: virtio_bt: Use skb_put to set length
Bluetooth: hci_conn: Fix CIS connection dst_type handling
Bluetooth: L2CAP: Fix use-after-free caused by l2cap_reassemble_sdu
netfilter: ipset: enforce documented limit to prevent allocating huge memory
isdn: mISDN: netjet: fix wrong check of device registration
...
Linus Torvalds [Thu, 3 Nov 2022 17:27:28 +0000 (10:27 -0700)]
Merge tag 'powerpc-6.1-4' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Fix an endian thinko in the asm-generic compat_arg_u64() which led to
syscall arguments being swapped for some compat syscalls.
- Fix syscall wrapper handling of syscalls with 64-bit arguments on
32-bit kernels, which led to syscall arguments being misplaced.
- A build fix for amdgpu on Book3E with AltiVec disabled.
Thanks to Andreas Schwab, Christian Zigotzky, and Arnd Bergmann.
* tag 'powerpc-6.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/32: Select ARCH_SPLIT_ARG64
powerpc/32: fix syscall wrappers with 64-bit arguments
asm-generic: compat: fix compat_arg_u64() and compat_arg_u64_dual()
powerpc/64e: Fix amdgpu build on Book3E w/o AltiVec
Paolo Abeni [Thu, 3 Nov 2022 14:26:14 +0000 (15:26 +0100)]
Merge branch 'add-new-pcp-and-apptrust-attributes-to-dcbnl'
Daniel Machon says:
====================
Add new PCP and APPTRUST attributes to dcbnl
This patch series adds new extension attributes to dcbnl, to support PCP
prioritization (and thereby hw offloadable pcp-based queue
classification) and per-selector trust and trust order. Additionally,
the microchip sparx5 driver has been dcb-enabled to make use of the new
attributes to offload PCP, DSCP and Default prio to the switch, and
implement trust order of selectors.
For pre-RFC discussion see:
https://lore.kernel.org/netdev/Yv9VO1DYAxNduw6A@DEN-LT-70577/
For RFC series see:
https://lore.kernel.org/netdev/
20220915095757.2861822-1-daniel.machon@microchip.com/
In summary: there currently exist no convenient way to offload per-port
PCP-based queue classification to hardware. The DCB subsystem offers
different ways to prioritize through its APP table, but lacks an option
for PCP. Similarly, there is no way to indicate the notion of trust for
APP table selectors. This patch series addresses both topics.
PCP based queue classification:
- 8021Q standardizes the Priority Code Point table (see 6.9.3 of IEEE
Std 802.1Q-2018). This patch series makes it possible, to offload
the PCP classification to said table. The new PCP selector is not a
standard part of the APP managed object, therefore it is
encapsulated in a new non-std extension attribute.
Selector trust:
- ASIC's often has the notion of trust DSCP and trust PCP. The new
attribute makes it possible to specify a trust order of app
selectors, which drivers can then react on.
DCB-enable sparx5 driver:
- Now supports offloading of DSCP, PCP and default priority. Only one
mapping of protocol:priority is allowed. Consecutive mappings of the
same protocol to some new priority, will overwrite the previous. This
is to keep a consistent view of the app table and the hardware.
- Now supports dscp and pcp trust, by use of the introduced
dcbnl_set/getapptrust ops. Sparx5 supports trust orders: [], [dscp],
[pcp] and [dscp, pcp]. For now, only DSCP and PCP selectors are
supported by the driver, everything else is bounced.
Patch #1 introduces a new PCP selector to the APP object, which makes it
possible to encode PCP and DEI in the app triplet and offload it to the
PCP table of the ASIC.
Patch #2 Introduces the new extension attributes
DCB_ATTR_DCB_APP_TRUST_TABLE and DCB_ATTR_DCB_APP_TRUST. Trusted
selectors are passed in the nested DCB_ATTR_DCB_APP_TRUST_TABLE
attribute, and assembled into an array of selectors:
u8 selectors[256];
where lower indexes has higher precedence. In the array, selectors are
stored consecutively, starting from index zero. With a maximum number of
256 unique selectors, the list has the same maximum size.
Patch #3 Sets up the dcbnl ops hook, and adds support for offloading pcp
app entries, to the PCP table of the switch.
Patch #4 Makes use of the dcbnl_set/getapptrust ops, to set a per-port
trust order.
Patch #5 Adds support for offloading dscp app entries to the DSCP table
of the switch.
Patch #6 Adds support for offloading default prio app entries to the
switch.
====================
Link: https://lore.kernel.org/r/20221101094834.2726202-1-daniel.machon@microchip.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Daniel Machon [Tue, 1 Nov 2022 09:48:34 +0000 (10:48 +0100)]
net: microchip: sparx5: add support for offloading default prio
Add support for offloading default prio {ETHERTYPE, 0, prio}.
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Daniel Machon [Tue, 1 Nov 2022 09:48:33 +0000 (10:48 +0100)]
net: microchip: sparx5: add support for offloading dscp table
Add support for offloading dscp app entries. Dscp values are global for
all ports on the sparx5 switch. Therefore, we replicate each dscp app
entry per-port.
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>