Bongsu Jeon [Tue, 17 Nov 2020 08:08:24 +0000 (17:08 +0900)]
nfc: s3fwrn5: Remove the max_payload
max_payload is unused.
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Bongsu Jeon <bongsu.jeon@samsung.com>
Link: https://lore.kernel.org/r/20201117080824epcms2p36f70e06e2d8bd51d1af278b26ca65725@epcms2p3
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 19 Nov 2020 01:34:21 +0000 (17:34 -0800)]
Merge branch 's390-qeth-updates-2020-11-17'
Julian Wiedmann says:
====================
s390/qeth: updates 2020-11-17
This brings some cleanups, and a bunch of improvements for our
.get_link_ksettings() code.
====================
Link: https://lore.kernel.org/r/20201117161520.1089-1-jwi@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Julian Wiedmann [Tue, 17 Nov 2020 16:15:20 +0000 (17:15 +0100)]
s390/qeth: improve selection of ethtool link modes
The link mode is a combination of port speed and port mode. But we
currently only consider the speed, and then typically select the
corresponding TP-based link mode. For 1G and 10G Fibre links this means
we display the wrong link modes.
Move the SPEED_* switch statements inside the PORT_* cases, and only
consider valid combinations where we can select the corresponding
link mode. Add the relevant link modes (1000baseX, 10000baseSR and
1000baseLR) that were introduced back with
commit
5711a9822144 ("net: ethtool: add support for 1000BaseX and missing 10G link modes").
To differentiate between 10000baseSR and 10000baseLR, use the detailed
media_type information that QUERY OAT provides.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Julian Wiedmann [Tue, 17 Nov 2020 16:15:19 +0000 (17:15 +0100)]
s390/qeth: use QUERY OAT for initial link info
Improve the initial link info with data obtained from QUERY OAT.
Doing so _only_ at initialization time avoids
1. dealing with multi-part replies, and
2. sifting through all the data that may get returned at runtime.
This allows us to determine the correct port type for the 1000BT variant
of recent OSA adapter generations (where the .card_type field in
QUERY CARD INFO is no longer sufficient).
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Julian Wiedmann [Tue, 17 Nov 2020 16:15:18 +0000 (17:15 +0100)]
s390/qeth: clean up default cases for ethtool link mode
Remove the default case for PORT_* and SPEED_* in our ethtool code.
The only time these could be hit is if qeth_init_link_info() was unable
to determine the port type from an OSA adapter's link_type.
We already throw a message in this case, so reduce the noise and don't
report bad data (ie. it's much more likely that any future link_type
will represent a PORT_FIBRE link ...).
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Julian Wiedmann [Tue, 17 Nov 2020 16:15:17 +0000 (17:15 +0100)]
s390/qeth: set static link info during initialization
Hard-code the minimal link info at initialization time, after we
obtained the link_type. qeth_get_link_ksettings() can still override
this with more accurate data from QUERY CARD INFO later on.
Don't set arbitrary defaults for unknown OSA link types, they
certainly won't match any future type.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Julian Wiedmann [Tue, 17 Nov 2020 16:15:16 +0000 (17:15 +0100)]
s390/qeth: improve QUERY CARD INFO processing
Move all the HW reply data parsing into qeth_query_card_info_cb(), and
use common ethtool enums for transporting the information back to the
caller.
Also only look at the .port_speed field when we couldn't determine the
speed from the .card_type field, and introduce some 'default' cases for
SPEED_UNKNOWN, PORT_OTHER and DUPLEX_UNKNOWN.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Julian Wiedmann [Tue, 17 Nov 2020 16:15:15 +0000 (17:15 +0100)]
s390/qeth: tolerate error when querying card info
By the time that our .get_link_ksettings() code issues a QUERY CARD INFO
cmd to get link-related information, we already set up a good amount of
static link data.
Return this data when the cmd fails, same as when the cmd is not
supported.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kaixu Xia [Tue, 17 Nov 2020 16:15:14 +0000 (17:15 +0100)]
s390/qeth: remove useless if/else
Fix the following coccinelle report:
./drivers/s390/net/qeth_l3_main.c:107:2-4: WARNING: possible condition with no effect (if == else)
Both branches are the same since
commit
ab29c480b194 ("s390/qeth: replace deprecated simple_stroul()"),
so remove them.
Reported-by: Tosk Robot <tencent_os_robot@tencent.com>
Signed-off-by: Kaixu Xia <kaixuxia@tencent.com>
[jwi: point to the commit that introduced this]
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Julian Wiedmann [Tue, 17 Nov 2020 16:15:13 +0000 (17:15 +0100)]
s390/qeth: reduce rtnl locking for switchdev events
call_switchdev_notifiers() doesn't require holding the RTNL lock since
commit
ff5cf100110c ("net: switchdev: Change notifier chain to be atomic").
We still need it for the "lost event" slow path, to avoid racing against
a concurrent .ndo_bridge_setlink().
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Heiner Kallweit [Tue, 17 Nov 2020 20:34:09 +0000 (21:34 +0100)]
r8169: remove not needed check in rtl8169_start_xmit
In rtl_tx() the released descriptors are zero'ed by
rtl8169_unmap_tx_skb(). And in the beginning of rtl8169_start_xmit()
we check that enough descriptors are free, therefore there's no way
the DescOwn bit can be set here.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/6965d665-6c50-90c5-70e6-0bb335d4ea47@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Heiner Kallweit [Tue, 17 Nov 2020 20:25:42 +0000 (21:25 +0100)]
net: bridge: replace struct br_vlan_stats with pcpu_sw_netstats
Struct br_vlan_stats duplicates pcpu_sw_netstats (apart from
br_vlan_stats not defining an alignment requirement), therefore
switch to using the latter one.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/04d25c3d-c5f6-3611-6d37-c2f40243dae2@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 19 Nov 2020 00:44:02 +0000 (16:44 -0800)]
Merge branch 'atm-replace-in_interrupt-usage'
Sebastian Andrzej Siewior says:
====================
atm: Replace in_interrupt usage
this mini series contains the removal of in_interrupt() in drivers/atm
====================
Link: https://lore.kernel.org/r/20201116162117.387191-1-bigeasy@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sebastian Andrzej Siewior [Mon, 16 Nov 2020 16:21:16 +0000 (17:21 +0100)]
atm: lanai: Remove in_interrupt() usage
lanai_shutdown_tx_vci() uses in_interrupt() to issue a warning message
if the function was used in context in which it is not safe to sleep.
The usage of in_interrupt() in driver code is deprecated as it can not always
detect all states where it is not allowed to sleep.
msleep() has debug code which will trigger a warning if used in bad
context.
Remove in_interrupt().
Cc: Chas Williams <3chas3@gmail.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sebastian Andrzej Siewior [Mon, 16 Nov 2020 16:21:15 +0000 (17:21 +0100)]
atm: nicstar: Replace in_interrupt() usage
push_scqe() uses in_interrupt() to figure out if it is allowed to sleep.
The usage of in_interrupt() in drivers is phased out and Linus clearly
requested that code which changes behaviour depending on context should
either be separated or the context be conveyed in an argument passed by the
caller, which usually knows the context.
Aside of that in_interrupt() is not correct as it does not catch preempt
disabled regions which neither can sleep.
ns_send() (the only caller of push_scqe()) has the following callers:
- vcc_sendmsg() used as proto_ops::sendmsg is expected to be invoked in
preemtible context.
-> vcc->dev->ops->send() (ns_send())
- atm_vcc::send via atmdev_ops::send either directly (pointer copied by
atm_init_aal34() or atm_init_aal5()) or via atm_send_aal0().
This is invoked by drivers (like br2684, clip, pppoatm, ...) which are
called from net_device_ops::ndo_start_xmit with BH disabled.
Add atmdev_ops::send_bh which is used by callers from BH context
(atm_send_aal*()) and if this callback missing then ::send is used
instead.
Implement this callback in nicstar and use it to replace in_interrupt().
Cc: Chas Williams <3chas3@gmail.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Wed, 18 Nov 2020 23:53:51 +0000 (15:53 -0800)]
Merge branch 'net-ipa-ipa-register-cleanup'
Alex Elder says:
====================
net: ipa: IPA register cleanup
This series consists of cleanup patches, almost entirely related to
the definitions for IPA registers. Some comments are updated or
added to provide better information about defined IPA registers.
Other cleanups ensure symbol names and their assigned values are
defined consistently. Some essentially duplicate definitions get
consolidated for simplicity. In a few cases some minor bugs
(missing definitions) are fixed. With these changes, all IPA
register offsets and associated field masks should be correct for
IPA versions 3.5.1, 4.0, 4.1, and 4.2.
====================
Link: https://lore.kernel.org/r/20201116233805.13775-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alex Elder [Mon, 16 Nov 2020 23:38:05 +0000 (17:38 -0600)]
net: ipa: a few last IPA register cleanups
Some last cleanups for the existing IPA register definitions:
- Remove the definition of IPA_REG_ENABLED_PIPES_OFFSET, because
it is not used.
- Use "IPA_" instead of "BAM_" as the prefix on fields associated
with the FLAVOR_0 register. We use GSI (not BAM), but the
fields apply to both GSI and BAM.
- Get rid of the definition of IPA_CS_RSVD; it is never used.
- Add two missing field mask definitions for the INIT_DEAGGR
endpoint register.
- Eliminate a few of the defined sequencer types, because they
are unused. We can add them back when needed.
- Add a field mask to indicate which bit causes an interrupt on
the microcontroller.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alex Elder [Mon, 16 Nov 2020 23:38:04 +0000 (17:38 -0600)]
net: ipa: move definition of enum ipa_irq_id
Move the definition of the ipa_irq_id enumerated type out of
"ipa_interrupt.h" and into "ipa_reg.h", and flesh out its set of
defined values. Each interrupt id indicates a particular type of
IPA interrupt that can be signaled. Their numeric values define bit
positions in the IPA_IRQ_* registers, so should their definitions
should accompany the definition of those register offsets.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alex Elder [Mon, 16 Nov 2020 23:38:03 +0000 (17:38 -0600)]
net: ipa: rearrange a few IPA register definitions
Move a few things around in "ipa_reg.h":
- Move the definition of ipa_reg_state_aggr_active_offset() down
a bit in the file so definitions are ordered by offset (for
the lowest supported IPA version) like all other definitions.
- Move the definition TIMER_FREQUENCY to be immediately above
the definition of ipa_aggr_granularity_val() where it's used.
- Move each register field value enumerated type definition to
immediately follow the definitions of the register and field
it is associated with.
No code functionality is modified by this patch.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alex Elder [Mon, 16 Nov 2020 23:38:02 +0000 (17:38 -0600)]
net: ipa: fix up IPA register comments
Revise or add comments in "ipa_reg.h" for to provide more
information, and to improve clarity and consistency.
- Always provide a comment to define when a register or field is
supported (or not) for certain versions of IPA hardware.
- Try to be specific about *which* or *how many* definitions
a comment refers to.
- Move comments stating that ipa->available defines the valid
bits in various registers *above* the register offset
definition, to avoid some checkpatch.pl warnings.
No code is changed by this patch.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alex Elder [Mon, 16 Nov 2020 23:38:01 +0000 (17:38 -0600)]
net: ipa: define enumerated types consistently
Consistently define numeric values for enumerated type members using
hexidecimal (rather than decimal) format values. Align the values
assigned in the same column in each file.
Only assign values where they really matter, for example don't
assign IPA_ENDPOINT_AP_MODEM_TX the value 0.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alex Elder [Mon, 16 Nov 2020 23:38:00 +0000 (17:38 -0600)]
net: ipa: fix BCR register field definitions
The backward compatibility register field masks are defined using
single-bit masks defined with BIT(x) rather than GENMASK(x, x).
Change this one set of definitions to follow the GENMASK() pattern
used everywhere else. Add a few missing field definitions for this
register as well.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alex Elder [Mon, 16 Nov 2020 23:37:59 +0000 (17:37 -0600)]
net: ipa: use _FMASK consistently
Several IPA register field masks are defined without the "_FMASK"
suffix naming convention. Rename these, so all field masks are
consistently named.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alex Elder [Mon, 16 Nov 2020 23:37:58 +0000 (17:37 -0600)]
net: ipa: fix two inconsistent IPA register names
Rename two suspend IRQ registers so they follow the IPA_REG_IRQ_xxx
naming convention used elsewhere.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alex Elder [Mon, 16 Nov 2020 23:37:57 +0000 (17:37 -0600)]
net: ipa: support more versions for HOLB timer
IPA version 3.5.1 represents the timer used in avoiding head-of-line
blocking with a simple tick count. IPA v4.2 changes that, instead
splitting the timer field into two parts (base and scale) to
represent the ticks in the timer period.
IPA v4.0 and IPA v4.1 use the same method as IPA v3.5.1. Change the
test in ipa_reg_init_hol_block_timer_val() so the result is correct
for those versions as well.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alex Elder [Mon, 16 Nov 2020 23:37:56 +0000 (17:37 -0600)]
net: ipa: make filter/routing hash enable register variable
For IPA v3.5.1, the IPA filter/routing hash enable register actually
does exist, but it is at offset 0x8c into the IPA register space.
For newer versions of IPA it is at offset 0x148.
Define a new inline function ipa_reg_filt_rout_hash_en_offset() to
return the appropriate value for a given version of IPA hardware.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alex Elder [Mon, 16 Nov 2020 23:37:55 +0000 (17:37 -0600)]
net: ipa: share field mask values for IPA hash registers
The IPA filter/routing hash enable register and filter/routing hash
flush register each have four single-bit fields representing the
four hashed tables to be enabled or flushed. The field positions
are identical, so just use a single set of field masks to represent
the fields for both registers.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Xie He [Wed, 18 Nov 2020 12:42:26 +0000 (04:42 -0800)]
Documentation: Remove the deleted "framerelay" document from the index
commit
f73659192b0b ("net: wan: Delete the DLCI / SDLA drivers")
deleted "Documentation/networking/framerelay.rst". However, it is still
referenced in "Documentation/networking/index.rst". We need to remove the
reference, too.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Xie He <xie.he.0141@gmail.com>
Link: https://lore.kernel.org/r/20201118124226.15588-1-xie.he.0141@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Wed, 18 Nov 2020 19:51:21 +0000 (11:51 -0800)]
Merge branch 'mlxsw-preparations-for-nexthop-objects-support-part-2-2'
Ido Schimmel says:
====================
mlxsw: Preparations for nexthop objects support - part 2/2
This patch set contains the second round of preparations towards nexthop
objects support in mlxsw. Follow up patches can be found here [1].
The patches are mostly small and trivial and contain non-functional
changes aimed at making it easier to integrate nexthop objects with
mlxsw.
Patch #1 is a fix for an issue introduced in previous submission. Found
by Coverity.
[1] https://github.com/idosch/linux/tree/submit/nexthop_objects
====================
Link: https://lore.kernel.org/r/20201117174704.291990-1-idosch@idosch.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel [Tue, 17 Nov 2020 17:47:04 +0000 (19:47 +0200)]
mlxsw: spectrum_router: Allow returning errors from mlxsw_sp_nexthop_group_refresh()
The function is responsible for allocating the adjacency entries used by
the nexthop group and populating them with the adjacency information
such as egress RIF and MAC address.
Allow the function to return an error when it encounters a problem and
have the relevant call sites check it.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel [Tue, 17 Nov 2020 17:47:03 +0000 (19:47 +0200)]
mlxsw: spectrum_router: Add an indication if a nexthop group can be destroyed
Currently, a nexthop group is destroyed when the last FIB entry is
detached from it.
When nexthop objects are supported, this can no longer be the case, as
the group is a separate object whose lifetime is managed by user space.
Add an indication if a nexthop group can be destroyed and always set it
to true for the existing IPv4 and IPv6 nexthop groups.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel [Tue, 17 Nov 2020 17:47:02 +0000 (19:47 +0200)]
mlxsw: spectrum_router: Only clear offload indication from valid IPv6 FIB info
When the IPv6 FIB info has a nexthop object, the nexthop offload
indication is set on the nexthop object and not on the FIB info itself.
Therefore, do not try to clear the offload indication from the FIB info
when it has a nexthop object.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel [Tue, 17 Nov 2020 17:47:01 +0000 (19:47 +0200)]
mlxsw: spectrum_router: Re-order mlxsw_sp_nexthop6_group_get()
Attach the FIB entry to the nexthop group after setting the offload flag
on the IPv6 FIB info (i.e., 'struct fib6_info'). The second operation is
not needed when the nexthop group is a nexthop object. This will allow
us to have a common exit path from the function, regardless of the
nexthop group's type.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel [Tue, 17 Nov 2020 17:47:00 +0000 (19:47 +0200)]
mlxsw: spectrum_router: Set FIB entry's type based on nexthop group
The previous patch associated a nexthop group with the FIB entry before
the entry's type is determined.
Make use of the nexthop group when determining the entry's type instead
of relying on helpers that assume that the nexthop info is not a nexthop
object (i.e., 'struct nexthop').
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel [Tue, 17 Nov 2020 17:46:59 +0000 (19:46 +0200)]
mlxsw: spectrum_router: Set FIB entry's type after creating nexthop group
Each FIB entry has a type (e.g., remote, local) that determines how the
entry is programmed to the device. In order to determine if the entry is
local (directly connected) or remote (has a gateway) the relevant FIB
info structures (e.g., 'struct fib_info') are checked.
When entries that use nexthop objects are supported, these checks will
need to be changed to take into account 'struct nexthop'.
Instead, first associate the entry with a nexthop group so that the next
patch could determine the entry's type based on the associated nexthop
group's type.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel [Tue, 17 Nov 2020 17:46:58 +0000 (19:46 +0200)]
mlxsw: spectrum_router: Pass ifindex to mlxsw_sp_ipip_entry_find_by_decap()
The sole caller of the function will soon only have the ifindex
available, instead of the pointer itself.
Therefore, change the function to take the ifindex as input and have it
get the pointer.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel [Tue, 17 Nov 2020 17:46:57 +0000 (19:46 +0200)]
mlxsw: spectrum_router: Set ifindex for IPv4 nexthops
The ifindex of the nexthop device was never set for IPv4 nexthops,
unlike IPv6 nexthops. This went unnoticed since only IPv6 nexthops use
it.
Set the ifindex for IPv4 nexthops in order to be consistent with IPv6
and also because it will be used by a later patch.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel [Tue, 17 Nov 2020 17:46:56 +0000 (19:46 +0200)]
mlxsw: spectrum_router: Fix wrong kfree() in error path
The function allocates 'nhgi', not 'nh_grp', so it needs to free the
former in its error path.
Fixes:
7f7a417e6a11 ("mlxsw: spectrum_router: Split nexthop group configuration to a different struct")
Addresses-Coverity: ("Memory - corruptions (USE_AFTER_FREE)")
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ahmad Fatoum [Tue, 17 Nov 2020 21:38:26 +0000 (22:38 +0100)]
ptp: document struct ptp_clock_request members
It's arguable most people interested in configuring a PPS signal
want it as external output, not as kernel input. PTP_CLK_REQ_PPS
is for input though. Add documentation to nudge readers into
the correct direction.
Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Link: https://lore.kernel.org/r/20201117213826.18235-1-a.fatoum@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Gustavo A. R. Silva [Tue, 17 Nov 2020 17:13:47 +0000 (11:13 -0600)]
nfp: tls: Fix unreachable code issue
Fix the following unreachable code issue:
drivers/net/ethernet/netronome/nfp/crypto/tls.c: In function 'nfp_net_tls_add':
include/linux/compiler_attributes.h:208:41: warning: statement will never be executed [-Wswitch-unreachable]
208 | # define fallthrough __attribute__((__fallthrough__))
| ^~~~~~~~~~~~~
drivers/net/ethernet/netronome/nfp/crypto/tls.c:299:3: note: in expansion of macro 'fallthrough'
299 | fallthrough;
| ^~~~~~~~~~~
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Link: https://lore.kernel.org/r/20201117171347.GA27231@embeddedor
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 17 Nov 2020 22:15:05 +0000 (14:15 -0800)]
Merge branch 'fix-several-bad-kernel-doc-markups'
Mauro Carvalho Chehab says:
====================
Fix several bad kernel-doc markups
Kernel-doc has always be limited to a probably bad documented
rule:
The kernel-doc markups should appear *imediatelly before* the
function or data structure that it documents.
On other words, if a C file would contain something like this:
/**
* foo - function foo
* @args: foo args
*/
static inline void bar(int args);
/**
* bar - function bar
* @args: foo args
*/
static inline void foo(void *args);
The output (in ReST format) will be:
.. c:function:: void bar (int args)
function foo
**Parameters**
``int args``
foo args
.. c:function:: void foo (void *args)
function bar
**Parameters**
``void *args``
foo args
Which is clearly a wrong result. Before this changeset,
not even a warning is produced on such cases.
As placing such markups just before the documented
data is a common practice, on most cases this is fine.
However, as patches touch things, identifiers may be
renamed, and people may forget to update the kernel-doc
markups to follow such changes.
This has been happening for quite a while, as there are
lots of files with kernel-doc problems.
This series address those issues and add a file at the
end that will enforce that the identifier will match the
kernel-doc markup, avoiding this problem from
keep happening as time goes by.
====================
Link: https://lore.kernel.org/r/cover.1605521731.git.mchehab+huawei@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Mauro Carvalho Chehab [Mon, 16 Nov 2020 10:17:59 +0000 (11:17 +0100)]
net: core: fix some kernel-doc markups
Some identifiers have different names between their prototypes
and the kernel-doc markup.
In the specific case of netif_subqueue_stopped(), keep the
current markup for __netif_subqueue_stopped(), adding a
new one for netif_subqueue_stopped().
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Mauro Carvalho Chehab [Mon, 16 Nov 2020 10:17:58 +0000 (11:17 +0100)]
net: datagram: fix some kernel-doc markups
Some identifiers have different names between their prototypes
and the kernel-doc markup.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Mauro Carvalho Chehab [Mon, 16 Nov 2020 10:17:57 +0000 (11:17 +0100)]
net: phy: fix kernel-doc markups
Some functions have different names between their prototypes
and the kernel-doc markup.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 17 Nov 2020 21:48:27 +0000 (13:48 -0800)]
Merge branch 'add-ethtool-ntuple-filters-support'
Naveen Mamindlapalli says:
====================
Add ethtool ntuple filters support
This patch series adds support for ethtool ntuple filters, unicast
address filtering, VLAN offload and SR-IOV ndo handlers. All of the
above features are based on the Admin Function(AF) driver support to
install and delete the low level MCAM entries. Each MCAM entry is
programmed with the packet fields to match and what actions to take
if the match succeeds. The PF driver requests AF driver to allocate
set of MCAM entries to be used to install the flows by that PF. The
entries will be freed when the PF driver is unloaded.
* The patches 1 to 4 adds AF driver infrastructure to install and
delete the low level MCAM flow entries.
* Patch 5 adds ethtool ntuple filter support.
* Patch 6 adds unicast MAC address filtering.
* Patch 7 adds support for dumping the MCAM entries via debugfs.
* Patches 8 to 10 adds support for VLAN offload.
* Patch 10 to 11 adds support for SR-IOV ndo handlers.
* Patch 12 adds support to read the MCAM entries.
Misc:
* Removed redundant mailbox NIX_RXVLAN_ALLOC.
====================
Link: https://lore.kernel.org/r/20201114195303.25967-1-naveenm@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Subbaraya Sundeep [Sat, 14 Nov 2020 19:53:03 +0000 (01:23 +0530)]
octeontx2-af: Delete NIX_RXVLAN_ALLOC mailbox message
Since mailbox message for installing flows is in place,
remove the RXVLAN_ALLOC mbox message which is redundant.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Naveen Mamindlapalli [Sat, 14 Nov 2020 19:53:02 +0000 (01:23 +0530)]
octeontx2-af: Add new mbox messages to retrieve MCAM entries
This patch introduces new mailbox mesages to retrieve a given
MCAM entry or base flow steering rule of a VF installed by its
parent PF. This helps while updating the existing MCAM rules
with out re-framing the whole mailbox request again. The INSTALL
FLOW mailbox consumer can read-modify-write the existing entry.
Similarly while installing new flow rules for a VF, the base
flow steering rule match creteria is copied to the new flow rule
and the deltas are appended to the new rule.
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Co-developed-by: Vamsi Attunuru <vattunuru@marvell.com>
Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hariprasad Kelam [Sat, 14 Nov 2020 19:53:01 +0000 (01:23 +0530)]
octeontx2-af: Handle PF-VF mac address changes
This patch handles the VF mac address changes as given below.
1. mac addr configrued by VF will be retained until VF module unload.
2. mac addr configred by PF for VF will be retained until power cycle.
3. mac addr confgired by PF for its VF can't be overwritten by VF.
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Naveen Mamindlapalli [Sat, 14 Nov 2020 19:53:00 +0000 (01:23 +0530)]
octeontx2-pf: Add support for SR-IOV management functions
This patch adds support for ndo_set_vf_mac, ndo_set_vf_vlan
and ndo_get_vf_config handlers. The traffic redirection
based on the VF mac address or vlan id is done by installing
MCAM rules. Reserved RX_VTAG_TYPE7 in each NIXLF for VF VLAN
which strips the VLAN tag from ingress VLAN traffic. The NIX PF
allocates two MCAM entries for VF VLAN feature, one used for
ingress VTAG strip and another entry for egress VTAG insertion.
This patch also updates the MAC address in PF installed VF VLAN
rule upon receiving nix_lf_start_rx mbox request for VF since
Administrative Function driver will assign a valid MAC addr
in nix_lf_start_rx function.
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Co-developed-by: Tomasz Duszynski <tduszynski@marvell.com>
Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hariprasad Kelam [Sat, 14 Nov 2020 19:52:59 +0000 (01:22 +0530)]
octeontx2-pf: Implement ingress/egress VLAN offload
This patch implements egress VLAN offload by appending NIX_SEND_EXT_S
header to NIX_SEND_HDR_S. The VLAN TCI information is specified
in the NIX_SEND_EXT_S. The VLAN offload in the ingress path is
implemented by configuring the NIX_RX_VTAG_ACTION_S to strip and
capture the outer vlan fields. The NIX PF allocates one MCAM entry
for Rx VLAN offload.
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vamsi Attunuru [Sat, 14 Nov 2020 19:52:58 +0000 (01:22 +0530)]
octeontx2-af: Modify nix_vtag_cfg mailbox to support TX VTAG entries
This patch modifies the existing nix_vtag_config mailbox message
to allocate and free TX VTAG entries as requested by a NIX PF.
The TX VTAG entries are global resource that shared by all PFs
and each entry specifies the size of VTAG to insert and the VTAG
header data to insert. The mailbox response contains the entry
index which is used by mailbox requester in configuring the
NPC_TX_VTAG_ACTION for any MCAM entry.
Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Subbaraya Sundeep [Sat, 14 Nov 2020 19:52:57 +0000 (01:22 +0530)]
octeontx2-af: Add debugfs entry to dump the MCAM rules
Add debugfs support to dump the MCAM rules installed using
NPC_INSTALL_FLOW mbox message. Debugfs file can display mcam
entry, counter if any, flow type and counter hits.
Ethtool will dump the ntuple flows related to the PF only.
The debugfs file gives systemwide view of the MCAM rules
installed by all the PF's.
Below is the example output when the debugfs file is read:
~ # mount -t debugfs none /sys/kernel/debug
~ # cat /sys/kernel/debug/octeontx2/npc/mcam_rules
Installed by: PF1
direction: RX
mcam entry: 227
udp source port 23 mask 0xffff
Forward to: PF1 VF0
action: Direct to queue 0
enabled: yes
counter: 1
hits: 0
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hariprasad Kelam [Sat, 14 Nov 2020 19:52:56 +0000 (01:22 +0530)]
octeontx2-pf: Add support for unicast MAC address filtering
Add unicast MAC address filtering support using install flow
message. Total of 8 MCAM entries are allocated for adding
unicast mac filtering rules. If the MCAM allocation fails,
the unicast filtering support will not be advertised.
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Subbaraya Sundeep [Sat, 14 Nov 2020 19:52:55 +0000 (01:22 +0530)]
octeontx2-pf: Add support for ethtool ntuple filters
This patch adds support for adding and deleting ethtool ntuple
filters. The filters for ether, ipv4, ipv6, tcp, udp and sctp
are supported. The mask is also supported. The supported actions
are drop and direct to a queue. Additionally we support FLOW_EXT
field vlan_tci and FLOW_MAC_EXT.
The NIX PF will allocate total 32 MCAM entries for the use of
ethtool ntuple filters. The Administrative Function(AF) will
install/delete the MCAM rules when NIX PF sends mailbox message
to install/delete the ntuple filters.
Ethtool ntuple filters support is restricted to PFs as of now
and PF can install ntuple filters to direct the traffic to its
VFs. Hence added a separate callback for VFs to get/set RSS
configuration.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Subbaraya Sundeep [Sat, 14 Nov 2020 19:52:54 +0000 (01:22 +0530)]
octeontx2-af: Add mbox messages to install and delete MCAM rules
Added new mailbox messages to install and delete MCAM rules.
These mailbox messages will be used for adding/deleting ethtool
n-tuple filters by NIX PF. The installed MCAM rules are stored
in a list that will be traversed later to delete the MCAM entries
when the interface is brought down or when PCIe FLR is received.
The delete mailbox supports deleting a single MCAM entry or range
of entries or all the MCAM entries owned by the pcifunc. Each MCAM
entry can be associated with a HW match stat entry if the mailbox
requester wants to check the hit count for debugging.
Modified adding default unicast DMAC match rule using install
flow API. The default unicast DMAC match entry installed by
Administrative Function is saved and can be changed later by the
mailbox user to fit additional fields, or the default MCAM entry
rule action can be used for other flow rules installed later.
Modified rvu_mbox_handler_nix_lf_free mailbox to add a flag to
disable or delete the MCAM entries. The MCAM entries are disabled
when the interface is brought down and deleted in FLR handler.
The disabled MCAM entries will be re-enabled when the interface
is brought up again.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Subbaraya Sundeep [Sat, 14 Nov 2020 19:52:53 +0000 (01:22 +0530)]
octeontx2-af: Generate key field bit mask from KEX profile
Key Extraction(KEX) profile decides how the packet metadata such as
layer information and selected packet data bytes at each layer are
placed in MCAM search key. This patch reads the configured KEX profile
parameters to find out the bit position and bit mask for each field.
The information is used when programming the MCAM match data by SW
to match a packet flow and take appropriate action on the flow. This
patch also verifies the mandatory fields such as channel and DMAC
are not overwritten by the KEX configuration of other fields.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Subbaraya Sundeep [Sat, 14 Nov 2020 19:52:52 +0000 (01:22 +0530)]
octeontx2-af: Verify MCAM entry channel and PF_FUNC
This patch adds support to verify the channel number sent by
mailbox requester before writing MCAM entry for Ingress packets.
Similarly for Egress packets, verifying the PF_FUNC sent by the
mailbox user.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stanislaw Kardach [Sat, 14 Nov 2020 19:52:51 +0000 (01:22 +0530)]
octeontx2-af: Modify default KEX profile to extract TX packet fields
The current default Key Extraction(KEX) profile can only use RX
packet fields while generating the MCAM search key. The profile
can't be used for matching TX packet fields. This patch modifies
the default KEX profile to add support for extracting TX packet
fields into MCAM search key. Enabled Tx KPU packet parsing by
configuring TX PKIND in tx_parse_cfg.
Modified the KEX profile to extract 2 bytes of VLAN TCI from an
offset of 2 bytes from LB_PTR. The LB_PTR points to the byte offset
where the VLAN header starts. The NPC KPU parser profile has been
modified to point LB_PTR to the starting byte offset of VLAN header
which points to the tpid field.
Signed-off-by: Stanislaw Kardach <skardach@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Xie He [Sat, 14 Nov 2020 15:09:21 +0000 (07:09 -0800)]
net: wan: Delete the DLCI / SDLA drivers
The DLCI driver (dlci.c) implements the Frame Relay protocol. However,
we already have another newer and better implementation of Frame Relay
provided by the HDLC_FR driver (hdlc_fr.c).
The DLCI driver's implementation of Frame Relay is used by only one
hardware driver in the kernel - the SDLA driver (sdla.c).
The SDLA driver provides Frame Relay support for the Sangoma S50x devices.
However, the vendor provides their own driver (along with their own
multi-WAN-protocol implementations including Frame Relay), called WANPIPE.
I believe most users of the hardware would use the vendor-provided WANPIPE
driver instead.
(The WANPIPE driver was even once in the kernel, but was deleted in
commit
8db60bcf3021 ("[WAN]: Remove broken and unmaintained Sangoma
drivers.") because the vendor no longer updated the in-kernel WANPIPE
driver.)
Cc: Mike McLagan <mike.mclagan@linux.org>
Signed-off-by: Xie He <xie.he.0141@gmail.com>
Link: https://lore.kernel.org/r/20201114150921.685594-1-xie.he.0141@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 17 Nov 2020 19:39:22 +0000 (11:39 -0800)]
Merge branch 'net-hns3-updates-for-next'
Huazhong Tan says:
====================
net: hns3: updates for -next
There are several updates relating to the interrupt coalesce for
the HNS3 ethernet driver.
based on the frame quantity).
a fixed value in code.
based on the gap time).
its new usage.
change log:
V4 - remove #5~#10 from this series, which needs more discussion.
V3 - fix a typo error in #1 reported by Jakub Kicinski.
rewrite #9 commit log.
remove #11 from this series.
V2 - reorder #2 & #3 to fix compiler error.
fix some checkpatch warnings in #10 & #11.
previous version:
V3: https://patchwork.ozlabs.org/project/netdev/cover/
1605151998-12633-1-git-send-email-tanhuazhong@huawei.com/
V2: https://patchwork.ozlabs.org/project/netdev/cover/
1604892159-19990-1-git-send-email-tanhuazhong@huawei.com/
V1: https://patchwork.ozlabs.org/project/netdev/cover/
1604730681-32559-1-git-send-email-tanhuazhong@huawei.com/
====================
Link: https://lore.kernel.org/r/1605514854-11205-1-git-send-email-tanhuazhong@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Huazhong Tan [Mon, 16 Nov 2020 08:20:54 +0000 (16:20 +0800)]
net: hns3: rename gl_adapt_enable in struct hns3_enet_coalesce
Besides GL(Gap Limiting), QL(Quantity Limiting) can be modified
dynamically when DIM is supported. So rename gl_adapt_enable as
adapt_enable in struct hns3_enet_coalesce.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Huazhong Tan [Mon, 16 Nov 2020 08:20:53 +0000 (16:20 +0800)]
net: hns3: add support for 1us unit GL configuration
For device whose version is above V3(include V3), the GL
configuration can set as 1us unit, so adds support for
configuring this field.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Huazhong Tan [Mon, 16 Nov 2020 08:20:52 +0000 (16:20 +0800)]
net: hns3: add support for querying maximum value of GL
For maintainability and compatibility, add support for querying
the maximum value of GL.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Huazhong Tan [Mon, 16 Nov 2020 08:20:51 +0000 (16:20 +0800)]
net: hns3: add support for configuring interrupt quantity limiting
QL(quantity limiting) means that hardware supports the interrupt
coalesce based on the frame quantity. QL can be configured when
int_ql_max in device's specification is non-zero, so add support
to configure it. Also, rename two coalesce init function to fit
their purpose.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 17 Nov 2020 19:37:13 +0000 (11:37 -0800)]
Merge branch 'net-phy-add-support-for-shared-interrupts-part-2'
Ioana Ciornei says:
====================
net: phy: add support for shared interrupts (part 2)
This patch set aims to actually add support for shared interrupts in
phylib and not only for multi-PHY devices. While we are at it,
streamline the interrupt handling in phylib.
For a bit of context, at the moment, there are multiple phy_driver ops
that deal with this subject:
- .config_intr() - Enable/disable the interrupt line.
- .ack_interrupt() - Should quiesce any interrupts that may have been
fired. It's also used by phylib in conjunction with .config_intr() to
clear any pending interrupts after the line was disabled, and before
it is going to be enabled.
- .did_interrupt() - Intended for multi-PHY devices with a shared IRQ
line and used by phylib to discern which PHY from the package was the
one that actually fired the interrupt.
- .handle_interrupt() - Completely overrides the default interrupt
handling logic from phylib. The PHY driver is responsible for checking
if any interrupt was fired by the respective PHY and choose
accordingly if it's the one that should trigger the link state machine.
From my point of view, the interrupt handling in phylib has become
somewhat confusing with all these callbacks that actually read the same
PHY register - the interrupt status. A more streamlined approach would
be to just move the responsibility to write an interrupt handler to the
driver (as any other device driver does) and make .handle_interrupt()
the only way to deal with interrupts.
Another advantage with this approach would be that phylib would gain
support for shared IRQs between different PHY (not just multi-PHY
devices), something which at the moment would require extending every
PHY driver anyway in order to implement their .did_interrupt() callback
and duplicate the same logic as in .ack_interrupt(). The disadvantage
of making .did_interrupt() mandatory would be that we are slightly
changing the semantics of the phylib API and that would increase
confusion instead of reducing it.
What I am proposing is the following:
- As a first step, make the .ack_interrupt() callback optional so that
we do not break any PHY driver amid the transition.
- Every PHY driver gains a .handle_interrupt() implementation that, for
the most part, would look like below:
irq_status = phy_read(phydev, INTR_STATUS);
if (irq_status < 0) {
phy_error(phydev);
return IRQ_NONE;
}
if (!(irq_status & irq_mask))
return IRQ_NONE;
phy_trigger_machine(phydev);
return IRQ_HANDLED;
- Remove each PHY driver's implementation of the .ack_interrupt() by
actually taking care of quiescing any pending interrupts before
enabling/after disabling the interrupt line.
- Finally, after all drivers have been ported, remove the
.ack_interrupt() and .did_interrupt() callbacks from phy_driver.
This patch set is part 2 of the entire change set and it addresses the
changes needed in 9 PHY drivers. The rest can be found on my Github
branch here:
https://github.com/IoanaCiornei/linux/commits/phylib-shared-irq
====================
Link: https://lore.kernel.org/r/20201113165226.561153-1-ciorneiioana@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ioana Ciornei [Fri, 13 Nov 2020 16:52:26 +0000 (18:52 +0200)]
net: phy: adin: remove the use of the .ack_interrupt()
In preparation of removing the .ack_interrupt() callback, we must replace
its occurrences (aka phy_clear_interrupt), from the 2 places where it is
called from (phy_enable_interrupts and phy_disable_interrupts), with
equivalent functionality.
This means that clearing interrupts now becomes something that the PHY
driver is responsible of doing, before enabling interrupts and after
clearing them. Make this driver follow the new contract.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Acked-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ioana Ciornei [Fri, 13 Nov 2020 16:52:25 +0000 (18:52 +0200)]
net: phy: adin: implement generic .handle_interrupt() callback
In an attempt to actually support shared IRQs in phylib, we now move the
responsibility of triggering the phylib state machine or just returning
IRQ_NONE, based on the IRQ status register, to the PHY driver. Having
3 different IRQ handling callbacks (.handle_interrupt(),
.did_interrupt() and .ack_interrupt() ) is confusing so let the PHY
driver implement directly an IRQ handler like any other device driver.
Make this driver follow the new convention.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Acked-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ioana Ciornei [Fri, 13 Nov 2020 16:52:24 +0000 (18:52 +0200)]
net: phy: ste10Xp: remove the use of .ack_interrupt()
In preparation of removing the .ack_interrupt() callback, we must replace
its occurrences (aka phy_clear_interrupt), from the 2 places where it is
called from (phy_enable_interrupts and phy_disable_interrupts), with
equivalent functionality.
This means that clearing interrupts now becomes something that the PHY
driver is responsible of doing, before enabling interrupts and after
clearing them. Make this driver follow the new contract.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ioana Ciornei [Fri, 13 Nov 2020 16:52:23 +0000 (18:52 +0200)]
net: phy: ste10Xp: implement generic .handle_interrupt() callback
In an attempt to actually support shared IRQs in phylib, we now move the
responsibility of triggering the phylib state machine or just returning
IRQ_NONE, based on the IRQ status register, to the PHY driver. Having
3 different IRQ handling callbacks (.handle_interrupt(),
.did_interrupt() and .ack_interrupt() ) is confusing so let the PHY
driver implement directly an IRQ handler like any other device driver.
Make this driver follow the new convention.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ioana Ciornei [Fri, 13 Nov 2020 16:52:22 +0000 (18:52 +0200)]
net: phy: smsc: remove the use of .ack_interrupt()
In preparation of removing the .ack_interrupt() callback, we must replace
its occurrences (aka phy_clear_interrupt), from the 2 places where it is
called from (phy_enable_interrupts and phy_disable_interrupts), with
equivalent functionality.
This means that clearing interrupts now becomes something that the PHY
driver is responsible of doing, before enabling interrupts and after
clearing them. Make this driver follow the new contract.
Cc: Andre Edich <andre.edich@microchip.com>
Cc: Marco Felsch <m.felsch@pengutronix.de>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ioana Ciornei [Fri, 13 Nov 2020 16:52:21 +0000 (18:52 +0200)]
net: phy: smsc: implement generic .handle_interrupt() callback
In an attempt to actually support shared IRQs in phylib, we now move the
responsibility of triggering the phylib state machine or just returning
IRQ_NONE, based on the IRQ status register, to the PHY driver. Having
3 different IRQ handling callbacks (.handle_interrupt(),
.did_interrupt() and .ack_interrupt() ) is confusing so let the PHY
driver implement directly an IRQ handler like any other device driver.
Make this driver follow the new convention.
Cc: Andre Edich <andre.edich@microchip.com>
Cc: Marco Felsch <m.felsch@pengutronix.de>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ioana Ciornei [Fri, 13 Nov 2020 16:52:20 +0000 (18:52 +0200)]
net: phy: amd: remove the use of .ack_interrupt()
In preparation of removing the .ack_interrupt() callback, we must replace
its occurrences (aka phy_clear_interrupt), from the 2 places where it is
called from (phy_enable_interrupts and phy_disable_interrupts), with
equivalent functionality.
This means that clearing interrupts now becomes something that the PHY
driver is responsible of doing, before enabling interrupts and after
clearing them. Make this driver follow the new contract.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ioana Ciornei [Fri, 13 Nov 2020 16:52:19 +0000 (18:52 +0200)]
net: phy: amd: implement generic .handle_interrupt() callback
In an attempt to actually support shared IRQs in phylib, we now move the
responsibility of triggering the phylib state machine or just returning
IRQ_NONE, based on the IRQ status register, to the PHY driver. Having
3 different IRQ handling callbacks (.handle_interrupt(),
.did_interrupt() and .ack_interrupt() ) is confusing so let the PHY
driver implement directly an IRQ handler like any other device driver.
Make this driver follow the new convention.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ioana Ciornei [Fri, 13 Nov 2020 16:52:18 +0000 (18:52 +0200)]
net: phy: nxp-tja11xx: remove the use of .ack_interrupt()
In preparation of removing the .ack_interrupt() callback, we must replace
its occurrences (aka phy_clear_interrupt), from the 2 places where it is
called from (phy_enable_interrupts and phy_disable_interrupts), with
equivalent functionality.
This means that clearing interrupts now becomes something that the PHY
driver is responsible of doing, before enabling interrupts and after
clearing them. Make this driver follow the new contract.
Cc: Marek Vasut <marex@denx.de>
Cc: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ioana Ciornei [Fri, 13 Nov 2020 16:52:17 +0000 (18:52 +0200)]
net: phy: nxp-tja11xx: implement generic .handle_interrupt() callback
In an attempt to actually support shared IRQs in phylib, we now move the
responsibility of triggering the phylib state machine or just returning
IRQ_NONE, based on the IRQ status register, to the PHY driver. Having
3 different IRQ handling callbacks (.handle_interrupt(),
.did_interrupt() and .ack_interrupt() ) is confusing so let the PHY
driver implement directly an IRQ handler like any other device driver.
Make this driver follow the new convention.
Cc: Marek Vasut <marex@denx.de>
Cc: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ioana Ciornei [Fri, 13 Nov 2020 16:52:16 +0000 (18:52 +0200)]
net: phy: lxt: remove the use of .ack_interrupt()
In preparation of removing the .ack_interrupt() callback, we must replace
its occurrences (aka phy_clear_interrupt), from the 2 places where it is
called from (phy_enable_interrupts and phy_disable_interrupts), with
equivalent functionality.
This means that clearing interrupts now becomes something that the PHY
driver is responsible of doing, before enabling interrupts and after
clearing them. Make this driver follow the new contract.
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ioana Ciornei [Fri, 13 Nov 2020 16:52:15 +0000 (18:52 +0200)]
net: phy: lxt: implement generic .handle_interrupt() callback
In an attempt to actually support shared IRQs in phylib, we now move the
responsibility of triggering the phylib state machine or just returning
IRQ_NONE, based on the IRQ status register, to the PHY driver. Having
3 different IRQ handling callbacks (.handle_interrupt(),
.did_interrupt() and .ack_interrupt() ) is confusing so let the PHY
driver implement directly an IRQ handler like any other device driver.
Make this driver follow the new convention.
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ioana Ciornei [Fri, 13 Nov 2020 16:52:14 +0000 (18:52 +0200)]
net: phy: marvell: remove the use of .ack_interrupt()
In preparation of removing the .ack_interrupt() callback, we must replace
its occurrences (aka phy_clear_interrupt), from the 2 places where it is
called from (phy_enable_interrupts and phy_disable_interrupts), with
equivalent functionality.
This means that clearing interrupts now becomes something that the PHY
driver is responsible of doing, before enabling interrupts and after
clearing them. Make this driver follow the new contract.
Cc: Maxim Kochetkov <fido_max@inbox.ru>
Cc: Baruch Siach <baruch@tkos.co.il>
Cc: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ioana Ciornei [Fri, 13 Nov 2020 16:52:13 +0000 (18:52 +0200)]
net: phy: marvell: implement generic .handle_interrupt() callback
In an attempt to actually support shared IRQs in phylib, we now move the
responsibility of triggering the phylib state machine or just returning
IRQ_NONE, based on the IRQ status register, to the PHY driver. Having
3 different IRQ handling callbacks (.handle_interrupt(),
.did_interrupt() and .ack_interrupt() ) is confusing so let the PHY
driver implement directly an IRQ handler like any other device driver.
Make this driver follow the new convention.
Cc: Maxim Kochetkov <fido_max@inbox.ru>
Cc: Baruch Siach <baruch@tkos.co.il>
Cc: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ioana Ciornei [Fri, 13 Nov 2020 16:52:12 +0000 (18:52 +0200)]
net: phy: microchip: remove the use of .ack_interrupt()
In preparation of removing the .ack_interrupt() callback, we must replace
its occurrences (aka phy_clear_interrupt), from the 2 places where it is
called from (phy_enable_interrupts and phy_disable_interrupts), with
equivalent functionality.
This means that clearing interrupts now becomes something that the PHY
driver is responsible of doing, before enabling interrupts and after
clearing them. Make this driver follow the new contract.
Cc: Nisar Sayed <Nisar.Sayed@microchip.com>
Cc: Yuiko Oshino <yuiko.oshino@microchip.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ioana Ciornei [Fri, 13 Nov 2020 16:52:11 +0000 (18:52 +0200)]
net: phy: microchip: implement generic .handle_interrupt() callback
In an attempt to actually support shared IRQs in phylib, we now move the
responsibility of triggering the phylib state machine or just returning
IRQ_NONE, based on the IRQ status register, to the PHY driver. Having
3 different IRQ handling callbacks (.handle_interrupt(),
.did_interrupt() and .ack_interrupt() ) is confusing so let the PHY
driver implement directly an IRQ handler like any other device driver.
Make this driver follow the new convention.
Cc: Nisar Sayed <Nisar.Sayed@microchip.com>
Cc: Yuiko Oshino <yuiko.oshino@microchip.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ioana Ciornei [Fri, 13 Nov 2020 16:52:10 +0000 (18:52 +0200)]
net: phy: vitesse: remove the use of .ack_interrupt()
In preparation of removing the .ack_interrupt() callback, we must replace
its occurrences (aka phy_clear_interrupt), from the 2 places where it is
called from (phy_enable_interrupts and phy_disable_interrupts), with
equivalent functionality.
This means that clearing interrupts now becomes something that the PHY
driver is responsible of doing, before enabling interrupts and after
clearing them. Make this driver follow the new contract.
Cc: Kavya Sree Kotagiri <kavyasree.kotagiri@microchip.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ioana Ciornei [Fri, 13 Nov 2020 16:52:09 +0000 (18:52 +0200)]
net: phy: vitesse: implement generic .handle_interrupt() callback
In an attempt to actually support shared IRQs in phylib, we now move the
responsibility of triggering the phylib state machine or just returning
IRQ_NONE, based on the IRQ status register, to the PHY driver. Having
3 different IRQ handling callbacks (.handle_interrupt(),
.did_interrupt() and .ack_interrupt() ) is confusing so let the PHY
driver implement directly an IRQ handler like any other device driver.
Make this driver follow the new convention.
Cc: Kavya Sree Kotagiri <kavyasree.kotagiri@microchip.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Randy Dunlap [Mon, 16 Nov 2020 21:21:08 +0000 (13:21 -0800)]
net: linux/skbuff.h: combine SKB_EXTENSIONS + KCOV handling
The previous Kconfig patch led to some other build errors as
reported by the 0day bot and my own overnight build testing.
These are all in <linux/skbuff.h> when KCOV is enabled but
SKB_EXTENSIONS is not enabled, so fix those by combining those conditions
in the header file.
Fixes:
6370cc3bbd8a ("net: add kcov handle to skb extensions")
Fixes:
85ce50d337d1 ("net: kcov: don't select SKB_EXTENSIONS when there is no NET")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Aleksandr Nogikh <nogikh@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Acked-by: Florian Westphal <fw@strlen.de>
Link: https://lore.kernel.org/r/20201116212108.32465-1-rdunlap@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sven Van Asbroeck [Mon, 16 Nov 2020 17:01:55 +0000 (12:01 -0500)]
lan743x: replace devicetree phy parse code with library function
The code in this driver which parses the devicetree to determine
the phy/fixed link setup, can be replaced by a single library
function: of_phy_get_and_connect().
Behaviour is identical, except that the library function will
complain when 'phy-connection-type' is omitted, instead of
blindly using PHY_INTERFACE_MODE_NA, which would result in an
invalid phy configuration.
The library function no longer brings out the exact phy_mode,
but the driver doesn't need this, because phy_interface_is_rgmii()
queries the phydev directly. Remove 'phy_mode' from the private
adapter struct.
While we're here, log info about the attached phy on connect,
this is useful because the phy type and connection method is now
fully configurable via the devicetree.
Tested on a lan7430 chip with built-in phy. Verified that adding
fixed-link/phy-connection-type in the devicetree results in a
fixed-link setup. Used ethtool to verify that the devicetree
settings are used.
Tested-by: Sven Van Asbroeck <thesven73@gmail.com> # lan7430
Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20201116170155.26967-1-TheSven73@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Heiner Kallweit [Sun, 15 Nov 2020 15:03:10 +0000 (16:03 +0100)]
net: phy: don't duplicate driver name in phy_attached_print
Currently we print the driver name twice in phy_attached_print():
- phy_dev_info() prints it as part of the device info
- and we print it as part of the info string
This is a little bit ugly, it makes the info harder to read,
especially if the driver name is a little bit longer.
Therefore omit the driver name (if set) in the info string.
Example from r8169 that uses phylib:
old: Generic FE-GE Realtek PHY r8169-300:00: attached PHY driver \
[Generic FE-GE Realtek PHY] (mii_bus:phy_addr=r8169-300:00, irq=IGNORE)
new: Generic FE-GE Realtek PHY r8169-300:00: attached PHY driver \
(mii_bus:phy_addr=r8169-300:00, irq=IGNORE)
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/8ab72586-f079-41d8-84ee-9f6a5bd97b2a@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Heiner Kallweit [Mon, 16 Nov 2020 16:03:14 +0000 (17:03 +0100)]
r8169: remove nr_frags argument from rtl_tx_slots_avail
The only time when nr_frags isn't SKB_MAX_FRAGS is when entering
rtl8169_start_xmit(). However we can use SKB_MAX_FRAGS also here
because when queue isn't stopped there should always be room for
MAX_SKB_FRAGS + 1 descriptors.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/3d1f2ad7-31d5-2cac-4f4a-394f8a3cab63@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
kernel test robot [Mon, 16 Nov 2020 15:34:44 +0000 (16:34 +0100)]
net: phy: mscc: fix excluded_middle.cocci warnings
Condition !A || A && B is equivalent to !A || B.
Generated by: scripts/coccinelle/misc/excluded_middle.cocci
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: kernel test robot <lkp@intel.com>
Signed-off-by: Julia Lawall <julia.lawall@inria.fr>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/alpine.DEB.2.22.394.2011161633240.2682@hadrien
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 17 Nov 2020 17:16:13 +0000 (09:16 -0800)]
Merge branch 'net-dsa-tag_dsa-unify-regular-and-ethertype-dsa-taggers'
Tobias Waldekranz says:
====================
net: dsa: tag_dsa: Unify regular and ethertype DSA taggers
The first patch ports tag_edsa.c's handling of IGMP/MLD traps to
tag_dsa.c. That way, we start from two logically equivalent taggers
that are then merged. The second commit does the heavy lifting of
actually fusing tag_dsa.c and tag_edsa.c. The final one just follows
up with some clean up of existing comments.
v2 -> v3:
- Add the first patch described above as suggested by Andrew.
- Better documentation of TO_SNIFFER and FORWARD tags.
- Spelling.
v1 -> v2:
- Fixed some grammar and whitespace errors.
- Removed unnecessary default value in Kconfig.
- Removed unnecessary #ifdef.
- Split out comment fixes from functional changes.
- Fully document enum dsa_code.
====================
Link: https://lore.kernel.org/r/20201114234558.31203-1-tobias@waldekranz.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tobias Waldekranz [Sat, 14 Nov 2020 23:45:58 +0000 (00:45 +0100)]
net: dsa: tag_dsa: Use a consistent comment style
Use a consistent style of one-line/multi-line comments throughout the
file.
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tobias Waldekranz [Sat, 14 Nov 2020 23:45:57 +0000 (00:45 +0100)]
net: dsa: tag_dsa: Unify regular and ethertype DSA taggers
Ethertype DSA encodes exactly the same information in the DSA tag as
the non-ethertype variety. So refactor out the common parts and reuse
them for both protocols.
This is ensures tag parsing and generation is always consistent across
all mv88e6xxx chips.
While we are at it, explicitly deal with all possible CPU codes on
receive, making sure to set offload_fwd_mark as appropriate.
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tobias Waldekranz [Sat, 14 Nov 2020 23:45:56 +0000 (00:45 +0100)]
net: dsa: tag_dsa: Allow forwarding of redirected IGMP traffic
When receiving an IGMP/MLD frame with a TO_CPU tag, the switch has not
performed any forwarding of it. This means that we should not set the
offload_fwd_mark on the skb, in case a software bridge wants it
forwarded.
This is a port of:
1ed9ec9b08ad ("dsa: Allow forwarding of redirected IGMP traffic")
Which corrected the issue for chips using EDSA tags, but not for those
using regular DSA tags.
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Mon, 16 Nov 2020 18:46:10 +0000 (10:46 -0800)]
Merge branch 'mptcp-improve-multiple-xmit-streams-support'
Paolo Abeni says:
====================
mptcp: improve multiple xmit streams support
This series improves MPTCP handling of multiple concurrent
xmit streams.
The to-be-transmitted data is enqueued to a subflow only when
the send window is open, keeping the subflows xmit queue shorter
and allowing for faster switch-over.
The above requires a more accurate msk socket state tracking
and some additional infrastructure to allow pushing the data
pending in the msk xmit queue as soon as the MPTCP's send window
opens (patches 6-10).
As a side effect, the MPTCP socket could enqueue data to subflows
after close() time - to completely spooling the data sitting in the
msk xmit queue. Dealing with the requires some infrastructure and
core TCP changes (patches 1-5)
Finally, patches 11-12 introduce a more accurate tracking of the other
end's receive window.
Overall this refactor the MPTCP xmit path, without introducing
new features - the new code is covered by the existing self-tests.
v2 -> v3:
- rebased,
- fixed checkpatch issue in patch 1/13
- fixed some state tracking issues in patch 8/13
v1 -> v2:
- this is just a repost, to cope with patchwork issues, no changes
at all
====================
Link: https://lore.kernel.org/r/cover.1605458224.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo Abeni [Mon, 16 Nov 2020 09:48:14 +0000 (10:48 +0100)]
mptcp: send explicit ack on delayed ack_seq incr
When the worker moves some bytes from the OoO queue into
the receive queue, the msk->ask_seq is updated, the MPTCP-level
ack carrying that value needs to wait the next ingress packet,
possibly slowing down or hanging the peer
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Florian Westphal [Mon, 16 Nov 2020 09:48:13 +0000 (10:48 +0100)]
mptcp: keep track of advertised windows right edge
Before sending 'x' new bytes also check that the new snd_una would
be within the permitted receive window.
For every ACK that also contains a DSS ack, check whether its tcp-level
receive window would advance the current mptcp window right edge and
update it if so.
Signed-off-by: Florian Westphal <fw@strlen.de>
Co-developed-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Florian Westphal [Mon, 16 Nov 2020 09:48:12 +0000 (10:48 +0100)]
mptcp: rework poll+nospace handling
MPTCP maintains a status bit, MPTCP_SEND_SPACE, that is set when at
least one subflow and the mptcp socket itself are writeable.
mptcp_poll returns EPOLLOUT if the bit is set.
mptcp_sendmsg makes sure MPTCP_SEND_SPACE gets cleared when last write
has used up all subflows or the mptcp socket wmem.
This reworks nospace handling as follows:
MPTCP_SEND_SPACE is replaced with MPTCP_NOSPACE, i.e. inverted meaning.
This bit is set when the mptcp socket is not writeable.
The mptcp-level ack path schedule will then schedule the mptcp worker
to allow it to free already-acked data (and reduce wmem usage).
This will then wake userspace processes that wait for a POLLOUT event.
sendmsg will set MPTCP_NOSPACE only when it has to wait for more
wmem (blocking I/O case).
poll path will set MPTCP_NOSPACE in case the mptcp socket is
not writeable.
Normal tcp-level notification (SOCK_NOSPACE) is only enabled
in case the subflow socket has no available wmem.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo Abeni [Mon, 16 Nov 2020 09:48:11 +0000 (10:48 +0100)]
mptcp: try to push pending data on snd una updates
After the previous patch we may end-up with unsent data
in the write buffer. If such buffer is full, the writer
will block for unlimited time.
We need to trigger the MPTCP xmit path even for the
subflow rx path, on MPTCP snd_una updates.
Keep things simple and just schedule the work queue if
needed.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo Abeni [Mon, 16 Nov 2020 09:48:10 +0000 (10:48 +0100)]
mptcp: move page frag allocation in mptcp_sendmsg()
mptcp_sendmsg() is refactored so that first it copies
the data provided from user space into the send queue,
and then tries to spool the send queue via sendmsg_frag.
There a subtle change in the mptcp level collapsing on
consecutive data fragment: we now allow that only on unsent
data.
The latter don't need to deal with msghdr data anymore
and can be simplified in a relevant way.
snd_nxt and write_seq are now tracked independently.
Overall this allows some relevant cleanup and will
allow sending pending mptcp data on msk una update in
later patch.
Co-developed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo Abeni [Mon, 16 Nov 2020 09:48:09 +0000 (10:48 +0100)]
mptcp: refactor shutdown and close
We must not close the subflows before all the MPTCP level
data, comprising the DATA_FIN has been acked at the MPTCP
level, otherwise we could be unable to retransmit as needed.
__mptcp_wr_shutdown() shutdown is responsible to check for the
correct status and close all subflows. Is called by the output
path after spooling any data and at shutdown/close time.
In a similar way, __mptcp_destroy_sock() is responsible to clean-up
the MPTCP level status, and is called when the msk transition
to TCP_CLOSE.
The protocol level close() does not force anymore the TCP_CLOSE
status, but orphan the msk socket and all the subflows.
Orphaned msk sockets are forciby closed after a timeout or
when all MPTCP-level data is acked.
There is a caveat about keeping the orphaned subflows around:
the TCP stack can asynchronusly call tcp_cleanup_ulp() on them via
tcp_close(). To prevent accessing freed memory on later MPTCP
level operations, the msk acquires a reference to each subflow
socket and prevent subflow_ulp_release() from releasing the
subflow context before __mptcp_destroy_sock().
The additional subflow references are released by __mptcp_done()
and the async ULP release is detected checking ULP ops. If such
field has been already cleared by the ULP release path, the
dangling context is freed directly by __mptcp_done().
Co-developed-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo Abeni [Mon, 16 Nov 2020 09:48:08 +0000 (10:48 +0100)]
mptcp: introduce MPTCP snd_nxt
Track the next MPTCP sequence number used on xmit,
currently always equal to write_next.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>