Vladimir Oltean [Tue, 8 Jun 2021 11:16:51 +0000 (14:16 +0300)]
net: dsa: felix: set TX flow control according to the phylink_mac_link_up resolution
Instead of relying on the static initialization done by ocelot_init_port()
which enables flow control unconditionally, set SYS_PAUSE_CFG_PAUSE_ENA
according to the parameters negotiated by the PHY.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wang Hai [Tue, 8 Jun 2021 13:30:07 +0000 (13:30 +0000)]
net: x25: Use list_for_each_entry() to simplify code in x25_forward.c
Convert list_for_each() to list_for_each_entry() where
applicable. This simplifies the code.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wang Hai [Tue, 8 Jun 2021 13:29:08 +0000 (13:29 +0000)]
ethernet/qlogic: Use list_for_each_entry() to simplify code in qlcnic_hw.c
Convert list_for_each() to list_for_each_entry() where
applicable. This simplifies the code.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Tue, 8 Jun 2021 13:10:37 +0000 (16:10 +0300)]
net: stmmac: fix NPD with phylink_set_pcs if there is no MDIO bus
priv->plat->mdio_bus_data is optional, some platforms may not set it,
however we proceed to look straight at priv->plat->mdio_bus_data->has_xpcs.
Since the xpcs is instantiated based on the has_xpcs property, we can
avoid looking at the priv->plat->mdio_bus_data structure altogether and
just check for the presence of the xpcs pointer.
Fixes: 11059740e616 ("net: pcs: xpcs: convert to phylink_pcs_ops")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wang Hai [Tue, 8 Jun 2021 08:13:01 +0000 (08:13 +0000)]
net: lapb: Use list_for_each_entry() to simplify code in lapb_iface.c
Convert list_for_each() to list_for_each_entry() where
applicable. This simplifies the code.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wang Hai [Tue, 8 Jun 2021 08:05:05 +0000 (08:05 +0000)]
net: x25: Use list_for_each_entry() to simplify code in x25_link.c
Convert list_for_each() to list_for_each_entry() where
applicable. This simplifies the code.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wang Hai [Tue, 8 Jun 2021 07:57:37 +0000 (07:57 +0000)]
net: qede: Use list_for_each_entry() to simplify code
Convert list_for_each() to list_for_each_entry() where
applicable. This simplifies the code.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 8 Jun 2021 21:43:31 +0000 (14:43 -0700)]
Merge branch 'hns3-RAS'
Guangbin Huang says:
====================
net: hns3: add RAS compatibility adaptation solution
This patchset adds RAS compatibility adaptation solution for new devices.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiaran Zhang [Tue, 8 Jun 2021 13:08:31 +0000 (21:08 +0800)]
net: hns3: add error handling compatibility during initialization
During initialization, the driver logs and clears the hw errors that
already occurred. For device supports imp-handle ras capability, it
needs handle different error status, otherwise it may cause wrong reset.
So fix it by adding a new processing branch.
Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiaran Zhang [Tue, 8 Jun 2021 13:08:30 +0000 (21:08 +0800)]
net: hns3: update error recovery module and type
Update error recovery module and type for RoCE.
The enumeration values of module names and error types are not sorted
in sequence. If use the current printing mode, they cannot be correctly
printed.
Use the index mode, If mod_id and type_id match the enumerated value,
display the corresponding information.
Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiaran Zhang [Tue, 8 Jun 2021 13:08:29 +0000 (21:08 +0800)]
net: hns3: add support for imp-handle ras capability
IMP(Intelligent Management Processor) firmware add a new feature to
handle and consolidate RAS information for new devices, NIC driver
only needs to query the reported RAS information. NIC driver adds
support for this feature.
Driver queries device capability to check whether IMP support this
feature, If yes, execute the new RAS processing branch.
In order to add a method to check whether PF supports imp-handle RAS
feature, add dumping this info in debugfs.
Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiaran Zhang [Tue, 8 Jun 2021 13:08:28 +0000 (21:08 +0800)]
net: hns3: add the RAS compatibility adaptation solution
To adapt to hardware modification and ensure that the driver is
compatible with the original error handling content, we need to add the
RAS compatibility adaptation solution.
Add a processing branch to the driver during error handling. In the new
processing branch, NIC fault information is integrated by the IMP. An
interaction command is added between the driver and IMP to query
and clear the fault source and interrupt source. The IMP integrates
error information and reports the highest reset level to the driver.
Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yufeng Mo [Tue, 8 Jun 2021 13:08:27 +0000 (21:08 +0800)]
net: hns3: add support for handling all errors through MSI-X
Currently, hardware errors can be reported through AER or MSI-X mode.
However, the AER mode is intended to handle only bus errors, but not
hardware errors. On the other hand, virtual machines cannot handle
AER errors. When an AER error is reported, virtual machines will be
suspended. So add support for handling all these hardware errors
through MSI-X mode which depends on a newer version of firmware,
and reserve the handler of the AER mode for compatibility.
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 8 Jun 2021 21:41:10 +0000 (14:41 -0700)]
Merge branch 'ena-updates'
Shay Agroskin says:
====================
se build_skb and reorganize some code in ENA
this patchset introduces several changes:
- Use build_skb() on RX side.
This allows to ensure that the headers are in the linear part
- Batch some code into functions and remove some of the code to make it more
readable and less error prone
- Fix RST format and outdated description in ENA documentation
- Improve cache alignment in the code
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Shay Agroskin [Tue, 8 Jun 2021 16:01:18 +0000 (19:01 +0300)]
net: ena: re-organize code to improve readability
Restructure some ethtool to a switch-case blocks to make it more uniform
with other similar functions.
Also restructure variable declaration to create reversed x-mas tree.
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shay Agroskin [Tue, 8 Jun 2021 16:01:17 +0000 (19:01 +0300)]
net: ena: Use dev_alloc() in RX buffer allocation
Use dev_alloc() when allocating RX buffers instead of specifying the
allocation flags explicitly. This result in same behaviour with less
code.
Also move the page allocation and its DMA mapping into a function. This
creates a logical block, which may help understanding the code.
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shay Agroskin [Tue, 8 Jun 2021 16:01:16 +0000 (19:01 +0300)]
net: ena: aggregate doorbell common operations into a function
The ena_ring_tx_doorbell() is introduced to call the doorbell and
increase the driver's corresponding stat.
Signed-off-by: Ido Segev <idose@amazon.com>
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shay Agroskin [Tue, 8 Jun 2021 16:01:15 +0000 (19:01 +0300)]
net: ena: fix RST format in ENA documentation file
The documentation file used to be written in markdown format but was
converted to reStructuredText (rst).
The converted file doesn't keep up with rst format requirements which
results in hard-to-read text.
This patch fixes the formatting of the file. The patch also
* Highlights and emphasizes some lines to improve readability
* Rephrases some hard-to-understand text
* Updates outdated function descriptions.
* Removes TSO description which falsely claims the driver supports it
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shay Agroskin [Tue, 8 Jun 2021 16:01:14 +0000 (19:01 +0300)]
net: ena: Remove module param and change message severity
Remove the module param 'debug' which allows to specify the message
level of the driver. This value can be specified using ethtool command.
Also reduce the message level of LLQ support to be a warning since it is
not an indication of an error.
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shay Agroskin [Tue, 8 Jun 2021 16:01:13 +0000 (19:01 +0300)]
net: ena: add jiffies of last napi call to stats
There are instances when we want to know when the last napi was
called for debugging.
On stuck / heavy loaded CPUs, the ena napi handler might not be
called for a long period of time. This stat can help us to
determine how much time passed since the last execution of napi.
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shay Agroskin [Tue, 8 Jun 2021 16:01:12 +0000 (19:01 +0300)]
net: ena: use build_skb() in RX path
This patch converts the RX path to use build_skb() for packets larger
than copybreak (set to 256 by default). This function makes the first
descriptor's page to be the linear part of the sk_buff struct buffer.
Also remove the SKB description from the README since most of it no
longer relevant and the parts that are left don't add information.
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shay Agroskin [Tue, 8 Jun 2021 16:01:11 +0000 (19:01 +0300)]
net: ena: Improve error logging in driver
Add prints to improve logging of driver's errors.
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shay Agroskin [Tue, 8 Jun 2021 16:01:10 +0000 (19:01 +0300)]
net: ena: Remove unused code
The ENA_DEFAULT_MIN_RX_BUFF_ALLOC_SIZE macro,
ena_xdp_queues_present() function and SUSPEND_RESUME enums aren't used
in the driver, and so not needed.
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shay Agroskin [Tue, 8 Jun 2021 16:01:09 +0000 (19:01 +0300)]
net: ena: optimize data access in fast-path code
This tweaks several small places to improve the data access in fast
path:
* Remove duplicates of first_interrupt flag and surround it with
WRITE/READ_ONCE macros:
The flag is used to detect HW disorders in its
interrupt communication with the driver. The flag is set when an
interrupt is received and used in the health check function
(ena_timer_service()) to help it find irregularities.
* Reorder some fields in ena_napi struct to take better advantage of
cache access pattern.
* Move XDP TX queue number to a variable to save its calculation for
every packet.
* Use likely in a condition to improve branch prediction
The 'first_interrupt' and 'interrupt_masked' flags were moved to reside
in the same cache line as the first fields of 'napi' struct. This
placement ensures that all memory accessed during upper-half handler
reside in the same cacheline (napi_schedule_irqoff() only accesses
'state' and 'poll_list' fields which are at the beginning of napi
struct).
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 8 Jun 2021 21:39:07 +0000 (14:39 -0700)]
Merge branch 'mlxsw-various-updates'
Ido Schimmel says:
====================
mlxsw: Various updates
This patchset contains various updates for mlxsw. The most significant
change is the long overdue removal of the abort mechanism in the first
two patches.
Patches #1-#2 remove the route abort mechanism. This change is long
overdue and explained in detail in the commit message.
Patch #3 sets ports down in a few selftests that forgot to do so.
Discovered using a BPF tool (WIP) that monitors ASIC resources.
Patch #4 fixes an issue introduced by commit
557c4d2f780c ("selftests:
devlink_lib: add check for devlink device existence").
Patches #5-#8 modify the driver to read transceiver module's temperature
thresholds using MTMP register (when supported) instead of directly from
the module's EEPROM using MCIA register. This is both more reliable and
more efficient as now the module's temperature and thresholds are read
using one transaction instead of three.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Mykola Kostenok [Tue, 8 Jun 2021 12:44:14 +0000 (15:44 +0300)]
mlxsw: thermal: Read module temperature thresholds using MTMP register
mlxsw_thermal_module_trips_update() is used to update the trip points of
the module's thermal zone. Currently, this is done by querying the
thresholds from the module's EEPROM via MCIA register. This data does
not pass validation and in some cases can be unreliable. For example,
due to some problem with transceiver module.
Previous patch made it possible to read module's temperature and
thresholds via MTMP register. Therefore, extend
mlxsw_thermal_module_trips_update() to use the thresholds queried from
MTMP, if valid.
This is both more reliable and more efficient than current method, as
temperature and thresholds are queried in one transaction instead of
three. This is significant when working over a slow bus such as I2C.
Signed-off-by: Mykola Kostenok <c_mykolak@nvidia.com>
Acked-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mykola Kostenok [Tue, 8 Jun 2021 12:44:13 +0000 (15:44 +0300)]
mlxsw: thermal: Add function for reading module temperature and thresholds
Provide new function mlxsw_thermal_module_temp_and_thresholds_get() for
reading temperature and temperature thresholds by a single operation.
The motivation is to reduce the number of transactions with the device
which is important when operating over a slow bus such as I2C.
Currently, the sole caller of the function is only using it to read the
module's temperature. The next patch will also use it to query the
module's temperature thresholds.
Signed-off-by: Mykola Kostenok <c_mykolak@nvidia.com>
Acked-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mykola Kostenok [Tue, 8 Jun 2021 12:44:12 +0000 (15:44 +0300)]
mlxsw: core_env: Read module temperature thresholds using MTMP register
Currently, module temperature thresholds are obtained from Management
Cable Info Access (MCIA) register by specifying the thresholds offsets
within module EEPROM layout. This data does not pass validation and in
some cases can be unreliable. For example, due to some problem with the
module.
Add support for a new feature provided by Management Temperature (MTMP)
register for sanitization of temperature thresholds values.
Extend mlxsw_env_module_temp_thresholds_get() to get temperature
thresholds through MTMP field 'max_operational_temperature' - if it is
not zero, feature is supported. Otherwise fallback to old method and get
the thresholds through MCIA.
Signed-off-by: Mykola Kostenok <c_mykolak@nvidia.com>
Acked-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mykola Kostenok [Tue, 8 Jun 2021 12:44:11 +0000 (15:44 +0300)]
mlxsw: reg: Extend MTMP register with new threshold field
Extend Management Temperature (MTMP) register with new field specifying
the maximum temperature threshold.
Extend mlxsw_reg_mtmp_unpack() function with two extra arguments,
providing high and maximum temperature thresholds. For modules, these
thresholds correspond to critical and emergency thresholds that are read
from the module's EEPROM.
Signed-off-by: Mykola Kostenok <c_mykolak@nvidia.com>
Acked-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Tue, 8 Jun 2021 12:44:10 +0000 (15:44 +0300)]
selftests: devlink_lib: Fix bouncing of netdevsim DEVLINK_DEV
In the commit referenced below, a check was added to devlink_lib that
asserts the existence of a devlink device referenced by $DEVLINK_DEV.
Unfortunately, several netdevsim tests point DEVLINK_DEV at a device that
does not exist at the time that devlink_lib is sourced. Thus these tests
spuriously fail.
Fix this by introducing an override. By setting DEVLINK_DEV to an empty
string, the user declares their intention to handle DEVLINK_DEV management
on their own.
In all netdevsim tests that use devlink_lib and set DEVLINK_DEV, set
instead an empty DEVLINK_DEV just before sourcing devlink_lib, and set it
to the correct value right afterwards.
Fixes: 557c4d2f780c ("selftests: devlink_lib: add check for devlink device existence")
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Tue, 8 Jun 2021 12:44:09 +0000 (15:44 +0300)]
selftests: Clean forgotten resources as part of cleanup()
Several tests do not set some ports down as part of their cleanup(),
resulting in IPv6 link-local addresses and associated routes not being
deleted.
These leaks were found using a BPF tool that monitors ASIC resources.
Solve this by setting the ports down at the end of the tests.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Tue, 8 Jun 2021 12:44:08 +0000 (15:44 +0300)]
selftests: router_scale: Do not count failed routes
To check how many routes are installed in hardware, the test runs "ip
route" and greps for "offload", which includes routes with state
"offload_failed".
Till now, this wrong check was not found because after one failure in
route insertion, the driver moved to "abort" mode, which means that user
cannot try to add more routes.
The previous patch removed the abort mechanism and now failed routes are
counted as offloaded.
Fix this by not considering routes with "offload_failed" flag as
offloaded.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Tue, 8 Jun 2021 12:44:07 +0000 (15:44 +0300)]
mlxsw: spectrum_router: Remove abort mechanism
The abort mechanism was introduced in commit
8e05fd7166c6 ("fib: hook
IPv4 fib for hardware offload") with the purpose of falling back to
software-based routing in case of a route programming error in hardware.
The process is irreversible and requires users to reload the offloading
driver or reboot the machine.
While this approach might make sense in theory, it makes very little
sense in practice. In the case of high speed ASICs such as the Spectrum
ASIC, the abort mechanism effectively kills the machine upon a non-fatal
error such as a route programming error.
Such an extreme policy does not belong in the kernel, especially when
user space can simply try to reprogram the route following the
RTM_NEWROUTE failure notification.
Therefore, remove the abort mechanism.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 8 Jun 2021 21:37:16 +0000 (14:37 -0700)]
Merge branch 'dsa-sja1110'
Vladimir Oltean says:
====================
Add NXP SJA1110 support to the sja1105 DSA driver
The NXP SJA1110 is an automotive Ethernet switch with an embedded Arm
Cortex-M7 microcontroller. The switch has 11 ports (10 external + one
for the DSA-style connection to the microcontroller).
The microcontroller can be disabled and the switch can be controlled
over SPI, a la SJA1105 - this is how this driver handles things.
There are some integrated NXP PHYs (100base-T1 and 100base-TX). Their
initialization is handled by their own PHY drivers, the switch is only
concerned with enabling register accesses to them, by registering two
MDIO buses.
Changes in v3:
- Make sure the VLAN retagging port is enabled and functional
- Dropped SGMII PCS from this series
Changes in v2:
- converted nxp,sja1105 DT bindings to YAML
- registered the PCS MDIO bus and forced auto-probing off for all PHY
addresses for this bus
- changed the container node name for the 2 MDIO buses from "mdio" to
"mdios" to avoid matching on the mdio.yaml schema (it's just a
container node, not an MDIO bus)
- fixed an uninitialized "offset" variable usage in
sja1110_pcs_mdio_{read,write}
- using the mdiobus_c45_addr macro instead of open-coding that operation
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Tue, 8 Jun 2021 09:25:38 +0000 (12:25 +0300)]
net: dsa: sja1105: register the MDIO buses for 100base-T1 and 100base-TX
The SJA1110 contains two types of integrated PHYs: one 100base-TX PHY
and multiple 100base-T1 PHYs.
The access procedure for the 100base-T1 PHYs is also different than it
is for the 100base-TX one. So we register 2 MDIO buses, one for the
base-TX and the other for the base-T1. Each bus has an OF node which is
a child of the "mdio" subnode of the switch, and they are recognized by
compatible string.
Cc: Russell King <linux@armlinux.org.uk>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: devicetree@vger.kernel.org
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Tue, 8 Jun 2021 09:25:37 +0000 (12:25 +0300)]
net: dsa: sja1105: make sure the retagging port is enabled for SJA1110
The SJA1110 has an extra configuration in the General Parameters Table
through which the user can select the buffer reservation config.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Tue, 8 Jun 2021 09:25:36 +0000 (12:25 +0300)]
net: dsa: sja1105: add support for the SJA1110 switch family
The SJA1110 is basically an SJA1105 with more ports, some integrated
PHYs (100base-T1 and 100base-TX) and an embedded microcontroller which
can be disabled, and the switch core can be controlled by a host running
Linux, over SPI.
This patch contains:
- the static and dynamic config packing functions, for the tables that
are common with SJA1105
- one more static config tables which is "unique" to the SJA1110
(actually it is a rehash of stuff that was placed somewhere else in
SJA1105): the PCP Remapping Table
- a reset and clock configuration procedure for the SJA1110 switch.
This resets just the switch subsystem, and gates off the clock which
powers on the embedded microcontroller.
- an RGMII delay configuration procedure for SJA1110, which is very
similar to SJA1105, but different enough for us to be unable to reuse
it (this is a pattern that repeats itself)
- some adaptations to dynamic config table entries which are no longer
programmed in the same way. For example, to delete a VLAN, you used to
write an entry through the dynamic reconfiguration interface with the
desired VLAN ID, and with the VALIDENT bit set to false. Now, the VLAN
table entries contain a TYPE_ENTRY field, which must be set to zero
(in a backwards-incompatible way) in order for the entry to be deleted,
or to some other entry for the VLAN to match "inner tagged" or "outer
tagged" packets.
- a similar thing for the static config: the xMII Mode Parameters Table
encoding for SGMII and MII (the latter just when attached to a
100base-TX PHY) just isn't what it used to be in SJA1105. They are
identical, except there is an extra "special" bit which needs to be
set. Set it.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Tue, 8 Jun 2021 09:25:35 +0000 (12:25 +0300)]
dt-bindings: net: dsa: sja1105: add SJA1110 bindings
There are 4 variations of the SJA1110 switch which have a different set
of MII protocols supported per port. Document the compatible strings.
Also, the SJA1110 optionally supports 2 internal MDIO buses for 2
different types of Ethernet PHYs. Document a container node called
"mdios" which has 2 subnodes "mdio@0" and "mdio@1", identifiable via
compatible string, under which the driver finds the internal PHYs.
Cc: Rob Herring <robh+dt@kernel.org>
Cc: devicetree@vger.kernel.org
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 8 Jun 2021 21:33:44 +0000 (14:33 -0700)]
Merge branch 'wwan-improvements'
Sergey Ryazanov says:
====================
net: WWAN subsystem improvements
While working on WWAN netdev creation support, I notice a few things
that could be done to make the wwan subsystem more developer and user
friendly. This series implements them.
The series begins with a WWAN HW simulator designed simplify testing
and make the WWAN subsystem available for a wider audience. The next two
patches are intended to make the code a bit more clearer. This is
followed by a few patches to make the port device naming more
user-friendly. The series is finishes with a set of changes that allow
the WWAN AT port to be used with terminal emulation software.
All changes were tested with the HW simulator that was introduced in
this series, as well as with a Huawei E3372 LTE modem (a CDC-NCM
device), which I finally found on my desk.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergey Ryazanov [Tue, 8 Jun 2021 04:02:41 +0000 (07:02 +0300)]
net: wwan: core: purge rx queue on port close
Purge the rx queue as soon as a user closes the port, just after the
port stop callback invocation. This is to prevent feeding a user that
will open the port next time with outdated and possibly unrelated
data.
While at it also remove the odd skb_queue_purge() call in the port
device destroy callback. The queue will be purged just before the
callback is ivoncated in the wwan_remove_port() function.
Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergey Ryazanov [Tue, 8 Jun 2021 04:02:40 +0000 (07:02 +0300)]
net: wwan: core: implement terminal ioctls for AT port
It is not unreasonable to assume that users will use terminal emulation
software to communicate directly with a WWAN device over the AT port.
But terminal emulators will refuse to work with a device that does not
support terminal IOCTLs (e.g. TCGETS, TCSETS, TIOCMSET, etc.). To make
it possible to interact with the WWAN AT port using a terminal emulator,
implement a minimal set of terminal IOCTLs.
The implementation is rather stub, no passed data are actually used to
control a port behaviour. An obtained configuration is kept inside the
port structure and returned back by a request. The latter is done to
fool a program that will test the configuration status by comparing the
readed back data from the device with earlier configured ones.
Tested with fresh versions of minicom and picocom terminal apps.
MBIM, QMI and other ports for binary protocols can hardly be considered
a terminal device, so terminal IOCTLs are only implemented for the AT
port.
Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergey Ryazanov [Tue, 8 Jun 2021 04:02:39 +0000 (07:02 +0300)]
net: wwan: core: implement TIOCINQ ioctl
It is quite common for a userpace program to fetch the buffered amount
of data in the rx queue to avoid the read block. Implement the TIOCINQ
ioctl to make the migration to the WWAN port usage smooth.
Despite the fact that the read call will return no more data than the
size of a first skb in the queue, TIOCINQ returns the entire amount of
buffered data (sum of all queued skbs). This is done to prevent the
breaking of programs that optimize reading, avoiding it if the buffered
amount of data is too small.
Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Reviewed-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergey Ryazanov [Tue, 8 Jun 2021 04:02:38 +0000 (07:02 +0300)]
net: wwan: core: expand ports number limit
Currently, we limit the total ports number to 256. It is quite common
for PBX or SMS gateway to be equipped with a lot of modems. In now days,
a modem could have 2-4 control ports or even more, what only accelerates
the ports exhausing rate.
To avoid facing the port number limitation issue reports, increase the
limit up the maximum number of minors (i.e. up to 1 << MINORBITS).
Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Reviewed-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergey Ryazanov [Tue, 8 Jun 2021 04:02:37 +0000 (07:02 +0300)]
net: wwan: core: make port names more user-friendly
At the moment, the port name is allocated based on the parent device
name, port id and the port type. Where the port id specifies nothing but
the ports registration order and is only used to make the port name
unique.
Most likely, to configure a WWAN device, the user will look for a port
of a specific type (e.g. AT port or MBIM port, etc.). The current naming
scheme can make it difficult to find a port of a specific type.
Consider a WWAN device that has 3 ports: AT port, MBIM port, and another
one AT port. With the global port index, the port names will be:
* wwan0p1at
* wwan0p2mbim
* wwan0p3at
To find the MBIM port, user should know in advance the device ports
composition (i.e. the user should know that the MBIM port is the 2nd
one) or carefully examine the whole ports list. It is not unusual for
USB modems to have a different composition, even if they are build on a
same chipset. Moreover, some modems able to change the ports composition
based on the user's configuration. All this makes port names fully
unpredictable.
To make naming more user-friendly, remove the global port id and
enumerate ports by its type. E.g.:
* wwan0p1at -> wwan0at0
* wwan0p2mbim -> wwan0mbim0
* wwan0p3at -> wwan0at1
With this naming scheme, the first AT port name will always be wwanXat0,
the first MBIM port name will always be wwanXmbim0, etc.
Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergey Ryazanov [Tue, 8 Jun 2021 04:02:36 +0000 (07:02 +0300)]
net: wwan: core: spell port device name in lowercase
Usually a device name is spelled in lowercase, let us follow this
practice in the WWAN subsystem as well. The bottom line is that such
name is easier to type.
To keep the device type attribute contents more natural (i.e., spell
abbreviations in uppercase), while making the device name lowercase,
turn the port type strings array to an array of structure that contains
both the port type name and the device name suffix.
Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Reviewed-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergey Ryazanov [Tue, 8 Jun 2021 04:02:35 +0000 (07:02 +0300)]
net: wwan: core: init port type string array using enum values
This array is indexed by port type. Make it self-descriptive by using
the port type enum values as indices in the array initializer.
Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Reviewed-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergey Ryazanov [Tue, 8 Jun 2021 04:02:34 +0000 (07:02 +0300)]
net: wwan: make WWAN_PORT_MAX meaning less surprised
It is quite unusual when some value can not be equal to a defined range
max value. Also most subsystems defines FOO_TYPE_MAX as a maximum valid
value. So turn the WAN_PORT_MAX meaning from the number of supported
port types to the maximum valid port type.
Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Reviewed-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergey Ryazanov [Tue, 8 Jun 2021 04:02:33 +0000 (07:02 +0300)]
wwan_hwsim: add debugfs management interface
wwan_hwsim creates and removes simulated control ports on module loading
and unloading. It would be helpful to be able to create/remove devices
and ports at run-time to trigger wwan port (un-)register actions without
module reloading.
Some simulator objects (e.g. ports) do not have the underling device and
it is not possible to fully manage the simulator via sysfs. wwan_hsim
intend for developers, so implement it as a self-contained debugfs based
management interface.
Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergey Ryazanov [Tue, 8 Jun 2021 04:02:32 +0000 (07:02 +0300)]
wwan_hwsim: WWAN device simulator
This driver simulates a set of WWAN device with a set of AT control
ports. It can be used to test WWAN kernel framework as well as user
space tools.
Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 8 Jun 2021 21:31:43 +0000 (14:31 -0700)]
Merge branch 'stmmac-25gbps'
Michael Sit Wei Hong says:
====================
Enable 2.5Gbps speed for stmmac
Intel mGbE supports 2.5Gbps link speed by overclocking the clock rate
by 2.5 times to support 2.5Gbps link speed. In this mode, the serdes/PHY
operates at a serial baud rate of 3.125 Gbps and the PCS data path and
GMII interface of the MAC operate at 312.5 MHz instead of 125 MHz.
This is configured in the BIOS during boot up. The kernel driver is not able
access to modify the clock rate for 1Gbps/2.5G mode on the fly. The way to
determine the current 1G/2.5G mode is by reading a dedicated adhoc
register through mdio bus.
Changes:
v5 -> v6
patch 1/3
- Check if mdio_bus_data is populated to prevent NULL pointer dereferencing
when accesing mdio_bus_data member
v4 -> v5
patch 1/3
- Rebase to latest code changes after Vladimir's code is merged and fix
build warnings
v3 -> v4
patch 1/3
- Rebase to latest code and Initialize 'found' to 0 to avoid build warning
patch 2/3
- Fix indentation issue from v3
v2 -> v3
patch 1/3
-New patch added to restructure the code. enabling reading the dedicated
adhoc register to determine link speed mode.
patch 2/3
-Restructure for 2.5G speed to use 2500BaseX configuration as the
PHY interface.
patch 3/3
-Restructure to read serdes registers to set max_speed and configure to
use 2500BaseX in 2.5G speeds.
v1 -> v2
patch 1/2
-Remove MAC supported link speed masking
patch 2/2
-Add supported link speed masking in the PCS
iperf3 and ping for 2.5Gbps and regression test on 10M/100M/1000Mbps
is done to prevent regresson issues.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Voon Weifeng [Tue, 8 Jun 2021 03:51:58 +0000 (11:51 +0800)]
net: stmmac: enable Intel mGbE 2.5Gbps link speed
The Intel mGbE supports 2.5Gbps link speed by increasing the clock rate by
2.5 times of the original rate. In this mode, the serdes/PHY operates at a
serial baud rate of 3.125 Gbps and the PCS data path and GMII interface of
the MAC operate at 312.5 MHz instead of 125 MHz.
For Intel mGbE, the overclocking of 2.5 times clock rate to support 2.5G is
only able to be configured in the BIOS during boot time. Kernel driver has
no access to modify the clock rate for 1Gbps/2.5G mode. The way to
determined the current 1G/2.5G mode is by reading a dedicated adhoc
register through mdio bus. In short, after the system boot up, it is either
in 1G mode or 2.5G mode which not able to be changed on the fly.
Compared to 1G mode, the 2.5G mode selects the 2500BASEX as PHY interface and
disables the xpcs_an_inband. This is to cater for some PHYs that only
supports 2500BASEX PHY interface with no autonegotiation.
v2: remove MAC supported link speed masking
v3: Restructure to introduce intel_speed_mode_2500() to read serdes registers
for max speed supported and select the appropritate configuration.
Use max_speed to determine the supported link speed mask.
Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Signed-off-by: Michael Sit Wei Hong <michael.wei.hong.sit@intel.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Voon Weifeng [Tue, 8 Jun 2021 03:51:57 +0000 (11:51 +0800)]
net: pcs: add 2500BASEX support for Intel mGbE controller
XPCS IP supports 2500BASEX as PHY interface. It is configured as
autonegotiation disable to cater for PHYs that does not supports 2500BASEX
autonegotiation.
v2: Add supported link speed masking.
v3: Restructure to introduce xpcs_config_2500basex() used to configure the
xpcs for 2.5G speeds. Added 2500BASEX specific information for
configuration.
v4: Fix indentation error
Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Signed-off-by: Michael Sit Wei Hong <michael.wei.hong.sit@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Voon Weifeng [Tue, 8 Jun 2021 03:51:56 +0000 (11:51 +0800)]
net: stmmac: split xPCS setup from mdio register
This patch is a preparation patch for the enabling of Intel mGbE 2.5Gbps
link speed. The Intel mGbR link speed configuration (1G/2.5G) is depends on
a mdio ADHOC register which can be configured in the bios menu.
As PHY interface might be different for 1G and 2.5G, the mdio bus need be
ready to check the link speed and select the PHY interface before probing
the xPCS.
Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Signed-off-by: Michael Sit Wei Hong <michael.wei.hong.sit@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 8 Jun 2021 19:10:26 +0000 (12:10 -0700)]
Merge tag 'batadv-next-pullrequest-
20210608' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
pull request for net-next: batman-adv 2021-06-08
here is a feature/cleanup pull request of batman-adv to go into net-next.
Please pull or let me know of any problem!
This feature/cleanup patchset includes the following patches:
- bump version strings, by Simon Wunderlich
- consistently send iface index/name in genlmsg, by Sven Eckelmann
- improve broadcast queueing, by Linus Lüssing (2 patches)
- add support for routable IPv4 multicast with bridged setups,
by Linus Lüssing
- remove repeated declarations, by Shaokun Zhang
- fix spelling mistakes, by Zheng Yongjun
- clean up hard interface handling after dropping sysfs support,
by Sven Eckelmann (4 patches)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Geert Uytterhoeven [Tue, 8 Jun 2021 10:56:09 +0000 (12:56 +0200)]
nvme: NVME_TCP_OFFLOAD should not default to m
The help text for the symbol controlling support for the NVM Express
over Fabrics TCP offload common layer suggests to not enable this
support when unsure.
Hence drop the "default m", which actually means "default y" if
CONFIG_MODULES is not enabled.
Fixes: f0e8cb6106da2703 ("nvme-tcp-offload: Add nvme-tcp-offload - NVMeTCP HW offload ULP")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 8 Jun 2021 19:04:05 +0000 (12:04 -0700)]
Merge branch 'farsync-cleanups'
Peng Li says:
====================
net: farsync: clean up some code style issues
This patchset clean up some code style issues.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Tue, 8 Jun 2021 08:12:42 +0000 (16:12 +0800)]
net: farsync: replace comparison to NULL with "fst_card_array[i]"
According to the chackpatch.pl, comparison to NULL could
be written "fst_card_array[i]".
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Tue, 8 Jun 2021 08:12:41 +0000 (16:12 +0800)]
net: farsync: remove redundant return
Void function return statements are not generally useful.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Tue, 8 Jun 2021 08:12:40 +0000 (16:12 +0800)]
net: farsync: fix the alignment issue
Alignment should match open parenthesis.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Tue, 8 Jun 2021 08:12:39 +0000 (16:12 +0800)]
net: farsync: remove redundant parentheses
Unnecessary parentheses around 'port->hwif == X21'.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Tue, 8 Jun 2021 08:12:38 +0000 (16:12 +0800)]
net: farsync: remove redundant spaces
According to the chackpatch.pl,
space prohibited between function name and open parenthesis '(',
no space is necessary after a cast.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Tue, 8 Jun 2021 08:12:37 +0000 (16:12 +0800)]
net: farsync: remove redundant braces {}
This patch removes redundant braces {}, to fix the
checkpatch.pl warning:
"braces {} are not necessary for single statement blocks".
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Tue, 8 Jun 2021 08:12:36 +0000 (16:12 +0800)]
net: farsync: add some required spaces
Add spaces required around that '=' and '*'.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Tue, 8 Jun 2021 08:12:35 +0000 (16:12 +0800)]
net: farsync: fix the code style issue about macros
Macros with complex values should be enclosed in parentheses.
space required after that ',' .
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Tue, 8 Jun 2021 08:12:34 +0000 (16:12 +0800)]
net: farsync: code indent use tabs where possible
Code indent should use tabs where possible.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Tue, 8 Jun 2021 08:12:33 +0000 (16:12 +0800)]
net: farsync: remove trailing whitespaces
This patch removes trailing whitespaces.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Tue, 8 Jun 2021 08:12:32 +0000 (16:12 +0800)]
net: farsync: fix the comments style issue
Networking block comments don't use an empty /* line,
use /* Comment...
Block comments use * on subsequent lines.
Block comments use a trailing */ on a separate line.
This patch fixes the comments style issues.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Tue, 8 Jun 2021 08:12:31 +0000 (16:12 +0800)]
net: farsync: remove redundant initialization for statics
Should not initialise statics to 0.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Tue, 8 Jun 2021 08:12:30 +0000 (16:12 +0800)]
net: farsync: move out assignment in if condition
Should not use assignment in if condition.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Tue, 8 Jun 2021 08:12:29 +0000 (16:12 +0800)]
net: farsync: fix the code style issue about "foo* bar"
Fix the checkpatch error as "foo * bar" should be "foo *bar".
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Tue, 8 Jun 2021 08:12:28 +0000 (16:12 +0800)]
net: farsync: add blank line after declarations
This patch fixes the checkpatch error about missing a blank line
after declarations.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Tue, 8 Jun 2021 08:12:27 +0000 (16:12 +0800)]
net: farsync: remove redundant blank lines
This patch removes some redundant blank lines.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 8 Jun 2021 18:41:24 +0000 (11:41 -0700)]
Merge branch 'realtek-dt'
Joakim Zhang says:
====================
net: phy: add dt property for realtek phy
Add dt property for realtek phy.
---
ChangeLogs:
V1->V2:
* store the desired PHYCR1/2 register value in "priv" rather than
using "quirks", per Russell King suggestion, as well as can
cover the bootloader setting.
* change the behavior of ALDPS mode, default is disabled, add dt
property for users to enable it.
* fix dt binding yaml build issues.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Joakim Zhang [Tue, 8 Jun 2021 03:15:35 +0000 (11:15 +0800)]
net: phy: realtek: add delay to fix RXC generation issue
PHY will delay about 11.5ms to generate RXC clock when switching from
power down to normal operation. Read/write registers would also cause RXC
become unstable and stop for a while during this process. Realtek engineer
suggests 15ms or more delay can workaround this issue.
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joakim Zhang [Tue, 8 Jun 2021 03:15:34 +0000 (11:15 +0800)]
net: phy: realtek: add dt property to enable ALDPS mode
If enable Advance Link Down Power Saving (ALDPS) mode, it will change
crystal/clock behavior, which cause RXC clock stop for dozens to hundreds
of miliseconds. This is comfirmed by Realtek engineer. For some MACs, it
needs RXC clock to support RX logic, after this patch, PHY can generate
continuous RXC clock during auto-negotiation.
ALDPS default is disabled after hardware reset, it's more reasonable to
add a property to enable this feature, since ALDPS would introduce side effect.
This patch adds dt property "realtek,aldps-enable" to enable ALDPS mode
per users' requirement.
Jisheng Zhang enables this feature, changes the default behavior. Since
mine patch breaks the rule that new implementation should not break
existing design, so Cc'ed let him know to see if it can be accepted.
Cc: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joakim Zhang [Tue, 8 Jun 2021 03:15:33 +0000 (11:15 +0800)]
net: phy: realtek: add dt property to disable CLKOUT clock
CLKOUT is enabled by default after PHY hardware reset, this patch adds
"realtek,clkout-disable" property for user to disable CLKOUT clock
to save PHY power.
Per RTL8211F guide, a PHY reset should be issued after setting these
bits in PHYCR2 register. After this patch, CLKOUT clock output to be
disabled.
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joakim Zhang [Tue, 8 Jun 2021 03:15:32 +0000 (11:15 +0800)]
dt-bindings: net: add dt binding for realtek rtl82xx phy
Add binding for realtek rtl82xx phy.
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marek Behún [Tue, 8 Jun 2021 01:26:48 +0000 (03:26 +0200)]
net: Kconfig: indent with tabs instead of spaces
The BAREUDP config option uses spaces instead of tabs for indentation.
The rest of this file uses tabs. Fix this.
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 7 Jun 2021 21:11:47 +0000 (14:11 -0700)]
Merge branch 'page_pool-recycling'
Matteo Croce says:
====================
page_pool: recycle buffers
This is a respin of [1]
This patchset shows the plans for allowing page_pool to handle and
maintain DMA map/unmap of the pages it serves to the driver. For this
to work a return hook in the network core is introduced.
The overall purpose is to simplify drivers, by providing a page
allocation API that does recycling, such that each driver doesn't have
to reinvent its own recycling scheme. Using page_pool in a driver
does not require implementing XDP support, but it makes it trivially
easy to do so. Instead of allocating buffers specifically for SKBs
we now allocate a generic buffer and either wrap it on an SKB
(via build_skb) or create an XDP frame.
The recycling code leverages the XDP recycle APIs.
The Marvell mvpp2 and mvneta drivers are used in this patchset to
demonstrate how to use the API, and tested on a MacchiatoBIN
and EspressoBIN boards respectively.
Please let this going in on a future -rc1 so to allow enough time
to have wider tests.
v7 -> v8:
- use page->lru.next instead of page->index for pfmemalloc
- remove conditional include
- rework page_pool_return_skb_page() so to have less conversions
between page and addresses, and call compound_head() only once
- move some code from skb_free_head() to a new helper skb_pp_recycle()
- misc fixes
v6 -> v7:
- refresh patches against net-next
- remove a redundant call to virt_to_head_page()
- update mvneta benchmarks
v5 -> v6:
- preserve pfmemalloc bit when setting signature
- fix typo in mvneta
- rebase on next-next with the new cache
- don't clear the skb->pp_recycle in pskb_expand_head()
v4 -> v5:
- move the signature so it doesn't alias with page->mapping
- use an invalid pointer as magic
- incorporate Matthew Wilcox's changes for pfmemalloc pages
- move the __skb_frag_unref() changes to a preliminary patch
- refactor some cpp directives
- only attempt recycling if skb->head_frag
- clear skb->pp_recycle in pskb_expand_head()
v3 -> v4:
- store a pointer to page_pool instead of xdp_mem_info
- drop a patch which reduces xdp_mem_info size
- do the recycling in the page_pool code instead of xdp_return
- remove some unused headers include
- remove some useless forward declaration
v2 -> v3:
- added missing SOBs
- CCed the MM people
v1 -> v2:
- fix a commit message
- avoid setting pp_recycle multiple times on mvneta
- squash two patches to avoid breaking bisect
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Matteo Croce [Mon, 7 Jun 2021 19:02:40 +0000 (21:02 +0200)]
mvneta: recycle buffers
Use the new recycling API for page_pool.
In a drop rate test, the packet rate increased by 10%,
from 296 Kpps to 326 Kpps.
perf top on a stock system shows:
Overhead Shared Object Symbol
23.66% [kernel] [k] __pi___inval_dcache_area
22.85% [mvneta] [k] mvneta_rx_swbm
7.54% [kernel] [k] kmem_cache_alloc
6.49% [kernel] [k] eth_type_trans
3.94% [kernel] [k] dev_gro_receive
3.91% [kernel] [k] __netif_receive_skb_core
3.91% [kernel] [k] kmem_cache_free
3.76% [kernel] [k] page_pool_release_page
3.56% [kernel] [k] free_unref_page
2.40% [kernel] [k] build_skb
1.49% [kernel] [k] skb_release_data
1.45% [kernel] [k] __alloc_pages_bulk
1.30% [kernel] [k] page_frag_free
And this is the same output with recycling enabled:
Overhead Shared Object Symbol
26.41% [kernel] [k] __pi___inval_dcache_area
25.00% [mvneta] [k] mvneta_rx_swbm
8.14% [kernel] [k] kmem_cache_alloc
6.84% [kernel] [k] eth_type_trans
4.44% [kernel] [k] __netif_receive_skb_core
4.38% [kernel] [k] kmem_cache_free
4.16% [kernel] [k] dev_gro_receive
3.21% [kernel] [k] page_pool_put_page
2.41% [kernel] [k] build_skb
1.82% [kernel] [k] skb_release_data
1.61% [kernel] [k] napi_gro_receive
1.25% [kernel] [k] page_pool_refill_alloc_cache
1.16% [kernel] [k] __netif_receive_skb_list_core
We can see that page_pool_release_page(), free_unref_page() and
__alloc_pages_bulk() are no longer on top of the list when receiving
traffic.
The test was done with mausezahn on the TX side with 64 byte raw
ethernet frames.
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Matteo Croce [Mon, 7 Jun 2021 19:02:39 +0000 (21:02 +0200)]
mvpp2: recycle buffers
Use the new recycling API for page_pool.
In a drop rate test, the packet rate is almost doubled,
from 1110 Kpps to 2128 Kpps.
perf top on a stock system shows:
Overhead Shared Object Symbol
34.88% [kernel] [k] page_pool_release_page
8.06% [kernel] [k] free_unref_page
6.42% [mvpp2] [k] mvpp2_rx
6.07% [kernel] [k] eth_type_trans
5.18% [kernel] [k] __netif_receive_skb_core
4.95% [kernel] [k] build_skb
4.88% [kernel] [k] kmem_cache_free
3.97% [kernel] [k] kmem_cache_alloc
3.45% [kernel] [k] dev_gro_receive
2.73% [kernel] [k] page_frag_free
2.07% [kernel] [k] __alloc_pages_bulk
1.99% [kernel] [k] arch_local_irq_save
1.84% [kernel] [k] skb_release_data
1.20% [kernel] [k] netif_receive_skb_list_internal
With packet rate stable at 1100 Kpps:
tx: 0 bps 0 pps rx: 532.7 Mbps 1110 Kpps
tx: 0 bps 0 pps rx: 532.6 Mbps 1110 Kpps
tx: 0 bps 0 pps rx: 532.4 Mbps 1109 Kpps
tx: 0 bps 0 pps rx: 532.1 Mbps 1109 Kpps
tx: 0 bps 0 pps rx: 531.9 Mbps 1108 Kpps
tx: 0 bps 0 pps rx: 531.9 Mbps 1108 Kpps
And this is the same output with recycling enabled:
Overhead Shared Object Symbol
12.91% [kernel] [k] eth_type_trans
12.54% [mvpp2] [k] mvpp2_rx
9.67% [kernel] [k] build_skb
9.63% [kernel] [k] __netif_receive_skb_core
8.44% [kernel] [k] page_pool_put_page
8.07% [kernel] [k] kmem_cache_free
7.79% [kernel] [k] kmem_cache_alloc
6.86% [kernel] [k] dev_gro_receive
3.19% [kernel] [k] skb_release_data
2.41% [kernel] [k] netif_receive_skb_list_internal
2.18% [kernel] [k] page_pool_refill_alloc_cache
1.76% [kernel] [k] napi_gro_receive
1.61% [kernel] [k] kfree_skb
1.20% [kernel] [k] dma_sync_single_for_device
1.16% [mvpp2] [k] mvpp2_poll
1.12% [mvpp2] [k] mvpp2_read
With packet rate above 2100 Kpps:
tx: 0 bps 0 pps rx: 1021 Mbps 2128 Kpps
tx: 0 bps 0 pps rx: 1021 Mbps 2127 Kpps
tx: 0 bps 0 pps rx: 1021 Mbps 2128 Kpps
tx: 0 bps 0 pps rx: 1021 Mbps 2128 Kpps
tx: 0 bps 0 pps rx: 1022 Mbps 2128 Kpps
tx: 0 bps 0 pps rx: 1022 Mbps 2129 Kpps
The major performance increase is explained by the fact that the most CPU
consuming functions (page_pool_release_page, page_frag_free and
free_unref_page) are no longer called on a per packet basis.
The test was done by sending to the macchiatobin 64 byte ethernet frames
with an invalid ethertype, so the packets are dropped early in the RX path.
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ilias Apalodimas [Mon, 7 Jun 2021 19:02:38 +0000 (21:02 +0200)]
page_pool: Allow drivers to hint on SKB recycling
Up to now several high speed NICs have custom mechanisms of recycling
the allocated memory they use for their payloads.
Our page_pool API already has recycling capabilities that are always
used when we are running in 'XDP mode'. So let's tweak the API and the
kernel network stack slightly and allow the recycling to happen even
during the standard operation.
The API doesn't take into account 'split page' policies used by those
drivers currently, but can be extended once we have users for that.
The idea is to be able to intercept the packet on skb_release_data().
If it's a buffer coming from our page_pool API recycle it back to the
pool for further usage or just release the packet entirely.
To achieve that we introduce a bit in struct sk_buff (pp_recycle:1) and
a field in struct page (page->pp) to store the page_pool pointer.
Storing the information in page->pp allows us to recycle both SKBs and
their fragments.
We could have skipped the skb bit entirely, since identical information
can bederived from struct page. However, in an effort to affect the free path
as less as possible, reading a single bit in the skb which is already
in cache, is better that trying to derive identical information for the
page stored data.
The driver or page_pool has to take care of the sync operations on it's own
during the buffer recycling since the buffer is, after opting-in to the
recycling, never unmapped.
Since the gain on the drivers depends on the architecture, we are not
enabling recycling by default if the page_pool API is used on a driver.
In order to enable recycling the driver must call skb_mark_for_recycle()
to store the information we need for recycling in page->pp and
enabling the recycling bit, or page_pool_store_mem_info() for a fragment.
Co-developed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Co-developed-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Matteo Croce [Mon, 7 Jun 2021 19:02:37 +0000 (21:02 +0200)]
skbuff: add a parameter to __skb_frag_unref
This is a prerequisite patch, the next one is enabling recycling of
skbs and fragments. Add an extra argument on __skb_frag_unref() to
handle recycling, and update the current users of the function with that.
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Matteo Croce [Mon, 7 Jun 2021 19:02:36 +0000 (21:02 +0200)]
mm: add a signature in struct page
This is needed by the page_pool to avoid recycling a page not allocated
via page_pool.
The page->signature field is aliased to page->lru.next and
page->compound_head, but it can't be set by mistake because the
signature value is a bad pointer, and can't trigger a false positive
in PageTail() because the last bit is 0.
Co-developed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Mon, 7 Jun 2021 15:02:59 +0000 (23:02 +0800)]
net: moxa: Use devm_platform_get_and_ioremap_resource()
Use devm_platform_get_and_ioremap_resource() to simplify
code and avoid a null-ptr-deref by checking 'res' in it.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Zheng Yongjun [Mon, 7 Jun 2021 15:01:37 +0000 (23:01 +0800)]
l2tp: Fix spelling mistakes
Fix some spelling mistakes in comments:
negociated ==> negotiated
dont ==> don't
Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Zheng Yongjun [Mon, 7 Jun 2021 15:01:18 +0000 (23:01 +0800)]
net/ncsi: Fix spelling mistakes
Fix some spelling mistakes in comments:
constuct ==> construct
chanels ==> channels
Detination ==> Destination
Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Zheng Yongjun [Mon, 7 Jun 2021 15:01:09 +0000 (23:01 +0800)]
ipv4: Fix spelling mistakes
Fix some spelling mistakes in comments:
Dont ==> Don't
timout ==> timeout
incomming ==> incoming
necesarry ==> necessary
substract ==> subtract
Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Zheng Yongjun [Mon, 7 Jun 2021 15:01:00 +0000 (23:01 +0800)]
netlabel: Fix spelling mistakes
Fix some spelling mistakes in comments:
Interate ==> Iterate
sucess ==> success
Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Mon, 7 Jun 2021 14:55:21 +0000 (22:55 +0800)]
net: micrel: check return value after calling platform_get_resource()
It will cause null-ptr-deref if platform_get_resource() returns NULL,
we need check the return value.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Mon, 7 Jun 2021 14:36:02 +0000 (22:36 +0800)]
net: mvpp2: check return value after calling platform_get_resource()
It will cause null-ptr-deref if platform_get_resource() returns NULL,
we need check the return value.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Mon, 7 Jun 2021 14:21:09 +0000 (22:21 +0800)]
net: ethernet: bgmac: Use devm_platform_ioremap_resource_byname
Use the devm_platform_ioremap_resource_byname() helper instead of
calling platform_get_resource_byname() and devm_ioremap_resource()
separately.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Mon, 7 Jun 2021 13:57:14 +0000 (21:57 +0800)]
net: enetc: Use devm_platform_get_and_ioremap_resource()
Use devm_platform_get_and_ioremap_resource() to simplify
code.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Mon, 7 Jun 2021 13:43:54 +0000 (21:43 +0800)]
net: macb: Use devm_platform_get_and_ioremap_resource()
Use devm_platform_get_and_ioremap_resource() to simplify
code.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Mon, 7 Jun 2021 13:38:37 +0000 (21:38 +0800)]
net: bcmgenet: check return value after calling platform_get_resource()
It will cause null-ptr-deref if platform_get_resource() returns NULL,
we need check the return value.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shaokun Zhang [Sat, 5 Jun 2021 05:42:56 +0000 (13:42 +0800)]
net: tulip: Remove the repeated declaration
Function 'pnic2_lnk_change' is declared twice, so remove the
repeated declaration.
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Sat, 5 Jun 2021 02:31:48 +0000 (10:31 +0800)]
net: mscc: ocelot: check return value after calling platform_get_resource()
It will cause null-ptr-deref if platform_get_resource() returns NULL,
we need check the return value.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 7 Jun 2021 21:00:37 +0000 (14:00 -0700)]
Merge branch 'hns3-error-handling'
Guangbin Huang says:
====================
net: hns3: refactors and decouples the error handling logic
This patchset refactors and decouples the error handling logic from reset
logic, it is the preset patch of the RAS feature. It mainly implements the
function that reset logic remains independent of the error handling logic,
this will ensure that common misellaneous MSI-X interrupt are re-enabled
quickly.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Yufeng Mo [Mon, 7 Jun 2021 11:18:12 +0000 (19:18 +0800)]
net: hns3: remove now redundant logic related to HNAE3_UNKNOWN_RESET
Earlier patches have decoupled the MSI-X conveyed error handling
and recovery logic. This earlier concept code is no longer required.
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiaran Zhang [Mon, 7 Jun 2021 11:18:11 +0000 (19:18 +0800)]
net: hns3: add scheduling logic for error handling task
Error handling & recovery is done in context of reset task which
gets scheduled from misc interrupt handler in existing code. But
since error handling has been moved to new task, it should get
scheduled instead of the reset task from the interrupt handler.
Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>