platform/kernel/linux-rpi.git
16 months agosctp: add fair capacity stream scheduler
Xin Long [Tue, 7 Mar 2023 21:23:26 +0000 (16:23 -0500)]
sctp: add fair capacity stream scheduler

As it says in rfc8260#section-3.5 about the fair capacity scheduler:

   A fair capacity distribution between the streams is used.  This
   scheduler considers the lengths of the messages of each stream and
   schedules them in a specific way to maintain an equal capacity for
   all streams.  The details are implementation dependent.  interleaving
   user messages allows for a better realization of the fair capacity
   usage.

This patch adds Fair Capacity Scheduler based on the foundations added
by commit 5bbbbe32a431 ("sctp: introduce stream scheduler foundations"):

A fc_list and a fc_length are added into struct sctp_stream_out_ext and
a fc_list is added into struct sctp_stream. In .enqueue, when there are
chunks enqueued into a stream, this stream will be linked into stream->
fc_list by its fc_list ordered by its fc_length. In .dequeue, it always
picks up the 1st skb from stream->fc_list. In .dequeue_done, fc_length
is increased by chunk's len and update its location in stream->fc_list
according to the its new fc_length.

Note that when the new fc_length overflows in .dequeue_done, instead of
resetting all fc_lengths to 0, we only reduced them by U32_MAX / 4 to
avoid a moment of imbalance in the scheduling, as Marcelo suggested.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
16 months agoMerge branch 'various-mtk_eth_soc-cleanups'
Paolo Abeni [Thu, 9 Mar 2023 08:51:34 +0000 (09:51 +0100)]
Merge branch 'various-mtk_eth_soc-cleanups'

Russell King says:

====================
Various mtk_eth_soc cleanups

Here are a number of patches that do a bit of cleanup to mtk_eth_soc.

The first patch cleans up mtk_gmac0_rgmii_adjust(), which is the
troublesome function preventing the driver becoming a post-March2020
phylink driver. It doesn't solve that problem, merely makes the code
easier to follow by getting rid of repeated tenary operators.

The second patch moves the check for DDR2 memory to the initialisation
of phylink's supported_interfaces - if TRGMII is not possible for some
reason, we should not be erroring out in phylink MAC operations when
that can be determined prior to phylink creation.

The third patch removes checks from mtk_mac_config() that are done
when initialising supported_interfaces - phylink will not call
mtk_mac_config() with an interface that was not marked as supported,
so these checks are redundant.

The last patch removes the remaining vestiges of REVMII and RMII
support, which appears to be entirely unused.

These shouldn't conflict with Daniel's patch set, but if they do I
will rework as appropriate.
====================

Link: https://lore.kernel.org/r/ZAdj9qUXcHUsK7Gt@shell.armlinux.org.uk
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
16 months agonet: mtk_eth_soc: remove support for RMII and REVMII modes
Russell King (Oracle) [Tue, 7 Mar 2023 16:19:41 +0000 (16:19 +0000)]
net: mtk_eth_soc: remove support for RMII and REVMII modes

Since the conversion of mtk_eth_soc to phylink's supported_interfaces
bitmap, these two modes have not been selectable. No one has raised
this as an issue. Checking the in-kernel DT files, none of them use
either of these modes with this hardware.

Daniel Golle concurs:

 A quick grep through the device trees of the more than 650 ramips and
 mediatek boards we support in OpenWrt has revealed that *none* of them
 uses either reduced-MII or reverse-MII PHY modes. I could imaging that
 some more specialized ramips boards may use the RMII 100M PHY mode to
 connect with exotic PHYs for industrial or automotive applications
 (think: for 100BASE-T1 PHY connected via RMII). I have never seen or
 touched such boards, but there are hints that they do exist.

 For reverse-MII there are cases in which the Ralink SoC (Rt305x, for
 example) is used in iNIC mode, ie. connected as a PHY to another SoC,
 and running only a minimal firmware rather than running Linux. Due to
 the lack of external DRAM for the Ralink SoC on this kind of boards,
 the Ralink SoC there will anyway never be able to boot Linux.
 I've seen this e.g. in multimedia devices like early WiFi-connected
 not-yet-so-smart TVs.

Consequently, the conclusion is that no one uses these modes with this
hardware, so we might as well drop support for them.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
16 months agonet: mtk_eth_soc: remove unnecessary checks in mtk_mac_config()
Russell King (Oracle) [Tue, 7 Mar 2023 16:19:36 +0000 (16:19 +0000)]
net: mtk_eth_soc: remove unnecessary checks in mtk_mac_config()

mtk_mac_config() checks that the interface mode is permitted for the
capabilities, but we already do these checks in mtk_add_mac() when
initialising phylink's supported_interfaces bitmap. Remove the
unnecessary tests.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
16 months agonet: mtk_eth_soc: move trgmii ddr2 check to probe function
Russell King (Oracle) [Tue, 7 Mar 2023 16:19:31 +0000 (16:19 +0000)]
net: mtk_eth_soc: move trgmii ddr2 check to probe function

If TRGMII mode is not permitted when using DDR2 mode, we should handle
that when setting up phylink's ->supported_interfaces so phylink knows
that this is not supported by the hardware. Move this check to
mtk_add_mac().

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
16 months agonet: mtk_eth_soc: tidy mtk_gmac0_rgmii_adjust()
Russell King (Oracle) [Tue, 7 Mar 2023 16:19:26 +0000 (16:19 +0000)]
net: mtk_eth_soc: tidy mtk_gmac0_rgmii_adjust()

Get rid of the multiple tenary operators in mtk_gmac0_rgmii_adjust()
replacing them with a single if(), thus making the code easier to read.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
16 months agoMerge branch 'pci-aer-remove-redundant-device-control-error-reporting-enable'
Jakub Kicinski [Thu, 9 Mar 2023 07:34:42 +0000 (23:34 -0800)]
Merge branch 'pci-aer-remove-redundant-device-control-error-reporting-enable'

Bjorn Helgaas says:

====================
PCI/AER: Remove redundant Device Control Error Reporting Enable

From: Bjorn Helgaas <bhelgaas@google.com>

Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is native"),
which appeared in v6.0, the PCI core has enabled PCIe error reporting for
all devices during enumeration.

Remove driver code to do this and remove unnecessary includes of
<linux/aer.h> from several other drivers.

Intel folks, sorry that I missed removing the <linux/aer.h> includes in the
first series.
====================

Link: https://lore.kernel.org/r/20230307181940.868828-1-helgaas@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoixgbe: Remove unnecessary aer.h include
Bjorn Helgaas [Tue, 7 Mar 2023 18:19:39 +0000 (12:19 -0600)]
ixgbe: Remove unnecessary aer.h include

<linux/aer.h> is unused, so remove it.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Tony Nguyen <anthony.l.nguyen@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoigc: Remove unnecessary aer.h include
Bjorn Helgaas [Tue, 7 Mar 2023 18:19:38 +0000 (12:19 -0600)]
igc: Remove unnecessary aer.h include

<linux/aer.h> is unused, so remove it.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Tony Nguyen <anthony.l.nguyen@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoigb: Remove unnecessary aer.h include
Bjorn Helgaas [Tue, 7 Mar 2023 18:19:37 +0000 (12:19 -0600)]
igb: Remove unnecessary aer.h include

<linux/aer.h> is unused, so remove it.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Tony Nguyen <anthony.l.nguyen@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoice: Remove unnecessary aer.h include
Bjorn Helgaas [Tue, 7 Mar 2023 18:19:36 +0000 (12:19 -0600)]
ice: Remove unnecessary aer.h include

<linux/aer.h> is unused, so remove it.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Tony Nguyen <anthony.l.nguyen@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoiavf: Remove unnecessary aer.h include
Bjorn Helgaas [Tue, 7 Mar 2023 18:19:35 +0000 (12:19 -0600)]
iavf: Remove unnecessary aer.h include

<linux/aer.h> is unused, so remove it.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Tony Nguyen <anthony.l.nguyen@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoi40e: Remove unnecessary aer.h include
Bjorn Helgaas [Tue, 7 Mar 2023 18:19:34 +0000 (12:19 -0600)]
i40e: Remove unnecessary aer.h include

<linux/aer.h> is unused, so remove it.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Tony Nguyen <anthony.l.nguyen@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agofm10k: Remove unnecessary aer.h include
Bjorn Helgaas [Tue, 7 Mar 2023 18:19:33 +0000 (12:19 -0600)]
fm10k: Remove unnecessary aer.h include

<linux/aer.h> is unused, so remove it.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Tony Nguyen <anthony.l.nguyen@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoe1000e: Remove unnecessary aer.h include
Bjorn Helgaas [Tue, 7 Mar 2023 18:19:32 +0000 (12:19 -0600)]
e1000e: Remove unnecessary aer.h include

<linux/aer.h> is unused, so remove it.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Tony Nguyen <anthony.l.nguyen@intel.com>
Cc: intel-wired-lan@lists.osuosl.org
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agonet: txgbe: Drop redundant pci_enable_pcie_error_reporting()
Bjorn Helgaas [Tue, 7 Mar 2023 18:19:31 +0000 (12:19 -0600)]
net: txgbe: Drop redundant pci_enable_pcie_error_reporting()

pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages.  Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is
native"), the PCI core does this for all devices during enumeration, so the
driver doesn't need to do it itself.

Remove the redundant pci_enable_pcie_error_reporting() call from the
driver.  Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.

Note that this only controls ERR_* Messages from the device.  An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Jiawen Wu <jiawenwu@trustnetic.com>
Cc: Mengyuan Lou <mengyuanlou@net-swift.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agonet: ngbe: Drop redundant pci_enable_pcie_error_reporting()
Bjorn Helgaas [Tue, 7 Mar 2023 18:19:30 +0000 (12:19 -0600)]
net: ngbe: Drop redundant pci_enable_pcie_error_reporting()

pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages.  Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is
native"), the PCI core does this for all devices during enumeration, so the
driver doesn't need to do it itself.

Remove the redundant pci_enable_pcie_error_reporting() call from the
driver.  Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.

Note that this only controls ERR_* Messages from the device.  An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Jiawen Wu <jiawenwu@trustnetic.com>
Cc: Mengyuan Lou <mengyuanlou@net-swift.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agosfc_ef100: Drop redundant pci_disable_pcie_error_reporting()
Bjorn Helgaas [Tue, 7 Mar 2023 18:19:29 +0000 (12:19 -0600)]
sfc_ef100: Drop redundant pci_disable_pcie_error_reporting()

51b35a454efd ("sfc: skeleton EF100 PF driver") added a call to
pci_disable_pcie_error_reporting() in ef100_pci_remove().

Remove this call since there's no apparent reason to disable error
reporting when it was not previously enabled.

Note that since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is
native"), the PCI core enables PCIe error reporting for all devices during
enumeration, so the driver doesn't need to do it itself.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Martin Habets <habetsm.xilinx@gmail.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agosfc/siena: Drop redundant pci_enable_pcie_error_reporting()
Bjorn Helgaas [Tue, 7 Mar 2023 18:19:28 +0000 (12:19 -0600)]
sfc/siena: Drop redundant pci_enable_pcie_error_reporting()

pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages.  Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is
native"), the PCI core does this for all devices during enumeration, so the
driver doesn't need to do it itself.

Remove the redundant pci_enable_pcie_error_reporting() call from the
driver.  Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.

Note that this only controls ERR_* Messages from the device.  An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Martin Habets <habetsm.xilinx@gmail.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agosfc: falcon: Drop redundant pci_enable_pcie_error_reporting()
Bjorn Helgaas [Tue, 7 Mar 2023 18:19:27 +0000 (12:19 -0600)]
sfc: falcon: Drop redundant pci_enable_pcie_error_reporting()

pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages.  Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is
native"), the PCI core does this for all devices during enumeration, so the
driver doesn't need to do it itself.

Remove the redundant pci_enable_pcie_error_reporting() call from the
driver.  Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.

Note that this only controls ERR_* Messages from the device.  An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Martin Habets <habetsm.xilinx@gmail.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agosfc: Drop redundant pci_enable_pcie_error_reporting()
Bjorn Helgaas [Tue, 7 Mar 2023 18:19:26 +0000 (12:19 -0600)]
sfc: Drop redundant pci_enable_pcie_error_reporting()

pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages.  Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is
native"), the PCI core does this for all devices during enumeration, so the
driver doesn't need to do it itself.

Remove the redundant pci_enable_pcie_error_reporting() call from the
driver.  Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.

Note that this only controls ERR_* Messages from the device.  An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Martin Habets <habetsm.xilinx@gmail.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoqlcnic: Remove unnecessary aer.h include
Bjorn Helgaas [Tue, 7 Mar 2023 18:19:25 +0000 (12:19 -0600)]
qlcnic: Remove unnecessary aer.h include

<linux/aer.h> is unused, so remove it.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Shahed Shaikh <shshaikh@marvell.com>
Cc: Manish Chopra <manishc@marvell.com>
Cc: GR-Linux-NIC-Dev@marvell.com
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoqlcnic: Drop redundant pci_enable_pcie_error_reporting()
Bjorn Helgaas [Tue, 7 Mar 2023 18:19:24 +0000 (12:19 -0600)]
qlcnic: Drop redundant pci_enable_pcie_error_reporting()

pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages.  Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is
native"), the PCI core does this for all devices during enumeration, so the
driver doesn't need to do it itself.

Remove the redundant pci_enable_pcie_error_reporting() call from the
driver.  Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.

Note that this only controls ERR_* Messages from the device.  An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Shahed Shaikh <shshaikh@marvell.com>
Cc: Manish Chopra <manishc@marvell.com>
Cc: GR-Linux-NIC-Dev@marvell.com
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agonet: qede: Remove unnecessary aer.h include
Bjorn Helgaas [Tue, 7 Mar 2023 18:19:23 +0000 (12:19 -0600)]
net: qede: Remove unnecessary aer.h include

<linux/aer.h> is unused, so remove it.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Ariel Elior <aelior@marvell.com>
Cc: Manish Chopra <manishc@marvell.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoqed: Drop redundant pci_enable_pcie_error_reporting()
Bjorn Helgaas [Tue, 7 Mar 2023 18:19:22 +0000 (12:19 -0600)]
qed: Drop redundant pci_enable_pcie_error_reporting()

pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages.  Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is
native"), the PCI core does this for all devices during enumeration, so the
driver doesn't need to do it itself.

Remove the redundant pci_enable_pcie_error_reporting() call from the
driver.  Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.

Note that this only controls ERR_* Messages from the device.  An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Ariel Elior <aelior@marvell.com>
Cc: Manish Chopra <manishc@marvell.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoocteon_ep: Drop redundant pci_enable_pcie_error_reporting()
Bjorn Helgaas [Tue, 7 Mar 2023 18:19:21 +0000 (12:19 -0600)]
octeon_ep: Drop redundant pci_enable_pcie_error_reporting()

pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages.  Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is
native"), the PCI core does this for all devices during enumeration, so the
driver doesn't need to do it itself.

Remove the redundant pci_enable_pcie_error_reporting() call from the
driver.  Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.

Note that this only controls ERR_* Messages from the device.  An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Veerasenareddy Burru <vburru@marvell.com>
Cc: Abhijit Ayarekar <aayarekar@marvell.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agonetxen_nic: Drop redundant pci_enable_pcie_error_reporting()
Bjorn Helgaas [Tue, 7 Mar 2023 18:19:20 +0000 (12:19 -0600)]
netxen_nic: Drop redundant pci_enable_pcie_error_reporting()

pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages.  Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is
native"), the PCI core does this for all devices during enumeration, so the
driver doesn't need to do it itself.

Remove the redundant pci_enable_pcie_error_reporting() call from the
driver.  Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.

Note that this only controls ERR_* Messages from the device.  An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.

Also note that the driver only called these for NX_IS_REVISION_P3 devices,
so since f26e58bf6f54, error reporting has been enabled for devices other
than NX_IS_REVISION_P3.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Manish Chopra <manishc@marvell.com>
Cc: Rahul Verma <rahulv@marvell.com>
Cc: GR-Linux-NIC-Dev@marvell.com
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agonet: hns3: remove unnecessary aer.h include
Bjorn Helgaas [Tue, 7 Mar 2023 18:19:19 +0000 (12:19 -0600)]
net: hns3: remove unnecessary aer.h include

<linux/aer.h> is unused, so remove it.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Yisen Zhuang <yisen.zhuang@huawei.com>
Cc: Salil Mehta <salil.mehta@huawei.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agonet/fungible: Drop redundant pci_enable_pcie_error_reporting()
Bjorn Helgaas [Tue, 7 Mar 2023 18:19:18 +0000 (12:19 -0600)]
net/fungible: Drop redundant pci_enable_pcie_error_reporting()

pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages.  Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is
native"), the PCI core does this for all devices during enumeration, so the
driver doesn't need to do it itself.

Remove the redundant pci_enable_pcie_error_reporting() call from the
driver.  Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.

Note that this only controls ERR_* Messages from the device.  An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Dimitris Michailidis <dmichail@fungible.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agocxgb4: Drop redundant pci_enable_pcie_error_reporting()
Bjorn Helgaas [Tue, 7 Mar 2023 18:19:17 +0000 (12:19 -0600)]
cxgb4: Drop redundant pci_enable_pcie_error_reporting()

pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages.  Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is
native"), the PCI core does this for all devices during enumeration, so the
driver doesn't need to do it itself.

Remove the redundant pci_enable_pcie_error_reporting() call from the
driver.  Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.

Note that this only controls ERR_* Messages from the device.  An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Raju Rangoju <rajur@chelsio.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agobnxt: Drop redundant pci_enable_pcie_error_reporting()
Bjorn Helgaas [Tue, 7 Mar 2023 18:19:16 +0000 (12:19 -0600)]
bnxt: Drop redundant pci_enable_pcie_error_reporting()

pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages.  Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is
native"), the PCI core does this for all devices during enumeration.

Remove the redundant pci_enable_pcie_error_reporting() call from the
driver.  Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.

Note that this only controls ERR_* Messages from the device.  An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agobnx2x: Drop redundant pci_enable_pcie_error_reporting()
Bjorn Helgaas [Tue, 7 Mar 2023 18:19:15 +0000 (12:19 -0600)]
bnx2x: Drop redundant pci_enable_pcie_error_reporting()

pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages.  Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is
native"), the PCI core does this for all devices during enumeration, so the
driver doesn't need to do it itself.

Remove the redundant pci_enable_pcie_error_reporting() call from the
driver.  Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.

Note that this only controls ERR_* Messages from the device.  An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Ariel Elior <aelior@marvell.com>
Cc: Sudarsana Kalluru <skalluru@marvell.com>
Cc: Manish Chopra <manishc@marvell.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agobnx2: Drop redundant pci_enable_pcie_error_reporting()
Bjorn Helgaas [Tue, 7 Mar 2023 18:19:14 +0000 (12:19 -0600)]
bnx2: Drop redundant pci_enable_pcie_error_reporting()

pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages.  Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is
native"), the PCI core does this for all devices during enumeration, so the
driver doesn't need to do it itself.

Remove the redundant pci_enable_pcie_error_reporting() call from the
driver.  Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.

Note that this only controls ERR_* Messages from the device.  An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.

cd709aa90648 ("bnx2: Add PCI Advanced Error Reporting support.") added
pci_enable_pcie_error_reporting() for all devices, and c239f279e571 ("bnx2:
Enable AER on PCIE devices only") restricted it to BNX2_CHIP_5709 devices
to avoid an error message when it failed on non-PCIe devices.  The PCI core
only enables PCIe error reporting on PCIe devices, which I assume means
BNX2_CHIP_5709.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Rasesh Mody <rmody@marvell.com>
Cc: GR-Linux-NIC-Dev@marvell.com
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agobe2net: Drop redundant pci_enable_pcie_error_reporting()
Bjorn Helgaas [Tue, 7 Mar 2023 18:19:13 +0000 (12:19 -0600)]
be2net: Drop redundant pci_enable_pcie_error_reporting()

pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages.  Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is
native"), the PCI core does this for all devices during enumeration,  so the
driver doesn't need to do it itself.

Remove the redundant pci_enable_pcie_error_reporting() call from the
driver.  Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.

Note that this only controls ERR_* Messages from the device.  An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Ajit Khaparde <ajit.khaparde@broadcom.com>
Cc: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Cc: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoalx: Drop redundant pci_enable_pcie_error_reporting()
Bjorn Helgaas [Tue, 7 Mar 2023 18:19:12 +0000 (12:19 -0600)]
alx: Drop redundant pci_enable_pcie_error_reporting()

pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages.  Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is
native"), the PCI core does this for all devices during enumeration, so the
driver doesn't need to do it itself.

Remove the redundant pci_enable_pcie_error_reporting() call from the
driver.  Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.

Note that this only controls ERR_* Messages from the device.  An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Chris Snook <chris.snook@gmail.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoravb: remove R-Car H3 ES1.* handling
Wolfram Sang [Tue, 7 Mar 2023 16:30:35 +0000 (17:30 +0100)]
ravb: remove R-Car H3 ES1.* handling

R-Car H3 ES1.* was only available to an internal development group and
needed a lot of quirks and workarounds. These become a maintenance
burden now, so our development group decided to remove upstream support
and disable booting for this SoC. Public users only have ES2 onwards.

Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/all/20230307163041.3815-8-wsa+renesas@sang-engineering.com/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoMerge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Jakub Kicinski [Wed, 8 Mar 2023 22:34:22 +0000 (14:34 -0800)]
Merge https://git./linux/kernel/git/bpf/bpf-next

Andrii Nakryiko says:

====================
pull-request: bpf-next 2023-03-08

We've added 23 non-merge commits during the last 2 day(s) which contain
a total of 28 files changed, 414 insertions(+), 104 deletions(-).

The main changes are:

1) Add more precise memory usage reporting for all BPF map types,
   from Yafang Shao.

2) Add ARM32 USDT support to libbpf, from Puranjay Mohan.

3) Fix BTF_ID_LIST size causing problems in !CONFIG_DEBUG_INFO_BTF,
   from Nathan Chancellor.

4) IMA selftests fix, from Roberto Sassu.

5) libbpf fix in APK support code, from Daniel Müller.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (23 commits)
  selftests/bpf: Fix IMA test
  libbpf: USDT arm arg parsing support
  libbpf: Refactor parse_usdt_arg() to re-use code
  libbpf: Fix theoretical u32 underflow in find_cd() function
  bpf: enforce all maps having memory usage callback
  bpf: offload map memory usage
  bpf, net: xskmap memory usage
  bpf, net: sock_map memory usage
  bpf, net: bpf_local_storage memory usage
  bpf: local_storage memory usage
  bpf: bpf_struct_ops memory usage
  bpf: queue_stack_maps memory usage
  bpf: devmap memory usage
  bpf: cpumap memory usage
  bpf: bloom_filter memory usage
  bpf: ringbuf memory usage
  bpf: reuseport_array memory usage
  bpf: stackmap memory usage
  bpf: arraymap memory usage
  bpf: hashtab memory usage
  ...
====================

Link: https://lore.kernel.org/r/20230308193533.1671597-1-andrii@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoselftests/bpf: Fix IMA test
Roberto Sassu [Wed, 8 Mar 2023 10:37:13 +0000 (11:37 +0100)]
selftests/bpf: Fix IMA test

Commit 62622dab0a28 ("ima: return IMA digest value only when IMA_COLLECTED
flag is set") caused bpf_ima_inode_hash() to refuse to give non-fresh
digests. IMA test #3 assumed the old behavior, that bpf_ima_inode_hash()
still returned also non-fresh digests.

Correct the test by accepting both cases. If the samples returned are 1,
assume that the commit above is applied and that the returned digest is
fresh. If the samples returned are 2, assume that the commit above is not
applied, and check both the non-fresh and fresh digest.

Fixes: 62622dab0a28 ("ima: return IMA digest value only when IMA_COLLECTED flag is set")
Reported-by: David Vernet <void@manifault.com>
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Matt Bobrowski <mattbobrowski@google.com>
Link: https://lore.kernel.org/bpf/20230308103713.1681200-1-roberto.sassu@huaweicloud.com
16 months agonet: reclaim skb->scm_io_uring bit
Eric Dumazet [Tue, 7 Mar 2023 14:59:59 +0000 (14:59 +0000)]
net: reclaim skb->scm_io_uring bit

Commit 0091bfc81741 ("io_uring/af_unix: defer registered
files gc to io_uring release") added one bit to struct sk_buff.

This structure is critical for networking, and we try very hard
to not add bloat on it, unless absolutely required.

For instance, we can use a specific destructor as a wrapper
around unix_destruct_scm(), to identify skbs that unix_gc()
has to special case.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Cc: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 months agoMerge branch 'sparx5-tc-flower-templates'
David S. Miller [Wed, 8 Mar 2023 13:19:44 +0000 (13:19 +0000)]
Merge branch 'sparx5-tc-flower-templates'

Steen Hegelund says:

====================
Add support for TC flower templates in Sparx5

This adds support for the TC template mechanism in the Sparx5 flower filter
implementation.

Templates are as such handled by the TC framework, but when a template is
created (using a chain id) there are by definition no filters on this
chain (an error will be returned if there are any).

If the templates chain id is one that is represented by a VCAP lookup, then
when the template is created, we know that it is safe to use the keys
provided in the template to change the keyset configuration for the (port,
lookup) combination, if this is needed to improve the match on the
template.

The original port keyset configuration is captured in the template state
information which is kept per port, so that when the template is deleted
the port keyset configuration can be restored to its previous setting.

The template also provides the protocol parameter which is the basic
information that is used to find out which port keyset configuration needs
to be changed.

The VCAPs and lookups are slightly different when it comes to which keys,
keysets and protocol are supported and used for selection, so in some
cases a bit of tweaking is needed to find a useful match.  This is done by
e.g. removing a key that prevents the best matching keyset from being
selected.

The debugfs output that is provided for a port allows inspection of the
currently used keyset in each of the VCAPs lookups.  So when a template has
been created the debugfs output allows you to verify if the keyset
configuration has been changed successfully.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
16 months agonet: microchip: sparx5: Add TC template support
Steen Hegelund [Tue, 7 Mar 2023 13:41:03 +0000 (14:41 +0100)]
net: microchip: sparx5: Add TC template support

This adds support for using the "template add" and "template destroy"
functionality to change the port keyset configuration.

If the VCAP lookup already contains rules, the port keyset is left
unchanged, as a change would make these rules unusable.

When the template is destroyed the port keyset configuration is restored.
The filters using the template chain will automatically be deleted by the
TC framework.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 months agonet: microchip: sparx5: Add port keyset changing functionality
Steen Hegelund [Tue, 7 Mar 2023 13:41:02 +0000 (14:41 +0100)]
net: microchip: sparx5: Add port keyset changing functionality

With this its is now possible for clients (like TC) to change the port
keyset configuration in the Sparx5 VCAPs.

This is typically done per traffic class which is guided with the L3
protocol information.
Before the change the current keyset configuration is collected in a list
that is handed back to the client.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 months agonet: microchip: sparx5: Add TC template list to a port
Steen Hegelund [Tue, 7 Mar 2023 13:41:01 +0000 (14:41 +0100)]
net: microchip: sparx5: Add TC template list to a port

This adds a list that is used to collect the templates that are active on a
port.

This allows the template creation to change the port configuration
and the template destruction to change it back.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 months agonet: microchip: sparx5: Provide rule count, key removal and keyset select
Steen Hegelund [Tue, 7 Mar 2023 13:41:00 +0000 (14:41 +0100)]
net: microchip: sparx5: Provide rule count, key removal and keyset select

This provides these 3 functions in the VCAP API:

- Count the number of rules in a VCAP lookup (chain)
- Remove a key from a VCAP rule
- Find the keyset that gives the smallest rule list from a list of keysets

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 months agonet: microchip: sparx5: Correct the spelling of the keysets in debugfs
Steen Hegelund [Tue, 7 Mar 2023 13:40:59 +0000 (14:40 +0100)]
net: microchip: sparx5: Correct the spelling of the keysets in debugfs

Correct the name used in the debugfs output.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 months agodt-bindings: net: dsa: mediatek,mt7530: change some descriptions to literal
Arınç ÜNAL [Tue, 7 Mar 2023 09:56:19 +0000 (12:56 +0300)]
dt-bindings: net: dsa: mediatek,mt7530: change some descriptions to literal

The line endings must be preserved on gpio-controller, io-supply, and
reset-gpios properties to look proper when the YAML file is parsed.

Currently it's interpreted as a single line when parsed. Change the style
of the description of these properties to literal style to preserve the
line endings.

Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 months agoemulex/benet: clean up some inconsistent indenting
Jiapeng Chong [Tue, 7 Mar 2023 05:41:38 +0000 (13:41 +0800)]
emulex/benet: clean up some inconsistent indenting

No functional modification involved.

drivers/net/ethernet/emulex/benet/be_cmds.c:1120 be_cmd_pmac_add() warn: inconsistent indenting.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4396
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 months agonet/mlx4_en: Replace fake flex-array with flexible-array member
Gustavo A. R. Silva [Mon, 6 Mar 2023 23:51:52 +0000 (17:51 -0600)]
net/mlx4_en: Replace fake flex-array with flexible-array member

Zero-length arrays as fake flexible arrays are deprecated and we are
moving towards adopting C99 flexible-array members instead.

Transform zero-length array into flexible-array member in struct
mlx4_en_rx_desc.

Address the following warnings found with GCC-13 and
-fstrict-flex-arrays=3 enabled:
drivers/net/ethernet/mellanox/mlx4/en_rx.c:88:30: warning: array subscript i is outside array bounds of â€˜struct mlx4_wqe_data_seg[0]’ [-Warray-bounds=]
drivers/net/ethernet/mellanox/mlx4/en_rx.c:149:30: warning: array subscript 0 is outside array bounds of â€˜struct mlx4_wqe_data_seg[0]’ [-Warray-bounds=]
drivers/net/ethernet/mellanox/mlx4/en_rx.c:127:30: warning: array subscript i is outside array bounds of â€˜struct mlx4_wqe_data_seg[0]’ [-Warray-bounds=]
drivers/net/ethernet/mellanox/mlx4/en_rx.c:128:30: warning: array subscript i is outside array bounds of â€˜struct mlx4_wqe_data_seg[0]’ [-Warray-bounds=]
drivers/net/ethernet/mellanox/mlx4/en_rx.c:129:30: warning: array subscript i is outside array bounds of â€˜struct mlx4_wqe_data_seg[0]’ [-Warray-bounds=]
drivers/net/ethernet/mellanox/mlx4/en_rx.c:117:30: warning: array subscript i is outside array bounds of â€˜struct mlx4_wqe_data_seg[0]’ [-Warray-bounds=]
drivers/net/ethernet/mellanox/mlx4/en_rx.c:119:30: warning: array subscript i is outside array bounds of â€˜struct mlx4_wqe_data_seg[0]’ [-Warray-bounds=]

This helps with the ongoing efforts to tighten the FORTIFY_SOURCE
routines on memcpy() and help us make progress towards globally
enabling -fstrict-flex-arrays=3 [1].

Link: https://github.com/KSPP/linux/issues/21
Link: https://github.com/KSPP/linux/issues/264
Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602902.html
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 months agoMerge branch 'r8169-disable-ASPM-during-NAPI-poll'
David S. Miller [Wed, 8 Mar 2023 09:31:31 +0000 (09:31 +0000)]
Merge branch 'r8169-disable-ASPM-during-NAPI-poll'

Heiner Kallweit says:

====================
r8169: disable ASPM during NAPI poll

This is a rework of ideas from Kai-Heng on how to avoid the known
ASPM issues whilst still allowing for a maximum of ASPM-related power
savings. As a prerequisite some locking is added first.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
16 months agor8169: remove ASPM restrictions now that ASPM is disabled during NAPI poll
Heiner Kallweit [Mon, 6 Mar 2023 21:28:06 +0000 (22:28 +0100)]
r8169: remove ASPM restrictions now that ASPM is disabled during NAPI poll

Now that  ASPM is disabled during NAPI poll, we can remove all ASPM
restrictions. This allows for higher power savings if the network
isn't fully loaded.

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 months agor8169: disable ASPM during NAPI poll
Heiner Kallweit [Mon, 6 Mar 2023 21:26:47 +0000 (22:26 +0100)]
r8169: disable ASPM during NAPI poll

Several chip versions have problems with ASPM, what may result in
rx_missed errors or tx timeouts. The root cause isn't known but
experience shows that disabling ASPM during NAPI poll can avoid
these problems.

Suggested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 months agor8169: prepare rtl_hw_aspm_clkreq_enable for usage in atomic context
Heiner Kallweit [Mon, 6 Mar 2023 21:25:49 +0000 (22:25 +0100)]
r8169: prepare rtl_hw_aspm_clkreq_enable for usage in atomic context

Bail out if the function is used with chip versions that don't support
ASPM configuration. In addition remove the delay, it tuned out that
it's not needed, also vendor driver r8125 doesn't have it.

Suggested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 months agor8169: enable cfg9346 config register access in atomic context
Heiner Kallweit [Mon, 6 Mar 2023 21:24:49 +0000 (22:24 +0100)]
r8169: enable cfg9346 config register access in atomic context

For disabling ASPM during NAPI poll we'll have to unlock access
to the config registers in atomic context. Other code parts
running with config register access unlocked are partially
longer and can sleep. Add a usage counter to enable parallel
execution of code parts requiring unlocked config registers.

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 months agor8169: use spinlock to protect access to registers Config2 and Config5
Heiner Kallweit [Mon, 6 Mar 2023 21:24:00 +0000 (22:24 +0100)]
r8169: use spinlock to protect access to registers Config2 and Config5

For disabling ASPM during NAPI poll we'll have to access both registers
in atomic context. Use a spinlock to protect access.

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 months agor8169: use spinlock to protect mac ocp register access
Heiner Kallweit [Mon, 6 Mar 2023 21:23:15 +0000 (22:23 +0100)]
r8169: use spinlock to protect mac ocp register access

For disabling ASPM during NAPI poll we'll have to access mac ocp
registers in atomic context. This could result in races because
a mac ocp read consists of a write to register OCPDR, followed
by a read from the same register. Therefore add a spinlock to
protect access to mac ocp registers.

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 months agonet-timestamp: extend SOF_TIMESTAMPING_OPT_ID to HW timestamps
Vadim Fedorenko [Mon, 6 Mar 2023 16:07:38 +0000 (08:07 -0800)]
net-timestamp: extend SOF_TIMESTAMPING_OPT_ID to HW timestamps

When the feature was added it was enabled for SW timestamps only but
with current hardware the same out-of-order timestamps can be seen.
Let's expand the area for the feature to all types of timestamps.

Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 months agonetxen_nic: Replace fake flex-array with flexible-array member
Gustavo A. R. Silva [Mon, 6 Mar 2023 23:40:28 +0000 (17:40 -0600)]
netxen_nic: Replace fake flex-array with flexible-array member

Zero-length arrays as fake flexible arrays are deprecated and we are
moving towards adopting C99 flexible-array members instead.

Transform zero-length array into flexible-array member in struct
nx_cardrsp_rx_ctx_t.

Address the following warnings found with GCC-13 and
-fstrict-flex-arrays=3 enabled:
drivers/net/ethernet/qlogic/netxen/netxen_nic_ctx.c:361:26: warning: array subscript <unknown> is outside array bounds of â€˜char[0]’ [-Warray-bounds=]
drivers/net/ethernet/qlogic/netxen/netxen_nic_ctx.c:372:25: warning: array subscript <unknown> is outside array bounds of â€˜char[0]’ [-Warray-bounds=]

This helps with the ongoing efforts to tighten the FORTIFY_SOURCE
routines on memcpy() and help us make progress towards globally
enabling -fstrict-flex-arrays=3 [1].

Link: https://github.com/KSPP/linux/issues/21
Link: https://github.com/KSPP/linux/issues/265
Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602902.html
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/ZAZ57I6WdQEwWh7v@work
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agonet: phy: smsc: simplify lan95xx_config_aneg_ext
Heiner Kallweit [Mon, 6 Mar 2023 22:10:57 +0000 (23:10 +0100)]
net: phy: smsc: simplify lan95xx_config_aneg_ext

lan95xx_config_aneg_ext() can be simplified by using phy_set_bits().

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/3da785c7-3ef8-b5d3-89a0-340f550be3c2@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agonet: remove enum skb_free_reason
Eric Dumazet [Mon, 6 Mar 2023 20:43:13 +0000 (20:43 +0000)]
net: remove enum skb_free_reason

enum skb_drop_reason is more generic, we can adopt it instead.

Provide dev_kfree_skb_irq_reason() and dev_kfree_skb_any_reason().

This means drivers can use more precise drop reasons if they want to.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>
Link: https://lore.kernel.org/r/20230306204313.10492-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agonet: phy: improve phy_read_poll_timeout
Heiner Kallweit [Mon, 6 Mar 2023 21:51:35 +0000 (22:51 +0100)]
net: phy: improve phy_read_poll_timeout

cond sometimes is (val & MASK) what may result in a false positive
if val is a negative errno. We shouldn't evaluate cond if val < 0.
This has no functional impact here, but it's not nice.
Therefore switch order of the checks.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/6d8274ac-4344-23b4-d9a3-cad4c39517d4@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoMerge branch 'libbpf: usdt arm arg parsing support'
Andrii Nakryiko [Tue, 7 Mar 2023 23:31:13 +0000 (15:31 -0800)]
Merge branch 'libbpf: usdt arm arg parsing support'

Puranjay Mohan says:

====================

This series add the support of the ARM architecture to libbpf USDT. This
involves implementing the parse_usdt_arg() function for ARM.

It was seen that the last part of parse_usdt_arg() is repeated for all architectures,
so, the first patch in this series refactors these functions and moved the post
processing to parse_usdt_spec()

Changes in V2[1] to V3:

- Use a tabular approach to find register offsets.
- Add the patch for refactoring parse_usdt_arg()
====================

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
16 months agolibbpf: USDT arm arg parsing support
Puranjay Mohan [Tue, 7 Mar 2023 12:04:40 +0000 (12:04 +0000)]
libbpf: USDT arm arg parsing support

Parsing of USDT arguments is architecture-specific; on arm it is
relatively easy since registers used are r[0-10], fp, ip, sp, lr,
pc. Format is slightly different compared to aarch64; forms are

- "size @ [ reg, #offset ]" for dereferences, for example
  "-8 @ [ sp, #76 ]" ; " -4 @ [ sp ]"
- "size @ reg" for register values; for example
  "-4@r0"
- "size @ #value" for raw values; for example
  "-8@#1"

Add support for parsing USDT arguments for ARM architecture.

To test the above changes QEMU's virt[1] board with cortex-a15
CPU was used. libbpf-bootstrap's usdt example[2] was modified to attach
to a test program with DTRACE_PROBE1/2/3/4... probes to test different
combinations.

[1] https://www.qemu.org/docs/master/system/arm/virt.html
[2] https://github.com/libbpf/libbpf-bootstrap/blob/master/examples/c/usdt.bpf.c

Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230307120440.25941-3-puranjay12@gmail.com
16 months agolibbpf: Refactor parse_usdt_arg() to re-use code
Puranjay Mohan [Tue, 7 Mar 2023 12:04:39 +0000 (12:04 +0000)]
libbpf: Refactor parse_usdt_arg() to re-use code

The parse_usdt_arg() function is defined differently for each
architecture but the last part of the function is repeated
verbatim for each architecture.

Refactor parse_usdt_arg() to fill the arg_sz and then do the repeated
post-processing in parse_usdt_spec().

Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230307120440.25941-2-puranjay12@gmail.com
16 months agolibbpf: Fix theoretical u32 underflow in find_cd() function
Daniel Müller [Tue, 7 Mar 2023 21:55:04 +0000 (21:55 +0000)]
libbpf: Fix theoretical u32 underflow in find_cd() function

Coverity reported a potential underflow of the offset variable used in
the find_cd() function. Switch to using a signed 64 bit integer for the
representation of offset to make sure we can never underflow.

Fixes: 1eebcb60633f ("libbpf: Implement basic zip archive parsing support")
Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230307215504.837321-1-deso@posteo.net
16 months agoMerge branch 'bpf: bpf memory usage'
Alexei Starovoitov [Tue, 7 Mar 2023 17:33:43 +0000 (09:33 -0800)]
Merge branch 'bpf: bpf memory usage'

Yafang Shao says:

====================

Currently we can't get bpf memory usage reliably either from memcg or
from bpftool.

In memcg, there's not a 'bpf' item in memory.stat, but only 'kernel',
'sock', 'vmalloc' and 'percpu' which may related to bpf memory. With
these items we still can't get the bpf memory usage, because bpf memory
usage may far less than the kmem in a memcg, for example, the dentry may
consume lots of kmem.

bpftool now shows the bpf memory footprint, which is difference with bpf
memory usage. The difference can be quite great in some cases, for example,

- non-preallocated bpf map
  The non-preallocated bpf map memory usage is dynamically changed. The
  allocated elements count can be from 0 to the max entries. But the
  memory footprint in bpftool only shows a fixed number.

- bpf metadata consumes more memory than bpf element
  In some corner cases, the bpf metadata can consumes a lot more memory
  than bpf element consumes. For example, it can happen when the element
  size is quite small.

- some maps don't have key, value or max_entries
  For example the key_size and value_size of ringbuf is 0, so its
  memlock is always 0.

We need a way to show the bpf memory usage especially there will be more
and more bpf programs running on the production environment and thus the
bpf memory usage is not trivial.

This patchset introduces a new map ops ->map_mem_usage to calculate the
memory usage. Note that we don't intend to make the memory usage 100%
accurate, while our goal is to make sure there is only a small difference
between what bpftool reports and the real memory. That small difference
can be ignored compared to the total usage.  That is enough to monitor
the bpf memory usage. For example, the user can rely on this value to
monitor the trend of bpf memory usage, compare the difference in bpf
memory usage between different bpf program versions, figure out which
maps consume large memory, and etc.

This patchset implements the bpf memory usage for all maps, and yet there's
still work to do. We don't want to introduce runtime overhead in the
element update and delete path, but we have to do it for some
non-preallocated maps,
- devmap, xskmap
  When we update or delete an element, it will allocate or free memory.
  In order to track this dynamic memory, we have to track the count in
  element update and delete path.

- cpumap
  The element size of each cpumap element is not determinated. If we
  want to track the usage, we have to count the size of all elements in
  the element update and delete path. So I just put it aside currently.

- local_storage, bpf_local_storage
  When we attach or detach a cgroup, it will allocate or free memory. If
  we want to track the dynamic memory, we also need to do something in
  the update and delete path. So I just put it aside currently.

- offload map
  The element update and delete of offload map is via the netdev dev_ops,
  in which it may dynamically allocate or free memory, but this dynamic
  memory isn't counted in offload map memory usage currently.

The result of each map can be found in the individual patch.

We may also need to track per-container bpf memory usage, that will be
addressed by a different patchset.

Changes:
v3->v4: code improvement on ringbuf (Andrii)
        use READ_ONCE() to read lpm_trie (Tao)
        explain why we can't get bpf memory usage from memcg.
v2->v3: check callback at map creation time and avoid warning (Alexei)
        fix build error under CONFIG_BPF=n (lkp@intel.com)
v1->v2: calculate the memory usage within bpf (Alexei)
- [v1] bpf, mm: bpf memory usage
  https://lwn.net/Articles/921991/
- [RFC PATCH v2] mm, bpf: Add BPF into /proc/meminfo
  https://lwn.net/Articles/919848/
- [RFC PATCH v1] mm, bpf: Add BPF into /proc/meminfo
  https://lwn.net/Articles/917647/
- [RFC PATCH] bpf, mm: Add a new item bpf into memory.stat
  https://lore.kernel.org/bpf/20220921170002.29557-1-laoar.shao@gmail].com/
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
16 months agobpf: enforce all maps having memory usage callback
Yafang Shao [Sun, 5 Mar 2023 12:46:15 +0000 (12:46 +0000)]
bpf: enforce all maps having memory usage callback

We have implemented memory usage callback for all maps, and we enforce
any newly added map having a callback as well. We check this callback at
map creation time. If it doesn't have the callback, we will return
EINVAL.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20230305124615.12358-19-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
16 months agobpf: offload map memory usage
Yafang Shao [Sun, 5 Mar 2023 12:46:14 +0000 (12:46 +0000)]
bpf: offload map memory usage

A new helper is introduced to calculate offload map memory usage. But
currently the memory dynamically allocated in netdev dev_ops, like
nsim_map_update_elem, is not counted. Let's just put it aside now.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20230305124615.12358-18-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
16 months agobpf, net: xskmap memory usage
Yafang Shao [Sun, 5 Mar 2023 12:46:13 +0000 (12:46 +0000)]
bpf, net: xskmap memory usage

A new helper is introduced to calculate xskmap memory usage.

The xfsmap memory usage can be dynamically changed when we add or remove
a xsk_map_node. Hence we need to track the count of xsk_map_node to get
its memory usage.

The result as follows,
- before
10: xskmap  name count_map  flags 0x0
        key 4B  value 4B  max_entries 65536  memlock 524288B

- after
10: xskmap  name count_map  flags 0x0 <<< no elements case
        key 4B  value 4B  max_entries 65536  memlock 524608B

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20230305124615.12358-17-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
16 months agobpf, net: sock_map memory usage
Yafang Shao [Sun, 5 Mar 2023 12:46:12 +0000 (12:46 +0000)]
bpf, net: sock_map memory usage

sockmap and sockhash don't have something in common in allocation, so let's
introduce different helpers to calculate their memory usage.

The reuslt as follows,

- before
28: sockmap  name count_map  flags 0x0
        key 4B  value 4B  max_entries 65536  memlock 524288B
29: sockhash  name count_map  flags 0x0
        key 4B  value 4B  max_entries 65536  memlock 524288B

- after
28: sockmap  name count_map  flags 0x0
        key 4B  value 4B  max_entries 65536  memlock 524608B
29: sockhash  name count_map  flags 0x0  <<<< no updated elements
        key 4B  value 4B  max_entries 65536  memlock 1048896B

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20230305124615.12358-16-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
16 months agobpf, net: bpf_local_storage memory usage
Yafang Shao [Sun, 5 Mar 2023 12:46:11 +0000 (12:46 +0000)]
bpf, net: bpf_local_storage memory usage

A new helper is introduced into bpf_local_storage map to calculate the
memory usage. This helper is also used by other maps like
bpf_cgrp_storage, bpf_inode_storage, bpf_task_storage and etc.

Note that currently the dynamically allocated storage elements are not
counted in the usage, since it will take extra runtime overhead in the
elements update or delete path. So let's put it aside now, and implement
it in the future when someone really need it.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20230305124615.12358-15-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
16 months agobpf: local_storage memory usage
Yafang Shao [Sun, 5 Mar 2023 12:46:10 +0000 (12:46 +0000)]
bpf: local_storage memory usage

A new helper is introduced to calculate local_storage map memory usage.
Currently the dynamically allocated elements are not counted, since it
will take runtime overhead in the element update or delete path. So
let's put it aside currently, and implement it in the future if the user
really needs it.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20230305124615.12358-14-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
16 months agobpf: bpf_struct_ops memory usage
Yafang Shao [Sun, 5 Mar 2023 12:46:09 +0000 (12:46 +0000)]
bpf: bpf_struct_ops memory usage

A new helper is introduced to calculate bpf_struct_ops memory usage.

The result as follows,

- before
1: struct_ops  name count_map  flags 0x0
        key 4B  value 256B  max_entries 1  memlock 4096B
        btf_id 73

- after
1: struct_ops  name count_map  flags 0x0
        key 4B  value 256B  max_entries 1  memlock 5016B
        btf_id 73

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20230305124615.12358-13-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
16 months agobpf: queue_stack_maps memory usage
Yafang Shao [Sun, 5 Mar 2023 12:46:08 +0000 (12:46 +0000)]
bpf: queue_stack_maps memory usage

A new helper is introduced to calculate queue_stack_maps memory usage.

The result as follows,

- before
20: queue  name count_map  flags 0x0
        key 0B  value 4B  max_entries 65536  memlock 266240B
21: stack  name count_map  flags 0x0
        key 0B  value 4B  max_entries 65536  memlock 266240B

- after
20: queue  name count_map  flags 0x0
        key 0B  value 4B  max_entries 65536  memlock 524288B
21: stack  name count_map  flags 0x0
        key 0B  value 4B  max_entries 65536  memlock 524288B

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20230305124615.12358-12-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
16 months agobpf: devmap memory usage
Yafang Shao [Sun, 5 Mar 2023 12:46:07 +0000 (12:46 +0000)]
bpf: devmap memory usage

A new helper is introduced to calculate the memory usage of devmap and
devmap_hash. The number of dynamically allocated elements are recored
for devmap_hash already, but not for devmap. To track the memory size of
dynamically allocated elements, this patch also count the numbers for
devmap.

The result as follows,
- before
40: devmap  name count_map  flags 0x80
        key 4B  value 4B  max_entries 65536  memlock 524288B
41: devmap_hash  name count_map  flags 0x80
        key 4B  value 4B  max_entries 65536  memlock 524288B

- after
40: devmap  name count_map  flags 0x80  <<<< no elements
        key 4B  value 4B  max_entries 65536  memlock 524608B
41: devmap_hash  name count_map  flags 0x80 <<<< no elements
        key 4B  value 4B  max_entries 65536  memlock 524608B

Note that the number of buckets is same with max_entries for devmap_hash
in this case.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20230305124615.12358-11-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
16 months agobpf: cpumap memory usage
Yafang Shao [Sun, 5 Mar 2023 12:46:06 +0000 (12:46 +0000)]
bpf: cpumap memory usage

A new helper is introduced to calculate cpumap memory usage. The size of
cpu_entries can be dynamically changed when we update or delete a cpumap
element, but this patch doesn't include the memory size of cpu_entry
yet. We can dynamically calculate the memory usage when we alloc or free
a cpu_entry, but it will take extra runtime overhead, so let just put it
aside currently. Note that the size of different cpu_entry may be
different as well.

The result as follows,
- before
48: cpumap  name count_map  flags 0x4
        key 4B  value 4B  max_entries 64  memlock 4096B

- after
48: cpumap  name count_map  flags 0x4
        key 4B  value 4B  max_entries 64  memlock 832B

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20230305124615.12358-10-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
16 months agobpf: bloom_filter memory usage
Yafang Shao [Sun, 5 Mar 2023 12:46:05 +0000 (12:46 +0000)]
bpf: bloom_filter memory usage

Introduce a new helper to calculate the bloom_filter memory usage.

The result as follows,
- before
16: bloom_filter  flags 0x0
        key 0B  value 8B  max_entries 65536  memlock 524288B

- after
16: bloom_filter  flags 0x0
        key 0B  value 8B  max_entries 65536  memlock 65856B

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20230305124615.12358-9-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
16 months agobpf: ringbuf memory usage
Yafang Shao [Sun, 5 Mar 2023 12:46:04 +0000 (12:46 +0000)]
bpf: ringbuf memory usage

A new helper ringbuf_map_mem_usage() is introduced to calculate ringbuf
memory usage.

The result as follows,
- before
15: ringbuf  name count_map  flags 0x0
        key 0B  value 0B  max_entries 65536  memlock 0B

- after
15: ringbuf  name count_map  flags 0x0
        key 0B  value 0B  max_entries 65536  memlock 78424B

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230305124615.12358-8-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
16 months agobpf: reuseport_array memory usage
Yafang Shao [Sun, 5 Mar 2023 12:46:03 +0000 (12:46 +0000)]
bpf: reuseport_array memory usage

A new helper is introduced to calculate reuseport_array memory usage.

The result as follows,
- before
14: reuseport_sockarray  name count_map  flags 0x0
        key 4B  value 8B  max_entries 65536  memlock 1048576B

- after
14: reuseport_sockarray  name count_map  flags 0x0
        key 4B  value 8B  max_entries 65536  memlock 524544B

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20230305124615.12358-7-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
16 months agobpf: stackmap memory usage
Yafang Shao [Sun, 5 Mar 2023 12:46:02 +0000 (12:46 +0000)]
bpf: stackmap memory usage

A new helper is introduced to get stackmap memory usage. Some small
memory allocations are ignored as their memory size is quite small
compared to the totol usage.

The result as follows,
- before
16: stack_trace  name count_map  flags 0x0
        key 4B  value 8B  max_entries 65536  memlock 1048576B

- after
16: stack_trace  name count_map  flags 0x0
        key 4B  value 8B  max_entries 65536  memlock 2097472B

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20230305124615.12358-6-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
16 months agobpf: arraymap memory usage
Yafang Shao [Sun, 5 Mar 2023 12:46:01 +0000 (12:46 +0000)]
bpf: arraymap memory usage

Introduce array_map_mem_usage() to calculate arraymap memory usage. In
this helper, some small memory allocations are ignored, like the
allocation of struct bpf_array_aux in prog_array. The inner_map_meta in
array_of_map is also ignored.

The result as follows,

- before
11: array  name count_map  flags 0x0
        key 4B  value 4B  max_entries 65536  memlock 524288B
12: percpu_array  name count_map  flags 0x0
        key 4B  value 4B  max_entries 65536  memlock 8912896B
13: perf_event_array  name count_map  flags 0x0
        key 4B  value 4B  max_entries 65536  memlock 524288B
14: prog_array  name count_map  flags 0x0
        key 4B  value 4B  max_entries 65536  memlock 524288B
15: cgroup_array  name count_map  flags 0x0
        key 4B  value 4B  max_entries 65536  memlock 524288B

- after
11: array  name count_map  flags 0x0
        key 4B  value 4B  max_entries 65536  memlock 524608B
12: percpu_array  name count_map  flags 0x0
        key 4B  value 4B  max_entries 65536  memlock 17301824B
13: perf_event_array  name count_map  flags 0x0
        key 4B  value 4B  max_entries 65536  memlock 524608B
14: prog_array  name count_map  flags 0x0
        key 4B  value 4B  max_entries 65536  memlock 524608B
15: cgroup_array  name count_map  flags 0x0
        key 4B  value 4B  max_entries 65536  memlock 524608B

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20230305124615.12358-5-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
16 months agobpf: hashtab memory usage
Yafang Shao [Sun, 5 Mar 2023 12:46:00 +0000 (12:46 +0000)]
bpf: hashtab memory usage

htab_map_mem_usage() is introduced to calculate hashmap memory usage. In
this helper, some small memory allocations are ignore, as their size is
quite small compared with the total size. The inner_map_meta in
hash_of_map is also ignored.

The result for hashtab as follows,

- before this change
1: hash  name count_map  flags 0x1  <<<< no prealloc, fully set
        key 16B  value 24B  max_entries 1048576  memlock 41943040B
2: hash  name count_map  flags 0x1  <<<< no prealloc, none set
        key 16B  value 24B  max_entries 1048576  memlock 41943040B
3: hash  name count_map  flags 0x0  <<<< prealloc
        key 16B  value 24B  max_entries 1048576  memlock 41943040B

The memlock is always a fixed size whatever it is preallocated or
not, and whatever the count of allocated elements is.

- after this change
1: hash  name count_map  flags 0x1    <<<< non prealloc, fully set
        key 16B  value 24B  max_entries 1048576  memlock 117441536B
2: hash  name count_map  flags 0x1    <<<< non prealloc, non set
        key 16B  value 24B  max_entries 1048576  memlock 16778240B
3: hash  name count_map  flags 0x0    <<<< prealloc
        key 16B  value 24B  max_entries 1048576  memlock 109056000B

The memlock now is hashtab actually allocated.

The result for percpu hash map as follows,
- before this change
4: percpu_hash  name count_map  flags 0x0       <<<< prealloc
        key 16B  value 24B  max_entries 1048576  memlock 822083584B
5: percpu_hash  name count_map  flags 0x1       <<<< no prealloc
        key 16B  value 24B  max_entries 1048576  memlock 822083584B

- after this change
4: percpu_hash  name count_map  flags 0x0
        key 16B  value 24B  max_entries 1048576  memlock 897582080B
5: percpu_hash  name count_map  flags 0x1
        key 16B  value 24B  max_entries 1048576  memlock 922748736B

At worst, the difference can be 10x, for example,
- before this change
6: hash  name count_map  flags 0x0
        key 4B  value 4B  max_entries 1048576  memlock 8388608B

- after this change
6: hash  name count_map  flags 0x0
        key 4B  value 4B  max_entries 1048576  memlock 83889408B

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Acked-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/r/20230305124615.12358-4-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
16 months agobpf: lpm_trie memory usage
Yafang Shao [Sun, 5 Mar 2023 12:45:59 +0000 (12:45 +0000)]
bpf: lpm_trie memory usage

trie_mem_usage() is introduced to calculate the lpm_trie memory usage.
Some small memory allocations are ignored. The inner node is also
ignored.

The result as follows,

- before
10: lpm_trie  flags 0x1
        key 8B  value 8B  max_entries 65536  memlock 1048576B

- after
10: lpm_trie  flags 0x1
        key 8B  value 8B  max_entries 65536  memlock 2291536B

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20230305124615.12358-3-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
16 months agobpf: add new map ops ->map_mem_usage
Yafang Shao [Sun, 5 Mar 2023 12:45:58 +0000 (12:45 +0000)]
bpf: add new map ops ->map_mem_usage

Add a new map ops ->map_mem_usage to print the memory usage of a
bpf map.

This is a preparation for the followup change.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20230305124615.12358-2-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
16 months agobpf: Increase size of BTF_ID_LIST without CONFIG_DEBUG_INFO_BTF again
Nathan Chancellor [Tue, 7 Mar 2023 15:14:06 +0000 (08:14 -0700)]
bpf: Increase size of BTF_ID_LIST without CONFIG_DEBUG_INFO_BTF again

After commit 66e3a13e7c2c ("bpf: Add bpf_dynptr_slice and bpf_dynptr_slice_rdwr"),
clang builds without CONFIG_DEBUG_INFO_BTF warn:

  kernel/bpf/verifier.c:10298:24: warning: array index 16 is past the end of the array (that has type 'u32[16]' (aka 'unsigned int[16]')) [-Warray-bounds]
                                     meta.func_id == special_kfunc_list[KF_bpf_dynptr_slice_rdwr]) {
                                                     ^                  ~~~~~~~~~~~~~~~~~~~~~~~~
  kernel/bpf/verifier.c:9150:1: note: array 'special_kfunc_list' declared here
  BTF_ID_LIST(special_kfunc_list)
  ^
  include/linux/btf_ids.h:207:27: note: expanded from macro 'BTF_ID_LIST'
  #define BTF_ID_LIST(name) static u32 __maybe_unused name[16];
                            ^
  1 warning generated.

A warning of this nature was previously addressed by
commit beb3d47d1d3d ("bpf: Fix a BTF_ID_LIST bug with CONFIG_DEBUG_INFO_BTF not set")
but there have been new kfuncs added since then.

Quadruple the size of the CONFIG_DEBUG_INFO_BTF=n definition so that
this problem is unlikely to show up for some time.

Link: https://github.com/ClangBuiltLinux/linux/issues/1810
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Tom Rix <trix@redhat.com>
Link: https://lore.kernel.org/r/20230307-bpf-kfuncs-warray-bounds-v1-1-00ad3191f3a6@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
16 months agoMerge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf...
Jakub Kicinski [Tue, 7 Mar 2023 04:36:39 +0000 (20:36 -0800)]
Merge tag 'for-netdev' of https://git./linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2023-03-06

We've added 85 non-merge commits during the last 13 day(s) which contain
a total of 131 files changed, 7102 insertions(+), 1792 deletions(-).

The main changes are:

1) Add skb and XDP typed dynptrs which allow BPF programs for more
   ergonomic and less brittle iteration through data and variable-sized
   accesses, from Joanne Koong.

2) Bigger batch of BPF verifier improvements to prepare for upcoming BPF
   open-coded iterators allowing for less restrictive looping capabilities,
   from Andrii Nakryiko.

3) Rework RCU enforcement in the verifier, add kptr_rcu and enforce BPF
   programs to NULL-check before passing such pointers into kfunc,
   from Alexei Starovoitov.

4) Add support for kptrs in percpu hashmaps, percpu LRU hashmaps and in
   local storage maps, from Kumar Kartikeya Dwivedi.

5) Add BPF verifier support for ST instructions in convert_ctx_access()
   which will help new -mcpu=v4 clang flag to start emitting them,
   from Eduard Zingerman.

6) Make uprobe attachment Android APK aware by supporting attachment
   to functions inside ELF objects contained in APKs via function names,
   from Daniel Müller.

7) Add a new flag BPF_F_TIMER_ABS flag for bpf_timer_start() helper
   to start the timer with absolute expiration value instead of relative
   one, from Tero Kristo.

8) Add a new kfunc bpf_cgroup_from_id() to look up cgroups via id,
   from Tejun Heo.

9) Extend libbpf to support users manually attaching kprobes/uprobes
   in the legacy/perf/link mode, from Menglong Dong.

10) Implement workarounds in the mips BPF JIT for DADDI/R4000,
   from Jiaxun Yang.

11) Enable mixing bpf2bpf and tailcalls for the loongarch BPF JIT,
    from Hengqi Chen.

12) Extend BPF instruction set doc with describing the encoding of BPF
    instructions in terms of how bytes are stored under big/little endian,
    from Jose E. Marchesi.

13) Follow-up to enable kfunc support for riscv BPF JIT, from Pu Lehui.

14) Fix bpf_xdp_query() backwards compatibility on old kernels,
    from Yonghong Song.

15) Fix BPF selftest cross compilation with CLANG_CROSS_FLAGS,
    from Florent Revest.

16) Improve bpf_cpumask_ma to only allocate one bpf_mem_cache,
    from Hou Tao.

17) Fix BPF verifier's check_subprogs to not unnecessarily mark
    a subprogram with has_tail_call, from Ilya Leoshkevich.

18) Fix arm syscall regs spec in libbpf's bpf_tracing.h, from Puranjay Mohan.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (85 commits)
  selftests/bpf: Add test for legacy/perf kprobe/uprobe attach mode
  selftests/bpf: Split test_attach_probe into multi subtests
  libbpf: Add support to set kprobe/uprobe attach mode
  tools/resolve_btfids: Add /libsubcmd to .gitignore
  bpf: add support for fixed-size memory pointer returns for kfuncs
  bpf: generalize dynptr_get_spi to be usable for iters
  bpf: mark PTR_TO_MEM as non-null register type
  bpf: move kfunc_call_arg_meta higher in the file
  bpf: ensure that r0 is marked scratched after any function call
  bpf: fix visit_insn()'s detection of BPF_FUNC_timer_set_callback helper
  bpf: clean up visit_insn()'s instruction processing
  selftests/bpf: adjust log_fixup's buffer size for proper truncation
  bpf: honor env->test_state_freq flag in is_state_visited()
  selftests/bpf: enhance align selftest's expected log matching
  bpf: improve regsafe() checks for PTR_TO_{MEM,BUF,TP_BUFFER}
  bpf: improve stack slot state printing
  selftests/bpf: Disassembler tests for verifier.c:convert_ctx_access()
  selftests/bpf: test if pointer type is tracked for BPF_ST_MEM
  bpf: allow ctx writes using BPF_ST_MEM instruction
  bpf: Use separate RCU callbacks for freeing selem
  ...
====================

Link: https://lore.kernel.org/r/20230307004346.27578-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
16 months agoMerge branch 'libbpf: allow users to set kprobe/uprobe attach mode'
Andrii Nakryiko [Mon, 6 Mar 2023 17:30:21 +0000 (09:30 -0800)]
Merge branch 'libbpf: allow users to set kprobe/uprobe attach mode'

Menglong Dong says:

====================

From: Menglong Dong <imagedong@tencent.com>

By default, libbpf will attach the kprobe/uprobe BPF program in the
latest mode that supported by kernel. In this series, we add the support
to let users manually attach kprobe/uprobe in legacy/perf/link mode in
the 1th patch.

And in the 2th patch, we split the testing 'attach_probe' into multi
subtests, as Andrii suggested.

In the 3th patch, we add the testings for loading kprobe/uprobe in
different mode.

Changes since v3:
- rename eBPF to BPF in the doc
- use OPTS_GET() to get the value of 'force_ioctl_attach'
- error out on attach mode is not supported
- use test_attach_probe_manual__open_and_load() directly

Changes since v2:
- fix the typo in the 2th patch

Changes since v1:
- some small changes in the 1th patch, as Andrii suggested
- split 'attach_probe' into multi subtests
====================

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
16 months agoselftests/bpf: Add test for legacy/perf kprobe/uprobe attach mode
Menglong Dong [Mon, 6 Mar 2023 06:48:33 +0000 (14:48 +0800)]
selftests/bpf: Add test for legacy/perf kprobe/uprobe attach mode

Add the testing for kprobe/uprobe attaching in default, legacy, perf and
link mode. And the testing passed:

./test_progs -t attach_probe
$5/1     attach_probe/manual-default:OK
$5/2     attach_probe/manual-legacy:OK
$5/3     attach_probe/manual-perf:OK
$5/4     attach_probe/manual-link:OK
$5/5     attach_probe/auto:OK
$5/6     attach_probe/kprobe-sleepable:OK
$5/7     attach_probe/uprobe-lib:OK
$5/8     attach_probe/uprobe-sleepable:OK
$5/9     attach_probe/uprobe-ref_ctr:OK
$5       attach_probe:OK
Summary: 1/9 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Biao Jiang <benbjiang@tencent.com>
Link: https://lore.kernel.org/bpf/20230306064833.7932-4-imagedong@tencent.com
16 months agoselftests/bpf: Split test_attach_probe into multi subtests
Menglong Dong [Mon, 6 Mar 2023 06:48:32 +0000 (14:48 +0800)]
selftests/bpf: Split test_attach_probe into multi subtests

In order to adapt to the older kernel, now we split the "attach_probe"
testing into multi subtests:

  manual // manual attach tests for kprobe/uprobe
  auto // auto-attach tests for kprobe and uprobe
  kprobe-sleepable // kprobe sleepable test
  uprobe-lib // uprobe tests for library function by name
  uprobe-sleepable // uprobe sleepable test
  uprobe-ref_ctr // uprobe ref_ctr test

As sleepable kprobe needs to set BPF_F_SLEEPABLE flag before loading,
we need to move it to a stand alone skel file, in case of it is not
supported by kernel and make the whole loading fail.

Therefore, we can only enable part of the subtests for older kernel.

Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Biao Jiang <benbjiang@tencent.com>
Link: https://lore.kernel.org/bpf/20230306064833.7932-3-imagedong@tencent.com
16 months agolibbpf: Add support to set kprobe/uprobe attach mode
Menglong Dong [Mon, 6 Mar 2023 06:48:31 +0000 (14:48 +0800)]
libbpf: Add support to set kprobe/uprobe attach mode

By default, libbpf will attach the kprobe/uprobe BPF program in the
latest mode that supported by kernel. In this patch, we add the support
to let users manually attach kprobe/uprobe in legacy or perf mode.

There are 3 mode that supported by the kernel to attach kprobe/uprobe:

  LEGACY: create perf event in legacy way and don't use bpf_link
  PERF: create perf event with perf_event_open() and don't use bpf_link

Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Biao Jiang <benbjiang@tencent.com>
Link: create perf event with perf_event_open() and use bpf_link
Link: https://lore.kernel.org/bpf/20230113093427.1666466-1-imagedong@tencent.com/
Link: https://lore.kernel.org/bpf/20230306064833.7932-2-imagedong@tencent.com
Users now can manually choose the mode with
bpf_program__attach_uprobe_opts()/bpf_program__attach_kprobe_opts().

16 months agotools/resolve_btfids: Add /libsubcmd to .gitignore
Rong Tao [Sat, 4 Mar 2023 15:17:04 +0000 (23:17 +0800)]
tools/resolve_btfids: Add /libsubcmd to .gitignore

Add libsubcmd to .gitignore, otherwise after compiling the kernel it
would result in the following:

    # bpf-next...bpf-next/master
    ?? tools/bpf/resolve_btfids/libsubcmd/

Signed-off-by: Rong Tao <rongtao@cestc.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/tencent_F13D670D5D7AA9C4BD868D3220921AAC090A@qq.com
16 months agobpf: add support for fixed-size memory pointer returns for kfuncs
Andrii Nakryiko [Thu, 2 Mar 2023 23:50:10 +0000 (15:50 -0800)]
bpf: add support for fixed-size memory pointer returns for kfuncs

Support direct fixed-size (and for now, read-only) memory access when
kfunc's return type is a pointer to non-struct type. Calculate type size
and let BPF program access that many bytes directly. This is crucial for
numbers iterator.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230302235015.2044271-13-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
16 months agobpf: generalize dynptr_get_spi to be usable for iters
Andrii Nakryiko [Thu, 2 Mar 2023 23:50:09 +0000 (15:50 -0800)]
bpf: generalize dynptr_get_spi to be usable for iters

Generalize the logic of fetching special stack slot object state using
spi (stack slot index). This will be used by STACK_ITER logic next.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230302235015.2044271-12-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
16 months agobpf: mark PTR_TO_MEM as non-null register type
Andrii Nakryiko [Thu, 2 Mar 2023 23:50:08 +0000 (15:50 -0800)]
bpf: mark PTR_TO_MEM as non-null register type

PTR_TO_MEM register without PTR_MAYBE_NULL is indeed non-null. This is
important for BPF verifier to be able to prune guaranteed not to be
taken branches. This is always the case with open-coded iterators.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230302235015.2044271-11-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
16 months agobpf: move kfunc_call_arg_meta higher in the file
Andrii Nakryiko [Thu, 2 Mar 2023 23:50:07 +0000 (15:50 -0800)]
bpf: move kfunc_call_arg_meta higher in the file

Move struct bpf_kfunc_call_arg_meta higher in the file and put it next
to struct bpf_call_arg_meta, so it can be used from more functions.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230302235015.2044271-10-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
16 months agobpf: ensure that r0 is marked scratched after any function call
Andrii Nakryiko [Thu, 2 Mar 2023 23:50:06 +0000 (15:50 -0800)]
bpf: ensure that r0 is marked scratched after any function call

r0 is important (unless called function is void-returning, but that's
taken care of by print_verifier_state() anyways) in verifier logs.
Currently for helpers we seem to print it in verifier log, but for
kfuncs we don't.

Instead of figuring out where in the maze of code we accidentally set r0
as scratched for helpers and why we don't do that for kfuncs, just
enforce that after any function call r0 is marked as scratched.

Also, perhaps, we should reconsider "scratched" terminology, as it's
mightily confusing. "Touched" would seem more appropriate. But I left
that for follow ups for now.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230302235015.2044271-9-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
16 months agobpf: fix visit_insn()'s detection of BPF_FUNC_timer_set_callback helper
Andrii Nakryiko [Thu, 2 Mar 2023 23:50:05 +0000 (15:50 -0800)]
bpf: fix visit_insn()'s detection of BPF_FUNC_timer_set_callback helper

It's not correct to assume that any BPF_CALL instruction is a helper
call. Fix visit_insn()'s detection of bpf_timer_set_callback() helper by
also checking insn->code == 0. For kfuncs insn->code would be set to
BPF_PSEUDO_KFUNC_CALL, and for subprog calls it will be BPF_PSEUDO_CALL.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230302235015.2044271-8-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
16 months agobpf: clean up visit_insn()'s instruction processing
Andrii Nakryiko [Thu, 2 Mar 2023 23:50:04 +0000 (15:50 -0800)]
bpf: clean up visit_insn()'s instruction processing

Instead of referencing processed instruction repeatedly as insns[t]
throughout entire visit_insn() function, take a local insn pointer and
work with it in a cleaner way.

It makes enhancing this function further a bit easier as well.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230302235015.2044271-7-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
16 months agoselftests/bpf: adjust log_fixup's buffer size for proper truncation
Andrii Nakryiko [Thu, 2 Mar 2023 23:50:03 +0000 (15:50 -0800)]
selftests/bpf: adjust log_fixup's buffer size for proper truncation

Adjust log_fixup's expected buffer length to fix the test. It's pretty
finicky in its length expectation, but it doesn't break often. So just
adjust the length to work on current kernel and with follow up iterator
changes as well.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230302235015.2044271-6-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
16 months agobpf: honor env->test_state_freq flag in is_state_visited()
Andrii Nakryiko [Thu, 2 Mar 2023 23:50:02 +0000 (15:50 -0800)]
bpf: honor env->test_state_freq flag in is_state_visited()

env->test_state_freq flag can be set by user by passing
BPF_F_TEST_STATE_FREQ program flag. This is used in a bunch of selftests
to have predictable state checkpoints at every jump and so on.

Currently, bounded loop handling heuristic ignores this flag if number
of processed jumps and/or number of processed instructions is below some
thresholds, which throws off that reliable state checkpointing.

Honor this flag in all circumstances by disabling heuristic if
env->test_state_freq is set.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230302235015.2044271-5-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
16 months agoselftests/bpf: enhance align selftest's expected log matching
Andrii Nakryiko [Thu, 2 Mar 2023 23:50:01 +0000 (15:50 -0800)]
selftests/bpf: enhance align selftest's expected log matching

Allow to search for expected register state in all the verifier log
output that's related to specified instruction number.

See added comment for an example of possible situation that is happening
due to a simple enhancement done in the next patch, which fixes handling
of env->test_state_freq flag in state checkpointing logic.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230302235015.2044271-4-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>