platform/kernel/linux-rpi.git
7 years agomlxsw: spectrum: Add support for flower matches on VLAN ID, PCP
Petr Machata [Thu, 9 Mar 2017 08:25:20 +0000 (09:25 +0100)]
mlxsw: spectrum: Add support for flower matches on VLAN ID, PCP

Introduce MLXSW_AFK_ELEMENT_VID, PCP and declare them in afk_element
infos that contain them.  Use the elements when VLAD ID or priority are
used in the flow.

Also add MLXSW_AFK_ELEMENT_VID, PCP to mlxsw_sp_acl_tcam_pattern_ipv4.
Both items are included in mlxsw_sp_afk_element_info_l2_dmac,
resp. _smac, and both MLXSW_AFK_ELEMENT_SMAC and _DMAC are already in
the pattern.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlxsw: spectrum: Add support for vlan modify TC action
Petr Machata [Thu, 9 Mar 2017 08:25:19 +0000 (09:25 +0100)]
mlxsw: spectrum: Add support for vlan modify TC action

Add VLAN action offloading. Invoke it from Spectrum flower handler for
"vlan modify" actions.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agotcp: rename *_sequence_number() to *_seq_and_tsoff()
Alexey Kodanev [Thu, 9 Mar 2017 10:53:55 +0000 (13:53 +0300)]
tcp: rename *_sequence_number() to *_seq_and_tsoff()

The functions that are returning tcp sequence number also setup
TS offset value, so rename them to better describe their purpose.

No functional changes in this patch.

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ks8851: Added support for half-duplex SPI
Sergey Shcherbakov [Thu, 9 Mar 2017 00:58:14 +0000 (02:58 +0200)]
net: ks8851: Added support for half-duplex SPI

In original driver was implemented support for half-
and full-duplex modes, but it was not enabled. Instead
of it ks8851_rx_1msg method always returns "true" that
means "full-duplex" mode.

This patch replaces hard-coded functionality with
flexible solution that supports both SPI modes.

Signed-off-by: Sergey Shcherbakov <shchers@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'bonding-winter-cleanup'
David S. Miller [Fri, 10 Mar 2017 01:33:30 +0000 (17:33 -0800)]
Merge branch 'bonding-winter-cleanup'

Mahesh Bandewar says:

====================
bonding: winter cleanup

Few cleanup patches that I have accumulated over some time now.

(a) First two patches are basically to move the work-queue initialization
    from every ndo_open / bond_open operation to once at the beginning while
    port creation. Work-queue initialization is an unnecessary operation
    for every 'ifup' operation. However we have some mode-specific work-queues
    and mode can change anytime after port creation. So the second patch is
    to ensure the correct work-handler is called based on the mode.

(b) Third patch is simple and straightforward that removes hard-coded value
    that was added into the initial commit and replaces it with the default
    value configured.

(c) The final patch in the series removes the unimplemented "port-moved" state
    from the LACP state machine. This state is defined but never set so
    removing from the state machine logic makes code little cleaner.

(d) Reduce scope of some global variables to local.

Note: None of these patches are making any functional changes.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobonding: reduce scope of some global variables
Mahesh Bandewar [Wed, 8 Mar 2017 18:56:02 +0000 (10:56 -0800)]
bonding: reduce scope of some global variables

Many of the bond param variables are declared global while it's not
really necessary for these variables to be global. So moving them to
the location these are used.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobonding: remove "port-moved" state that was never implemented
Mahesh Bandewar [Wed, 8 Mar 2017 18:55:59 +0000 (10:55 -0800)]
bonding: remove "port-moved" state that was never implemented

LACP state-machine defines "port-moved" state when the same ActorSystemID
and Port are seen in a LACPDU received on different port. The state is
never set since it's not implemented. However the state-machine attempts
to clear that state occasionally. LACP state machine is already complicated
and since this state is not implemented, removing it's checks makes the
state-machine little simpler.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobonding: remove hardcoded value
Mahesh Bandewar [Wed, 8 Mar 2017 18:55:56 +0000 (10:55 -0800)]
bonding: remove hardcoded value

Eliminate hard-coded value and use the default that is set.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobonding: initialize work-queues during creation of bond
Mahesh Bandewar [Wed, 8 Mar 2017 18:55:54 +0000 (10:55 -0800)]
bonding: initialize work-queues during creation of bond

Initializing work-queues every time ifup operation performed is unnecessary
and can be performed only once when the port is created.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobonding: restructure arp-monitor
Mahesh Bandewar [Wed, 8 Mar 2017 18:55:51 +0000 (10:55 -0800)]
bonding: restructure arp-monitor

In preparation to move the work-queue initialization to port creation
from current port_open phase. Work-queue initialization does not make
sense every time we do 'ifup/ifdown'. So moving to port creation phase.

Arp monitoring work depends on the bonding mode and that is not tied
to the port creation and can change anytime during the life after port
creation. So this restructuring allows us to move the initialization
at creation without losing the ability to arm the correct work for the
mode user has selected.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'nfp-crc32-rss-hash-port-name-reporting-and-misc-fastpath-cleanups'
David S. Miller [Fri, 10 Mar 2017 00:40:00 +0000 (16:40 -0800)]
Merge branch 'nfp-crc32-rss-hash-port-name-reporting-and-misc-fastpath-cleanups'

Jakub Kicinski says:

====================
nfp: CRC32 RSS hash, port name reporting and misc fastpath cleanups

This series adds support for CRC32 RSS hash function to kernel API
of which NFP driver immediately makes use.  There is also a
.ndo_get_phys_port_name() implementation conforming to switchdev
name format.  Small patch takes advantage of napi_complete_done()'s
return code.  Simon provides a fix for potentially trusting values
returned from FW too much.

A handful of unassuming fast path adjustments is also thrown in to make
the upcoming xdp_adjust_head() series easier to review.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonfp: prevent theoretical buffer overrun in nfp_eth_read_ports
Simon Horman [Wed, 8 Mar 2017 16:57:08 +0000 (08:57 -0800)]
nfp: prevent theoretical buffer overrun in nfp_eth_read_ports

Prevent theoretical buffer overrun by returning an error if
the number of entries returned by the firmware does not match those
present.

Also use a common handling error path.

Found by inspection.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
Tested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonfp: add metadata format bit
Jakub Kicinski [Wed, 8 Mar 2017 16:57:07 +0000 (08:57 -0800)]
nfp: add metadata format bit

We only need FW version in the first cache line of adapter struct
because we need to know the metadata format.  To save space add a
metadata format bit.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonfp: avoid rearming the interrupts when in busy poll
Jakub Kicinski [Wed, 8 Mar 2017 16:57:06 +0000 (08:57 -0800)]
nfp: avoid rearming the interrupts when in busy poll

Make use of return code from napi_complete_done() to avoid rearming
interrupts when busy polling is on.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonfp: store device pointer for the fastpath
Jakub Kicinski [Wed, 8 Mar 2017 16:57:05 +0000 (08:57 -0800)]
nfp: store device pointer for the fastpath

We really only need the device pointer on the fast path, stash it at
the beginning of the adapter structure and move pci_dev pointer down.
This saves up a few lines of code.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonfp: reorder variables in nfp_net_tx()
Jakub Kicinski [Wed, 8 Mar 2017 16:57:04 +0000 (08:57 -0800)]
nfp: reorder variables in nfp_net_tx()

Reorder variables longest to shortest to comply with netdev coding style.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonfp: move more ring debug info to debugfs
Jakub Kicinski [Wed, 8 Mar 2017 16:57:03 +0000 (08:57 -0800)]
nfp: move more ring debug info to debugfs

We already print most of ring configuration including descriptors
in debugfs, add the few missing pieces and remove debug prints.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonfp: implement .ndo_get_phys_port_name()
Jakub Kicinski [Wed, 8 Mar 2017 16:57:02 +0000 (08:57 -0800)]
nfp: implement .ndo_get_phys_port_name()

NSP reports to us port labels.  First id is the id of the physical
port, the other one tells us which logical interface is it within a
split port.  Instead of printing them as string keep them in integer
format.  Compute which interfaces are part of port split.

On netdev side use port labels and split information to provide a
.ndo_get_phys_port_name() implementation.  We follow the name format
of mlxsw which is also suggested in "Port Netdev Naming" section
of Documentation/networking/switchdev.txt.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonfp: add support for reporting CRC32 hash function
Jakub Kicinski [Wed, 8 Mar 2017 16:57:01 +0000 (08:57 -0800)]
nfp: add support for reporting CRC32 hash function

Some firmware images may reuse CRC32 hardware to compute RXHASH.
Make sure we report the correct hash function.  Note that we don't
support changing functions at runtime.  That would also require
a few more additions to the way the key is set because different
functions have different key sizes.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoethtool: add CRC32 as an RSS hash function
Jakub Kicinski [Wed, 8 Mar 2017 16:57:00 +0000 (08:57 -0800)]
ethtool: add CRC32 as an RSS hash function

CRC32 engines are usually easily available in hardware and generate
OK spread for RSS hash.  Add CRC32 RSS hash function to ethtool API.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/socket: use per af lockdep classes for sk queues
Paolo Abeni [Thu, 9 Mar 2017 12:54:08 +0000 (13:54 +0100)]
net/socket: use per af lockdep classes for sk queues

Currently the sock queue's spin locks get their lockdep
classes by the default init_spin_lock() initializer:
all socket families get - usually, see below - a single
class for rx, another specific class for tx, etc.
This can lead to false positive lockdep splat, as
reported by Andrey.
Moreover there are two separate initialization points
for the sock queues, one in sk_clone_lock() and one
in sock_init_data(), so that e.g. the rx queue lock
can get one of two possible, different classes, depending
on the socket being cloned or not.
This change tries to address the above, setting explicitly
a per address family lockdep class for each queue's
spinlock. Also, move the duplicated initialization code to a
single location.

v1 -> v2:
 - renamed the init helper

rfc -> v1:
 - no changes, tested with several different workload

Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: dwc-xlgmac: Initial driver for DesignWare Enterprise Ethernet
Jie Deng [Wed, 8 Mar 2017 06:06:18 +0000 (14:06 +0800)]
net: dwc-xlgmac: Initial driver for DesignWare Enterprise Ethernet

Synopsys provides a new DesignWare Core Enterprise Ethernet MAC
IP (DWC-XLGMAC) for Ethernet designs. It is compliant with the
IEEE 802.3-2012 specifications, including IEEE 802.3ba and
consortium specifications.

This patch provides the initial 25G/40G/50G/100G Ethernet driver
for Synopsys XLGMAC IP Prototyping Kit.

Signed-off-by: Jie Deng <jiedeng@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'xgene-v2'
David S. Miller [Thu, 9 Mar 2017 21:25:05 +0000 (13:25 -0800)]
Merge branch 'xgene-v2'

Iyappan Subramanian says:

====================
drivers: net: xgene-v2: Add RGMII based 1G driver

This patch set adds support for RGMII based 1GbE hardware which uses a linked
list of DMA descriptor architecture (v2) for APM X-Gene SoCs.

v4: Address review comments from v3
- fixed local variable declarations to reverse christmas tree order

v3: Address review comments from v2
- fixed kbuild warnings (this 'if' clause does not guard)

v2: Address review comments from v1
- moved create_desc_ring and delete_desc_ring to open() and close()
  respectively
- changed to use dma_zalloc APIs
- fixed tx_timeout()
- removed tx completion polling upper bound
- added error checking on rx packets
- added netif_stop_queue() and netif_wake_queue()

v1:
- Initial version
====================

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMAINTAINERS: Add entry for APM X-Gene SoC Ethernet (v2) driver
Iyappan Subramanian [Wed, 8 Mar 2017 01:08:45 +0000 (17:08 -0800)]
MAINTAINERS: Add entry for APM X-Gene SoC Ethernet (v2) driver

This patch adds a MAINTAINERS entry for the ethernet driver for
the on-chip ethernet interface which uses a linked list of DMA
descriptor architecture (v2) for APM X-Gene SoCs.

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: Keyur Chudgar <kchudgar@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agodrivers: net: xgene-v2: Add transmit and receive
Iyappan Subramanian [Wed, 8 Mar 2017 01:08:44 +0000 (17:08 -0800)]
drivers: net: xgene-v2: Add transmit and receive

This patch adds,
    - Transmit
    - Transmit completion poll
    - Receive poll
    - NAPI handler

and enables the driver.

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: Keyur Chudgar <kchudgar@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agodrivers: net: xgene-v2: Add base driver
Iyappan Subramanian [Wed, 8 Mar 2017 01:08:43 +0000 (17:08 -0800)]
drivers: net: xgene-v2: Add base driver

This patch adds,

     - probe, remove, shutdown
     - open, close and stats
     - create and delete ring
     - request and delete irq

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: Keyur Chudgar <kchudgar@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agodrivers: net: xgene-v2: Add ethernet hardware configuration
Iyappan Subramanian [Wed, 8 Mar 2017 01:08:42 +0000 (17:08 -0800)]
drivers: net: xgene-v2: Add ethernet hardware configuration

This patch adds functions to configure ethernet hardware.

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: Keyur Chudgar <kchudgar@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agodrivers: net: xgene-v2: Add mac configuration
Iyappan Subramanian [Wed, 8 Mar 2017 01:08:41 +0000 (17:08 -0800)]
drivers: net: xgene-v2: Add mac configuration

This patch adds functions to configure and control mac.  This
patch also adds helper functions to get/set registers.

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: Keyur Chudgar <kchudgar@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agodrivers: net: xgene-v2: Add DMA descriptor
Iyappan Subramanian [Wed, 8 Mar 2017 01:08:40 +0000 (17:08 -0800)]
drivers: net: xgene-v2: Add DMA descriptor

This patch adds DMA descriptor setup and interrupt enable/disable
functions.

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: Keyur Chudgar <kchudgar@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoliquidio: add support for XPS
Rick Farrington [Tue, 7 Mar 2017 19:40:41 +0000 (11:40 -0800)]
liquidio: add support for XPS

Add support for XPS.

Signed-off-by: Rick Farrington <ricardo.farrington@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Satanand Burla <satananda.burla@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: stmicro: replace kzalloc with devm_kzalloc
Joao Pinto [Tue, 7 Mar 2017 15:27:36 +0000 (15:27 +0000)]
net: stmicro: replace kzalloc with devm_kzalloc

The axi variable was not being freed upon device removal.
With devm_kzalloc it ensures that it is properly freed.

Signed-off-by: Joao Pinto <jpinto@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mediatek: Use eth_hw_addr_random()
Tobias Klauser [Tue, 7 Mar 2017 15:27:10 +0000 (16:27 +0100)]
net: mediatek: Use eth_hw_addr_random()

Use eth_hw_addr_random() to set a random dev_addr and update
addr_assign_type instead of open-coding it.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agotg3: Add the ability to conditionally build w/ HWMON
Florian Fainelli [Mon, 6 Mar 2017 20:56:02 +0000 (12:56 -0800)]
tg3: Add the ability to conditionally build w/ HWMON

Introduce a Kconfig option: CONFIG_TIGON3_HWMON which allows to build
in/out support for thermal sensors reported by Tigon3 NICs.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'mvpp2-add-initial-support-for-PPv2.2'
David S. Miller [Thu, 9 Mar 2017 18:12:14 +0000 (10:12 -0800)]
Merge branch 'mvpp2-add-initial-support-for-PPv2.2'

Thomas Petazzoni says:

====================
net: mvpp2: add initial support for PPv2.2

The goal of this patch series is to add basic support for PPv2.2 in
the existing mvpp2 driver. mvpp2 currently supported the PPv2.1
version of the IP, used in the 32 bits Marvell Armada 375 SoC. PPv2.2
is an evolution of this IP block, used in the 64 bits Marvell Armada
7K/8K SoCs.

In order to ease the review, the introduction of PPv2.2 support has
been made into multiple small commits, with the final commit adding
the compatible string that makes the PPv2.2 support actually
usable. The series remain fully bisectable.

People interested in testing the code will find the full series (plus
a few Device Tree patches) at:

  https://github.com/MISL-EBU-System-SW/mainline-public/tree/4.11/mvpp2.2-support-v3

I'd like to thank Stefan Chulski and Marcin Wojtas, who helped me a
lot in the development of this patch series, by reviewing the patches,
and giving lots of useful hints to debug the driver on PPv2.2. Thanks
as well to Russell King for reviewing previous iterations of this
series, and providing suggestions and fixes.

Changes between v2 and v3:

 - Rebased on v4.11-rc1.

 - Add patch "net: mvpp2: fix DMA address calculation in
   mvpp2_txq_inc_put()", to properly take into account the "packet
   offset" field of the TX descriptors. Without this, we were getting
   DMA_API_DEBUG warnings that we are unmapping DMA mappings with a
   non-mapped DMA address.

 - In patch "net: mvpp2: add and use accessors for TX/RX descriptors",
   add a function named mvpp2_txdesc_offset_get(), which is needed for
   the DMA address calculation fix.

 - In patch "net: mvpp2: add and use accessors for TX/RX descriptors",
   fix the calculation of tx_desc physical address and packet offset
   in mvpp2_tx_frag_process(). The offset was assigned into the buffer
   physical address, and the physical address to the packet offset,
   which meant the fragment process was completely broken.

 - In patch "net: mvpp2: adjust the allocation/free of BM pools for
   PPv2.2" fix how MVPP22_BM_ADDR_HIGH_VIRT_RLS_MASK is used. This
   mask is already shifted. So the value should be shifted before
   being masked and not the opposite.

 - Add a new patch "net: mvpp2: set dma mask and coherent dma mask on
   PPv2.2", to set the DMA mask and DMA coherent mask. By setting the
   DMA mask to 40 bits we avoid using bounce buffers when network
   packets are above the 4 GB limit. The coherent mask remains set to
   32 bits, because the BM pools must all have the same high 32 bits
   in their addresses.

 - Use "dma" instead of "phys" where appropriate, as suggested by
   Russell King.

 - Use the "cookie" field of the RX descriptor to store the physical
   address instead of the virtual address, and then use phys_to_virt()
   to get the virtual address. This allows to work around the limit
   that the "cookie" field only has 40 bits, which is not sufficient
   to store a virtual address on 64 bits platforms. This was suggested
   by Russell King.

   As part of this change, also got rid of all the compile time
   conditionals on CONFIG_ARCH_DMA_ADDR_T_64BIT, to get better
   compile-time coverage.

 - In patch "net: mvpp2: handle misc PPv2.1/PPv2.2 differences":

    * Instead of calling mvpp21_port_power_up(port) only on PPv2.1,
      remove this function, and call its relevant parts directly from
      ->probe(). Only mvpp2_port_fc_adv_enable() is PPv2.1
      specific. Reported by Russell King.

    * Add a mvpp22_port_mii_set() function that properly initializes
      SGMII support on PPv2.2. Code provided by Russell King.

 - In patch "net: mvpp2: handle register mapping and access for PPv2.2":

    * Adjust the code to match the change of the DT binding in terms
      of mapping the second register area on PPv2.2.

    * Rework the register accessors to remove the get_cpu()/put_cpu(),
      and instead use separate accessors for global registers
      vs. per-CPU registers.

 - Add a few new patches removing dead/unused/useless code:

   net: mvpp2: remove support for buffer header
   net: mvpp2: remove unused register definition MVPP2_TXQ_THRESH_REG
   net: mvpp2: remove mvpp2_txq_pend_desc_num_get() function

 - Fix a number of checkpatch warnings.

Changes between v1 and v2:

 - Made a separate series from the set of patches doing preparation
   changes/fixes to the mvpp2 driver.

 - Rebased on top of v4.10-rc1.

 - Update Kconfig text of the mvpp2 driver to mention the support for
   Armada 7K and 8K (PPv2.2).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: finally add the PPv2.2 compatible string
Thomas Petazzoni [Tue, 7 Mar 2017 15:53:20 +0000 (16:53 +0100)]
net: mvpp2: finally add the PPv2.2 compatible string

Now that the mvpp2 driver has been modified to accommodate the support
for PPv2.2, we can finally advertise this support by adding the
appropriate compatible string.

At the same time, we update the Kconfig description of the MVPP2 driver.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: set dma mask and coherent dma mask on PPv2.2
Thomas Petazzoni [Tue, 7 Mar 2017 15:53:19 +0000 (16:53 +0100)]
net: mvpp2: set dma mask and coherent dma mask on PPv2.2

On PPv2.2, the streaming mappings can be anywhere in the first 40 bits
of the physical address space. However, for the coherent mappings, we
still need them to be in the first 32 bits of the address space,
because all BM pools share a single register to store the high 32 bits
of the BM pool address, which means all BM pools must be allocated in
the same 4GB memory area.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: add support for an additional clock needed for PPv2.2
Thomas Petazzoni [Tue, 7 Mar 2017 15:53:18 +0000 (16:53 +0100)]
net: mvpp2: add support for an additional clock needed for PPv2.2

The PPv2.2 variant of the network controller needs an additional
clock, the "MG clock" in order for the IP block to operate
properly. This commit adds support for this additional clock to the
driver, reworking as needed the error handling path.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: adapt rxq distribution to PPv2.2
Thomas Petazzoni [Tue, 7 Mar 2017 15:53:17 +0000 (16:53 +0100)]
net: mvpp2: adapt rxq distribution to PPv2.2

In PPv2.1, we have a maximum of 8 RXQs per port, with a default of 4
RXQs per port, and we were assigning RXQs 0->3 to the first port, 4->7
to the second port, 8->11 to the third port, etc.

In PPv2.2, we have a maximum of 32 RXQs per port, and we must allocate
RXQs from the range of 32 RXQs available for each port. So port 0 must
use RXQs in the range 0->31, port 1 in the range 32->63, etc.

This commit adapts the mvpp2 to this difference between PPv2.1 and
PPv2.2:

 - The constant definition MVPP2_MAX_RXQ is replaced by a new field
   'max_port_rxqs' in 'struct mvpp2', which stores the maximum number of
   RXQs per port. This field is initialized during ->probe() depending
   on the IP version.

 - MVPP2_RXQ_TOTAL_NUM is removed, and instead we calculate the total
   number of RXQs by multiplying the number of ports by the maximum of
   RXQs per port. This was anyway used in only one place.

 - In mvpp2_port_probe(), the calculation of port->first_rxq is adjusted
   to cope with the different allocation strategy between PPv2.1 and
   PPv2.2. Due to this change, the 'next_first_rxq' argument of this
   function is no longer needed and is removed.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: rework RXQ interrupt group initialization for PPv2.2
Thomas Petazzoni [Tue, 7 Mar 2017 15:53:16 +0000 (16:53 +0100)]
net: mvpp2: rework RXQ interrupt group initialization for PPv2.2

This commit adjusts how the MVPP2_ISR_RXQ_GROUP_REG register is
configured, since it changed between PPv2.1 and PPv2.2.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: add AXI bridge initialization for PPv2.2
Thomas Petazzoni [Tue, 7 Mar 2017 15:53:15 +0000 (16:53 +0100)]
net: mvpp2: add AXI bridge initialization for PPv2.2

The PPv2.2 unit is connected to an AXI bus on Armada 7K/8K, so this
commit adds the necessary initialization of the AXI bridge.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: handle misc PPv2.1/PPv2.2 differences
Thomas Petazzoni [Tue, 7 Mar 2017 15:53:14 +0000 (16:53 +0100)]
net: mvpp2: handle misc PPv2.1/PPv2.2 differences

This commit handles a few miscellaneous differences between PPv2.1 and
PPv2.2 in different areas, where code done for PPv2.1 doesn't apply for
PPv2.2 or needs to be adjusted (getting the MAC address, disabling PHY
polling, etc.).

Thanks to Russell King for providing the initial implementation of
mvpp22_port_mii_set().

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: handle register mapping and access for PPv2.2
Thomas Petazzoni [Tue, 7 Mar 2017 15:53:13 +0000 (16:53 +0100)]
net: mvpp2: handle register mapping and access for PPv2.2

This commit adjusts the mvpp2 driver register mapping and access logic
to support PPv2.2, to handle a number of differences.

Due to how the registers are laid out in memory, the Device Tree binding
for the "reg" property is different:

 - On PPv2.1, we had a first area for the packet processor
   registers (common to all ports), and then one area per port.

 - On PPv2.2, we have a first area for the packet processor
   registers (common to all ports), and a second area for numerous other
   registers, including a large number of per-port registers

In addition, on PPv2.2, the area for the common registers is split into
so-called "address spaces" of 64 KB each. They allow to access per-CPU
registers, where each CPU has its own copy of some registers. A few
other registers, which have a single copy, also need to be accessed from
those per-CPU windows if they are related to a per-CPU register. For
example:

  - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a
    per-CPU register, it must be accessed from the current CPU register
    window.

  - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and
    a few others) will affect the TX queue that was selected by the
    write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU
    window as the write to the TXQ_NUM_REG.

Therefore, the ->base member of 'struct mvpp2' is replaced with a
->cpu_base[] array, each entry pointing to a mapping of the per-CPU
area. Since PPv2.1 doesn't have this concept of per-CPU windows, all
entries in ->cpu_base[] point to the same io-remapped area.

The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0],
they are used for registers for which the CPU window doesn't matter.

mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to
access the registers for which the CPU window does matter, which is why
they take a "cpu" as argument.

The driver is then changed to use mvpp2_percpu_read() and
mvpp2_percpu_write() where it matters.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: adjust mvpp2_{rxq, txq}_init for PPv2.2
Thomas Petazzoni [Tue, 7 Mar 2017 15:53:12 +0000 (16:53 +0100)]
net: mvpp2: adjust mvpp2_{rxq, txq}_init for PPv2.2

In PPv2.2, the MVPP2_RXQ_DESC_ADDR_REG and MVPP2_TXQ_DESC_ADDR_REG
registers have a slightly different layout, because they need to contain
a 64-bit address for the RX and TX descriptor arrays. This commit
adjusts those functions accordingly.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: adapt mvpp2_defaults_set() to PPv2.2
Thomas Petazzoni [Tue, 7 Mar 2017 15:53:11 +0000 (16:53 +0100)]
net: mvpp2: adapt mvpp2_defaults_set() to PPv2.2

This commit modifies the mvpp2_defaults_set() function to not do the
loopback and FIFO threshold initialization, which are not needed for
PPv2.2.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: adapt the mvpp2_rxq_*_pool_set functions to PPv2.2
Thomas Petazzoni [Tue, 7 Mar 2017 15:53:10 +0000 (16:53 +0100)]
net: mvpp2: adapt the mvpp2_rxq_*_pool_set functions to PPv2.2

The MVPP2_RXQ_CONFIG_REG register has a slightly different layout
between PPv2.1 and PPv2.2, so this commit adapts the functions modifying
this register to accommodate for both the PPv2.1 and PPv2.2 cases.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: adjust the allocation/free of BM pools for PPv2.2
Thomas Petazzoni [Tue, 7 Mar 2017 15:53:09 +0000 (16:53 +0100)]
net: mvpp2: adjust the allocation/free of BM pools for PPv2.2

This commit adjusts the allocation and freeing of BM pools to support
PPv2.2. This involves:

 - Checking that the number of buffer pointers is a multiple of 16, as
   required by the hardware.

 - Adjusting the size of the DMA coherent area allocated for buffer
   pointers. Indeed, PPv2.2 needs space for 2 pointers of 64-bits per
   buffer, as opposed to 2 pointers of 32-bits per buffer in
   PPv2.1. The size in bytes is now stored in a new field of the
   mvpp2_bm_pool structure.

 - On PPv2.2, getting the DMA address and cookie (used for the physical
   address) of each buffer requires reading the
   MVPP22_BM_ADDR_HIGH_ALLOC to get the high order bits of those
   addresses. A new utility function mvpp2_bm_bufs_get_addrs() is
   introduced to handle this.

 - On PPv2.2, releasing a buffer requires writing the high order 32 bits
   of the DMA address and cookie to MVPP22_BM_PHY_VIRT_HIGH_RLS_REG.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: introduce PPv2.2 HW descriptors and adapt accessors
Thomas Petazzoni [Tue, 7 Mar 2017 15:53:08 +0000 (16:53 +0100)]
net: mvpp2: introduce PPv2.2 HW descriptors and adapt accessors

This commit adds the definition of the PPv2.2 HW descriptors, adjusts
the mvpp2_tx_desc and mvpp2_rx_desc structures accordingly, and adapts
the accessors to work on both PPv2.1 and PPv2.2.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: introduce an intermediate union for the TX/RX descriptors
Thomas Petazzoni [Tue, 7 Mar 2017 15:53:07 +0000 (16:53 +0100)]
net: mvpp2: introduce an intermediate union for the TX/RX descriptors

Since the format of the HW descriptors is different between PPv2.1 and
PPv2.2, this commit introduces an intermediate union, with for now
only the PPv2.1 descriptors. The bulk of the driver code only
manipulates opaque mvpp2_tx_desc and mvpp2_rx_desc pointers, and the
descriptors can only be accessed and modified through the accessor
functions. A follow-up commit will add the descriptor definitions for
PPv2.2.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: add hw_version field in "struct mvpp2"
Thomas Petazzoni [Tue, 7 Mar 2017 15:53:06 +0000 (16:53 +0100)]
net: mvpp2: add hw_version field in "struct mvpp2"

In preparation to the introduction for the support of PPv2.2 in the
mvpp2 driver, this commit adds a hw_version field to the struct
mvpp2, and uses the .data field of the DT match table to fill it in.

Having the MVPP21 and MVPP22 definitions available will allow to start
adding the necessary conditional code to support PPv2.2.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: add and use accessors for TX/RX descriptors
Thomas Petazzoni [Tue, 7 Mar 2017 15:53:05 +0000 (16:53 +0100)]
net: mvpp2: add and use accessors for TX/RX descriptors

The PPv2.2 IP has a different TX and RX descriptor layout compared to
PPv2.1. In order to prepare for the introduction of PPv2.2 support in
mvpp2, this commit adds accessors for the different fields of the TX
and RX descriptors, and changes the code to use them.

For now, the mvpp2_port argument passed to the accessors is not used,
but it will be used in follow-up to update the descriptor according to
the version of the IP being used.

Apart from the mechanical changes to use the newly introduced
accessors, a few other changes, needed to use the accessors, are made:

 - The mvpp2_txq_inc_put() function now takes a mvpp2_port as first
   argument, as it is needed to use the accessors.

 - Similarly, the mvpp2_bm_cookie_build() gains a mvpp2_port first
   argument, for the same reason.

 - In mvpp2_rx_error(), instead of accessing the RX descriptor in each
   case of the switch, we introduce a local variable to store the
   packet size.

 - In mvpp2_tx_frag_process() and mvpp2_tx() instead of accessing the
   packet size from the TX descriptor, we use the actual value
   available in the function, which is used to set the TX descriptor
   packet size a few lines before.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: store physical address of buffer in rx_desc->buf_cookie
Thomas Petazzoni [Tue, 7 Mar 2017 15:53:04 +0000 (16:53 +0100)]
net: mvpp2: store physical address of buffer in rx_desc->buf_cookie

The RX descriptors of the PPv2 hardware allow to store several
information, amongst which:

 - the DMA address of the buffer in which the data has been received
 - a "cookie" field, left to the use of the driver, and not used by the
   hardware

In the current implementation, the "cookie" field is used to store the
virtual address of the buffer, so that in the receive completion path,
we can easily get the virtual address of the buffer that corresponds to
a completed RX descriptors.

On PPv2.1, used on 32-bit platforms, those two fields are 32-bit wide,
which is enough to store a DMA address in the first field, and a virtual
address in the second field.

On PPv2.2, used on 64-bit platforms, these two fields have been extended
to 40 bits. While 40 bits is enough to store a DMA address (as long as
the DMA mask is 40 bits or lower), it is not enough to store a virtual
address. Therefore, the "cookie" field can no longer be used to store
the virtual address of the buffer.

However, as Russell King pointed out, the RX buffers are always
allocated in the kernel linear mapping, and therefore using
phys_to_virt() on the physical address of the RX buffer is possible and
correct.

Therefore, this commit changes the driver to use the "cookie" field to
store the physical address instead of the virtual
address. phys_to_virt() is used in the receive completion path to
retrieve the virtual address from the physical address.

It is obviously important to realize that the DMA address and physical
address are two different things, which is why we store both in the RX
descriptors. While those addresses may be identical in some situations,
it remains two distinct concepts, and both addresses should be handled
separately.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: remove mvpp2_txq_pend_desc_num_get() function
Thomas Petazzoni [Tue, 7 Mar 2017 15:53:03 +0000 (16:53 +0100)]
net: mvpp2: remove mvpp2_txq_pend_desc_num_get() function

The mvpp2_txq_pend_desc_num_get() function only selects a TX queue, and
reads the number of pending descriptors. It is used in only one place,
in mvpp2_txq_clean(), where the TX queue has already been selected by a
write to MVPP2_TXQ_NUM_REG.

Therefore, this function is useless, and the caller can simply read the
value of the MVPP2_TXQ_PENDING_REG register instead.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: remove unused register definition MVPP2_TXQ_THRESH_REG
Thomas Petazzoni [Tue, 7 Mar 2017 15:53:02 +0000 (16:53 +0100)]
net: mvpp2: remove unused register definition MVPP2_TXQ_THRESH_REG

This register is no longer used since commit edc660fa09e2 ("net: mvpp2:
replace TX coalescing interrupts with hrtimer").

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: remove support for buffer header
Thomas Petazzoni [Tue, 7 Mar 2017 15:53:01 +0000 (16:53 +0100)]
net: mvpp2: remove support for buffer header

The "buffer header" functionality is a functionality used by the
hardware to split an incoming packets over multiple BM buffers if they
are not large enough. However, the mvpp2 driver guarantees that a pool
of BM buffers has buffers with a size large enough to store MTU-sized
packets. Therefore, this functionality is completely unused, and the
code can be removed, and we should never get a descriptor with bit
MVPP2_RXD_BUF_HDR set.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: use "dma" instead of "phys" where appropriate
Thomas Petazzoni [Tue, 7 Mar 2017 15:53:00 +0000 (16:53 +0100)]
net: mvpp2: use "dma" instead of "phys" where appropriate

As indicated by Russell King, the mvpp2 driver currently uses a lot
"phys" or "phys_addr" to store what really is a DMA address. This commit
clarifies this by using "dma" or "dma_addr" where appropriate.

This is especially important as we are going to introduce more changes
where the distinction between physical address and DMA address will be
key.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agodt-bindings: net: update Marvell PPv2 binding for PPv2.2 support
Thomas Petazzoni [Tue, 7 Mar 2017 15:52:59 +0000 (16:52 +0100)]
dt-bindings: net: update Marvell PPv2 binding for PPv2.2 support

The Marvell PPv2 Device Tree binding was so far only used to describe
the PPv2.1 network controller, used in the Marvell Armada 375.

A new version of this IP block, PPv2.2 is used in the Marvell Armada
7K/8K processor. This commit extends the existing binding so that it can
also be used to describe PPv2.2 hardware.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'mlx4-order-0-allocations-and-page-recycling'
David S. Miller [Thu, 9 Mar 2017 17:54:48 +0000 (09:54 -0800)]
Merge branch 'mlx4-order-0-allocations-and-page-recycling'

Eric Dumazet says:

====================
mlx4: order-0 allocations and page recycling

As mentioned half a year ago, we better switch mlx4 driver to order-0
allocations and page recycling.

This reduces vulnerability surface thanks to better skb->truesize
tracking and provides better performance in most cases.
(33 Gbit for one TCP flow on my lab hosts)

I will provide for linux-4.13 a patch on top of this series,
trying to improve data locality as described in
https://www.spinics.net/lists/netdev/msg422258.html

v2 provides an ethtool -S new counter (rx_alloc_pages) and
code factorization, plus Tariq fix.

v3 includes various fixes based on Tariq tests and feedback
from Saeed and Tariq.

v4 rebased on net-next for inclusion in linux-4.12, as requested
by Tariq.

Worth noting this patch series deletes ~250 lines of code ;)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlx4: remove duplicate code in mlx4_en_process_rx_cq()
Eric Dumazet [Wed, 8 Mar 2017 16:17:18 +0000 (08:17 -0800)]
mlx4: remove duplicate code in mlx4_en_process_rx_cq()

We should keep one way to build skbs, regardless of GRO being on or off.

Note that I made sure to defer as much as possible the point we need to
pull data from the frame, so that future prefetch() we might add
are more effective.

These skb attributes derive from the CQE or ring :
 ip_summed, csum
 hash
 vlan offload
 hwtstamps
 queue_mapping

As a bonus, this patch removes mlx4 dependency on eth_get_headlen()
which is very often broken enough to give us headaches.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlx4: make validate_loopback() more generic
Eric Dumazet [Wed, 8 Mar 2017 16:17:17 +0000 (08:17 -0800)]
mlx4: make validate_loopback() more generic

Testing a boolean in fast path is not worth duplicating
the code allocating packets, when GRO is on or off.

If this proves to be a problem, we might later use a jump label.

Next patch will remove this duplicated code and ease code review.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlx4: factorize page_address() calls
Eric Dumazet [Wed, 8 Mar 2017 16:17:16 +0000 (08:17 -0800)]
mlx4: factorize page_address() calls

We need to compute the frame virtual address at different points.
Do it once.

Following patch will use the new va address for validate_loopback()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlx4: do not access rx_desc from mlx4_en_process_rx_cq()
Eric Dumazet [Wed, 8 Mar 2017 16:17:15 +0000 (08:17 -0800)]
mlx4: do not access rx_desc from mlx4_en_process_rx_cq()

Instead of fetching dma address from rx_desc->data[0].addr,
prefer using frags[0].dma + frags[0].page_offset to avoid
a potential cache line miss.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlx4: add rx_alloc_pages counter in ethtool -S
Eric Dumazet [Wed, 8 Mar 2017 16:17:14 +0000 (08:17 -0800)]
mlx4: add rx_alloc_pages counter in ethtool -S

This new counter tracks number of pages that we allocated for one port.

lpaa24:~# ethtool -S eth0 | egrep 'rx_alloc_pages|rx_packets'
     rx_packets: 306755183
     rx_alloc_pages: 932897

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlx4: add page recycling in receive path
Eric Dumazet [Wed, 8 Mar 2017 16:17:13 +0000 (08:17 -0800)]
mlx4: add page recycling in receive path

Same technique than some Intel drivers, for arches where PAGE_SIZE = 4096

In most cases, pages are reused because they were consumed
before we could loop around the RX ring.

This brings back performance, and is even better,
a single TCP flow reaches 30Gbit on my hosts.

v2: added full memset() in mlx4_en_free_frag(), as Tariq found it was needed
if we switch to large MTU, as priv->log_rx_info can dynamically be changed.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlx4: use order-0 pages for RX
Eric Dumazet [Wed, 8 Mar 2017 16:17:12 +0000 (08:17 -0800)]
mlx4: use order-0 pages for RX

Use of order-3 pages is problematic in some cases.

This patch might add three kinds of regression :

1) a CPU performance regression, but we will add later page
recycling and performance should be back.

2) TCP receiver could grow its receive window slightly slower,
   because skb->len/skb->truesize ratio will decrease.
   This is mostly ok, we prefer being conservative to not risk OOM,
   and eventually tune TCP better in the future.
   This is consistent with other drivers using 2048 per ethernet frame.

3) Because we allocate one page per RX slot, we consume more
   memory for the ring buffers. XDP already had this constraint anyway.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlx4: removal of frag_sizes[]
Eric Dumazet [Wed, 8 Mar 2017 16:17:11 +0000 (08:17 -0800)]
mlx4: removal of frag_sizes[]

We will soon use order-0 pages, and frag truesize will more precisely
match real sizes.

In the new model, we prefer to use <= 2048 bytes fragments, so that
we can use page-recycle technique on PAGE_SIZE=4096 arches.

We will still pack as much frames as possible on arches with big
pages, like PowerPC.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlx4: reduce rx ring page_cache size
Eric Dumazet [Wed, 8 Mar 2017 16:17:10 +0000 (08:17 -0800)]
mlx4: reduce rx ring page_cache size

We only need to store the page and dma address.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlx4: rx_headroom is a per port attribute
Eric Dumazet [Wed, 8 Mar 2017 16:17:09 +0000 (08:17 -0800)]
mlx4: rx_headroom is a per port attribute

No need to duplicate it per RX queue / frags.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlx4: get rid of frag_prefix_size
Eric Dumazet [Wed, 8 Mar 2017 16:17:08 +0000 (08:17 -0800)]
mlx4: get rid of frag_prefix_size

Using per frag storage for frag_prefix_size is really silly.

mlx4_en_complete_rx_desc() has all needed info already.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlx4: remove order field from mlx4_en_frag_info
Eric Dumazet [Wed, 8 Mar 2017 16:17:07 +0000 (08:17 -0800)]
mlx4: remove order field from mlx4_en_frag_info

This is really a port attribute, no need to duplicate it per
RX queue and per frag.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlx4: dma_dir is a mlx4_en_priv attribute
Eric Dumazet [Wed, 8 Mar 2017 16:17:06 +0000 (08:17 -0800)]
mlx4: dma_dir is a mlx4_en_priv attribute

No need to duplicate it for all queues and frags.

num_frags & log_rx_info become u8 to save space.
u8 accesses are a bit faster than u16 anyway.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'mlxsw-cosmetics'
David S. Miller [Thu, 9 Mar 2017 07:17:39 +0000 (23:17 -0800)]
Merge branch 'mlxsw-cosmetics'

Jiri Pirko says:

====================
mlxsw: cosmetics

Couple of cosmetic mlxsw patches
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlxsw: pci: Remove unused bit
Ido Schimmel [Mon, 6 Mar 2017 20:25:21 +0000 (21:25 +0100)]
mlxsw: pci: Remove unused bit

The overrun ignore bit isn't supported by the device's firmware and was
recently removed from the programmer's reference manual (PRM).

Remove it from the driver as well.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlxsw: spectrum: Fix helper function and port variable names
Jiri Pirko [Mon, 6 Mar 2017 20:25:20 +0000 (21:25 +0100)]
mlxsw: spectrum: Fix helper function and port variable names

Commit dd82364c3ab9 ("mlxsw: Flip to the new dev walk API") did some
small changes in mlxsw code, but it did not respect the naming
conventions. So fix this now.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: use proper lockdep annotation in __sk_dst_set()
Eric Dumazet [Mon, 6 Mar 2017 19:23:55 +0000 (11:23 -0800)]
net: use proper lockdep annotation in __sk_dst_set()

__sk_dst_set() must be called while we own the socket.

We can get proper lockdep coverage using lockdep_sock_is_held()
and rcu_dereference_protected()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'flow_dissector-improvements'
David S. Miller [Thu, 9 Mar 2017 07:08:59 +0000 (23:08 -0800)]
Merge branch 'flow_dissector-improvements'

Jiri Pirko says:

====================
flow dissector improvements

This patchset follows-up the discussion about future extensions of flow
dissector and tries to address the mentioned concerns. Some parts are
cut out into sub-functions. Also, the processing of the code (ARP, MPLS)
is made dependent on user actually requiring the bisected values.
This prepares the code for future extensions to bisect IPv6 ND messages,
TCP flags, etc.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoflow_dissector: Move GRE dissection into a separate function
Jiri Pirko [Mon, 6 Mar 2017 15:39:55 +0000 (16:39 +0100)]
flow_dissector: Move GRE dissection into a separate function

Make the main flow_dissect function a bit smaller and move the GRE
dissection into a separate function.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoflow_dissector: rename "proto again" goto label
Jiri Pirko [Mon, 6 Mar 2017 15:39:54 +0000 (16:39 +0100)]
flow_dissector: rename "proto again" goto label

Align with "ip_proto_again" label used in the same function and rename
vague "again" to "proto_again".

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoflow_dissector: Fix GRE header error path
Jiri Pirko [Mon, 6 Mar 2017 15:39:53 +0000 (16:39 +0100)]
flow_dissector: Fix GRE header error path

Now, when an unexpected element in the GRE header appears, we break so
the l4 ports are processed. But since the ports are processed
unconditionally, there will be certainly random values dissected. Fix
this by just bailing out in such situations.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoflow_dissector: Move MPLS dissection into a separate function
Jiri Pirko [Mon, 6 Mar 2017 15:39:52 +0000 (16:39 +0100)]
flow_dissector: Move MPLS dissection into a separate function

Make the main flow_dissect function a bit smaller and move the MPLS
dissection into a separate function. Along with that, do the MPLS header
processing only in case the flow dissection user requires it.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoflow_dissector: Move ARP dissection into a separate function
Jiri Pirko [Mon, 6 Mar 2017 15:39:51 +0000 (16:39 +0100)]
flow_dissector: Move ARP dissection into a separate function

Make the main flow_dissect function a bit smaller and move the ARP
dissection into a separate function. Along with that, do the ARP header
processing only in case the flow dissection user requires it.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: toshiba: spider_net: use new api ethtool_{get|set}_link_ksettings
Philippe Reynes [Sun, 5 Mar 2017 22:46:00 +0000 (23:46 +0100)]
net: toshiba: spider_net: use new api ethtool_{get|set}_link_ksettings

The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

As I don't have the hardware, I'd be very pleased if
someone may test this patch.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: toshiba: ps3_genic_net: use new api ethtool_{get|set}_link_ksettings
Philippe Reynes [Sun, 5 Mar 2017 22:21:06 +0000 (23:21 +0100)]
net: toshiba: ps3_genic_net: use new api ethtool_{get|set}_link_ksettings

The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

As I don't have the hardware, I'd be very pleased if
someone may test this patch.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Tested-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: sun: sunhme: use new api ethtool_{get|set}_link_ksettings
Philippe Reynes [Sun, 5 Mar 2017 21:25:39 +0000 (22:25 +0100)]
net: sun: sunhme: use new api ethtool_{get|set}_link_ksettings

The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

As I don't have the hardware, I'd be very pleased if
someone may test this patch.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: sun: sungem: use new api ethtool_{get|set}_link_ksettings
Philippe Reynes [Sat, 4 Mar 2017 23:04:18 +0000 (00:04 +0100)]
net: sun: sungem: use new api ethtool_{get|set}_link_ksettings

The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

As I don't have the hardware, I'd be very pleased if
someone may test this patch.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: sun: niu: use new api ethtool_{get|set}_link_ksettings
Philippe Reynes [Sat, 4 Mar 2017 16:50:06 +0000 (17:50 +0100)]
net: sun: niu: use new api ethtool_{get|set}_link_ksettings

The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

As I don't have the hardware, I'd be very pleased if
someone may test this patch.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: sun: cassini: use new api ethtool_{get|set}_link_ksettings
Philippe Reynes [Sat, 4 Mar 2017 15:16:12 +0000 (16:16 +0100)]
net: sun: cassini: use new api ethtool_{get|set}_link_ksettings

The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

As I don't have the hardware, I'd be very pleased if
someone may test this patch.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: smsc: smc91x: use new api ethtool_{get|set}_link_ksettings
Philippe Reynes [Sat, 4 Mar 2017 11:42:39 +0000 (12:42 +0100)]
net: smsc: smc91x: use new api ethtool_{get|set}_link_ksettings

The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

As I don't have the hardware, I'd be very pleased if
someone may test this patch.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Tested-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: smsc: smc911x: use new api ethtool_{get|set}_link_ksettings
Philippe Reynes [Tue, 28 Feb 2017 22:49:38 +0000 (23:49 +0100)]
net: smsc: smc911x: use new api ethtool_{get|set}_link_ksettings

The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

As I don't have the hardware, I'd be very pleased if
someone may test this patch.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agodecnet: Use TCP nagle macro instead of literal number in decnet
Gao Feng [Sat, 4 Mar 2017 14:10:28 +0000 (22:10 +0800)]
decnet: Use TCP nagle macro instead of literal number in decnet

Use existing TCP nagle macro TCP_NAGLE_OFF and TCP_NAGLE_CORK instead
of the literal number 1 and 2 in the current decnet codes.

Signed-off-by: Gao Feng <fgao@ikuai8.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: qcom/emac: optimize QDF2400 SGMII RX/TX impedence values
Timur Tabi [Tue, 28 Feb 2017 23:16:02 +0000 (17:16 -0600)]
net: qcom/emac: optimize QDF2400 SGMII RX/TX impedence values

Adjust the impedance values of the RX and TX lanes in the SGMII block
so that they are closer to optimal values.

Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'bgmac-cleanups-PM-support'
David S. Miller [Tue, 7 Mar 2017 01:17:48 +0000 (17:17 -0800)]
Merge branch 'bgmac-cleanups-PM-support'

Jon Mason says:

====================
net: ethernet: bgmac: PM support and clean-ups

Changes in v3:
* Corrected a bug Florian found and added his Reviewed-by

Changes in v2:
* Reworked the PM patch with Florian's suggestions

Add code to support Power Management (only tested on NS2), and add some
code clean-ups
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: bgmac: driver power manangement
Joey Zhong [Tue, 28 Feb 2017 18:51:01 +0000 (13:51 -0500)]
net: ethernet: bgmac: driver power manangement

Implement suspend/resume callbacks in the bgmac driver. This makes sure
that we de-initialize and re-initialize the hardware correctly before
entering suspend and when resuming.

Signed-off-by: Joey Zhong <zhongx@broadcom.com>
Signed-off-by: Jon Mason <jon.mason@broadcom.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: bgmac: unify code of the same family
Jon Mason [Tue, 28 Feb 2017 18:51:00 +0000 (13:51 -0500)]
net: ethernet: bgmac: unify code of the same family

BCM471X and BCM535X are of the same family (from what I can derive from
internal documents).  Group them into the case statement together, which
results in more code reuse.

Also, use existing helper variables to make the code a little more
readable too.

Signed-off-by: Jon Mason <jon.mason@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: bgmac: use #defines for MAX size
Jon Mason [Tue, 28 Feb 2017 18:50:59 +0000 (13:50 -0500)]
net: ethernet: bgmac: use #defines for MAX size

The maximum frame size is really just the standard ethernet frame size
and FCS.  So use those existing defines to make the code a little more
beautiful.

Signed-off-by: Jon Mason <jon.mason@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: axienet: use eth_hw_addr_random()
Tobias Klauser [Tue, 28 Feb 2017 11:21:12 +0000 (12:21 +0100)]
net: axienet: use eth_hw_addr_random()

Use eth_hw_addr_random() to set a random MAC address in order to make
sure ndev->addr_assign_type will be properly set to NET_ADDR_RANDOM.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'netvsc-NAPI'
David S. Miller [Tue, 7 Mar 2017 01:13:14 +0000 (17:13 -0800)]
Merge branch 'netvsc-NAPI'

Stephen Hemminger says:

====================
NAPI support for Hyper-V

These patches enable NAPI, GRO and napi_alloc_skb for Hyper-V netvsc
driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonetvsc: replace netdev_alloc_skb_ip_align with napi_alloc_skb
stephen hemminger [Mon, 27 Feb 2017 18:26:51 +0000 (10:26 -0800)]
netvsc: replace netdev_alloc_skb_ip_align with napi_alloc_skb

Gives potential performance gain.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonetvsc: enable GRO
stephen hemminger [Mon, 27 Feb 2017 18:26:50 +0000 (10:26 -0800)]
netvsc: enable GRO

Use GRO when receiving packets.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonetvsc: implement NAPI
stephen hemminger [Mon, 27 Feb 2017 18:26:49 +0000 (10:26 -0800)]
netvsc: implement NAPI

Use NAPI (softirq), to handle receive packets and send completions.
Previously this was handled by tasklet.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agovmbus: introduce in-place packet iterator
stephen hemminger [Mon, 27 Feb 2017 18:26:48 +0000 (10:26 -0800)]
vmbus: introduce in-place packet iterator

This is mostly just a refactoring of previous functions
(get_pkt_next_raw, put_pkt_raw and commit_rd_index) to make it easier
to use for other drivers and NAPI.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>