platform/kernel/linux-rpi.git
6 years agoigb: Delete an error message for a failed memory allocation in igb_enable_sriov()
Markus Elfring [Mon, 1 Jan 2018 19:56:55 +0000 (20:56 +0100)]
igb: Delete an error message for a failed memory allocation in igb_enable_sriov()

Omit an extra message for a memory allocation failure in this function.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoigb: Free IRQs when device is hotplugged
Lyude Paul [Tue, 12 Dec 2017 19:31:30 +0000 (14:31 -0500)]
igb: Free IRQs when device is hotplugged

Recently I got a Caldigit TS3 Thunderbolt 3 dock, and noticed that upon
hotplugging my kernel would immediately crash due to igb:

[  680.825801] kernel BUG at drivers/pci/msi.c:352!
[  680.828388] invalid opcode: 0000 [#1] SMP
[  680.829194] Modules linked in: igb(O) thunderbolt i2c_algo_bit joydev vfat fat btusb btrtl btbcm btintel bluetooth ecdh_generic hp_wmi sparse_keymap rfkill wmi_bmof iTCO_wdt intel_rapl x86_pkg_temp_thermal coretemp crc32_pclmul snd_pcm rtsx_pci_ms mei_me snd_timer memstick snd pcspkr mei soundcore i2c_i801 tpm_tis psmouse shpchp wmi tpm_tis_core tpm video hp_wireless acpi_pad rtsx_pci_sdmmc mmc_core crc32c_intel serio_raw rtsx_pci mfd_core xhci_pci xhci_hcd i2c_hid i2c_core [last unloaded: igb]
[  680.831085] CPU: 1 PID: 78 Comm: kworker/u16:1 Tainted: G           O     4.15.0-rc3Lyude-Test+ #6
[  680.831596] Hardware name: HP HP ZBook Studio G4/826B, BIOS P71 Ver. 01.03 06/09/2017
[  680.832168] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
[  680.832687] RIP: 0010:free_msi_irqs+0x180/0x1b0
[  680.833271] RSP: 0018:ffffc9000030fbf0 EFLAGS: 00010286
[  680.833761] RAX: ffff8803405f9c00 RBX: ffff88033e3d2e40 RCX: 000000000000002c
[  680.834278] RDX: 0000000000000000 RSI: 00000000000000ac RDI: ffff880340be2178
[  680.834832] RBP: 0000000000000000 R08: ffff880340be1ff0 R09: ffff8803405f9c00
[  680.835342] R10: 0000000000000000 R11: 0000000000000040 R12: ffff88033d63a298
[  680.835822] R13: ffff88033d63a000 R14: 0000000000000060 R15: ffff880341959000
[  680.836332] FS:  0000000000000000(0000) GS:ffff88034f440000(0000) knlGS:0000000000000000
[  680.836817] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  680.837360] CR2: 000055e64044afdf CR3: 0000000001c09002 CR4: 00000000003606e0
[  680.837954] Call Trace:
[  680.838853]  pci_disable_msix+0xce/0xf0
[  680.839616]  igb_reset_interrupt_capability+0x5d/0x60 [igb]
[  680.840278]  igb_remove+0x9d/0x110 [igb]
[  680.840764]  pci_device_remove+0x36/0xb0
[  680.841279]  device_release_driver_internal+0x157/0x220
[  680.841739]  pci_stop_bus_device+0x7d/0xa0
[  680.842255]  pci_stop_bus_device+0x2b/0xa0
[  680.842722]  pci_stop_bus_device+0x3d/0xa0
[  680.843189]  pci_stop_and_remove_bus_device+0xe/0x20
[  680.843627]  trim_stale_devices+0xf3/0x140
[  680.844086]  trim_stale_devices+0x94/0x140
[  680.844532]  trim_stale_devices+0xa6/0x140
[  680.845031]  ? get_slot_status+0x90/0xc0
[  680.845536]  acpiphp_check_bridge.part.5+0xfe/0x140
[  680.846021]  acpiphp_hotplug_notify+0x175/0x200
[  680.846581]  ? free_bridge+0x100/0x100
[  680.847113]  acpi_device_hotplug+0x8a/0x490
[  680.847535]  acpi_hotplug_work_fn+0x1a/0x30
[  680.848076]  process_one_work+0x182/0x3a0
[  680.848543]  worker_thread+0x2e/0x380
[  680.848963]  ? process_one_work+0x3a0/0x3a0
[  680.849373]  kthread+0x111/0x130
[  680.849776]  ? kthread_create_worker_on_cpu+0x50/0x50
[  680.850188]  ret_from_fork+0x1f/0x30
[  680.850601] Code: 43 14 85 c0 0f 84 d5 fe ff ff 31 ed eb 0f 83 c5 01 39 6b 14 0f 86 c5 fe ff ff 8b 7b 10 01 ef e8 b7 e4 d2 ff 48 83 78 70 00 74 e3 <0f> 0b 49 8d b5 a0 00 00 00 e8 62 6f d3 ff e9 c7 fe ff ff 48 8b
[  680.851497] RIP: free_msi_irqs+0x180/0x1b0 RSP: ffffc9000030fbf0

As it turns out, normally the freeing of IRQs that would fix this is called
inside of the scope of __igb_close(). However, since the device is
already gone by the point we try to unregister the netdevice from the
driver due to a hotplug we end up seeing that the netif isn't present
and thus, forget to free any of the device IRQs.

So: make sure that if we're in the process of dismantling the netdev, we
always allow __igb_close() to be called so that IRQs may be freed
normally. Additionally, only allow igb_close() to be called from
__igb_close() if it hasn't already been called for the given adapter.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Fixes: 9474933caf21 ("igb: close/suspend race in netif_device_detach")
Cc: Todd Fujinaka <todd.fujinaka@intel.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: stable@vger.kernel.org
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoe1000e: Alert the user that C-states will be disabled by enabling jumbo frames
Matt Turner [Tue, 14 Nov 2017 23:51:33 +0000 (15:51 -0800)]
e1000e: Alert the user that C-states will be disabled by enabling jumbo frames

I personally spent a long time trying to decypher why my CPU would not
reach deeper C-states. Let's just tell the next user what's going on.

Signed-off-by: Matt Turner <matt.turner@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoigb: Clarify idleslope config constraints
Jesus Sanchez-Palencia [Fri, 10 Nov 2017 22:21:50 +0000 (14:21 -0800)]
igb: Clarify idleslope config constraints

By design, the idleslope increments are restricted to 16.384kbps steps.
Add a comment to igb_main.c making that explicit and add one example
that illustrates the impact of that.

Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoe1000e: Set HTHRESH when PTHRESH is used
Matt Turner [Tue, 7 Nov 2017 22:13:30 +0000 (14:13 -0800)]
e1000e: Set HTHRESH when PTHRESH is used

According to section 12.0.3.4.13 "Receive Descriptor Control - RXDCTL"
of the IntelĀ® 82579 Gigabit Ethernet PHY Datasheet v2.1:

    "HTHRESH should be given a non zero value when ever PTHRESH is
     used."

In RXDCTL(0), PTHRESH lives at bits 5:0, and HTHREST lives at bits 13:8.
Set only bit 8 of HTHREST as is done in e1000_flush_rx_ring(). Found by
inspection.

Signed-off-by: Matt Turner <matt.turner@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoigb: add function to get maximum RSS queues
Zhang Shengju [Tue, 19 Sep 2017 13:40:54 +0000 (21:40 +0800)]
igb: add function to get maximum RSS queues

This patch adds a new function igb_get_max_rss_queues() to get maximum
RSS queues, this will reduce duplicate code and facilitate future
maintenance.

Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoigb: Allow to remove administratively set MAC on VFs
Corinna Vinschen [Mon, 10 Apr 2017 08:58:14 +0000 (10:58 +0200)]
igb: Allow to remove administratively set MAC on VFs

Before libvirt modifies the MAC address and vlan tag for an SRIOV VF
for use by a virtual machine (either using vfio device assignment or
macvtap passthru mode), it saves the current MAC address and vlan tag
so that it can reset them to their original value when the guest is
done.  Libvirt can't leave the VF MAC set to the value used by the
now-defunct guest since it may be started again later using a
different VF, but it certainly shouldn't just pick any random value,
either. So it saves the state of everything prior to using the VF, and
resets it to that.

The igb driver initializes the MAC addresses of all VFs to
00:00:00:00:00:00, and reports that when asked (via an RTM_GETLINK
netlink message, also visible in the list of VFs in the output of "ip
link show"). But when libvirt attempts to restore the MAC address back
to 00:00:00:00:00:00 (using an RTM_SETLINK netlink message) the kernel
responds with "Invalid argument".

Forbidding a reset back to the original value leaves the VF MAC at the
value set for the now-defunct virtual machine. Especially on a system
with NetworkManager enabled, this has very bad consequences, since
NetworkManager forces all interfacess to be IFF_UP all the time - if
the same virtual machine is restarted using a different VF (or even on
a different host), there will be multiple interfaces watching for
traffic with the same MAC address.

To allow libvirt to revert to the original state, we need a way to
remove the administrative set MAC on a VF, to allow normal host
operation again, and to reset/overwrite the VF MAC via VF netdev.

This patch implements the outlined scenario by allowing to set the
VF MAC to 00:00:00:00:00:00 via RTM_SETLINK on the PF.
igb_ndo_set_vf_mac resets the IGB_VF_FLAG_PF_SET_MAC flag to 0,
so it's possible to reset the VF MAC back to the original value via
the VF netdev.

Note: Recent patches to libvirt allow for a workaround if the NIC
isn't capable of resetting the administrative MAC back to all 0, but
in theory the NIC should allow resetting the MAC in the first place.

Signed-off-by: Corinna Vinschen <vinschen@redhat.com>
Tested-by: Aaron Brown <arron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoMerge branch 'pktgen-Behavior-flags-fixes'
David S. Miller [Wed, 24 Jan 2018 20:03:37 +0000 (15:03 -0500)]
Merge branch 'pktgen-Behavior-flags-fixes'

Dmitry Safonov says:

====================
pktgen: Behavior flags fixes

v2:
  o fixed a nitpick from David Miller

There are a bunch of fixes/cleanups/Documentations.
Diffstat says for itself, regardless added docs and missed flag
parameters.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agopktgen: Clean read user supplied flag mess
Dmitry Safonov [Thu, 18 Jan 2018 18:31:37 +0000 (18:31 +0000)]
pktgen: Clean read user supplied flag mess

Don't use error-prone-brute-force way.

Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agopktgen: Remove brute-force printing of flags
Dmitry Safonov [Thu, 18 Jan 2018 18:31:36 +0000 (18:31 +0000)]
pktgen: Remove brute-force printing of flags

Add macro generated pkt_flag_names array, with a little help of which
the flags can be printed by using an index.

Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agopktgen: Add behaviour flags macro to generate flags/names
Dmitry Safonov [Thu, 18 Jan 2018 18:31:35 +0000 (18:31 +0000)]
pktgen: Add behaviour flags macro to generate flags/names

PKT_FALGS macro will be used to add package behavior names definitions
to simplify the code that prints/reads pkg flags.
Sorted the array in order of printing the flags in pktgen_if_show()
Note: Renamed IPSEC_ON => IPSEC for simplicity.

No visible behavior change expected.

Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agopktgen: Add missing !flag parameters
Dmitry Safonov [Thu, 18 Jan 2018 18:31:34 +0000 (18:31 +0000)]
pktgen: Add missing !flag parameters

o FLOW_SEQ now can be disabled with pgset "flag !FLOW_SEQ"
o FLOW_SEQ and FLOW_RND are antonyms, as it's shown by pktgen_if_show()
o IPSEC now may be disabled

Note, that IPV6 is enabled with dst6/src6 parameters, not with
a flag parameter.

Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoDocumentation/pktgen: Clearify how-to use pktgen samples
Dmitry Safonov [Thu, 18 Jan 2018 18:31:33 +0000 (18:31 +0000)]
Documentation/pktgen: Clearify how-to use pktgen samples

o Change process name in ps output: looks like, these days the process
  is named kpktgend_<cpu>, rather than pktgen/<cpu>.
o Use pg_ctrl for start/stop as it can work well with pgset without
  changes to $(PGDEV) variable.
o Clarify a bit needed $(PGDEV) definition for sample scripts and that
  one needs to `source functions.sh`.
o Document how-to unset a behaviour flag, note about history expansion.
o Fix pgset spi parameter value.

Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'cxgb4-fix-build-error'
David S. Miller [Wed, 24 Jan 2018 15:56:59 +0000 (10:56 -0500)]
Merge branch 'cxgb4-fix-build-error'

Rahul Lakkireddy says:

====================
cxgb4: fix build error

Patch 1 fixes build error with compiling cudbg_zlib.c when
CONFIG_ZLIB_DEFLATE macro is not defined.

Patch 2 fixes following sparse warning:
"Using plain integer as NULL pointer"
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocxgb4: properly initialize variables
Rahul Lakkireddy [Wed, 24 Jan 2018 08:01:05 +0000 (13:31 +0530)]
cxgb4: properly initialize variables

memset variables to 0 to fix sparse warnings:

drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c:409:42: sparse: Using
plain integer as NULL pointer

drivers/net/ethernet/chelsio/cxgb4/cudbg_zlib.c:43:47: sparse: Using
plain integer as NULL pointer

Fixes: ad75b7d32f25 ("cxgb4: implement ethtool dump data operations")
Fixes: 91c1953de387 ("cxgb4: use zlib deflate to compress firmware dump")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocxgb4: enable ZLIB_DEFLATE when building cxgb4
Rahul Lakkireddy [Wed, 24 Jan 2018 08:01:04 +0000 (13:31 +0530)]
cxgb4: enable ZLIB_DEFLATE when building cxgb4

Fixes:
drivers/net/ethernet/chelsio/cxgb4/cudbg_zlib.c:39:5: error:
redefinition of 'cudbg_compress_buff'
    int cudbg_compress_buff(struct cudbg_init *pdbg_init,
        ^~~~~~~~~~~~~~~~~~~
   In file included from
drivers/net/ethernet/chelsio/cxgb4/cudbg_zlib.c:23:0:
   drivers/net/ethernet/chelsio/cxgb4/cudbg_zlib.h:45:19: note: previous
definition of 'cudbg_compress_buff' was here
    static inline int cudbg_compress_buff(struct cudbg_init *pdbg_init,
                      ^~~~~~~~~~~~~~~~~~~

Fixes: 91c1953de387 ("cxgb4: use zlib deflate to compress firmware dump")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'net-smc-socket-closing-improvements'
David S. Miller [Wed, 24 Jan 2018 15:52:58 +0000 (10:52 -0500)]
Merge branch 'net-smc-socket-closing-improvements'

Ursula Braun says:

====================
net/smc: socket closing improvements

while the first 2 patches are just small cleanups, the remaing
patches affect socket closing.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/smc: continue waiting if peer signals write_shutdown
Ursula Braun [Wed, 24 Jan 2018 09:28:17 +0000 (10:28 +0100)]
net/smc: continue waiting if peer signals write_shutdown

If the peer sends a shutdown WRITE, this should not affect sending
in general, and waiting for send buffer space in particular.
Stop waiting of the local socket for send buffer space only, if peer
signals closing, but not if peer signals just shutdown WRITE.

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/smc: improve state change handling after close wait
Ursula Braun [Wed, 24 Jan 2018 09:28:16 +0000 (10:28 +0100)]
net/smc: improve state change handling after close wait

When a socket is closed or shutdown, smc waits for data being transmitted
in certain states. If the state changes during this wait, the close
switch depending on state should be reentered.
In addition, state change is avoided if sending of close or shutdown fails.

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/smc: make wait for work request uninterruptible
Ursula Braun [Wed, 24 Jan 2018 09:28:15 +0000 (10:28 +0100)]
net/smc: make wait for work request uninterruptible

Work requests are needed for every ib_post_send(), among them the
ib_post_send() to signal closing. If an smc socket program is cancelled,
the smc connections should be cleaned up, and require sending of closing
signals to the peer. This may fail, if a wait for
a free work request is needed, but is cancelled immediately due to the
cancel interrupt. To guarantee notification of the peer, the wait for
a work request is changed to uninterruptible.

And the area to receive work request completion info with
ib_poll_cq() is cleared first.
And _tx_ variable names are used in the _tx_routines for the
demultiplexing common type in the header.

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/smc: get rid of tx_pend waits in socket closing
Ursula Braun [Wed, 24 Jan 2018 09:28:14 +0000 (10:28 +0100)]
net/smc: get rid of tx_pend waits in socket closing

There is no need to wait for confirmation of pending tx requests
for a closing connection, since pending tx slots are dismissed
when finishing a connection.

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/smc: simplify function smc_clcsock_accept()
Ursula Braun [Wed, 24 Jan 2018 09:28:13 +0000 (10:28 +0100)]
net/smc: simplify function smc_clcsock_accept()

Cleanup to avoid duplicate code in smc_clcsock_accept().
No functional change.

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/smc: use local struct sock variables consistently
Ursula Braun [Wed, 24 Jan 2018 09:28:12 +0000 (10:28 +0100)]
net/smc: use local struct sock variables consistently

Cleanup to consistently exploit the local struct sock definitions.
No functional change.

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocxgb4/cxgb4vf: add support for ndo_set_vf_vlan
Ganesh Goudar [Wed, 24 Jan 2018 15:14:07 +0000 (20:44 +0530)]
cxgb4/cxgb4vf: add support for ndo_set_vf_vlan

implement ndo_set_vf_vlan for mgmt netdevice to configure
the PCIe VF.

Original work by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'bpf-and-netdevsim-test-updates'
David S. Miller [Wed, 24 Jan 2018 01:24:32 +0000 (20:24 -0500)]
Merge branch 'bpf-and-netdevsim-test-updates'

Jakub Kicinski says:

====================
bpf and netdevsim test updates

A number of test improvements (delayed by merges).  Quentin provides
patches for checking printing to the verifier log from the drivers
and validating extack messages are propagated.  There is also a test
for replacing TC filters to avoid adding back the bug Daniel recently
fixed in net and stable.
====================

Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests/bpf: validate replace of TC filters is working
Jakub Kicinski [Tue, 23 Jan 2018 19:22:56 +0000 (11:22 -0800)]
selftests/bpf: validate replace of TC filters is working

Daniel discovered recently I broke TC filter replace (and fixed
it in commit ad9294dbc227 ("bpf: fix cls_bpf on filter replace")).
Add a test to make sure it never happens again.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests/bpf: check bpf verifier log buffer usage works for HW offload
Quentin Monnet [Tue, 23 Jan 2018 19:22:55 +0000 (11:22 -0800)]
selftests/bpf: check bpf verifier log buffer usage works for HW offload

Make netdevsim print a message to the BPF verifier log buffer when a
program is offloaded.

Then use this message in hardware offload selftests to make sure that
using this buffer actually prints the message to the console for
eBPF hardware offload.

The message is appended after the last instruction is processed with the
verifying function from netdevsim. Output looks like the following:

    $ tc filter add dev foo ingress bpf obj sample_ret0.o \
        sec .text verbose skip_sw

    Prog section '.text' loaded (5)!
     - Type:         3
     - Instructions: 2 (0 over limit)
     - License:

    Verifier analysis:

    0: (b7) r0 = 0
    1: (95) exit
    [netdevsim] Hello from netdevsim!
    processed 2 insns, stack depth 0

"verbose" flag is required to see it in the console since netdevsim does
not throw an error after printing the message.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonetdevsim: don't compile BPF code if syscall not enabled
Jakub Kicinski [Tue, 23 Jan 2018 19:22:54 +0000 (11:22 -0800)]
netdevsim: don't compile BPF code if syscall not enabled

We should not compile netdevsim/bpf.c if BPF syscall is not
enabled.  Otherwise bpf core would have to provide wrappers
for all functions offload drivers may call, even though
system will never see a BPF object.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests/bpf: add checks on extack messages for eBPF hw offload tests
Quentin Monnet [Tue, 23 Jan 2018 19:22:53 +0000 (11:22 -0800)]
selftests/bpf: add checks on extack messages for eBPF hw offload tests

Add checks to test that netlink extack messages are correctly displayed
in some expected error cases for eBPF offload to netdevsim with TC and
XDP.

iproute2 may be built without libmnl support, in which case the extack
messages will not be reported.  Try to detect this condition, and when
enountered print a mild warning to the user and skip the extack validation.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonetdevsim: add extack support for TC eBPF offload
Quentin Monnet [Tue, 23 Jan 2018 19:22:52 +0000 (11:22 -0800)]
netdevsim: add extack support for TC eBPF offload

Use the recently added extack support for TC eBPF filters in netdevsim.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Wed, 24 Jan 2018 01:22:57 +0000 (20:22 -0500)]
Merge branch '40GbE' of git://git./linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2018-01-23

This series contains updates to i40e and i40evf only.

Pawel enables FlatNVM support on x722 devices by allowing nvmupdate tool
to configure the preservation flags in the AdminQ command.

Mitch fixes a potential divide by zero error when DCB is enabled and
the firmware fails to configure the VSI, so check for this state.
Fixed a bug where the driver could fail to adhere to ETS bandwidth
allocations if 8 traffic classes were configured on the switch.

Sudheer fixes a potential deadlock by avoiding to call
flush_schedule_work() in i40evf_remove(), since cancel_work_sync()
and cancel_delayed_work_sync() already cleans up necessary work items.
Fixed an issue with the problematic detection and recovery from
hung queues in the PF which was causing lost interrupts.  This is done
by triggering a software interrupt so that interrupts are forced on
and if we are already in napi_poll and an interrupt fires, napi_poll
will not be rescheduled and the interrupt is lost.

Avinash fixes an issue in the VF where is was possible to issue a
reset_task while the device is currently being removed.

Michal fixes an issue occurring while calling i40e_led_set() with
the blink parameter set to true, which was causing the activity LED
instead of the link LED to blink for port identification.

Shiraz changes the client interface to not call client close/open on
netdev down/up events, since this causes a lot of thrash that is
not needed.  Instead, disable the PE TCP-ENA flag during a netdev
down event and re-enable on a netdev up event, since this blocks all
TCP traffic to the RDMA protocol engine.

Alan fixes an issue which was causing a potential transmit hang by
ignoring the PF link up message if the VF state is not yet in the
RUNNING state.

Amritha fixes the channel VSI recreation during the reset flow to
reconfigure the transmit rings and the queue context associated with
the channel VSI.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'act_csum-spinlock-remove'
David S. Miller [Wed, 24 Jan 2018 00:51:46 +0000 (19:51 -0500)]
Merge branch 'act_csum-spinlock-remove'

Davide Caratti says:

====================
net/sched: remove spinlock from 'csum' action

Similarly to what has been done earlier with other actions [1][2], this
series tries to improve the performance of 'csum' tc action, removing a
spinlock in the data path. Patch 1 lets act_csum use per-CPU counters;
patch 2 removes spin_{,un}lock_bh() calls from the act() method.

test procedure (using pktgen from https://github.com/netoptimizer):

 # ip link add name eth1 type dummy
 # ip link set dev eth1 up
 # tc qdisc add dev eth1 root handle 1: prio
 # for a in pass drop; do
 > tc filter del dev eth1 parent 1: pref 10 matchall action csum udp
 > tc filter add dev eth1 parent 1: pref 10 matchall action csum udp $a
 > for n in 2 4; do
 > ./pktgen_bench_xmit_mode_queue_xmit.sh -v -s 64 -t $n -n 1000000 -i eth1
 > done
 > done

test results:

      |    |  before patch   |   after patch
  $a  | $n | avg. pps/thread | avg. pps/thread
 -----+----+-----------------+----------------
 pass |  2 |    1671463 Ā± 4% |    1920789 Ā± 3%
 pass |  4 |     648797 Ā± 1% |     738190 Ā± 1%
 drop |  2 |    3212692 Ā± 2% |    3719811 Ā± 2%
 drop |  4 |    1078824 Ā± 1% |    1328099 Ā± 1%

references:

[1] https://www.spinics.net/lists/netdev/msg334760.html
[2] https://www.spinics.net/lists/netdev/msg465862.html

v3 changes:
 - use rtnl_dereference() in place of rcu_dereference() in tcf_csum_dump()

v2 changes:
 - add 'drop' test, it produces more contentions
 - use RCU-protected struct to store 'action' and 'update_flags', to avoid
   reading the values from subsequent configurations
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/sched: act_csum: don't use spinlock in the fast path
Davide Caratti [Mon, 22 Jan 2018 17:14:32 +0000 (18:14 +0100)]
net/sched: act_csum: don't use spinlock in the fast path

use RCU instead of spin_{,unlock}_bh() to protect concurrent read/write on
act_csum configuration, to reduce the effects of contention in the data
path when multiple readers are present.

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/sched: act_csum: use per-core statistics
Davide Caratti [Mon, 22 Jan 2018 17:14:31 +0000 (18:14 +0100)]
net/sched: act_csum: use per-core statistics

use per-CPU counters, like other TC actions do, instead of maintaining one
set of stats across all cores. This allows updating act_csum stats without
the need of protecting them using spin_{,un}lock_bh() invocations.

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: link_watch: mark bonding link events urgent
Roopa Prabhu [Mon, 22 Jan 2018 16:07:19 +0000 (08:07 -0800)]
net: link_watch: mark bonding link events urgent

It takes 1sec for bond link down notification to hit user-space
when all slaves of the bond go down. 1sec is too long for
protocol daemons in user-space relying on bond notification
to recover (eg: multichassis lag implementations in user-space).
Since the link event code already marks team device port link events
 as urgent, this patch moves the code to cover all lag ports and master.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Wed, 24 Jan 2018 00:29:26 +0000 (19:29 -0500)]
Merge branch '10GbE' of git://git./linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
10GbE Intel Wired LAN Driver Updates 2018-01-23

This series contains updates to ixgbe only.

Shannon Nelson provides an implementation of the ipsec hardware offload
feature for the ixgbe driver for these devices: x540, x550, 82599.

The ixgbe NICs support ipsec offload for 1024 Rx and 1024 Tx Security
Associations (SAs), using up to 128 inbound IP addresses, and using the
rfc4106(gcm(aes)) encryption.  This code does not yet support checksum
offload, or TSO in conjunction with the ipsec offload - those will be
added in the future.

This code shows improvements in both packet throughput and CPU utilization.
For example, here are some quicky numbers that show the magnitude of the
performance gain on a single run of "iperf -c <dest>" with the ipsec
offload on both ends of a point-to-point connection:

9.4 Gbps - normal case
7.6 Gbps - ipsec with offload
343 Mbps - ipsec no offload
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'GEHC-Bx50-Switch-Support'
David S. Miller [Wed, 24 Jan 2018 00:22:39 +0000 (19:22 -0500)]
Merge branch 'GEHC-Bx50-Switch-Support'

Sebastian Reichel says:

====================
GEHC Bx50 Switch Support

This adds support for the internal switch found in GE Healthcare
B450v3, B650v3 and B850v3. All devices use a GPIO bitbanged MDIO
bus to communicate with the switch and a PCIe based network card
for exchanging network data. The cpu network data link requires,
that the switch's internal phy interface is enabled, so support
for that is added by the first patch in this series.

The patch series is based on v4.15-rc8.

Changes since PATCHv4:
 * Introduce dsa_port_link_(un)register_of and mark the fixed
   variant static.
 * Update patch description to describe the phy<->phy connection
   from i210 to the Marvell switch
Changes since PATCHv3:
 * Enable the phy in dsa_port_setup() instead of abusing the
   fixed link setup function
Changes since PATCHv2:
 * Add phy nodes to switch in bx50.dtsi and reference them
   from switch ports
 * Enable cpu-port's phy based on 'phy-handle' instead of 'phy-mode'
Changes since PATCHv1:
 * Use 'marvell,mv88e6085' instead of introducing compatible
   string for mv88e6240.
 * Fix indention of DT nodes
 * Only enable 'cpu' phy, if explicitly set to "internal".
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoARM: dts: imx6q-b450v3: Add switch port configuration
Sebastian Reichel [Tue, 23 Jan 2018 15:03:50 +0000 (16:03 +0100)]
ARM: dts: imx6q-b450v3: Add switch port configuration

This adds support for the Marvell switch and names the network
ports according to the labels, that can be found next to the
connectors. The switch is connected to the host system using a
PCI based network card.

The PCI bus configuration has been written using the following
information:

root@b450v3# lspci -tv
-[0000:00]---00.0-[01]----00.0  Intel Corporation I210 Gigabit Network Connection
root@b450v3# lspci -nn
00:00.0 PCI bridge [0604]: Synopsys, Inc. Device [16c3:abcd] (rev 01)
01:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)

Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoARM: dts: imx6q-b650v3: Add switch port configuration
Sebastian Reichel [Tue, 23 Jan 2018 15:03:49 +0000 (16:03 +0100)]
ARM: dts: imx6q-b650v3: Add switch port configuration

This adds support for the Marvell switch and names the network
ports according to the labels, that can be found next to the
connectors. The switch is connected to the host system using a
PCI based network card.

The PCI bus configuration has been written using the following
information:

root@b650v3# lspci -tv
-[0000:00]---00.0-[01]----00.0  Intel Corporation I210 Gigabit Network Connection
root@b650v3# lspci -nn
00:00.0 PCI bridge [0604]: Synopsys, Inc. Device [16c3:abcd] (rev 01)
01:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)

Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoARM: dts: imx6q-b850v3: Add switch port configuration
Sebastian Reichel [Tue, 23 Jan 2018 15:03:48 +0000 (16:03 +0100)]
ARM: dts: imx6q-b850v3: Add switch port configuration

This adds support for the Marvell switch and names the network
ports according to the labels, that can be found next to the
connectors ("ID", "IX", "ePort 1", "ePort 2"). The switch is
connected to the host system using a PCI based network card.

The PCI bus configuration has been written using the following
information:

root@b850v3# lspci -tv
-[0000:00]---00.0-[01]----00.0-[02-05]--+-01.0-[03]----00.0  Intel Corporation I210 Gigabit Network Connection
                                        +-02.0-[04]----00.0  Intel Corporation I210 Gigabit Network Connection
                                        \-03.0-[05]--
root@b850v3# lspci -nn
00:00.0 PCI bridge [0604]: Synopsys, Inc. Device [16c3:abcd] (rev 01)
01:00.0 PCI bridge [0604]: PLX Technology, Inc. PEX 8605 PCI Express 4-port Gen2 Switch [10b5:8605] (rev ab)
02:01.0 PCI bridge [0604]: PLX Technology, Inc. PEX 8605 PCI Express 4-port Gen2 Switch [10b5:8605] (rev ab)
02:02.0 PCI bridge [0604]: PLX Technology, Inc. PEX 8605 PCI Express 4-port Gen2 Switch [10b5:8605] (rev ab)
02:03.0 PCI bridge [0604]: PLX Technology, Inc. PEX 8605 PCI Express 4-port Gen2 Switch [10b5:8605] (rev ab)
03:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)
04:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)

Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoARM: dts: imx6q-bx50v3: Add internal switch
Sebastian Reichel [Tue, 23 Jan 2018 15:03:47 +0000 (16:03 +0100)]
ARM: dts: imx6q-bx50v3: Add internal switch

B850v3, B650v3 and B450v3 all have a GPIO bit banged MDIO bus to
communicate with a Marvell switch. On all devices the switch is
connected to a PCI based network card, which needs to be referenced
by DT, so this also adds the common PCI root node.

Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: dsa: Support internal phy on 'cpu' port
Sebastian Reichel [Tue, 23 Jan 2018 15:03:46 +0000 (16:03 +0100)]
net: dsa: Support internal phy on 'cpu' port

This adds support for enabling the internal PHY for a 'cpu' port.
It has been tested on GE B850v3,  B650v3 and B450v3, which have a
built-in MV88E6240 switch hardwired to a PCIe based network card.
On these machines the internal PHY of the i210 network card and
the Marvell switch are connected to each other and must be enabled
for properly using the switch. While the i210 PHY will be enabled
when the network interface is enabled, the switch's port is not
exposed as network interface. Additionally the mv88e6xxx driver
resets the chip during probe, so the PHY is disabled without this
patch.

Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoi40e: Fix channel addition in reset flow
Amritha Nambiar [Tue, 9 Jan 2018 01:27:20 +0000 (17:27 -0800)]
i40e: Fix channel addition in reset flow

Fix recreating the channel VSIs during the reset flow to reconfigure
the Tx rings and the queue context associated with the channel VSI.
Also update the next_base_queue for the VSI while rebuilding the
channel VSIs after a reset.

Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoi40evf: ignore link up if not running
Alan Brady [Fri, 5 Jan 2018 09:55:21 +0000 (04:55 -0500)]
i40evf: ignore link up if not running

If we receive the link status message from PF with link up before queues
are actually enabled, it will trigger a TX hang.  This fixes the issue
by ignoring a link up message if the VF state is not yet in RUNNING
state.

Signed-off-by: Alan Brady <alan.brady@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoi40e: Delete an error message for a failed memory allocation in i40e_init_interrupt_s...
Markus Elfring [Mon, 1 Jan 2018 19:43:35 +0000 (20:43 +0100)]
i40e: Delete an error message for a failed memory allocation in i40e_init_interrupt_scheme()

Omit an extra message for a memory allocation failure in this function.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoi40e: Disable iWARP VSI PETCP_ENA flag on netdev down events
Shiraz Saleem [Mon, 18 Dec 2017 10:18:22 +0000 (05:18 -0500)]
i40e: Disable iWARP VSI PETCP_ENA flag on netdev down events

Client close is overloaded to handle both un-registration and
netdev down event. On netdev down, i40iw client close is called
which unregisters the RDMA dev and this is too destructive
since the netdev is still registered.

Do not call client close/open on netdev down/up events. Instead
disable the PE TCP_ENA flag during a netdev down event. This
blocks all TCP traffic to the RDMA Protocol Engine. On netdev up,
re-enable the flag.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoi40e: simplify pointer dereferences
Mitch Williams [Mon, 18 Dec 2017 10:18:02 +0000 (05:18 -0500)]
i40e: simplify pointer dereferences

Now that i40e_vsi_config_tc() has the pf and hw variable defined, use
them, instead of dereferencing vsi->back. Much easier to read.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoi40e: check for invalid DCB config
Mitch Williams [Mon, 18 Dec 2017 10:17:42 +0000 (05:17 -0500)]
i40e: check for invalid DCB config

The driver (and the entire netdev layer for that matter) assumes
that TC0 will always be present in our DCB configuration.
Unfortunately, this isn't always the case. Rather than fail to
configure the VSI, let's go ahead and try to make it work, even
though DCB will end up being disabled by the kernel.

If the driver fails to configure DCB, the driver queries what's
valid, then writes that back to the hardware, always forcing TC0.

This fixes a bug where the driver could fail to adhere to ETS BW
allocations if 8 TCs were configured on the switch.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoi40e/i40evf: Detect and recover hung queue scenario
Sudheer Mogilappagari [Mon, 18 Dec 2017 10:17:25 +0000 (05:17 -0500)]
i40e/i40evf: Detect and recover hung queue scenario

In VFs, there is a known issue which can cause writebacks
to not occur when interrupts are disabled and there are
less than 4 descriptors resulting in TX timeout. Timeout
can also occur due to lost interrupt.

The current implementation for detecting and recovering
from hung queues in the PF is problematic because it actually
actively encourages lost interrupts.  By triggering a SW
interrupt, interrupts are forced on.  If we are already in
napi_poll and an interrupt fires, napi_poll will not be
rescheduled and the interrupt is effectively lost; thereby
potentially *causing* hung queues.

This patch checks whether packets are being processed between
every watchdog cycle and determine potential hung queue and
fires triggers SW interrupt only for that particular queue.

Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoi40e: Fix for blinking activity instead of link LEDs
Michal Kuchta [Mon, 18 Dec 2017 10:17:03 +0000 (05:17 -0500)]
i40e: Fix for blinking activity instead of link LEDs

This fix solves an issue occurring while calling i40e_led_set function
from the driver with "blink" parameter set as TRUE. This call resulted
in Activity LED blinking instead of Link LED, which may lead to errors
in physically identifying the port, since Activity LED may be blinking
for different reasons as well.

Signed-off-by: Michal Kuchta <michal.kuchta@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoi40evf: Don't schedule reset_task when device is being removed
Avinash Dayanand [Mon, 18 Dec 2017 10:16:43 +0000 (05:16 -0500)]
i40evf: Don't schedule reset_task when device is being removed

When a host disables and enables a PF device, all the associated
VFs are removed and added back in. It also generates a PFR which in turn
resets all the connected VFs. This behaviour is different from that of
Linux guest on Linux host. Hence we end up in a situation where there's
a PFR and device removal at the same time. And watchdog doesn't have a
clue about this and schedules a reset_task. This patch adds code to send
signal to reset_task that the device is currently being removed.

Signed-off-by: Avinash Dayanand <avinash.dayanand@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoi40evf: remove flush_scheduled_work call in i40evf_remove
Sudheer Mogilappagari [Mon, 18 Dec 2017 10:16:23 +0000 (05:16 -0500)]
i40evf: remove flush_scheduled_work call in i40evf_remove

flush_schedule_work blocks until completion of all scheduled
work items in global work-queue. This can cause deadlock in some
cases. i40evf_remove() cleans up necessary work items with
cancel_delayed_work_sync and cancel_work_sync. This fix removes
flush_schedule_work call inside i40evf_remove().

Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoi40e: avoid divide by zero
Mitch Williams [Mon, 18 Dec 2017 10:15:25 +0000 (05:15 -0500)]
i40e: avoid divide by zero

In some weird circumstances with DCB enabled, the firmware can fail to
configure the VSI, leaving us with zero traffic classes. Check for this
state when we configure RSS to avoid a panic.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoi40e/i40evf: Enable NVMUpdate to retrieve AdminQ and add preservation flags for NVM...
Pawel Jablonski [Mon, 18 Dec 2017 10:14:44 +0000 (05:14 -0500)]
i40e/i40evf: Enable NVMUpdate to retrieve AdminQ and add preservation flags for NVM update

This patch adds new I40E_NVMUPD_GET_AQ_EVENT state to allow
retrieval of AdminQ events as a result of AdminQ commands sent
to firmware.

Add preservation flags support on X722 devices for NVM update
AdminQ function wrapper. Add new parameter and handling to
nvmupdate admin queue function intended to allow nvmupdate tool
to configure the preservation flags in the AdminQ command.

This is required to implement FlatNVM on X722 devices.

Signed-off-by: Pawel Jablonski <pawel.jablonski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David S. Miller [Tue, 23 Jan 2018 18:49:06 +0000 (13:49 -0500)]
Merge git://git./linux/kernel/git/davem/net

en_rx_am.c was deleted in 'net-next' but had a bug fixed in it in
'net'.

The esp{4,6}_offload.c conflicts were overlapping changes.
The 'out' label is removed so we just return ERR_PTR(-EINVAL)
directly.

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoixgbe: register ipsec offload with the xfrm subsystem
Shannon Nelson [Wed, 20 Dec 2017 00:00:02 +0000 (16:00 -0800)]
ixgbe: register ipsec offload with the xfrm subsystem

With all the support code in place we can now link in the ipsec
offload operations and set the ESP feature flag for the XFRM
subsystem to see.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoixgbe: ipsec offload stats
Shannon Nelson [Wed, 20 Dec 2017 00:00:01 +0000 (16:00 -0800)]
ixgbe: ipsec offload stats

Add a simple statistic to count the ipsec offloads.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoixgbe: process the Tx ipsec offload
Shannon Nelson [Wed, 20 Dec 2017 00:00:00 +0000 (16:00 -0800)]
ixgbe: process the Tx ipsec offload

If the skb has a security association referenced in the skb, then
set up the Tx descriptor with the ipsec offload bits.  While we're
here, we fix an oddly named field in the context descriptor struct.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoixgbe: process the Rx ipsec offload
Shannon Nelson [Tue, 19 Dec 2017 23:59:59 +0000 (15:59 -0800)]
ixgbe: process the Rx ipsec offload

If the chip sees and decrypts an ipsec offload, set up the skb
sp pointer with the ralated SA info.  Since the chip is rude
enough to keep to itself the table index it used for the
decryption, we have to do our own table lookup, using the
hash for speed.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoixgbe: restore offloaded SAs after a reset
Shannon Nelson [Tue, 19 Dec 2017 23:59:58 +0000 (15:59 -0800)]
ixgbe: restore offloaded SAs after a reset

On a chip reset most of the table contents are lost, so must be
restored.  This scans the driver's ipsec tables and restores both
the filled and empty table slots to their pre-reset values.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoixgbe: add ipsec offload add and remove SA
Shannon Nelson [Tue, 19 Dec 2017 23:59:57 +0000 (15:59 -0800)]
ixgbe: add ipsec offload add and remove SA

Add the functions for setting up and removing offloaded SAs (Security
Associations) with the x540 hardware.  We set up the callback structure
but we don't yet set the hardware feature bit to be sure the XFRM service
won't actually try to use us for an offload yet.

The software tables are made up to mimic the hardware tables to make it
easier to track what's in the hardware, and the SA table index is used
for the XFRM offload handle.  However, there is a hashing field in the
Rx SA tracking that will be used to facilitate faster table searches in
the Rx fast path.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoixgbe: add ipsec data structures
Shannon Nelson [Tue, 19 Dec 2017 23:59:56 +0000 (15:59 -0800)]
ixgbe: add ipsec data structures

Set up the data structures to be used by the ipsec offload.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoixgbe: add ipsec engine start and stop routines
Shannon Nelson [Tue, 19 Dec 2017 23:59:55 +0000 (15:59 -0800)]
ixgbe: add ipsec engine start and stop routines

Add in the code for running and stopping the hardware ipsec
encryption/decryption engine.  It is good to keep the engine
off when not in use in order to save on the power draw.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoixgbe: add ipsec register access routines
Shannon Nelson [Tue, 19 Dec 2017 23:59:54 +0000 (15:59 -0800)]
ixgbe: add ipsec register access routines

Add a few routines to make access to the ipsec registers just a little
easier, and throw in the beginnings of an initialization.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Tue, 23 Jan 2018 16:52:55 +0000 (08:52 -0800)]
Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) Fix divide by zero in mlx5, from Talut Batheesh.

 2) Guard against invalid GSO packets coming from untrusted guests and
    arriving in qdisc_pkt_len_init(), from Eric Dumazet.

 3) Similarly add such protection to the various protocol GSO handlers.
    From Willem de Bruijn.

 4) Fix regression added to IGMP source address checking for IGMPv3
    reports, from Felix Feitkau.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  tls: Correct length of scatterlist in tls_sw_sendpage
  be2net: restore properly promisc mode after queues reconfiguration
  net: igmp: fix source address check for IGMPv3 reports
  gso: validate gso_type in GSO handlers
  net: qdisc_pkt_len_init() should be more robust
  ibmvnic: Allocate and request vpd in init_resources
  ibmvnic: Revert to previous mtu when unsupported value requested
  ibmvnic: Modify buffer size and number of queues on failover
  rds: tcp: compute m_ack_seq as offset from ->write_seq
  usbnet: silence an unnecessary warning
  cxgb4: fix endianness for vlan value in cxgb4_tc_flower
  cxgb4: set filter type to 1 for ETH_P_IPV6
  net/mlx5e: Fix fixpoint divide exception in mlx5e_am_stats_compare

6 years agoixgbe: clean up ipsec defines
Shannon Nelson [Tue, 19 Dec 2017 23:59:53 +0000 (15:59 -0800)]
ixgbe: clean up ipsec defines

Clean up the ipsec/macsec descriptor bit definitions to match the rest
of the defines and file organization.  Also recognise the bit-definition
overlap in the error mask macro.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agosctp: reset ret in again path in sctp_for_each_transport
Xin Long [Tue, 23 Jan 2018 10:22:25 +0000 (18:22 +0800)]
sctp: reset ret in again path in sctp_for_each_transport

Commit 97a6ec4ac021 ("rhashtable: Change rhashtable_walk_start to
return void") only initialized ret for the first time, when going
to again path, the next tsp could be NULL. Without resetting ret,
cb_done would be called with tsp as NULL.

A kernel crash was caused by this when running sctpdiag testcase
in sctp-tests.

Note that this issue doesn't affect net.git yet.

Fixes: 97a6ec4ac021 ("rhashtable: Change rhashtable_walk_start to return void")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agobnx2: remove redundant initializations of pointers txr and rxr
Colin Ian King [Tue, 23 Jan 2018 09:44:08 +0000 (09:44 +0000)]
bnx2: remove redundant initializations of pointers txr and rxr

Pointers txr and rxr are being initialized and a few statements later
are being assigned new values without the original values ever being
read. The initialized values are therefore redundant and can be
removed.

Cleans up clang warnings:
drivers/net/ethernet/broadcom/bnx2.c:5821:28: warning: Value stored to
'txr' during its initialization is never read
drivers/net/ethernet/broadcom/bnx2.c:5822:28: warning: Value stored to
'rxr' during its initialization is never read

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoforcedeth: remove duplicate structure member in rx
Zhu Yanjun [Tue, 23 Jan 2018 07:03:37 +0000 (02:03 -0500)]
forcedeth: remove duplicate structure member in rx

Since both first_rx_ctx and rx_skb are the head of rx ctx, it not
necessary to use two structure members to statically indicate
the head of rx ctx. So first_rx_ctx is removed.

CC: Srinivas Eeda <srinivas.eeda@oracle.com>
CC: Joe Jin <joe.jin@oracle.com>
CC: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'Kernel-doc-fixes-for-networking'
David S. Miller [Tue, 23 Jan 2018 16:06:51 +0000 (11:06 -0500)]
Merge branch 'Kernel-doc-fixes-for-networking'

Florian Fainelli says:

====================
Kernel doc fixes for networking

This patch series fixes kernel doc warnings found while running make htmldocs
pertaining to the networking subsystem. There is a finaly set of warnings due
to PHYLINK which I have not been able to resolve yet.

The last patch could thereoteically be applied to 'net' since the commit
referenced by the Fixes: tag is present in v4.15-rcX.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: core: Fix kernel-doc for netdev_upper_link()
Florian Fainelli [Tue, 23 Jan 2018 03:14:28 +0000 (19:14 -0800)]
net: core: Fix kernel-doc for netdev_upper_link()

Fixes the following warnings:
./net/core/dev.c:6438: warning: No description found for parameter 'extack'
./net/core/dev.c:6461: warning: No description found for parameter 'extack'

Fixes: 42ab19ee9029 ("net: Add extack to upper device linking")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: core: Fix kernel-doc for call_netdevice_notifiers_info()
Florian Fainelli [Tue, 23 Jan 2018 03:14:27 +0000 (19:14 -0800)]
net: core: Fix kernel-doc for call_netdevice_notifiers_info()

Remove the @dev comment, since we do not have a net_device argument, fixes the
following kernel doc warning: /net/core/dev.c:1707: warning: Excess function
parameter 'dev' description in 'call_netdevice_notifiers_info'

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: phy: sfp: Fix kernel doc warning
Florian Fainelli [Tue, 23 Jan 2018 03:14:26 +0000 (19:14 -0800)]
net: phy: sfp: Fix kernel doc warning

We forgot to update the kernel doc header above sfp_register_upstream()

Fixes: c19bb00070dd ("sfp: convert to fwnode")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: core: Fix kernel-doc for carrier_* attributes
Florian Fainelli [Tue, 23 Jan 2018 03:14:25 +0000 (19:14 -0800)]
net: core: Fix kernel-doc for carrier_* attributes

Fix the documentation warning:

include/linux/netdevice.h:1939: warning: Excess struct member 'carrier_changes' description in 'net_device'

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Fixes: b2d3bcfa26a7 ("net: core: Expose number of link up/down transitions")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: aquantia: make symbol hw_atl_boards static
Wei Yongjun [Tue, 23 Jan 2018 02:10:38 +0000 (02:10 +0000)]
net: aquantia: make symbol hw_atl_boards static

Fixes the following sparse warning:

drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c:50:34: warning:
 symbol 'hw_atl_boards' was not declared. Should it be static?

Fixes: 4948293ff963 ("net: aquantia: Introduce new AQC devices and capabilities")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: aquantia: Fix error return code in aq_pci_probe()
Wei Yongjun [Tue, 23 Jan 2018 02:10:46 +0000 (02:10 +0000)]
net: aquantia: Fix error return code in aq_pci_probe()

Fix to return error code -ENOMEM from the aq_ndev_alloc() error
handling case instead of 0, as done elsewhere in this function.

Fixes: 23ee07ad3c2f ("net: aquantia: Cleanup pci functions module")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonfp: fix error return code in nfp_pci_probe()
Wei Yongjun [Tue, 23 Jan 2018 02:10:27 +0000 (02:10 +0000)]
nfp: fix error return code in nfp_pci_probe()

Fix to return error code -EINVAL instead of 0 when num_vfs above
limit_vfs, as done elsewhere in this function.

Fixes: 0dc786219186 ("nfp: handle SR-IOV already enabled when driver is probing")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonfp: fix fw dump handling of absolute rtsym size
Carl Heymann [Tue, 23 Jan 2018 01:29:43 +0000 (17:29 -0800)]
nfp: fix fw dump handling of absolute rtsym size

Fix bug that causes _absolute_ rtsym sizes of > 8 bytes (as per symbol
table) to result in incorrect space used during a TLV-based debug dump.

Detail: The size calculation stage calculates the correct size (size of
the rtsym address field == 8), while the dump uses the size in the table
to calculate the TLV size to reserve. Symbols with size <= 8 are handled
OK due to aligning sizes to 8, but including any absolute symbol with
listed size > 8 leads to an ENOSPC error during the dump.

Fixes: da762863edd9 ("nfp: fix absolute rtsym handling in debug dump")
Signed-off-by: Carl Heymann <carl.heymann@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonfsd: auth: Fix gid sorting when rootsquash enabled
Ben Hutchings [Mon, 22 Jan 2018 20:11:06 +0000 (20:11 +0000)]
nfsd: auth: Fix gid sorting when rootsquash enabled

Commit bdcf0a423ea1 ("kernel: make groups_sort calling a responsibility
group_info allocators") appears to break nfsd rootsquash in a pretty
major way.

It adds a call to groups_sort() inside the loop that copies/squashes
gids, which means the valid gids are sorted along with the following
garbage.  The net result is that the highest numbered valid gids are
replaced with any lower-valued garbage gids, possibly including 0.

We should sort only once, after filling in all the gids.

Fixes: bdcf0a423ea1 ("kernel: make groups_sort calling a responsibility ...")
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Acked-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agotun: avoid calling xdp_rxq_info_unreg() twice
Cong Wang [Mon, 22 Jan 2018 21:49:27 +0000 (13:49 -0800)]
tun: avoid calling xdp_rxq_info_unreg() twice

Similarly to tx ring, xdp_rxq_info is only registered
when !tfile->detached, so we need to avoid calling
xdp_rxq_info_unreg() twice too. The helper tun_cleanup_tx_ring()
already checks for this properly, so it is correct to put
xdp_rxq_info_unreg() just inside there.

Reported-by: syzbot+1c788d7ce0f0888f1d7f@syzkaller.appspotmail.com
Fixes: 8565d26bcb2f ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoorangefs: initialize op on loop restart in orangefs_devreq_read
Martin Brandenburg [Mon, 22 Jan 2018 20:44:52 +0000 (15:44 -0500)]
orangefs: initialize op on loop restart in orangefs_devreq_read

In orangefs_devreq_read, there is a loop which picks an op off the list
of pending ops.  If the loop fails to find an op, there is nothing to
read, and it returns EAGAIN.  If the op has been given up on, the loop
is restarted via a goto.  The bug is that the variable which the found
op is written to is not reinitialized, so if there are no more eligible
ops on the list, the code runs again on the already handled op.

This is triggered by interrupting a process while the op is being copied
to the client-core.  It's a fairly small window, but it's there.

Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoorangefs: use list_for_each_entry_safe in purge_waiting_ops
Martin Brandenburg [Mon, 22 Jan 2018 20:44:51 +0000 (15:44 -0500)]
orangefs: use list_for_each_entry_safe in purge_waiting_ops

set_op_state_purged can delete the op.

Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoMerge branch 'net-sched-add-extack-support-for-cls-offloads'
David S. Miller [Mon, 22 Jan 2018 21:30:30 +0000 (16:30 -0500)]
Merge branch 'net-sched-add-extack-support-for-cls-offloads'

Jakub Kicinski says:

====================
net: sched: add extack support for cls offloads

I've dropped the tests from the series because test_offloads.py changes
will conflict with bpf-next patches.  I will send four more patches with
tests once bpf-next is merged back, hopefully still making it into 4.16 :)

v4:
 - rebase on top of Alex's changes.

---

Quentin says:

This series tries to improve user experience when eBPF hardware offload
hits error paths at load time. In particular, it introduces netlink
extended ack support in the nfp driver.

To that aim, transmission of the pointer to the extack object is piped
through the `change()` operation of the existing classifiers (patch 1 to
6). Then it is used for TC offload in the nfp driver (patch 8) and in
netdevsim (patch 9, selftest in patch 10). Patch 7 adds a helper to handle
extack messages in the core when TC offload is disabled on the net device.

For completeness extack is propagated for classifiers other than cls_bpf,
but it's up to the drivers to make use of it.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonfp: bpf: use extack support to improve debugging
Quentin Monnet [Sat, 20 Jan 2018 01:44:50 +0000 (17:44 -0800)]
nfp: bpf: use extack support to improve debugging

Use the recently added extack support for eBPF offload in the driver.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonfp: bpf: plumb extack into functions related to XDP offload
Quentin Monnet [Sat, 20 Jan 2018 01:44:49 +0000 (17:44 -0800)]
nfp: bpf: plumb extack into functions related to XDP offload

Pass a pointer to an extack object to nfp_app_xdp_offload() in order to
prepare for extack usage in the nfp driver. Next step will be to forward
this extack pointer to nfp_net_bpf_offload(), once this function is able
to use it for printing error messages.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: sched: create tc_can_offload_extack() wrapper
Quentin Monnet [Sat, 20 Jan 2018 01:44:48 +0000 (17:44 -0800)]
net: sched: create tc_can_offload_extack() wrapper

Create a wrapper around tc_can_offload() that takes an additional
extack pointer argument in order to output an error message if TC
offload is disabled on the device.

In this way, the error message is handled by the core and can be the
same for all drivers.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: sched: add extack support for offload via tc_cls_common_offload
Quentin Monnet [Sat, 20 Jan 2018 01:44:47 +0000 (17:44 -0800)]
net: sched: add extack support for offload via tc_cls_common_offload

Add extack support for hardware offload of classifiers. In order
to achieve this, a pointer to a struct netlink_ext_ack is added to the
struct tc_cls_common_offload that is passed to the callback for setting
up the classifier. Function tc_cls_common_offload_init() is updated to
support initialization of this new attribute.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: sched: cls_bpf: plumb extack support in filter for hardware offload
Quentin Monnet [Sat, 20 Jan 2018 01:44:46 +0000 (17:44 -0800)]
net: sched: cls_bpf: plumb extack support in filter for hardware offload

Pass the extack pointer obtained in the `->change()` filter operation to
cls_bpf_offload() and then to cls_bpf_offload_cmd(). This makes it
possible to use this extack pointer in drivers offloading BPF programs
in a future patch.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: sched: cls_u32: propagate extack support for filter offload
Quentin Monnet [Sat, 20 Jan 2018 01:44:45 +0000 (17:44 -0800)]
net: sched: cls_u32: propagate extack support for filter offload

Propagate the extack pointer from the `->change()` classifier operation
to the function used for filter replacement in cls_u32. This makes it
possible to use netlink extack messages in the future at replacement
time for this filter, although it is not used at this point.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: sched: cls_matchall: propagate extack support for filter offload
Quentin Monnet [Sat, 20 Jan 2018 01:44:44 +0000 (17:44 -0800)]
net: sched: cls_matchall: propagate extack support for filter offload

Propagate the extack pointer from the `->change()` classifier operation
to the function used for filter replacement in cls_matchall. This makes
it possible to use netlink extack messages in the future at replacement
time for this filter, although it is not used at this point.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: sched: cls_flower: propagate extack support for filter offload
Quentin Monnet [Sat, 20 Jan 2018 01:44:43 +0000 (17:44 -0800)]
net: sched: cls_flower: propagate extack support for filter offload

Propagate the extack pointer from the `->change()` classifier operation
to the function used for filter replacement in cls_flower. This makes it
possible to use netlink extack messages in the future at replacement
time for this filter, although it is not used at this point.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agotls: Correct length of scatterlist in tls_sw_sendpage
Dave Watson [Fri, 19 Jan 2018 20:30:13 +0000 (12:30 -0800)]
tls: Correct length of scatterlist in tls_sw_sendpage

The scatterlist is reused by both sendmsg and sendfile.
If a sendmsg of smaller number of pages is followed by a sendfile
of larger number of pages, the scatterlist may be too short, resulting
in a crash in gcm_encrypt.

Add sg_unmark_end to make the list the correct length.

tls_sw_sendmsg already calls sg_unmark_end correctly when it allocates
memory in alloc_sg, or in zerocopy_from_iter.

Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agohv_netvsc: Use the num_online_cpus() for channel limit
Haiyang Zhang [Fri, 19 Jan 2018 20:26:43 +0000 (13:26 -0700)]
hv_netvsc: Use the num_online_cpus() for channel limit

Since we no longer localize channel/CPU affiliation within one NUMA
node, num_online_cpus() is used as the number of channel cap, instead of
the number of processors in a NUMA node.

This patch allows a bigger range for tuning the number of channels.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agobe2net: restore properly promisc mode after queues reconfiguration
Ivan Vecera [Fri, 19 Jan 2018 19:23:50 +0000 (20:23 +0100)]
be2net: restore properly promisc mode after queues reconfiguration

The commit 622190669403 ("be2net: Request RSS capability of Rx interface
depending on number of Rx rings") modified be_update_queues() so the
IFACE (HW representation of the netdevice) is destroyed and then
re-created. This causes a regression because potential promiscuous mode
is not restored properly during be_open() because the driver thinks
that the HW has promiscuous mode already enabled.

Note that Lancer is not affected by this bug because RX-filter flags are
disabled during be_close() for this chipset.

Cc: Sathya Perla <sathya.perla@broadcom.com>
Cc: Ajit Khaparde <ajit.khaparde@broadcom.com>
Cc: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Cc: Somnath Kotur <somnath.kotur@broadcom.com>
Fixes: 622190669403 ("be2net: Request RSS capability of Rx interface depending on number of Rx rings")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: hns3: converting spaces into tabs to avoid checkpatch.pl warning
Salil Mehta [Fri, 19 Jan 2018 15:20:53 +0000 (15:20 +0000)]
net: hns3: converting spaces into tabs to avoid checkpatch.pl warning

Spaces were mistakenly used instead of tabs in some of the code related
to reset functionality, which caused checkpatch.pl errors. These were
missed earlier so fixing them now.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: igmp: fix source address check for IGMPv3 reports
Felix Fietkau [Fri, 19 Jan 2018 10:50:46 +0000 (11:50 +0100)]
net: igmp: fix source address check for IGMPv3 reports

Commit "net: igmp: Use correct source address on IGMPv3 reports"
introduced a check to validate the source address of locally generated
IGMPv3 packets.
Instead of checking the local interface address directly, it uses
inet_ifa_match(fl4->saddr, ifa), which checks if the address is on the
local subnet (or equal to the point-to-point address if used).

This breaks for point-to-point interfaces, so check against
ifa->ifa_local directly.

Cc: Kevin Cernekee <cernekee@chromium.org>
Fixes: a46182b00290 ("net: igmp: Use correct source address on IGMPv3 reports")
Reported-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocxgb3: assign port id to net_device->dev_port
Arjun Vynipadath [Fri, 19 Jan 2018 09:41:48 +0000 (15:11 +0530)]
cxgb3: assign port id to net_device->dev_port

T3 devices have different ports on same PCI function,
so using dev_port to identify ports.

Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agobridge: return boolean instead of integer in br_multicast_is_router
Gustavo A. R. Silva [Thu, 18 Jan 2018 23:37:45 +0000 (17:37 -0600)]
bridge: return boolean instead of integer in br_multicast_is_router

Return statements in functions returning bool should use
true/false instead of 1/0.

This issue was detected with the help of Coccinelle.

Fixes: 85b352693264 ("bridge: Fix build error when IGMP_SNOOPING is not enabled")
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: stmmac: Fix reception of Broadcom switches tags
Florian Fainelli [Thu, 18 Jan 2018 23:12:21 +0000 (15:12 -0800)]
net: stmmac: Fix reception of Broadcom switches tags

Broadcom tags inserted by Broadcom switches put a 4 byte header after
the MAC SA and before the EtherType, which may look like some sort of 0
length LLC/SNAP packet (tcpdump and wireshark do think that way). With
ACS enabled in stmmac the packets were truncated to 8 bytes on
reception, whereas clearing this bit allowed normal reception to occur.

In order to make that possible, we need to pass a net_device argument to
the different core_init() functions and we are dependent on the Broadcom
tagger padding packets correctly (which it now does). To be as little
invasive as possible, this is only done for gmac1000 when the network
device is DSA-enabled (netdev_uses_dsa() returns true).

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'hns3-new-features'
David S. Miller [Mon, 22 Jan 2018 21:05:50 +0000 (16:05 -0500)]
Merge branch 'hns3-new-features'

Peng Li says:

====================
add some features to hns3 driver

This patchset adds some features to hns3 driver, include the support
for ethtool command -d, -p and support for manager table.

[Patch 1/4] adds support for ethtool command -d, its ops is get_regs.
driver will send command to command queue, and get regs number and
regs value from command queue.
[Patch 2/4] adds manager table initialization for hardware.
[Patch 3/4] adds support for ethtool command -p. For fiber ports, driver
sends command to command queue, and IMP will write SGPIO regs to control
leds.
[Patch 4/4] adds support for net status led for fiber ports. Net status
include  port speed, total rx/tx packets and link status. Driver send
the status to command queue, and IMP will write SGPIO to control leds.

---
Change log:
V1 -> V2:
1, fix comments from Andrew Lunn, remove the patch "net: hns3: add
ethtool -p support for phy device".
====================

Signed-off-by: David S. Miller <davem@davemloft.net>