Avinash Dayanand [Wed, 27 Dec 2017 13:22:11 +0000 (08:22 -0500)]
i40e: Fix kdump failure
kdump fails in the system when used in conjunction with Ethernet driver
X722/X710. This is mainly because when we are resource constrained i.e.
when we have just one online_cpus, we are enabling VMDq and iWARP. It
doesn't make sense to enable them with just one CPU and starve kdump
for lack of IRQs.
So don't enable VMDq or iWARP when we just have a single CPU.
Signed-off-by: Avinash Dayanand <avinash.dayanand@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jeff Kirsher [Wed, 27 Dec 2017 13:21:03 +0000 (08:21 -0500)]
i40e: cleanup unnecessary parens
Clean up unnecessary parenthesis.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Alan Brady [Wed, 27 Dec 2017 13:19:19 +0000 (08:19 -0500)]
i40e: fix FW_LLDP flag on init
Using ethtool --set-priv-flags disable-fw-lldp <on/off> is persistent
across reboots/reloads so we need some mechanism in the driver to detect
if it's on or off on init so we can set the ethtool private flag
appropriately. Without this, every time the driver is reloaded the flag
will default to off regardless of whether it's on or off in FW.
We detect this by first attempting to program DCB and if AQ fails
returning I40E_AQ_RC_EPERM, we know that LLDP is disabled in FW.
Signed-off-by: Alan Brady <alan.brady@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Dave Ertman [Wed, 27 Dec 2017 13:18:21 +0000 (08:18 -0500)]
i40e: Implement an ethtool private flag to stop LLDP in FW
Implement the private flag disable-fw-lldp for ethtool
to disable the processing of LLDP packets by the FW.
This will stop the FW from consuming LLDPDU and cause
them to be sent up the stack.
The FW is also being configured to apply a default DCB
configuration on link up.
Toggling the value of this flag will also cause a PF reset.
Disabling FW DCB will also disable DCBx.
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alice Michael [Wed, 27 Dec 2017 13:17:50 +0000 (08:17 -0500)]
i40e: change flags to use 64 bits
As we have added more flags, we need to now use more
bits and have over flooded the 32 bit size. So
make it 64.
Also change all the existing bits to unsigned long long
bits.
Signed-off-by: Alice Michael <alice.michael@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Upasana Menon [Wed, 27 Dec 2017 13:17:07 +0000 (08:17 -0500)]
i40e: Display LLDP information on vSphere Web Client
This patch enables driver to display LLDP information on the vSphere Web
Client with Intel adapters (X710, XL710) and Distributed Virtual Switch.
Signed-off-by: Upasana Menon <upasana.menon@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Wed, 27 Dec 2017 13:15:51 +0000 (08:15 -0500)]
i40e/i40evf: Use ring pointers to clean up _set_itr_per_queue
This change cleans up the i40e/i40evf_set_itr_per_queue function by
dropping all the unneeded pointer chases. Instead we can just pull out the
pointers for the Tx and Rx rings and use them throughout the function.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Paweł Jabłoński [Wed, 27 Dec 2017 12:32:32 +0000 (07:32 -0500)]
i40evf: Allow turning off offloads when the VF has VLAN set
This patch adds back the capability to turn off offloads when VF has
VLAN set. The commit
0a3b4f702fb1 ("i40evf: enable support for VF VLAN
tag stripping control") adds the i40evf_set_features function and
changes the 'turn off' flow for offloads. This patch adds that
capability back by moving checking the VLAN option for VF to the
next statement.
Signed-off-by: Paweł Jabłoński <pawel.jablonski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Patryk Małek [Wed, 27 Dec 2017 12:32:31 +0000 (07:32 -0500)]
i40e: Fix for adding multiple ethtool filters on the same location
This patch reorders i40e_add_del_fdir and i40e_update_ethtool_fdir_entry
calls so that we first remove an already existing filter (inside
i40e_update_ethtool_fdir_entry using i40e_add_del_fdir) and then
we add a new one with i40e_add_del_fdir.
After applying this patch, creating multiple identical filters (with
the same location) one after another doesn't revert their behavior
but behaves correctly.
Signed-off-by: Patryk Małek <patryk.malek@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Michal Kosiarz [Wed, 27 Dec 2017 13:14:40 +0000 (08:14 -0500)]
i40e: Add returning AQ critical error to SW
The FW has the ability to return a critical error on every AQ command.
When this critical error occurs then we need to send the correct response
to the caller.
Signed-off-by: Michal Kosiarz <michal.kosiarz@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
David Ahern [Thu, 25 Jan 2018 03:37:38 +0000 (19:37 -0800)]
net/ipv4: Allow send to local broadcast from a socket bound to a VRF
Message sends to the local broadcast address (255.255.255.255) require
uc_index or sk_bound_dev_if to be set to an egress device. However,
responses or only received if the socket is bound to the device. This
is overly constraining for processes running in an L3 domain. This
patch allows a socket bound to the VRF device to send to the local
broadcast address by using IP_UNICAST_IF to set the egress interface
with packet receipt handled by the VRF binding.
Similar to IP_MULTICAST_IF, relax the constraint on setting
IP_UNICAST_IF if a socket is bound to an L3 master device. In this
case allow uc_index to be set to an enslaved if sk_bound_dev_if is
an L3 master device and is the master device for the ifindex.
In udp and raw sendmsg, allow uc_index to override the oif if
uc_index master device is oif (ie., the oif is an L3 master and the
index is an L3 slave).
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 26 Jan 2018 02:39:43 +0000 (21:39 -0500)]
Merge branch 'net-erspan-add-support-for-openvswitch'
William Tu says:
====================
net: erspan: add support for openvswitch
The first patch refactors the erspan header definitions.
Originally, the erspan fields are defined as a group into a __be16 field,
and use mask and offset to access each field. This is more costly due to
calling ntohs/htons and error-prone. The first patch changes it to use
bitfields. The second patch creates erspan.h in UAPI and move the definition
'struct erspan_metadata' to it for later openvswitch to use. The final patch
introduces the new OVS tunnel key attribute, OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS,
to program both v1 and v2 erspan tunnel for openvswitch.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
William Tu [Thu, 25 Jan 2018 21:20:11 +0000 (13:20 -0800)]
openvswitch: add erspan version I and II support
The patch adds support for openvswitch to configure erspan
v1 and v2. The OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS attr is added
to uapi as a binary blob to support all ERSPAN v1 and v2's
fields. Note that Previous commit "openvswitch: Add erspan tunnel
support." was reverted since it does not design properly.
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
William Tu [Thu, 25 Jan 2018 21:20:10 +0000 (13:20 -0800)]
net: erspan: create erspan metadata uapi header
The patch adds a new uapi header file, erspan.h, and moves
the 'struct erspan_metadata' from internal erspan.h to it.
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
William Tu [Thu, 25 Jan 2018 21:20:09 +0000 (13:20 -0800)]
net: erspan: use bitfield instead of mask and offset
Originally the erspan fields are defined as a group into a __be16 field,
and use mask and offset to access each field. This is more costly due to
calling ntohs/htons. The patch changes it to use bitfields.
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 26 Jan 2018 02:23:09 +0000 (21:23 -0500)]
Merge branch 'use-tc_cls_can_offload_and_chain0-throughout-the-drivers'
Jakub Kicinski says:
====================
use tc_cls_can_offload_and_chain0() throughout the drivers
This set makes all drivers use a new tc_cls_can_offload_and_chain0()
helper which will set extack in case TC hw offload flag is disabled.
I chose to keep the new helper which also looks at the chain but
renamed it more appropriately. The rationale being that most drivers
don't accept chains other than 0 and since we have to pass extack
to the helper we can as well pass the entire struct tc_cls_common_offload
and perform the most common checks.
This code makes the assumption that type_data in the callback can
be interpreted as struct tc_cls_common_offload, i.e. the real offload
structure has common part as the first member. This allows us to
make the check once for all classifier types if driver supports
more than one.
v1:
- drop the type validation in nfp and netdevsim.
v2:
- reorder checks in patch 1;
- split other changes from patch 1;
- add the i40e patch in;
- add one more test case - for chain 0 extack.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 25 Jan 2018 22:00:53 +0000 (14:00 -0800)]
selftests/bpf: check for chain-non-0 extack message
Make sure netdevsim doesn't allow offload of chains other than 0,
and that it reports the expected extack message.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 25 Jan 2018 22:00:52 +0000 (14:00 -0800)]
selftests/bpf: check for spurious extacks from the driver
Drivers should not report errors when offload is not forced.
Check stdout and stderr for familiar messages when with no
skip flags and with skip_hw. Check for add, replace, and
destroy.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 25 Jan 2018 22:00:51 +0000 (14:00 -0800)]
mlxsw: use tc_cls_can_offload_and_chain0()
Make use of tc_cls_can_offload_and_chain0() to set extack msg in case
ethtool tc offload flag is not set or chain unsupported.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 25 Jan 2018 22:00:50 +0000 (14:00 -0800)]
i40e: use tc_cls_can_offload_and_chain0()
Make use of tc_cls_can_offload_and_chain0() to set extack msg in case
ethtool tc offload flag is not set or chain unsupported.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 25 Jan 2018 22:00:49 +0000 (14:00 -0800)]
ixgbe: use tc_cls_can_offload_and_chain0()
Make use of tc_cls_can_offload_and_chain0() to set extack msg in case
ethtool tc offload flag is not set or chain unsupported.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 25 Jan 2018 22:00:48 +0000 (14:00 -0800)]
bnxt: use tc_cls_can_offload_and_chain0()
Make use of tc_cls_can_offload_and_chain0() to set extack msg in case
ethtool tc offload flag is not set or chain unsupported.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 25 Jan 2018 22:00:47 +0000 (14:00 -0800)]
mlx5: use tc_cls_can_offload_and_chain0()
Make use of tc_cls_can_offload_and_chain0() to set extack msg in case
ethtool tc offload flag is not set or chain unsupported.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 25 Jan 2018 22:00:46 +0000 (14:00 -0800)]
cxgb4: use tc_cls_can_offload_and_chain0()
Make use of tc_cls_can_offload_and_chain0() to set extack msg in case
ethtool tc offload flag is not set or chain unsupported.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 25 Jan 2018 22:00:45 +0000 (14:00 -0800)]
nfp: use tc_cls_can_offload_and_chain0()
Make use of tc_cls_can_offload_and_chain0() to set extack msg in case
ethtool tc offload flag is not set or chain unsupported.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 25 Jan 2018 22:00:44 +0000 (14:00 -0800)]
netdevsim: use tc_cls_can_offload_and_chain0()
Make use of tc_cls_can_offload_and_chain0() to set extack msg in case
ethtool tc offload flag is not set or chain unsupported.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 25 Jan 2018 22:00:43 +0000 (14:00 -0800)]
pkt_cls: add new tc cls helper to check offload flag and chain index
Very few (mlxsw) upstream drivers seem to allow offload of chains
other than 0. Save driver developers typing and add a helper for
checking both if ethtool's TC offload flag is on and if chain is 0.
This helper will set the extack appropriately in both error cases.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rohit Visavalia [Thu, 25 Jan 2018 10:26:14 +0000 (15:56 +0530)]
qed: code indent should use tabs where possible
Issue found by checkpatch.
Signed-off-by: Rohit Visavalia <rohit.visavalia@softnautics.com>
Acked-by: Michal Kalderon <michal.kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rohit Visavalia [Thu, 25 Jan 2018 12:58:24 +0000 (18:28 +0530)]
be2net: networking block comments don't use an empty /* line
Resolved Warning: networking block comments don't use an empty /* line,
use /* Comment...
Issue found by checkpatch.
Signed-off-by: Rohit Visavalia <rohit.visavalia@softnautics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 25 Jan 2018 21:32:28 +0000 (16:32 -0500)]
Merge branch 'for-upstream' of git://git./linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg says:
====================
pull request: bluetooth-next 2018-01-25
Here's one last bluetooth-next pull request for the 4.16 kernel:
- Improved support for Intel controllers
- New set_parity method to serdev (agreed with maintainers to be taken
through bluetooth-next)
- Fix error path in hci_bcm (missing call to serdev close)
- New ID for BCM4343A0 UART controller
Please let me know if there are any issues pulling. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ganesh Goudar [Thu, 25 Jan 2018 07:59:43 +0000 (13:29 +0530)]
cxgb4: fix possible deadlock
t4_wr_mbox_meat_timeout() can be called from both softirq
context and process context, hence protect the mbox with
spin_lock_bh() instead of simple spin_lock()
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Thu, 25 Jan 2018 03:45:29 +0000 (19:45 -0800)]
net/ipv6: Do not allow route add with a device that is down
IPv6 allows routes to be installed when the device is not up (admin up).
Worse, it does not mark it as LINKDOWN. IPv4 does not allow it and really
there is no reason for IPv6 to allow it, so check the flags and deny if
device is admin down.
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 25 Jan 2018 21:10:43 +0000 (16:10 -0500)]
Merge branch 'net-smc-more-socket-closing-improvements'
Ursula Braun says:
====================
net/smc: more socket closing improvements
these patches improve the smc behavior for abnormal socket closing.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ursula Braun [Thu, 25 Jan 2018 10:15:36 +0000 (11:15 +0100)]
net/smc: check for healthy link group resp. connections
If a problem for at least one connection of a link group is detected,
the whole link group and all its connections are terminated.
This patch adds a check for healthy link group when trying to reserve
a work request, and checks for healthy connections before starting
a tx worker.
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ursula Braun [Thu, 25 Jan 2018 10:15:35 +0000 (11:15 +0100)]
net/smc: wake up wr_reg_wait when terminating a link group
If a new connection with a new rmb is added to a link group, its
memory region is registered. If a link group is terminated, a pending
registration requires a wake up.
And consolidate setting of tx_flag peer_conn_abort in smc_lgr_terminate().
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ursula Braun [Thu, 25 Jan 2018 10:15:34 +0000 (11:15 +0100)]
net/smc: do not reuse a linkgroup with setup problems
Once a linkgroup is created successfully, it stays alive for a
certain time to service more connections potentially created.
If one of the initialization steps for a new linkgroup fails,
the linkgroup should not be reused by other connections following.
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ursula Braun [Thu, 25 Jan 2018 10:15:33 +0000 (11:15 +0100)]
net/smc: terminate link group for ib_post_send problems
If ib_post_send() fails, terminate all connections of this
link group.
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ursula Braun [Thu, 25 Jan 2018 10:15:32 +0000 (11:15 +0100)]
net/smc: handle state SMC_PEERFINCLOSEWAIT correctly
A state transition from closing state SMC_PEERFINCLOSEWAIT to closing
state SMC_APPFINCLOSEWAIT is not allowed. Once a closing indication
from the peer has been received, the socket reaches state SMC_CLOSED.
And receiving a peer_conn_abort just changes the state of the socket
into one of the states SMC_PROCESSABORT or SMC_CLOSED;
sending a peer_conn_abort occurs in smc_close_active() for state
SMC_PROCESSABORT only.
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ursula Braun [Thu, 25 Jan 2018 10:15:31 +0000 (11:15 +0100)]
net/smc: cancel tx worker in case of socket aborts
If an SMC socket is aborted, the tx worker should be cancelled.
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 25 Jan 2018 21:05:15 +0000 (16:05 -0500)]
Merge branch 'sfc-support-PTP-on-8000-and-X2000-series-NICs'
Edward Cree says:
====================
sfc: support PTP on 8000 and X2000 series NICs
Starting from the 8000-series (Medford 1), SFC NICs can timestamp TX packets
sent through an ordinary DMA queue, rather than a special control-plane
operation as in the 7000-series. Patches 2-8 implement support for this.
The X2000-series (Medford 2) changes the format of timestamps, from seconds+
(2^27)ths to seconds + quarter nanoseconds, as well as changing the shift
of the frequency adjustment for increased precision. Patches 9-12
implement support for these changes.
Patch #1 is an unrelated fix for NAPI budget handling, needed in order for
TX completion changes in the later patches to apply cleanly.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Laurence Evans [Thu, 25 Jan 2018 17:28:04 +0000 (17:28 +0000)]
sfc: support Medford2 frequency adjustment format
Support increased precision frequency adjustment format (FP44) used
by Medford2 adapters.
Signed-off-by: Laurence Evans <levans@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Thu, 25 Jan 2018 17:27:40 +0000 (17:27 +0000)]
sfc: support second + quarter ns time format for receive datapath
The time_format that we stash in the PTP data structure is never
referenced, so we can remove it. Instead, store the information needed
to interpret sync event timestamps.
Also rolls in a couple of other related minor PTP fixes.
Based on patches by Bert Kenward <bkenward@solarflare.com> and Laurence
Evans <levans@solarflare.com>.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Laurence Evans [Thu, 25 Jan 2018 17:27:22 +0000 (17:27 +0000)]
sfc: support separate PTP and general timestamping
Support MC_CMD_PTP_OUT_GET_TIMESTAMP_CORRECTIONS_V2. Extract general
timestamp corrections in addition to PTP corrections. Apply receive
timestamp corrections for general datapath receive timestamping, and
correspondingly for transmit.
Signed-off-by: Laurence Evans <levans@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Laurence Evans [Thu, 25 Jan 2018 17:27:02 +0000 (17:27 +0000)]
sfc: simplify RX datapath timestamping
Use timestamp conversion function with correction to avoid duplicate
correction handling.
Signed-off-by: Laurence Evans <levans@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Martin Habets [Thu, 25 Jan 2018 17:26:31 +0000 (17:26 +0000)]
sfc: only advertise TX timestamping if we have the license for it
We check the license for TX hardware timestamping capability.
The PTP probe will have enabled PTP sync events from the adapter. If
later, at TX queue init, it turns out we do not have the license, we
don't need the sync events either.
Signed-off-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Thu, 25 Jan 2018 17:26:06 +0000 (17:26 +0000)]
sfc: on 8000 series use TX queues for TX timestamps
For this we create and use one or more new TX queues on the PTP channel,
and enable sync events for it.
Based on a patch by Martin Habets <mhabets@solarflare.com>.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Martin Habets [Thu, 25 Jan 2018 17:25:50 +0000 (17:25 +0000)]
sfc: MAC TX timestamp handling on the 8000 series
TX timestamps on 8000 series are supplied from the MAC. This timestamp is
only 48 bits long. The high order bits from the last time sync event are
used for the top 16 bits.
Signed-off-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Martin Habets [Thu, 25 Jan 2018 17:25:33 +0000 (17:25 +0000)]
sfc: only enable TX timestamping if the adapter is licensed for it
If we try to enable the feature and do not have the license for it, the
MCPU will refuse and fail our TX queue init.
Signed-off-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Martin Habets [Thu, 25 Jan 2018 17:25:15 +0000 (17:25 +0000)]
sfc: use main datapath for HW timestamps if available
We can now transmit SKBs in 2 ways:
1. Via the MC (for the 7XXX series and earlier), using
efx_ptp_xmit_skb_mc().
2. Via the TX queues on the dedicated PTP channel (8XXX series and later),
using efx_ptp_xmit_skb_queue().
The PTP worker thread uses the method set up at probe time. It never
checked the return code from the old efx_ptp_xmit_skb(), so it now
returns void.
We increment the TX dropped counter of the device if the transmit fails.
As a result of the probe per channel the remove gets called multiple times.
Clean up efx->ptp_data properly to avoid the 2nd call blowing up.
Signed-off-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Martin Habets [Thu, 25 Jan 2018 17:24:56 +0000 (17:24 +0000)]
sfc: add function to determine which TX timestamping method to use
Use MC capability MC_CMD_GET_CAPABILITIES_V2_OUT_TX_MAC_TIMESTAMPING to
detect whether the NIC supports timestamping packets sent out the main
datapath.
Signed-off-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Martin Habets [Thu, 25 Jan 2018 17:24:43 +0000 (17:24 +0000)]
sfc: handle TX timestamps in the normal data path
Before this work, TX timestamping is done by sending each SKB to the MC.
On the 8000 series (Medford1) we have high speed timestamping via the
MAC, which means we can use normal TX queues for this without a
significant drop in bandwidth. On the X2000 series (Medford2) support
for transmitting via the MC is removed, so the new way must be used.
This patch enables timestamping on a TX queue, if requested.
It also enhances TX event handling to process the extra completion events,
and puts the time in the SKB.
Signed-off-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bert Kenward [Thu, 25 Jan 2018 17:24:20 +0000 (17:24 +0000)]
sfc: remove tx and MCDI handling from NAPI budget consideration
The NAPI budget is only for RX processing work, not other work such as
TX or MCDI completion handling.
Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kirill Tkhai [Fri, 19 Jan 2018 16:14:53 +0000 (19:14 +0300)]
net: Move net:netns_ids destruction out of rtnl_lock() and document locking scheme
Currently, we unhash a dying net from netns_ids lists
under rtnl_lock(). It's a leftover from the time when
net::netns_ids was introduced. There was no net::nsid_lock,
and rtnl_lock() was mostly need to order modification
of alive nets nsid idr, i.e. for:
for_each_net(tmp) {
...
id = __peernet2id(tmp, net);
idr_remove(&tmp->netns_ids, id);
...
}
Since we have net::nsid_lock, the modifications are
protected by this local lock, and now we may introduce
better scheme of netns_ids destruction.
Let's look at the functions peernet2id_alloc() and
get_net_ns_by_id(). Previous commits taught these
functions to work well with dying net acquired from
rtnl unlocked lists. And they are the only functions
which can hash a net to netns_ids or obtain from there.
And as easy to check, other netns_ids operating functions
works with id, not with net pointers. So, we do not
need rtnl_lock to synchronize cleanup_net() with all them.
The another property, which is used in the patch,
is that net is unhashed from net_namespace_list
in the only place and by the only process. So,
we avoid excess rcu_read_lock() or rtnl_lock(),
when we'are iterating over the list in unhash_nsid().
All the above makes possible to keep rtnl_lock() locked
only for net->list deletion, and completely avoid it
for netns_ids unhashing and destruction. As these two
doings may take long time (e.g., memory allocation
to send skb), the patch should positively act on
the scalability and signify decrease the time, which
rtnl_lock() is held in cleanup_net().
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tedd Ho-Jeong An [Wed, 24 Jan 2018 17:19:21 +0000 (09:19 -0800)]
Bluetooth: btintel: Create common function for firmware download
The firmware download flow for RAM SKU is same for both USB and UART
and this patch creates a common function for both driver.
Signed-off-by: Tedd Ho-Jeong An <tedd.an@linux.intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
David S. Miller [Thu, 25 Jan 2018 04:48:11 +0000 (23:48 -0500)]
Merge branch 'rebased-net-ioctl' of git://git./linux/kernel/git/viro/vfs
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 25 Jan 2018 04:44:15 +0000 (23:44 -0500)]
Merge git://git./linux/kernel/git/davem/net
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 25 Jan 2018 01:24:30 +0000 (17:24 -0800)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Avoid negative netdev refcount in error flow of xfrm state add, from
Aviad Yehezkel.
2) Fix tcpdump decoding of IPSEC decap'd frames by filling in the
ethernet header protocol field in xfrm{4,6}_mode_tunnel_input().
From Yossi Kuperman.
3) Fix a syzbot triggered skb_under_panic in pppoe having to do with
failing to allocate an appropriate amount of headroom. From
Guillaume Nault.
4) Fix memory leak in vmxnet3 driver, from Neil Horman.
5) Cure out-of-bounds packet memory access in em_nbyte EMATCH module,
from Wolfgang Bumiller.
6) Restrict what kinds of sockets can be bound to the KCM multiplexer
and also disallow when another layer has attached to the socket and
made use of sk_user_data. From Tom Herbert.
7) Fix use before init of IOTLB in vhost code, from Jason Wang.
8) Correct STACR register write bit definition in IBM emac driver, from
Ivan Mikhaylov.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
net/ibm/emac: wrong bit is used for STA control register write
net/ibm/emac: add 8192 rx/tx fifo size
vhost: do not try to access device IOTLB when not initialized
vhost: use mutex_lock_nested() in vhost_dev_lock_vqs()
i40e: flower: check if TC offload is enabled on a netdev
qed: Free reserved MR tid
qed: Remove reserveration of dpi for kernel
kcm: Check if sk_user_data already set in kcm_attach
kcm: Only allow TCP sockets to be attached to a KCM mux
net: sched: fix TCF_LAYER_LINK case in tcf_get_base_ptr
net: sched: em_nbyte: don't add the data offset twice
mlxsw: spectrum_router: Don't log an error on missing neighbor
vmxnet3: repair memory leak
ipv6: Fix getsockopt() for sockets with default IPV6_AUTOFLOWLABEL
pppoe: take ->needed_headroom of lower device into account on xmit
xfrm: fix boolean assignment in xfrm_get_type_offload
xfrm: Fix eth_hdr(skb)->h_proto to reflect inner IP version
xfrm: fix error flow in case of add state fails
xfrm: Add SA to hardware at the end of xfrm_state_construct()
Al Viro [Sat, 1 Jul 2017 22:46:30 +0000 (18:46 -0400)]
kill kernel_sock_ioctl()
no users since 2014
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 5 Oct 2017 16:59:44 +0000 (12:59 -0400)]
dev_ioctl(): move copyin/copyout to callers
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Mon, 2 Oct 2017 00:27:01 +0000 (20:27 -0400)]
ipconfig: use dev_set_mtu()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Mon, 2 Oct 2017 00:13:08 +0000 (20:13 -0400)]
lift handling of SIOCIW... out of dev_ioctl()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Mon, 2 Oct 2017 01:12:09 +0000 (21:12 -0400)]
kill dev_ifname32()
same story...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 30 Sep 2017 23:32:17 +0000 (19:32 -0400)]
kill bond_ioctl()
Same story as with dev_ifsioc(), except that the last cases with non-trivial
conversions had been taken out in 2013...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 30 Sep 2017 23:31:15 +0000 (19:31 -0400)]
kill dev_ifsioc()
Once upon a time net/socket.c:dev_ifsioc() used to handle SIOCSHWTSTAMP and
SIOCSIFMAP. These have different native and compat layout, so the format
conversion had been needed. In 2009 these two cases had been taken out,
turning the rest into a convoluted way to calling sock_do_ioctl(). We copy
compat structure into native one, call sock_do_ioctl() on that and copy
the result back for the in/out ioctls. No layout transformation anywhere,
so we might as well just call sock_do_ioctl() and skip all the headache with
copying.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 1 Jul 2017 12:03:10 +0000 (08:03 -0400)]
ip_rt_ioctl(): take copyin to caller
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 1 Jul 2017 11:53:12 +0000 (07:53 -0400)]
devinet_ioctl(): take copyin/copyout to caller
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Mon, 26 Jun 2017 17:19:16 +0000 (13:19 -0400)]
net: separate SIOCGIFCONF handling from dev_ioctl()
Only two of dev_ioctl() callers may pass SIOCGIFCONF to it.
Separating that codepath from the rest of dev_ioctl() allows both
to simplify dev_ioctl() itself (all other cases work with struct ifreq *)
*and* seriously simplify the compat side of that beast: all it takes
is passing to inet_gifconf() an extra argument - the size of individual
records (sizeof(struct ifreq) or sizeof(struct compat_ifreq)). With
dev_ifconf() called directly from sock_do_ioctl()/compat_dev_ifconf()
that's easy to arrange.
As the result, compat side of SIOCGIFCONF doesn't need any
allocations, copy_in_user() back and forth, etc.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Linus Torvalds [Wed, 24 Jan 2018 23:49:02 +0000 (15:49 -0800)]
Merge git://git./linux/kernel/git/davem/sparc
Pull sparc bugfix from David Miller:
"Sparc Makefile typo fix"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: fix typo in CONFIG_CRYPTO_DES_SPARC64 => CONFIG_CRYPTO_CAMELLIA_SPARC64
Ivan Mikhaylov [Wed, 24 Jan 2018 12:53:25 +0000 (15:53 +0300)]
net/ibm/emac: wrong bit is used for STA control register write
STA control register has areas of mode and opcodes for opeations. 18 bit is
using for mode selection, where 0 is old MIO/MDIO access method and 1 is
indirect access mode. 19-20 bits are using for setting up read/write
operation(STA opcodes). In current state 'read' is set into old MIO/MDIO mode
with 19 bit and write operation is set into 18 bit which is mode selection,
not a write operation. To correlate write with read we set it into 20 bit.
All those bit operations are MSB 0 based.
Signed-off-by: Ivan Mikhaylov <ivan@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Mikhaylov [Wed, 24 Jan 2018 12:53:24 +0000 (15:53 +0300)]
net/ibm/emac: add 8192 rx/tx fifo size
emac4syn chips has availability to use 8192 rx/tx fifo buffer sizes,
in current state if we set it up in dts 8192 as example, we will get
only 2048 which may impact on network speed.
Signed-off-by: Ivan Mikhaylov <ivan@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 24 Jan 2018 23:02:17 +0000 (18:02 -0500)]
Merge branch '100GbE' of git://git./linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
100GbE Intel Wired LAN Driver Updates 2018-01-24
This series contains updates to fm10k only.
Alex fixes MACVLAN offload for fm10k, where we were not seeing unicast
packets being received because we did not correctly configure the
default VLAN ID for the port and defaulting to 0.
Jake cleans up unnecessary parenthesis in a couple of "if" statements.
Fixed the driver to stop adding VLAN 0 into the VLAN table, since it
would cause the VLAN table to be inconsistent between the PF and VF.
Also fixed an issue where we were assuming that VLAN 1 is enabled when
the default VLAN ID is not set, so resolve by not requesting any filters
for the default_vid if it has not yet been assigned.
Ngai fixes an issue which was generating a dmesg regarding unbale to
kill a particular VLAN ID for the device. This is due to
ndo_vlan_rx_kill_vid() exits with an error and the handler for this ndo
is fm10k_update_vid() which exits prematurely under PF VLAN management.
So to resolve, we must check the VLAN update action type before exiting
fm10k_update_vid(), and act appropriately based on the action type.
Also corrected code comment typos.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ngai-Mint Kwan [Wed, 24 Jan 2018 22:23:27 +0000 (14:23 -0800)]
fm10k: clarify action when updating the VLAN table
Clarify the comment for when entering promiscuous mode that we update
the VLAN table. Add a comment distinguishing the case where we're
exiting promiscuous mode and need to clear the entire VLAN table.
Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@gmail.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Ngai-Mint Kwan [Wed, 24 Jan 2018 22:22:18 +0000 (14:22 -0800)]
fm10k: correct typo in fm10k_pf.c
Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Wed, 24 Jan 2018 22:20:29 +0000 (14:20 -0800)]
fm10k: don't assume VLAN 1 is enabled
Since commit
856dfd69e84f ("fm10k: Fix multicast mode synch issues",
2016-03-03) we've incorrectly assumed that VLAN 1 is enabled when the
default VID is not set.
This occurs because we check the default_vid and if it's zero, start
several loops over the active_vlans bitmask at 1, instead of checking to
ensure that that bit is active.
This happened because of commit
d9ff3ee8efe9 ("fm10k: Add support for
VLAN 0 w/o default VLAN", 2014-08-07) which mistakenly assumed that we
should send requests for MAC and VLAN filters with VLAN 0 when the
default_vid isn't set.
However, the switch generally considers this an invalid configuration,
so the only time we'd have a default_vid of 0 is when the switch is
down.
Instead, lets just not request any filters for the default_vid if it's
not yet been assigned.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Wed, 24 Jan 2018 22:19:30 +0000 (14:19 -0800)]
fm10k: stop adding VLAN 0 to the VLAN table
Currently, when the driver loads, it sends a request to add VLAN 0 to the
VLAN table. For the PF, this is honored, and VLAN 0 is indeed set. For
the VF, this request is silently converted into a request for the
default VLAN as defined by either the switch vid or the PF vid.
This results in the odd behavior that the VLAN table doesn't appear
consistent between the PF and the VF.
Furthermore, setting a MAC filter with VLAN 0 is generally considered an
invalid configuration by the switch, and since commit
856dfd69e84f
("fm10k: Fix multicast mode synch issues", 2016-03-03) we've had code
which prevents us from ever sending such a request.
Since there's not really a good reason to keep VLAN 0 in the VLAN table,
stop requesting it in fm10k_restore_rx_state().
This might seem to indicate that we would no longer properly configure
the MAC and VLAN tables for the default vid. However, due to the way
that fm10k_find_next_vlan() behaves, it will always return the
default_vid as enabled.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Ngai-Mint Kwan [Wed, 24 Jan 2018 22:18:22 +0000 (14:18 -0800)]
fm10k: fix "failed to kill vid" message for VF
When a VF is under PF VLAN assignment:
ip link set <pf> vf <#> vlan <vid>
This will remove all previous entries in the VLAN table including those
generated by VLAN interfaces created on the VF. The issue arises when
the VF is under PF VLAN assignment and one or more of these VLAN
interfaces of the VF are deleted. When deleting these VLAN interfaces,
the following message will be generated in "dmesg":
failed to kill vid 0081/<vid> for device <vf>
This is due to the fact that "ndo_vlan_rx_kill_vid" exits with an error.
The handler for this ndo is "fm10k_update_vid". Any calls to this
function while under PF VLAN management will exit prematurely and, thus,
it will generate the failure message.
Additionally, since "fm10k_update_vid" exits prematurely, none of the
VLAN update is performed. So, even though the actual VLAN interfaces of
the VF will be deleted, the active_vlans bitmask is not cleared. When
the VF is no longer under PF VLAN assignment, the driver mistakenly
restores the previous entries of the VLAN table based on an
unsynchronized list of active VLANs.
The solution to this issue involves checking the VLAN update action type
before exiting "fm10k_update_vid". If the VLAN update action type is to
"add", this action will not be permitted while the VF is under PF VLAN
assignment and the VLAN update is abandoned like before.
However, if the VLAN update action type is to "kill", then we need to
also clear the active_vlans bitmask. However, we don't need to actually
queue any messages to the PF, because the MAC and VLAN tables have
already been cleared, and the PF would silently ignore these requests
anyways.
Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Wed, 24 Jan 2018 22:17:15 +0000 (14:17 -0800)]
fm10k: cleanup unnecessary parenthesis in fm10k_iov.c
This fixes a few warnings found by checkpatch.pl --strict
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Wei Yongjun [Wed, 24 Jan 2018 02:14:33 +0000 (02:14 +0000)]
cxgb4: make symbol pedits static
Fixes the following sparse warning:
drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c:46:27: warning:
symbol 'pedits' was not declared. Should it be static?
Fixes:
27ece1f357b7 ("cxgb4: add tc flower support for ETH-DMAC rewrite")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Tue, 23 Jan 2018 09:27:26 +0000 (17:27 +0800)]
vhost: do not try to access device IOTLB when not initialized
The code will try to access dev->iotlb when processing
VHOST_IOTLB_INVALIDATE even if it was not initialized which may lead
to NULL pointer dereference. Fixes this by check dev->iotlb before.
Fixes:
6b1e6cc7855b0 ("vhost: new device IOTLB API")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Tue, 23 Jan 2018 09:27:25 +0000 (17:27 +0800)]
vhost: use mutex_lock_nested() in vhost_dev_lock_vqs()
We used to call mutex_lock() in vhost_dev_lock_vqs() which tries to
hold mutexes of all virtqueues. This may confuse lockdep to report a
possible deadlock because of trying to hold locks belong to same
class. Switch to use mutex_lock_nested() to avoid false positive.
Fixes:
6b1e6cc7855b0 ("vhost: new device IOTLB API")
Reported-by: syzbot+dbb7c1161485e61b0241@syzkaller.appspotmail.com
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
William Tu [Wed, 24 Jan 2018 01:01:29 +0000 (17:01 -0800)]
net: erspan: fix use-after-free
When building the erspan header for either v1 or v2, the eth_hdr()
does not point to the right inner packet's eth_hdr,
causing kasan report use-after-free and slab-out-of-bouds read.
The patch fixes the following syzkaller issues:
[1] BUG: KASAN: slab-out-of-bounds in erspan_xmit+0x22d4/0x2430 net/ipv4/ip_gre.c:735
[2] BUG: KASAN: slab-out-of-bounds in erspan_build_header+0x3bf/0x3d0 net/ipv4/ip_gre.c:698
[3] BUG: KASAN: use-after-free in erspan_xmit+0x22d4/0x2430 net/ipv4/ip_gre.c:735
[4] BUG: KASAN: use-after-free in erspan_build_header+0x3bf/0x3d0 net/ipv4/ip_gre.c:698
[2] CPU: 0 PID: 3654 Comm: syzkaller377964 Not tainted 4.15.0-rc9+ #185
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
print_address_description+0x73/0x250 mm/kasan/report.c:252
kasan_report_error mm/kasan/report.c:351 [inline]
kasan_report+0x25b/0x340 mm/kasan/report.c:409
__asan_report_load_n_noabort+0xf/0x20 mm/kasan/report.c:440
erspan_build_header+0x3bf/0x3d0 net/ipv4/ip_gre.c:698
erspan_xmit+0x3b8/0x13b0 net/ipv4/ip_gre.c:740
__netdev_start_xmit include/linux/netdevice.h:4042 [inline]
netdev_start_xmit include/linux/netdevice.h:4051 [inline]
packet_direct_xmit+0x315/0x6b0 net/packet/af_packet.c:266
packet_snd net/packet/af_packet.c:2943 [inline]
packet_sendmsg+0x3aed/0x60b0 net/packet/af_packet.c:2968
sock_sendmsg_nosec net/socket.c:638 [inline]
sock_sendmsg+0xca/0x110 net/socket.c:648
SYSC_sendto+0x361/0x5c0 net/socket.c:1729
SyS_sendto+0x40/0x50 net/socket.c:1697
do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
entry_SYSENTER_compat+0x54/0x63 arch/x86/entry/entry_64_compat.S:129
RIP: 0023:0xf7fcfc79
RSP: 002b:
00000000ffc6976c EFLAGS:
00000286 ORIG_RAX:
0000000000000171
RAX:
ffffffffffffffda RBX:
0000000000000004 RCX:
0000000020011000
RDX:
0000000000000000 RSI:
0000000000000000 RDI:
0000000020008000
RBP:
000000000000001c R08:
0000000000000000 R09:
0000000000000000
R10:
0000000000000000 R11:
0000000000000000 R12:
0000000000000000
R13:
0000000000000000 R14:
0000000000000000 R15:
0000000000000000
Fixes:
f551c91de262 ("net: erspan: introduce erspan v2 for ip_gre")
Fixes:
84e54fe0a5ea ("gre: introduce native tunnel support for ERSPAN")
Reported-by: syzbot+9723f2d288e49b492cf0@syzkaller.appspotmail.com
Reported-by: syzbot+f0ddeb2b032a8e1d9098@syzkaller.appspotmail.com
Reported-by: syzbot+f14b3703cd8d7670203f@syzkaller.appspotmail.com
Reported-by: syzbot+eefa384efad8d7997f20@syzkaller.appspotmail.com
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Tue, 23 Jan 2018 08:08:40 +0000 (00:08 -0800)]
i40e: flower: check if TC offload is enabled on a netdev
Since TC block changes drivers are required to check if
the TC hw offload flag is set on the interface themselves.
Fixes:
2f4b411a3d67 ("i40e: Enable cloud filters via tc-flower")
Fixes:
44ae12a768b7 ("net: sched: move the can_offload check from binding phase to rule insertion phase")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Amritha Nambiar <amritha.nambiar@intel.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Corentin Labbe [Tue, 23 Jan 2018 14:33:14 +0000 (14:33 +0000)]
sparc64: fix typo in CONFIG_CRYPTO_DES_SPARC64 => CONFIG_CRYPTO_CAMELLIA_SPARC64
This patch fixes the typo CONFIG_CRYPTO_DES_SPARC64 => CONFIG_CRYPTO_CAMELLIA_SPARC64
Fixes:
81658ad0d923 ("sparc64: Add CAMELLIA driver making use of the new camellia opcodes.")
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 24 Jan 2018 21:44:21 +0000 (16:44 -0500)]
Merge branch 'qed-rdma-bug-fixes'
Michal Kalderon says:
====================
qed: rdma bug fixes
This patch contains two small bug fixes related to RDMA.
Both related to resource reservations.
====================
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Kalderon [Tue, 23 Jan 2018 09:33:47 +0000 (11:33 +0200)]
qed: Free reserved MR tid
A tid was allocated for reserved MR during initialization but
not freed. This lead to an annoying output message during
rdma unload flow.
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Kalderon [Tue, 23 Jan 2018 09:33:46 +0000 (11:33 +0200)]
qed: Remove reserveration of dpi for kernel
Double reservation for kernel dedicated dpi was performed.
Once in the core module and once in qedr.
Remove the reservation from core.
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 24 Jan 2018 21:39:44 +0000 (13:39 -0800)]
fm10k: Fix configuration for macvlan offload
The fm10k driver didn't work correctly when macvlan offload was enabled.
Specifically what would occur is that we would see no unicast packets being
received. This was traced down to us not correctly configuring the default
VLAN ID for the port and defaulting to 0.
To correct this we either use the default ID provided by the switch or
simply use 1. With that we are able to pass and receive traffic without any
issues.
In addition we were not repopulating the filter table following a reset. To
correct that I have added a bit of code to fm10k_restore_rx_state that will
repopulate the Rx filter configuration for the macvlan interfaces.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Wang Dongsheng [Tue, 23 Jan 2018 04:25:06 +0000 (20:25 -0800)]
net: qcom/emac: extend DMA mask to 46bits
Bit TPD3[31] is used as a timestamp bit if PTP is enabled, but
it's used as an address bit if PTP is disabled. Since PTP isn't
supported by the driver, we can extend the DMA address to 46 bits.
Signed-off-by: Wang Dongsheng <dongsheng.wang@hxt-semitech.com>
Acked-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Winter [Tue, 23 Jan 2018 03:46:24 +0000 (16:46 +1300)]
ip_tunnel: Use mark in skb by default
This allows marks set by connmark in iptables
to be used for route lookups.
Signed-off-by: Thomas Winter <thomas.winter@alliedtelesis.co.nz>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Niklas Cassel [Mon, 22 Jan 2018 15:59:50 +0000 (16:59 +0100)]
net: stmmac: do not use a bitwise AND operator with a bool operand
Doing a bitwise AND between a bool and an int is generally not a good idea.
The bool will be promoted to an int with value 0 or 1,
the int is generally regarded as true with a non-zero value,
thus ANDing them has the potential to yield an undesired result.
This commit fixes the following smatch warnings:
drivers/net/ethernet/stmicro/stmmac/enh_desc.c:344 enh_desc_prepare_tx_desc() warn: maybe use && instead of &
drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c:337 dwmac4_rd_prepare_tx_desc() warn: maybe use && instead of &
drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c:380 dwmac4_rd_prepare_tso_tx_desc() warn: maybe use && instead of &
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 24 Jan 2018 21:09:09 +0000 (16:09 -0500)]
Merge branch '1GbE' of git://git./linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
1GbE Intel Wired LAN Driver Updates 2018-01-24
This series contains updates to igb and e1000e only.
Corinna Vinschen implements the ability to set the VF MAC to
00:00:00:00:00:00 via RTM_SETLINK on the PF, to prevent receiving
"invlaid argument" when libvirt attempts to restore the MAC address back
to its original state of 00:00:00:00:00:00.
Zhang Shengju adds a new function igb_get_max_rss_queues() to get the
maxium number of RSS queues and to reduce code duplication.
Matt fixes an issue with e1000e where when setting HTHREST, we should
only be setting bit 8. Also added a dev_info() message to alert when
C-states have been disabled, to help in debugging.
Jesus adds code comments to clarify the idlescope configuration
constraints.
Lyude Paul fixes a kernel crash on hotplug of igb, by checking to ensure
that we are in the process of dismantling the netdev on hotplug events.
Markus Elfring removes a duplicate message for a memory allocation
failure.
Daniel Hua fixes an issue where transmit timestamps will stop being put
into the socket when link is repeatedly up/down due to TSYNCTXCTL's TXTT
bit (Transmit timestamp valid bit) not clearing in the timeout logic of
ptp_tx_work(), which in turn causes no new timestamp to be loaded to the
TXSTMP register.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 24 Jan 2018 21:01:11 +0000 (16:01 -0500)]
Merge branch 'net-sched-propagate-extack-to-cls-offloads-on-destroy-and-only-with-skip_sw'
Jakub Kicinski says:
====================
net: sched: propagate extack to cls offloads on destroy and only with skip_sw
This series some of Jiri's comments and the fact that today drivers
may produce extack even if there is no skip_sw flag (meaning the
driver failure is not really a problem), and warning messages will
only confuse the users.
First patch propagates extack to destroy as requested by Jiri, extack
is then propagated to the driver callback for each classifier. I chose
not to provide the extack on error paths. As a rule of thumb it seems
best to keep the extack of the condition which caused the error. E.g.
err = this_will_fail(arg, extack);
if (err) {
undo_things(arg, NULL /* don't pass extack */);
return err;
}
Note that NL_SET_ERR_MSG() will ignore the message if extack is NULL.
I was pondering whether we should make NL_SET_ERR_MSG() refuse to
overwrite the msg, but there seem to be cases in the tree where extack
is set like this:
err = this_will_fail(arg, extack);
if (err) {
undo_things(arg, NULL /* don't pass extack */);
NL_SET_ERR_MSG(extack, "extack is set after undo call :/");
return err;
}
I think not passing extack to undo calls is reasonable.
v2:
- rename the temporary tc_cls_common_offload_init().
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Wed, 24 Jan 2018 20:54:24 +0000 (12:54 -0800)]
net: sched: remove tc_cls_common_offload_init_deprecated()
All users are now converted to tc_cls_common_offload_init().
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Wed, 24 Jan 2018 20:54:23 +0000 (12:54 -0800)]
cls_u32: propagate extack to delete callback
Propagate extack on removal of offloaded filter. Don't pass
extack from error paths.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Wed, 24 Jan 2018 20:54:22 +0000 (12:54 -0800)]
cls_u32: pass offload flags to tc_cls_common_offload_init()
Pass offload flags to the new implementation of
tc_cls_common_offload_init(). Extack will now only
be set if user requested skip_sw. hnodes need to
hold onto the flags now to be able to reuse them
on filter removal.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Wed, 24 Jan 2018 20:54:21 +0000 (12:54 -0800)]
cls_flower: propagate extack to delete callback
Propagate extack on removal of offloaded filter. Don't pass
extack from error paths.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Wed, 24 Jan 2018 20:54:20 +0000 (12:54 -0800)]
cls_flower: pass offload flags to tc_cls_common_offload_init()
Pass offload flags to the new implementation of
tc_cls_common_offload_init(). Extack will now only
be set if user requested skip_sw.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Wed, 24 Jan 2018 20:54:19 +0000 (12:54 -0800)]
cls_matchall: propagate extack to delete callback
Propagate extack on removal of offloaded filter. Don't pass
extack from error paths.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Wed, 24 Jan 2018 20:54:18 +0000 (12:54 -0800)]
cls_matchall: pass offload flags to tc_cls_common_offload_init()
Pass offload flags to the new implementation of
tc_cls_common_offload_init(). Extack will now only
be set if user requested skip_sw.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Wed, 24 Jan 2018 20:54:17 +0000 (12:54 -0800)]
cls_bpf: propagate extack to offload delete callback
Propagate extack on removal of offloaded filter.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>