platform/kernel/linux-starfive.git
4 years agoethtool: provide private flags with PRIVFLAGS_GET request
Michal Kubecek [Thu, 12 Mar 2020 20:08:08 +0000 (21:08 +0100)]
ethtool: provide private flags with PRIVFLAGS_GET request

Implement PRIVFLAGS_GET request to get private flags for a network device.
These are traditionally available via ETHTOOL_GPFLAGS ioctl request.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoethtool: add FEATURES_NTF notification
Michal Kubecek [Thu, 12 Mar 2020 20:08:03 +0000 (21:08 +0100)]
ethtool: add FEATURES_NTF notification

Send ETHTOOL_MSG_FEATURES_NTF notification whenever network device features
are modified using ETHTOOL_MSG_FEATURES_SET netlink message, ethtool ioctl
request or any other way resulting in call to netdev_update_features() or
netdev_change_features()

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoethtool: set netdev features with FEATURES_SET request
Michal Kubecek [Thu, 12 Mar 2020 20:07:58 +0000 (21:07 +0100)]
ethtool: set netdev features with FEATURES_SET request

Implement FEATURES_SET netlink request to set network device features.
These are traditionally set using ETHTOOL_SFEATURES ioctl request.

Actual change is subject to netdev_change_features() sanity checks so that
it can differ from what was requested. Unlike with most other SET requests,
in addition to error code and optional extack, kernel provides an optional
reply message (ETHTOOL_MSG_FEATURES_SET_REPLY) in the same format but with
different semantics: information about difference between user request and
actual result and difference between old and new state of dev->features.
This reply message can be suppressed by setting ETHTOOL_FLAG_OMIT_REPLY
flag in request header.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoethtool: add ethnl_parse_bitset() helper
Michal Kubecek [Thu, 12 Mar 2020 20:07:53 +0000 (21:07 +0100)]
ethtool: add ethnl_parse_bitset() helper

Unlike other SET type commands, modifying netdev features is required to
provide a reply telling userspace what was actually changed, compared to
what was requested. For that purpose, the "modified" flag provided by
ethnl_update_bitset() is not sufficient, we need full information which
bits were requested to change.

Therefore provide ethnl_parse_bitset() returning effective value and mask
bitmaps equivalent to the contents of a bitset nested attribute.

v2: use non-atomic __set_bit() (suggested by David Miller)

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoethtool: provide netdev features with FEATURES_GET request
Michal Kubecek [Thu, 12 Mar 2020 20:07:48 +0000 (21:07 +0100)]
ethtool: provide netdev features with FEATURES_GET request

Implement FEATURES_GET request to get network device features. These are
traditionally available via ETHTOOL_GFEATURES ioctl request.

v2:
  - style cleanup suggested by Jakub Kicinski

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoethtool: update mapping of features to legacy ioctl requests
Michal Kubecek [Thu, 12 Mar 2020 20:07:43 +0000 (21:07 +0100)]
ethtool: update mapping of features to legacy ioctl requests

Legacy ioctl request like ETHTOOL_GTXCSUM are still used by ethtool utility
to get values of legacy flags (which rather work as feature groups). These
are calculated from values of actual features and request to set them is
implemented as an attempt to set all features mapping to them but there are
two inconsistencies:

- tx-checksum-fcoe-crc is shown under tx-checksumming but NETIF_F_FCOE_CRC
  is not included in ETHTOOL_GTXCSUM/ETHTOOL_STXCSUM
- tx-scatter-gather-fraglist is shown under scatter-gather but
  NETIF_F_FRAGLIST is not included in ETHTOOL_GSG/ETHTOOL_SSG

As the mapping in ethtool output is more correct from logical point of
view, fix ethtool_get_feature_mask() to match it.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoethtool: rename ethnl_parse_header() to ethnl_parse_header_dev_get()
Michal Kubecek [Thu, 12 Mar 2020 20:07:38 +0000 (21:07 +0100)]
ethtool: rename ethnl_parse_header() to ethnl_parse_header_dev_get()

Andrew Lunn pointed out that even if it's documented that
ethnl_parse_header() takes reference to network device if it fills it
into the target structure, its name doesn't make it apparent so that
corresponding dev_put() looks like mismatched.

Rename the function ethnl_parse_header_dev_get() to indicate that it
takes a reference.

Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'Introduce-connection-tracking-offload'
David S. Miller [Thu, 12 Mar 2020 22:00:39 +0000 (15:00 -0700)]
Merge branch 'Introduce-connection-tracking-offload'

Paul Blakey says:

====================
Introduce connection tracking offload

Background
----------

The connection tracking action provides the ability to associate connection state to a packet.
The connection state may be used for stateful packet processing such as stateful firewalls
and NAT operations.

Connection tracking in TC SW
----------------------------

The CT state may be matched only after the CT action is performed.
As such, CT use cases are commonly implemented using multiple chains.
Consider the following TC filters, as an example:
1. tc filter add dev ens1f0_0 ingress prio 1 chain 0 proto ip flower \
    src_mac 24:8a:07:a5:28:01 ct_state -trk \
    action ct \
    pipe action goto chain 2

2. tc filter add dev ens1f0_0 ingress prio 1 chain 2 proto ip flower \
    ct_state +trk+new \
    action ct commit \
    pipe action tunnel_key set \
        src_ip 0.0.0.0 \
        dst_ip 7.7.7.8 \
        id 98 \
        dst_port 4789 \
    action mirred egress redirect dev vxlan0

3. tc filter add dev ens1f0_0 ingress prio 1 chain 2 proto ip flower \
    ct_state +trk+est \
    action tunnel_key set \
        src_ip 0.0.0.0 \
        dst_ip 7.7.7.8 \
        id 98 \
        dst_port 4789 \
    action mirred egress redirect dev vxlan0

Filter #1 (chain 0) decides, after initial packet classification, to send the packet to the
connection tracking module (ct action).
Once the ct_state is initialized by the CT action the packet processing continues on chain 2.

Chain 2 classifies the packet based on the ct_state.
Filter #2 matches on the +trk+new CT state while filter #3 matches on the +trk+est ct_state.

MLX5 Connection tracking HW offload - MLX5 driver patches
------------------------------

The MLX5 hardware model aligns with the software model by realizing a multi-table
architecture. In SW the TC CT action sets the CT state on the skb. Similarly,
HW sets the CT state on a HW register. Driver gets this CT state while offloading
a tuple with a new ct_metadata action that provides it.

Matches on ct_state are translated to HW register matches.

TC filter with CT action broken to two rules, a pre_ct rule, and a post_ct rule.
pre_ct rule:
   Inserted on the corrosponding tc chain table, matches on original tc match, with
   actions: any pre ct actions, set fte_id, set zone, and goto the ct table.
   The fte_id is a register mapping uniquely identifying this filter.
post_ct_rule:
   Inserted in a post_ct table, matches on the fte_id register mapping, with
   actions: counter + any post ct actions (this is usally 'goto chain X')

post_ct table is a table that all the tuples inserted to the ct table goto, so
if there is a tuple hit, packet will continue from ct table to post_ct table,
after being marked with the CT state (mark/label..)

This design ensures that the rule's actions and counters will be executed only after a CT hit.
HW misses will continue processing in SW from the last chain ID that was processed in hardware.

The following illustrates the HW model:

+-------------------+      +--------------------+    +--------------+
+ pre_ct (tc chain) +----->+ CT (nat or no nat) +--->+ post_ct      +----->
+ original match    +   |  + tuple + zone match + |  + fte_id match +  |
+-------------------+   |  +--------------------+ |  +--------------+  |
                        v                         v                    v
                     set chain miss mapping    set mark             original
                     set fte_id                set label            filter
                     set zone                  set established      actions
                     set tunnel_id             do nat (if needed)
                     do decap

To fill CT table, driver registers a CB for flow offload events, for each new
flow table that is passed to it from offloading ct actions. Once a flow offload
event is triggered on this CB, offload this flow to the hardware CT table.

Established events offload
--------------------------

Currently, act_ct maintains an FT instance per ct zone. Flow table entries
are created, per ct connection, when connections enter an established
state and deleted otherwise. Once an entry is created, the FT assumes
ownership of the entries, and manages their aging. FT is used for software
offload of conntrack. FT entries associate 5-tuples with an action list.

The act_ct changes in this patchset:
Populate the action list with a (new) ct_metadata action, providing the
connection's ct state (zone,mark and label), and mangle actions if NAT
is configured.

Pass the action's flow table instance as ct action entry parameter,
so  when the action is offloaded, the driver may register a callback on
it's block to receive FT flow offload add/del/stats events.

Netilter changes
--------------------------
The netfilter changes export the relevant bits, and add the relevant CBs
to support the above.

Applying this patchset
--------------------------

On top of current net-next ("r8169: simplify getting stats by using netdev_stats_to_stats64"),
pull Saeed's ct-offload branch, from git git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git
and fix the following non trivial conflict in fs_core.c as follows:

Then apply this patchset.

Changelog:
  v2->v3:
    Added the first two patches needed after rebasing on net-next:
     "net/mlx5: E-Switch, Enable reg c1 loopback when possible"
     "net/mlx5e: en_rep: Create uplink rep root table after eswitch offloads table"
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet/mlx5e: CT: Support clear action
Paul Blakey [Thu, 12 Mar 2020 10:23:17 +0000 (12:23 +0200)]
net/mlx5e: CT: Support clear action

Clear action, as with software, removes all ct metadata from
the packet.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet/mlx5e: CT: Handle misses after executing CT action
Paul Blakey [Thu, 12 Mar 2020 10:23:16 +0000 (12:23 +0200)]
net/mlx5e: CT: Handle misses after executing CT action

Mark packets with a unique tupleid, and on miss use that id to get
the act ct restore_cookie. Using that restore cookie, we ask CT to
restore the relevant info on the SKB.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet/mlx5e: CT: Offload established flows
Paul Blakey [Thu, 12 Mar 2020 10:23:15 +0000 (12:23 +0200)]
net/mlx5e: CT: Offload established flows

Register driver callbacks with the nf flow table platform.
FT add/delete events will create/delete FTE in the CT/CT_NAT tables.

Restoring the CT state on miss will be added in the following patch.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet/mlx5e: CT: Introduce connection tracking
Paul Blakey [Thu, 12 Mar 2020 10:23:14 +0000 (12:23 +0200)]
net/mlx5e: CT: Introduce connection tracking

Add support for offloading tc ct action and ct matches.
We translate the tc filter with CT action the following HW model:

+-------------------+      +--------------------+    +--------------+
+ pre_ct (tc chain) +----->+ CT (nat or no nat) +--->+ post_ct      +----->
+ original match    +  |   + tuple + zone match + |  + fte_id match +  |
+-------------------+  |   +--------------------+ |  +--------------+  |
                       v                          v                    v
                      set chain miss mapping  set mark             original
                      set fte_id              set label            filter
                      set zone                set established      actions
                      set tunnel_id           do nat (if needed)
                      do decap

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoflow_offload: Add flow_match_ct to get rule ct match
Paul Blakey [Thu, 12 Mar 2020 10:23:13 +0000 (12:23 +0200)]
flow_offload: Add flow_match_ct to get rule ct match

Add relevant getter for ct info dissector.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet/mlx5: E-Switch, Support getting chain mapping
Paul Blakey [Thu, 12 Mar 2020 10:23:12 +0000 (12:23 +0200)]
net/mlx5: E-Switch, Support getting chain mapping

Currently, we write chain register mapping on miss from the the last
prio of a chain. It is used to restore the chain in software.

To support re-using the chain register mapping from global tables (such
as CT tuple table) misses, export the chain mapping.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet/mlx5: E-Switch, Add support for offloading rules with no in_port
Paul Blakey [Thu, 12 Mar 2020 10:23:11 +0000 (12:23 +0200)]
net/mlx5: E-Switch, Add support for offloading rules with no in_port

FTEs in global tables may match on packets from multiple in_ports.
Provide the capability to omit the in_port match condition.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet/mlx5: E-Switch, Introduce global tables
Paul Blakey [Thu, 12 Mar 2020 10:23:10 +0000 (12:23 +0200)]
net/mlx5: E-Switch, Introduce global tables

Currently, flow tables are automatically connected according to their
<chain,prio,level> tuple.

Introduce global tables which are flow tables that are detached from the
eswitch chains processing, and will be connected by explicitly referencing
them from multiple chains.

Add this new table type, and allow connecting them by refenece.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet/sched: act_ct: Enable hardware offload of flow table entires
Paul Blakey [Thu, 12 Mar 2020 10:23:09 +0000 (12:23 +0200)]
net/sched: act_ct: Enable hardware offload of flow table entires

Pass the zone's flow table instance on the flow action to the drivers.
Thus, allowing drivers to register FT add/del/stats callbacks.

Finally, enable hardware offload on the flow table instance.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet/sched: act_ct: Support refreshing the flow table entries
Paul Blakey [Thu, 12 Mar 2020 10:23:08 +0000 (12:23 +0200)]
net/sched: act_ct: Support refreshing the flow table entries

If driver deleted an FT entry, a FT failed to offload, or registered to the
flow table after flows were already added, we still get packets in
software.

For those packets, while restoring the ct state from the flow table
entry, refresh it's hardware offload.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet/sched: act_ct: Support restoring conntrack info on skbs
Paul Blakey [Thu, 12 Mar 2020 10:23:07 +0000 (12:23 +0200)]
net/sched: act_ct: Support restoring conntrack info on skbs

Provide an API to restore the ct state pointer.

This may be used by drivers to restore the ct state if they
miss in tc chain after they already did the hardware connection
tracking action (ct_metadata action).

For example, consider the following rule on chain 0 that is in_hw,
however chain 1 is not_in_hw:

$ tc filter add dev ... chain 0 ... \
  flower ... action ct pipe action goto chain 1

Packets of a flow offloaded (via nf flow table offload) by the driver
hit this rule in hardware, will be marked with the ct metadata action
(mark, label, zone) that does the equivalent of the software ct action,
and when the packet jumps to hardware chain 1, there would be a miss.

CT was already processed in hardware. Therefore, the driver's miss
handling should restore the ct state on the skb, using the provided API,
and continue the packet processing in chain 1.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet/sched: act_ct: Instantiate flow table entry actions
Paul Blakey [Thu, 12 Mar 2020 10:23:06 +0000 (12:23 +0200)]
net/sched: act_ct: Instantiate flow table entry actions

NF flow table API associate 5-tuple rule with an action list by calling
the flow table type action() CB to fill the rule's actions.

In action CB of act_ct, populate the ct offload entry actions with a new
ct_metadata action. Initialize the ct_metadata with the ct mark, label and
zone information. If ct nat was performed, then also append the relevant
packet mangle actions (e.g. ipv4/ipv6/tcp/udp header rewrites).

Drivers that offload the ft entries may match on the 5-tuple and perform
the action list.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonetfilter: flowtable: Add API for registering to flow table events
Paul Blakey [Thu, 12 Mar 2020 10:23:05 +0000 (12:23 +0200)]
netfilter: flowtable: Add API for registering to flow table events

Let drivers to add their cb allowing them to receive flow offload events
of type TC_SETUP_CLSFLOWER (REPLACE/DEL/STATS) for flows managed by the
flow table.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet/mlx5e: en_rep: Create uplink rep root table after eswitch offloads table
Paul Blakey [Thu, 12 Mar 2020 10:23:04 +0000 (12:23 +0200)]
net/mlx5e: en_rep: Create uplink rep root table after eswitch offloads table

The eswitch offloads table, which has the reps (vport) rx miss rules,
was moved from OFFLOADS namespace [0,0] (prio, level), to [1,0], so
the restore table (the new [0,0]) can come before it. The destinations
of these miss rules is the rep root ft (ttc for non uplink reps).

Uplink rep root ft is created as OFFLOADS namespace [0,1], and is used
as a hook to next RX prio (either ethtool or ttc), but this fails to
pass fs_core level's check.

Move uplink rep root ft to OFFLOADS prio 1, level 1 ([1,1]), so it
will keep the same relative position after the restore table
change.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet/mlx5: E-Switch, Enable reg c1 loopback when possible
Paul Blakey [Thu, 12 Mar 2020 10:23:03 +0000 (12:23 +0200)]
net/mlx5: E-Switch, Enable reg c1 loopback when possible

Enable reg c1 loopback if firmware reports it's supported,
as this is needed for restoring packet metadata (e.g chain).

Also define helper to query if it is enabled.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'ct-offload' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed...
David S. Miller [Thu, 12 Mar 2020 19:34:23 +0000 (12:34 -0700)]
Merge branch 'ct-offload' of git://git./linux/kernel/git/saeed/linux

4 years agoMerge branch 'bind_addr_zero'
David S. Miller [Thu, 12 Mar 2020 19:08:10 +0000 (12:08 -0700)]
Merge branch 'bind_addr_zero'

Kuniyuki Iwashima says:

====================
Improve bind(addr, 0) behaviour.

Currently we fail to bind sockets to ephemeral ports when all of the ports
are exhausted even if all sockets have SO_REUSEADDR enabled. In this case,
we still have a chance to connect to the different remote hosts.

These patches add net.ipv4.ip_autobind_reuse option and fix the behaviour
to fully utilize all space of the local (addr, port) tuples.

Changes in v5:
  - Add more description to documents.
  - Fix sysctl option to use proc_dointvec_minmax.
  - Remove the Fixes: tag and squash two commits.

Changes in v4:
  - Add net.ipv4.ip_autobind_reuse option to not change the current behaviour.
  - Modify .gitignore for test.
  https://lore.kernel.org/netdev/20200308181615.90135-1-kuniyu@amazon.co.jp/

Changes in v3:
  - Change the title and write more specific description of the 3rd patch.
  - Add a test in tools/testing/selftests/net/ as the 4th patch.
  https://lore.kernel.org/netdev/20200229113554.78338-1-kuniyu@amazon.co.jp/

Changes in v2:
  - Change the description of the 2nd patch ('localhost' -> 'address').
  - Correct the description and the if statement of the 3rd patch.
  https://lore.kernel.org/netdev/20200226074631.67688-1-kuniyu@amazon.co.jp/

v1 with tests:
  https://lore.kernel.org/netdev/20200220152020.13056-1-kuniyu@amazon.co.jp/
====================

Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoselftests: net: Add SO_REUSEADDR test to check if 4-tuples are fully utilized.
Kuniyuki Iwashima [Tue, 10 Mar 2020 08:05:27 +0000 (17:05 +0900)]
selftests: net: Add SO_REUSEADDR test to check if 4-tuples are fully utilized.

This commit adds a test to check if we can fully utilize 4-tuples for
connect() when all ephemeral ports are exhausted.

The test program changes the local port range to use only one port and binds
two sockets with or without SO_REUSEADDR and SO_REUSEPORT, and with the same
EUID or with different EUIDs, then do listen().

We should be able to bind only one socket having both SO_REUSEADDR and
SO_REUSEPORT per EUID, which restriction is to prevent unintentional
listen().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agotcp: Forbid to bind more than one sockets haveing SO_REUSEADDR and SO_REUSEPORT per...
Kuniyuki Iwashima [Tue, 10 Mar 2020 08:05:26 +0000 (17:05 +0900)]
tcp: Forbid to bind more than one sockets haveing SO_REUSEADDR and SO_REUSEPORT per EUID.

If there is no TCP_LISTEN socket on a ephemeral port, we can bind multiple
sockets having SO_REUSEADDR to the same port. Then if all sockets bound to
the port have also SO_REUSEPORT enabled and have the same EUID, all of them
can be listened. This is not safe.

Let's say, an application has root privilege and binds sockets to an
ephemeral port with both of SO_REUSEADDR and SO_REUSEPORT. When none of
sockets is not listened yet, a malicious user can use sudo, exhaust
ephemeral ports, and bind sockets to the same ephemeral port, so he or she
can call listen and steal the port.

To prevent this issue, we must not bind more than one sockets that have the
same EUID and both of SO_REUSEADDR and SO_REUSEPORT.

On the other hand, if the sockets have different EUIDs, the issue above does
not occur. After sockets with different EUIDs are bound to the same port and
one of them is listened, no more socket can be listened. This is because the
condition below is evaluated true and listen() for the second socket fails.

} else if (!reuseport_ok ||
   !reuseport || !sk2->sk_reuseport ||
   rcu_access_pointer(sk->sk_reuseport_cb) ||
   (sk2->sk_state != TCP_TIME_WAIT &&
    !uid_eq(uid, sock_i_uid(sk2)))) {
if (inet_rcv_saddr_equal(sk, sk2, true))
break;
}

Therefore, on the same port, we cannot do listen() for multiple sockets with
different EUIDs and any other listen syscalls fail, so the problem does not
happen. In this case, we can still call connect() for other sockets that
cannot be listened, so we have to succeed to call bind() in order to fully
utilize 4-tuples.

Summarizing the above, we should be able to bind only one socket having
SO_REUSEADDR and SO_REUSEPORT per EUID.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agotcp: bind(0) remove the SO_REUSEADDR restriction when ephemeral ports are exhausted.
Kuniyuki Iwashima [Tue, 10 Mar 2020 08:05:25 +0000 (17:05 +0900)]
tcp: bind(0) remove the SO_REUSEADDR restriction when ephemeral ports are exhausted.

Commit aacd9289af8b82f5fb01bcdd53d0e3406d1333c7 ("tcp: bind() use stronger
condition for bind_conflict") introduced a restriction to forbid to bind
SO_REUSEADDR enabled sockets to the same (addr, port) tuple in order to
assign ports dispersedly so that we can connect to the same remote host.

The change results in accelerating port depletion so that we fail to bind
sockets to the same local port even if we want to connect to the different
remote hosts.

You can reproduce this issue by following instructions below.

  1. # sysctl -w net.ipv4.ip_local_port_range="32768 32768"
  2. set SO_REUSEADDR to two sockets.
  3. bind two sockets to (localhost, 0) and the latter fails.

Therefore, when ephemeral ports are exhausted, bind(0) should fallback to
the legacy behaviour to enable the SO_REUSEADDR option and make it possible
to connect to different remote (addr, port) tuples.

This patch allows us to bind SO_REUSEADDR enabled sockets to the same
(addr, port) only when net.ipv4.ip_autobind_reuse is set 1 and all
ephemeral ports are exhausted. This also allows connect() and listen() to
share ports in the following way and may break some applications. So the
ip_autobind_reuse is 0 by default and disables the feature.

  1. setsockopt(sk1, SO_REUSEADDR)
  2. setsockopt(sk2, SO_REUSEADDR)
  3. bind(sk1, saddr, 0)
  4. bind(sk2, saddr, 0)
  5. connect(sk1, daddr)
  6. listen(sk2)

If it is set 1, we can fully utilize the 4-tuples, but we should use
IP_BIND_ADDRESS_NO_PORT for bind()+connect() as possible.

The notable thing is that if all sockets bound to the same port have
both SO_REUSEADDR and SO_REUSEPORT enabled, we can bind sockets to an
ephemeral port and also do listen().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agotcp: Remove unnecessary conditions in inet_csk_bind_conflict().
Kuniyuki Iwashima [Tue, 10 Mar 2020 08:05:24 +0000 (17:05 +0900)]
tcp: Remove unnecessary conditions in inet_csk_bind_conflict().

When we get an ephemeral port, the relax is false, so the SO_REUSEADDR
conditions may be evaluated twice. We do not need to check the conditions
again.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'ethtool-consolidate-irq-coalescing-part-4'
David S. Miller [Thu, 12 Mar 2020 18:32:36 +0000 (11:32 -0700)]
Merge branch 'ethtool-consolidate-irq-coalescing-part-4'

Jakub Kicinski says:

====================
ethtool: consolidate irq coalescing - part 4

Convert more drivers following the groundwork laid in a recent
patch set [1] and continued in [2], [3]. The aim of the effort
is to consolidate irq coalescing parameter validation in the core.

This set converts 15 drivers in drivers/net/ethernet - remaining
Intel drivers, Freescale/NXP, and others.
2 more conversion sets to come.

[1] https://lore.kernel.org/netdev/20200305051542.991898-1-kuba@kernel.org/
[2] https://lore.kernel.org/netdev/20200306010602.1620354-1-kuba@kernel.org/
[3] https://lore.kernel.org/netdev/20200310021512.1861626-1-kuba@kernel.org/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: ixgbevf: reject unsupported coalescing params
Jakub Kicinski [Wed, 11 Mar 2020 22:33:02 +0000 (15:33 -0700)]
net: ixgbevf: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: ixgbe: reject unsupported coalescing params
Jakub Kicinski [Wed, 11 Mar 2020 22:33:01 +0000 (15:33 -0700)]
net: ixgbe: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: igc: let core reject the unsupported coalescing parameters
Jakub Kicinski [Wed, 11 Mar 2020 22:33:00 +0000 (15:33 -0700)]
net: igc: let core reject the unsupported coalescing parameters

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver was rejecting almost all unsupported
parameters already, it was only missing a check
for tx_max_coalesced_frames_irq.

As a side effect of these changes the error code for
unsupported params changes from ENOTSUPP to EOPNOTSUPP.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: igbvf: reject unsupported coalescing params
Jakub Kicinski [Wed, 11 Mar 2020 22:32:59 +0000 (15:32 -0700)]
net: igbvf: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: igb: let core reject the unsupported coalescing parameters
Jakub Kicinski [Wed, 11 Mar 2020 22:32:58 +0000 (15:32 -0700)]
net: igb: let core reject the unsupported coalescing parameters

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver was rejecting almost all unsupported
parameters already, it was only missing a check
for tx_max_coalesced_frames_irq.

As a side effect of these changes the error code for
unsupported params changes from ENOTSUPP to EOPNOTSUPP.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: iavf: reject unsupported coalescing params
Jakub Kicinski [Wed, 11 Mar 2020 22:32:57 +0000 (15:32 -0700)]
net: iavf: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: i40e: reject unsupported coalescing params
Jakub Kicinski [Wed, 11 Mar 2020 22:32:56 +0000 (15:32 -0700)]
net: i40e: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: fm10k: reject unsupported coalescing params
Jakub Kicinski [Wed, 11 Mar 2020 22:32:55 +0000 (15:32 -0700)]
net: fm10k: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: e1000: reject unsupported coalescing params
Jakub Kicinski [Wed, 11 Mar 2020 22:32:54 +0000 (15:32 -0700)]
net: e1000: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hns3: reject unsupported coalescing params
Jakub Kicinski [Wed, 11 Mar 2020 22:32:53 +0000 (15:32 -0700)]
net: hns3: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hns: reject unsupported coalescing params
Jakub Kicinski [Wed, 11 Mar 2020 22:32:52 +0000 (15:32 -0700)]
net: hns: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: gianfar: reject unsupported coalescing params
Jakub Kicinski [Wed, 11 Mar 2020 22:32:51 +0000 (15:32 -0700)]
net: gianfar: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: fec: reject unsupported coalescing params
Jakub Kicinski [Wed, 11 Mar 2020 22:32:50 +0000 (15:32 -0700)]
net: fec: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: dpaa: reject unsupported coalescing params
Jakub Kicinski [Wed, 11 Mar 2020 22:32:49 +0000 (15:32 -0700)]
net: dpaa: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters
(other than adaptive rx, which will now be rejected by core).

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: be2net: reject unsupported coalescing params
Jakub Kicinski [Wed, 11 Mar 2020 22:32:48 +0000 (15:32 -0700)]
net: be2net: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agosfc: ethtool: Refactor to remove fallthrough comments in case blocks
Joe Perches [Wed, 11 Mar 2020 02:41:41 +0000 (19:41 -0700)]
sfc: ethtool: Refactor to remove fallthrough comments in case blocks

Converting fallthrough comments to fallthrough; creates warnings
in this code when compiled with gcc.

This code is overly complicated and reads rather better with a
little refactoring and no fallthrough uses at all.

Remove the fallthrough comments and simplify the written source
code while reducing the object code size.

Consolidate duplicated switch/case blocks for IPV4 and IPV6.

defconfig x86-64 with sfc:

$ size drivers/net/ethernet/sfc/ethtool.o*
   text    data     bss     dec     hex filename
  10055      12       0   10067    2753 drivers/net/ethernet/sfc/ethtool.o.new
  10135      12       0   10147    27a3 drivers/net/ethernet/sfc/ethtool.o.old

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoRevert "net: sched: make newly activated qdiscs visible"
Julian Wiedmann [Thu, 12 Mar 2020 17:57:54 +0000 (18:57 +0100)]
Revert "net: sched: make newly activated qdiscs visible"

This reverts commit 4cda75275f9f89f9485b0ca4d6950c95258a9bce
from net-next.

Brown bag time.

Michal noticed that this change doesn't work at all when
netif_set_real_num_tx_queues() gets called prior to an initial
dev_activate(), as for instance igb does.

Doing so dies with:

[   40.579142] BUG: kernel NULL pointer dereference, address: 0000000000000400
[   40.586922] #PF: supervisor read access in kernel mode
[   40.592668] #PF: error_code(0x0000) - not-present page
[   40.598405] PGD 0 P4D 0
[   40.601234] Oops: 0000 [#1] PREEMPT SMP PTI
[   40.605909] CPU: 18 PID: 1681 Comm: wickedd Tainted: G            E     5.6.0-rc3-ethnl.50-default #1
[   40.616205] Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.R3.27.D685.1305151734 05/15/2013
[   40.627377] RIP: 0010:qdisc_hash_add.part.22+0x2e/0x90
[   40.633115] Code: 00 55 53 89 f5 48 89 fb e8 2f 9b fb ff 85 c0 74 44 48 8b 43 40 48 8b 08 69 43 38 47 86 c8 61 c1 e8 1c 48 83 e8 80 48 8d 14 c1 <48> 8b 04 c1 48 8d 4b 28 48 89 53 30 48 89 43 28 48 85 c0 48 89 0a
[   40.654080] RSP: 0018:ffffb879864934d8 EFLAGS: 00010203
[   40.659914] RAX: 0000000000000080 RBX: ffffffffb8328d80 RCX: 0000000000000000
[   40.667882] RDX: 0000000000000400 RSI: 0000000000000000 RDI: ffffffffb831faa0
[   40.675849] RBP: 0000000000000000 R08: ffffa0752c8b9088 R09: ffffa0752c8b9208
[   40.683816] R10: 0000000000000006 R11: 0000000000000000 R12: ffffa0752d734000
[   40.691783] R13: 0000000000000008 R14: 0000000000000000 R15: ffffa07113c18000
[   40.699750] FS:  00007f94548e5880(0000) GS:ffffa0752e980000(0000) knlGS:0000000000000000
[   40.708782] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   40.715189] CR2: 0000000000000400 CR3: 000000082b6ae006 CR4: 00000000001606e0
[   40.723156] Call Trace:
[   40.725888]  dev_qdisc_set_real_num_tx_queues+0x61/0x90
[   40.731725]  netif_set_real_num_tx_queues+0x94/0x1d0
[   40.737286]  __igb_open+0x19a/0x5d0 [igb]
[   40.741767]  __dev_open+0xbb/0x150
[   40.745567]  __dev_change_flags+0x157/0x1a0
[   40.750240]  dev_change_flags+0x23/0x60

[...]

Fixes: 4cda75275f9f ("net: sched: make newly activated qdiscs visible")
Reported-by: Michal Kubecek <mkubecek@suse.cz>
CC: Michal Kubecek <mkubecek@suse.cz>
CC: Eric Dumazet <edumazet@google.com>
CC: Jamal Hadi Salim <jhs@mojatatu.com>
CC: Cong Wang <xiyou.wangcong@gmail.com>
CC: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodt-bindings: soc: qcom: fix IPA binding
Alex Elder [Wed, 11 Mar 2020 21:47:00 +0000 (16:47 -0500)]
dt-bindings: soc: qcom: fix IPA binding

The definitions for the "qcom,smem-states" and "qcom,smem-state-names"
properties need to list their "$ref" under an "allOf" keyword.

In addition, fix two problems in the example at the end:
  - Use #include for header files that define needed symbolic values
  - Terminate the line that includes the "ipa-shared" register space
    name with a comma rather than a semicolon

Finally, update some white space in the example for better alignment.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: mptcp: don't hang before sending 'MP capable with data'
Davide Caratti [Wed, 11 Mar 2020 18:50:53 +0000 (19:50 +0100)]
net: mptcp: don't hang before sending 'MP capable with data'

the following packetdrill script

  socket(..., SOCK_STREAM, IPPROTO_MPTCP) = 3
  fcntl(3, F_GETFL) = 0x2 (flags O_RDWR)
  fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
  connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress)
  > S 0:0(0) <mss 1460,sackOK,TS val 100 ecr 0,nop,wscale 8,mpcapable v1 flags[flag_h] nokey>
  < S. 0:0(0) ack 1 win 65535 <mss 1460,sackOK,TS val 700 ecr 100,nop,wscale 8,mpcapable v1 flags[flag_h] key[skey=2]>
  > . 1:1(0) ack 1 win 256 <nop, nop, TS val 100 ecr 700,mpcapable v1 flags[flag_h] key[ckey,skey]>
  getsockopt(3, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
  fcntl(3, F_SETFL, O_RDWR) = 0
  write(3, ..., 1000) = 1000

doesn't transmit 1KB data packet after a successful three-way-handshake,
using mp_capable with data as required by protocol v1, and write() hangs
forever:

 PID: 973    TASK: ffff97dd399cae80  CPU: 1   COMMAND: "packetdrill"
  #0 [ffffa9b94062fb78] __schedule at ffffffff9c90a000
  #1 [ffffa9b94062fc08] schedule at ffffffff9c90a4a0
  #2 [ffffa9b94062fc18] schedule_timeout at ffffffff9c90e00d
  #3 [ffffa9b94062fc90] wait_woken at ffffffff9c120184
  #4 [ffffa9b94062fcb0] sk_stream_wait_connect at ffffffff9c75b064
  #5 [ffffa9b94062fd20] mptcp_sendmsg at ffffffff9c8e801c
  #6 [ffffa9b94062fdc0] sock_sendmsg at ffffffff9c747324
  #7 [ffffa9b94062fdd8] sock_write_iter at ffffffff9c7473c7
  #8 [ffffa9b94062fe48] new_sync_write at ffffffff9c302976
  #9 [ffffa9b94062fed0] vfs_write at ffffffff9c305685
 #10 [ffffa9b94062ff00] ksys_write at ffffffff9c305985
 #11 [ffffa9b94062ff38] do_syscall_64 at ffffffff9c004475
 #12 [ffffa9b94062ff50] entry_SYSCALL_64_after_hwframe at ffffffff9ca0008c
     RIP: 00007f959407eaf7  RSP: 00007ffe9e95a910  RFLAGS: 00000293
     RAX: ffffffffffffffda  RBX: 0000000000000008  RCX: 00007f959407eaf7
     RDX: 00000000000003e8  RSI: 0000000001785fe0  RDI: 0000000000000008
     RBP: 0000000001785fe0   R8: 0000000000000000   R9: 0000000000000003
     R10: 0000000000000007  R11: 0000000000000293  R12: 00000000000003e8
     R13: 00007ffe9e95ae30  R14: 0000000000000000  R15: 0000000000000000
     ORIG_RAX: 0000000000000001  CS: 0033  SS: 002b

Fix it ensuring that socket state is TCP_ESTABLISHED on reception of the
third ack.

Fixes: 1954b86016cf ("mptcp: Check connection state before attempting send")
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agosoc: qcom: ipa: fix spelling mistake "cahces" -> "caches"
Colin Ian King [Wed, 11 Mar 2020 09:16:13 +0000 (09:16 +0000)]
soc: qcom: ipa: fix spelling mistake "cahces" -> "caches"

There is a spelling mistake in a dev_err message. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: ibm: remove set but not used variables 'err'
Chen Zhou [Wed, 11 Mar 2020 06:54:11 +0000 (14:54 +0800)]
net: ibm: remove set but not used variables 'err'

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/ibm/emac/core.c: In function __emac_mdio_write:
drivers/net/ethernet/ibm/emac/core.c:875:9: warning:
variable err set but not used [-Wunused-but-set-variable]

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: Add missing annotation for *netlink_seq_start()
Jules Irenge [Wed, 11 Mar 2020 01:09:06 +0000 (01:09 +0000)]
net: Add missing annotation for *netlink_seq_start()

Sparse reports a warning at netlink_seq_start()

warning: context imbalance in netlink_seq_start() - wrong count at exit
The root cause is the missing annotation at netlink_seq_start()
Add the missing  __acquires(RCU) annotation

Signed-off-by: Jules Irenge <jbi.octave@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agotcp: Add missing annotation for tcp_child_process()
Jules Irenge [Wed, 11 Mar 2020 01:09:03 +0000 (01:09 +0000)]
tcp: Add missing annotation for tcp_child_process()

Sparse reports warning at tcp_child_process()
warning: context imbalance in tcp_child_process() - unexpected unlock
The root cause is the missing annotation at tcp_child_process()

Add the missing __releases(&((child)->sk_lock.slock)) annotation

Signed-off-by: Jules Irenge <jbi.octave@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoraw: Add missing annotations to raw_seq_start() and raw_seq_stop()
Jules Irenge [Wed, 11 Mar 2020 01:09:02 +0000 (01:09 +0000)]
raw: Add missing annotations to raw_seq_start() and raw_seq_stop()

Sparse reports warnings at raw_seq_start() and raw_seq_stop()

warning: context imbalance in raw_seq_start() - wrong count at exit
warning: context imbalance in raw_seq_stop() - unexpected unlock

The root cause is the missing annotations at raw_seq_start()
and raw_seq_stop()
Add the missing __acquires(&h->lock) annotation
Add the missing __releases(&h->lock) annotation

Signed-off-by: Jules Irenge <jbi.octave@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: sched: make newly activated qdiscs visible
Julian Wiedmann [Tue, 10 Mar 2020 16:53:35 +0000 (17:53 +0100)]
net: sched: make newly activated qdiscs visible

In their .attach callback, mq[prio] only add the qdiscs of the currently
active TX queues to the device's qdisc hash list.
If a user later increases the number of active TX queues, their qdiscs
are not visible via eg. 'tc qdisc show'.

Add a hook to netif_set_real_num_tx_queues() that walks all active
TX queues and adds those which are missing to the hash list.

CC: Eric Dumazet <edumazet@google.com>
CC: Jamal Hadi Salim <jhs@mojatatu.com>
CC: Cong Wang <xiyou.wangcong@gmail.com>
CC: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: slcan, slip -- no need for goto when if () will do
Pavel Machek [Mon, 9 Mar 2020 22:33:23 +0000 (23:33 +0100)]
net: slcan, slip -- no need for goto when if () will do

No need to play with gotos to jump over single statement.

Signed-off-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: stmmac: selftests: Fix L3/L4 Filtering test
Jose Abreu [Mon, 9 Mar 2020 13:30:22 +0000 (14:30 +0100)]
net: stmmac: selftests: Fix L3/L4 Filtering test

Since commit 319a1d19471e, stmmac only support basic HW stats type for
action. Set this field in the L3/L4 Filtering test so that it correctly
setups the filter instead of returning EOPNOTSUPP.

Fixes: 319a1d19471e ("flow_offload: check for basic action hw stats type")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agocdc_ncm: Implement the 32-bit version of NCM Transfer Block
Alexander Bersenev [Thu, 5 Mar 2020 20:33:16 +0000 (01:33 +0500)]
cdc_ncm: Implement the 32-bit version of NCM Transfer Block

The NCM specification defines two formats of transfer blocks: with 16-bit
fields (NTB-16) and with 32-bit fields (NTB-32). Currently only NTB-16 is
implemented.

This patch adds the support of NTB-32. The motivation behind this is that
some devices such as E5785 or E5885 from the current generation of Huawei
LTE routers do not support NTB-16. The previous generations of Huawei
devices are also use NTB-32 by default.

Also this patch enables NTB-32 by default for Huawei devices.

During the 2019 ValdikSS made five attempts to contact Huawei to add the
NTB-16 support to their router firmware, but they were unsuccessful.

Signed-off-by: Alexander Bersenev <bay@hackerdom.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agobareudp: Fixed bareudp receive handling
Martin Varghese [Thu, 12 Mar 2020 03:03:51 +0000 (08:33 +0530)]
bareudp: Fixed bareudp receive handling

Reverted commit "2baecda bareudp: remove unnecessary udp_encap_enable() in
bareudp_socket_create()"

An explicit call to udp_encap_enable is needed as the setup_udp_tunnel_sock
does not call udp_encap_enable if the if the socket is of type v6.

Bareudp device uses v6 socket to receive v4 & v6 traffic

CC: Taehee Yoo <ap420073@gmail.com>
Fixes: 2baecda37f4e ("bareudp: remove unnecessary udp_encap_enable() in bareudp_socket_create()")
Signed-off-by: Martin Varghese <martin.varghese@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoftgmac100: Remove redundant judgement
tangbin [Wed, 11 Mar 2020 02:05:37 +0000 (10:05 +0800)]
ftgmac100: Remove redundant judgement

In this function, ftgmac100_probe() can be triggered only
if the platform_device and platform_driver matches, so the
judgement at the beginning is redundant.

Signed-off-by: tangbin <tangbin@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'ethtool-consolidate-irq-coalescing-part-3'
David S. Miller [Tue, 10 Mar 2020 23:28:54 +0000 (16:28 -0700)]
Merge branch 'ethtool-consolidate-irq-coalescing-part-3'

Jakub Kicinski says:

====================
ethtool: consolidate irq coalescing - part 3

Convert more drivers following the groundwork laid in a recent
patch set [1] and continued in [2]. The aim of the effort is to
consolidate irq coalescing parameter validation in the core.

This set converts 15 drivers in drivers/net/ethernet.
3 more conversion sets to come.

None of the drivers here checked all unsupported parameters.

[1] https://lore.kernel.org/netdev/20200305051542.991898-1-kuba@kernel.org/
[2] https://lore.kernel.org/netdev/20200306010602.1620354-1-kuba@kernel.org/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: gemini: reject unsupported coalescing params
Jakub Kicinski [Tue, 10 Mar 2020 02:15:12 +0000 (19:15 -0700)]
net: gemini: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: cxgb4vf: reject unsupported coalescing params
Jakub Kicinski [Tue, 10 Mar 2020 02:15:11 +0000 (19:15 -0700)]
net: cxgb4vf: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: cxgb4: reject unsupported coalescing params
Jakub Kicinski [Tue, 10 Mar 2020 02:15:10 +0000 (19:15 -0700)]
net: cxgb4: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: cxgb3: reject unsupported coalescing params
Jakub Kicinski [Tue, 10 Mar 2020 02:15:09 +0000 (19:15 -0700)]
net: cxgb3: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: cxgb2: reject unsupported coalescing params
Jakub Kicinski [Tue, 10 Mar 2020 02:15:08 +0000 (19:15 -0700)]
net: cxgb2: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: mlx4: reject unsupported coalescing params
Jakub Kicinski [Tue, 10 Mar 2020 02:15:07 +0000 (19:15 -0700)]
net: mlx4: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: liquidio: reject unsupported coalescing params
Jakub Kicinski [Tue, 10 Mar 2020 02:15:06 +0000 (19:15 -0700)]
net: liquidio: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: bna: reject unsupported coalescing params
Jakub Kicinski [Tue, 10 Mar 2020 02:15:05 +0000 (19:15 -0700)]
net: bna: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: tg3: reject unsupported coalescing params
Jakub Kicinski [Tue, 10 Mar 2020 02:15:04 +0000 (19:15 -0700)]
net: tg3: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: bcmgenet: reject unsupported coalescing params
Jakub Kicinski [Tue, 10 Mar 2020 02:15:03 +0000 (19:15 -0700)]
net: bcmgenet: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject all unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: bnx2x: reject unsupported coalescing params
Jakub Kicinski [Tue, 10 Mar 2020 02:15:02 +0000 (19:15 -0700)]
net: bnx2x: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: bnx2: reject unsupported coalescing params
Jakub Kicinski [Tue, 10 Mar 2020 02:15:01 +0000 (19:15 -0700)]
net: bnx2: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: systemport: reject unsupported coalescing params
Jakub Kicinski [Tue, 10 Mar 2020 02:15:00 +0000 (19:15 -0700)]
net: systemport: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver did not previously reject most of unsupported
parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: aquantia: reject all unsupported coalescing params
Jakub Kicinski [Tue, 10 Mar 2020 02:14:59 +0000 (19:14 -0700)]
net: aquantia: reject all unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

This driver only rejected some of the unsupported parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: ena: reject unsupported coalescing params
Jakub Kicinski [Tue, 10 Mar 2020 02:14:58 +0000 (19:14 -0700)]
net: ena: reject unsupported coalescing params

Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Sameeh Jubran <sameehj@amazon.com>
Acked-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agor8169: simplify getting stats by using netdev_stats_to_stats64
Heiner Kallweit [Tue, 10 Mar 2020 22:15:00 +0000 (23:15 +0100)]
r8169: simplify getting stats by using netdev_stats_to_stats64

Let netdev_stats_to_stats64() do the copy work for us.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agor8169: let rtl8169_mark_to_asic clear rx descriptor field opts2
Heiner Kallweit [Tue, 10 Mar 2020 22:14:41 +0000 (23:14 +0100)]
r8169: let rtl8169_mark_to_asic clear rx descriptor field opts2

Clearing opts2 belongs to preparing the descriptor for DMA engine use.
Therefore move it into rtl8169_mark_to_asic().

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Tue, 10 Mar 2020 23:20:03 +0000 (16:20 -0700)]
Merge branch '100GbE' of git://git./linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2020-03-10

This series contains updates to ice and iavf drivers.

Cleaned up unnecessary parenthesis, which was pointed out by Sergei
Shtylyov.

Mitch updates the iavf and ice drivers to expand the limitation on the
number of queues that the driver can support to account for the newer
800-series capabilities.

Brett cleans up the error messages for both SR-IOV and non SR-IOV use
cases.  Fixed the logic when the ice driver is removed and a bare-metal
VF is passing traffic, which was causing a transmit hang on the VF.
Updated the ice driver to display "Link detected" field via ethtool,
when the driver is in safe mode.  Updated ice driver to properly set
VLAN pruning when transmit anti-spoof is off.

Avinash fixed a corner case in DCB, when switching from IEEE to CEE
mode, the DCBX mode does not get properly updated.

Dave updates the logic when switching from software DCB to firmware DCB
to renegotiate DCBX to ensure the firmware agent has up to date
information about the DCB settings of the link partner.

Lukasz increases the PF's mailbox receive queue size to the maximum to
prevent potential bottleneck or slow down occurring from the PF's
mailbox receive queue being full.

Bruce updates the ice driver to use strscpy() instead of strlcpy().
Cleaned up variable names that were not very descriptive with names that
had more meaning.

Anirudh replaces the use of ENOTSUPP with EOPNOTSUPP in the ice driver.

Jake fixed up a function header comment to properly reflect the variable
size and use.

v2: Dropped patch 5 of the original series, where Tony added tunnel
    offload support.  Based on community feedback, the patch needed
    changes, so giving Tony additional time to work on those changes and
    not hold up the remaining changes in the series.
====================

Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: dsa: mt7530: fix macro MIRROR_PORT
DENG Qingfang [Tue, 10 Mar 2020 18:20:50 +0000 (02:20 +0800)]
net: dsa: mt7530: fix macro MIRROR_PORT

The inner pair of parentheses should be around the variable x

Fixes: 37feab6076aa ("net: dsa: mt7530: add support for port mirroring")
Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: dsa: microchip: use delayed_work instead of timer + work
George McCollister [Tue, 10 Mar 2020 17:58:59 +0000 (12:58 -0500)]
net: dsa: microchip: use delayed_work instead of timer + work

Simplify ksz_common.c by using delayed_work instead of a combination of
timer and work.

Signed-off-by: George McCollister <george.mccollister@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'flow_offload-follow-ups-to-HW-stats-type-patchset'
David S. Miller [Tue, 10 Mar 2020 23:04:19 +0000 (16:04 -0700)]
Merge branch 'flow_offload-follow-ups-to-HW-stats-type-patchset'

Jiri Pirko says:

====================
flow_offload: follow-ups to HW stats type patchset

This patchset includes couple of patches in reaction to the discussions
to the original HW stats patchset. The first patch is a fix,
the other two patches are basically cosmetics.
====================

Acked-by: Edward Cree <ecree@solarflare.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoflow_offload: restrict driver to pass one allowed bit to flow_action_hw_stats_types_c...
Jiri Pirko [Tue, 10 Mar 2020 15:49:09 +0000 (16:49 +0100)]
flow_offload: restrict driver to pass one allowed bit to flow_action_hw_stats_types_check()

The intention of this helper was to allow driver to specify one type
that it supports, so not only "any" value would pass. So make the API
more strict and allow driver to pass only 1 bit that is going
to be checked.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoflow_offload: turn hw_stats_type into dedicated enum
Jiri Pirko [Tue, 10 Mar 2020 15:49:08 +0000 (16:49 +0100)]
flow_offload: turn hw_stats_type into dedicated enum

Put the values into enum and add an enum to define the bits.

Suggested-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoflow_offload: fix allowed types check
Jiri Pirko [Tue, 10 Mar 2020 15:49:07 +0000 (16:49 +0100)]
flow_offload: fix allowed types check

Change the check to see if the passed allowed type bit is enabled.

Fixes: 319a1d19471e ("flow_offload: check for basic action hw stats type")
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'enetc-Support-extended-BD-rings-at-runtime'
David S. Miller [Tue, 10 Mar 2020 22:48:54 +0000 (15:48 -0700)]
Merge branch 'enetc-Support-extended-BD-rings-at-runtime'

Claudiu Manoil says:

====================
enetc: Support extended BD rings at runtime

First two patches are just misc code cleanup.
The 3rd patch prepares the Rx BD processing code to be extended
to processing both normal and extended BDs.
The last one adds extended Rx BD support for timestamping
without the need of a static config. Finally, the config option
FSL_ENETC_HW_TIMESTAMPING can be dropped.
Care was taken not to impact non-timestamping usecases.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoenetc: Add dynamic allocation of extended Rx BD rings
Claudiu Manoil [Tue, 10 Mar 2020 12:51:24 +0000 (14:51 +0200)]
enetc: Add dynamic allocation of extended Rx BD rings

Hardware timestamping support (PTP) on Rx requires extended
buffer descriptors, double the size of normal Rx descriptors.
On the current controller revision only the timestamping offload
requires extended Rx descriptors.
Since Rx timestamping can be turned on/off at runtime, make Rx ring
allocation configurable at runtime too. As a result, the static
config option FSL_ENETC_HW_TIMESTAMPING can be dropped and the
extended descriptors can be used only when Rx timestamping gets
activated.
The extension has the same size as the base descriptor, making
the descriptor iterators easy to update for the extended case.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoenetc: Clean up Rx BD iteration
Claudiu Manoil [Tue, 10 Mar 2020 12:51:23 +0000 (14:51 +0200)]
enetc: Clean up Rx BD iteration

Improve maintainability of the code iterating the Rx buffer
descriptors to prepare it to support iterating extended Rx BD
descriptors as well.
Don't increment by one the h/w descriptor pointers explicitly,
provide an iterator that takes care of the h/w details.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoenetc: Clean up of ehtool stats len
Claudiu Manoil [Tue, 10 Mar 2020 12:51:22 +0000 (14:51 +0200)]
enetc: Clean up of ehtool stats len

Refactor the stats len computation code to make it easier
to add new stats counters.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoenetc: Drop redundant device node check
Claudiu Manoil [Tue, 10 Mar 2020 12:51:21 +0000 (14:51 +0200)]
enetc: Drop redundant device node check

The existence of the DT port node is the first thing checked
at probe time, and probing won't reach this point if the node
is missing.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agopktgen: Allow on loopback device
Lukas Wunner [Tue, 10 Mar 2020 10:49:46 +0000 (11:49 +0100)]
pktgen: Allow on loopback device

When pktgen is used to measure the performance of dev_queue_xmit()
packet handling in the core, it is preferable to not hand down
packets to a low-level Ethernet driver as it would distort the
measurements.

Allow using pktgen on the loopback device, thus constraining
measurements to core code.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoflow_offload: use flow_action_for_each in flow_action_mixed_hw_stats_types_check()
Jiri Pirko [Tue, 10 Mar 2020 10:11:57 +0000 (11:11 +0100)]
flow_offload: use flow_action_for_each in flow_action_mixed_hw_stats_types_check()

Instead of manually iterating over entries, use flow_action_for_each
helper. Move the helper and wrap it to fit to 80 cols on the way.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoice: fix incorrect size description of ice_get_nvm_version
Jacob Keller [Thu, 27 Feb 2020 18:15:05 +0000 (10:15 -0800)]
ice: fix incorrect size description of ice_get_nvm_version

The function comment for ice_get_nvm_version indicated that the ver_hi
and ver_lo values were 16 bits. In fact, they are only uint8_t values,
meaning that they have a maximum size of 8 bits. Fix the comment to
match the correct size.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
4 years agoice: use variable name more descriptive than type
Bruce Allan [Thu, 27 Feb 2020 18:15:04 +0000 (10:15 -0800)]
ice: use variable name more descriptive than type

The variable name 'type' is not very descriptive. Replace instances of
those with a variable name that is more descriptive or replace it if not
needed.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
4 years agoice: Use EOPNOTSUPP instead of ENOTSUPP
Anirudh Venkataramanan [Thu, 27 Feb 2020 18:15:03 +0000 (10:15 -0800)]
ice: Use EOPNOTSUPP instead of ENOTSUPP

Using ENOTSUPP almost always results in some bizarre error message to
be printed in userspace. This is likely because ENOTSUPP was defined for
the NFS protocol (as per a comment in include/linux/errno.h). Use
EOPNOTSUPP instead.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
4 years agoice: Fix format specifier
Tony Nguyen [Thu, 27 Feb 2020 18:15:02 +0000 (10:15 -0800)]
ice: Fix format specifier

Commit ed5a3f664c55 ("ice: Removing hung_queue variable to use txqueue
function parameter") began utilizing the txqueue variable over the
hung_queue variable. hung_queue was an int where txqueue is an unsigned
int. Update the format specifiers to reflect the new type.

Fixes: ed5a3f664c55 ("ice: Removing hung_queue variable to use txqueue function parameter")
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
4 years agoice: fix use of deprecated strlcpy()
Bruce Allan [Thu, 27 Feb 2020 18:15:01 +0000 (10:15 -0800)]
ice: fix use of deprecated strlcpy()

checkpatch complains "CHECK:DEPRECATED_API: Deprecated use of 'strlcpy',
prefer 'stracpy or strscpy' instead"; use strscpy.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
4 years agoice: Increase mailbox receive queue length to maximum
Lukasz Czapnik [Thu, 27 Feb 2020 18:15:00 +0000 (10:15 -0800)]
ice: Increase mailbox receive queue length to maximum

Currently the PF's mailbox receive queue is only 512 entries. This fine,
but considering that all VF's mailbox send queues funnel into the PF's
single mailbox receive queue, let's increase it to the maximum size. This
will help prevent any possible bottleneck/slowdown occurring from the PF's
mailbox receive queue being full.

Signed-off-by: Lukasz Czapnik <lukasz.czapnik@intel.com>
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
4 years agoice: Correct setting VLAN pruning
Brett Creeley [Thu, 27 Feb 2020 18:14:59 +0000 (10:14 -0800)]
ice: Correct setting VLAN pruning

VLAN pruning is not always being set correctly due to a previous change
that set Tx antispoof off. ice_vsi_is_vlan_pruning_ena() currently checks
for both Tx antispoof and Rx pruning. The expectation for this function is
to only check Rx pruning so fix the check.

Fixes: cd6d6b83316a ("ice: Fix VF spoofchk")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
4 years agoice: renegotiate link after FW DCB on
Dave Ertman [Thu, 27 Feb 2020 18:14:58 +0000 (10:14 -0800)]
ice: renegotiate link after FW DCB on

When switching from SW DCB to FW DCB it is necessary
to renegotiate DCBx so that the FW agent can have up
to date information about the DCB settings of the link
partner.

Perform an autoneg restart on the link when activating
FW DCB.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>