Julian Wiedmann [Thu, 28 Feb 2019 17:59:37 +0000 (18:59 +0100)]
s390/qeth: enable/disable the HW trap a little earlier
When setting a L2 qeth device online, enable the HW trap as soon as the
control plane is available. This allows us to catch any error that
occurs during the very first commands.
In the same spirit, the offline code should disable the HW trap as the
very first step of its processing.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Thu, 28 Feb 2019 17:59:36 +0000 (18:59 +0100)]
s390/qeth: remove RECOVER state
The offline code uses a specific RECOVER state to indicate that the
interface should be brought up when a qeth device is set online again.
Rather than having a specific card-state for this, just put it in an
internal flag bit and set the state to DOWN. When working with the
card's state transitions, this reduces the complexity quite a bit.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ursula Braun [Thu, 28 Feb 2019 14:10:08 +0000 (15:10 +0100)]
net/smc: allow pnetid-less configuration
Without hardware pnetid support there must currently be a pnet
table configured to determine the IB device port to be used for SMC
RDMA traffic. This patch enables a setup without pnet table, if
the used handshake interface belongs already to a RoCE port.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Leslie Monis [Thu, 28 Feb 2019 12:36:54 +0000 (18:06 +0530)]
net: sched: pie: avoid slow division in drop probability decay
As per RFC 8033, it is sufficient for the drop probability
decay factor to have a value of (1 - 1/64) instead of 98%.
This avoids the need to do slow division.
Suggested-by: David Laight <David.Laight@aculab.com>
Signed-off-by: Leslie Monis <lesliemonis@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arjun Vynipadath [Thu, 28 Feb 2019 10:03:16 +0000 (15:33 +0530)]
cxgb4vf: Enter debugging mode if FW is inaccessible
If we are not able to reach firmware, enter debugging mode that will
help us to get adapter logs.
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arjun Vynipadath [Thu, 28 Feb 2019 09:39:28 +0000 (15:09 +0530)]
cxgb4: Enable outer UDP checksum offload for T6
T6 adapters support outer UDP checksum offload for
encapsulated packets, hence enabling netdev feature flag
NETIF_F_GSO_UDP_TUNNEL_CSUM.
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arjun Vynipadath [Thu, 28 Feb 2019 09:36:54 +0000 (15:06 +0530)]
cxgb4/cxgb4vf: Fix up netdev->hw_features
GRO is done by cxgb4/cxgb4vf. Hence set NETIF_F_GRO flag for
both cxgb4/cxgb4vf.
Cleaned up VLAN netdev features in cxgb4vf. Also fixed
NETIF_F_HIGHDMA being set unconditionally for vlan netdev
features.
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eli Britstein [Tue, 26 Feb 2019 09:57:34 +0000 (09:57 +0000)]
net: sched: act_csum: Fix csum calc for tagged packets
The csum calculation is different for IPv4/6. For VLAN packets,
tc_skb_protocol returns the VLAN protocol rather than the packet's one
(e.g. IPv4/6), so csum is not calculated. Furthermore, VLAN may not be
stripped so csum is not calculated in this case too. Calculate the
csum for those cases.
Fixes:
d8b9605d2697 ("net: sched: fix skb->protocol use in case of accelerated vlan path")
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gustavo A. R. Silva [Tue, 26 Feb 2019 00:27:57 +0000 (18:27 -0600)]
net: hns: use struct_size() in devm_kzalloc()
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:
struct foo {
int stuff;
struct boo entry[];
};
instance = devm_kzalloc(dev, sizeof(struct foo) + sizeof(struct boo) * count, GFP_KERNEL);
Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:
instance = devm_kzalloc(dev, struct_size(instance, entry, count), GFP_KERNEL);
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 28 Feb 2019 05:41:41 +0000 (21:41 -0800)]
Merge branch 'net-phy-marvell10g-Clean-get_features-by-using-C45-helpers'
Maxime Chevallier says:
====================
net: phy: marvell10g: Clean .get_features by using C45 helpers
Recent work on C45 helpers by Heiner made the
genphy_c45_pma_read_abilities function generic enough to use as a
default .get_featutes implementation.
This series removes the remaining redundant code in
mv3310_get_features(), and makes the 2110 PHY use
genphy_c45_pma_read_abilities() directly, since it doesn't have the
issue with the wrong abilities being reported.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Maxime Chevallier [Mon, 25 Feb 2019 16:14:07 +0000 (17:14 +0100)]
net: phy: marvell10g: Use the generic C45 helper to read the 2110 features
Contrary to the 3310, the 2110 PHY correctly reports it's 2.5G/5G
abilities. We can therefore use the genphy_c45_pma_read_abilities helper
to build the list of features.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Maxime Chevallier [Mon, 25 Feb 2019 16:14:06 +0000 (17:14 +0100)]
net: phy: marvell10g: Let genphy_c45_pma_read_abilities set Aneg bit
The genphy_c45_pma_read_abilities helper now sets the Autoneg ability
in phydev->supported according to what the AN MMD reports.
We therefore don't need to manually do that in mv3310_get_features().
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Suggested-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Buslov [Mon, 25 Feb 2019 15:30:14 +0000 (17:30 +0200)]
net: sched: act_tunnel_key: fix metadata handling
Tunnel key action params->tcft_enc_metadata is only set when action is
TCA_TUNNEL_KEY_ACT_SET. However, metadata pointer is incorrectly
dereferenced during tunnel key init and release without verifying that
action is if correct type, which causes NULL pointer dereference. Metadata
tunnel dst_cache is also leaked on action overwrite.
Fix metadata handling:
- Verify that metadata pointer is not NULL before dereferencing it in
tunnel_key_init error handling code.
- Move dst_cache destroy code into tunnel_key_release_params() function
that is called in both action overwrite and release cases (fixes resource
leak) and verifies that actions has correct type before dereferencing
metadata pointer (fixes NULL pointer dereference).
Oops with KASAN enabled during tdc tests execution:
[ 261.080482] ==================================================================
[ 261.088049] BUG: KASAN: null-ptr-deref in dst_cache_destroy+0x21/0xa0
[ 261.094613] Read of size 8 at addr
00000000000000b0 by task tc/2976
[ 261.102524] CPU: 14 PID: 2976 Comm: tc Not tainted 5.0.0-rc7+ #157
[ 261.108844] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
[ 261.116726] Call Trace:
[ 261.119234] dump_stack+0x9a/0xeb
[ 261.122625] ? dst_cache_destroy+0x21/0xa0
[ 261.126818] ? dst_cache_destroy+0x21/0xa0
[ 261.131004] kasan_report+0x176/0x192
[ 261.134752] ? idr_get_next+0xd0/0x120
[ 261.138578] ? dst_cache_destroy+0x21/0xa0
[ 261.142768] dst_cache_destroy+0x21/0xa0
[ 261.146799] tunnel_key_release+0x3a/0x50 [act_tunnel_key]
[ 261.152392] tcf_action_cleanup+0x2c/0xc0
[ 261.156490] tcf_generic_walker+0x4c2/0x5c0
[ 261.160794] ? tcf_action_dump_1+0x390/0x390
[ 261.165163] ? tunnel_key_walker+0x5/0x1a0 [act_tunnel_key]
[ 261.170865] ? tunnel_key_walker+0xe9/0x1a0 [act_tunnel_key]
[ 261.176641] tca_action_gd+0x600/0xa40
[ 261.180482] ? tca_get_fill.constprop.17+0x200/0x200
[ 261.185548] ? __lock_acquire+0x588/0x1d20
[ 261.189741] ? __lock_acquire+0x588/0x1d20
[ 261.193922] ? mark_held_locks+0x90/0x90
[ 261.197944] ? mark_held_locks+0x90/0x90
[ 261.202018] ? __nla_parse+0xfe/0x190
[ 261.205774] tc_ctl_action+0x218/0x230
[ 261.209614] ? tcf_action_add+0x230/0x230
[ 261.213726] rtnetlink_rcv_msg+0x3a5/0x600
[ 261.217910] ? lock_downgrade+0x2d0/0x2d0
[ 261.222006] ? validate_linkmsg+0x400/0x400
[ 261.226278] ? find_held_lock+0x6d/0xd0
[ 261.230200] ? match_held_lock+0x1b/0x210
[ 261.234296] ? validate_linkmsg+0x400/0x400
[ 261.238567] netlink_rcv_skb+0xc7/0x1f0
[ 261.242489] ? netlink_ack+0x470/0x470
[ 261.246319] ? netlink_deliver_tap+0x1f3/0x5a0
[ 261.250874] netlink_unicast+0x2ae/0x350
[ 261.254884] ? netlink_attachskb+0x340/0x340
[ 261.261647] ? _copy_from_iter_full+0xdd/0x380
[ 261.268576] ? __virt_addr_valid+0xb6/0xf0
[ 261.275227] ? __check_object_size+0x159/0x240
[ 261.282184] netlink_sendmsg+0x4d3/0x630
[ 261.288572] ? netlink_unicast+0x350/0x350
[ 261.295132] ? netlink_unicast+0x350/0x350
[ 261.301608] sock_sendmsg+0x6d/0x80
[ 261.307467] ___sys_sendmsg+0x48e/0x540
[ 261.313633] ? copy_msghdr_from_user+0x210/0x210
[ 261.320545] ? save_stack+0x89/0xb0
[ 261.326289] ? __lock_acquire+0x588/0x1d20
[ 261.332605] ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 261.340063] ? mark_held_locks+0x90/0x90
[ 261.346162] ? do_filp_open+0x138/0x1d0
[ 261.352108] ? may_open_dev+0x50/0x50
[ 261.357897] ? match_held_lock+0x1b/0x210
[ 261.364016] ? __fget_light+0xa6/0xe0
[ 261.369840] ? __sys_sendmsg+0xd2/0x150
[ 261.375814] __sys_sendmsg+0xd2/0x150
[ 261.381610] ? __ia32_sys_shutdown+0x30/0x30
[ 261.388026] ? lock_downgrade+0x2d0/0x2d0
[ 261.394182] ? mark_held_locks+0x1c/0x90
[ 261.400230] ? do_syscall_64+0x1e/0x280
[ 261.406172] do_syscall_64+0x78/0x280
[ 261.411932] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 261.419103] RIP: 0033:0x7f28e91a8b87
[ 261.424791] Code: 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 80 00 00 00 00 8b 05 6a 2b 2c 00 48 63 d2 48 63 ff 85 c0 75 18 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 59 f3 c3 0f 1f 80 00 00 00 00 53 48 89 f3 48
[ 261.448226] RSP: 002b:
00007ffdc5c4e2d8 EFLAGS:
00000246 ORIG_RAX:
000000000000002e
[ 261.458183] RAX:
ffffffffffffffda RBX:
000000005c73c202 RCX:
00007f28e91a8b87
[ 261.467728] RDX:
0000000000000000 RSI:
00007ffdc5c4e340 RDI:
0000000000000003
[ 261.477342] RBP:
0000000000000000 R08:
0000000000000001 R09:
000000000000000c
[ 261.486970] R10:
000000000000000c R11:
0000000000000246 R12:
0000000000000001
[ 261.496599] R13:
000000000067b4e0 R14:
00007ffdc5c5248c R15:
00007ffdc5c52480
[ 261.506281] ==================================================================
[ 261.516076] Disabling lock debugging due to kernel taint
[ 261.523979] BUG: unable to handle kernel NULL pointer dereference at
00000000000000b0
[ 261.534413] #PF error: [normal kernel read fault]
[ 261.541730] PGD
8000000317400067 P4D
8000000317400067 PUD
316878067 PMD 0
[ 261.551294] Oops: 0000 [#1] SMP KASAN PTI
[ 261.557985] CPU: 14 PID: 2976 Comm: tc Tainted: G B 5.0.0-rc7+ #157
[ 261.568306] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
[ 261.578874] RIP: 0010:dst_cache_destroy+0x21/0xa0
[ 261.586413] Code: f4 ff ff ff eb f6 0f 1f 00 0f 1f 44 00 00 41 56 41 55 49 c7 c6 60 fe 35 af 41 54 55 49 89 fc 53 bd ff ff ff ff e8 ef 98 73 ff <49> 83 3c 24 00 75 35 eb 6c 4c 63 ed e8 de 98 73 ff 4a 8d 3c ed 40
[ 261.611247] RSP: 0018:
ffff888316447160 EFLAGS:
00010282
[ 261.619564] RAX:
0000000000000000 RBX:
ffff88835b3e2f00 RCX:
ffffffffad1c5071
[ 261.629862] RDX:
0000000000000003 RSI:
dffffc0000000000 RDI:
0000000000000297
[ 261.640149] RBP:
00000000ffffffff R08:
fffffbfff5dd4e89 R09:
fffffbfff5dd4e89
[ 261.650467] R10:
0000000000000001 R11:
fffffbfff5dd4e88 R12:
00000000000000b0
[ 261.660785] R13:
ffff8883267a10c0 R14:
ffffffffaf35fe60 R15:
0000000000000001
[ 261.671110] FS:
00007f28ea3e6400(0000) GS:
ffff888364200000(0000) knlGS:
0000000000000000
[ 261.682447] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 261.691491] CR2:
00000000000000b0 CR3:
00000003178ae004 CR4:
00000000001606e0
[ 261.701283] Call Trace:
[ 261.706374] tunnel_key_release+0x3a/0x50 [act_tunnel_key]
[ 261.714522] tcf_action_cleanup+0x2c/0xc0
[ 261.721208] tcf_generic_walker+0x4c2/0x5c0
[ 261.728074] ? tcf_action_dump_1+0x390/0x390
[ 261.734996] ? tunnel_key_walker+0x5/0x1a0 [act_tunnel_key]
[ 261.743247] ? tunnel_key_walker+0xe9/0x1a0 [act_tunnel_key]
[ 261.751557] tca_action_gd+0x600/0xa40
[ 261.757991] ? tca_get_fill.constprop.17+0x200/0x200
[ 261.765644] ? __lock_acquire+0x588/0x1d20
[ 261.772461] ? __lock_acquire+0x588/0x1d20
[ 261.779266] ? mark_held_locks+0x90/0x90
[ 261.785880] ? mark_held_locks+0x90/0x90
[ 261.792470] ? __nla_parse+0xfe/0x190
[ 261.798738] tc_ctl_action+0x218/0x230
[ 261.805145] ? tcf_action_add+0x230/0x230
[ 261.811760] rtnetlink_rcv_msg+0x3a5/0x600
[ 261.818564] ? lock_downgrade+0x2d0/0x2d0
[ 261.825433] ? validate_linkmsg+0x400/0x400
[ 261.832256] ? find_held_lock+0x6d/0xd0
[ 261.838624] ? match_held_lock+0x1b/0x210
[ 261.845142] ? validate_linkmsg+0x400/0x400
[ 261.851729] netlink_rcv_skb+0xc7/0x1f0
[ 261.857976] ? netlink_ack+0x470/0x470
[ 261.864132] ? netlink_deliver_tap+0x1f3/0x5a0
[ 261.870969] netlink_unicast+0x2ae/0x350
[ 261.877294] ? netlink_attachskb+0x340/0x340
[ 261.883962] ? _copy_from_iter_full+0xdd/0x380
[ 261.890750] ? __virt_addr_valid+0xb6/0xf0
[ 261.897188] ? __check_object_size+0x159/0x240
[ 261.903928] netlink_sendmsg+0x4d3/0x630
[ 261.910112] ? netlink_unicast+0x350/0x350
[ 261.916410] ? netlink_unicast+0x350/0x350
[ 261.922656] sock_sendmsg+0x6d/0x80
[ 261.928257] ___sys_sendmsg+0x48e/0x540
[ 261.934183] ? copy_msghdr_from_user+0x210/0x210
[ 261.940865] ? save_stack+0x89/0xb0
[ 261.946355] ? __lock_acquire+0x588/0x1d20
[ 261.952358] ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 261.959468] ? mark_held_locks+0x90/0x90
[ 261.965248] ? do_filp_open+0x138/0x1d0
[ 261.970910] ? may_open_dev+0x50/0x50
[ 261.976386] ? match_held_lock+0x1b/0x210
[ 261.982210] ? __fget_light+0xa6/0xe0
[ 261.987648] ? __sys_sendmsg+0xd2/0x150
[ 261.993263] __sys_sendmsg+0xd2/0x150
[ 261.998613] ? __ia32_sys_shutdown+0x30/0x30
[ 262.004555] ? lock_downgrade+0x2d0/0x2d0
[ 262.010236] ? mark_held_locks+0x1c/0x90
[ 262.015758] ? do_syscall_64+0x1e/0x280
[ 262.021234] do_syscall_64+0x78/0x280
[ 262.026500] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 262.033207] RIP: 0033:0x7f28e91a8b87
[ 262.038421] Code: 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 80 00 00 00 00 8b 05 6a 2b 2c 00 48 63 d2 48 63 ff 85 c0 75 18 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 59 f3 c3 0f 1f 80 00 00 00 00 53 48 89 f3 48
[ 262.060708] RSP: 002b:
00007ffdc5c4e2d8 EFLAGS:
00000246 ORIG_RAX:
000000000000002e
[ 262.070112] RAX:
ffffffffffffffda RBX:
000000005c73c202 RCX:
00007f28e91a8b87
[ 262.079087] RDX:
0000000000000000 RSI:
00007ffdc5c4e340 RDI:
0000000000000003
[ 262.088122] RBP:
0000000000000000 R08:
0000000000000001 R09:
000000000000000c
[ 262.097157] R10:
000000000000000c R11:
0000000000000246 R12:
0000000000000001
[ 262.106207] R13:
000000000067b4e0 R14:
00007ffdc5c5248c R15:
00007ffdc5c52480
[ 262.115271] Modules linked in: act_tunnel_key act_skbmod act_simple act_connmark nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 act_csum libcrc32c act_meta_skbtcindex act_meta_skbprio act_meta_mark act_ife ife act_police act_sample psample act_gact veth nfsv3 nfs_acl nfs lockd grace fscache bridge stp llc intel_rapl sb_edac mlx5_ib x86_pkg_temp_thermal sunrpc intel_powerclamp coretemp ib_uverbs kvm_intel ib_core kvm irqbypass mlx5_core crct10dif_pclmul crc32_pclmul crc32c_intel igb ghash_clmulni_intel intel_cstate mlxfw iTCO_wdt devlink intel_uncore iTCO_vendor_support ipmi_ssif ptp mei_me intel_rapl_perf ioatdma joydev pps_core ses mei i2c_i801 pcspkr enclosure lpc_ich dca wmi ipmi_si ipmi_devintf ipmi_msghandler acpi_pad acpi_power_meter pcc_cpufreq ast i2c_algo_bit drm_kms_helper ttm drm mpt3sas raid_class scsi_transport_sas
[ 262.204393] CR2:
00000000000000b0
[ 262.210390] ---[ end trace
2e41d786f2c7901a ]---
[ 262.226790] RIP: 0010:dst_cache_destroy+0x21/0xa0
[ 262.234083] Code: f4 ff ff ff eb f6 0f 1f 00 0f 1f 44 00 00 41 56 41 55 49 c7 c6 60 fe 35 af 41 54 55 49 89 fc 53 bd ff ff ff ff e8 ef 98 73 ff <49> 83 3c 24 00 75 35 eb 6c 4c 63 ed e8 de 98 73 ff 4a 8d 3c ed 40
[ 262.258311] RSP: 0018:
ffff888316447160 EFLAGS:
00010282
[ 262.266304] RAX:
0000000000000000 RBX:
ffff88835b3e2f00 RCX:
ffffffffad1c5071
[ 262.276251] RDX:
0000000000000003 RSI:
dffffc0000000000 RDI:
0000000000000297
[ 262.286208] RBP:
00000000ffffffff R08:
fffffbfff5dd4e89 R09:
fffffbfff5dd4e89
[ 262.296183] R10:
0000000000000001 R11:
fffffbfff5dd4e88 R12:
00000000000000b0
[ 262.306157] R13:
ffff8883267a10c0 R14:
ffffffffaf35fe60 R15:
0000000000000001
[ 262.316139] FS:
00007f28ea3e6400(0000) GS:
ffff888364200000(0000) knlGS:
0000000000000000
[ 262.327146] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 262.335815] CR2:
00000000000000b0 CR3:
00000003178ae004 CR4:
00000000001606e0
Fixes:
41411e2fd6b8 ("net/sched: act_tunnel_key: Add dst_cache support")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pankaj Bansal [Mon, 25 Feb 2019 06:16:55 +0000 (06:16 +0000)]
drivers: net: phy: mdio-mux: Add support for Generic Mux controls
Add support for Generic Mux controls, when Mdio mux node is a consumer
of mux produced by some other device.
Signed-off-by: Pankaj Bansal <pankaj.bansal@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pankaj Bansal [Mon, 25 Feb 2019 06:16:53 +0000 (06:16 +0000)]
dt-bindings: net: Add bindings for mdio mux consumers
When we use the bindings defined in Documentation/devicetree/bindings/mux
to define mdio mux in producer and consumer terms, it results in two
devices. one is mux producer and other is mux consumer.
Add the bindings needed for Mdio mux consumer devices.
Signed-off-by: Pankaj Bansal <pankaj.bansal@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
wenxu [Sun, 24 Feb 2019 03:36:20 +0000 (11:36 +0800)]
route: Add multipath_hash in flowi_common to make user-define hash
Current fib_multipath_hash_policy can make hash based on the L3 or
L4. But it only work on the outer IP. So a specific tunnel always
has the same hash value. But a specific tunnel may contain so many
inner connections.
This patch provide a generic multipath_hash in floi_common. It can
make a user-define hash which can mix with L3 or L4 hash.
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 27 Feb 2019 20:39:56 +0000 (12:39 -0800)]
Merge branch 'net-Remove-switchdev_ops'
Florian Fainelli says:
====================
net: Remove switchdev_ops
This patch series completes the removal of the switchdev_ops by
converting switchdev_port_attr_set() to use either the blocking
(process) or non-blocking (atomic) notifier since we typically need to
deal with both depending on where in the bridge code we get called from.
This was tested with the forwarding selftests and DSA hardware.
Ido, hopefully this captures your comments done on v1, if not, can you
illustrate with some pseudo-code what you had in mind if that's okay?
Changes in v3:
- added Reviewed-by tags from Ido where relevant
- added missing notifier_to_errno() in net/bridge/br_switchdev.c when
calling the atomic notifier for PRE_BRIDGE_FLAGS
- kept mlxsw_sp_switchdev_init() in mlxsw/
Changes in v2:
- do not check for SWITCHDEV_F_DEFER when calling the blocking notifier
and instead directly call the atomic notifier from the single location
where this is required
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Wed, 27 Feb 2019 19:44:32 +0000 (11:44 -0800)]
net: Remove switchdev_ops
Now that we have converted all possible callers to using a switchdev
notifier for attributes we do not have a need for implementing
switchdev_ops anymore, and this can be removed from all drivers the
net_device structure.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Wed, 27 Feb 2019 19:44:31 +0000 (11:44 -0800)]
net: switchdev: Replace port attr set SDO with a notification
Drop switchdev_ops.switchdev_port_attr_set. Drop the uses of this field
from all clients, which were migrated to use switchdev notification in
the previous patches.
Add a new function switchdev_port_attr_notify() that sends the switchdev
notifications SWITCHDEV_PORT_ATTR_SET and calls the blocking (process)
notifier chain.
We have one odd case within net/bridge/br_switchdev.c with the
SWITCHDEV_ATTR_ID_PORT_PRE_BRIDGE_FLAGS attribute identifier that
requires executing from atomic context, we deal with that one
specifically.
Drop __switchdev_port_attr_set() and update switchdev_port_attr_set()
likewise.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Wed, 27 Feb 2019 19:44:30 +0000 (11:44 -0800)]
staging: fsl-dpaa2: ethsw: Handle SWITCHDEV_PORT_ATTR_SET
Following patches will change the way we communicate setting a port's
attribute and use a blocking notifier to perform those tasks.
Prepare ethsw to support receiving notifier events targeting
SWITCHDEV_PORT_ATTR_SET and simply translate that into the existing
swdev_port_attr_set() call.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Wed, 27 Feb 2019 19:44:29 +0000 (11:44 -0800)]
net: mscc: ocelot: Handle SWITCHDEV_PORT_ATTR_SET
Following patches will change the way we communicate setting a port's
attribute and use notifiers to perform those tasks.
Ocelot does not currently have an atomic notifier registered for
switchdev events, so we need to register one in order to deal with
atomic context SWITCHDEV_PORT_ATTR_SET events.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Wed, 27 Feb 2019 19:44:28 +0000 (11:44 -0800)]
mlxsw: spectrum_switchdev: Handle SWITCHDEV_PORT_ATTR_SET
Following patches will change the way we communicate setting a port's
attribute and use a notifier to perform those tasks.
Prepare mlxsw to support receiving notifier events targeting
SWITCHDEV_PORT_ATTR_SET and utilize the switchdev_handle_port_attr_set()
to handle stacking of devices.
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Wed, 27 Feb 2019 19:44:27 +0000 (11:44 -0800)]
net: dsa: Handle SWITCHDEV_PORT_ATTR_SET
Following patches will change the way we communicate setting a port's
attribute and use notifiers towards that goal.
Prepare DSA to support receiving notifier events targeting
SWITCHDEV_PORT_ATTR_SET from both atomic and process context and use a
small helper to translate the event notifier into something that
dsa_slave_port_attr_set() can process.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Wed, 27 Feb 2019 19:44:26 +0000 (11:44 -0800)]
rocker: Handle SWITCHDEV_PORT_ATTR_SET
Following patches will change the way we communicate setting a port's
attribute and use notifiers towards that goal.
Prepare rocker to support receiving notifier events targeting
SWITCHDEV_PORT_ATTR_SET from both atomic and process context and use a
small helper to translate the event notifier into something that
rocker_port_attr_set() can process.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Wed, 27 Feb 2019 19:44:25 +0000 (11:44 -0800)]
switchdev: Add SWITCHDEV_PORT_ATTR_SET
In preparation for allowing switchdev enabled drivers to veto specific
attribute settings from within the context of the caller, introduce a
new switchdev notifier type for port attributes.
Suggested-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Buslov [Wed, 27 Feb 2019 13:49:17 +0000 (15:49 +0200)]
Revert "net: sched: fw: don't set arg->stop in fw_walk() when empty"
This reverts commit
31a998487641 ("net: sched: fw: don't set arg->stop in
fw_walk() when empty")
Cls API function tcf_proto_is_empty() was changed in commit
6676d5e416ee ("net: sched: set dedicated tcf_walker flag when tp is empty")
to no longer depend on arg->stop to determine that classifier instance is
empty. Instead, it adds dedicated arg->nonempty field, which makes the fix
in fw classifier no longer necessary.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Wed, 27 Feb 2019 12:47:57 +0000 (20:47 +0800)]
ethtool: Use explicit designated initializers for .cmd
Initialize the .cmd member by using a designated struct
initializer. This fixes warning of missing field initializers,
and makes code a little easier to read.
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Reviewed-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Leslie Monis [Wed, 27 Feb 2019 01:00:06 +0000 (06:30 +0530)]
net: sched: pie: fix 64-bit division
Use div_u64() to resolve build failures on 32-bit platforms.
Fixes:
3f7ae5f3dc52 ("net: sched: pie: add more cases to auto-tune alpha and beta")
Signed-off-by: Leslie Monis <lesliemonis@gmail.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Mon, 25 Feb 2019 02:43:06 +0000 (10:43 +0800)]
net: Use RCU_POINTER_INITIALIZER() to init static variable
This pointer is RCU protected, so proper primitives should be used.
Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 26 Feb 2019 21:16:03 +0000 (13:16 -0800)]
Merge branch 'tcp-cleanups'
Eric Dumazet says:
====================
tcp: cleanups for linux-5.1
This small patch series cleanups few things, and add a small
timewait optimization for hosts not using md5.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 26 Feb 2019 17:49:13 +0000 (09:49 -0800)]
tcp: remove tcp_queue argument from tso_fragment()
tso_fragment() is only called for packets still in write queue.
Remove the tcp_queue parameter to make this more obvious,
even if the comment clearly states this.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 26 Feb 2019 17:49:12 +0000 (09:49 -0800)]
tcp: use tcp_md5_needed for timewait sockets
This might speedup tcp_twsk_destructor() a bit,
avoiding a cache line miss.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 26 Feb 2019 17:49:11 +0000 (09:49 -0800)]
tcp: convert tcp_md5_needed to static_branch API
We prefer static_branch_unlikely() over static_key_false() these days.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 26 Feb 2019 17:49:10 +0000 (09:49 -0800)]
tcp: get rid of __tcp_add_write_queue_tail()
This helper is only used from tcp_add_write_queue_tail(), and does
not make the code more readable.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 26 Feb 2019 17:49:09 +0000 (09:49 -0800)]
tcp: get rid of tcp_check_send_head()
This helper is used only once, and its name is no longer relevant.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Buslov [Tue, 26 Feb 2019 15:37:09 +0000 (17:37 +0200)]
tc-testing: gitignore, ignore local tdc config file
Comment in tdc_config.py recommends putting customizations in
tdc_config_local.py file that wasn't included in gitignore. Add the local
config file to gitignore.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Buslov [Tue, 26 Feb 2019 15:34:40 +0000 (17:34 +0200)]
net: sched: fix typo in walker_check_empty()
Function walker_check_empty() incorrectly verifies that tp pointer is not
NULL, instead of actual filter pointer. Fix conditional to check the right
pointer. Adjust filter pointer naming accordingly to other cls API
functions.
Fixes:
6676d5e416ee ("net: sched: set dedicated tcf_walker flag when tp is empty")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reported-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Leslie Monis [Tue, 26 Feb 2019 10:23:31 +0000 (15:53 +0530)]
net: sched: pie: fix mistake in reference link
Fix the incorrect reference link to RFC 8033
Signed-off-by: Leslie Monis <lesliemonis@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
YueHaibing [Mon, 25 Feb 2019 02:03:28 +0000 (02:03 +0000)]
mlxsw: spectrum: remove set but not used variable 'autoneg_status'
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/net/ethernet/mellanox/mlxsw/spectrum.c: In function 'mlxsw_sp_port_get_link_ksettings':
drivers/net/ethernet/mellanox/mlxsw/spectrum.c:3062:5: warning:
variable 'autoneg_status' set but not used [-Wunused-but-set-variable]
It's not used since commit
475b33cb66c9 ("mlxsw: spectrum: Remove unsupported
eth_proto_lp_advertise field in PTYS")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 26 Feb 2019 16:54:37 +0000 (08:54 -0800)]
Merge branch 'vxlan-create-and-changelink-extack-support'
Roopa Prabhu says:
====================
vxlan: create and changelink extack support
This series adds extack support to changelink paths.
In the process re-factors flag sets to a separate helper.
Also adds some changelink testcases to rtnetlink.sh
(This series was initially part of another series that
tried to support changelink for more attributes.
But after some feedback from sabrina, i have dropped the
'support changelink for more attributes' part because some
of them cannot be supported today or may require additional
use-case handling code. These can be done separately
as and when we see the need for it.)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Roopa Prabhu [Tue, 26 Feb 2019 06:03:02 +0000 (22:03 -0800)]
tools: selftests: rtnetlink: add testcases for vxlan flag sets
This patch extends rtnetlink.sh to cover some vxlan flag
netlink attribute sets.
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roopa Prabhu [Tue, 26 Feb 2019 06:03:01 +0000 (22:03 -0800)]
vxlan: add extack support for create and changelink
This patch adds extack coverage in vxlan link
create and changelink paths. Introduces a new helper
vxlan_nl2flags to consolidate flag attribute validation.
thanks to Johannes Berg for some tips to construct the
generic vxlan flag extack strings.
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 26 Feb 2019 16:49:06 +0000 (08:49 -0800)]
Merge branch 'devlink-make-ethtool-compat-reliable'
Jakub Kicinski says:
====================
devlink: make ethtool compat reliable
This is a follow up to the series which added device flash
updates via devlink. I went with the approach of adding a
new NDO in the end. It seems to end up looking cleaner.
First patch removes the option to build devlink as a module.
Users can still decide to not build it, but the module option
ends up not being worth the maintenance cost.
Next two patches add a NDO which can be used to ask the driver
to return a devlink instance associated with a given netdev,
instead of iterating over devlink ports. Drivers which implement
this NDO must take into account the potential impact on the
visibility of the devlink instance.
With the new NDO in place we can remove NFP ethtool flash update
code.
Fifth patch makes sure we hold a reference to dev while
callbacks are active.
Last but not least the NULL-check of devlink->ops is moved
to instance allocation time.
Last but not least missing checks for devlink->ops are added.
There is currently no driver registering devlink without ops,
so can just fix this in -next.
v2 (Michal): add netdev_to_devlink() in patch 3.
v3 (Florian):
- add missing checks for devlink->ops;
- move locking/holding into devlink_compat_ functions.
v4 (Jiri):
- hold devlink_mutex around callbacks (patch 2);
- require non-NULL ops (patch 6).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Tue, 26 Feb 2019 03:34:07 +0000 (19:34 -0800)]
devlink: require non-NULL ops for devlink instances
Commit
76726ccb7f46 ("devlink: add flash update command") and
commit
2d8dc5bbf4e7 ("devlink: Add support for reload")
access devlink ops without NULL-checking. There is, however, no
driver which would pass in NULL ops, so let's just make that
a requirement. Remove the now unnecessary NULL-checking.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Tue, 26 Feb 2019 03:34:06 +0000 (19:34 -0800)]
devlink: hold a reference to the netdevice around ethtool compat
When ethtool is calling into devlink compat code make sure we have
a reference on the netdevice on which the operation was invoked.
v3: move the hold/lock logic into devlink_compat_* functions (Florian)
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Tue, 26 Feb 2019 03:34:05 +0000 (19:34 -0800)]
nfp: remove ethtool flashing fallback
Now that devlink fallback will be called reliably, we can remove
the ethtool flashing code.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Tue, 26 Feb 2019 03:34:04 +0000 (19:34 -0800)]
nfp: add .ndo_get_devlink
Support getting devlink instance from a new NDO.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Tue, 26 Feb 2019 03:34:03 +0000 (19:34 -0800)]
devlink: create a special NDO for getting the devlink instance
Instead of iterating over all devlink ports add a NDO which
will return the devlink instance from the driver.
v2: add the netdev_to_devlink() helper (Michal)
v3: check that devlink has ops (Florian)
v4: hold devlink_mutex (Jiri)
Suggested-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Tue, 26 Feb 2019 03:34:02 +0000 (19:34 -0800)]
net: devlink: turn devlink into a built-in
Being able to build devlink as a module causes growing pains.
First all drivers had to add a meta dependency to make sure
they are not built in when devlink is built as a module. Now
we are struggling to invoke ethtool compat code reliably.
Make devlink code built-in, users can still not build it at
all but the dynamically loadable module option is removed.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Oskolkov [Tue, 26 Feb 2019 01:43:46 +0000 (17:43 -0800)]
net: remove unused struct inet_frag_queue.fragments field
Now that all users of struct inet_frag_queue have been converted
to use 'rb_fragments', remove the unused 'fragments' field.
Build with `make allyesconfig` succeeded. ip_defrag selftest passed.
Signed-off-by: Peter Oskolkov <posk@google.com>
Acked-by: Stefan Schmidt <stefan@datenfreihafen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Mon, 25 Feb 2019 15:06:24 +0000 (23:06 +0800)]
net: wan: z85230: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called in z8530_tx_done() when skb
xmit done. It makes drop profiles(dropwatch, perf) more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Mon, 25 Feb 2019 15:05:41 +0000 (23:05 +0800)]
net: wan: cosa: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called in cosa_net_tx_done() when skb
xmit done. It makes drop profiles(dropwatch, perf) more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Mon, 25 Feb 2019 15:03:40 +0000 (23:03 +0800)]
net: wan: sbni: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called in send_complete() when skb
xmit done. It makes drop profiles(dropwatch, perf) more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Mon, 25 Feb 2019 15:02:57 +0000 (23:02 +0800)]
net: wan: ixp4xx_hss: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called in hss_hdlc_txdone_irq() when
skb xmit done. It makes drop profiles(dropwatch, perf) more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Mon, 25 Feb 2019 15:01:50 +0000 (23:01 +0800)]
net: wan: wanxl: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called in wanxl_tx_intr() when skb
xmit done. It makes drop profiles(dropwatch, perf) more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Mon, 25 Feb 2019 14:57:40 +0000 (22:57 +0800)]
net: lmc: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called in lmc_interrupt() when skb
xmit done. It makes drop profiles(dropwatch, perf) more friendly.
Delete a redundant comment line in lmc_interrupt().
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 25 Feb 2019 22:21:03 +0000 (14:21 -0800)]
Merge branch 'pie-next'
Leslie Monis says:
====================
net: sched: pie: align PIE implementation with RFC 8033
The current implementation of the PIE queuing discipline is according to the
IETF draft [http://tools.ietf.org/html/draft-pan-aqm-pie-00] and the paper
[PIE: A Lightweight Control Scheme to Address the Bufferbloat Problem].
However, a lot of necessary modifications and enhancements have been proposed
in RFC 8033, which have not yet been incorporated in the source code of Linux.
This patch series helps in achieving the same.
Performance tests carried out using Flent [https://flent.org/]
Changes from v2 to v3:
- Used div_u64() instead of direct division after explicit type casting as
recommended by David
Changes from v1 to v2:
- Excluded the patch setting PIE dynamically active/inactive as the test
results were unsatisfactory
- Fixed a scaling issue when adding more auto-tuning cases which caused
local variables to underflow
- Changed the long if/else chain to a loop as suggested by Stephen
- Changed the position of the accu_prob variable in the pie_vars
structure as recommended by Stephen
====================
Acked-by: Dave Taht <dave.taht@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mohit P. Tahiliani [Mon, 25 Feb 2019 19:10:01 +0000 (00:40 +0530)]
net: sched: pie: update references
RFC 8033 replaces the IETF draft for PIE
Signed-off-by: Mohit P. Tahiliani <tahiliani@nitk.edu.in>
Signed-off-by: Dhaval Khandla <dhavaljkhandla26@gmail.com>
Signed-off-by: Hrishikesh Hiraskar <hrishihiraskar@gmail.com>
Signed-off-by: Manish Kumar B <bmanish15597@gmail.com>
Signed-off-by: Sachin D. Patil <sdp.sachin@gmail.com>
Signed-off-by: Leslie Monis <lesliemonis@gmail.com>
Acked-by: Dave Taht <dave.taht@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mohit P. Tahiliani [Mon, 25 Feb 2019 19:10:00 +0000 (00:40 +0530)]
net: sched: pie: add derandomization mechanism
Random dropping of packets to achieve latency control may
introduce outlier situations where packets are dropped too
close to each other or too far from each other. This can
cause the real drop percentage to temporarily deviate from
the intended drop probability. In certain scenarios, such
as a small number of simultaneous TCP flows, these
deviations can cause significant deviations in link
utilization and queuing latency.
RFC 8033 suggests using a derandomization mechanism to avoid
these deviations.
Signed-off-by: Mohit P. Tahiliani <tahiliani@nitk.edu.in>
Signed-off-by: Dhaval Khandla <dhavaljkhandla26@gmail.com>
Signed-off-by: Hrishikesh Hiraskar <hrishihiraskar@gmail.com>
Signed-off-by: Manish Kumar B <bmanish15597@gmail.com>
Signed-off-by: Sachin D. Patil <sdp.sachin@gmail.com>
Signed-off-by: Leslie Monis <lesliemonis@gmail.com>
Acked-by: Dave Taht <dave.taht@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mohit P. Tahiliani [Mon, 25 Feb 2019 19:09:59 +0000 (00:39 +0530)]
net: sched: pie: add more cases to auto-tune alpha and beta
The current implementation scales the local alpha and beta
variables in the calculate_probability function by the same
amount for all values of drop probability below 1%.
RFC 8033 suggests using additional cases for auto-tuning
alpha and beta when the drop probability is less than 1%.
In order to add more auto-tuning cases, MAX_PROB must be
scaled by u64 instead of u32 to prevent underflow when
scaling the local alpha and beta variables in the
calculate_probability function.
Signed-off-by: Mohit P. Tahiliani <tahiliani@nitk.edu.in>
Signed-off-by: Dhaval Khandla <dhavaljkhandla26@gmail.com>
Signed-off-by: Hrishikesh Hiraskar <hrishihiraskar@gmail.com>
Signed-off-by: Manish Kumar B <bmanish15597@gmail.com>
Signed-off-by: Sachin D. Patil <sdp.sachin@gmail.com>
Signed-off-by: Leslie Monis <lesliemonis@gmail.com>
Acked-by: Dave Taht <dave.taht@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mohit P. Tahiliani [Mon, 25 Feb 2019 19:09:58 +0000 (00:39 +0530)]
net: sched: pie: change initial value of pie_vars->burst_time
RFC 8033 suggests an initial value of 150 milliseconds for
the maximum time allowed for a burst of packets.
Signed-off-by: Mohit P. Tahiliani <tahiliani@nitk.edu.in>
Signed-off-by: Dhaval Khandla <dhavaljkhandla26@gmail.com>
Signed-off-by: Hrishikesh Hiraskar <hrishihiraskar@gmail.com>
Signed-off-by: Manish Kumar B <bmanish15597@gmail.com>
Signed-off-by: Sachin D. Patil <sdp.sachin@gmail.com>
Signed-off-by: Leslie Monis <lesliemonis@gmail.com>
Acked-by: Dave Taht <dave.taht@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mohit P. Tahiliani [Mon, 25 Feb 2019 19:09:57 +0000 (00:39 +0530)]
net: sched: pie: change default value of pie_params->tupdate
RFC 8033 suggests a default value of 15 milliseconds for the
update interval.
Signed-off-by: Mohit P. Tahiliani <tahiliani@nitk.edu.in>
Signed-off-by: Dhaval Khandla <dhavaljkhandla26@gmail.com>
Signed-off-by: Hrishikesh Hiraskar <hrishihiraskar@gmail.com>
Signed-off-by: Manish Kumar B <bmanish15597@gmail.com>
Signed-off-by: Sachin D. Patil <sdp.sachin@gmail.com>
Signed-off-by: Leslie Monis <lesliemonis@gmail.com>
Acked-by: Dave Taht <dave.taht@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mohit P. Tahiliani [Mon, 25 Feb 2019 19:09:56 +0000 (00:39 +0530)]
net: sched: pie: change default value of pie_params->target
RFC 8033 suggests a default value of 15 milliseconds for the
target queue delay.
Signed-off-by: Mohit P. Tahiliani <tahiliani@nitk.edu.in>
Signed-off-by: Dhaval Khandla <dhavaljkhandla26@gmail.com>
Signed-off-by: Hrishikesh Hiraskar <hrishihiraskar@gmail.com>
Signed-off-by: Manish Kumar B <bmanish15597@gmail.com>
Signed-off-by: Sachin D. Patil <sdp.sachin@gmail.com>
Signed-off-by: Leslie Monis <lesliemonis@gmail.com>
Acked-by: Dave Taht <dave.taht@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mohit P. Tahiliani [Mon, 25 Feb 2019 19:09:55 +0000 (00:39 +0530)]
net: sched: pie: change value of QUEUE_THRESHOLD
RFC 8033 recommends a value of 16384 bytes for the queue
threshold.
Signed-off-by: Mohit P. Tahiliani <tahiliani@nitk.edu.in>
Signed-off-by: Dhaval Khandla <dhavaljkhandla26@gmail.com>
Signed-off-by: Hrishikesh Hiraskar <hrishihiraskar@gmail.com>
Signed-off-by: Manish Kumar B <bmanish15597@gmail.com>
Signed-off-by: Sachin D. Patil <sdp.sachin@gmail.com>
Signed-off-by: Leslie Monis <lesliemonis@gmail.com>
Acked-by: Dave Taht <dave.taht@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gustavo A. R. Silva [Mon, 25 Feb 2019 19:01:32 +0000 (13:01 -0600)]
mlxsw: spectrum: acl: Use struct_size() in kzalloc()
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:
struct foo {
int stuff;
struct boo entry[];
};
size = sizeof(struct foo) + count * sizeof(struct boo);
instance = kzalloc(size, GFP_KERNEL)
Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:
instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL)
Notice that, in this case, variable alloc_size is not necessary, hence
it is removed.
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 25 Feb 2019 22:16:22 +0000 (14:16 -0800)]
Merge branch 'aquantia-hwmon'
Heiner Kallweit says:
====================
net: phy: aquantia: add hwmon support
This series adds HWMON support for the temperature sensor and the
related alarms on the 107/108/109 chips.
v2:
- remove struct aqr_priv
- rename header file to aquantia.h
v3:
- add conditional compiling of aquantia_hwmon.c
- improve converting sensor register values to/from long
- add helper aqr_hwmon_test_bit
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Mon, 25 Feb 2019 18:56:38 +0000 (19:56 +0100)]
net: phy: aquantia: add hwmon support
This adds HWMON support for the temperature sensor and the related
alarms on the 107/108/109 chips. This patch is based on work from
Nikita and Andrew. I added:
- support for changing alarm thresholds via sysfs
- move HWMON code to a separate source file to improve maintainability
- smaller changes like using IS_REACHABLE instead of ifdef
(avoids problems if PHY driver is built in and HWMON is a module)
v2:
- remove struct aqr_priv
- rename header file to aquantia.h
v3:
- add conditional compiling of aquantia_hwmon.c
- improve converting sensor register values to/from long
- add helper aqr_hwmon_test_bit
Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Mon, 25 Feb 2019 18:53:04 +0000 (19:53 +0100)]
net: phy: aquantia: rename aquantia.c to aquantia_main.c
Rename aquantia.c to aquantia_main.c to be prepared for adding new
functionality to separate source code files.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 25 Feb 2019 22:14:24 +0000 (14:14 -0800)]
Merge branch '100GbE' of git://git./linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
100GbE Intel Wired LAN Driver Updates 2019-02-22
This series contains updates to the ice driver only.
Bruce adds the __always_unused attribute to a parameter to avoid
compiler warnings when using -Wunused-parameter. Fixed unnecessary
type-casting and the use of sizeof(). Fix the allocation of structs
that have become memory hogs, so allocate them in heaps and fix all the
associated references. Fixed the "possible" numeric overflow issues
that were caught with static analysis.
Maciej fixes the maximum MTU calculation by taking into account double
VLAN tagging amd ensure that the operations are done in the correct
order.
Victor fixes the supported node calculation, where we were not taking
into account if there is space to add the new VSI or intermediate node
above that layer, then it is not required to continue the calculation.
Added a check for a leaf node presence for a given VSI, which is needed
before removing a VSI.
Jake fixes an issue where the VSI list is shared, so simply removing a
VSI from the list will cause issues for the other users who reference
the list. Since we also free the memory, this could lead to
segmentation faults.
Brett fixes an issue where driver unload could cause a system reboot
when intel_iommu=on parameter is set. The issue is that we are not
clearing the CAUSE_ENA bit for the appropriate control queues register
when freeing the miscellaneous interrupt vector.
Mitch is so kind, he prevented spamming the VF with link messages when
the link status really has not changed. Updates the driver to use the
absolute vector ID and not the per-PF vector ID for the VF MSIx vector
allocation.
Lukasz fixes the ethtool pause parameter for the ice driver, which was
originally based off the link status but is now based off the PHY
configuration. This is to resolve an issue where pause parameters could
be set while link was down.
Jesse updates the string that reports statistics so the string does not
get modified at runtime and cause reports of string truncation.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Buslov [Mon, 25 Feb 2019 15:45:44 +0000 (17:45 +0200)]
net: sched: don't release block->lock when dumping chains
Function tc_dump_chain() obtains and releases block->lock on each iteration
of its inner loop that dumps all chains on block. Outputting chain template
info is fast operation so locking/unlocking mutex multiple times is an
overhead when lock is highly contested. Modify tc_dump_chain() to only
obtain block->lock once and dump all chains without releasing it.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Buslov [Mon, 25 Feb 2019 15:38:31 +0000 (17:38 +0200)]
net: sched: set dedicated tcf_walker flag when tp is empty
Using tcf_walker->stop flag to determine when tcf_walker->fn() was called
at least once is unreliable. Some classifiers set 'stop' flag on error
before calling walker callback, other classifiers used to call it with NULL
filter pointer when empty. In order to prevent further regressions, extend
tcf_walker structure with dedicated 'nonempty' flag. Set this flag in
tcf_walker->fn() implementation that is used to check if classifier has
filters configured.
Fixes:
8b64678e0af8 ("net: sched: refactor tp insert/delete for concurrent execution")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marek Behún [Mon, 25 Feb 2019 11:39:55 +0000 (12:39 +0100)]
net: dsa: mv88e6xxx: Fix phylink_validate for Topaz family
The Topaz family should have different phylink_validate method from the
Peridot, since on Topaz the port supporting 2500BaseX mode is port 5,
not 9 and 10.
Signed-off-by: Marek Behún <marek.behun@nic.cz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marek Behún [Mon, 25 Feb 2019 11:39:54 +0000 (12:39 +0100)]
net: dsa: mv88e6xxx: Default CMODE to 1000BaseX only on 6390X
Commit
787799a9d555 sets the SERDES interfaces of 6390 and 6390X to
1000BaseX, but this is only needed on 6390X, since there are SERDES
interfaces which can be used on lower ports on 6390.
This commit fixes this by returning to previous behaviour on 6390.
(Previous behaviour means that CMODE is not set at all if requested mode
is NA).
This is needed on Turris MOX, where the 88e6190 is connected to CPU in
2500BaseX mode.
Fixes:
787799a9d555 ("net: dsa: mv88e6xxx: Default ports 9/10 6390X CMODE to 1000BaseX")
Signed-off-by: Marek Behún <marek.behun@nic.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yafang Shao [Mon, 25 Feb 2019 10:33:48 +0000 (18:33 +0800)]
tcp: clean up SOCK_DEBUG()
Per discussion with Daniel[1] and Eric[2], these SOCK_DEBUG() calles in
TCP are not needed now.
We'd better clean up it.
[1] https://patchwork.ozlabs.org/patch/1035573/
[2] https://patchwork.ozlabs.org/patch/1040533/
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Taehee Yoo [Mon, 25 Feb 2019 09:42:33 +0000 (18:42 +0900)]
tcp: remove unused parameter of tcp_sacktag_bsearch()
parameter state in the tcp_sacktag_bsearch() is not used.
So, it can be removed.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jesse Brandeburg [Fri, 8 Feb 2019 20:50:43 +0000 (12:50 -0800)]
ice: fix overlong string, update stats output
A test started warning on a string truncation. This led to an unfortunate
realization that we are likely not accounting for the stats length
correctly before this patch, so fix the issue by putting "port." in front
of all the PF stats, instead of magically prepending it at runtime.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Lukasz Czapnik [Fri, 8 Feb 2019 20:50:42 +0000 (12:50 -0800)]
ice: Fix for FC get rx/tx pause params
Ethtool reported pause params based on the currently negotiated
link settings instead of current PHY config. User was not able
to turn off pause params because ethtool was incorrectly reporting
parameters as off when link was down even though PHY was configured
to support pause frames. Now pause params are taken from PHY config
instead of link status.
Signed-off-by: Lukasz Czapnik <lukasz.czapnik@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mitch Williams [Fri, 8 Feb 2019 20:50:41 +0000 (12:50 -0800)]
ice: use absolute vector ID for VFs
When the PF driver sets up the VF MSI-X vector allocation, it needs to
use the hardware absolute vector ID, not the per-PF vector ID. Without
this change we see (apparent) TX hangs when using VFs on multiple PFs.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Victor Raj [Fri, 8 Feb 2019 20:50:40 +0000 (12:50 -0800)]
ice: check for a leaf node presence
Check for a leaf node presence for a given VSI. This check is required
before removing a VSI since VSIs can't be removed with enabled queues
(with leaf nodes) from the FW scheduler tree unless its a reset.
Signed-off-by: Victor Raj <victor.raj@intel.com>
Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Victor Raj [Fri, 8 Feb 2019 20:50:39 +0000 (12:50 -0800)]
ice: flush Tx pipe on disable queue timeout
Set the flush Tx pipe flag instead of getting an EAGAIN error when FW
times out in processing the disable Tx queue command.
Signed-off-by: Victor Raj <victor.raj@intel.com>
Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mitch Williams [Fri, 8 Feb 2019 20:50:38 +0000 (12:50 -0800)]
ice: clear VF ARQLEN register on reset
On older devices like X710 and X722, the VF's ARQLEN register is cleared
on reset, so the VF driver uses that register to detect an unannounced
reset. Unfortunately, on devices controlled by ice, this register is NOT
cleared on reset. This causes the VF to miss resets, and even on
properly-announced resets, the VF driver complains that it didn't see
the reset.
To fix this, we'll do it in software. When we handle a VF reset (whether
triggered by software or VFLR), clear this register after the HW reset
is complete.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mitch Williams [Fri, 8 Feb 2019 20:50:37 +0000 (12:50 -0800)]
ice: don't spam VFs with link messages
Don't send a link message to the VFs unless link actually changes state.
This avoids a small timing hole in some VF drivers that can cause an
apparent TX hang if they receive a link status message at the wrong time.
Although we have fixed the timing hole in the current VF driver, there
are still lots of drivers in the field that have this timing hole. Let's
not fall into it if we can avoid it.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Brett Creeley [Fri, 8 Feb 2019 20:50:36 +0000 (12:50 -0800)]
ice: only use the VF for ICE_VSI_VF in ice_vsi_release
In ice_vsi_release we are always assigning a value to the local VF
variable. Change this to only be assigned if the VSI is a VF VSI.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 8 Feb 2019 20:50:35 +0000 (12:50 -0800)]
ice: fix numeric overflow warning
When compiling and analyzing the driver on newer kernels, a static
analyzer warns about the following "numeric overflow" issues:
"The result of expression: 'budget-1' generates 4-byte type while casting
to a bigger size of 8-byte".
"The result of expression: '*words-words_read' generates 4-byte type
while casting to a bigger size of 8-byte".
Fix them both.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Brett Creeley [Fri, 8 Feb 2019 20:50:34 +0000 (12:50 -0800)]
ice: fix issue where host reboots on unload when iommu=on
Currently if the kernel has the intel_iommu=on parameter set, on some
platforms removing the driver causes a system reboot. In initialization
we associate the control queue interrupts with the pf->hw_oicr_idx and
enable the interrupts by setting the CAUSE_ENA bit. The problem comes
on teardown because we are not clearing the CAUSE_ENA bit for the
control queues, but the vector at pf->hw_oicr_idx (miscellaneous
interrupt vector) gets disabled.
Fix this by clearing the CAUSE_ENA bit in the appropriate control queue
registers on when freeing the miscellaneous interrupt vector. Also,
move the call to ice_free_irq_msix_misc() to after ice_deinit_sw() in
ice_remove() because ice_deinit_sw() makes an AQ call, but
ice_free_irq_msix_misc() disables the miscellaneous vector and it's
associated interrupts.
Also, create two small helper functions to enable and disable the
control queue interrupts respectively.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Fri, 8 Feb 2019 20:50:33 +0000 (12:50 -0800)]
ice: fix ice_remove_rule_internal vsi_list handling
When adding multiple VLANs to the same VSI, the ice_add_vlan code will
share the VSI list, so as not to create multiple unnecessary VSI lists.
Consider the following flow
ice_add_vlan(hw, <VSI 0 VID 7, VSI 0 VID 8, VSI 0 VID 9>)
Where we add three VLAN filters for VIDs 7, 8, and 9, all for VSI 0.
The ice_add_vlan will create a single vsi_list and share it among all
the filters.
Later, if we try to remove a VLAN,
ice_remove_vlan(hw, <VSI 0 VID 7>)
Then the removal code will update the vsi_list and remove VSI 0 from it.
But, since the vsi_list is shared, this breaks the list for the other
users who reference it. We actually even free the VSI list memory, and
may result in segmentation faults.
This is due to the way that VLAN rule share VSI lists with reference
counts, and is caused because we call ice_rem_update_vsi_list even when
the ref_cnt is greater than one.
To fix this, handle the case where ref_cnt is greater than one
separately. In this case, we need to remove the associated rule without
modifying the vsi_list, since it is currently being referenced by
another rule. Instead, we just need to decrement the VSI list ref_cnt.
The case for handling sharing of VSI lists with multiple VSIs is not
currently supported by this code. No such rules will be created today,
and this code will require changes if/when such code is added.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 8 Feb 2019 20:50:32 +0000 (12:50 -0800)]
ice: fix stack hogs from struct ice_vsi_ctx structures
struct ice_vsi_ctx has gotten large enough that function local declarations
of it on the stack are causing stack hogs. Fix that by allocating the
structs on heap. Cleanup some formatting issues in the code around these
changes and fix incorrect data type uses of returned functions in a couple
places.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 8 Feb 2019 20:50:31 +0000 (12:50 -0800)]
ice: sizeof(<type>) should be avoided
With sizeof(), it is preferable to use the variable of type <type> instead
of sizeof(<type>).
There are multiple places where a temporary variable is used to hold a
'size' value which is then used for a subsequent alloc/memset. Get rid
of the temporary variable by calculating size as part of the alloc/memset
statement.
Also remove unnecessary type-cast.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Victor Raj [Fri, 8 Feb 2019 20:50:30 +0000 (12:50 -0800)]
ice: Fix added in VSI supported nodes calc
VSI supported nodes are calculated in order to add the VSI parent or
intermediate nodes to the scheduler tree. If one of the node in below
layers (from VSI layer) has space to add the new VSI or intermediate node
above that layer then it's not required to continue the calculation further
for below layers.
Signed-off-by: Victor Raj <victor.raj@intel.com>
Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Maciej Fijalkowski [Fri, 8 Feb 2019 20:50:29 +0000 (12:50 -0800)]
ice: Fix the calculation of ICE_MAX_MTU
Currently ICE_MAX_MTU subtracts only ETH_HLEN from max frame size and
adds ETH_FCS_LEN and VLAN_HLEN, which is not what was intended.
The ETH_HLEN + ETH_FCS_LEN + VLAN_HLEN expression should be surrounded
with parentheses.
Wrap mentioned expression and take into account VLAN double tagging.
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 8 Feb 2019 20:50:28 +0000 (12:50 -0800)]
ice: Mark extack argument as __always_unused
Commit
87b0984ebfab ("net: Add extack argument to ndo_fdb_add()") in
net-next added an extended parameter to the .ndo_fdb_add op and changed
ice_fdb_add() accordingly. Update the function header and add the
__always_unused attribute to the new parameter to avoid -Wunused-parameter
warnings.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Florian Fainelli [Mon, 25 Feb 2019 02:39:02 +0000 (18:39 -0800)]
switchdev: Complete removal of switchdev_port_attr_get()
We have no more in tree users of switchdev_port_attr_get() after
d0e698d57a94 ("Merge branch 'net-Get-rid-of-switchdev_port_attr_get'")
so completely remove the function signature and body.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sun, 24 Feb 2019 19:44:43 +0000 (20:44 +0100)]
dsa: Remove phydev parameter from disable_port call
No current DSA driver makes use of the phydev parameter passed to the
disable_port call. Remove it.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Sun, 24 Feb 2019 17:01:18 +0000 (18:01 +0100)]
net: phy: fix reading fixed phy status
With the switch to phy_resolve_aneg_linkmode() we don't read from the
chip any longer what is advertised but use phydev->advertising directly.
For a fixed phy however this bitmap is empty so far, what results in
no common mode being found. This breaks DSA. Fix this by advertising
everything that is supported. For a normal phy this done by phy_probe().
Fixes:
5502b218e001 ("net: phy: use phy_resolve_aneg_linkmode in genphy_read_status")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Sun, 24 Feb 2019 16:41:47 +0000 (17:41 +0100)]
net: phy: improve auto-neg emulation in swphy
Auto-neg emulation currently doesn't set bit BMCR_ANENABLE in BMCR,
add this. Users will ignore speed and duplex settings in BMCR because
we're emulating auto-neg, therefore we can remove related code.
See also following discussion [0].
[0] https://marc.info/?t=
155041784900002&r=1&w=2
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 25 Feb 2019 06:27:19 +0000 (22:27 -0800)]
Merge branch 'for-upstream' of git://git./linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg says:
====================
Here's the main bluetooth-next pull request for the 5.1 kernel.
- Fixes & improvements to mediatek, hci_qca, btrtl, and btmrvl HCI drivers
- Fixes to parsing invalid L2CAP config option sizes
- Locking fix to bt_accept_enqueue()
- Add support for new Marvel sd8977 chipset
- Various other smaller fixes & cleanups
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Oskolkov [Sun, 24 Feb 2019 02:25:01 +0000 (18:25 -0800)]
net: fix double-free in bpf_lwt_xmit_reroute
dst_output() frees skb when it fails (see, for example,
ip_finish_output2), so it must not be freed in this case.
Fixes:
3bd0b15281af ("bpf: add handling of BPF_LWT_REROUTE to lwt_bpf.c")
Signed-off-by: Peter Oskolkov <posk@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
wenxu [Sun, 24 Feb 2019 00:24:45 +0000 (08:24 +0800)]
ip_tunnel: Add ip tunnel tun_info type dst_cache in ip_tunnel_xmit
ip l add dev tun type gretap key 1000
Non-tunnel-dst ip tunnel device can send packet through lwtunnel
This patch provide the tun_inf dst cache support for this mode.
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 25 Feb 2019 06:21:23 +0000 (22:21 -0800)]
Merge branch 'dsa-mv88e6xxx-lockdep'
Andrew Lunn says:
====================
mv88e6xxx: Avoid false positive Lockdep splats
When acquiring the GPIO interrupt line for the switch, it is possible
to trigger lockdep splats. These are false positives, the mutex is in
a different IRQ descriptor. But fix it anyway, since it could mask
real locking issues.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sat, 23 Feb 2019 16:43:57 +0000 (17:43 +0100)]
net: dsa: mv88e6xxx: Release lock while requesting IRQ
There is no need to hold the register lock while requesting the GPIO
interrupt. By not holding it we can also avoid a false positive
lockdep splat.
Reported-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>