Fujinaka, Todd [Tue, 1 Oct 2013 11:33:55 +0000 (04:33 -0700)]
igb: Add ethtool offline tests for i354
Add the ethtool offline tests for i354 devices.
Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jacob Keller [Tue, 1 Oct 2013 11:33:54 +0000 (04:33 -0700)]
ixgbe: remove marketing names from busy poll code
This patch renames the LL_EXTENDED_STATS and some of the functions required to
implement busy polling in the ixgbe driver, in order to remove the marketing
"low latency" blurb which hides what the code actually does.
This furthers work which was requested by Linus Torvalds when the initial busy
poll code was included in the kernel. The code in the ixgbe driver itself was
never properly renamed to reflect the change to busy polling as the title.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher [Tue, 1 Oct 2013 11:33:53 +0000 (04:33 -0700)]
ixgbe: Cleanup the use of tabs and spaces
Cleans up the whitespace issues noticed during code review where
a mix of tabs and spaces were used for indentation.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Leonardo Potenza [Tue, 1 Oct 2013 11:33:52 +0000 (04:33 -0700)]
ixgbe: ethtool DCB registers dump for 82599 and x540
Added support for DCB registers dump using ethtool -d option both for
82599 and x540 ethernet controllers
Signed-off-by: Leonardo Potenza <leonardo.potenza@intel.com>
Signed-off-by: Maryam Tahhan <maryam.tahhan@intel.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don Skidmore [Tue, 1 Oct 2013 11:33:51 +0000 (04:33 -0700)]
ixgbevf: move API neg to reset path
After this patch the API negotiation will occur in the reset path. So now
the PF will be informed of the API version earlier. This will also require
the mailbox lock to be initialize sooner as well.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don Skidmore [Tue, 1 Oct 2013 11:33:50 +0000 (04:33 -0700)]
ixgbevf: add wait for Rx queue disable
New function was added to wait for Rx queues to be disabled before
disabling NAPI. This function also allows us to modify
ixgbevf_rx_desc_queue_enable() to better match ixgbe. I also cleaned up
some msleep calls to usleep_range while I was in this code anyway.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don Skidmore [Tue, 1 Oct 2013 11:33:49 +0000 (04:33 -0700)]
ixgbevf: cleanup redundant mailbox read failure check
Since we are already checking for read failure in check_link we don't need
to do it here. Instead just make sure the watchdog task gets scheduled, if
we are up, and it can be done there. This will better follow igbvf method
of handling a mailbox event and message timeout.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jacob Keller [Tue, 1 Oct 2013 11:33:48 +0000 (04:33 -0700)]
ixgbevf: do not print registers to dmesg in ixgbevf_get_regs
This patch removes the use of hw_dbg in ixgbevf when the ixgbe_get_regs function
is called from ethtool. This goes along side a patch to ethtool which enables
proper support for ixgbevf pretty-printing of registers.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sathya Perla [Tue, 1 Oct 2013 10:30:01 +0000 (16:00 +0530)]
be2net: add a counter for pkts dropped in xmit path
In the xmit path, the driver may drop some pkts due to reasons such as
DMA mapping errors, out of memory conditions or to protect HW from
unrecoverable errors. Add a counter in TX-stats for such drops.
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sathya Perla [Tue, 1 Oct 2013 10:30:00 +0000 (16:00 +0530)]
be2net: fix adaptive interrupt coalescing
The current EQ delay calculation for AIC is based only on RX packet rate.
This fails to be effective when there's only TX and no RX.
This patch inclues:
- Calculating EQ-delay based on both RX and TX pps.
- Modifying EQ-delay of all EQs via one cmd, instead of issuing a separate
cmd for each EQ.
- A new structure to store interrupt coalescing parameters, in a separate
cache-line from the EQ-obj.
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Tue, 1 Oct 2013 10:29:59 +0000 (15:59 +0530)]
be2net: call ENABLE_VF cmd for Skyhawk-R too
This cmd needs to be sent to FW when enabling VFs (currently used only
for Lancer.) Also, avoid calling the cmd when driver loads and finds that
VFs are already enabled from a previous load.
Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Tue, 1 Oct 2013 10:29:58 +0000 (15:59 +0530)]
be2net: Create single TXQ on BE3-R 1G ports
On BE3-R 1G ports (identified by port numbers 2 and 3) the FW cannot properly
support multiple TXQs. This also makes the number of RX and TX queues symmetric
as only a single RXQ is available on 1G ports.
Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Tue, 1 Oct 2013 10:29:57 +0000 (15:59 +0530)]
be2net: pass if_id for v1 and V2 versions of TX_CREATE cmd
It is a required field for all TX_CREATE cmd versions > 0.
Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Tue, 1 Oct 2013 10:29:56 +0000 (15:59 +0530)]
be2net: Call be_vf_setup() even when VFs are enbaled from previous load
Re-define the sriov_want() macro to check for number of VFs that need
to be enabled in the current load of the driver or the number of VFs that
still remain enabled from the previous load (attached VFs cannot be disabled.)
Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sachin Kamat [Mon, 30 Sep 2013 04:25:14 +0000 (09:55 +0530)]
net: can: c_can_platform: Remove redundant of_match_ptr
The data structure of_match_ptr() protects is always compiled in.
Hence of_match_ptr() is not needed.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: linux-can@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Sachin Kamat [Mon, 30 Sep 2013 04:25:13 +0000 (09:55 +0530)]
net: ethernet: cpsw-phy-sel: Remove redundant of_match_ptr
The data structure of_match_ptr() protects is always compiled in.
Hence of_match_ptr() is not needed.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sachin Kamat [Mon, 30 Sep 2013 04:25:12 +0000 (09:55 +0530)]
net: ethernet: cpsw: Remove redundant of_match_ptr
The data structure of_match_ptr() protects is always compiled in.
Hence of_match_ptr() is not needed.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Sat, 28 Sep 2013 19:18:56 +0000 (21:18 +0200)]
bonding: RCUify bond_set_rx_mode()
Currently we rely on rtnl locking in bond_set_rx_mode(), however it's not
always the case:
RTNL: assertion failed at drivers/net/bonding/bond_main.c (3391)
...
[<
ffffffff81651ca5>] dump_stack+0x54/0x74
[<
ffffffffa029e717>] bond_set_rx_mode+0xc7/0xd0 [bonding]
[<
ffffffff81553af7>] __dev_set_rx_mode+0x57/0xa0
[<
ffffffff81557ff8>] __dev_mc_add+0x58/0x70
[<
ffffffff81558020>] dev_mc_add+0x10/0x20
[<
ffffffff8161e26e>] igmp6_group_added+0x18e/0x1d0
[<
ffffffff81186f76>] ? kmem_cache_alloc_trace+0x236/0x260
[<
ffffffff8161f80f>] ipv6_dev_mc_inc+0x29f/0x320
[<
ffffffff8161f9e7>] ipv6_sock_mc_join+0x157/0x260
...
Fix this by using RCU primitives.
Reported-by: Joe Lawrence <joe.lawrence@stratus.com>
Tested-by: Joe Lawrence <joe.lawrence@stratus.com>
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Avinash Kumar [Mon, 30 Sep 2013 04:06:44 +0000 (09:36 +0530)]
drivers: net: phy: marvell.c: removed checkpatch.pl warnings
removes following warnings-
drivers/net/phy/marvell.c:37: WARNING: Use #include <linux/io.h> instead of <asm/io.h>
drivers/net/phy/marvell.c:39: WARNING: Use #include <linux/uaccess.h> instead of <asm/uaccess.h>
Signed-off-by: Avinash Kumar <avi.kp.137@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 29 Sep 2013 08:21:32 +0000 (01:21 -0700)]
net: skb_is_gso_v6() requires skb_is_gso()
bnx2x makes a dangerous use of skb_is_gso_v6().
It should first make sure skb is a gso packet
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Eilon Greenstein <eilong@broadcom.com>
Acked-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 29 Sep 2013 08:12:40 +0000 (01:12 -0700)]
net: add missing sk_max_pacing_rate doc
Warning(include/net/sock.h:411): No description found for parameter
'sk_max_pacing_rate'
Lets please "make htmldocs" and kbuild bot.
Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hauke Mehrtens [Sun, 29 Sep 2013 11:54:58 +0000 (13:54 +0200)]
bgmac: add support for Byte Queue Limits
This makes it possible to use some more advanced queuing
techniques with this driver.
When multi queue support will be added some changes to Byte Queue
handling is needed.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hauke Mehrtens [Sat, 28 Sep 2013 21:22:18 +0000 (23:22 +0200)]
b44: add support for Byte Queue Limits
This makes it possible to use some more advanced queuing
techniques with this driver.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric W. Biederman [Sat, 28 Sep 2013 21:10:59 +0000 (14:10 -0700)]
net ipv4: Convert ipv4.ip_local_port_range to be per netns v3
- Move sysctl_local_ports from a global variable into struct netns_ipv4.
- Modify inet_get_local_port_range to take a struct net, and update all
of the callers.
- Move the initialization of sysctl_local_ports into
sysctl_net_ipv4.c:ipv4_sysctl_init_net from inet_connection_sock.c
v2:
- Ensure indentation used tabs
- Fixed ip.h so it applies cleanly to todays net-next
v3:
- Compile fixes of strange callers of inet_get_local_port_range.
This patch now successfully passes an allmodconfig build.
Removed manual inlining of inet_get_local_port_range in ipv4_local_port_range
Originally-by: Samya <samya@twitter.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
stephen hemminger [Sat, 28 Sep 2013 00:21:27 +0000 (17:21 -0700)]
ethernet: use likely() for common Ethernet encap
Mark code path's likely/unlikely based on most common usage.
* Very few devices use dsa tags.
* Most traffic is Ethernet (not 802.2)
* No sane person uses trailer type or Novell encapsulation
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
stephen hemminger [Sat, 28 Sep 2013 00:19:41 +0000 (17:19 -0700)]
ethernet: cleanup eth_type_trans
Remove old legacy comment and weird if condition.
The comment has outlived it's stay and is throwback to some
early net code (before my time). Maybe Dave remembers what it meant.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 30 Sep 2013 23:14:20 +0000 (19:14 -0400)]
Merge branch 'for-davem' of git://gitorious.org/linux-can/linux-can-next
Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Mon, 23 Sep 2013 06:55:59 +0000 (14:55 +0800)]
ipv6: Not need to set fl6.flowi6_flags as zero
setting fl6.flowi6_flags as zero after memset is redundant, Remove it.
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Liu [Sun, 22 Sep 2013 18:03:44 +0000 (19:03 +0100)]
xen-netback: improve ring effeciency for guest RX
There was a bug that netback routines netbk/xenvif_skb_count_slots and
netbk/xenvif_gop_frag_copy disagreed with each other, which caused
netback to push wrong number of responses to netfront, which caused
netfront to eventually crash. The bug was fixed in
6e43fc04a
("xen-netback: count number required slots for an skb more carefully").
Commit
6e43fc04a focused on backport-ability. The drawback with the
existing packing scheme is that the ring is not used effeciently, as
stated in
6e43fc04a.
skb->data like:
| 1111|
222222222222|3333 |
is arranged as:
|1111 |
222222222222|3333 |
If we can do this:
|
111122222222|
22223333 |
That would save one ring slot, which improves ring effeciency.
This patch effectively reverts
6e43fc04a. That patch made count_slots
agree with gop_frag_copy, while this patch goes the other way around --
make gop_frag_copy agree with count_slots. The end result is that they
still agree with each other, and the ring is now arranged like:
|
111122222222|
22223333 |
The patch that improves packing was first posted by Xi Xong and Matt
Wilson. I only rebase it on top of net-next and rewrite commit message,
so I retain all their SoBs. For more infomation about the original bug
please refer to email listed below and commit message of
6e43fc04a.
Original patch:
http://lists.xen.org/archives/html/xen-devel/2013-07/msg00760.html
Signed-off-by: Xi Xiong <xixiong@amazon.com>
Reviewed-by: Matt Wilson <msw@amazon.com>
[ msw: minor code cleanups, rewrote commit message, adjusted code
to count RX slots instead of meta structures ]
Signed-off-by: Matt Wilson <msw@amazon.com>
Cc: Annie Li <annie.li@oracle.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
[ liuw: rebased on top of net-next tree, rewrote commit message, coding
style cleanup. ]
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
stephen hemminger [Fri, 27 Sep 2013 00:42:16 +0000 (17:42 -0700)]
qdisc: basic classifier - remove unnecessary initialization
err is set once, then first code resets it.
err = tcf_exts_validate(...)
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Jamal Hadi Salim <hadi@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
stephen hemminger [Fri, 27 Sep 2013 00:40:11 +0000 (17:40 -0700)]
qdisc: meta return ENOMEM on alloc failure
Rather than returning earlier value (EINVAL), return ENOMEM if
kzalloc fails. Found while reviewing to find another EINVAL condition.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Thu, 26 Sep 2013 23:22:01 +0000 (01:22 +0200)]
bonding: trivial: remove forgotten bond_next_vlan()
It's a forgotten function declaration, which was removed some time ago
already.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 30 Sep 2013 19:36:45 +0000 (15:36 -0400)]
Merge branch '20130926_include_linux_networking_externs' of git://repo.or.cz/linux-2.6/trivial-mods
Conflicts:
include/linux/netdevice.h
More extern removals from Joe Perches.
Minor conflict with the dev_notify_flags changes which added a new
argument to __dev_notify_flags().
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 30 Sep 2013 19:11:00 +0000 (15:11 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/klassert/ipsec-next
Conflicts:
include/net/xfrm.h
Simple conflict between Joe Perches "extern" removal for function
declarations in header files and the changes in Steffen's tree.
Steffen Klassert says:
====================
Two patches that are left from the last development cycle.
Manual merging of include/net/xfrm.h is needed. The conflict
can be solved as it is currently done in linux-next.
1) We announce the creation of temporary acquire state via an asyc event,
so the deletion should be annunced too. From Nicolas Dichtel.
2) The VTI tunnels do not real tunning, they just provide a routable
IPsec tunnel interface. So introduce and use xfrm_tunnel_notifier
instead of xfrm_tunnel for xfrm tunnel mode callback. From Fan Du.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Opdenacker [Wed, 25 Sep 2013 18:17:45 +0000 (20:17 +0200)]
hamradio: baycom: remove deprecated IRQF_DISABLED
This patch proposes to remove the IRQF_DISABLED flag
from drivers/net/hamradio/baycom_*
It's a NOOP since 2.6.35 and it will be removed one day.
Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Dichtel [Wed, 25 Sep 2013 10:02:45 +0000 (12:02 +0200)]
dev: always advertise rx_flags changes via netlink
When flags IFF_PROMISC and IFF_ALLMULTI are changed, netlink messages are not
consistent. For example, if a multicast daemon is running (flag IFF_ALLMULTI
set in dev->flags but not dev->gflags, ie not exported to userspace) and then a
user sets it via netlink (flag IFF_ALLMULTI set in dev->flags and dev->gflags, ie
exported to userspace), no netlink message is sent.
Same for IFF_PROMISC and because dev->promiscuity is exported via
IFLA_PROMISCUITY, we may send a netlink message after each change of this
counter.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Dichtel [Wed, 25 Sep 2013 10:02:44 +0000 (12:02 +0200)]
dev: update __dev_notify_flags() to send rtnl msg
This patch only prepares the next one, there is no functional change.
Now, __dev_notify_flags() can also be used to notify flags changes via
rtnetlink.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fabio Porcedda [Wed, 25 Sep 2013 09:21:26 +0000 (11:21 +0200)]
net: qmi_wwan: fix checkpatch warnings
Signed-off-by: Fabio Porcedda <fabio.porcedda@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fabio Porcedda [Wed, 25 Sep 2013 09:21:25 +0000 (11:21 +0200)]
net: qmi_wwan: add Telit LE920 newer firmware support
Newer firmware use a new pid and a different interface.
Signed-off-by: Fabio Porcedda <fabio.porcedda@gmail.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 24 Sep 2013 15:20:52 +0000 (08:20 -0700)]
net: introduce SO_MAX_PACING_RATE
As mentioned in commit
afe4fd062416b ("pkt_sched: fq: Fair Queue packet
scheduler"), this patch adds a new socket option.
SO_MAX_PACING_RATE offers the application the ability to cap the
rate computed by transport layer. Value is in bytes per second.
u32 val = 1000000;
setsockopt(sockfd, SOL_SOCKET, SO_MAX_PACING_RATE, &val, sizeof(val));
To be effectively paced, a flow must use FQ packet scheduler.
Note that a packet scheduler takes into account the headers for its
computations. The effective payload rate depends on MSS and retransmits
if any.
I chose to make this pacing rate a SOL_SOCKET option instead of a
TCP one because this can be used by other protocols.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Fri, 27 Sep 2013 14:12:05 +0000 (16:12 +0200)]
bonding: remove bond_next_slave()
There are no users left, so it's safe to remove.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Fri, 27 Sep 2013 14:12:04 +0000 (16:12 +0200)]
bonding: don't use bond_next_slave() in bond_info_seq_next()
We don't need the circular loop there and it's the only current user of
bond_next_slave() - so just use the standard bond_for_each_slave().
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Fri, 27 Sep 2013 14:12:03 +0000 (16:12 +0200)]
bonding: remove unused __get_next_agg()
It has no users, so it's safe to remove it completely.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Fri, 27 Sep 2013 14:12:02 +0000 (16:12 +0200)]
bonding: make bond_3ad_unbind_slave() use bond_for_each_slave()
Convert all instances of
for (agg = __get_first_agg(); agg; agg = __get_next_port)
to the standard bond_for_each_slave().
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Fri, 27 Sep 2013 14:12:01 +0000 (16:12 +0200)]
bonding: make ad_agg_selection_logic() use bond_for_each_slave()
Convert all instances of
for (agg = __get_first_agg(); agg; agg = __get_next_port)
to the standard bond_for_each_slave(). Also, remove the useless checks
before calling bond_3ad_set_carrier() - if we have something NULL - it
would fire long ago, in __get_first/next_port(), per example.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Fri, 27 Sep 2013 14:12:00 +0000 (16:12 +0200)]
bonding: make __get_active_agg() use bond_for_each_slave()
Currently we're relying on suboptimal construct
for (; aggregator; aggregator = __get_next_agg(aggregator)) {
where aggregator is an argument of __get_active_agg() which is _always_ the
first slave's aggregator - judging by all the callers, comments in the
ad_agg_selection_logic() and by logic.
Convert it to use the standard bond_for_each_slave().
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Fri, 27 Sep 2013 14:11:59 +0000 (16:11 +0200)]
bonding: make ad_port_selection_logic() use bond_for_each_slave()
Currently, ad_port_selection_logic() uses
for (aggregator = __get_first_agg(port); aggregator;
aggregator = __get_next_agg(aggregator)) {
construct, however it's suboptimal, difficult to read and understand.
Change it to a standard bond_for_each_slave(), so that we won't need
__get_first/next_agg() and have it more readable.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Fri, 27 Sep 2013 14:11:58 +0000 (16:11 +0200)]
bonding: remove __get_first_port()
Currently we have only one user of it, so it's kind of useless and just
obfusicates things.
Remove it and move the logic to the only user -
bond_3ad_state_machine_handler().
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Fri, 27 Sep 2013 14:11:57 +0000 (16:11 +0200)]
bonding: remove __get_next_port()
Currently this function is only used in constructs like
for (port = __get_first_port(bond); port; port = __get_next_port(port))
which is basicly the same as
bond_for_each_slave(bond, slave, iter) {
port = &(SLAVE_AD_INFO(slave).port);
but a more time consuming.
Remove the function and convert the users to bond_for_each_slave().
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Fri, 27 Sep 2013 13:10:58 +0000 (15:10 +0200)]
bonding: verify if we still have slaves in bond_3ad_unbind_slave()
After commit
1f718f0f4f97145f4072d2d72dcf85069ca7226d ("bonding: populate
neighbour's private on enslave"), we've moved the unlinking of the slave
to the earliest position possible - so that nobody will see an
half-uninited slave.
However, bond_3ad_unbind_slave() relied that, even while removing the last
slave, it is still accessible - via __get_first_agg() (and, eventually,
bond_first_slave()).
Fix that by verifying if the aggregator return is an actual aggregator, but
not NULL.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Fri, 27 Sep 2013 13:10:57 +0000 (15:10 +0200)]
bonding: correctly verify for the first slave in bond_enslave
After commit
1f718f0f4f97145f4072d2d72dcf85069ca7226d ("bonding: populate
neighbour's private on enslave"), we've moved the actual 'linking' in the
end of the function - so that, once linked, the slave is ready to be used,
and is not still in the process of enslaving.
However, 802.3ad verified if it's the first slave by looking at the
if (bond_first_slave(bond) == new_slave)
which, because we've moved the linking to the end, became broken - on the
first slave bond_first_slave(bond) returns NULL.
Fix this by verifying if the prev_slave, that equals bond_last_slave(), is
actually populated - if it is - then it's not the first slave, and vice
versa.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Sat, 28 Sep 2013 05:46:12 +0000 (08:46 +0300)]
bnx2x: use pcie_get_minimum_link()
Use common code for getting the pcie link speed/width for debug printing.
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yaniv Rosner [Sat, 28 Sep 2013 05:46:11 +0000 (08:46 +0300)]
bnx2x: Add support for EXTPHY2 LED mode
Add new LED mode for the BCM848xx to support new board type.
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yaniv Rosner [Sat, 28 Sep 2013 05:46:10 +0000 (08:46 +0300)]
bnx2x: Change function prototype
Change bnx2x_bsc_read function prototype (more of a cosmetic change).
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Elior [Sat, 28 Sep 2013 05:46:09 +0000 (08:46 +0300)]
bnx2x: Don't disable/enable SR-IOV when loading
Current bnx2x implementation controls the number of VFs only by
standard sysfs support, and will reject setting the number of VFs
when the PF is not loaded.
As a result, there is no need to schedule a delayed work to enable
SR-IOV when PF is loaded, as the number of VFs at that point
must be 0.
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Sat, 28 Sep 2013 05:46:08 +0000 (08:46 +0300)]
bnx2x: Correct VF driver info
When running ethtool on VF interfaces, returning values should indicate
that the interface does not support self-test or register dump.
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Sat, 28 Sep 2013 05:46:07 +0000 (08:46 +0300)]
bnx2x: Test nvram when interface is down
Since commit 3fb43eb ("bnx2x: Change to D3hot only on removal") nvram
is accessible whenever the driver is loaded - Thus it is possible to
test it during self-test even if the interface is down
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Francesco Fusco [Tue, 24 Sep 2013 13:43:09 +0000 (15:43 +0200)]
ipv4: processing ancillary IP_TOS or IP_TTL
If IP_TOS or IP_TTL are specified as ancillary data, then sendmsg() sends out
packets with the specified TTL or TOS overriding the socket values specified
with the traditional setsockopt().
The struct inet_cork stores the values of TOS, TTL and priority that are
passed through the struct ipcm_cookie. If there are user-specified TOS
(tos != -1) or TTL (ttl != 0) in the struct ipcm_cookie, these values are
used to override the per-socket values. In case of TOS also the priority
is changed accordingly.
Two helper functions get_rttos and get_rtconn_flags are defined to take
into account the presence of a user specified TOS value when computing
RT_TOS and RT_CONN_FLAGS.
Signed-off-by: Francesco Fusco <ffusco@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Francesco Fusco [Tue, 24 Sep 2013 13:43:08 +0000 (15:43 +0200)]
ipv4: IP_TOS and IP_TTL can be specified as ancillary data
This patch enables the IP_TTL and IP_TOS values passed from userspace to
be stored in the ipcm_cookie struct. Three fields are added to the struct:
- the TTL, expressed as __u8.
The allowed values are in the [1-255].
A value of 0 means that the TTL is not specified.
- the TOS, expressed as __s16.
The allowed values are in the range [0,255].
A value of -1 means that the TOS is not specified.
- the priority, expressed as a char and computed when
handling the ancillary data.
Signed-off-by: Francesco Fusco <ffusco@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nate Levesque [Sat, 21 Sep 2013 18:49:41 +0000 (18:49 +0000)]
lance: Fix hardcoded interrupt name lp->name to use system device value
The lance interrupt handler was using the hard-coded name which would make it difficult to tell where the interrupt came from. Changed to use the device name that made the interrupt.
Signed-off-by: Nate Levesque <thenaterhood@gmail.com>
Reviewed-by: Matthew Whitehead <tedheadster@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mihir Singh [Sat, 21 Sep 2013 18:48:09 +0000 (18:48 +0000)]
hp100: replace hardcoded name in /proc/interrupts with interface name
The /proc/interrupts file displays hp100, which is not the accepted style. Printing eth%d is more helpful.
Signed-off-by: Mihir Singh <me@mihirsingh.com>
Reviewed-By: Matthew Whitehead <tedheadster@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hannes Frederic Sowa [Sat, 21 Sep 2013 14:56:10 +0000 (16:56 +0200)]
ipv6: compare sernum when walking fib for /proc/net/ipv6_route as safety net
This patch provides an additional safety net against NULL
pointer dereferences while walking the fib trie for the new
/proc/net/ipv6_route walkers. I never needed it myself and am unsure
if it is needed at all, but the same checks where introduced in
2bec5a369ee79576a3eea2c23863325089785a2c ("ipv6: fib: fix crash when
changing large fib while dumping it") to fix NULL pointer bugs.
This patch is separated from the first patch to make it easier to revert
if we are sure we can drop this logic.
Cc: Ben Greear <greearb@candelatech.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hannes Frederic Sowa [Sat, 21 Sep 2013 14:55:59 +0000 (16:55 +0200)]
ipv6: avoid high order memory allocations for /proc/net/ipv6_route
Dumping routes on a system with lots rt6_infos in the fibs causes up to
11-order allocations in seq_file (which fail). While we could switch
there to vmalloc we could just implement the streaming interface for
/proc/net/ipv6_route. This patch switches /proc/net/ipv6_route from
single_open_net to seq_open_net.
loff_t *pos tracks dst entries.
Also kill never used struct rt6_proc_arg and now unused function
fib6_clean_all_ro.
Cc: Ben Greear <greearb@candelatech.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Mack [Sat, 21 Sep 2013 14:53:02 +0000 (16:53 +0200)]
net: phy: at803x: add suspend/resume callbacks
When WOL is enabled, the chip can't be put into power-down (BMCR_PDOWN)
mode, as that will also switch off the MAC, which consequently leads to
a link loss.
Use BMCR_ISOLATE in that case, which will at least save us some
milliamperes in comparison to normal operation mode.
Signed-off-by: Daniel Mack <zonque@gmail.com>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Mack [Sat, 21 Sep 2013 14:53:01 +0000 (16:53 +0200)]
net: phy: at803x: don't pass function pointers with &
Just a cosmetic cleanup.
Signed-off-by: Daniel Mack <zonque@gmail.com>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 27 Sep 2013 21:02:24 +0000 (17:02 -0400)]
Merge branch 'qlge'
Jitendra Kalsaria says:
====================
This patch series enhance the handling of nested vlan tags in Rx path.
V2 changes:
* removed module parameter.
V3 changes:
* Users can enable or disable hardware VLAN acceleration using ethtool
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jitendra Kalsaria [Fri, 27 Sep 2013 17:17:47 +0000 (13:17 -0400)]
qlge: Update version to 1.00.00.33
Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jitendra Kalsaria [Fri, 27 Sep 2013 17:17:46 +0000 (13:17 -0400)]
qlge: Enhance nested VLAN (Q-in-Q) handling.
o Adapter doesn’t handle packets with nested VLAN tags in
Rx path. User can turn off VLAN tag stripping in the hardware
and let the stack handle stripping of VLAN tags in the Rx path.
o Users can enable or disable hardware VLAN acceleration using ethtool
Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shahed Shaikh [Fri, 27 Sep 2013 05:42:27 +0000 (01:42 -0400)]
netxen_nic: Update version to 4.0.82
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shahed Shaikh [Fri, 27 Sep 2013 05:42:26 +0000 (01:42 -0400)]
netxen_nic: Print ULA information
This patch reads CAMRAM(0x178) where FW writes a key for ULA and non-ULA
adapter and based on the key, driver logs the message.
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Thu, 26 Sep 2013 21:48:15 +0000 (14:48 -0700)]
[networking]device.h: Remove extern from function prototypes
There are a mix of function prototypes with and without extern
in the kernel sources. Standardize on not using extern for
function prototypes.
Function prototypes don't need to be written with extern.
extern is assumed by the compiler. Its use is as unnecessary as
using auto to declare automatic/local variables in a block.
Signed-off-by: Joe Perches <joe@perches.com>
Joe Perches [Thu, 26 Sep 2013 21:48:15 +0000 (14:48 -0700)]
net.h/skbuff.h: Remove extern from function prototypes
There are a mix of function prototypes with and without extern
in the kernel sources. Standardize on not using extern for
function prototypes.
Function prototypes don't need to be written with extern.
extern is assumed by the compiler. Its use is as unnecessary as
using auto to declare automatic/local variables in a block.
Signed-off-by: Joe Perches <joe@perches.com>
Joe Perches [Thu, 26 Sep 2013 21:48:15 +0000 (14:48 -0700)]
netfilter: Remove extern from function prototypes
There are a mix of function prototypes with and without extern
in the kernel sources. Standardize on not using extern for
function prototypes.
Function prototypes don't need to be written with extern.
extern is assumed by the compiler. Its use is as unnecessary as
using auto to declare automatic/local variables in a block.
Signed-off-by: Joe Perches <joe@perches.com>
David S. Miller [Thu, 26 Sep 2013 20:02:19 +0000 (16:02 -0400)]
Merge branch 'bonding_neighbours'
bonding: use neighbours instead of own lists
Veaceslav Falico says:
====================
This patchset introduces all the needed infrastructure, on top of current
adjacent lists, to be able to remove bond's slave_list/slave->list. The
overhead in memory/CPU is minimal, and after the patchset bonding can rely
on its slave-related functions, given the proper locking. I've done some
netperf benchmarks on a vm, and the delta was about 0.1gbps for 35gbps as a
whole, so no speed fluctuations.
It also automatically creates lower/upper and master symlinks in dev's
sysfs directory.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Wed, 25 Sep 2013 07:20:32 +0000 (09:20 +0200)]
net: create sysfs symlinks for neighbour devices
Also, remove the same functionality from bonding - it will be already done
for any device that links to its lower/upper neighbour.
The links will be created for dev's kobject, and will look like
lower_eth0 for lower device eth0 and upper_bridge0 for upper device
bridge0.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Wed, 25 Sep 2013 07:20:31 +0000 (09:20 +0200)]
net: expose the master link to sysfs, and remove it from bond
Currently, we can have only one master upper neighbour, so it would be
useful to create a symlink to it in the sysfs device directory, the way
that bonding now does it, for every device. Lower devices from
bridge/team/etc will automagically get it, so we could rely on it.
Also, remove the same functionality from bonding.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Wed, 25 Sep 2013 07:20:30 +0000 (09:20 +0200)]
vlan: unlink the upper neighbour before unregistering
On netdev unregister we're removing also all of its sysfs-associated stuff,
including the sysfs symlinks that are controlled by netdev neighbour code.
Also, it's a subtle race condition - cause we can still access it after
unregistering.
Move the unlinking right before the unregistering to fix both.
CC: Patrick McHardy <kaber@trash.net>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Wed, 25 Sep 2013 07:20:29 +0000 (09:20 +0200)]
vlan: link the upper neighbour only after registering
Otherwise users might access it without being fully registered, as per
sysfs - it only inits in register_netdevice(), so is unusable till it is
called.
CC: Patrick McHardy <kaber@trash.net>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Wed, 25 Sep 2013 07:20:28 +0000 (09:20 +0200)]
bonding: remove slave lists
And all the initialization.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Wed, 25 Sep 2013 07:20:27 +0000 (09:20 +0200)]
bonding: use neighbours for bond_next_slave()
Use the new function __bond_next_slave().
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Wed, 25 Sep 2013 07:20:26 +0000 (09:20 +0200)]
bonding: add __bond_next_slave() which uses neighbours
Add a new function, __bond_next_slave(), which uses neighbours to find the
next slave after the slave provided. It will be further used to gradually
go start using neighbour netdev_adjacent infrastructure instead of
bonding's own lists.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Wed, 25 Sep 2013 07:20:25 +0000 (09:20 +0200)]
bonding: remove bond_prev_slave()
We don't really need it, and it's really hard to RCUify the list->prev.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Wed, 25 Sep 2013 07:20:24 +0000 (09:20 +0200)]
bonding: convert first/last slave logic to use neighbours
For that, use netdev_adjacent_get_private(list_head) on bond's lower
neighbour list members. Also, add a small macro - bond_slave_list(bond),
which returns the bond list via neighbour list.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Wed, 25 Sep 2013 07:20:23 +0000 (09:20 +0200)]
net: add a possibility to get private from netdev_adjacent->list
It will be useful to get first/last element.
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Wed, 25 Sep 2013 07:20:22 +0000 (09:20 +0200)]
bonding: convert bond_has_slaves() to use the neighbour list
The same way as it was used for its own slave_list.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Wed, 25 Sep 2013 07:20:21 +0000 (09:20 +0200)]
bonding: add bond_has_slaves() and use it
Currently we verify if we have slaves by checking if bond->slave_list is
empty. Create a define bond_has_slaves() and use it, a bit more readable
and easier to change in the future.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Wed, 25 Sep 2013 07:20:20 +0000 (09:20 +0200)]
bonding: remove unused bond_for_each_slave_from()
It has no users, so we can remove it.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Wed, 25 Sep 2013 07:20:19 +0000 (09:20 +0200)]
bonding: rework bond_ab_arp_probe() to use bond_for_each_slave()
Currently it uses the hard-to-rcuify bond_for_each_slave_from(), and also
it doesn't check every slave for disrepencies between the actual
IS_UP(slave) and the slave->link == BOND_LINK_UP, but only till we find the
next suitable slave.
Fix this by using bond_for_each_slave() and storing the first good slave in
*before till we find the current_arp_slave, after that we store the first good
slave in new_slave. If new_slave is empty - use the slave stored in before,
and if it's also empty - then we didn't find any suitable slave.
Also, in the meanwhile, check for each slave status.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Wed, 25 Sep 2013 07:20:18 +0000 (09:20 +0200)]
bonding: rework bond_find_best_slave() to use bond_for_each_slave()
bond_find_best_slave() does not have to be balanced - i.e. return the slave
that is *after* some other slave, but rather return the best slave that
suits, except of bond->primary_slave - in which case we just return it if
it's suitable.
After that we just look through all the slaves and return either first up
slave or the slave whose link came back earliest.
We also don't care about curr_active_slave lock cause we use it in
bond_should_change_active() only and there we take it right away - i.e. it
won't go away.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Wed, 25 Sep 2013 07:20:17 +0000 (09:20 +0200)]
bonding: rework rlb_next_rx_slave() to use bond_for_each_slave()
Currently, we're using bond_for_each_slave_from(), which is really hard to
implement under RCU and/or neighbour list.
Remove it and use bond_for_each_slave() instead, taking care of the last
used slave.
Also, rename next_rx_slave to rx_slave and store the current (last)
rx_slave.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Wed, 25 Sep 2013 07:20:16 +0000 (09:20 +0200)]
bonding: rework bond_3ad_xmit_xor() to use bond_for_each_slave() only
Currently, there are two loops - first we find the first slave in an
aggregator after the xmit_hash_policy() returned number, and after that we
loop from that slave, over bonding head, and till that slave to find any
suitable slave to send the packet through.
Replace it by just one bond_for_each_slave() loop, which first loops
through the requested number of slaves, saving the first suitable one, and
after that we've hit the requested number of slaves to skip - search for
any up slave to send the packet through. If we don't find such kind of
slave - then just send the packet through the first suitable slave found.
Logic remains unchainged, and we skip two loops. Also, refactor it a bit
for readability.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Wed, 25 Sep 2013 07:20:15 +0000 (09:20 +0200)]
bonding: use bond_for_each_slave() in bond_uninit()
We're safe agains removal there, cause we use neighbours primitives.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Wed, 25 Sep 2013 07:20:14 +0000 (09:20 +0200)]
bonding: make bond_for_each_slave() use lower neighbour's private
It needs a list_head *iter, so add it wherever needed. Use both non-rcu and
rcu variants.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Dimitris Michailidis <dm@chelsio.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Wed, 25 Sep 2013 07:20:13 +0000 (09:20 +0200)]
bonding: remove bond_for_each_slave_continue_reverse()
We only use it in rollback scenarios and can easily use the standart
bond_for_each_dev() instead.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Wed, 25 Sep 2013 07:20:12 +0000 (09:20 +0200)]
net: add for_each iterators through neighbour lower link's private
Add a possibility to iterate through netdev_adjacent's private, currently
only for lower neighbours.
Add both RCU and RTNL/other locking variants of iterators, and make the
non-rcu variant to be safe from removal.
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Wed, 25 Sep 2013 07:20:11 +0000 (09:20 +0200)]
bonding: modify bond_get_slave_by_dev() to use neighbours
It should be used under rtnl/bonding lock, so use the non-RCU version.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Wed, 25 Sep 2013 07:20:10 +0000 (09:20 +0200)]
bonding: populate neighbour's private on enslave
Use the new provided function when attaching the lower slave to populate
its ->private with struct slave *new_slave. Also, move it to the end to
be able to 'find' it only after it was completely initialized, and
deinitialize in the first place on release.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Wed, 25 Sep 2013 07:20:09 +0000 (09:20 +0200)]
net: add netdev_adjacent->private and allow to use it
Currently, even though we can access any linked device, we can't attach
anything to it, which is vital to properly manage them.
To fix this, add a new void *private to netdev_adjacent and functions
setting/getting it (per link), so that we can save, per example, bonding's
slave structures there, per slave device.
netdev_master_upper_dev_link_private(dev, upper_dev, private) links dev to
upper dev and populates the neighbour link only with private.
netdev_lower_dev_get_private{,_rcu}() returns the private, if found.
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Wed, 25 Sep 2013 07:20:08 +0000 (09:20 +0200)]
net: add RCU variant to search for netdev_adjacent link
Currently we have only the RTNL flavour, however we can traverse it while
holding only RCU, so add the RCU search. Add an RCU variant that uses
list_head * as an argument, so that it can be universally used afterwards.
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
CC: Cong Wang <amwang@redhat.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veaceslav Falico [Wed, 25 Sep 2013 07:20:07 +0000 (09:20 +0200)]
net: add adj_list to save only neighbours
Currently, we distinguish neighbours (first-level linked devices) from
non-neighbours by the neighbour bool in the netdev_adjacent. This could be
quite time-consuming in case we would like to traverse *only* through
neighbours - cause we'd have to traverse through all devices and check for
this flag, and in a (quite common) scenario where we have lots of vlans on
top of bridge, which is on top of a bond - the bonding would have to go
through all those vlans to get its upper neighbour linked devices.
This situation is really unpleasant, cause there are already a lot of cases
when a device with slaves needs to go through them in hot path.
To fix this, introduce a new upper/lower device lists structure -
adj_list, which contains only the neighbours. It works always in
pair with the all_adj_list structure (renamed from upper/lower_dev_list),
i.e. both of them contain the same links, only that all_adj_list contains
also non-neighbour device links. It's really a small change visible,
currently, only for __netdev_adjacent_dev_insert/remove(), and doesn't
change the main linked logic at all.
Also, add some comments a fix a name collision in
netdev_for_each_upper_dev_rcu() and rework the naming by the following
rules:
netdev_(all_)(upper|lower)_*
If "all_" is present, then we work with the whole list of upper/lower
devices, otherwise - only with direct neighbours. Uninline functions - to
get better stack traces.
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
CC: Cong Wang <amwang@redhat.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>