platform/kernel/linux-starfive.git
11 years agoigb: Don't give VFs random MAC addresses
Mitch A Williams [Fri, 18 Jan 2013 08:57:20 +0000 (08:57 +0000)]
igb: Don't give VFs random MAC addresses

If the user has not assigned a MAC address to a VM, then don't give it a
random one. Instead, just give it zeros and let it figure out what to do
with them.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Tested-by: Stefan Assmann <sassmann@redhat.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoixgbevf: Make sure link status and speed are fetched
Greg Rose [Thu, 24 Jan 2013 04:54:48 +0000 (04:54 +0000)]
ixgbevf: Make sure link status and speed are fetched

A recent change makes it necessary to set get_link_status to ensure that
the driver fetches the correct, refreshed value for link status and speed
when it has changed in the physical function device.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoe1000e: cleanup: remove comments which are no longer applicable
Bruce Allan [Sat, 12 Jan 2013 07:26:53 +0000 (07:26 +0000)]
e1000e: cleanup: remove comments which are no longer applicable

Code was removed but the applicable comments were not.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoe1000e: cleanup hw.h
Bruce Allan [Sat, 12 Jan 2013 07:26:22 +0000 (07:26 +0000)]
e1000e: cleanup hw.h

Remove unnecessary #include, forward prototype of struct e1000_adapter and
an empty comment; fix a comment which mentions "static data for the MAC"
which is not applicable to the following struct; and cleanup some
whitespace issues.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoe1000e: cleanup: remove unused #define
Bruce Allan [Sat, 12 Jan 2013 07:25:52 +0000 (07:25 +0000)]
e1000e: cleanup: remove unused #define

All references to E1000_ERT_2048 have been removed.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoe1000e: adjust PM QoS request
Bruce Allan [Sat, 12 Jan 2013 07:25:22 +0000 (07:25 +0000)]
e1000e: adjust PM QoS request

It has been found that devices other than 82579 (a.k.a. e1000_pch2lan)
suffer from dropped transactions on platforms with deep C-states when
jumbo frames are enabled.  For example, LOMs on ICH9- and ICH10-based
platforms which recently had early-receive de-featured (for stability
reasons) suffer from this.  To resolve this for all devices, when jumbo
frames are enabled set the PM QoS DMA latency request based on the size
of the receive packet buffer less one full frame.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoe1000e: correct maximum frame size on 82579
Bruce Allan [Wed, 9 Jan 2013 01:20:46 +0000 (01:20 +0000)]
e1000e: correct maximum frame size on 82579

The largest jumbo frame supported by the 82579 hardware is 9018.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoe1000e: cleanup: remove e1000e_commit_phy()
Bruce Allan [Wed, 23 Jan 2013 06:50:05 +0000 (06:50 +0000)]
e1000e: cleanup: remove e1000e_commit_phy()

Remove the function e1000e_commit_phy() and replace the few calls to it
with the same function pointer that it would call.  The function pointer is
almost always set for the devices that access these code paths so there is
no risk of a NULL pointer dereference; for the few instances where the
function pointer might not be set (i.e. can be called for the few devices
which do not have this function pointer set), check for a valid function
pointer.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoe1000e: cleanup: remove e1000_get_cable_length()
Bruce Allan [Sat, 5 Jan 2013 08:06:24 +0000 (08:06 +0000)]
e1000e: cleanup: remove e1000_get_cable_length()

Remove the function e1000_get_cable_length() and replace the two calls
to it with the same function pointer that it would call.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoe1000e: cleanup: remove e1000_get_phy_cfg_done()
Bruce Allan [Sat, 5 Jan 2013 08:06:19 +0000 (08:06 +0000)]
e1000e: cleanup: remove e1000_get_phy_cfg_done()

Remove the function e1000_get_phy_cfg_done() and replace the single call
to it with the same function pointer that it would call.  The function
pointer is always set so there is no risk of a NULL pointer dereference.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoe1000e: cleanup: rename e1000_get_cfg_done()
Bruce Allan [Sat, 5 Jan 2013 08:06:14 +0000 (08:06 +0000)]
e1000e: cleanup: rename e1000_get_cfg_done()

In keeping with the e1000e driver function naming convention, the subject
function is renamed to indicate it is generic, i.e. it is applicable to
more than just a single MAC family (e.g. 80003es2lan, 82571, ich8lan).

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoe1000e: cleanup: remove e1000_force_speed_duplex()
Bruce Allan [Sat, 5 Jan 2013 08:06:08 +0000 (08:06 +0000)]
e1000e: cleanup: remove e1000_force_speed_duplex()

Remove the function e1000_force_speed_duplex() and replace the single call
to it with the same function pointer that it would call.  The function
pointer is always set so there is no risk of a NULL pointer dereference.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoe1000e: cleanup: remove e1000_set_d0_lplu_state()
Bruce Allan [Sat, 5 Jan 2013 08:06:03 +0000 (08:06 +0000)]
e1000e: cleanup: remove e1000_set_d0_lplu_state()

Replace the function e1000_set_d0_lplu_state() with the contents of it
coded in place of the single call to the function.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agogro: Fix kcalloc argument order
Joe Perches [Sat, 26 Jan 2013 09:24:19 +0000 (09:24 +0000)]
gro: Fix kcalloc argument order

First number, then size.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoirda: buffer overflow in irnet_ctrl_read()
Dan Carpenter [Thu, 24 Jan 2013 20:40:56 +0000 (20:40 +0000)]
irda: buffer overflow in irnet_ctrl_read()

The comments here say that the /* Max event is 61 char */ but in 2003 we
changed the event format and now the max event size is 75.  The longest
event is:

"Discovered %08x (%s) behind %08x {hints %02X-%02X}\n",
         12345678901    23  456789012    34567890    1    2 3
            +8    +21        +8          +2   +2     +1
         = 75 characters.

There was a check to return -EOVERFLOW if the user gave us a "count"
value that was less than 64.  Raising it to 75 might break backwards
compatability.  Instead I removed the check and now it returns a
truncated string if "count" is too low.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge branch 'intel'
David S. Miller [Mon, 28 Jan 2013 00:06:42 +0000 (19:06 -0500)]
Merge branch 'intel'

Jeff Kirsher says:

====================
This series contains updates to e1000e only.  All the updates come
from Bruce Allan and most of the patch fix or enable features on
i217/i218.  Most notably is patch 03 "e1000e: add support for IEEE-1588
PTP", which is v2 of the patch based on feedback from Stephen Hemminger.

Also patch 04 "e1000e: enable ECC on I217/I218 to catch packet buffer
memory errors" should be queued up for stable (as well as net) trees, but
the patch does not apply cleanly to either of those trees currently.
So I will work with Bruce to provide a version of the patch which will
apply cleanly to net (and stable) and we can queue it up at that point
for stable 3.5 tree.

The remaining patches are general cleanups of the code.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoe1000e: cleanup: do not assign a variable a value when not necessary
Bruce Allan [Sat, 5 Jan 2013 05:08:37 +0000 (05:08 +0000)]
e1000e: cleanup: do not assign a variable a value when not necessary

Static analysis with cppcheck has shown a few instances of a variable
being reassigned a value before the old one has been used.  None of these
ever require the old value to be used so remove the old values.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoe1000e: do not ignore variables which get set a value
Bruce Allan [Sat, 5 Jan 2013 05:08:31 +0000 (05:08 +0000)]
e1000e: do not ignore variables which get set a value

Static analysis with cppcheck has shown a few instances of a variable which
is assigned a value that is never used.  A number of these are the return
status of various driver function calls which should be passed back to the
caller of the current function.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoe1000e: cleanup: remove unnecessary function prototypes
Bruce Allan [Sat, 5 Jan 2013 03:06:54 +0000 (03:06 +0000)]
e1000e: cleanup: remove unnecessary function prototypes

...and cleanup some whitespace in other prototypes.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoe1000e: add comment to spinlock_t definition
Bruce Allan [Fri, 4 Jan 2013 10:06:03 +0000 (10:06 +0000)]
e1000e: add comment to spinlock_t definition

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoe1000e: remove definition of struct which is no longer used
Bruce Allan [Fri, 4 Jan 2013 09:54:11 +0000 (09:54 +0000)]
e1000e: remove definition of struct which is no longer used

The e1000e driver has been converted to use extended descriptors instead of
the older legacy descriptor type.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoe1000e: fix PHY init workarounds for i217/i218
Bruce Allan [Fri, 4 Jan 2013 09:53:19 +0000 (09:53 +0000)]
e1000e: fix PHY init workarounds for i217/i218

Toggling the LANPHYPC Value bit cycles the power on the PHY and sets it
back to power-on defaults.  This includes setting it's MAC-PHY messaging
mode to use the PCIe-like interconnect, so the MAC must also be set back
from SMBus mode to PCIe mode otherwise the PHY can be inaccessible.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoe1000e: correct maximum frame size on i217/i218
Bruce Allan [Fri, 4 Jan 2013 09:51:36 +0000 (09:51 +0000)]
e1000e: correct maximum frame size on i217/i218

The largest jumbo frame supported by the i217 and i218 hardware is 9018.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoe1000e: update copyright date
Bruce Allan [Tue, 1 Jan 2013 16:00:01 +0000 (16:00 +0000)]
e1000e: update copyright date

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoe1000e: remove prototype of non-existent function
Bruce Allan [Sat, 29 Dec 2012 09:08:50 +0000 (09:08 +0000)]
e1000e: remove prototype of non-existent function

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoe1000e: prevent hardware from automatically configuring PHY on I217/I218
Bruce Allan [Wed, 12 Dec 2012 04:45:51 +0000 (04:45 +0000)]
e1000e: prevent hardware from automatically configuring PHY on I217/I218

As done with the previous generation managed 82579, prevent the PHY from
being put into an unknown state by blocking the hardware from automatically
configuring the PHY as done with the previous generation managed 82579.
Instead, the driver should configure the PHY with contents of the EEPROM
image.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoe1000e: enable ECC on I217/I218 to catch packet buffer memory errors
Bruce Allan [Wed, 23 Jan 2013 09:00:03 +0000 (09:00 +0000)]
e1000e: enable ECC on I217/I218 to catch packet buffer memory errors

In rare instances, memory errors have been detected in the internal packet
buffer memory on I217/I218 when stressed under certain environmental
conditions.  Enable Error Correcting Code (ECC) in hardware to catch both
correctable and uncorrectable errors.  Correctable errors will be handled
by the hardware.  Uncorrectable errors in the packet buffer will cause the
packet to be received with an error indication in the buffer descriptor
causing the packet to be discarded.  If the uncorrectable error is in the
descriptor itself, the hardware will stop and interrupt the driver
indicating the error.  The driver will then reset the hardware in order to
clear the error and restart.

Both types of errors will be accounted for in statistics counters.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoe1000e: add support for IEEE-1588 PTP
Bruce Allan [Sat, 19 Jan 2013 01:09:58 +0000 (01:09 +0000)]
e1000e: add support for IEEE-1588 PTP

Add PTP IEEE-1588 support and make accesible via the PHC subsystem.

v2: make e1000e_ptp_clock_info a static const struct per Stephen Hemminger

Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Reviewed-by: Jacob Keller <Jacob.e.keller@intel.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoe1000e: fix flow-control thresholds for jumbo frames on 82579/I217/I218
Bruce Allan [Sat, 8 Dec 2012 00:35:35 +0000 (00:35 +0000)]
e1000e: fix flow-control thresholds for jumbo frames on 82579/I217/I218

The previous static flow-control thresholds were causing unnecessary pause
packets to be transmitted when jumbo frames are configured reducing the
throughput.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoe1000e: fix ethtool offline register test for I217
Bruce Allan [Wed, 5 Dec 2012 06:25:36 +0000 (06:25 +0000)]
e1000e: fix ethtool offline register test for I217

The SHRAH[9] register on I217 has a different R/W bit-mask than RAR and
SHRAL/H registers.  Set R/W bit-mask appropriately for SHRAH[9] when
testing the R/W ability of the register.  Also, fix the error message log
format so that it does not provide misleading information (i.e. the logged
register address could be incorrect).

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net...
David S. Miller [Sun, 27 Jan 2013 06:26:27 +0000 (01:26 -0500)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agosoreuseport: fix use of uid in tb->fastuid
Tom Herbert [Sat, 26 Jan 2013 07:50:54 +0000 (07:50 +0000)]
soreuseport: fix use of uid in tb->fastuid

Fix a reported compilation error where ia variable of type kuid_t
was being set to zero.

Eliminate two instances of setting tb->fastuid to zero.  tb->fastuid is
only used if tb->fastreuseport is set, so there should be no problem if
tb->fastuid is not initialized (when tb->fastreuesport is zero).

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoRemove leftover #endif after introducing SO_REUSEPORT
Thomas Graf [Fri, 25 Jan 2013 00:00:58 +0000 (00:00 +0000)]
Remove leftover #endif after introducing SO_REUSEPORT

Commit 055dc21a1d (soreuseport: infrastructure) removed the #if 0
around SO_REUSEPORT without removing the corresponding #endif
thus causing the header guard to close early.

Cc: Tom Herbert <therbert@google.com>
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoqlcnic: Bump up the version to 5.1.32
Jitendra Kalsaria [Fri, 25 Jan 2013 10:20:41 +0000 (10:20 +0000)]
qlcnic: Bump up the version to 5.1.32

Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoqlcnic: add support for FDB netdevice ops.
Jitendra Kalsaria [Fri, 25 Jan 2013 10:20:40 +0000 (10:20 +0000)]
qlcnic: add support for FDB netdevice ops.

Providing communication channel between KVM and e-Switch so that it
can be informed when hypervisor configures a MAC address and VLAN.

qlcnic_mac_learn module param usage will be changed to:
0 = MAC learning is disable
1 = Driver learning is enable
2 = FDB learning is enable

Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoqlcnic: sleeping function called from invalid context
Jitendra Kalsaria [Fri, 25 Jan 2013 10:20:39 +0000 (10:20 +0000)]
qlcnic: sleeping function called from invalid context

device eth0 entered promiscuous mode
BUG: sleeping function called from invalid context at mm/slub.c:930
in_atomic(): 1, irqs_disabled(): 0, pid: 5911, name: brctl
INFO: lockdep is turned off.
Pid: 5911, comm: brctl Tainted: GF       W  O 3.6.0-0.rc7.git1.4.fc18.x86_64 #1
Call Trace:
[<ffffffff810a29ca>] __might_sleep+0x18a/0x240
[<ffffffff811b5d77>] __kmalloc+0x67/0x2d0
[<ffffffffa00a61a9>] ? qlcnic_alloc_lb_filters_mem+0x59/0xa0 [qlcnic]
[<ffffffffa00a61a9>] qlcnic_alloc_lb_filters_mem+0x59/0xa0 [qlcnic]
[<ffffffffa009e1c1>] qlcnic_set_multi+0x81/0x100 [qlcnic]
[<ffffffff8159cccf>] __dev_set_rx_mode+0x5f/0xb0
[<ffffffff8159cd4f>] dev_set_rx_mode+0x2f/0x50
[<ffffffff8159d00c>] dev_set_promiscuity+0x3c/0x50
[<ffffffffa05ed728>] br_add_if+0x1e8/0x400 [bridge]
[<ffffffffa05ee2df>] add_del_if+0x5f/0x90 [bridge]
[<ffffffffa05eee0b>] br_dev_ioctl+0x4b/0x90 [bridge]
[<ffffffff8159d613>] dev_ifsioc+0x373/0x3b0
[<ffffffff8159d78f>] dev_ioctl+0x13f/0x860
[<ffffffff812dd6e1>] ? avc_has_perm_flags+0x31/0x2c0
[<ffffffff8157c18d>] sock_do_ioctl+0x5d/0x70
[<ffffffff8157c21d>] sock_ioctl+0x7d/0x2c0
[<ffffffff812df922>] ? inode_has_perm.isra.48.constprop.61+0x62/0xa0
[<ffffffff811e4979>] do_vfs_ioctl+0x99/0x5a0
[<ffffffff812df9f7>] ? file_has_perm+0x97/0xb0
[<ffffffff810d716d>] ? trace_hardirqs_on+0xd/0x10
[<ffffffff811e4f19>] sys_ioctl+0x99/0xa0
[<ffffffff816e7369>] system_call_fastpath+0x16/0x1b
br0: port 1(eth0) entered forwarding state
br0: port 1(eth0) entered forwarding state

Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoqlcnic: Fix LED/Beaconing tests to work on all ports of an adapter.
Himanshu Madhani [Fri, 25 Jan 2013 10:20:38 +0000 (10:20 +0000)]
qlcnic: Fix LED/Beaconing tests to work on all ports of an adapter.

Provide port number in command payload for LED/Beaconing tests.

Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoqlcnic: avoid mixed mode interrupts for some adapter types
Manish chopra [Fri, 25 Jan 2013 10:20:37 +0000 (10:20 +0000)]
qlcnic: avoid mixed mode interrupts for some adapter types

o Some adapter types do not support co-existence of Legacy Interrupt with
  MSI-x or MSI among multiple functions. For those adapters, prevent attaching
  to a function during normal load, if MSI-x or MSI vectors are not available.
o Using module parameters use_msi=0 and use_msi_x=0, driver can be loaded in
  legacy mode for all functions in the adapter.

Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoqlcnic: enable RSS for TCP over IPv6
Shahed Shaikh [Fri, 25 Jan 2013 10:20:36 +0000 (10:20 +0000)]
qlcnic: enable RSS for TCP over IPv6

o This patch enables RSS for TYPE-C packets to enable RSS for TCP over IPv6

Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoqlcnic: enable LRO on IPv6 without dest ip check
Shahed Shaikh [Fri, 25 Jan 2013 10:20:35 +0000 (10:20 +0000)]
qlcnic: enable LRO on IPv6 without dest ip check

Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoqlcnic: set driver version in firmware
Sritej Velaga [Fri, 25 Jan 2013 10:20:34 +0000 (10:20 +0000)]
qlcnic: set driver version in firmware

Signed-off-by: Sritej Velaga <sritej.velaga@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge branch 'master' of git://1984.lsi.us.es/nf-next
David S. Miller [Sun, 27 Jan 2013 05:56:10 +0000 (00:56 -0500)]
Merge branch 'master' of git://1984.lsi.us.es/nf-next

Pablo Neira Ayuso says:

====================
This batch contains netfilter updates for you net-next tree, they are:

* The new connlabel extension for x_tables, that allows us to attach
  labels to each conntrack flow. The kernel implementation uses a
  bitmask and there's a file in user-space that maps the bits with the
  corresponding string for each existing label. By now, you can attach
  up to 128 overlapping labels. From Florian Westphal.

* A new round of improvements for the netns support for conntrack.
  Gao feng has moved many of the initialization code of each module
  of the netns init path. He also made several code refactoring, that
  code looks cleaner to me now.

* Added documentation for all possible tweaks for nf_conntrack via
  sysctl, from Jiri Pirko.

* Cisco 7941/7945 IP phone support for our SIP conntrack helper,
  from Kevin Cernekee.

* Missing header file in the snmp helper, from Stephen Hemminger.

* Finally, a couple of fixes to resolve minor issues with these
  changes, from myself.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoixgbevf: Fix link speed message to support 100Mbps
Greg Rose [Sat, 19 Jan 2013 06:40:22 +0000 (06:40 +0000)]
ixgbevf: Fix link speed message to support 100Mbps

The X540 can link at 100Mbps - fix the link speed indicator message to
show that value.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoixgbe: Limit number of reported VFs to device specific value
Donald Dutile [Tue, 11 Dec 2012 08:26:48 +0000 (08:26 +0000)]
ixgbe: Limit number of reported VFs to device specific value

ixgbe claims it supports 64 VFs in its SRIOV capability
structure, but the driver only supports 63.  Adjust it
so sysfs sriov configuration checking will check with
the proper totalvf value.

Signed-off-by: Donald Dutile <ddutile@redhat.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoixgbe: Implement PCI SR-IOV sysfs callback operation
Greg Rose [Tue, 11 Dec 2012 08:26:43 +0000 (08:26 +0000)]
ixgbe: Implement PCI SR-IOV sysfs callback operation

Implement callbacks in the driver for the new PCI bus driver
interface that allows the user to enable/disable SR-IOV VFs
in a device via the sysfs interface.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
CC: Don Dutile <ddutile@redhat.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoixgbe: Modularize SR-IOV enablement code
Greg Rose [Tue, 11 Dec 2012 08:26:38 +0000 (08:26 +0000)]
ixgbe: Modularize SR-IOV enablement code

In preparation for enable/disable of SR-IOV via the PCI sysfs interface
move some core SR-IOV enablement code that would be common to module
parameter usage or callback from the PCI bus driver to a separate
function so that it can be used by either method.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
CC: Don Dutile <ddutile@redhat.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Sibai Li <Sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoixgbe: Make mailbox ops initialization unconditional
Greg Rose [Tue, 11 Dec 2012 08:26:33 +0000 (08:26 +0000)]
ixgbe: Make mailbox ops initialization unconditional

There is no actual dependency on initialization of the mailbox ops on
whether SR-IOV is enabled or not and it doesn't hurt to go ahead and
initialize ops unconditionally.  Move the initialization into the device
probe so that the mailbox ops are initialized at the time we have the
board info necessary to do it.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
CC: Don Dutile <ddutile@redhat.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Sibai Li <Sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoixgbe: only compile ixgbe_debugfs.o when enabled
Jacob Keller [Sat, 8 Dec 2012 09:04:25 +0000 (09:04 +0000)]
ixgbe: only compile ixgbe_debugfs.o when enabled

This patch modifies ixgbe_debugfs.c and the Makefile for the ixgbe
driver to only compile the file when the config is enabled. This means
we can remove the #ifdef inside the ixgbe_debugfs.c file.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoixgbe: Inline Rx PTP descriptor handling
Alexander Duyck [Wed, 5 Dec 2012 06:51:29 +0000 (06:51 +0000)]
ixgbe: Inline Rx PTP descriptor handling

This change is meant to inline the Rx PTP descriptor handling.  The main
motivation is to avoid unnecessary jumps into function calls that we then
immediately exit because we are not performing timestamps.

The net result of this change is that ixgbe_ptp_rx_tstamp drops from .5% CPU
utilization in my performance runs to 0%, and the only value tested is the Rx
descriptor which should already be warm in the cache if not stored in a
register.

Cc: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Jacob Keller <Jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoixgbe: Fix overwriting of rx_mtrl in ixgbe_ptp_hwtstamp_ioctl
Jacob Keller [Wed, 5 Dec 2012 07:53:38 +0000 (07:53 +0000)]
ixgbe: Fix overwriting of rx_mtrl in ixgbe_ptp_hwtstamp_ioctl

This patch corrects a bug introduced by commit f3444d8b. The rxmtrl value for
the UDP port to timestamp on was moved above the switch statement, but was
overwritten to 0 if the ioctl selected one of the V1 filters.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoixgbe: add warning when scheduling reset
Jacob Keller [Sat, 1 Dec 2012 07:57:17 +0000 (07:57 +0000)]
ixgbe: add warning when scheduling reset

This patch adds warnings when a reset of the adapter is scheduled so that the
user can see log of why the reset occurred.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoixgbe: Add ptp work item to poll for the Tx timestamp
Jacob Keller [Wed, 5 Dec 2012 07:24:46 +0000 (07:24 +0000)]
ixgbe: Add ptp work item to poll for the Tx timestamp

This patch copies the igb implementation of Tx timestamps, which uses a work
item to poll for the Tx timestamp. In addition it adds a timeout value of 15
seconds, after which it will stop polling.

This is necessary due to an issue with the descriptor being marked done before
the Tx timestamp event has occurred. These two events don't correlate, so using
the done bit on the descriptor as indication that the timestamp must already
have been taken leads to potentially dropped Tx timestamps (especially under
heavy packet load)

Reported-by: Matthew Vick <matthew.vick@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoixgbe: Use watchdog check in favor of BPF for detecting latched timestamp
Jacob Keller [Wed, 5 Dec 2012 07:24:41 +0000 (07:24 +0000)]
ixgbe: Use watchdog check in favor of BPF for detecting latched timestamp

This patch removes ixgbe_ptp_match, and the corresponding packet filtering from
ixgbe driver. This code was previously causing some issues within the hotpath of
the driver. However the code also provided a check against possible frozen Rx
timestamp due to dropped packets when the Rx ring is full. This patch provides a
replacement solution based on the watchdog.

To this end, whenever a packet consumes the Rx timestamp it stores the jiffy
value in the rx_ring structure. Watchdog updates its own jiffy timer whenever
there is no valid timestamp in the registers.

If watchdog detects a valid timestamp in the registers, (meaning that no Rx
packet has consumed it yet) it will check which time is most recent, the last
time in the watchdog, or any time in the rx_rings. If the most recent "event"
was more than 5seconds ago, it will flush the Rx timestamp and print a warning
message to the syslog.

Reported-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoixgbe: Update ptp_overflow check comment and jiffies
Jacob Keller [Wed, 5 Dec 2012 07:24:35 +0000 (07:24 +0000)]
ixgbe: Update ptp_overflow check comment and jiffies

This patch fixes the comment on ptp_overflow_check to match up with what is
currently used as the parameters. Also change the jiffies check to use
time_is_after_jiffies macro.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoixgbe: add missing supported filters to get_ts_info
Jacob Keller [Thu, 15 Nov 2012 01:10:37 +0000 (01:10 +0000)]
ixgbe: add missing supported filters to get_ts_info

This patch updates the filters for ethtool's get_ts_info to return support for
all filters which can be supported by upscaling to ptp_v2_event. The intent
behind this change is due to reasoning that we do in fact support the
filters. (hwtstamp_ioctl returns success after setting the filter to the
upscaled version). In this way we can remain consistent over which filters are
supported via the get_ts_info ioctl and which filters are in practice actually
supported.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoixgbe: ethtool ixgbe_diag_test cleanup
Jacob Keller [Thu, 8 Nov 2012 07:07:08 +0000 (07:07 +0000)]
ixgbe: ethtool ixgbe_diag_test cleanup

This patch cleans up the ethtool diagnostics test by ensuring that the tests
work properly regardless of what state the adapter was in. The SRIOV VF check is
done at the beginning, forgoing the link test. The if_running -> dev_close is
moved before the link test, as well as a call to enable the Tx laser. This
ensures that the link test will return valid results even when adapter was
previously down. Also, a call to disable the Tx laser is added if the device
was down before the start. This ensures consistent behavior of the Tx laser
before and after the diagnostic checks. The end result is consistent behavior
regardless of device state.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoMerge branch 'testing' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert...
David S. Miller [Wed, 23 Jan 2013 19:00:16 +0000 (14:00 -0500)]
Merge branch 'testing' of git://git./linux/kernel/git/klassert/ipsec-next

Steffen Klassert says:

====================
1) Add a statistic counter for invalid output states and
   remove a superfluous state valid check, from Li RongQing.

2) Probe for asynchronous block ciphers instead of synchronous block
   ciphers to make the asynchronous variants available even if no
   synchronous block ciphers are found, from Jussi Kivilinna.

3) Make rfc3686 asynchronous block cipher and make use of
   the new asynchronous variant, from Jussi Kivilinna.

4) Replace some rwlocks by rcu, from Cong Wang.

5) Remove some unused defines.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobnx2x: SR-IOV version compatibility bugfix
Ariel Elior [Wed, 23 Jan 2013 03:21:54 +0000 (03:21 +0000)]
bnx2x: SR-IOV version compatibility bugfix

When posting a message on the bulletin board, the PF calculates crc
over the message and places the result in the message. When the VF
samples the Bulletin Board it copies the message aside and validates
this crc. The length of the message is crucial here and must be the
same in both parties. Since the PF is running in the Hypervisor and
the VF is running in a Vm, they can possibly be of different versions.
As the Bulletin Board is designed to grow forward in future versions,
in the VF the length must not be the size of the message structure
but instead it should be a field in the message itself.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobnx2x: Fix compilation with stop-on-error
Yuval Mintz [Wed, 23 Jan 2013 03:21:53 +0000 (03:21 +0000)]
bnx2x: Fix compilation with stop-on-error

Commit 823e1d9 caused bnx2x to fail once BNX2X_STOP_ON_ERROR is set.
Fixes compilation by moving function declarations between header files.

Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agocnic, bnx2x: Add CNIC_DRV_STATE_HANDLES_IRQ to ethdev->drv_state
Michael Chan [Wed, 23 Jan 2013 03:21:52 +0000 (03:21 +0000)]
cnic, bnx2x: Add CNIC_DRV_STATE_HANDLES_IRQ to ethdev->drv_state

In INTA mode, cnic and bnx2x share the same IRQ.  During chip reset,
for example, cnic will stop servicing IRQs after it has shutdown the
cnic hardware resources.  However, the shared IRQ is still active as
bnx2x needs to finish the reset.  There is a window when bnx2x does
not know that cnic is no longer handling IRQ and things don't always
work properly.

Add a flag to tell bnx2x that cnic is handling IRQ.  The flag is set
before the first cnic IRQ is expected and cleared when no more cnic
IRQs are expected, so there should be no race conditions.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobnx2x: correct memory release scheme
Yuval Mintz [Wed, 23 Jan 2013 03:21:51 +0000 (03:21 +0000)]
bnx2x: correct memory release scheme

Fix an incorrect SR-IOV memory release which was committed in 1ab4434.

Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobnx2x: Remove many sparse warnings
Yuval Mintz [Wed, 23 Jan 2013 03:21:50 +0000 (03:21 +0000)]
bnx2x: Remove many sparse warnings

Remove most of the sparse warnings in the bnx2x compilation
(i.e., thus resulting when compiling with `C=2 CF=-D__CHECK_ENDIAN__').

Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobnx2x: Modify unload conditions
Yuval Mintz [Wed, 23 Jan 2013 03:21:49 +0000 (03:21 +0000)]
bnx2x: Modify unload conditions

Don't unload the bnx2x driver if its in a recovery process, or if
the previous load have failed.

Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobnx2x: Correct memory preparation and release
Dmitry Kravkov [Wed, 23 Jan 2013 03:21:48 +0000 (03:21 +0000)]
bnx2x: Correct memory preparation and release

Since commit 15192a8cf there have been a memory leak upon rmmod
of the bnx2x driver.

This corrects the memory leak and corrects the zeroing of internal
memories upon driver load.

Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobnx2x: Add missing VFs reference in macros
Yuval Mintz [Wed, 23 Jan 2013 03:21:47 +0000 (03:21 +0000)]
bnx2x: Add missing VFs reference in macros

Add missing 57712_VF and 57800_VF to CHIP_IS_E2 and CHIP_IS_E3
macros (missing from commit 8395be5).

Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobnx2x: Add additional debug information
Yuval Mintz [Wed, 23 Jan 2013 03:21:46 +0000 (03:21 +0000)]
bnx2x: Add additional debug information

Add/Revise several debug prints in the bnx2x driver - on regular flows
as well as error flows.

Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobnx2x: correct usleep_range usage
Yuval Mintz [Wed, 23 Jan 2013 03:21:45 +0000 (03:21 +0000)]
bnx2x: correct usleep_range usage

Change the incorrect usage of `usleep_range(1000, 1000)' into
`usleep_range(1000, 2000)'.

Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobnx2x: reorganization and beautification
Yuval Mintz [Wed, 23 Jan 2013 03:21:44 +0000 (03:21 +0000)]
bnx2x: reorganization and beautification

Slightly changes the bnx2x code without `true' functional changes.
Changes include:
 1. Gathering macros into a single macro when combination is used multiple
    times.
 2. Exporting parts of functions into their own functions.
 3. Return values after if-else instead of only on the else condition
    (where current flow would simply return same value later in the code)
 4. Removing some unnecessary code (either dead-code or incorrect conditions)

Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobnx2x: Semantic renovation
Yuval Mintz [Wed, 23 Jan 2013 03:21:43 +0000 (03:21 +0000)]
bnx2x: Semantic renovation

Mostly corrects white spaces, indentations, and comments.

Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agogianfar: Restore promisc mode on gfar_init_mac()
Claudiu Manoil [Wed, 23 Jan 2013 00:18:36 +0000 (00:18 +0000)]
gianfar: Restore promisc mode on gfar_init_mac()

Reactivate promiscuous mode in H/W upon gfar_init_mac(), if the
net dev requires it (IFF_PROMISC flag set).
This way the promisc mode is preserved accross device reset conditions
like tx timeout, device restore, a.s.o.

Signed-off-by: Voncken C Acksys <cedric.voncken@acksys.fr>
Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge branch 'soreuseport'
David S. Miller [Wed, 23 Jan 2013 18:44:10 +0000 (13:44 -0500)]
Merge branch 'soreuseport'

Tom Herbert says:

====================
This series implements so_reuseport (SO_REUSEPORT socket option) for
TCP and UDP.  For TCP, so_reuseport allows multiple listener sockets
to be bound to the same port.  In the case of UDP, so_reuseport allows
multiple sockets to bind to the same port.  To prevent port hijacking
all sockets bound to the same port using so_reuseport must have the
same uid.  Received packets are distributed to multiple sockets bound
to the same port using a 4-tuple hash.

The motivating case for so_resuseport in TCP would be something like
a web server binding to port 80 running with multiple threads, where
each thread might have it's own listener socket.  This could be done
as an alternative to other models: 1) have one listener thread which
dispatches completed connections to workers. 2) accept on a single
listener socket from multiple threads.  In case #1 the listener thread
can easily become the bottleneck with high connection turn-over rate.
In case #2, the proportion of connections accepted per thread tends
to be uneven under high connection load (assuming simple event loop:
while (1) { accept(); process() }, wakeup does not promote fairness
among the sockets.  We have seen the  disproportion to be as high
as 3:1 ratio between thread accepting most connections and the one
accepting the fewest.  With so_reusport the distribution is
uniform.

The TCP implementation has a problem in that the request sockets for a
listener are attached to a listener socket.  If a SYN is received, a
listener socket is chosen and request structure is created (SYN-RECV
state).  If the subsequent ack in 3WHS does not match the same port
by so_reusport, the connection state is not found (reset) and the
request structure is orphaned.  This scenario would occur when the
number of listener sockets bound to a port changes (new ones are
added, or old ones closed).  We are looking for a solution to this,
maybe allow multiple sockets to share the same request table...

The motivating case for so_reuseport in UDP would be something like a
DNS server.  An alternative would be to recv on the same socket from
multiple threads.  As in the case of TCP, the load across these threads
tends to be disproportionate and we also see a lot of contection on
the socket lock.  Note that SO_REUSEADDR already allows multiple UDP
sockets to bind to the same port, however there is no provision to
prevent hijacking and nothing to distribute packets across all the
sockets sharing the same bound port.  This patch does not change the
semantics of SO_REUSEADDR, but provides usable functionality of it
for unicast.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agosoreuseport: UDP/IPv6 implementation
Tom Herbert [Tue, 22 Jan 2013 09:50:44 +0000 (09:50 +0000)]
soreuseport: UDP/IPv6 implementation

Motivation for soreuseport would be something like a DNS server.  An
alternative would be to recv on the same socket from multiple threads.
As in the case of TCP, the load across these threads tends to be
disproportionate and we also see a lot of contection on the socket lock.
Note that SO_REUSEADDR already allows multiple UDP sockets to bind to
the same port, however there is no provision to prevent hijacking and
nothing to distribute packets across all the sockets sharing the same
bound port.  This patch does not change the semantics of SO_REUSEADDR,
but provides usable functionality of it for unicast.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agosoreuseport: TCP/IPv6 implementation
Tom Herbert [Tue, 22 Jan 2013 09:50:39 +0000 (09:50 +0000)]
soreuseport: TCP/IPv6 implementation

Motivation for soreuseport would be something like a web server
binding to port 80 running with multiple threads, where each thread
might have it's own listener socket.  This could be done as an
alternative to other models: 1) have one listener thread which
dispatches completed connections to workers. 2) accept on a single
listener socket from multiple threads.  In case #1 the listener thread
can easily become the bottleneck with high connection turn-over rate.
In case #2, the proportion of connections accepted per thread tends
to be uneven under high connection load (assuming simple event loop:
while (1) { accept(); process() }, wakeup does not promote fairness
among the sockets.  We have seen the  disproportion to be as high
as 3:1 ratio between thread accepting most connections and the one
accepting the fewest.  With so_reusport the distribution is
uniform.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agosoreuseport: UDP/IPv4 implementation
Tom Herbert [Tue, 22 Jan 2013 09:50:32 +0000 (09:50 +0000)]
soreuseport: UDP/IPv4 implementation

Allow multiple UDP sockets to bind to the same port.

Motivation soreuseport would be something like a DNS server.  An
alternative would be to recv on the same socket from multiple threads.
As in the case of TCP, the load across these threads tends to be
disproportionate and we also see a lot of contection on the socketlock.
Note that SO_REUSEADDR already allows multiple UDP sockets to bind to
the same port, however there is no provision to prevent hijacking and
nothing to distribute packets across all the sockets sharing the same
bound port.  This patch does not change the semantics of SO_REUSEADDR,
but provides usable functionality of it for unicast.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agosoreuseport: TCP/IPv4 implementation
Tom Herbert [Tue, 22 Jan 2013 09:50:24 +0000 (09:50 +0000)]
soreuseport: TCP/IPv4 implementation

Allow multiple listener sockets to bind to the same port.

Motivation for soresuseport would be something like a web server
binding to port 80 running with multiple threads, where each thread
might have it's own listener socket.  This could be done as an
alternative to other models: 1) have one listener thread which
dispatches completed connections to workers. 2) accept on a single
listener socket from multiple threads.  In case #1 the listener thread
can easily become the bottleneck with high connection turn-over rate.
In case #2, the proportion of connections accepted per thread tends
to be uneven under high connection load (assuming simple event loop:
while (1) { accept(); process() }, wakeup does not promote fairness
among the sockets.  We have seen the  disproportion to be as high
as 3:1 ratio between thread accepting most connections and the one
accepting the fewest.  With so_reusport the distribution is
uniform.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agosoreuseport: infrastructure
Tom Herbert [Tue, 22 Jan 2013 09:49:50 +0000 (09:49 +0000)]
soreuseport: infrastructure

Definitions and macros for implementing soreusport.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoxen-netback: allow changing the MAC address of the interface
Matt Wilson [Tue, 22 Jan 2013 08:08:25 +0000 (08:08 +0000)]
xen-netback: allow changing the MAC address of the interface

Sometimes it is useful to be able to change the MAC address of the
interface for netback devices. For example, when using ebtables it may
be useful to be able to distinguish traffic from different interfaces
without depending on the interface name.

Reported-by: Nikita Borzykh <sample.n@gmail.com>
Reported-by: Paul Harvey <stockingpaul@hotmail.com>
Cc: netdev@vger.kernel.org
Cc: xen-devel@lists.xen.org
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Matt Wilson <msw@amazon.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonetfilter: nf_conntrack: fix compilation if sysctl are disabled
Pablo Neira Ayuso [Wed, 23 Jan 2013 14:12:25 +0000 (15:12 +0100)]
netfilter: nf_conntrack: fix compilation if sysctl are disabled

In (f94161c netfilter: nf_conntrack: move initialization out of pernet
operations), some ifdefs were missing for sysctl dependent code.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
11 years agonetfilter: nf_conntrack: refactor l4proto support for netns
Gao feng [Wed, 23 Jan 2013 11:51:10 +0000 (12:51 +0100)]
netfilter: nf_conntrack: refactor l4proto support for netns

Move the code that register/unregister l4proto to the
module_init/exit context.

Given that we have to modify some interfaces to accomodate
these changes, it is a good time to use shorter function names
for this using the nf_ct_* prefix instead of nf_conntrack_*,
that is:

nf_ct_l4proto_register
nf_ct_l4proto_pernet_register
nf_ct_l4proto_unregister
nf_ct_l4proto_pernet_unregister

We same many line breaks with it.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
11 years agonetfilter: nf_conntrack: refactor l3proto support for netns
Gao feng [Mon, 21 Jan 2013 22:10:33 +0000 (22:10 +0000)]
netfilter: nf_conntrack: refactor l3proto support for netns

Move the code that register/unregister l3proto to the
module_init/exit context.

Given that we have to modify some interfaces to accomodate
these changes, it is a good time to use shorter function names
for this using the nf_ct_* prefix instead of nf_conntrack_*,
that is:

nf_ct_l3proto_register
nf_ct_l3proto_pernet_register
nf_ct_l3proto_unregister
nf_ct_l3proto_pernet_unregister

We same many line breaks with it.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
11 years agonetfilter: nf_ct_proto: move initialization out of pernet_operations
Gao feng [Mon, 21 Jan 2013 22:10:32 +0000 (22:10 +0000)]
netfilter: nf_ct_proto: move initialization out of pernet_operations

Move the global initial codes to the module_init/exit context.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
11 years agonetfilter: nf_ct_labels: move initialization out of pernet_operations
Gao feng [Mon, 21 Jan 2013 22:10:31 +0000 (22:10 +0000)]
netfilter: nf_ct_labels: move initialization out of pernet_operations

Move the global initial codes to the module_init/exit context.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
11 years agonetfilter: nf_ct_helper: move initialization out of pernet_operations
Gao feng [Mon, 21 Jan 2013 22:10:30 +0000 (22:10 +0000)]
netfilter: nf_ct_helper: move initialization out of pernet_operations

Move the global initial codes to the module_init/exit context.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
11 years agonetfilter: nf_ct_timeout: move initialization out of pernet_operations
Gao feng [Mon, 21 Jan 2013 22:10:29 +0000 (22:10 +0000)]
netfilter: nf_ct_timeout: move initialization out of pernet_operations

Move the global initial codes to the module_init/exit context.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
11 years agonetfilter: nf_ct_ecache: move initialization out of pernet_operations
Gao feng [Mon, 21 Jan 2013 22:10:28 +0000 (22:10 +0000)]
netfilter: nf_ct_ecache: move initialization out of pernet_operations

Move the global initial codes to the module_init/exit context.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
11 years agonetfilter: nf_ct_tstamp: move initialization out of pernet_operations
Gao feng [Mon, 21 Jan 2013 22:10:27 +0000 (22:10 +0000)]
netfilter: nf_ct_tstamp: move initialization out of pernet_operations

Move the global initial codes to the module_init/exit context.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
11 years agonetfilter: nf_ct_acct: move initialization out of pernet_operations
Gao feng [Mon, 21 Jan 2013 22:10:26 +0000 (22:10 +0000)]
netfilter: nf_ct_acct: move initialization out of pernet_operations

Move the global initial codes to the module_init/exit context.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
11 years agonetfilter: nf_ct_expect: move initialization out of pernet_operations
Gao feng [Mon, 21 Jan 2013 22:10:25 +0000 (22:10 +0000)]
netfilter: nf_ct_expect: move initialization out of pernet_operations

Move the global initial codes to the module_init/exit context.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
11 years agonetfilter: nf_conntrack: move initialization out of pernet operations
Gao feng [Mon, 21 Jan 2013 22:10:24 +0000 (22:10 +0000)]
netfilter: nf_conntrack: move initialization out of pernet operations

nf_conntrack initialization and cleanup codes happens in pernet
operations function. This task should be done in module_init/exit.
We can't use init_net to identify if it's the right time to initialize
or cleanup since we cannot make assumption on the order netns are
created/destroyed.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
11 years agonetpoll: fix an uninitialized variable
Cong Wang [Tue, 22 Jan 2013 17:39:11 +0000 (17:39 +0000)]
netpoll: fix an uninitialized variable

Fengguang reported:

   net/core/netpoll.c: In function 'netpoll_setup':
   net/core/netpoll.c:1049:6: warning: 'err' may be used uninitialized in this function [-Wmaybe-uninitialized]

in !CONFIG_IPV6 case, we may error out without initializing
'err'.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoipv6: remove duplicated declaration of ip6_fragment()
Cong Wang [Tue, 22 Jan 2013 17:22:07 +0000 (17:22 +0000)]
ipv6: remove duplicated declaration of ip6_fragment()

It is declared in:
include/net/ip6_route.h:187:int ip6_fragment(struct sk_buff *skb, int (*output)(struct sk_buff *));

and net/ip6_route.h is already included.

Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge branch 'legacy-isa-delete' of git://git.kernel.org/pub/scm/linux/kernel/git...
David S. Miller [Tue, 22 Jan 2013 19:47:13 +0000 (14:47 -0500)]
Merge branch 'legacy-isa-delete' of git://git./linux/kernel/git/paulg/linux

Paul Gortmaker says:

====================
The Ethernet-HowTo was maintained for roughly 10 years, from 1993 to 2003.
Fortunately sane hardware probing and auto detection (via PCI and ISA/PnP)
largely made the document a relic of the past, hence it being abandoned
a decade ago.

However, there is one last useful thing that we can extract from the
effort made in maintaining that document.  We can use it to guide us
with respect to what rare, experimental and/or super ancient 10Mbit
ISA drivers don't make sense to maintain in-tree anymore.

Nobody will argue that ISA is obsolete.  Availability went away at about
the time Pentium3 motherboards moved from 500MHz Slot1/SECC processors
to the green 500MHz Socket 370 Pentium3 chips, at the turn of the century.

In theory, it is possible that someone could still be running one of these
12+ year old P3 machines and want 3.9+ bleeding edge kernels (but unlikely).
In light of the above (remote) possibility, we can defer the removal of some
ISA network drivers that were highly popular and well tested.  Typically
that means the stuff more from the mid to late '90s, some with ISA PnP
support, like the 3c509, the wd/SMC 8390 based stuff, PCnet/lance etc.

But a lot of other drivers, typically from the early 1990s were for rare
hardware, and experimental (to the point of requiring a cron job that would
do a test ping, and then ifconfig down/up and/or a rmmod/insmod!).  And
some of these drivers (znet, and lp486e to name two) are physically tied
to platforms with on motherboard ethernet -- of 486 machines that date
from the early 1990s and can only have single digit amounts of memory.

What I'd like to achieve here with this series, is to get rid of those old
drivers that are no longer being used.  In an earlier discussion where
I'd proposed deleting a single driver, Alan suggested we instead dump
all the historical stuff in one go, to make it "...immediately obvious
where the break point is..."[1] and that it was "perfectly reasonable it
(and a pile of other ISA cards) ought to be shown the door"[2].  So that
is the goal here - make a clear line in the sand where the really ancient
stuff finally gets kicked to the curb.

Two old parallel port drivers are considered for removal here as well,
since in early 386/486 ISA machines, the parallel port was typically found
with the UARTS on the multi-I/O ISA controller card.  These drivers also date
from the early 1990's; parallel ports are no longer found on modern boards,
and their performance was not even capable of 10% of 10Mbit bandwidth.

Allow me a preemptive justification against the inevitable comments from
well meaning bystanders who suggest "why not just leave all this alone?".
Dead drivers cost us all if they are left in tree.  If you think that
is false, then please first consider:

-every time you type "git status", you are checking to see if modifications
 have been made by you to all that dead code.

-every time you type "git grep <regex>" you are searching through files
 which contain that dead code that simply does not interest you.

-every time you build a "allyesconfig" and an "allmodconfig" (don't tell
 me you skip this step before submitting your changes to a maintainer),
 you waste CPU cycles building this dead code.

-every time there is a tree wide API change, or cleanup, or file relocation,
 we pay the cost of updating dead code, or moving dead code.

-daily regression tests (take linux-next as the most transparent
 example) spend time building (and possibly running) this dead code.

-hard working people who regularly run auditing tools looking for lurking
 bugs (sparse/coverity/smatch/coccinelle) are wasting time checking for,
 and fixing bugs in this dead code.

This last one is key.  Please take a look at the git history for the
files that are proposed for removal here.  Look at the git history for
any one of them ("git whatchanged --follow drivers/net/.../driver.c")
Mentally sort the changes into two bins -- (1) the robotic tree-wide
changes, and (2) the "look I found a real run-time bug while using this"
category.  You will see that category #2 is essentially empty.

Further to that, realize that drivers don't simply disappear.  We are
not operating in the binary-only distribution space like other OS.  All
these drivers remain in the git history forever.  If a person is an
enthusiast for extreme legacy hardware, they are probably already
customizing their kernel source and building it themselves to support
such systems.  Also keep in mind that they could still build the 3.8
kernel exactly as-is, and run it (or a 3.8.x stable variant of it) for
several more years if they were really determined to cling to these old
experimental ISA drivers for some reason.

In summary, I hope that folks can be pragmatic about this, and not
get swept up in nostalgia.  Ask yourself whether it is realistic to
expect a person would have a genuine use case where they would
need to build a 3.9+ modern kernel and install it on some legacy hardware
that has no option but to absolutely _require_ one of the drivers
that are deleted here.

The following series was created with --irreversible-delete for
ease of review (it skips showing the content of files that are
deleted); however the complete patches can be pulled as per below.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonetfilter: Use IS_ERR_OR_NULL().
YOSHIFUJI Hideaki / 吉藤英明 [Tue, 22 Jan 2013 06:33:09 +0000 (06:33 +0000)]
netfilter: Use IS_ERR_OR_NULL().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoipv6: Use IS_ERR_OR_NULL().
YOSHIFUJI Hideaki / 吉藤英明 [Tue, 22 Jan 2013 06:32:54 +0000 (06:32 +0000)]
ipv6: Use IS_ERR_OR_NULL().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoipv4: Use IS_ERR_OR_NULL().
YOSHIFUJI Hideaki / 吉藤英明 [Tue, 22 Jan 2013 06:32:49 +0000 (06:32 +0000)]
ipv4: Use IS_ERR_OR_NULL().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: Use IS_ERR_OR_NULL().
YOSHIFUJI Hideaki / 吉藤英明 [Tue, 22 Jan 2013 06:32:44 +0000 (06:32 +0000)]
net: Use IS_ERR_OR_NULL().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoneigh: Keep neighbour cache entries if number of them is small enough.
YOSHIFUJI Hideaki / 吉藤英明 [Tue, 22 Jan 2013 05:20:05 +0000 (05:20 +0000)]
neigh: Keep neighbour cache entries if number of them is small enough.

Since we have removed NCE (Neighbour Cache Entry) reference from
routing entries, the only refcnt holders of an NCE are its timer
(if running) and its owner table, in usual cases.  As a result,
neigh_periodic_work() purges NCEs over and over again even for
gateways.

It does not make sense to purge entries, if number of them is
very small, so keep them.  The minimum number of entries to keep
is specified by gc_thresh1.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoipmr: fix sparse warning when testing origin or group
Nicolas Dichtel [Tue, 22 Jan 2013 10:18:03 +0000 (11:18 +0100)]
ipmr: fix sparse warning when testing origin or group

mfc_mcastgrp and mfc_origin are __be32, thus we need to convert INADDR_ANY.
Because INADDR_ANY is 0, this patch just fix sparse warnings.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agodrivers/net: delete old x86 variant of the seeq8005 driver
Paul Gortmaker [Tue, 22 Jan 2013 00:13:26 +0000 (19:13 -0500)]
drivers/net: delete old x86 variant of the seeq8005 driver

The last update to the Ethernet HowTo (over 10 years ago) listed this:

 ------------------------
   SEEQ 8005

   Status: Obsolete, Driver Name: seeq8005

   There is little information about the card included in the driver,
   and hence little information to be put here. If you have a question,
   you are probably best trying to e-mail the driver author as listed
   in the source.

   It was marked obsolete as of the 2.4 series kernels.
 ------------------------

If it was obsolete over a decade ago, the situation can not have
improved with the passage of time, so let us act on that.  Even with
today's improved search engines, I was unable to locate any real
meaningful information on the ISA implementation of this rare chip.

There are ARM and SGI variants of the driver in tree, but they do
not depend on the original x86 driver source or header file.  We
leave those non-x86 drivers to be deleted by the arch maintainers
when they decide to expire those legacy platforms as a whole.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
11 years agodrivers/net: delete Digital EtherWorks-3 support.
Paul Gortmaker [Sun, 20 Jan 2013 22:14:45 +0000 (17:14 -0500)]
drivers/net: delete Digital EtherWorks-3 support.

This is another one that makes sense to target for obsolescence, since
it (a)appeared pre-1995, and (b)was rather rare, and (c)did not
really have any statistically significant active linux user base.

Removing this ISA 10Mbit driver support is unlikely to be even noticed
by the user base of 3.9+ linux kernels, especially when the documentation
clearly indicates the vintage with this text:

 "...designed to  work with all kernels > 1.1.33"

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>