platform/kernel/linux-stable.git
14 years agoethtool: Change ethtool_op_set_flags to validate flags
Ben Hutchings [Wed, 30 Jun 2010 02:44:32 +0000 (02:44 +0000)]
ethtool: Change ethtool_op_set_flags to validate flags

ethtool_op_set_flags() does not check for unsupported flags, and has
no way of doing so.  This means it is not suitable for use as a
default implementation of ethtool_ops::set_flags.

Add a 'supported' parameter specifying the flags that the driver and
hardware support, validate the requested flags against this, and
change all current callers to pass this parameter.

Change some other trivial implementations of ethtool_ops::set_flags to
call ethtool_op_set_flags().

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Reviewed-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocxgb4vf: Use correct shift factor for extracting the SGE DMA Ingress Padding Boundary
Casey Leedom [Tue, 29 Jun 2010 12:54:12 +0000 (12:54 +0000)]
cxgb4vf: Use correct shift factor for extracting the SGE DMA Ingress Padding Boundary

Use correct shift factor for extracting the SGE DMA Ingress Padding
Boundary.  Was accidentally using the register field's shift which was
close enough (4 instead of the propper value of 5) that it actually
sort of worked for various packet sizes ...

Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocxgb4vf: Remove obsolete comment about the lack of a TX Timer Callback
Casey Leedom [Tue, 29 Jun 2010 12:53:39 +0000 (12:53 +0000)]
cxgb4vf: Remove obsolete comment about the lack of a TX Timer Callback

Remove obsolete comment about the lack of a TX Timer Callback -- which
we now _do_ have ...

Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agofragment: add fast path for in-order fragments
Changli Gao [Tue, 29 Jun 2010 04:39:37 +0000 (04:39 +0000)]
fragment: add fast path for in-order fragments

add fast path for in-order fragments

As the fragments are sent in order in most of OSes, such as Windows, Darwin and
FreeBSD, it is likely the new fragments are at the end of the inet_frag_queue.
In the fast path, we check if the skb at the end of the inet_frag_queue is the
prev we expect.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
----
 include/net/inet_frag.h |    1 +
 net/ipv4/ip_fragment.c  |   12 ++++++++++++
 net/ipv6/reassembly.c   |   11 +++++++++++
 3 files changed, 24 insertions(+)
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosnmp: 64bit ipstats_mib for all arches
Eric Dumazet [Wed, 30 Jun 2010 20:31:19 +0000 (13:31 -0700)]
snmp: 64bit ipstats_mib for all arches

/proc/net/snmp and /proc/net/netstat expose SNMP counters.

Width of these counters is either 32 or 64 bits, depending on the size
of "unsigned long" in kernel.

This means user program parsing these files must already be prepared to
deal with 64bit values, regardless of user program being 32 or 64 bit.

This patch introduces 64bit snmp values for IPSTAT mib, where some
counters can wrap pretty fast if they are 32bit wide.

# netstat -s|egrep "InOctets|OutOctets"
    InOctets: 244068329096
    OutOctets: 244069348848

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobe2net: memory barrier fixes on IBM p7 platform
Sathya Perla [Tue, 29 Jun 2010 00:11:17 +0000 (00:11 +0000)]
be2net: memory barrier fixes on IBM p7 platform

The ibm p7 architecure seems to reorder memory accesses more
aggressively than previous ppc64 architectures. This requires memory
barriers to ensure that rx/tx doorbells are pressed only after
memory to be DMAed is written.

Signed-off-by: Sathya Perla <sathyap@serverengines.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocpmac: use resource_size()
Dan Carpenter [Wed, 30 Jun 2010 20:12:01 +0000 (13:12 -0700)]
cpmac: use resource_size()

The original code is off by one because we should start counting at
zero.  So the size of the resource is end - start + 1.  I switched it to
use resource_size() to do the calculation.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoact_nat: use stack variable
Changli Gao [Tue, 29 Jun 2010 23:07:09 +0000 (23:07 +0000)]
act_nat: use stack variable

act_nat: use stack variable

structure tc_nat isn't too big for stack, so we can put it in stack.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
----
 net/sched/act_nat.c |   31 ++++++++++---------------------
 1 file changed, 10 insertions(+), 21 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoact_mirred: combine duplicate code
Changli Gao [Tue, 29 Jun 2010 22:54:58 +0000 (22:54 +0000)]
act_mirred: combine duplicate code

act_mirred: combine duplicate code

tcf_bstats is updated in any way, so we can do it earlier to reduce the size of
the code.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
----
 net/sched/act_mirred.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet/neighbour.h: fix typo
Kulikov Vasiliy [Wed, 30 Jun 2010 06:08:15 +0000 (06:08 +0000)]
net/neighbour.h: fix typo

'Shoul' must be 'should'.

Signed-off-by: Kulikov Vasiliy <segooon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agogianfar: Implement workaround for eTSEC-A002 erratum
Anton Vorontsov [Wed, 30 Jun 2010 06:39:15 +0000 (06:39 +0000)]
gianfar: Implement workaround for eTSEC-A002 erratum

MPC8313ECE says:

"If the controller receives a 1- or 2-byte frame (such as an illegal
 runt packet or a packet with RX_ER asserted) before GRS is asserted
 and does not receive any other frames, the controller may fail to set
 GRSC even when the receive logic is completely idle. Any subsequent
 receive frame that is larger than two bytes will reset the state so
 the graceful stop can complete. A MAC receiver (Rx) reset will also
 reset the state."

This patch implements the proposed workaround:

"If IEVENT[GRSC] is still not set after the timeout, read the eTSEC
 register at offset 0xD1C. If bits 7-14 are the same as bits 23-30,
 the eTSEC Rx is assumed to be idle and the Rx can be safely reset.
 If the register fields are not equal, wait for another timeout
 period and check again."

Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agogianfar: Implement workaround for eTSEC76 erratum
Anton Vorontsov [Wed, 30 Jun 2010 06:39:13 +0000 (06:39 +0000)]
gianfar: Implement workaround for eTSEC76 erratum

MPC8313ECE says:

"For TOE=1 huge or jumbo frames, the data required to generate the
 checksum may exceed the 2500-byte threshold beyond which the controller
 constrains itself to one memory fetch every 256 eTSEC system clocks.

 This throttling threshold is supposed to trigger only when the
 controller has sufficient data to keep transmit active for the duration
 of the memory fetches. The state machine handling this threshold,
 however, fails to take large TOE frames into account. As a result,
 TOE=1 frames larger than 2500 bytes often see excess delays before start
 of transmission."

This patch implements the workaround as suggested by the errata
document, i.e.:

"Limit TOE=1 frames to less than 2500 bytes to avoid excess delays due to
 memory throttling.
 When using packets larger than 2700 bytes, it is recommended to turn TOE
 off."

To be sure, we limit the TOE frames to 2500 bytes, and do software
checksumming instead.

Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agogianfar: Implement workaround for eTSEC74 erratum
Anton Vorontsov [Wed, 30 Jun 2010 06:39:12 +0000 (06:39 +0000)]
gianfar: Implement workaround for eTSEC74 erratum

MPC8313ECE says:

"If MACCFG2[Huge Frame]=0 and the Ethernet controller receives frames
 which are larger than MAXFRM, the controller truncates the frames to
 length MAXFRM and marks RxBD[TR]=1 to indicate the error. The controller
 also erroneously marks RxBD[TR]=1 if the received frame length is MAXFRM
 or MAXFRM-1, even though those frames are not truncated.
 No truncation or truncation error occurs if MACCFG2[Huge Frame]=1."

There are two options to workaround the issue:

"1. Set MACCFG2[Huge Frame]=1, so no truncation occurs for invalid large
 frames. Software can determine if a frame is larger than MAXFRM by
 reading RxBD[LG] or RxBD[Data Length].

 2. Set MAXFRM to 1538 (0x602) instead of the default 1536 (0x600), so
 normal-length frames are not marked as truncated. Software can examine
 RxBD[Data Length] to determine if the frame was larger than MAXFRM-2."

This patch implements the first workaround option by setting HUGEFRAME
bit, and gfar_clean_rx_ring() already checks the RxBD[Data Length].

Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet/core: use ntohs for skb->protocol
Sebastian Andrzej Siewior [Wed, 30 Jun 2010 17:39:19 +0000 (10:39 -0700)]
net/core: use ntohs for skb->protocol

This is only noticed by people that are not doing everything correct in
the first place.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoipv6: Use interface max_desync_factor instead of static default
Ben Hutchings [Sat, 26 Jun 2010 11:42:55 +0000 (11:42 +0000)]
ipv6: Use interface max_desync_factor instead of static default

max_desync_factor can be configured per-interface, but nothing is
using the value.

Reported-by: Piotr Lewandowski <piotr.lewandowski@gmail.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoipv6: Clamp reported valid_lft to a minimum of 0
Ben Hutchings [Sat, 26 Jun 2010 11:37:47 +0000 (11:37 +0000)]
ipv6: Clamp reported valid_lft to a minimum of 0

Since addresses are only revalidated every 2 minutes, the reported
valid_lft can underflow shortly before the address is deleted.
Clamp it to a minimum of 0, as for prefered_lft.

Reported-by: Piotr Lewandowski <piotr.lewandowski@gmail.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agousb: pegasus: fixed coding style issues
Nicolas Kaiser [Sat, 26 Jun 2010 06:58:54 +0000 (06:58 +0000)]
usb: pegasus: fixed coding style issues

Fixed brace, static initialization, comment, whitespace and spacing
coding style issues.

Signed-off-by: Nicolas Kaiser <nikai@nikai.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years ago3c59x: Use fine-grained locks for MII and windowed register access
Ben Hutchings [Tue, 29 Jun 2010 15:26:56 +0000 (15:26 +0000)]
3c59x: Use fine-grained locks for MII and windowed register access

This avoids scheduling in atomic context and also means that IRQs
will only be deferred for relatively short periods of time.

Previously discussed in:
http://article.gmane.org/gmane.linux.network/155024

Reported-by: Arne Nordmark <nordmark@mech.kth.se>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoe1000e: disable EEE support by default
Bruce Allan [Tue, 29 Jun 2010 18:13:13 +0000 (18:13 +0000)]
e1000e: disable EEE support by default

Based on community feedback, EEE should be disabled by default until the
IEEE802.3az specification has been finalized.

Cc: bhutchings@solarflare.com
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoe1000e: remove EEE module parameter
Bruce Allan [Tue, 29 Jun 2010 18:12:52 +0000 (18:12 +0000)]
e1000e: remove EEE module parameter

As requested by Dave Miller.  A follow-on set of patches will allow for
ethtool to enable/disable the feature instead.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoe1000e: suppress compile warnings on certain archs
Bruce Allan [Tue, 29 Jun 2010 18:12:30 +0000 (18:12 +0000)]
e1000e: suppress compile warnings on certain archs

Commit 84f4ee902ad3ee964b7b3a13d5b7cf9c086e9916 causes compile warnings on
architectures that have unsigned long long's that are not 64-bit, e.g.
ia64.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoe1000e: don't inadvertently re-set INTX_DISABLE
Dean Nelson [Tue, 29 Jun 2010 18:12:05 +0000 (18:12 +0000)]
e1000e: don't inadvertently re-set INTX_DISABLE

Should e1000_test_msi() fail to see an msi interrupt, it attempts to
fallback to legacy INTx interrupts. But an error in the code may prevent
this from happening correctly.

Before calling e1000_test_msi_interrupt(), e1000_test_msi() disables SERR
by clearing the SERR bit from the just read PCI_COMMAND bits as it writes
them back out.

Upon return from calling e1000_test_msi_interrupt(), it re-enables SERR
by writing out the version of PCI_COMMAND it had previously read.

The problem with this is that e1000_test_msi_interrupt() calls
pci_disable_msi(), which eventually ends up in pci_intx(). And because
pci_intx() was called with enable set to 1, the INTX_DISABLE bit gets
cleared from PCI_COMMAND, which is what we want. But when we get back to
e1000_test_msi(), the INTX_DISABLE bit gets inadvertently re-set because
of the attempt by e1000_test_msi() to re-enable SERR.

The solution is to have e1000_test_msi() re-read the PCI_COMMAND bits as
part of its attempt to re-enable SERR.

During debugging/testing of this issue I found that not all the systems
I ran on had the SERR bit set to begin with. And on some of the systems
the same could be said for the INTX_DISABLE bit. Needless to say these
latter systems didn't have a problem falling back to legacy INTx
interrupts with the code as is.

Signed-off-by: Dean Nelson <dnelson@redhat.com>
CC: stable@kernel.org
Tested-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agodrivers/net/Makefile: conditionally descend to wireless
Nicolas Kaiser [Sun, 27 Jun 2010 11:44:52 +0000 (11:44 +0000)]
drivers/net/Makefile: conditionally descend to wireless

Don't descend to wireless unless it is actually used.

Signed-off-by: Nicolas Kaiser <nikai@nikai.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet/Makefile: conditionally descend to wireless and ieee802154
Nicolas Kaiser [Sun, 27 Jun 2010 00:00:25 +0000 (00:00 +0000)]
net/Makefile: conditionally descend to wireless and ieee802154

Don't descend to wireless and ieee802154 unless they are actually used.

Signed-off-by: Nicolas Kaiser <nikai@nikai.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqlcnic: Add support for configuring eswitch and npars
Rajesh K Borundia [Tue, 29 Jun 2010 08:01:20 +0000 (08:01 +0000)]
qlcnic: Add support for configuring eswitch and npars

Following changes are made:
1.Obtain capabilities of Nic partition.
2.Configure tx bandwidth of particular Nic partition.
3.Configure the eswitch for setting port mirroring, enable mac
learning, promiscous mode.

Signed-off-by: Rajesh K Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqlcnic: Remove obsolete code
Anirban Chakraborty [Tue, 29 Jun 2010 07:52:12 +0000 (07:52 +0000)]
qlcnic: Remove obsolete code

Current driver uses FW API version 2 and thus code corresponding to FW API
version 1 has become obsolete. Clean up this from the driver.

Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agomicrel phy driver - updated(1)
Choi, David [Mon, 28 Jun 2010 15:23:41 +0000 (15:23 +0000)]
micrel phy driver - updated(1)

Hello all:

This patch fixes what Ben mentioned, namely duplicated ids.

From: David J. Choi <david.choi@micrel.com>

Body of the explanation: This patch has changes as followings;
 -support the interrupt from phy devices from Micrel Inc.
 -support more phy devices, ks8737, ks8721, ks8041, ks8051 from Micrel.
 -remove vsc8201 because this device was used only internal test at Micrel.

Signed-off-by: David J. Choi <david.choi@micrel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqlcnic: fail when try to setup unsupported features
Stanislaw Gruszka [Sun, 27 Jun 2010 23:31:34 +0000 (23:31 +0000)]
qlcnic: fail when try to setup unsupported features

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonetxen: fail when try to setup unsupported features
Stanislaw Gruszka [Sun, 27 Jun 2010 23:33:29 +0000 (23:33 +0000)]
netxen: fail when try to setup unsupported features

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobnx2x: fail when try to setup unsupported features
Stanislaw Gruszka [Sun, 27 Jun 2010 23:28:11 +0000 (23:28 +0000)]
bnx2x: fail when try to setup unsupported features

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agovmxnet3: fail when try to setup unsupported features
Stanislaw Gruszka [Sun, 27 Jun 2010 23:29:42 +0000 (23:29 +0000)]
vmxnet3: fail when try to setup unsupported features

Return EOPNOTSUPP in ethtool_ops->set_flags.

Fix coding style while at it.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoe1000e: fail when try to setup unsupported features
Stanislaw Gruszka [Sun, 27 Jun 2010 23:26:23 +0000 (23:26 +0000)]
e1000e: fail when try to setup unsupported features

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocaif-driver: Add CAIF-SPI Protocol driver.
Sjur Braendeland [Tue, 29 Jun 2010 07:08:21 +0000 (00:08 -0700)]
caif-driver: Add CAIF-SPI Protocol driver.

This patch introduces the CAIF SPI Protocol Driver for
CAIF Link Layer.

This driver implements a platform driver to accommodate for a
platform specific SPI device. A general platform driver is not
possible as there are no SPI Slave side Kernel API defined.
A sample CAIF SPI Platform device can be found in
.../Documentation/networking/caif/spi_porting.txt

Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocaif: Kconfig and Makefile fixes
Sjur Braendeland [Sat, 26 Jun 2010 11:31:28 +0000 (11:31 +0000)]
caif: Kconfig and Makefile fixes

Use "depends on" instead of "if" in Kconfig files.
Fixed CAIF debug flag, and removed unnecessary clean-* options.

Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocxgb4vf: Stitch new T4 PCI-E SR-IOV Virtual Function driver into the build
Casey Leedom [Fri, 25 Jun 2010 12:15:33 +0000 (12:15 +0000)]
cxgb4vf: Stitch new T4 PCI-E SR-IOV Virtual Function driver into the build

Stitch new T4 PCI-E SR-IOV Virtual Function driver into the build.

Signed-off-by: Casey Leedom
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocxgb4vf: Add new Makefile for T4 PCI-E SR-IOV Virtual Function driver cxgb4vf
Casey Leedom [Fri, 25 Jun 2010 12:14:57 +0000 (12:14 +0000)]
cxgb4vf: Add new Makefile for T4 PCI-E SR-IOV Virtual Function driver cxgb4vf

Add new Makefile for T4 PCI-E SR-IOV Virtual Function driver "cxgb4vf".

Signed-off-by: Casey Leedom
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocxgb4vf: Add main T4 PCI-E SR-IOV Virtual Function driver for cxgb4vf
Casey Leedom [Fri, 25 Jun 2010 12:14:15 +0000 (12:14 +0000)]
cxgb4vf: Add main T4 PCI-E SR-IOV Virtual Function driver for cxgb4vf

Add main T4 PCI-E SR-IOV Virtual Function driver for "cxgb4vf".

Signed-off-by: Casey Leedom
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocxgb4vf: Add T4 Virtual Function Scatter-Gather Engine DMA code
Casey Leedom [Fri, 25 Jun 2010 12:13:28 +0000 (12:13 +0000)]
cxgb4vf: Add T4 Virtual Function Scatter-Gather Engine DMA code

Add T4 Virtual Function Scatter-Gather Engine DMA code.

Signed-off-by: Casey Leedom
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocxgb4vf: Add core T4 PCI-E SR-IOV Virtual Function hardware definitions and device...
Casey Leedom [Fri, 25 Jun 2010 12:12:54 +0000 (12:12 +0000)]
cxgb4vf: Add core T4 PCI-E SR-IOV Virtual Function hardware definitions and device communication code

Add core T4 PCI-E SR-IOV Virtual Function hardware definitions and device
communication code.

Signed-off-by: Casey Leedom
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocxgb4vf: Add code to provision T4 PCI-E SR-IOV Virtual Functions with hardware resources
Casey Leedom [Fri, 25 Jun 2010 12:11:46 +0000 (12:11 +0000)]
cxgb4vf: Add code to provision T4 PCI-E SR-IOV Virtual Functions with hardware resources

Add code to provision T4 PCI-E SR-IOV Virtual Functions with hardware
resources.

Signed-off-by: Casey Leedom
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocxgb4vf: Add new macros and definitions for hardware constants
Casey Leedom [Fri, 25 Jun 2010 12:11:05 +0000 (12:11 +0000)]
cxgb4vf: Add new macros and definitions for hardware constants

Add new macros and definitions for hardware constants.

Signed-off-by: Casey Leedom
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocxgb4vf: update to latest T4 firmware API file
Casey Leedom [Fri, 25 Jun 2010 12:10:32 +0000 (12:10 +0000)]
cxgb4vf: update to latest T4 firmware API file

Update to latest T4 firmware API file.

Signed-off-by: Casey Leedom
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocxgb4vf: small changes to message processing structures/macros
Casey Leedom [Fri, 25 Jun 2010 12:09:38 +0000 (12:09 +0000)]
cxgb4vf: small changes to message processing structures/macros

Split cpl_tx_pkt_lso into core message structure and encapsulated message,
make RSPD_LEN macro match other response descriptor macros.

Signed-off-by: Casey Leedom
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonetdev: mdio-octeon: Fix section mismatch errors.
David Daney [Thu, 24 Jun 2010 09:14:48 +0000 (09:14 +0000)]
netdev: mdio-octeon: Fix section mismatch errors.

We started getting:

WARNING: vmlinux.o(.data+0x20bd0): Section mismatch in reference from
the variable octeon_mdiobus_driver to the function
.init.text:octeon_mdiobus_probe()

This fixes it.

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonetdev: octeon_mgmt: Fix section mismatch errors.
David Daney [Thu, 24 Jun 2010 09:14:47 +0000 (09:14 +0000)]
netdev: octeon_mgmt: Fix section mismatch errors.

We started getting:

WARNING: drivers/net/built-in.o(.data+0x10f0): Section mismatch in
reference from the variable octeon_mgmt_driver to the function
.init.text:octeon_mgmt_probe()

This fixes it.

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoact_mirred: don't clone skb when skb isn't shared
Changli Gao [Thu, 24 Jun 2010 16:25:12 +0000 (16:25 +0000)]
act_mirred: don't clone skb when skb isn't shared

don't clone skb when skb isn't shared

When the tcf_action is TC_ACT_STOLEN, and the skb isn't shared, we don't need
to clone a new skb. As the skb will be freed after this function returns, we
can use it freely once we get a reference to it.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
----
 include/net/sch_generic.h |   11 +++++++++--
 net/sched/act_mirred.c    |    6 +++---
 2 files changed, 12 insertions(+), 5 deletions(-)
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agotcp: tso_fragment() might avoid GFP_ATOMIC
Eric Dumazet [Thu, 24 Jun 2010 01:00:22 +0000 (01:00 +0000)]
tcp: tso_fragment() might avoid GFP_ATOMIC

We can pass a gfp argument to tso_fragment() and avoid GFP_ATOMIC
allocations sometimes.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agovlan: 64 bit rx counters
Eric Dumazet [Thu, 24 Jun 2010 00:55:06 +0000 (00:55 +0000)]
vlan: 64 bit rx counters

Use u64_stats_sync infrastructure to implement 64bit rx stats.

(tx stats are addressed later)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agomacvlan: 64 bit rx counters
Eric Dumazet [Thu, 24 Jun 2010 00:54:21 +0000 (00:54 +0000)]
macvlan: 64 bit rx counters

Use u64_stats_sync infrastructure to implement 64bit stats.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: u64_stats_fetch_begin_bh() and u64_stats_fetch_retry_bh()
Eric Dumazet [Thu, 24 Jun 2010 00:54:06 +0000 (00:54 +0000)]
net: u64_stats_fetch_begin_bh() and u64_stats_fetch_retry_bh()

- Must disable preemption in case of 32bit UP in u64_stats_fetch_begin()
and u64_stats_fetch_retry()

- Add new u64_stats_fetch_begin_bh() and u64_stats_fetch_retry_bh() for
network usage, disabling BH on 32bit UP only.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: use this_cpu_ptr()
Eric Dumazet [Thu, 24 Jun 2010 00:52:37 +0000 (00:52 +0000)]
net: use this_cpu_ptr()

use this_cpu_ptr(p) instead of per_cpu_ptr(p, smp_processor_id())

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: u64_stats_sync improvements
Eric Dumazet [Thu, 24 Jun 2010 00:04:38 +0000 (00:04 +0000)]
net: u64_stats_sync improvements

- Add a comment about interrupts:

6) If counter might be written by an interrupt, readers should block
interrupts.

- Fix a typo in sample of use.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years ago3c59x: Specify window explicitly for access to windowed registers
Ben Hutchings [Wed, 23 Jun 2010 13:54:31 +0000 (13:54 +0000)]
3c59x: Specify window explicitly for access to windowed registers

Currently much of the code assumes that a specific window has been
selected, while a few functions save and restore the window.  This
makes it impossible to introduce fine-grained locking.

Make those assumptions explicit by introducing wrapper functions
to set the window and read/write a register.  Use these everywhere
except vortex_interrupt(), vortex_start_xmit() and vortex_rx().
These set the window just once, or not at all in the case of
vortex_rx() as it should always be called from vortex_interrupt().

Cache the current window in struct vortex_private to avoid
unnecessary hardware writes.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Tested-by: Arne Nordmark <nordmark@mech.kth.se> [against 2.6.32]
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agomlx4: add dynamic LRO disable support
Amerigo Wang [Mon, 21 Jun 2010 22:50:17 +0000 (22:50 +0000)]
mlx4: add dynamic LRO disable support

This patch adds dynamic LRO diable support for mlx4 net driver.
It also fixes a bug of mlx4, which checks NETIF_F_LRO flag in rx
path without rtnl lock.

(I don't have mlx4 card, so only did compiling test. Anyone who wants
to test this is more than welcome.)

This is based on Neil's initial work too, and heavily modified based
on Stanislaw's suggestions.

Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: Neil Horman <nhorman@redhat.com>
Acked-by: Neil Horman <nhorman@redhat.com>
Reviewed-by: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agos2io: add dynamic LRO disable support
Jon Mason [Thu, 24 Jun 2010 18:45:10 +0000 (18:45 +0000)]
s2io: add dynamic LRO disable support

This patch adds dynamic LRO disable support for s2io net driver,
enables LRO by default, increases the driver version number, and
corrects the name of the LRO modparm.

This is mostly Wang's patch based on Neil's initial work, heavily
modified based on Ramkrishna's suggestions.  This has been tested on
a Neterion Xframe adapter and verified via adapter LRO statistics.

Signed-off-by: Jon Mason <jon.mason@exar.com>
Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: Neil Horman <nhorman@redhat.com>
Acked-by: Neil Horman <nhorman@redhat.com>
Reviewed-by: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Ramkrishna Vepa <Ramkrishna.Vepa@exar.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosyncookies: add support for ECN
Florian Westphal [Mon, 21 Jun 2010 11:48:45 +0000 (11:48 +0000)]
syncookies: add support for ECN

Allows use of ECN when syncookies are in effect by encoding ecn_ok
into the syn-ack tcp timestamp.

While at it, remove a uneeded #ifdef CONFIG_SYN_COOKIES.
With CONFIG_SYN_COOKIES=nm want_cookie is ifdef'd to 0 and gcc
removes the "if (0)".

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosyncookies: do not store rcv_wscale in tcp timestamp
Florian Westphal [Mon, 21 Jun 2010 11:48:44 +0000 (11:48 +0000)]
syncookies: do not store rcv_wscale in tcp timestamp

As pointed out by Fernando Gont there is no need to encode rcv_wscale
into the cookie.

We did not use the restored rcv_wscale anyway; it is recomputed
via tcp_select_initial_window().

Thus we can save 4 bits in the ts option space by removing rcv_wscale.
In case window scaling was not supported, we set the (invalid) wscale
value 0xf.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoipv6: remove ipv6_statistics
Eric Dumazet [Wed, 23 Jun 2010 02:51:25 +0000 (02:51 +0000)]
ipv6: remove ipv6_statistics

commit 9261e5370112 (ipv6: making ip and icmp statistics per/namespace)
forgot to remove ipv6_statistics variable.

commit bc417d99bf27 (ipv6: remove stale MIB definitions) took care of
icmpv6_statistics & icmpv6msg_statistics

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Denis V. Lunev <den@openvz.org>
CC: Alexey Dobriyan <adobriyan@gmail.com>
CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosnmp: add align parameter to snmp_mib_init()
Eric Dumazet [Tue, 22 Jun 2010 20:58:41 +0000 (20:58 +0000)]
snmp: add align parameter to snmp_mib_init()

In preparation for 64bit snmp counters for some mibs,
add an 'align' parameter to snmp_mib_init(), instead
of assuming mibs only contain 'unsigned long' fields.

Callers can use __alignof__(type) to provide correct
alignment.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
CC: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoloopback: use u64_stats_sync infrastructure
Eric Dumazet [Tue, 22 Jun 2010 12:44:11 +0000 (12:44 +0000)]
loopback: use u64_stats_sync infrastructure

Commit 6b10de38f0ef (loopback: Implement 64bit stats on 32bit arches)
introduced 64bit stats in loopback driver, using a private seqcount and
private helpers.

David suggested to introduce a generic infrastructure, added in (net:
Introduce u64_stats_sync infrastructure)

This patch reimplements loopback 64bit stats using the u64_stats_sync
infrastructure.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoarp: RCU change in arp_solicit()
Eric Dumazet [Tue, 22 Jun 2010 07:43:15 +0000 (07:43 +0000)]
arp: RCU change in arp_solicit()

Avoid two atomic ops in arp_solicit()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agodccp: make implementation of Syn-RTT symmetric
Gerrit Renker [Tue, 22 Jun 2010 01:14:35 +0000 (01:14 +0000)]
dccp: make implementation of Syn-RTT symmetric

This patch is thanks to Andre Noll who reported the issue and helped testing.

The Syn-RTT sampled during the initial handshake currently only works for
the client sending the DCCP-Request. TFRC penalizes the absence of an RTT
sample with a very slow initial speed (1 packet per second), which delays
slow-start significantly, resulting in sluggish performance.

This patch mirrors the "Syn RTT" principle by adding a timestamp also onto
the DCCP-Response, producing an RTT sample  when the (Data)Ack completing
the handshake arrives.

Also changed the documentation to 'TFRC' since Syn RTTs are also used by CCID-4.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agodccp: remove unused function argument
Gerrit Renker [Tue, 22 Jun 2010 01:14:34 +0000 (01:14 +0000)]
dccp: remove unused function argument

This removes an unused 'sk' argument from several option-inserting functions.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet/fec: clean suspend/resume
Eric Benard [Fri, 18 Jun 2010 04:19:54 +0000 (04:19 +0000)]
net/fec: clean suspend/resume

Commit 59d4289b83b11379d867e2f7146904b19cc96404 converted fec to dev_pm_ops but
didn't update the suspend/resume functions thus leading to the following warning :
"initialization from incompatible pointer type" when CONFIG_PM is set.

This patch also fixe a few indentation and style around CONFIG_PM area.

Signed-off-by: Eric Bénard <eric@eukrea.com>
Cc: netdev@vger.kernel.org
Cc: davem@davemloft.net
Cc: amit.kucheria@canonical.com
Cc: s.hauer@pengutronix.de
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocxgb3: request 7.10 firmware
Divy Le Ray [Mon, 21 Jun 2010 15:54:53 +0000 (15:54 +0000)]
cxgb3: request 7.10 firmware

The driver requests FW 7.10
Bump up driver version.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocxgb3: update FW to 7.10
Divy Le Ray [Mon, 21 Jun 2010 15:54:48 +0000 (15:54 +0000)]
cxgb3: update FW to 7.10

Update FW to 7.10

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet/core/pktgen.c: Use pr_<level>
Joe Perches [Mon, 21 Jun 2010 12:29:14 +0000 (12:29 +0000)]
net/core/pktgen.c: Use pr_<level>

Add pr_fmt(fmt) KBUILD_MODNAME ": " fmt
Remove "pktgen: " from formats
Convert printks to pr_<level>
Added func_enter() for debugging
Moved version to end of string at module_init
Coalesced long formats

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: optimize Berkeley Packet Filter (BPF) processing
Hagen Paul Pfeifer [Sat, 19 Jun 2010 17:05:36 +0000 (17:05 +0000)]
net: optimize Berkeley Packet Filter (BPF) processing

Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).

Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):

7ff: 8b 06                 mov    (%rsi),%eax
801: 66 83 f8 35           cmp    $0x35,%ax
805: 0f 84 d0 02 00 00     je     adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00     ja     918 <sk_run_filter+0x15a>
811: 66 83 f8 15           cmp    $0x15,%ax
815: 0f 84 c5 02 00 00     je     ae0 <sk_run_filter+0x322>
81b: 77 73                 ja     890 <sk_run_filter+0xd2>
81d: 66 83 f8 04           cmp    $0x4,%ax
821: 0f 84 17 02 00 00     je     a3e <sk_run_filter+0x280>
827: 77 29                 ja     852 <sk_run_filter+0x94>
829: 66 83 f8 01           cmp    $0x1,%ax
[...]

With the modification the compiler translate the switch statement into
the following jump table fragment:

7ff: 66 83 3e 2c           cmpw   $0x2c,(%rsi)
803: 0f 87 1f 02 00 00     ja     a28 <sk_run_filter+0x26a>
809: 0f b7 06              movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00  jmpq   *0x0(,%rax,8)
813: 44 89 e3              mov    %r12d,%ebx
816: e9 43 03 00 00        jmpq   b5e <sk_run_filter+0x3a0>
81b: 41 89 dc              mov    %ebx,%r12d
81e: e9 3b 03 00 00        jmpq   b5e <sk_run_filter+0x3a0>

Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.

Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosfc: Log clearer error messages for hardware monitor
Ben Hutchings [Fri, 25 Jun 2010 07:06:29 +0000 (07:06 +0000)]
sfc: Log clearer error messages for hardware monitor

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosfc: Use Toeplitz IPv4 hash for RSS and hash insertion
Ben Hutchings [Fri, 25 Jun 2010 07:05:56 +0000 (07:05 +0000)]
sfc: Use Toeplitz IPv4 hash for RSS and hash insertion

Insertion of the Falcon hash is unreliable.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosfc: Move siena_nic_data::ipv6_rss_key to efx_nic::rx_hash_key
Ben Hutchings [Fri, 25 Jun 2010 07:05:43 +0000 (07:05 +0000)]
sfc: Move siena_nic_data::ipv6_rss_key to efx_nic::rx_hash_key

We will use this hash key for Toeplitz IPv4 hashing too.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosfc: Fix reading of inserted hash
Ben Hutchings [Fri, 25 Jun 2010 07:05:33 +0000 (07:05 +0000)]
sfc: Fix reading of inserted hash

The hash appears immediately before the packet data, not at the
beginning of the buffer. This means we can easily use negative offsets
from the start of packet data, so adjust the data and length at the
top of __efx_rx_packet() instead of wherever we consume the hash.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoenic: Clean ups
Vasanthy Kolluri [Thu, 24 Jun 2010 10:52:26 +0000 (10:52 +0000)]
enic: Clean ups

1) Update copyright
2) Fix hardware queue descriptor field size CQ_ENET_RQ_DESC_FCOE_SOF_BITS
3) Include rtnetlink.h instead of if_link.h
4) Selectively flush writes to interrupt mask register
5) Use pci_enable_device_mem
6) Remove unused variables and header files
7) Fix size mismatch between memory alloc and free operations of a variable
8) Check for non null arguments to vic_provinfo_alloc

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoenic: Bug Fix: Handle surprise hardware removals
Vasanthy Kolluri [Thu, 24 Jun 2010 10:52:08 +0000 (10:52 +0000)]
enic: Bug Fix: Handle surprise hardware removals

Handle surprise hardware removals gracefully during devcmd issue and init,
cleanup of queues.

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoenic: Feature Add: Add loopback capability to enic devices
Vasanthy Kolluri [Thu, 24 Jun 2010 10:51:59 +0000 (10:51 +0000)]
enic: Feature Add: Add loopback capability to enic devices

Hardware has the loopback capability to queue the packets transmitted from
a device to the receive queue of the same device. enic now supports the
loopback capability.

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoenic: Use receive queue buffer blocks of 32/64 entries
Vasanthy Kolluri [Thu, 24 Jun 2010 10:51:51 +0000 (10:51 +0000)]
enic: Use receive queue buffer blocks of 32/64 entries

Change the receive queue buffer allocations into blocks of 32 entries when
ring size is less than 64, otherwise use 64 entries per block.

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoenic: Add new firmware devcmds
Vasanthy Kolluri [Thu, 24 Jun 2010 10:51:43 +0000 (10:51 +0000)]
enic: Add new firmware devcmds

Add new firmware devcmds - CMD_PROXY_BY_BDF, CMD_PACKET_FILTER_ALL,
CMD_ENABLE_WAIT.

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoenic: Use (netdev|dev|pr)_<level> macro helpers for logging
Vasanthy Kolluri [Thu, 24 Jun 2010 10:50:56 +0000 (10:50 +0000)]
enic: Use (netdev|dev|pr)_<level> macro helpers for logging

Replace all printk routines with the (netdev|dev|pr)_<level> macros that
provide verbose logs.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoenic: Clean up: Add wrapper routines for firmware devcmd calls
Vasanthy Kolluri [Thu, 24 Jun 2010 10:50:12 +0000 (10:50 +0000)]
enic: Clean up: Add wrapper routines for firmware devcmd calls

Add wrapper routines that issue devcmds to firmware and ensure that a
devcmd lock is held for each devcmd call.

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoenic: Use a lighter reset operation for enic devices
Vasanthy Kolluri [Thu, 24 Jun 2010 10:50:00 +0000 (10:50 +0000)]
enic: Use a lighter reset operation for enic devices

The port profile information for a dynamic enic device is set by the upper
layers, that are oblivious to the device reset operation. We do not want a
reset operation erase the network state of a dynamic enic device as there
is no way to set up the port profile information again. Hence a lighter
reset operation called hang reset is used. Hang reset, unlike soft reset
does not reset the network state and resets the host side state only.

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoenic: Bug Fix: Change hardware ingress vlan rewrite mode
Vasanthy Kolluri [Thu, 24 Jun 2010 10:49:51 +0000 (10:49 +0000)]
enic: Bug Fix: Change hardware ingress vlan rewrite mode

The current ingress vlan rewrite mode setting lets the hardware strip off
the tag control information of a packet received on native vlan. As a
result, the priority bits are also lost. The fix is to change the ingress
vlan rewrite mode setting such that the complete tag control information is
retained for packets that belong to native vlan.

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoenic: Feature Add: Replace LRO with GRO
Vasanthy Kolluri [Thu, 24 Jun 2010 10:49:25 +0000 (10:49 +0000)]
enic: Feature Add: Replace LRO with GRO

enic now uses the GRO mechanism instead of LRO to pass skbs to upper
layers.

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocnic: Update version to 2.1.3.
Michael Chan [Thu, 24 Jun 2010 14:58:42 +0000 (14:58 +0000)]
cnic: Update version to 2.1.3.

Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocnic: Further unify kcq handling code.
Michael Chan [Thu, 24 Jun 2010 14:58:41 +0000 (14:58 +0000)]
cnic: Further unify kcq handling code.

This eliminates some of the duplicate code for the various devices
that require the same basic kcq handling.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocnic: Restructure kcq processing.
Michael Chan [Thu, 24 Jun 2010 14:58:40 +0000 (14:58 +0000)]
cnic: Restructure kcq processing.

By doing more work in the common function cnic_get_kcqes(), and
making full use of the kcq_info structure.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocnic: Unify kcq allocation for all devices.
Michael Chan [Thu, 24 Jun 2010 14:58:39 +0000 (14:58 +0000)]
cnic: Unify kcq allocation for all devices.

By creating a common data stucture kcq_info for all devices, the kcq
(kernel completion queue) for all devices can be allocated by common
code.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocnic: Unify IRQ code for all hardware types.
Michael Chan [Thu, 24 Jun 2010 14:58:38 +0000 (14:58 +0000)]
cnic: Unify IRQ code for all hardware types.

By creating a common cnic_doirq().

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocnic: Fine-tune CID memory space calculation.
Michael Chan [Thu, 24 Jun 2010 14:58:37 +0000 (14:58 +0000)]
cnic: Fine-tune CID memory space calculation.

The current code makes assumptions about the CID (context ID) memory
space and starting CID that may not be always correct when firmware
changes.  In particular, BNX2_ISCSI_START_CID may not always be fixed.
We now calculate cp->max_cid_space and cp->iscsi_start_cid dynamically
instead of using fixed constants.  The unused cp->max_iscsi_conn is also
eliminated.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosfc: Record hardware RX hash on each skb where possible
Ben Hutchings [Wed, 23 Jun 2010 11:31:28 +0000 (11:31 +0000)]
sfc: Record hardware RX hash on each skb where possible

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosfc: Disable setting feature flags that are not implemented
Ben Hutchings [Wed, 23 Jun 2010 11:30:35 +0000 (11:30 +0000)]
sfc: Disable setting feature flags that are not implemented

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosfc: Replace EFX_DRIVER_NAME with KBUILD_MODNAME
Ben Hutchings [Wed, 23 Jun 2010 11:30:26 +0000 (11:30 +0000)]
sfc: Replace EFX_DRIVER_NAME with KBUILD_MODNAME

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosfc: Implement message level control
Ben Hutchings [Wed, 23 Jun 2010 11:30:07 +0000 (11:30 +0000)]
sfc: Implement message level control

Replace EFX_ERR() with netif_err(), EFX_INFO() with netif_info(),
EFX_LOG() with netif_dbg() and EFX_TRACE() and EFX_REGDUMP() with
netif_vdbg().

Replace EFX_ERR_RL(), EFX_INFO_RL() and EFX_LOG_RL() using explicit
calls to net_ratelimit().

Implement the ethtool operations to get and set message level flags,
and add a 'debug' module parameter for the initial value.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosfc: Log MTD errors using partition name, not just net device name
Ben Hutchings [Wed, 23 Jun 2010 11:29:24 +0000 (11:29 +0000)]
sfc: Log MTD errors using partition name, not just net device name

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosfc: Implement ethtool register dump operation
Ben Hutchings [Mon, 21 Jun 2010 03:06:53 +0000 (03:06 +0000)]
sfc: Implement ethtool register dump operation

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agotcp: do not send reset to already closed sockets
Konstantin Khorenko [Fri, 25 Jun 2010 04:54:58 +0000 (21:54 -0700)]
tcp: do not send reset to already closed sockets

i've found that tcp_close() can be called for an already closed
socket, but still sends reset in this case (tcp_send_active_reset())
which seems to be incorrect.  Moreover, a packet with reset is sent
with different source port as original port number has been already
cleared on socket.  Besides that incrementing stat counter for
LINUX_MIB_TCPABORTONCLOSE also does not look correct in this case.

Initially this issue was found on 2.6.18-x RHEL5 kernel, but the same
seems to be true for the current mainstream kernel (checked on
2.6.35-rc3).  Please, correct me if i missed something.

How that happens:

1) the server receives a packet for socket in TCP_CLOSE_WAIT state
   that triggers a tcp_reset():

Call Trace:
 <IRQ>  [<ffffffff8025b9b9>] tcp_reset+0x12f/0x1e8
 [<ffffffff80046125>] tcp_rcv_state_process+0x1c0/0xa08
 [<ffffffff8003eb22>] tcp_v4_do_rcv+0x310/0x37a
 [<ffffffff80028bea>] tcp_v4_rcv+0x74d/0xb43
 [<ffffffff8024ef4c>] ip_local_deliver_finish+0x0/0x259
 [<ffffffff80037131>] ip_local_deliver+0x200/0x2f4
 [<ffffffff8003843c>] ip_rcv+0x64c/0x69f
 [<ffffffff80021d89>] netif_receive_skb+0x4c4/0x4fa
 [<ffffffff80032eca>] process_backlog+0x90/0xec
 [<ffffffff8000cc50>] net_rx_action+0xbb/0x1f1
 [<ffffffff80012d3a>] __do_softirq+0xf5/0x1ce
 [<ffffffff8001147a>] handle_IRQ_event+0x56/0xb0
 [<ffffffff8006334c>] call_softirq+0x1c/0x28
 [<ffffffff80070476>] do_softirq+0x2c/0x85
 [<ffffffff80070441>] do_IRQ+0x149/0x152
 [<ffffffff80062665>] ret_from_intr+0x0/0xa
 <EOI>  [<ffffffff80008a2e>] __handle_mm_fault+0x6cd/0x1303
 [<ffffffff80008903>] __handle_mm_fault+0x5a2/0x1303
 [<ffffffff80033a9d>] cache_free_debugcheck+0x21f/0x22e
 [<ffffffff8006a263>] do_page_fault+0x49a/0x7dc
 [<ffffffff80066487>] thread_return+0x89/0x174
 [<ffffffff800c5aee>] audit_syscall_exit+0x341/0x35c
 [<ffffffff80062e39>] error_exit+0x0/0x84

tcp_rcv_state_process()
...  // (sk_state == TCP_CLOSE_WAIT here)
...
        /* step 2: check RST bit */
        if(th->rst) {
                tcp_reset(sk);
                goto discard;
        }
...
---------------------------------
tcp_rcv_state_process
 tcp_reset
  tcp_done
   tcp_set_state(sk, TCP_CLOSE);
     inet_put_port
      __inet_put_port
       inet_sk(sk)->num = 0;

   sk->sk_shutdown = SHUTDOWN_MASK;

2) After that the process (socket owner) tries to write something to
   that socket and "inet_autobind" sets a _new_ (which differs from
   the original!) port number for the socket:

 Call Trace:
  [<ffffffff80255a12>] inet_bind_hash+0x33/0x5f
  [<ffffffff80257180>] inet_csk_get_port+0x216/0x268
  [<ffffffff8026bcc9>] inet_autobind+0x22/0x8f
  [<ffffffff80049140>] inet_sendmsg+0x27/0x57
  [<ffffffff8003a9d9>] do_sock_write+0xae/0xea
  [<ffffffff80226ac7>] sock_writev+0xdc/0xf6
  [<ffffffff800680c7>] _spin_lock_irqsave+0x9/0xe
  [<ffffffff8001fb49>] __pollwait+0x0/0xdd
  [<ffffffff8008d533>] default_wake_function+0x0/0xe
  [<ffffffff800a4f10>] autoremove_wake_function+0x0/0x2e
  [<ffffffff800f0b49>] do_readv_writev+0x163/0x274
  [<ffffffff80066538>] thread_return+0x13a/0x174
  [<ffffffff800145d8>] tcp_poll+0x0/0x1c9
  [<ffffffff800c56d3>] audit_syscall_entry+0x180/0x1b3
  [<ffffffff800f0dd0>] sys_writev+0x49/0xe4
  [<ffffffff800622dd>] tracesys+0xd5/0xe0

3) sendmsg fails at last with -EPIPE (=> 'write' returns -EPIPE in userspace):

F: tcp_sendmsg1 -EPIPE: sk=ffff81000bda00d0, sport=49847, old_state=7, new_state=7, sk_err=0, sk_shutdown=3

Call Trace:
 [<ffffffff80027557>] tcp_sendmsg+0xcb/0xe87
 [<ffffffff80033300>] release_sock+0x10/0xae
 [<ffffffff8016f20f>] vgacon_cursor+0x0/0x1a7
 [<ffffffff8026bd32>] inet_autobind+0x8b/0x8f
 [<ffffffff8003a9d9>] do_sock_write+0xae/0xea
 [<ffffffff80226ac7>] sock_writev+0xdc/0xf6
 [<ffffffff800680c7>] _spin_lock_irqsave+0x9/0xe
 [<ffffffff8001fb49>] __pollwait+0x0/0xdd
 [<ffffffff8008d533>] default_wake_function+0x0/0xe
 [<ffffffff800a4f10>] autoremove_wake_function+0x0/0x2e
 [<ffffffff800f0b49>] do_readv_writev+0x163/0x274
 [<ffffffff80066538>] thread_return+0x13a/0x174
 [<ffffffff800145d8>] tcp_poll+0x0/0x1c9
 [<ffffffff800c56d3>] audit_syscall_entry+0x180/0x1b3
 [<ffffffff800f0dd0>] sys_writev+0x49/0xe4
 [<ffffffff800622dd>] tracesys+0xd5/0xe0

tcp_sendmsg()
...
        /* Wait for a connection to finish. */
        if ((1 << sk->sk_state) & ~(TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)) {
                int old_state = sk->sk_state;
                if ((err = sk_stream_wait_connect(sk, &timeo)) != 0) {
if (f_d && (err == -EPIPE)) {
        printk("F: tcp_sendmsg1 -EPIPE: sk=%p, sport=%u, old_state=%d, new_state=%d, "
                "sk_err=%d, sk_shutdown=%d\n",
                sk, ntohs(inet_sk(sk)->sport), old_state, sk->sk_state,
                sk->sk_err, sk->sk_shutdown);
        dump_stack();
}
                        goto out_err;
                }
        }
...

4) Then the process (socket owner) understands that it's time to close
   that socket and does that (and thus triggers sending reset packet):

Call Trace:
...
 [<ffffffff80032077>] dev_queue_xmit+0x343/0x3d6
 [<ffffffff80034698>] ip_output+0x351/0x384
 [<ffffffff80251ae9>] dst_output+0x0/0xe
 [<ffffffff80036ec6>] ip_queue_xmit+0x567/0x5d2
 [<ffffffff80095700>] vprintk+0x21/0x33
 [<ffffffff800070f0>] check_poison_obj+0x2e/0x206
 [<ffffffff80013587>] poison_obj+0x36/0x45
 [<ffffffff8025dea6>] tcp_send_active_reset+0x15/0x14d
 [<ffffffff80023481>] dbg_redzone1+0x1c/0x25
 [<ffffffff8025dea6>] tcp_send_active_reset+0x15/0x14d
 [<ffffffff8000ca94>] cache_alloc_debugcheck_after+0x189/0x1c8
 [<ffffffff80023405>] tcp_transmit_skb+0x764/0x786
 [<ffffffff8025df8a>] tcp_send_active_reset+0xf9/0x14d
 [<ffffffff80258ff1>] tcp_close+0x39a/0x960
 [<ffffffff8026be12>] inet_release+0x69/0x80
 [<ffffffff80059b31>] sock_release+0x4f/0xcf
 [<ffffffff80059d4c>] sock_close+0x2c/0x30
 [<ffffffff800133c9>] __fput+0xac/0x197
 [<ffffffff800252bc>] filp_close+0x59/0x61
 [<ffffffff8001eff6>] sys_close+0x85/0xc7
 [<ffffffff800622dd>] tracesys+0xd5/0xe0

So, in brief:

* a received packet for socket in TCP_CLOSE_WAIT state triggers
  tcp_reset() which clears inet_sk(sk)->num and put socket into
  TCP_CLOSE state

* an attempt to write to that socket forces inet_autobind() to get a
  new port (but the write itself fails with -EPIPE)

* tcp_close() called for socket in TCP_CLOSE state sends an active
  reset via socket with newly allocated port

This adds an additional check in tcp_close() for already closed
sockets. We do not want to send anything to closed sockets.

Signed-off-by: Konstantin Khorenko <khorenko@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobroadcom: Add 5241 support
Dmitry Baryshkov [Wed, 16 Jun 2010 23:02:24 +0000 (23:02 +0000)]
broadcom: Add 5241 support

This patch adds the 5241 PHY ID to the broadcom module.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobroadcom: move all PHY_ID's to header
Dmitry Baryshkov [Wed, 16 Jun 2010 23:02:23 +0000 (23:02 +0000)]
broadcom: move all PHY_ID's to header

Move all PHY IDs to brcmphy.h header for completeness and unification of code.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: fix "netpoll: Allow netpoll_setup/cleanup recursion"
Andrew Morton [Fri, 25 Jun 2010 03:33:04 +0000 (20:33 -0700)]
net: fix "netpoll: Allow netpoll_setup/cleanup recursion"

Remove rtnl_unlock() which had no corresponding rtnl_lock().

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoMerge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
David S. Miller [Thu, 24 Jun 2010 01:26:27 +0000 (18:26 -0700)]
Merge branch 'master' of /linux/kernel/git/davem/net-2.6

Conflicts:
net/ipv4/ip_output.c

14 years agosky2: enable rx/tx in sky2_phy_reinit()
Brandon Philips [Wed, 16 Jun 2010 16:21:58 +0000 (16:21 +0000)]
sky2: enable rx/tx in sky2_phy_reinit()

sky2_phy_reinit is called by the ethtool helpers sky2_set_settings,
sky2_nway_reset and sky2_set_pauseparam when netif_running.

However, at the end of sky2_phy_init GM_GP_CTRL has GM_GPCR_RX_ENA and
GM_GPCR_TX_ENA cleared. So, doing these commands causes the device to
stop working:

$ ethtool -r eth0
$ ethtool -A eth0 autoneg off

Fix this issue by enabling Rx/Tx after running sky2_phy_init in
sky2_phy_reinit.

Signed-off-by: Brandon Philips <bphilips@suse.de>
Tested-by: Brandon Philips <bphilips@suse.de>
Cc: stable@kernel.org
Tested-by: Mike McCormack <mikem@ring3k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>