platform/kernel/linux-3.10.git
12 years agosctp: fix sparse warning for sctp_init_cause_fixed
Ioan Orghici [Fri, 13 Jul 2012 07:16:37 +0000 (07:16 +0000)]
sctp: fix sparse warning for sctp_init_cause_fixed

Fix the following sparse warning:
* symbol 'sctp_init_cause_fixed' was not declared. Should it be
  static?

Signed-off-by: Ioan Orghici <ioanorghici@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2: Try to recover from PCI block reset
Michael Chan [Mon, 16 Jul 2012 14:25:56 +0000 (14:25 +0000)]
bnx2: Try to recover from PCI block reset

If the PCI block has reset, the memory enable bit will be reset and
the device will not respond to MMIO access.  bnx2_reset_task() currently
will not recover when this happens.  Add code to detect this condition
and restore the PCI state.  This scenario has been reported by some
users.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotg3: Add hwmon support for temperature
Michael Chan [Mon, 16 Jul 2012 16:24:02 +0000 (16:24 +0000)]
tg3: Add hwmon support for temperature

Some tg3 devices have management firmware that can export sensor data.
Export temperature sensor reading via hwmon sysfs.

[hwmon interface suggested by Ben Hutchings <bhutchings@solarflare.com>]

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotg3: Add APE scratchpad read function
Matt Carlson [Mon, 16 Jul 2012 16:24:01 +0000 (16:24 +0000)]
tg3: Add APE scratchpad read function

for retreiving temperature sensor data.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotg3: Add common function tg3_ape_event_lock()
Matt Carlson [Mon, 16 Jul 2012 16:24:00 +0000 (16:24 +0000)]
tg3: Add common function tg3_ape_event_lock()

by refactoring code in tg3_ape_send_event().  The common function will
be used in subsequent patches.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotg3: Fix the setting of the APE_HAS_NCSI flag
Michael Chan [Mon, 16 Jul 2012 16:23:59 +0000 (16:23 +0000)]
tg3: Fix the setting of the APE_HAS_NCSI flag

The driver currently skips setting this flag if the VPD contains the
firmware version string.  We fix this by separating the probing of NCSI
from the reading of the NCSI version string.  The APE_HAS_NCSI flag is
needed to properly read sensor data.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonetem: refine early skb orphaning
Eric Dumazet [Sat, 14 Jul 2012 03:16:27 +0000 (03:16 +0000)]
netem: refine early skb orphaning

netem does an early orphaning of skbs. Doing so breaks TCP Small Queue
or any mechanism relying on socket sk_wmem_alloc feedback.

Ideally, we should perform this orphaning after the rate module and
before the delay module, to mimic what happens on a real link :

skb orphaning is indeed normally done at TX completion, before the
transit on the link.

+-------+   +--------+  +---------------+  +-----------------+
+ Qdisc +---> Device +--> TX completion +--> links / hops    +->
+       +   +  xmit  +  + skb orphaning +  + propagation     +
+-------+   +--------+  +---------------+  +-----------------+
      < rate limiting >                  < delay, drops, reorders >

If netem is used without delay feature (drops, reorders, rate
limiting), then we should avoid early skb orphaning, to keep pressure
on sockets as long as packets are still in qdisc queue.

Ideally, netem should be refactored to implement delay module
as the last stage. Current algorithm merges the two phases
(rate limiting + delay) so its not correct.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Hagen Paul Pfeifer <hagen@jauu.net>
Cc: Mark Gordon <msg@google.com>
Cc: Andreas Terzis <aterzis@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net...
David S. Miller [Tue, 17 Jul 2012 06:04:00 +0000 (23:04 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next

Jett Kirsher says:

====================
This series contains updates to e1000e and ixgbe.
 ...
Alexander Duyck (5):
  ixgbe: Simplify logic for getting traffic class from user priority
  ixgbe: Cleanup unpacking code for DCB
  ixgbe: Populate the prio_tc_map in ixgbe_setup_tc
  ixgbe: Add function for obtaining FCoE TC based on FCoE user priority
  ixgbe: Merge FCoE set_num and cache_ring calls into RSS/DCB config
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobe2net: dont pull too much data in skb linear part
Eric Dumazet [Fri, 13 Jul 2012 03:19:41 +0000 (03:19 +0000)]
be2net: dont pull too much data in skb linear part

skb_fill_rx_data() pulls 64 byte of data in skb->data

Its too much for TCP (with no options) on IPv4, as total size of headers
is 14 + 40 = 54

This means tcp stack and splice() are suboptimal, since tcp payload
is in part in tcp->data, and in part in skb frag.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobe2net: update driver version
Padmanabh Ratnakar [Fri, 13 Jul 2012 02:46:03 +0000 (02:46 +0000)]
be2net: update driver version

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobe2net: Add description about various RSS hash types
Padmanabh Ratnakar [Fri, 13 Jul 2012 02:45:51 +0000 (02:45 +0000)]
be2net: Add description about various RSS hash types

Incorporated review comment from Eric Dumazet. Added description
about different RSS hash types which adapter is capable of.
Will add support for ETHTOOL_GRXFH and ETHTOOL_SRXFX as suggested
by Ben Hutchings in a later patch.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobridge: Fix enforcement of multicast hash_max limit
Thomas Graf [Tue, 10 Jul 2012 22:29:19 +0000 (22:29 +0000)]
bridge: Fix enforcement of multicast hash_max limit

The hash size is doubled when it needs to grow and compared against
hash_max. The >= comparison will limit the hash table size to half
of what is expected i.e. the default 512 hash_max will not allow
the hash table to grow larger than 256.

Also print the hash table limit instead of the desirable size when
the limit is reached.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet/mlx4_en: dereferencing freed memory
Dan Carpenter [Tue, 10 Jul 2012 20:34:07 +0000 (20:34 +0000)]
net/mlx4_en: dereferencing freed memory

We dereferenced "mclist" after the kfree().

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet/mlx4: off by one in parse_trans_rule()
Dan Carpenter [Tue, 10 Jul 2012 20:33:36 +0000 (20:33 +0000)]
net/mlx4: off by one in parse_trans_rule()

This should be ">=" here instead of ">".  MLX4_NET_TRANS_RULE_NUM is 6.
We use "spec->id" as an array offset into the __rule_hw_sz[] and
__sw_id_hw[] arrays which have 6 elements.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Hadar Hen Zion <hadarh@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: fix RTPROT_RA markup of RA routes w/nexthops
Denis Ovsienko [Tue, 10 Jul 2012 04:45:50 +0000 (04:45 +0000)]
ipv6: fix RTPROT_RA markup of RA routes w/nexthops

Userspace implementations of network routing protocols sometimes need to
tell RA-originated IPv6 routes from other kernel routes to make proper
routing decisions. This makes most sense for RA routes with nexthops,
namely, default routes and Route Information routes.

The intended mean of preserving RA route origin in a netlink message is
through indicating RTPROT_RA as protocol code. Function rt6_fill_node()
tried to do that for default routes, but its test condition was taken
wrong. This change is modeled after the original mailing list posting
by Jeff Haran. It fixes the test condition for default route case and
sets the same behaviour for Route Information case (both types use
nexthops). Handling of the 3rd RA route type, Prefix Information, is
left unchanged, as it stands for interface connected routes (without
nexthops).

Signed-off-by: Denis Ovsienko <infrastation@yandex.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agohyperv: Add support for setting MAC from within guests
Haiyang Zhang [Tue, 10 Jul 2012 07:19:22 +0000 (07:19 +0000)]
hyperv: Add support for setting MAC from within guests

This adds support for setting synthetic NIC MAC address from within Linux
guests. Before using this feature, the option "spoofing of MAC address"
should be enabled at the Hyper-V manager / Settings of the synthetic
NIC.

Thanks to Kin Cho <kcho@infoblox.com> for the initial implementation and
tests. And, thanks to Long Li <longli@microsoft.com> for the debugging
works.

Reported-and-tested-by: Kin Cho <kcho@infoblox.com>
Reported-by: Long Li <longli@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years ago6lowpan: Change byte order when storing/accessing to len field
Tony Cheneau [Wed, 11 Jul 2012 06:51:16 +0000 (06:51 +0000)]
6lowpan: Change byte order when storing/accessing to len field

Lenght field should be encoded using big endian byte order, such as intend in the specs.
As it is currently written, the len field would not be decoded properly on an implementation using the correct byte ordering. Hence, it could lead to interroperability issues.

Also, I rewrote the code so that iphc0 argument of lowpan_alloc_new_frame could be removed.

Signed-off-by: Tony Cheneau <tony.cheneau@amnesiak.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years ago6lowpan: Change byte order when storing/accessing u16 tag
Tony Cheneau [Wed, 11 Jul 2012 06:51:15 +0000 (06:51 +0000)]
6lowpan: Change byte order when storing/accessing u16 tag

The tag field should be stored and accessed using big endian byte order (as
intended in the specs). Or else, when displayed with a trafic analyser, such a
Wireshark, the field not properly displayed (e.g. 0x01 00 instead of 0x00 01,
and so on).

Signed-off-by: Tony Cheneau <tony.cheneau@amnesiak.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years ago6lowpan: Fix null pointer dereference in UDP uncompression function
Tony Cheneau [Wed, 11 Jul 2012 06:51:14 +0000 (06:51 +0000)]
6lowpan: Fix null pointer dereference in UDP uncompression function

When a UDP packet gets fragmented, a crash will occur at reassembly time.
This is because skb->transport_header is not set during earlier period of fragment reassembly.
As a consequence, call to udp_hdr() return NULL and uh (which is NULL) gets
dereferenced without much test.

Signed-off-by: Tony Cheneau <tony.cheneau@amnesiak.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoarch: Use eth_random_addr
Joe Perches [Fri, 13 Jul 2012 05:33:12 +0000 (22:33 -0700)]
arch: Use eth_random_addr

Convert the existing uses of random_ether_addr to
the new eth_random_addr.

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agousb: Use eth_random_addr
Joe Perches [Fri, 13 Jul 2012 05:33:11 +0000 (22:33 -0700)]
usb: Use eth_random_addr

Convert the existing uses of random_ether_addr to
the new eth_random_addr.

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agos390: Use eth_random_addr
Joe Perches [Fri, 13 Jul 2012 05:33:10 +0000 (22:33 -0700)]
s390: Use eth_random_addr

Convert the existing uses of random_ether_addr to
the new eth_random_addr.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agodrivers/net: Use eth_random_addr
Joe Perches [Thu, 12 Jul 2012 19:33:09 +0000 (19:33 +0000)]
drivers/net: Use eth_random_addr

Convert the existing uses of random_ether_addr to
the new eth_random_addr.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agowireless: Use eth_random_addr
Joe Perches [Thu, 12 Jul 2012 19:33:08 +0000 (19:33 +0000)]
wireless: Use eth_random_addr

Convert the existing uses of random_ether_addr to
the new eth_random_addr.

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Acked-by: Gertjan van Wingerde <gwingerde@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: usb: Use eth_random_addr
Joe Perches [Thu, 12 Jul 2012 19:33:07 +0000 (19:33 +0000)]
net: usb: Use eth_random_addr

Convert the existing uses of random_ether_addr to
the new eth_random_addr.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoethernet: Use eth_random_addr
Joe Perches [Thu, 12 Jul 2012 19:33:06 +0000 (19:33 +0000)]
ethernet: Use eth_random_addr

Convert the existing uses of random_ether_addr to
the new eth_random_addr.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoetherdevice: Rename random_ether_addr to eth_random_addr
Joe Perches [Thu, 12 Jul 2012 19:33:05 +0000 (19:33 +0000)]
etherdevice: Rename random_ether_addr to eth_random_addr

Add some API symmetry to eth_broadcast_addr and
add a #define to the old name for backward compatibility.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge branch 'tipc_net-next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg...
David S. Miller [Tue, 17 Jul 2012 05:33:32 +0000 (22:33 -0700)]
Merge branch 'tipc_net-next' of git://git./linux/kernel/git/paulg/linux

Paul Gortmaker says:

====================
This is the same eight commits as sent for review last week[1],
with just the incorporation of the pr_fmt change as suggested
by JoeP.  There was no additional change requests, so unless you
can see something else you'd like me to change, please pull.
 ...
Erik Hugne (5):
      tipc: use standard printk shortcut macros (pr_err etc.)
      tipc: remove TIPC packet debugging functions and macros
      tipc: simplify print buffer handling in tipc_printf
      tipc: phase out most of the struct print_buf usage
      tipc: remove print_buf and deprecated log buffer code

Paul Gortmaker (3):
      tipc: factor stats struct out of the larger link struct
      tipc: limit error messages relating to memory leak to one line
      tipc: simplify link_print by divorcing it from using tipc_printf
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: make sock diag per-namespace
Andrey Vagin [Mon, 16 Jul 2012 04:28:49 +0000 (04:28 +0000)]
net: make sock diag per-namespace

Before this patch sock_diag works for init_net only and dumps
information about sockets from all namespaces.

This patch expands sock_diag for all name-spaces.
It creates a netlink kernel socket for each netns and filters
data during dumping.

v2: filter accoding with netns in all places
    remove an unused variable.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Pavel Emelyanov <xemul@parallels.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
Cc: linux-kernel@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agolpc_eth: remove duplicated include
Duan Jiong [Mon, 16 Jul 2012 02:29:05 +0000 (02:29 +0000)]
lpc_eth: remove duplicated include

Remove duplicated #include <linux/delay.h> in
drivers/net/ethernet/nxp/lpc_eth.c

Signed-off-by: Duan Jiong<djduanjiong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: add OFO snmp counters
Eric Dumazet [Mon, 16 Jul 2012 01:41:36 +0000 (01:41 +0000)]
tcp: add OFO snmp counters

Add three SNMP TCP counters, to better track TCP behavior
at global stage (netstat -s), when packets are received
Out Of Order (OFO)

TCPOFOQueue : Number of packets queued in OFO queue

TCPOFODrop  : Number of packets meant to be queued in OFO
              but dropped because socket rcvbuf limit hit.

TCPOFOMerge : Number of packets in OFO that were merged with
              other packets.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoixgbe: Merge FCoE set_num and cache_ring calls into RSS/DCB config
Alexander Duyck [Sat, 30 Jun 2012 00:14:01 +0000 (00:14 +0000)]
ixgbe: Merge FCoE set_num and cache_ring calls into RSS/DCB config

This change merges the ixgbe_cache_ring_fcoe and ixgbe_set_fcoe_queues
logic into the DCB and RSS initialization calls.

Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoixgbe: Add function for obtaining FCoE TC based on FCoE user priority
Alexander Duyck [Sat, 2 Jun 2012 00:11:02 +0000 (00:11 +0000)]
ixgbe: Add function for obtaining FCoE TC based on FCoE user priority

In upcoming patches it will become increasingly common to need to determine
the FCoE traffic class in order to determine the correct queues for FCoE.
In order to make this easier I am adding a function for obtaining the FCoE
traffic class based on the user priority.

Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoixgbe: Populate the prio_tc_map in ixgbe_setup_tc
Alexander Duyck [Fri, 18 May 2012 06:33:31 +0000 (06:33 +0000)]
ixgbe: Populate the prio_tc_map in ixgbe_setup_tc

There were cases where the prio_tc_map was not populated when we were
calling open.  This will result in us incorrectly configuring the traffic
classes when DCB is enabled.  In order to correct this I have updated the
code so that we now populate the values prior to allocating the q_vectors
and calling ixgbe_open.

Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoixgbe: Cleanup unpacking code for DCB
Alexander Duyck [Thu, 17 May 2012 05:14:39 +0000 (05:14 +0000)]
ixgbe: Cleanup unpacking code for DCB

This is meant to be a generic clean-up of the remaining functions for
unpacking data from the DCB structures. The only real changes are:
replaced the variable i with tc for functions that were looping through the
traffic classes, and added a pointer for tc_class instead of path since
that way we only need to pull the pointer once instead of once per loop.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoixgbe: Simplify logic for getting traffic class from user priority
Alexander Duyck [Thu, 17 May 2012 05:14:34 +0000 (05:14 +0000)]
ixgbe: Simplify logic for getting traffic class from user priority

This patch is meant to help simplify the logic for getting traffic classes
from user priorities. To do this I am adding a function named
ixgbe_dcb_get_tc_from_up that will go through the traffic classes in
reverse order in order to determine which traffic class contains a bit for
a given user priority.

Adding a declaration for this new function to the header so that
we have a centralized means for sorting out traffic classes belonging to
features such as FCoE.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoe1000e: Program the correct register for ITR when using MSI-X.
Matthew Vick [Thu, 12 Jul 2012 00:02:42 +0000 (00:02 +0000)]
e1000e: Program the correct register for ITR when using MSI-X.

When configuring interrupt throttling on 82574 in MSI-X mode, we need to
be programming the EITR registers instead of the ITR register.

-rc2: Renamed e1000_write_itr() to e1000e_write_itr(), fixed whitespace
      issues, and removed unnecessary !! operation.
-rc3: Reduced the scope of the loop variable in e1000e_write_itr().

Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Acked-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoe1000e: Cleanup code logic in e1000_check_for_serdes_link_82571()
Tushar Dave [Thu, 12 Jul 2012 08:00:15 +0000 (08:00 +0000)]
e1000e: Cleanup code logic in e1000_check_for_serdes_link_82571()

Cleanup code to make it more clean and readable.

Signed-off-by: Tushar Dave <tushar.n.dave@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoxfrm: Initialize the struct xfrm_dst behind the dst_enty field
Steffen Klassert [Thu, 5 Jul 2012 23:39:34 +0000 (23:39 +0000)]
xfrm: Initialize the struct xfrm_dst behind the dst_enty field

We start initializing the struct xfrm_dst at the first field
behind the struct dst_enty. This is error prone because it
might leave a new field uninitialized. So start initializing
the struct xfrm_dst right behind the dst_entry.

Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: Initialize the struct rt6_info behind the dst_enty field
Steffen Klassert [Thu, 5 Jul 2012 23:37:09 +0000 (23:37 +0000)]
ipv6: Initialize the struct rt6_info behind the dst_enty field

We start initializing the struct rt6_info at the first field
behind the struct dst_enty. This is error prone because it
might leave a new field uninitialized. So start initializing
the struct rt6_info right behind the dst_entry.

Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville...
David S. Miller [Sat, 14 Jul 2012 06:02:28 +0000 (23:02 -0700)]
Merge branch 'for-davem' of git://git./linux/kernel/git/linville/wireless-next

John Linville says:

====================
Several drivers see updates: mwifiex, ath9k, iwlwifi, brcmsmac,
wlcore/wl12xx/wl18xx, and a handful of others.  The bcma bus got a
lot of attention from Hauke Mehrtens.  The cfg80211 component gets
a flurry of patches for multi-channel support, and the mac80211
component gets the first few VHT (11ac) and 60GHz (11ad) patches.
This also includes the removal of the iwmc3200 drivers, since the
hardware never became available to normal people.

Additionally, the NFC subsystem gets a series of updates.  According to
Samuel, "Here are the interesting bits:

- A better error management for the HCI stack.
- An LLCP "late" binding implementation for a better NFC SAP usage. SAPs are
  now reserved only when there's a client for it.
- Support for Sony RC-S360 (a.k.a. PaSoRi) pn533 based dongle. We can read and
  write NFC tags and also establish a p2p link with this dongle now.
- A few LLCP fixes."

Finally, this includes another pull of the fixes from the wireless
tree in order to resolve some merge issues.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotipc: remove print_buf and deprecated log buffer code
Erik Hugne [Fri, 29 Jun 2012 04:50:24 +0000 (00:50 -0400)]
tipc: remove print_buf and deprecated log buffer code

The internal log buffer handling functions can now safely be
removed since there is no code using it anymore.  Requests to
interact with the internal tipc log buffer over netlink (in
config.c) will report 'obsolete command'.

This represents the final removal of any references to a
struct print_buf, and the removal of the struct itself.
We also get rid of a TIPC specific Kconfig in the process.

Finally, log.h is removed since it is not needed anymore.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: phase out most of the struct print_buf usage
Erik Hugne [Fri, 29 Jun 2012 04:50:23 +0000 (00:50 -0400)]
tipc: phase out most of the struct print_buf usage

The tipc_printf is renamed to tipc_snprintf, as the new name
describes more what the function actually does.  It is also
changed to take a buffer and length parameter and return
number of characters written to the buffer.  All callers of
this function that used to pass a print_buf are updated.

Final removal of the struct print_buf itself will be done
synchronously with the pending removal of the deprecated
logging code that also was using it.

Functions that build up a response message with a list of
ports, nametable contents etc. are changed to return the number
of characters written to the output buffer. This information
was previously hidden in a field of the print_buf struct, and
the number of chars written was fetched with a call to
tipc_printbuf_validate.  This function is removed since it
is no longer referenced nor needed.

A generic max size ULTRA_STRING_MAX_LEN is defined, named
in keeping with the existing TIPC_TLV_ULTRA_STRING, and the
various definitions in port, link and nametable code that
largely duplicated this information are removed.  This means
that amount of link statistics that can be returned is now
increased from 2k to 32k.

The buffer overflow check is now done just before the reply
message is passed over netlink or TIPC to a remote node and
the message indicating a truncated buffer is changed to a less
dramatic one (less CAPS), placed at the end of the message.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: simplify print buffer handling in tipc_printf
Erik Hugne [Fri, 29 Jun 2012 04:50:22 +0000 (00:50 -0400)]
tipc: simplify print buffer handling in tipc_printf

tipc_printf was previously used both to construct debug traces
and to append data to buffers that should be sent over netlink
to the tipc-config application.  A global print_buffer was
used to format the string before it was copied to the actual
output buffer.  This could lead to concurrent access of the
global print_buffer, which then had to be lock protected.
This is simplified by changing tipc_printf to append data
directly to the output buffer using vscnprintf.

With the new implementation of tipc_printf, there is no longer
any risk of concurrent access to the internal log buffer, so
the lock (and the comments describing it) are no longer
strictly necessary.  However, there are still a few functions
that do grab this lock before resizing/dumping the log
buffer.  We leave the lock, and these functions untouched since
they will be removed with a subsequent commit that drops the
deprecated log buffer handling code

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: simplify link_print by divorcing it from using tipc_printf
Paul Gortmaker [Wed, 11 Jul 2012 23:27:56 +0000 (19:27 -0400)]
tipc: simplify link_print by divorcing it from using tipc_printf

To pave the way for a pending cleanup of tipc_printf, and
removal of struct print_buf entirely, we make that task simpler
by converting link_print to issue its messages with standard
printk infrastructure.  [Original idea separated from a larger
patch from Erik Hugne <erik.hugne@ericsson.com>]

Cc: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: remove TIPC packet debugging functions and macros
Erik Hugne [Fri, 29 Jun 2012 04:50:21 +0000 (00:50 -0400)]
tipc: remove TIPC packet debugging functions and macros

The link queue traces and packet level debug functions served
a purpose during early development, but are now redundant
since there are other, more capable tools available for
debugging at the packet level.

The TIPC_DEBUG Kconfig option is removed since it does not
provide any extra debugging features anymore.

This gets rid of a lot of tipc_printf usages, which will
make the pending cleanup work of that function easier.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: use standard printk shortcut macros (pr_err etc.)
Erik Hugne [Fri, 29 Jun 2012 04:16:37 +0000 (00:16 -0400)]
tipc: use standard printk shortcut macros (pr_err etc.)

All messages should go directly to the kernel log.  The TIPC
specific error, warning, info and debug trace macro's are
removed and all references replaced with pr_err, pr_warn,
pr_info and pr_debug.

Commonly used sub-strings are explicitly declared as a const
char to reduce .text size.

Note that this means the debug messages (changed to pr_debug),
are now enabled through dynamic debugging, instead of a TIPC
specific Kconfig option (TIPC_DEBUG).  The latter will be
phased out completely

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
[PG: use pr_fmt as suggested by Joe Perches <joe@perches.com>]
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agoipv4: Don't store a rule pointer in fib_result.
David S. Miller [Fri, 13 Jul 2012 15:21:29 +0000 (08:21 -0700)]
ipv4: Don't store a rule pointer in fib_result.

We only use it to fetch the rule's tclassid, so just store the
tclassid there instead.

This also decreases the size of fib_result by a full 8 bytes on
64-bit.  On 32-bits it's a wash.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: add LAST_ACK as a valid state for TSQ
Eric Dumazet [Thu, 12 Jul 2012 22:46:09 +0000 (22:46 +0000)]
tcp: add LAST_ACK as a valid state for TSQ

Socket state LAST_ACK should allow TSQ to send additional frames,
or else we rely on incoming ACKS or timers to send them.

Reported-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Matt Mathis <mattmathis@google.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotg3: add device id of Apple Thunderbolt Ethernet device
Greg KH [Thu, 12 Jul 2012 15:39:44 +0000 (15:39 +0000)]
tg3: add device id of Apple Thunderbolt Ethernet device

The Apple Thunderbolt ethernet device is already listed in the driver,
but not hooked up in the MODULE_DEVICE_TABLE().  This fixes that and
allows it to work properly.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: Update alloc frag to reduce get/put page usage and recycle pages
Alexander Duyck [Thu, 12 Jul 2012 14:23:50 +0000 (14:23 +0000)]
net: Update alloc frag to reduce get/put page usage and recycle pages

This patch is meant to help improve performance by reducing the number of
locked operations required to allocate a frag on x86 and other platforms.
This is accomplished by using atomic_set operations on the page count
instead of calling get_page and put_page.  It is based on work originally
provided by Eric Dumazet.

In addition it also helps to reduce memory overhead when using TCP.  This
is done by recycling the page if the only holder of the frame is the
netdev_alloc_frag call itself.  This can occur when skb heads are stolen by
either GRO or TCP and the driver providing the packets is using paged frags
to store all of the data for the packets.

Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wirel...
John W. Linville [Thu, 12 Jul 2012 17:44:50 +0000 (13:44 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless-next into for-davem

12 years agoipv4: Remove tb_peers from fib_table.
David S. Miller [Thu, 12 Jul 2012 16:39:28 +0000 (09:39 -0700)]
ipv4: Remove tb_peers from fib_table.

No longer used.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge branch 'for-davem' of git://gitorious.org/linux-can/linux-can-next
David S. Miller [Thu, 12 Jul 2012 15:18:45 +0000 (08:18 -0700)]
Merge branch 'for-davem' of git://gitorious.org/linux-can/linux-can-next

12 years agobe2net: Enable RSS UDP hashing for Lancer and Skyhawk
Padmanabh Ratnakar [Thu, 12 Jul 2012 03:57:47 +0000 (03:57 +0000)]
be2net: Enable RSS UDP hashing for Lancer and Skyhawk

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobe2net: Fix port name in message during driver load
Padmanabh Ratnakar [Thu, 12 Jul 2012 03:57:35 +0000 (03:57 +0000)]
be2net: Fix port name in message during driver load

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobe2net: Fix cleanup path when EQ creation fails
Padmanabh Ratnakar [Thu, 12 Jul 2012 03:57:21 +0000 (03:57 +0000)]
be2net: Fix cleanup path when EQ creation fails

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobe2net: Activate new FW after FW download for Lancer
Padmanabh Ratnakar [Thu, 12 Jul 2012 03:57:09 +0000 (03:57 +0000)]
be2net: Activate new FW after FW download for Lancer

After FW download, activate new FW by invoking FW reset.
Recreate rings once new FW is operational.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobe2net: Fix initialization sequence for Lancer
Padmanabh Ratnakar [Thu, 12 Jul 2012 03:56:58 +0000 (03:56 +0000)]
be2net: Fix initialization sequence for Lancer

Invoke only required initialization routines for Lancer.
Remove invocation of unnecessary routines.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobe2net : Fix die temperature stat for Lancer
Padmanabh Ratnakar [Thu, 12 Jul 2012 03:56:46 +0000 (03:56 +0000)]
be2net : Fix die temperature stat for Lancer

Query die temperature stat for Lancer to report it correctly
in ethtool.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobe2net: Fix error while toggling autoneg of pause parameters
Padmanabh Ratnakar [Thu, 12 Jul 2012 03:56:11 +0000 (03:56 +0000)]
be2net: Fix error while toggling autoneg of pause parameters

Autonegotiation of pause parameters is possible only on some PHYs.
Ability of autoneg of pause parameters is reported by adapter.
Autoneg of pause parameters cannot be changed from driver.
Fix driver to give error when autoneg mode is toggled by user.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoteam: make team_port_enabled() and team_port_txable() static inline
Jiri Pirko [Wed, 11 Jul 2012 05:34:04 +0000 (05:34 +0000)]
team: make team_port_enabled() and team_port_txable() static inline

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoteam: add broadcast mode
Jiri Pirko [Wed, 11 Jul 2012 05:34:03 +0000 (05:34 +0000)]
team: add broadcast mode

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoteam: use function team_port_txable() for determing enabled and up port
Jiri Pirko [Wed, 11 Jul 2012 05:34:02 +0000 (05:34 +0000)]
team: use function team_port_txable() for determing enabled and up port

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: Put proper checks into icmp_socket_deliver().
David S. Miller [Thu, 12 Jul 2012 15:06:04 +0000 (08:06 -0700)]
ipv4: Put proper checks into icmp_socket_deliver().

All handler->err() routines expect that we've done a pskb_may_pull()
test to make sure that IP header length + 8 bytes can be safely
pulled.

Reported-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net...
David S. Miller [Thu, 12 Jul 2012 15:00:56 +0000 (08:00 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next

12 years agonet: sched: add ipset ematch
Florian Westphal [Wed, 11 Jul 2012 10:56:57 +0000 (10:56 +0000)]
net: sched: add ipset ematch

Can be used to match packets against netfilter ip sets created via ipset(8).
skb->sk_iif is used as 'incoming interface', skb->dev is 'outgoing interface'.

Since ipset is usually called from netfilter, the ematch
initializes a fake xt_action_param, pulls the ip header into the
linear area and also sets skb->data to the IP header (otherwise
matching Layer 4 set types doesn't work).

Tested-by: Mr Dash Four <mr.dash.four@googlemail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonetxen: fix link notification order
Flavio Leitner [Wed, 11 Jul 2012 08:56:55 +0000 (08:56 +0000)]
netxen: fix link notification order

First update the adapter variables with the current speed and
mode before fire the notification. Otherwise, the get_settings()
may provide old values.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Acked-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years ago6lowpan: rework fragment-deleting routine
alex.bluesman.smirnov@gmail.com [Tue, 10 Jul 2012 21:22:48 +0000 (21:22 +0000)]
6lowpan: rework fragment-deleting routine

6lowpan module starts collecting incomming frames and fragments
right after lowpan_module_init() therefor it will be better to
clean unfinished fragments in lowpan_cleanup_module() function
instead of doing it when link goes down.

Changed spinlocks type to prevent deadlock with expired timer event
and removed unused one.

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years ago6lowpan: fix tag variable size
alex.bluesman.smirnov@gmail.com [Tue, 10 Jul 2012 21:22:47 +0000 (21:22 +0000)]
6lowpan: fix tag variable size

Function lowpan_alloc_new_frame() takes u8 tag as an argument. However,
its only caller, lowpan_process_data() passes down a u16. Hence,
the tag value can get corrupted. This prevent 6lowpan fragment reassembly of a
message when the fragment tag value is over 256.

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Cc: Tony Cheneau <tony.cheneau@amnesiak.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agomac802154: sparse warnings: make symbols static
alex.bluesman.smirnov@gmail.com [Tue, 10 Jul 2012 21:22:46 +0000 (21:22 +0000)]
mac802154: sparse warnings: make symbols static

Make symbols static to avoid the following warning shown up
by sparse:

    warning: symbol ... was not declared. Should it be static?

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years ago6lowpan: get extra headroom in allocated frame
alex.bluesman.smirnov@gmail.com [Tue, 10 Jul 2012 21:22:45 +0000 (21:22 +0000)]
6lowpan: get extra headroom in allocated frame

Use netdev_alloc_skb_ip_align() instead of alloc_skb() to get some
extra headroom in case we need to forward this frame in a tunnel or
something else.

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agomac802154: add get short address method
alex.bluesman.smirnov@gmail.com [Tue, 10 Jul 2012 21:22:44 +0000 (21:22 +0000)]
mac802154: add get short address method

Add method to get the device short 802.15.4 address. This call
needed by ieee802154 layer to satisfy 'iz list' request from
the user space.

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agodrivers/ieee802154/at86rf230: rework irq handler
alex.bluesman.smirnov@gmail.com [Tue, 10 Jul 2012 21:22:43 +0000 (21:22 +0000)]
drivers/ieee802154/at86rf230: rework irq handler

Fix LOCKDEP bug message for the irq handler spinlock.
Make the irq processing code more explicit and stable.

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years ago6lowpan: revert: add missing spin_lock_init()
alex.bluesman.smirnov@gmail.com [Tue, 10 Jul 2012 21:22:42 +0000 (21:22 +0000)]
6lowpan: revert: add missing spin_lock_init()

Revert the commit 768f7c7c121e80f458a9d013b2e8b169e5dfb1e5 to initialize
spinlock in the more preferable way and make it static to avoid sparse
warning.

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agosmsc95xx: signedness bug in get_regs()
Dan Carpenter [Tue, 10 Jul 2012 20:32:51 +0000 (20:32 +0000)]
smsc95xx: signedness bug in get_regs()

"retval" has to be a signed integer for the error handling to work.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: add support for NS8390 based eth controllers on some ColdFire CPU boards
Greg Ungerer [Wed, 4 Jul 2012 13:50:00 +0000 (13:50 +0000)]
net: add support for NS8390 based eth controllers on some ColdFire CPU boards

A number of older ColdFire CPU based boards use NS8390 based network
controllers. Most use the Davicom 9008F or the UMC 9008F. This driver
provides the support code to get these devices working on these platforms.

Generally the NS8390 based eth device is direct connected via the general
purpose bus of the ColdFire CPU. So its addressing and interrupt setup is
fixed on each of the different platforms (classic platform setup).

This driver is based on the other drivers/net/ethernet/8390 drivers, and
includes the lib8390.c code. It uses the existing definitions of the
board NS8390 device addresses, interrupts and access types from the
arch/m68k/include/asm/mcf8390.h, but moves the IO access functions into
the driver code and out of that header.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agom68knommu: move the badly named mcfne.h to a better mcf8390.h
Greg Ungerer [Wed, 4 Jul 2012 13:49:59 +0000 (13:49 +0000)]
m68knommu: move the badly named mcfne.h to a better mcf8390.h

The mcfne.h include contains definitions to support NS8390 eth based hardware
on ColdFire based CPU boards. So change its name to reflect that better.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: Fix warnings in ip_do_redirect() for some configurations.
David S. Miller [Thu, 12 Jul 2012 14:40:05 +0000 (07:40 -0700)]
ipv4: Fix warnings in ip_do_redirect() for some configurations.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotipc: limit error messages relating to memory leak to one line
Paul Gortmaker [Wed, 11 Jul 2012 21:35:01 +0000 (17:35 -0400)]
tipc: limit error messages relating to memory leak to one line

With the default name table size of 1024, it is possible that
the sanity check in tipc_nametbl_stop could spam out 1024
essentially identical error messages if memory was corrupted
or similar.  Limit it to issuing no more than a single message.

The actual chain number (i.e. 0 --> 1023) wouldn't provide any
useful insight if/when such an instance happened, so don't
bother printing out that value.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: factor stats struct out of the larger link struct
Paul Gortmaker [Wed, 11 Jul 2012 13:40:43 +0000 (09:40 -0400)]
tipc: factor stats struct out of the larger link struct

This is done to improve readability, and so that we can give
the struct a name that will allow us to declare a local
pointer to it in code, instead of having to always redirect
through the link struct to get to it.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agoMerge branch 'redirect_via_sock'
David S. Miller [Thu, 12 Jul 2012 10:49:19 +0000 (03:49 -0700)]
Merge branch 'redirect_via_sock'

As described in my patch series from the other day, we need to
rearrange redirect handling so that the local initiators of packets
(sockets, tunnels, xfrms, etc.) that implement the protocols compute
the route and pass this down into the ipv4/ipv6 routing code.

These changes here do so by implementing a new dst_ops->redirect
method.

No more do we have this funny code that tries several different sets
of routing keys to try and figure out which route the redirect should
actually be applied to.

No more do we have the problem wherein TOS rewriting causes problems
for us.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: Remove checks for dst_ops->redirect being NULL.
David S. Miller [Thu, 12 Jul 2012 07:41:25 +0000 (00:41 -0700)]
net: Remove checks for dst_ops->redirect being NULL.

No longer necessary.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: Add dummy dst_ops->redirect method where needed.
David S. Miller [Thu, 12 Jul 2012 07:39:24 +0000 (00:39 -0700)]
net: Add dummy dst_ops->redirect method where needed.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: Use icmpv6_notify() to propagate redirect, instead of rt6_redirect().
David S. Miller [Thu, 12 Jul 2012 07:33:37 +0000 (00:33 -0700)]
ipv6: Use icmpv6_notify() to propagate redirect, instead of rt6_redirect().

And delete rt6_redirect(), since it is no longer used.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: Add redirect support to all protocol icmp error handlers.
David S. Miller [Thu, 12 Jul 2012 07:25:15 +0000 (00:25 -0700)]
ipv6: Add redirect support to all protocol icmp error handlers.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: Add ip6_redirect() and ip6_sk_redirect() helper functions.
David S. Miller [Thu, 12 Jul 2012 07:08:07 +0000 (00:08 -0700)]
ipv6: Add ip6_redirect() and ip6_sk_redirect() helper functions.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: Pull main logic of rt6_redirect() into rt6_do_redirect().
David S. Miller [Thu, 12 Jul 2012 07:05:02 +0000 (00:05 -0700)]
ipv6: Pull main logic of rt6_redirect() into rt6_do_redirect().

Hook it into dst_ops->redirect as well.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: Move bulk of redirect handling into rt6_redirect().
David S. Miller [Thu, 12 Jul 2012 06:43:53 +0000 (23:43 -0700)]
ipv6: Move bulk of redirect handling into rt6_redirect().

This sets things up so that we can have the protocol error handlers
call down into the ipv6 route code for redirects just as ipv4 already
does.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: Export ndisc option parsing from ndisc.c
David S. Miller [Thu, 12 Jul 2012 06:26:46 +0000 (23:26 -0700)]
ipv6: Export ndisc option parsing from ndisc.c

This is going to be used internally by the rt6 redirect code.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: Kill ip_rt_redirect().
David S. Miller [Thu, 12 Jul 2012 04:30:08 +0000 (21:30 -0700)]
ipv4: Kill ip_rt_redirect().

No longer needed, as the protocol handlers now all properly
propagate the redirect back into the routing code.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: Add redirect support to all protocol icmp error handlers.
David S. Miller [Thu, 12 Jul 2012 04:27:49 +0000 (21:27 -0700)]
ipv4: Add redirect support to all protocol icmp error handlers.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: Add ipv4_redirect() and ipv4_sk_redirect() helper functions.
David S. Miller [Thu, 12 Jul 2012 04:25:45 +0000 (21:25 -0700)]
ipv4: Add ipv4_redirect() and ipv4_sk_redirect() helper functions.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: Generalize ip_do_redirect() and hook into new dst_ops->redirect.
David S. Miller [Thu, 12 Jul 2012 03:55:47 +0000 (20:55 -0700)]
ipv4: Generalize ip_do_redirect() and hook into new dst_ops->redirect.

All of the redirect acceptance policy is now contained within.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: Rearrange arguments to ip_rt_redirect()
David S. Miller [Thu, 12 Jul 2012 03:38:08 +0000 (20:38 -0700)]
ipv4: Rearrange arguments to ip_rt_redirect()

Pass in the SKB rather than just the IP addresses, so that policy
and other aspects can reside in ip_rt_redirect() rather then
icmp_redirect().

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: Pull redirect instantiation out into a helper function.
David S. Miller [Thu, 12 Jul 2012 03:27:54 +0000 (20:27 -0700)]
ipv4: Pull redirect instantiation out into a helper function.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: Deliver ICMP redirects to sockets too.
David S. Miller [Thu, 12 Jul 2012 01:35:12 +0000 (18:35 -0700)]
ipv4: Deliver ICMP redirects to sockets too.

And thus, we can remove the ping_err() hack.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: Pull icmp socket delivery out into a helper function.
David S. Miller [Thu, 12 Jul 2012 01:32:17 +0000 (18:32 -0700)]
ipv4: Pull icmp socket delivery out into a helper function.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: TCP Small Queues
Eric Dumazet [Wed, 11 Jul 2012 05:50:31 +0000 (05:50 +0000)]
tcp: TCP Small Queues

This introduce TSQ (TCP Small Queues)

TSQ goal is to reduce number of TCP packets in xmit queues (qdisc &
device queues), to reduce RTT and cwnd bias, part of the bufferbloat
problem.

sk->sk_wmem_alloc not allowed to grow above a given limit,
allowing no more than ~128KB [1] per tcp socket in qdisc/dev layers at a
given time.

TSO packets are sized/capped to half the limit, so that we have two
TSO packets in flight, allowing better bandwidth use.

As a side effect, setting the limit to 40000 automatically reduces the
standard gso max limit (65536) to 40000/2 : It can help to reduce
latencies of high prio packets, having smaller TSO packets.

This means we divert sock_wfree() to a tcp_wfree() handler, to
queue/send following frames when skb_orphan() [2] is called for the
already queued skbs.

Results on my dev machines (tg3/ixgbe nics) are really impressive,
using standard pfifo_fast, and with or without TSO/GSO.

Without reduction of nominal bandwidth, we have reduction of buffering
per bulk sender :
< 1ms on Gbit (instead of 50ms with TSO)
< 8ms on 100Mbit (instead of 132 ms)

I no longer have 4 MBytes backlogged in qdisc by a single netperf
session, and both side socket autotuning no longer use 4 Mbytes.

As skb destructor cannot restart xmit itself ( as qdisc lock might be
taken at this point ), we delegate the work to a tasklet. We use one
tasklest per cpu for performance reasons.

If tasklet finds a socket owned by the user, it sets TSQ_OWNED flag.
This flag is tested in a new protocol method called from release_sock(),
to eventually send new segments.

[1] New /proc/sys/net/ipv4/tcp_limit_output_bytes tunable
[2] skb_orphan() is usually called at TX completion time,
  but some drivers call it in their start_xmit() handler.
  These drivers should at least use BQL, or else a single TCP
  session can still fill the whole NIC TX ring, since TSQ will
  have no effect.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Dave Taht <dave.taht@bufferbloat.net>
Cc: Tom Herbert <therbert@google.com>
Cc: Matt Mathis <mattmathis@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: Fix out of bounds access to tcpm_vals
Alexander Duyck [Thu, 12 Jul 2012 00:18:04 +0000 (17:18 -0700)]
tcp: Fix out of bounds access to tcpm_vals

The recent patch "tcp: Maintain dynamic metrics in local cache." introduced
an out of bounds access due to what appears to be a typo.   I believe this
change should resolve the issue by replacing the access to RTAX_CWND with
TCP_METRIC_CWND.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>