platform/upstream/kernel-adaptation-pc.git
13 years agocaif: Use RCU instead of spin-lock in caif_dev.c
sjur.brandeland@stericsson.com [Fri, 13 May 2011 02:44:00 +0000 (02:44 +0000)]
caif: Use RCU instead of spin-lock in caif_dev.c

RCU read_lock and refcount is used to protect in-flight packets.

Use RCU and counters to manage freeing lower part of the CAIF stack if
CAIF-link layer is removed. Old solution based on delaying removal of
device is removed.

When CAIF link layer goes down the use of CAIF link layer is disabled
(by calling caif_set_phy_state()), but removal and freeing of the
lower part of the CAIF stack is done when Link layer is unregistered.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agocaif: Use rcu_read_lock in CAIF mux layer.
sjur.brandeland@stericsson.com [Fri, 13 May 2011 02:43:59 +0000 (02:43 +0000)]
caif: Use rcu_read_lock in CAIF mux layer.

Replace spin_lock with rcu_read_lock when accessing lists to layers
and cache. While packets are in flight rcu_read_lock should not be held,
instead ref-counters are used in combination with RCU.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: ping: small changes
Eric Dumazet [Fri, 13 May 2011 22:59:19 +0000 (22:59 +0000)]
net: ping: small changes

ping_table is not __read_mostly, since it contains one rwlock,
and is static to ping.c

ping_port_rover & ping_v4_lookup are static

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/jkirsher/net...
David S. Miller [Sun, 15 May 2011 05:08:23 +0000 (01:08 -0400)]
Merge branch 'master' of /linux/kernel/git/jkirsher/net-next-2.6

13 years agoMerge branch 'batman-adv/next' of git://git.open-mesh.org/ecsv/linux-merge
David S. Miller [Sun, 15 May 2011 02:47:51 +0000 (22:47 -0400)]
Merge branch 'batman-adv/next' of git://git.open-mesh.org/ecsv/linux-merge

13 years agoixgbe: Add support for new 82599 adapter
Don Skidmore [Sat, 14 May 2011 06:36:35 +0000 (06:36 +0000)]
ixgbe: Add support for new 82599 adapter

This patch adds support for a new adapter in the 82599 family.  Included
in that support is a new media_type ixgbe_media_type_fiber_lco.

Signed-of-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: fix sparse warning
Emil Tantilov [Sat, 7 May 2011 06:49:18 +0000 (06:49 +0000)]
ixgbe: fix sparse warning

error: bad constant expression

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Evan Swanson <evan.swanson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: cleanup some minor issues in ixgbe_down()
Alexander Duyck [Fri, 22 Apr 2011 04:08:14 +0000 (04:08 +0000)]
ixgbe: cleanup some minor issues in ixgbe_down()

This patch cleans up two minor issues in ixgbe_down.  Specifically it
addresses the fact that the VFs should not be pinged until after interrupts
are disabled otherwise they might still get a response.  It also drops the
use of the txdctl temporary variable since the only bit we should be
writing to the TXDCTL registers during a shutdown is the flush bit.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: Merge over-temp task into service task
Alexander Duyck [Fri, 22 Apr 2011 04:08:09 +0000 (04:08 +0000)]
ixgbe: Merge over-temp task into service task

This change merges the over-temp task into the service task.  As a result
all tasklets are finally combined into once single tasklet for easier
management.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: Merge ATR reinit into the service task
Alexander Duyck [Wed, 27 Apr 2011 09:25:34 +0000 (09:25 +0000)]
ixgbe: Merge ATR reinit into the service task

This change merges the ATR table reinitialization into the service task.
This is yet another opportunity to avoid any race conditions as we don't
want to be attempting to reinitialize the table during a possible reset.

In addition this change adds a counter for table reinitialization so that
it can be tracked as part of the regular statistics.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Evan Swanson <evan.swanson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: merge reset task into service task
Alexander Duyck [Wed, 27 Apr 2011 09:21:16 +0000 (09:21 +0000)]
ixgbe: merge reset task into service task

This change is meant to further help to reduce possible configuration
collisions between the various tasklets.  This change combines the device
reset with the service task.  As a result it is now not possible to be
updating the link on the device while also resetting the part.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Evan Swanson <evan.swanson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: Merge watchdog functionality into service task
Alexander Duyck [Fri, 22 Apr 2011 04:07:54 +0000 (04:07 +0000)]
ixgbe: Merge watchdog functionality into service task

This patch is meant to merge the functionality of the ixgbe watchdog task
into the service task.  By doing this all link state functionality will be
controlled by a single task.  As a result the reliability of the interface
will be improved as the likelihood of any race conditions is further
reduced.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: Combine SFP and multi-speed fiber task into single service task
Alexander Duyck [Wed, 27 Apr 2011 09:13:56 +0000 (09:13 +0000)]
ixgbe: Combine SFP and multi-speed fiber task into single service task

This change is meant to address several race conditions with multi-speed
fiber SFP+ modules in 82599 adapters.  Specifically issues have been seen
in which both the SFP configuration and the multi-speed fiber configuration
are running simultaneously which will result in the device getting into an
erroneous link down state.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Evan Swanson <evan.swanson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: move flags and state into the same cacheline
Alexander Duyck [Fri, 22 Apr 2011 04:07:43 +0000 (04:07 +0000)]
ixgbe: move flags and state into the same cacheline

This change moves flags and state into the same cacheline.  The reason for
this change is because both are frequently read around the same time and
infrequently written.  By combining them into the same cacheline this
should help to reduce memory utilization.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Evan Swanson <evan.swanson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: force unlock on timeout
Emil Tantilov [Fri, 8 Apr 2011 01:23:59 +0000 (01:23 +0000)]
ixgbe: force unlock on timeout

The semaphore can be in locked state upon driver load, particularly
on 82598 if a machine is rebooted due to panic and the semaphore was
acquired just prior to the panic.

This patch unlocks the semaphore if it times out.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: Add macvlan support for VF
Greg Rose [Fri, 13 May 2011 01:33:48 +0000 (01:33 +0000)]
ixgbe: Add macvlan support for VF

Add infrastructure in the PF driver to support macvlan in the VF driver.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbevf: Add macvlan support in the set rx mode op
Greg Rose [Fri, 13 May 2011 01:33:42 +0000 (01:33 +0000)]
ixgbevf: Add macvlan support in the set rx mode op

Implement setup of unicast address list in the VF driver's set_rx_mode
netdev op.  Unicast addresses are sent to the PF via a mailbox message
and the PF will check if it has room in the RAR table and if so set the
filter for the VF.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoe1000e: minor comment cleanups
Bruce Allan [Fri, 13 May 2011 07:19:58 +0000 (07:19 +0000)]
e1000e: minor comment cleanups

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agobatman-adv: reset broadcast flood protection on error
Marek Lindner [Sat, 14 May 2011 18:01:22 +0000 (20:01 +0200)]
batman-adv: reset broadcast flood protection on error

The broadcast flood protection should be reset to its original value
if the primary interface could not be retrieved.

Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
13 years agobatman-adv: Add missing hardif_free_ref in forw_packet_free
Sven Eckelmann [Wed, 11 May 2011 18:59:06 +0000 (20:59 +0200)]
batman-adv: Add missing hardif_free_ref in forw_packet_free

add_bcast_packet_to_list increases the refcount for if_incoming but the
reference count is never decreased. The reference count must be
increased for all kinds of forwarded packets which have the primary
interface stored and forw_packet_free must decrease them. Also
purge_outstanding_packets has to invoke forw_packet_free when a work
item was really cancelled.

This regression was introduced in
32ae9b221e788413ce68feaae2ca39e406211a0a.

Reported-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
13 years agoipv4: Remove rt->rt_dst reference from ip_forward_options().
David S. Miller [Fri, 13 May 2011 21:31:02 +0000 (17:31 -0400)]
ipv4: Remove rt->rt_dst reference from ip_forward_options().

At this point iph->daddr equals what rt->rt_dst would hold.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: Remove route key identity dependencies in ip_rt_get_source().
David S. Miller [Fri, 13 May 2011 21:29:41 +0000 (17:29 -0400)]
ipv4: Remove route key identity dependencies in ip_rt_get_source().

Pass in the sk_buff so that we can fetch the necessary keys from
the packet header when working with input routes.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: Always call ip_options_build() after rest of IP header is filled in.
David S. Miller [Fri, 13 May 2011 21:21:27 +0000 (17:21 -0400)]
ipv4: Always call ip_options_build() after rest of IP header is filled in.

This will allow ip_options_build() to reliably look at the values of
iph->{daddr,saddr}

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: Kill spurious write to iph->daddr in ip_forward_options().
David S. Miller [Fri, 13 May 2011 21:15:50 +0000 (17:15 -0400)]
ipv4: Kill spurious write to iph->daddr in ip_forward_options().

This code block executes when opt->srr_is_hit is set.  It will be
set only by ip_options_rcv_srr().

ip_options_rcv_srr() walks until it hits a matching nexthop in the SRR
option addresses, and when it matches one 1) looks up the route for
that nexthop and 2) on route lookup success it writes that nexthop
value into iph->daddr.

ip_forward_options() runs later, and again walks the SRR option
addresses looking for the option matching the destination of the route
stored in skb_rtable().  This route will be the same exact one looked
up for the nexthop by ip_options_rcv_srr().

Therefore "rt->rt_dst == iph->daddr" must be true.

All it really needs to do is record the route's source address in the
matching SRR option adddress.  It need not write iph->daddr again,
since that has already been done by ip_options_rcv_srr() as detailed
above.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoolympic: convert to seq_file
Alexey Dobriyan [Fri, 13 May 2011 20:50:49 +0000 (16:50 -0400)]
olympic: convert to seq_file

->read_proc interface is going away, switch to seq_file.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet:set valid name before calling ndo_init()
Peter Pan(潘卫平) [Thu, 12 May 2011 15:46:56 +0000 (15:46 +0000)]
net:set valid name before calling ndo_init()

In commit 1c5cae815d19 (net: call dev_alloc_name from register_netdevice),
a bug of bonding was involved, see example 1 and 2.

In register_netdevice(), the name of net_device is not valid until
dev_get_valid_name() is called. But dev->netdev_ops->ndo_init(that is
bond_init) is called before dev_get_valid_name(),
and it uses the invalid name of net_device.

I think register_netdevice() should make sure that the name of net_device is
valid before calling ndo_init().

example 1:
modprobe bonding
ls  /proc/net/bonding/bond%d

ps -eLf
root      3398     2  3398  0    1 21:34 ?        00:00:00 [bond%d]

example 2:
modprobe bonding max_bonds=3

[  170.100292] bonding: Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
[  170.101090] bonding: Warning: either miimon or arp_interval and arp_ip_target module parameters must be specified, otherwise bonding will not detect link failures! see bonding.txt for details.
[  170.102469] ------------[ cut here ]------------
[  170.103150] WARNING: at /home/pwp/net-next-2.6/fs/proc/generic.c:586 proc_register+0x126/0x157()
[  170.104075] Hardware name: VirtualBox
[  170.105065] proc_dir_entry 'bonding/bond%d' already registered
[  170.105613] Modules linked in: bonding(+) sunrpc ipv6 uinput microcode ppdev parport_pc parport joydev e1000 pcspkr i2c_piix4 i2c_core [last unloaded: bonding]
[  170.108397] Pid: 3457, comm: modprobe Not tainted 2.6.39-rc2+ #14
[  170.108935] Call Trace:
[  170.109382]  [<c0438f3b>] warn_slowpath_common+0x6a/0x7f
[  170.109911]  [<c051a42a>] ? proc_register+0x126/0x157
[  170.110329]  [<c0438fc3>] warn_slowpath_fmt+0x2b/0x2f
[  170.110846]  [<c051a42a>] proc_register+0x126/0x157
[  170.111870]  [<c051a4dd>] proc_create_data+0x82/0x98
[  170.112335]  [<f94e6af6>] bond_create_proc_entry+0x3f/0x73 [bonding]
[  170.112905]  [<f94dd806>] bond_init+0x77/0xa5 [bonding]
[  170.113319]  [<c0721ac6>] register_netdevice+0x8c/0x1d3
[  170.113848]  [<f94e0e30>] bond_create+0x6c/0x90 [bonding]
[  170.114322]  [<f94f4763>] bonding_init+0x763/0x7b1 [bonding]
[  170.114879]  [<c0401240>] do_one_initcall+0x76/0x122
[  170.115317]  [<f94f4000>] ? 0xf94f3fff
[  170.115799]  [<c0463f1e>] sys_init_module+0x1286/0x140d
[  170.116879]  [<c07c6d9f>] sysenter_do_call+0x12/0x28
[  170.117404] ---[ end trace 64e4fac3ae5fff1a ]---
[  170.117924] bond%d: Warning: failed to register to debugfs
[  170.128728] ------------[ cut here ]------------
[  170.129360] WARNING: at /home/pwp/net-next-2.6/fs/proc/generic.c:586 proc_register+0x126/0x157()
[  170.130323] Hardware name: VirtualBox
[  170.130797] proc_dir_entry 'bonding/bond%d' already registered
[  170.131315] Modules linked in: bonding(+) sunrpc ipv6 uinput microcode ppdev parport_pc parport joydev e1000 pcspkr i2c_piix4 i2c_core [last unloaded: bonding]
[  170.133731] Pid: 3457, comm: modprobe Tainted: G        W   2.6.39-rc2+ #14
[  170.134308] Call Trace:
[  170.134743]  [<c0438f3b>] warn_slowpath_common+0x6a/0x7f
[  170.135305]  [<c051a42a>] ? proc_register+0x126/0x157
[  170.135820]  [<c0438fc3>] warn_slowpath_fmt+0x2b/0x2f
[  170.137168]  [<c051a42a>] proc_register+0x126/0x157
[  170.137700]  [<c051a4dd>] proc_create_data+0x82/0x98
[  170.138174]  [<f94e6af6>] bond_create_proc_entry+0x3f/0x73 [bonding]
[  170.138745]  [<f94dd806>] bond_init+0x77/0xa5 [bonding]
[  170.139278]  [<c0721ac6>] register_netdevice+0x8c/0x1d3
[  170.139828]  [<f94e0e30>] bond_create+0x6c/0x90 [bonding]
[  170.140361]  [<f94f4763>] bonding_init+0x763/0x7b1 [bonding]
[  170.140927]  [<c0401240>] do_one_initcall+0x76/0x122
[  170.141494]  [<f94f4000>] ? 0xf94f3fff
[  170.141975]  [<c0463f1e>] sys_init_module+0x1286/0x140d
[  170.142463]  [<c07c6d9f>] sysenter_do_call+0x12/0x28
[  170.142974] ---[ end trace 64e4fac3ae5fff1b ]---
[  170.144949] bond%d: Warning: failed to register to debugfs

Signed-off-by: Weiping Pan <panweiping3@gmail.com>
Reviewed-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agostmmac: fix autoneg in set_pauseparam
Giuseppe CAVALLARO [Thu, 12 May 2011 20:28:05 +0000 (20:28 +0000)]
stmmac: fix autoneg in set_pauseparam

This patch fixes a bug in the set_pauseparam
function that didn't well manage the ANE
field and returned broken values when use
ethtool -A|-a.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agostmmac: don't go through ethtool to start auto-negotiation
David Decotigny [Thu, 12 May 2011 20:28:04 +0000 (20:28 +0000)]
stmmac: don't go through ethtool to start auto-negotiation

The driver used to call phy's ethtool configuration routine to start
auto-negotiation. This change has it call directly phy's routine to
start auto-negotiation.

The initial version was hiding phy_start_aneg() return value,
this patch returns it (<0 upon error).

Tested: module compiles, tested on STM HDK7108 STB.

Signed-off-by: David Decotigny <decot@google.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agodrivers/isdn/hisax: Drop unused list
Julia Lawall [Fri, 13 May 2011 04:15:39 +0000 (04:15 +0000)]
drivers/isdn/hisax: Drop unused list

The file st5481_init.c locally defines and initializes the adapter_list
variable, but does not use it for anything.  Removing the list makes it
possible to remove the list field from the st5481_adapter data structure.
In the function probe_st5481, it also makes it possible to free the locally
allocated adapter value on an error exit.

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: ipv4: add IPPROTO_ICMP socket kind
Vasiliy Kulikov [Fri, 13 May 2011 10:01:00 +0000 (10:01 +0000)]
net: ipv4: add IPPROTO_ICMP socket kind

This patch adds IPPROTO_ICMP socket kind.  It makes it possible to send
ICMP_ECHO messages and receive the corresponding ICMP_ECHOREPLY messages
without any special privileges.  In other words, the patch makes it
possible to implement setuid-less and CAP_NET_RAW-less /bin/ping.  In
order not to increase the kernel's attack surface, the new functionality
is disabled by default, but is enabled at bootup by supporting Linux
distributions, optionally with restriction to a group or a group range
(see below).

Similar functionality is implemented in Mac OS X:
http://www.manpagez.com/man/4/icmp/

A new ping socket is created with

    socket(PF_INET, SOCK_DGRAM, PROT_ICMP)

Message identifiers (octets 4-5 of ICMP header) are interpreted as local
ports. Addresses are stored in struct sockaddr_in. No port numbers are
reserved for privileged processes, port 0 is reserved for API ("let the
kernel pick a free number"). There is no notion of remote ports, remote
port numbers provided by the user (e.g. in connect()) are ignored.

Data sent and received include ICMP headers. This is deliberate to:
1) Avoid the need to transport headers values like sequence numbers by
other means.
2) Make it easier to port existing programs using raw sockets.

ICMP headers given to send() are checked and sanitized. The type must be
ICMP_ECHO and the code must be zero (future extensions might relax this,
see below). The id is set to the number (local port) of the socket, the
checksum is always recomputed.

ICMP reply packets received from the network are demultiplexed according
to their id's, and are returned by recv() without any modifications.
IP header information and ICMP errors of those packets may be obtained
via ancillary data (IP_RECVTTL, IP_RETOPTS, and IP_RECVERR). ICMP source
quenches and redirects are reported as fake errors via the error queue
(IP_RECVERR); the next hop address for redirects is saved to ee_info (in
network order).

socket(2) is restricted to the group range specified in
"/proc/sys/net/ipv4/ping_group_range".  It is "1 0" by default, meaning
that nobody (not even root) may create ping sockets.  Setting it to "100
100" would grant permissions to the single group (to either make
/sbin/ping g+s and owned by this group or to grant permissions to the
"netadmins" group), "0 4294967295" would enable it for the world, "100
4294967295" would enable it for the users, but not daemons.

The existing code might be (in the unlikely case anyone needs it)
extended rather easily to handle other similar pairs of ICMP messages
(Timestamp/Reply, Information Request/Reply, Address Mask Request/Reply
etc.).

Userspace ping util & patch for it:
http://openwall.info/wiki/people/segoon/ping

For Openwall GNU/*/Linux it was the last step on the road to the
setuid-less distro.  A revision of this patch (for RHEL5/OpenVZ kernels)
is in use in Owl-current, such as in the 2011/03/12 LiveCD ISOs:
http://mirrors.kernel.org/openwall/Owl/current/iso/

Initially this functionality was written by Pavel Kankovsky for
Linux 2.4.32, but unfortunately it was never made public.

All ping options (-b, -p, -Q, -R, -s, -t, -T, -M, -I), are tested with
the patch.

PATCH v3:
    - switched to flowi4.
    - minor changes to be consistent with raw sockets code.

PATCH v2:
    - changed ping_debug() to pr_debug().
    - removed CONFIG_IP_PING.
    - removed ping_seq_fops.owner field (unused for procfs).
    - switched to proc_net_fops_create().
    - switched to %pK in seq_printf().

PATCH v1:
    - fixed checksumming bug.
    - CAP_NET_RAW may not create icmp sockets anymore.

RFC v2:
    - minor cleanups.
    - introduced sysctl'able group range to restrict socket(2).

Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoconvert old cpumask API into new one
KOSAKI Motohiro [Thu, 12 May 2011 18:45:09 +0000 (18:45 +0000)]
convert old cpumask API into new one

Adapt new API.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoaf_iucv: get rid of compile warning
Ursula Braun [Thu, 12 May 2011 18:45:08 +0000 (18:45 +0000)]
af_iucv: get rid of compile warning

-Wunused-but-set-variable generates compile warnings. The affected
variables are removed.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoiucv: get rid of compile warning
Ursula Braun [Thu, 12 May 2011 18:45:07 +0000 (18:45 +0000)]
iucv: get rid of compile warning

-Wunused-but-set-variable generates a compile warning. The affected
variable is removed.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoctcm: get rid of compile warning
Ursula Braun [Thu, 12 May 2011 18:45:06 +0000 (18:45 +0000)]
ctcm: get rid of compile warning

-Wunused-but-set-variable generates compile warnings. The affected
variables are removed.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agolcs: get rid of compile warning
Heiko Carstens [Thu, 12 May 2011 18:45:05 +0000 (18:45 +0000)]
lcs: get rid of compile warning

-Wunused-but-set-variable generates a compile warning for lcs' tasklet
function. Invoked functions contain already error handling; thus
additional return code checking is not needed here.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoclaw: remove unused return code handling
Heiko Carstens [Thu, 12 May 2011 18:45:04 +0000 (18:45 +0000)]
claw: remove unused return code handling

Remove unused return code handling. The claw driver is mostly dead, so
just make sure it keeps compiling without warnings.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoqeth: add owner to ccw driver
Sebastian Ott [Thu, 12 May 2011 18:45:03 +0000 (18:45 +0000)]
qeth: add owner to ccw driver

Fill in the owner of qeth's ccw device driver.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoqeth: add OSA concurrent hardware trap
Frank Blaschka [Thu, 12 May 2011 18:45:02 +0000 (18:45 +0000)]
qeth: add OSA concurrent hardware trap

This patch improves FFDC (first failure data capture) by requesting
a hardware trace in case the device driver, the hardware or a user
detects an error.

Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoqeth: convert to hw_features part 2
Frank Blaschka [Thu, 12 May 2011 18:45:01 +0000 (18:45 +0000)]
qeth: convert to hw_features part 2

Set rx csum default to hw checksumming again.
Remove sysfs interface for rx csum (checksumming) and TSO (large_send).
With the new hw_features it does not work to keep the old sysfs
interface in parallel. Convert options.checksum_type to new hw_features.

Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoqlcnic: Bumped up version number to 5.0.18
Anirban Chakraborty [Thu, 12 May 2011 12:48:35 +0000 (12:48 +0000)]
qlcnic: Bumped up version number to 5.0.18

Update driver version number

Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoqlcnic: Take FW dump via ethtool
Anirban Chakraborty [Thu, 12 May 2011 12:48:34 +0000 (12:48 +0000)]
qlcnic: Take FW dump via ethtool

Driver checks if the previous dump has been cleared before taking the dump.
It doesn't take the dump if it is not cleared.

Changes from v2:
Added lock to protect dump data structures from being mangled while
dumping or setting them via ethtool.

Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoqlcnic: FW dump support
Anirban Chakraborty [Thu, 12 May 2011 12:48:33 +0000 (12:48 +0000)]
qlcnic: FW dump support

Added code to take FW dump.
o Driver queries FW at the init time and gets the dump template
o It takes FW dump as per the dump template
o Level of FW dump (and its size) is configured via dump flag

Signed-off-by: Sritej Velaga <sritej.velaga@qlogic.com>
Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobe2net: fix mbox polling for signal reception
Sathya Perla [Thu, 12 May 2011 19:32:16 +0000 (19:32 +0000)]
be2net: fix mbox polling for signal reception

Sending mbox cmds require multiple steps of writing to the DB register and polling
for an ack. Gettting interrupted in the middle by a signal breaks the mbox protocol.
Use msleep() to not get interrupted.

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobe2net: handle signal reception while waiting for POST
Sathya Perla [Thu, 12 May 2011 19:32:15 +0000 (19:32 +0000)]
be2net: handle signal reception while waiting for POST

If waiting on POST returns prematurely (due to a signal), abort polling and return an error.

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoethtool: Added support for FW dump
Anirban Chakraborty [Thu, 12 May 2011 12:48:32 +0000 (12:48 +0000)]
ethtool: Added support for FW dump

Added code to take FW dump via ethtool. Dump level can be controlled via setting the
dump flag. A get function is provided to query the current setting of the dump flag.
Dump data is obtained from the driver via a separate get function.

Changes from v3:
Fixed buffer length issue in ethtool_get_dump_data function.
Updated kernel doc for ethtool_dump struct and get_dump_flag function.

Changes from v2:
Provided separate commands for get flag and data.
Check for minimum of the two buffer length obtained via ethtool and driver and
use that for dump buffer
Pass up the driver return error codes up to the caller.
Added kernel doc comments.

Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com>
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonetdevice.h: Align struct net_device members
Joe Perches [Mon, 9 May 2011 17:42:46 +0000 (17:42 +0000)]
netdevice.h: Align struct net_device members

Save a bit of space.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: Fix 'iph' use before set.
David S. Miller [Fri, 13 May 2011 03:03:46 +0000 (23:03 -0400)]
ipv4: Fix 'iph' use before set.

I swear none of my compilers warned about this, yet it is so
obvious.

> net/ipv4/ip_forward.c: In function 'ip_forward':
> net/ipv4/ip_forward.c:87: warning: 'iph' may be used uninitialized in this function

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-next-2.6
David S. Miller [Fri, 13 May 2011 03:01:55 +0000 (23:01 -0400)]
Merge branch 'master' of /linux/kernel/git/davem/net-next-2.6

13 years agoipv4: Elide use of rt->rt_dst in ip_forward()
David S. Miller [Thu, 12 May 2011 23:34:30 +0000 (19:34 -0400)]
ipv4: Elide use of rt->rt_dst in ip_forward()

No matter what kind of header mangling occurs due to IP options
processing, rt->rt_dst will always equal iph->daddr in the packet.

So we can safely use iph->daddr instead of rt->rt_dst here.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: Simplify iph->daddr overwrite in ip_options_rcv_srr().
David S. Miller [Thu, 12 May 2011 23:30:58 +0000 (19:30 -0400)]
ipv4: Simplify iph->daddr overwrite in ip_options_rcv_srr().

We already copy the 4-byte nexthop from the options block into
local variable "nexthop" for the route lookup.

Re-use that variable instead of memcpy()'ing again when assigning
to iph->daddr after the route lookup succeeds.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: Kill spurious opt->srr check in ip_options_rcv_srr().
David S. Miller [Thu, 12 May 2011 23:26:57 +0000 (19:26 -0400)]
ipv4: Kill spurious opt->srr check in ip_options_rcv_srr().

All call sites conditionalize the call to ip_options_rcv_srr()
with a check of opt->srr, so no need to check it again there.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobonding: convert to ndo_fix_features
Michał Mirosław [Sat, 7 May 2011 03:22:17 +0000 (03:22 +0000)]
bonding: convert to ndo_fix_features

This should also fix updating of vlan_features and propagating changes to
VLAN devices on the bond.

Side effect: it allows user to force-disable some offloads on the bond
interface.

Note: NETIF_F_VLAN_CHALLENGED is managed by bond_fix_features() now.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: introduce netdev_change_features()
Michał Mirosław [Sat, 7 May 2011 03:22:17 +0000 (03:22 +0000)]
net: introduce netdev_change_features()

It will be needed by bonding and other drivers changing vlan_features
after ndo_init callback.

As a bonus, this includes kernel-doc for netdev_update_features().

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoCDC NCM: Add mising short packet in cdc_ncm driver
Alexey Orishko [Fri, 6 May 2011 03:01:30 +0000 (03:01 +0000)]
CDC NCM: Add mising short packet in cdc_ncm driver

Changes:
- while making NTB, driver shall check if device dwNtbOutMaxSize is higher than
 host value and shall add a short packet if this is the case
- previous temporary patch for this issue is replaced by this one

Signed-off-by: Alexey Orishko <alexey.orishko@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipvs: Remove all remaining references to rt->rt_{src,dst}
Julian Anastasov [Tue, 10 May 2011 12:46:05 +0000 (12:46 +0000)]
ipvs: Remove all remaining references to rt->rt_{src,dst}

Remove all remaining references to rt->rt_{src,dst}
by using dest->dst_saddr to cache saddr (used for TUN mode).
For ICMP in FORWARD hook just restrict the rt_mode for NAT
to disable LOCALNODE. All other modes do not allow
IP_VS_RT_MODE_RDR, so we should be safe with the ICMP
forwarding. Using cp->daddr as replacement for rt_dst
is safe for all modes except BYPASS, even when cp->dest is
NULL because it is cp->daddr that is used to assign cp->dest
for sync-ed connections.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipvs: Eliminate rt->rt_dst usage in __ip_vs_get_out_rt().
David S. Miller [Mon, 9 May 2011 21:38:06 +0000 (14:38 -0700)]
ipvs: Eliminate rt->rt_dst usage in __ip_vs_get_out_rt().

We can simply track what destination address is used based upon which
code block is taken at the top of the function.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipvs: Use IP_VS_RT_MODE_* instead of magic constants.
David S. Miller [Thu, 12 May 2011 22:22:34 +0000 (18:22 -0400)]
ipvs: Use IP_VS_RT_MODE_* instead of magic constants.

[ Add some cases I missed, from Julian Anastasov ]

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agotg3: Allow ethtool to enable/disable loopback.
Mahesh Bandewar [Sun, 8 May 2011 06:51:48 +0000 (06:51 +0000)]
tg3: Allow ethtool to enable/disable loopback.

This patch adds tg3_set_features() to handle loopback mode. Currently the
capability is added for the devices which support internal MAC loopback mode.
So when enabled, it enables internal-MAC loopback.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet/irda/ircomm_tty.c: Use flip buffers to deliver data
Amit Virdi [Thu, 12 May 2011 01:04:40 +0000 (01:04 +0000)]
net/irda/ircomm_tty.c: Use flip buffers to deliver data

use tty_insert_flip_string and tty_flip_buffer_push to deliver incoming data
packets from the IrDA device instead of delivering the packets directly to the
line discipline. Following later approach resulted in warning "Sleeping function
called from invalid context".

Signed-off-by: Amit Virdi <amit.virdi@st.com>
Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: group FCoE related feature flags
Yi Zou [Mon, 9 May 2011 11:53:27 +0000 (11:53 +0000)]
net: group FCoE related feature flags

Michał Mirosław's patch (http://patchwork.ozlabs.org/patch/94421/) fixes the
issue (http://patchwork.ozlabs.org/patch/94188/) about not populating FCoE related
flags correctly on vlan devices. However, only NETIF_F_FCOE_CRC is part of the
NETIF_F_ALL_TX_OFFLOADS right now, where weed NETIF_F_FCOE_MTU and NETIF_F_FSO
as well.

Therefore, add NETIF_F_ALL_FCOE to indicate feature flags used by FCoE TX offloads.
These include NETIF_F_FCOE_CRC, NETIF_F_FCOE_MTU, and NETIF_F_FSO and add them to
be part of NETIF_F_ALL_TX_OFFLOADS. This would eventually make sure all FCoE needed
flags are populated properly to vlan devices.

Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: Fix vlan_features propagation
Michał Mirosław [Fri, 6 May 2011 07:56:29 +0000 (07:56 +0000)]
net: Fix vlan_features propagation

Fix VLAN features propagation for devices which change vlan_features.
For this to work, driver needs to make sure netdev_features_changed()
gets called after the change (it is e.g. after ndo_set_features()).

Side effect is that a user might request features that will never
be enabled on a VLAN device.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoethtool: bring back missing comma in netdev features strings
Franco Fichtner [Thu, 12 May 2011 06:42:04 +0000 (06:42 +0000)]
ethtool: bring back missing comma in netdev features strings

The issue was introduced in commit eed2a12f1ed9aabf.

Signed-off-by: Franco Fichtner <franco@lastsummer.de>
Acked-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agogarp: remove last synchronize_rcu() call
Eric Dumazet [Thu, 12 May 2011 21:46:56 +0000 (17:46 -0400)]
garp: remove last synchronize_rcu() call

When removing last vlan from a device, garp_uninit_applicant() calls
synchronize_rcu() to make sure no user can still manipulate struct
garp_applicant before we free it.

Use call_rcu() instead, as a step to further net_device dismantle
optimizations.

Add the temporary garp_cleanup_module() function to make sure no pending
call_rcu() are left at module unload time [ this will be removed when
kfree_rcu() is available ]

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agovmxnet3: Use single tx queue when CONFIG_PCI_MSI not defined
Shreyas Bhatewara [Tue, 10 May 2011 06:13:56 +0000 (06:13 +0000)]
vmxnet3: Use single tx queue when CONFIG_PCI_MSI not defined

Resending this patch with few changes.

Avoid multiple queues when MSI or MSI-X not available

Limit number of Tx queues to 1 if MSI/MSI-X support is not configured in
the kernel. This will make number of tx and rx queues equal when MSI/X
is not configured thus providing better performance.

Signed-off-by: Bhavesh Davda <bhavesh@vmware.com>
Signed-off-by: Shreyas N Bhatewara <sbhatewara@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agosctp: sctp_sendmsg: Don't test known non-null sinfo
Joe Perches [Thu, 12 May 2011 09:19:10 +0000 (09:19 +0000)]
sctp: sctp_sendmsg: Don't test known non-null sinfo

It's already known non-null above.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agosctp: sctp_sendmsg: Don't initialize default_sinfo
Joe Perches [Thu, 12 May 2011 11:27:20 +0000 (11:27 +0000)]
sctp: sctp_sendmsg: Don't initialize default_sinfo

This variable only needs initialization when cmsgs.info
is NULL.

Use memset to ensure padding is also zeroed so
kernel doesn't leak any data.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agol2tp: fix potential rcu race
Eric Dumazet [Wed, 11 May 2011 18:22:36 +0000 (18:22 +0000)]
l2tp: fix potential rcu race

While trying to remove useless synchronize_rcu() calls, I found l2tp is
indeed incorrectly using two of such calls, but also bumps tunnel
refcount after list insertion.

tunnel refcount must be incremented before being made publically visible
by rcu readers.

This fix can be applied to 2.6.35+ and might need a backport for older
kernels, since things were shuffled in commit fd558d186df2c
(l2tp: Split pppol2tp patch into separate l2tp and ppp parts)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: James Chapman <jchapman@katalix.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobe2net: Fix to prevent flooding of TX queue
Padmanabh Ratnakar [Tue, 10 May 2011 05:13:57 +0000 (05:13 +0000)]
be2net: Fix to prevent flooding of TX queue

Start/stop TX queue is controlled by TX queue "used" counter.
It is incremented while WRBs are posted to TX queue and
decremented when TX completions are received. This counter was
getting decremented before HW is informed about processing of TX
completions. As used counter is decremented, transmit function
posts new WRBs and creates completion queue full scenario in HW.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobe2net: Use NTWK_RX_FILTER command for promiscous mode
Padmanabh Ratnakar [Tue, 10 May 2011 05:13:26 +0000 (05:13 +0000)]
be2net: Use NTWK_RX_FILTER command for promiscous mode

Use OPCODE_COMMON_NTWK_RX_FILTER command for promiscous mode as
OPCODE_ETH_PROMISCUOUS command is getting deprecated.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobe2net: In case of UE, do not dump registers for Lancer
Padmanabh Ratnakar [Tue, 10 May 2011 05:13:01 +0000 (05:13 +0000)]
be2net: In case of UE, do not dump registers for Lancer

In case of UE, do not dump registers for Lancer as they are not
supported.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobe2net: Disable coalesce water mark mode of CQ for Lancer
Padmanabh Ratnakar [Tue, 10 May 2011 05:12:37 +0000 (05:12 +0000)]
be2net: Disable coalesce water mark mode of CQ for Lancer

Disable coalesce water mark mode of CQ for Lancer

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobe2net: Handle error completion in Lancer
Padmanabh Ratnakar [Tue, 10 May 2011 05:12:17 +0000 (05:12 +0000)]
be2net: Handle error completion in Lancer

In Lancer if a frame is DMAed partially due to lack of RX buffers,
an error completion is sent with packet size as zero and num_recvd
indicating number of used buffers. These buffers need to be freed
and packet dropped.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-3.6
David S. Miller [Wed, 11 May 2011 18:26:15 +0000 (14:26 -0400)]
Merge branch 'master' of /linux/kernel/git/davem/net-3.6

Conflicts:
drivers/net/benet/be_main.c

13 years agoMerge branch 'tipc-May10-2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg...
David S. Miller [Wed, 11 May 2011 16:41:28 +0000 (12:41 -0400)]
Merge branch 'tipc-May10-2011' of git://git./linux/kernel/git/paulg/net-next-2.6

13 years agoMerge branch 'pablo/nf-2.6-updates' of git://1984.lsi.us.es/net-2.6
David S. Miller [Tue, 10 May 2011 22:04:35 +0000 (15:04 -0700)]
Merge branch 'pablo/nf-2.6-updates' of git://1984.lsi.us.es/net-2.6

13 years agoslcan: fix ldisc->open retval
Oliver Hartkopp [Tue, 10 May 2011 20:12:30 +0000 (13:12 -0700)]
slcan: fix ldisc->open retval

TTY layer expects 0 if the ldisc->open operation succeeded.

Reported-by: Matvejchikov Ilya <matvejchikov@gmail.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet/usb: mark LG VL600 LTE modem ethernet interface as WWAN
Dan Williams [Mon, 9 May 2011 07:43:20 +0000 (07:43 +0000)]
net/usb: mark LG VL600 LTE modem ethernet interface as WWAN

Like other mobile broadband device ethernet interfaces, mark the LG
VL600 with the 'wwan' devtype so userspace knows it needs additional
configuration via the AT port before the interface can be used.

Signed-off-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoxfrm: Don't allow esn with disabled anti replay detection
Steffen Klassert [Mon, 9 May 2011 19:43:05 +0000 (19:43 +0000)]
xfrm: Don't allow esn with disabled anti replay detection

Unlike the standard case, disabled anti replay detection needs some
nontrivial extra treatment on ESN. RFC 4303 states:

Note: If a receiver chooses to not enable anti-replay for an SA, then
the receiver SHOULD NOT negotiate ESN in an SA management protocol.
Use of ESN creates a need for the receiver to manage the anti-replay
window (in order to determine the correct value for the high-order
bits of the ESN, which are employed in the ICV computation), which is
generally contrary to the notion of disabling anti-replay for an SA.

So return an error if an ESN state with disabled anti replay detection
is inserted for now and add the extra treatment later if we need it.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoxfrm: Assign the inner mode output function to the dst entry
Steffen Klassert [Mon, 9 May 2011 19:36:38 +0000 (19:36 +0000)]
xfrm: Assign the inner mode output function to the dst entry

As it is, we assign the outer modes output function to the dst entry
when we create the xfrm bundle. This leads to two problems on interfamily
scenarios. We might insert ipv4 packets into ip6_fragment when called
from xfrm6_output. The system crashes if we try to fragment an ipv4
packet with ip6_fragment. This issue was introduced with git commit
ad0081e4 (ipv6: Fragment locally generated tunnel-mode IPSec6 packets
as needed). The second issue is, that we might insert ipv4 packets in
netfilter6 and vice versa on interfamily scenarios.

With this patch we assign the inner mode output function to the dst entry
when we create the xfrm bundle. So xfrm4_output/xfrm6_output from the inner
mode is used and the right fragmentation and netfilter functions are called.
We switch then to outer mode with the output_finish functions.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: dev_close() should check IFF_UP
Eric Dumazet [Tue, 10 May 2011 19:26:06 +0000 (12:26 -0700)]
net: dev_close() should check IFF_UP

Commit 443457242beb (factorize sync-rcu call in
unregister_netdevice_many) mistakenly removed one test from dev_close()

Following actions trigger a BUG :

modprobe bonding
modprobe dummy
ifconfig bond0 up
ifenslave bond0 dummy0
rmmod dummy

dev_close() must not close a non IFF_UP device.

With help from Frank Blaschka and Einar EL Lueck

Reported-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Reported-by: Einar EL Lueck <ELELUECK@de.ibm.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agovlan: fix GVRP at dismantle time
Eric Dumazet [Tue, 10 May 2011 19:22:54 +0000 (12:22 -0700)]
vlan: fix GVRP at dismantle time

ip link add link eth2 eth2.103 type vlan id 103 gvrp on loose_binding on
ip link set eth2.103 up
rmmod tg3    # driver providing eth2

 BUG: unable to handle kernel NULL pointer dereference at           (null)
 IP: [<ffffffffa0030c9e>] garp_request_leave+0x3e/0xc0 [garp]
 PGD 11d251067 PUD 11b9e0067 PMD 0
 Oops: 0000 [#1] SMP
 last sysfs file: /sys/devices/virtual/net/eth2.104/ifindex
 CPU 0
 Modules linked in: tg3(-) 8021q garp nfsd lockd auth_rpcgss sunrpc libphy sg [last unloaded: x_tables]

 Pid: 11494, comm: rmmod Tainted: G        W   2.6.39-rc6-00261-gfd71257-dirty #580 HP ProLiant BL460c G6
 RIP: 0010:[<ffffffffa0030c9e>]  [<ffffffffa0030c9e>] garp_request_leave+0x3e/0xc0 [garp]
 RSP: 0018:ffff88007a19bae8  EFLAGS: 00010286
 RAX: 0000000000000000 RBX: ffff88011b5e2000 RCX: 0000000000000002
 RDX: 0000000000000000 RSI: 0000000000000175 RDI: ffffffffa0030d5b
 RBP: ffff88007a19bb18 R08: 0000000000000001 R09: ffff88011bd64a00
 R10: ffff88011d34ec00 R11: 0000000000000000 R12: 0000000000000002
 R13: ffff88007a19bc48 R14: ffff88007a19bb88 R15: 0000000000000001
 FS:  0000000000000000(0000) GS:ffff88011fc00000(0063) knlGS:00000000f77d76c0
 CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
 CR2: 0000000000000000 CR3: 000000011a675000 CR4: 00000000000006f0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process rmmod (pid: 11494, threadinfo ffff88007a19a000, task ffff8800798595c0)
 Stack:
  ffff88007a19bb36 ffff88011c84b800 ffff88011b5e2000 ffff88007a19bc48
  ffff88007a19bb88 0000000000000006 ffff88007a19bb38 ffffffffa003a5f6
  ffff88007a19bb38 670088007a19bba8 ffff88007a19bb58 ffffffffa00397e7
 Call Trace:
  [<ffffffffa003a5f6>] vlan_gvrp_request_leave+0x46/0x50 [8021q]
  [<ffffffffa00397e7>] vlan_dev_stop+0xb7/0xc0 [8021q]
  [<ffffffff8137e427>] __dev_close_many+0x87/0xe0
  [<ffffffff8137e507>] dev_close_many+0x87/0x110
  [<ffffffff8137e630>] rollback_registered_many+0xa0/0x240
  [<ffffffff8137e7e9>] unregister_netdevice_many+0x19/0x60
  [<ffffffffa00389eb>] vlan_device_event+0x53b/0x550 [8021q]
  [<ffffffff8143f448>] ? ip6mr_device_event+0xa8/0xd0
  [<ffffffff81479d03>] notifier_call_chain+0x53/0x80
  [<ffffffff81062539>] __raw_notifier_call_chain+0x9/0x10
  [<ffffffff81062551>] raw_notifier_call_chain+0x11/0x20
  [<ffffffff8137df82>] call_netdevice_notifiers+0x32/0x60
  [<ffffffff8137e69f>] rollback_registered_many+0x10f/0x240
  [<ffffffff8137e85f>] rollback_registered+0x2f/0x40
  [<ffffffff8137e8c8>] unregister_netdevice_queue+0x58/0x90
  [<ffffffff8137e9eb>] unregister_netdev+0x1b/0x30
  [<ffffffffa005d73f>] tg3_remove_one+0x6f/0x10b [tg3]

We should call vlan_gvrp_request_leave() from unregister_vlan_dev(),
not from vlan_dev_stop(), because vlan_gvrp_uninit_applicant()
is called right after unregister_netdevice_queue(). In batch mode,
unregister_netdevice_queue() doesn’t immediately call vlan_dev_stop().

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: fix two lockdep splats
Eric Dumazet [Tue, 10 May 2011 03:55:03 +0000 (20:55 -0700)]
net: fix two lockdep splats

Commit e67f88dd12f6 (net: dont hold rtnl mutex during netlink dump
callbacks) switched rtnl protection to RCU, but we forgot to adjust two
rcu_dereference() lockdep annotations :

inet_get_link_af_size() or inet_fill_link_af() might be called with
rcu_read_lock or rtnl held, so use rcu_dereference_rtnl()
instead of rtnl_dereference()

Reported-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: xfrm: Eliminate ->rt_src reference in policy code.
David S. Miller [Mon, 9 May 2011 22:13:28 +0000 (15:13 -0700)]
ipv4: xfrm: Eliminate ->rt_src reference in policy code.

Rearrange xfrm4_dst_lookup() so that it works by calling a helper
function __xfrm_dst_lookup() that takes an explicit flow key storage
area as an argument.

Use this new helper in xfrm4_get_saddr() so we can fetch the selected
source address from the flow instead of from rt->rt_src

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoinfiniband: Remove rt->rt_src usage in addr4_resolve()
David S. Miller [Mon, 9 May 2011 21:52:02 +0000 (14:52 -0700)]
infiniband: Remove rt->rt_src usage in addr4_resolve()

Use an explicit flow key and fetch it from there.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agosctp: Remove rt->rt_src usage in sctp_v4_get_saddr()
David S. Miller [Mon, 9 May 2011 21:49:13 +0000 (14:49 -0700)]
sctp: Remove rt->rt_src usage in sctp_v4_get_saddr()

Flow key is available, so fetch it from there.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: udp: Eliminate remaining uses of rt->rt_src
David S. Miller [Mon, 9 May 2011 20:31:04 +0000 (13:31 -0700)]
ipv4: udp: Eliminate remaining uses of rt->rt_src

We already track and pass around the correct flow key,
so simply use it in udp_send_skb().

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: icmp: Eliminate remaining uses of rt->rt_src
David S. Miller [Mon, 9 May 2011 20:28:22 +0000 (13:28 -0700)]
ipv4: icmp: Eliminate remaining uses of rt->rt_src

On input packets, rt->rt_src always equals ip_hdr(skb)->saddr

Anything that mangles or otherwise changes the IP header must
relookup the route found at skb_rtable().  Therefore this
invariant must always hold true.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: Pass explicit daddr arg to ip_send_reply().
David S. Miller [Mon, 9 May 2011 20:22:43 +0000 (13:22 -0700)]
ipv4: Pass explicit daddr arg to ip_send_reply().

This eliminates an access to rt->rt_src.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agotipc: Revise timings used when sending link request messages
Allan Stephens [Fri, 22 Apr 2011 01:34:03 +0000 (20:34 -0500)]
tipc: Revise timings used when sending link request messages

Revises the algorithm governing the sending of link request messages
to take into account the number of nodes each bearer is currently in
contact with, and to ensure more rapid rediscovery of neighboring nodes
if a bearer fails and then recovers.

The discovery object now sends requests at least once a second if it
is not in contact with any other nodes, and at least once a minute if
it has at least one neighbor; if contact with the only neighbor is
lost, the object immediately reverts to its initial rapid-fire search
timing to accelerate the rediscovery process.

In addition, the discovery object now stops issuing link request
messages if it is in contact with the only neighboring node it is
configured to communicate with, since further searching is unnecessary.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Add monitoring of number of nodes discovered by bearer
Allan Stephens [Fri, 22 Apr 2011 00:05:25 +0000 (19:05 -0500)]
tipc: Add monitoring of number of nodes discovered by bearer

Augments TIPC's discovery object to track the number of neighboring nodes
having an active link to the associated bearer.

This means tipc_disc_update_link_req() becomes either one of:

       tipc_disc_add_dest()
or:
       tipc_disc_remove_dest()

depending on the code flow direction of things.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Enhance sending of discovery object link request messages
Allan Stephens [Thu, 21 Apr 2011 21:28:02 +0000 (16:28 -0500)]
tipc: Enhance sending of discovery object link request messages

Augments TIPC's discovery object to send its initial neighbor discovery
request message as soon as the associated bearer is created, rather than
waiting for its first periodic timeout to occur, thereby speeding up the
discovery process. Also adds a check to suppress the initial request or
subsequent requests if the bearer is blocked at the time the request is
scheduled for transmission.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Enhance handling of discovery object creation failures
Allan Stephens [Thu, 21 Apr 2011 18:58:26 +0000 (13:58 -0500)]
tipc: Enhance handling of discovery object creation failures

Modifies bearer creation and deletion code to improve handling of
scenarios when a neighbor discovery object cannot be created. The
creation routine now aborts the creation of a bearer if its discovery
object cannot be created, and deletes the newly created bearer, rather
than failing quietly and leaving an unusable bearer hanging around.

Since the exit via the goto label really isn't a definitive failure
in all cases, relabel it appropriately.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Introduce routine to enqueue a chain of messages on link tx queue
Allan Stephens [Thu, 21 Apr 2011 15:50:42 +0000 (11:50 -0400)]
tipc: Introduce routine to enqueue a chain of messages on link tx queue

Create a helper routine to enqueue a chain of sk_buffs to a link's
transmit queue.  It improves readability and the new function is
anticipated to be used more than just once in the future as well.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Avoid recomputation of outgoing message length
Allan Stephens [Thu, 21 Apr 2011 15:42:07 +0000 (10:42 -0500)]
tipc: Avoid recomputation of outgoing message length

Rework TIPC's message sending routines to take advantage of the total
amount of data value passed to it by the kernel socket infrastructure.
This change eliminates the need for TIPC to compute the size of outgoing
messages itself, as well as the check for an oversize message in
tipc_msg_build().  In addition, this change warrants an explanation:

   -     res = send_packet(NULL, sock, &my_msg, 0);
   +     res = send_packet(NULL, sock, &my_msg, bytes_to_send);

Previously, the final argument to send_packet() was ignored (since the
amount of data being sent was recalculated by a lower-level routine)
and we could just pass in a dummy value (0). Now that the
recalculation is being eliminated, the argument value being passed to
send_packet() is significant and we have to supply the actual amount
of data we want to send.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Abort excessive send requests as early as possible
Allan Stephens [Tue, 20 Apr 2010 21:58:24 +0000 (17:58 -0400)]
tipc: Abort excessive send requests as early as possible

Adds checks to TIPC's socket send routines to promptly detect and
abort attempts to send more than 66,000 bytes in a single TIPC
message or more than 2**31-1 bytes in a single TIPC byte stream request.
In addition, this ensures that the number of iovecs in a send request
does not exceed the limits of a standard integer variable.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Strengthen checks for neighboring node discovery
Allan Stephens [Wed, 20 Apr 2011 21:24:07 +0000 (16:24 -0500)]
tipc: Strengthen checks for neighboring node discovery

Enhances existing checks on the discovery domain associated with a TIPC
bearer. A bearer can no longer be configured to accept links from itself
only (which would be pointless), or to nodes outside its own cluster
(since multi-cluster support has now been removed from TIPC). Also, the
neighbor discovery routine now validates link setup requests against the
configured discovery domain for the bearer, rather than simply ensuring
the requesting node belongs to the node's own cluster.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: make zone/cluster mask constants a define
Paul Gortmaker [Tue, 19 Apr 2011 17:11:23 +0000 (13:11 -0400)]
tipc: make zone/cluster mask constants a define

This allows them to be available for easy re-use in other places
and avoids trivial mistakes caused by  "count the f's and 0's".

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Fix sk_buff leaks when link congestion is detected
Allan Stephens [Tue, 19 Apr 2011 14:17:58 +0000 (10:17 -0400)]
tipc: Fix sk_buff leaks when link congestion is detected

Modifies a TIPC send routine that did not discard the outgoing sk_buff
if it was not transmitted because of link congestion; this eliminates
the potential for buffer leakage in the many callers who did not clean up
the unsent buffer. (The two routines that previously did discard the unsent
buffer have been updated to eliminate their now-redundant clean up.)

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Update destination node field on incoming multicast messages
Allan Stephens [Mon, 18 Apr 2011 14:14:26 +0000 (10:14 -0400)]
tipc: Update destination node field on incoming multicast messages

Sets the destination node field of an incoming multicast message
to the receiving node's network address before handing off the message
to each receiving port. This ensures that, in the event the destination
port returns the message to the sender, the sender can identify which
node the destination port belonged to.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Fix problem with bundled multicast message
Allan Stephens [Mon, 18 Apr 2011 14:08:22 +0000 (10:08 -0400)]
tipc: Fix problem with bundled multicast message

Set the destination node and destination port fields of an outgoing
multicast message header to zero; this is necessary to ensure that
the receiving node can route the message properly if it was packed
into a bundle due to link congestion. (Previously, there was a chance
that the receiving node would send the unbundled message to a random
node & port, rather than processing the message itself.)

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>