platform/kernel/kernel-mfld-blackbay.git
13 years agotipc: Convert node object array to a hash table
Allan Stephens [Fri, 25 Feb 2011 23:42:52 +0000 (18:42 -0500)]
tipc: Convert node object array to a hash table

Replaces the dynamically allocated array of pointers to the cluster's
node objects with a static hash table. Hash collisions are resolved
using chaining, with a typical hash chain having only a single node,
to avoid degrading performance during processing of incoming packets.
The conversion to a hash table reduces the memory requirements for
TIPC's node table to approximately the same size it had prior to
the previous commit.

In addition to the hash table itself, TIPC now also maintains a
linked list for the node objects, sorted by ascending network address.
This list allows TIPC to continue sending responses to user space
applications that request node and link information in sorted order.
The list also improves performance when name table update messages are
sent by making it easier to identify the nodes that must be notified.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Eliminate configuration for maximum number of cluster nodes
Allan Stephens [Fri, 25 Feb 2011 19:22:11 +0000 (14:22 -0500)]
tipc: Eliminate configuration for maximum number of cluster nodes

Gets rid of the need for users to specify the maximum number of
cluster nodes supported by TIPC. TIPC now automatically provides
support for all 4K nodes allowed by its addressing scheme.

Note: This change sets TIPC's memory usage to the amount used by
a maximum size node table with 4K entries.  An upcoming patch that
converts the node table from a linear array to a hash table will
compact the node table to a more efficient design, but for clarity
it is nice to have all the Kconfig infrastruture go away separately.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Split up unified structure of network-related variables
Allan Stephens [Fri, 25 Feb 2011 15:01:58 +0000 (10:01 -0500)]
tipc: Split up unified structure of network-related variables

Converts the fields of the global "tipc_net" structure into individual
variables.  Since the struct was never referenced as a complete unit,
its existence was pointless.  This will facilitate upcoming changes to
TIPC's node table and simpify upcoming relocation of the variables so
they are only visible to the files that actually use them.

This change is essentially cosmetic in nature, and doesn't affect the
operation of TIPC.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Fix problem with missing link in "tipc-config -l" output
Allan Stephens [Thu, 24 Feb 2011 18:20:20 +0000 (13:20 -0500)]
tipc: Fix problem with missing link in "tipc-config -l" output

Removes a race condition that could cause TIPC's internal counter
of the number of links it has to neighboring nodes to have the
incorrect value if two independent threads of control simultaneously
create new link endpoints connecting to two different nodes using two
different bearers. Such under counting would result in TIPC failing to
list the final link(s) in its response to a configuration request to
list all of the node's links. The counter is now updated atomically
to ensure that simultaneous increments do not interfere with each
other.

Thanks go to Peter Butler <pbutler@pt.com> for his assistance in
diagnosing and fixing this problem.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Add support for SO_RCVTIMEO socket option
Allan Stephens [Wed, 23 Feb 2011 19:52:14 +0000 (14:52 -0500)]
tipc: Add support for SO_RCVTIMEO socket option

Adds support for the SO_RCVTIMEO socket option to TIPC's socket
receive routines.

Thanks go out to Raj Hegde <rajenhegde@yahoo.ca> for his contribution
to the development and testing this enhancement.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Cosmetic changes to node subscription code
Allan Stephens [Wed, 23 Feb 2011 19:13:41 +0000 (14:13 -0500)]
tipc: Cosmetic changes to node subscription code

Relocates the code that notifies users of node subscriptions so that
it is adjacent to the rest of the routines that implement TIPC's node
subscription capability. Renames the name table routine that is
invoked by a node subscription to better reflect its purpose and to
be consistent with other, similar name table routines.

These changes are cosmetic in nature, and do not alter the behavior
of TIPC.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Prevent null pointer error when removing a node subscription
Allan Stephens [Wed, 23 Feb 2011 18:51:15 +0000 (13:51 -0500)]
tipc: Prevent null pointer error when removing a node subscription

Prevents a null pointer dereference from occurring if a node subscription
is triggered at the same time that the subscribing port or publication is
terminating the subscription. The problem arises if the triggering routine
asynchronously activates and deregisters the node subscription while
deregistration is already underway -- the deregistration routine may find
that the pointer it has just verified to be non-NULL is now NULL.
To avoid this race condition the triggering routine now simply marks the
node subscription as defunct (to prevent it from re-activating)
instead of deregistering it. The subscription is now both deregistered
and destroyed only when the subscribing port or publication code terminates
the node subscription.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Add network address mask helper routines
Allan Stephens [Wed, 23 Feb 2011 16:44:49 +0000 (11:44 -0500)]
tipc: Add network address mask helper routines

Introduces a pair of helper routines that convert the network address
for a TIPC node into the network address for its cluster or zone.

This is a cosmetic change designed to avoid future errors caused by
the incorrect use of address bitmasks, and does not alter the existing
operation of TIPC.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Correct broadcast link peer info when displaying links
Allan Stephens [Mon, 21 Feb 2011 14:45:31 +0000 (09:45 -0500)]
tipc: Correct broadcast link peer info when displaying links

Fixes a typo in the calculation of the network address of a node's own
cluster when generating a response to the configuration command that
lists all of the node's links. The correct mask value for a <Z.C.N>
network address uses 1's for the 8-bit zone and 12-bit cluster parts
and 0's for the 12-bit node part.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Allow receiving into iovec containing multiple entries
Allan Stephens [Mon, 21 Feb 2011 14:45:40 +0000 (09:45 -0500)]
tipc: Allow receiving into iovec containing multiple entries

Enhances TIPC's socket receive routines to support iovec structures
containing more than a single entry. This change leverages existing
sk_buff routines to do most of the work; the only significant change
to TIPC itself is that an sk_buff now records how much data has been
already consumed as an numeric offset, rather than as a pointer to
the first unread data byte.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agodecnet: Convert to use flowidn where applicable.
David S. Miller [Sat, 12 Mar 2011 22:17:10 +0000 (17:17 -0500)]
decnet: Convert to use flowidn where applicable.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: Put fl6_* macros to struct flowi6 and use them again.
David S. Miller [Sat, 12 Mar 2011 21:36:19 +0000 (16:36 -0500)]
net: Put fl6_* macros to struct flowi6 and use them again.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv6: Convert to use flowi6 where applicable.
David S. Miller [Sat, 12 Mar 2011 21:22:43 +0000 (16:22 -0500)]
ipv6: Convert to use flowi6 where applicable.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: Put fl4_* macros to struct flowi4 and use them again.
David S. Miller [Sat, 12 Mar 2011 08:00:33 +0000 (03:00 -0500)]
net: Put fl4_* macros to struct flowi4 and use them again.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: Kill fib_semantic_match declaration from fib_lookup.h
David S. Miller [Sat, 12 Mar 2011 07:44:16 +0000 (02:44 -0500)]
ipv4: Kill fib_semantic_match declaration from fib_lookup.h

This function no longer exists.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: Use flowi4 and flowi6 in xfrm layer.
David S. Miller [Sat, 12 Mar 2011 07:42:11 +0000 (02:42 -0500)]
net: Use flowi4 and flowi6 in xfrm layer.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: Add flowi6_* member helper macros.
David S. Miller [Sat, 12 Mar 2011 07:30:50 +0000 (02:30 -0500)]
net: Add flowi6_* member helper macros.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonetfilter: Use flowi4 and flowi6 in xt_TCPMSS
David S. Miller [Sat, 12 Mar 2011 07:16:48 +0000 (02:16 -0500)]
netfilter: Use flowi4 and flowi6 in xt_TCPMSS

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonetfilter: Use flowi4 and flowi6 in nf_conntrack_h323_main
David S. Miller [Sat, 12 Mar 2011 07:14:05 +0000 (02:14 -0500)]
netfilter: Use flowi4 and flowi6 in nf_conntrack_h323_main

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: Use flowi4 in UDP
David S. Miller [Sat, 12 Mar 2011 07:09:18 +0000 (02:09 -0500)]
ipv4: Use flowi4 in UDP

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonetfilter: Use flowi4 in nf_nat_standalone.c
David S. Miller [Sat, 12 Mar 2011 07:06:33 +0000 (02:06 -0500)]
netfilter: Use flowi4 in nf_nat_standalone.c

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: Use flowi4 in ipmr code.
David S. Miller [Sat, 12 Mar 2011 07:04:50 +0000 (02:04 -0500)]
ipv4: Use flowi4 in ipmr code.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: Use flowi4 in FIB layer.
David S. Miller [Sat, 12 Mar 2011 07:02:42 +0000 (02:02 -0500)]
ipv4: Use flowi4 in FIB layer.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: Use flowi4 in public route lookup interfaces.
David S. Miller [Sat, 12 Mar 2011 06:12:47 +0000 (01:12 -0500)]
ipv4: Use flowi4 in public route lookup interfaces.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: Use struct flowi4 internally in routing lookups.
David S. Miller [Sat, 12 Mar 2011 01:07:33 +0000 (20:07 -0500)]
ipv4: Use struct flowi4 internally in routing lookups.

We will change the externally visible APIs next.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: Pass ipv4 flow objects into fib_lookup() paths.
David S. Miller [Sat, 12 Mar 2011 00:54:08 +0000 (19:54 -0500)]
ipv4: Pass ipv4 flow objects into fib_lookup() paths.

To start doing these conversions, we need to add some temporary
flow4_* macros which will eventually go away when all the protocol
code paths are changed to work on AF specific flowi objects.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: Add flowiX_to_flowi() shorthands.
David S. Miller [Sat, 12 Mar 2011 00:23:02 +0000 (19:23 -0500)]
net: Add flowiX_to_flowi() shorthands.

This is just a shorthand which will help in passing around AF
specific flow structures as generic ones.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: Break struct flowi out into AF specific instances.
David S. Miller [Sat, 12 Mar 2011 05:44:35 +0000 (00:44 -0500)]
net: Break struct flowi out into AF specific instances.

Now we have struct flowi4, flowi6, and flowidn for each address
family.  And struct flowi is just a union of them all.

It might have been troublesome to convert flow_cache_uli_match() but
as it turns out this function is completely unused and therefore can
be simply removed.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: Make flowi ports AF dependent.
David S. Miller [Sat, 12 Mar 2011 05:43:55 +0000 (00:43 -0500)]
net: Make flowi ports AF dependent.

Create two sets of port member accessors, one set prefixed by fl4_*
and the other prefixed by fl6_*

This will let us to create AF optimal flow instances.

It will work because every context in which we access the ports,
we have to be fully aware of which AF the flowi is anyways.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: Create union flowi_uli
David S. Miller [Fri, 11 Mar 2011 23:36:42 +0000 (18:36 -0500)]
net: Create union flowi_uli

This will be used when we have seperate flowi types.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: Create struct flowi_common
David S. Miller [Fri, 11 Mar 2011 23:22:00 +0000 (18:22 -0500)]
net: Create struct flowi_common

Pull out the AF independent members of struct flowi into a
new struct flowi_common

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: Put flowi_* prefix on AF independent members of struct flowi
David S. Miller [Sat, 12 Mar 2011 05:29:39 +0000 (00:29 -0500)]
net: Put flowi_* prefix on AF independent members of struct flowi

I intend to turn struct flowi into a union of AF specific flowi
structs.  There will be a common structure that each variant includes
first, much like struct sock_common.

This is the first step to move in that direction.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoxfrm: Eliminate "fl" and "pol" args to xfrm_bundle_ok().
David S. Miller [Fri, 11 Mar 2011 20:59:31 +0000 (15:59 -0500)]
xfrm: Eliminate "fl" and "pol" args to xfrm_bundle_ok().

There is only one caller of xfrm_bundle_ok(), and that always passes these
parameters as NULL.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: Remove unnecessary padding in struct flowi
David S. Miller [Fri, 11 Mar 2011 20:55:37 +0000 (15:55 -0500)]
net: Remove unnecessary padding in struct flowi

Move tos, scope, proto, and flags to the beginning of
the structure.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: Create and use route lookup helpers.
David S. Miller [Sat, 12 Mar 2011 05:00:52 +0000 (00:00 -0500)]
ipv4: Create and use route lookup helpers.

The idea here is this minimizes the number of places one has to edit
in order to make changes to how flows are defined and used.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/jkirsher/net...
David S. Miller [Sat, 12 Mar 2011 22:41:02 +0000 (14:41 -0800)]
Merge branch 'master' of /linux/kernel/git/jkirsher/net-next-2.6

13 years agoMerge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville...
David S. Miller [Sat, 12 Mar 2011 19:06:59 +0000 (11:06 -0800)]
Merge branch 'for-davem' of git://git./linux/kernel/git/linville/wireless-next-2.6

13 years agoixgbe: DCB, PFC not cleared until reset occurs
John Fastabend [Thu, 10 Mar 2011 12:06:12 +0000 (12:06 +0000)]
ixgbe: DCB, PFC not cleared until reset occurs

The PFC configuration is not cleared until the device is reset. This
has not been a problem because setting DCB attributes forced a
hardware reset. Now that we no longer require this reset to occur
PFC remains configured even after being disabled until the
device is reset.

This removes a goto in the PFC hardware set routines for 82598 and
82599 devices that was short circuiting the clear.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: add support for VF Transmit rate limit using iproute2
Lior Levy [Fri, 11 Mar 2011 02:03:07 +0000 (02:03 +0000)]
ixgbe: add support for VF Transmit rate limit using iproute2

Implemented ixgbe_ndo_set_vf_bw function which is being used by iproute2
tool. In addition, updated ixgbe_ndo_get_vf_config function to show the
actual rate limit to the user.

The rate limitation can be configured only when the link is up and the
link speed is 10Gb.
The rate limit value can be 0 or ranged between 11 and actual link
speed measured in Mbps. A value of '0' disables the rate limit for
this specific VF.

iproute2 usage will be 'ip link set ethX vf Y rate Z'.
After the command is made, the rate will be changed instantly.
To view the current rate limit, use 'ip link show ethX'.

The rates will be zeroed only upon driver reload or a link speed change.

This feature is being supported by 82599 and X540 devices.

Signed-off-by: Lior Levy <lior.levy@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: DCB, set minimum bandwidth per traffic class
John Fastabend [Wed, 9 Mar 2011 04:46:16 +0000 (04:46 +0000)]
ixgbe: DCB, set minimum bandwidth per traffic class

DCB provides a guaranteed bandwidth in the case with 0%
bandwidth then no bandwidth is guaranteed. However the
traffic class should still be able to transmit traffic.
For this to work the traffic class must be given the
minimum credits required to send a frame.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: correct typo in define name
Emil Tantilov [Sat, 5 Mar 2011 08:02:18 +0000 (08:02 +0000)]
ixgbe: correct typo in define name

VF Free Running Timer register name missing an F.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Evan Swanson <evan.swanson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: update PHY code to support 100Mbps as well as 1G/10G
Emil Tantilov [Sat, 5 Mar 2011 01:28:07 +0000 (01:28 +0000)]
ixgbe: update PHY code to support 100Mbps as well as 1G/10G

This change updates the PHY setup code to support 100Mbps capable PHYs
as well as 10G and 1Gbps.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: remove timer reset to 0 on timeout
Emil Tantilov [Fri, 4 Mar 2011 03:20:59 +0000 (03:20 +0000)]
ixgbe: remove timer reset to 0 on timeout

The VF mailbox polling for acks and messages would reset the timer to zero
on a timeout. Under heavy load a timeout may actually occur without being
the result of an error and when this occurs it is not practical to perform
a full VF driver reset on every message timeout. Instead, just return an
error (which is already done) and the VF driver will have an opportunity
to retry the operation.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: DCB during ifup use correct CEE or IEEE mode
John Fastabend [Wed, 23 Feb 2011 05:58:25 +0000 (05:58 +0000)]
ixgbe: DCB during ifup use correct CEE or IEEE mode

DCB settings are cleared in the hardware across link events
during ifup ixgbe reprograms the hardware for DCB if it is
enabled. Now that we have two modes CEE or IEEE we need to
use the correct set of configuration data.

This patch checks the dcbx_cap bits and then enables the
device in the correct mode.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: IEEE 802.1Qaz, implement priority assignment table
John Fastabend [Wed, 23 Feb 2011 05:58:19 +0000 (05:58 +0000)]
ixgbe: IEEE 802.1Qaz, implement priority assignment table

This patch adds support to use the priority assignment
table in the ieee_ets structure to map priorities to
traffic classes. Previously ixgbe only supported a
1:1 mapping. Now we can enable and disable hardware
DCB support when multiple traffic classes are actually
being used. This allows the default case all priorities
mapped to traffic class 0 to work in normal hardware
mode and utilize the full packet buffer.

This patch does not address putting the hardware in
4TC mode so packet buffer space may be underutilized
in this case. A follow up patch can address this
optimization. But at least we have the hooks to do
this now.

Also CEE will behave as it always has and map priorities
1:1 with traffic classes.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: DCB, missed translation from 8021Qaz TSA to CEE link strict
John Fastabend [Wed, 23 Feb 2011 05:58:14 +0000 (05:58 +0000)]
ixgbe: DCB, missed translation from 8021Qaz TSA to CEE link strict

The patch below  allowed IEEE 802.1Qaz and CEE DCB hardware
configurations to use common hardware set routines,

commit 88eb696cc6a7af8f9272266965b1a4dd7d6a931b
Author: John Fastabend <john.r.fastabend@intel.com>
Date:   Thu Feb 10 03:02:11 2011 -0800

    ixgbe: DCB, abstract out dcb_config from DCB hardware configuration

However the case when CEE link strict and group strict
are set was missed and are currently being mapped
incorrectly in some configurations.

This patch resolves this.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: DCB: enable RSS to be used with DCB
John Fastabend [Wed, 23 Feb 2011 05:58:08 +0000 (05:58 +0000)]
ixgbe: DCB: enable RSS to be used with DCB

RSS had previously been disabled when DCB was enabled because
DCB was single queued per traffic class. Now that DCB implements
multiple Tx/Rx rings per traffic class enable RSS.

Here RSS hashes across the queues in the traffic class.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain.@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: enable ndo_tc_setup
John Fastabend [Wed, 23 Feb 2011 05:58:03 +0000 (05:58 +0000)]
ixgbe: enable ndo_tc_setup

This patch adds the ndo_tc_setup to ixgbe. By default we set
the device to use strict priority.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain.@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: DCB, use multiple Tx rings per traffic class
John Fastabend [Tue, 8 Mar 2011 03:44:52 +0000 (03:44 +0000)]
ixgbe: DCB, use multiple Tx rings per traffic class

This enables multiple {Tx|Rx} rings per traffic class while in DCB
mode. In order to get this working as expected the tc_to_tx net
device mapping is configured as well as the prio_tc_map.

skb priorities are mapped across a range of queue pairs to get
a distribution per traffic class. The maximum number of
queue pairs used while in DCB mode is capped at 64. The hardware
max is actually 128 queues but 64 is sufficient for now and
allocating more seemed a bit excessive. It is easy enough to
increase the cap later if need be.

To get the 802.1Q priority tags inserted correctly ixgbe was
previously using the skb queue_mapping field to directly set
the 802.1Q priority. This no longer works because we have removed
the 1:1 mapping between queues and traffic class. Each ring
is aligned with an 802.1Qaz traffic class so here we add an
extra field to the ring struct to identify the 802.1Q traffic
class. This uses an extra byte of the ixgbe_ring struct
fortunately there was a 2byte hole,

struct ixgbe_ring {
        void *                     desc;                 /*     0     8 */
        struct device *            dev;                  /*     8     8 */
        struct net_device *        netdev;               /*    16     8 */
        union {
                struct ixgbe_tx_buffer * tx_buffer_info; /*           8 */
                struct ixgbe_rx_buffer * rx_buffer_info; /*           8 */
        };                                               /*    24     8 */
        long unsigned int          state;                /*    32     8 */
        u8                         atr_sample_rate;      /*    40     1 */
        u8                         atr_count;            /*    41     1 */
        u16                        count;                /*    42     2 */
        u16                        rx_buf_len;           /*    44     2 */
        u16                        next_to_use;          /*    46     2 */
        u16                        next_to_clean;        /*    48     2 */
        u8                         queue_index;          /*    50     1 */
        u8                         reg_idx;              /*    51     1 */
        u16                        work_limit;           /*    52     2 */

        /* XXX 2 bytes hole, try to pack */

        u8 *                       tail;                 /*    56     8 */
        /* --- cacheline 1 boundary (64 bytes) --- */

Now we can set the VLAN priority directly and it will be
correct. User space can indicate the 802.1Qaz priority
using the SO_PRIORITY setsocket() option and QOS layer will
steer the skb to the correct rings. Additionally using
the multiq qdisc with a queue_mapping action works as
well.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: DCB remove ixgbe_fcoe_getapp routine
John Fastabend [Wed, 23 Feb 2011 05:57:52 +0000 (05:57 +0000)]
ixgbe: DCB remove ixgbe_fcoe_getapp routine

Remove ixgbe_fcoe_getapp() and use the generic kernel
routine instead. Also add application priority to the
kernel maintained list on setapp so applications and
stacks can query the value.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: DCB, implement ieee_setapp dcbnl ops
John Fastabend [Wed, 23 Feb 2011 05:57:47 +0000 (05:57 +0000)]
ixgbe: DCB, implement ieee_setapp dcbnl ops

Implement ieee_setapp dcbnl ops in ixgbe. This is required
to setup FCoE which requires dedicated resources. If the
app data is not for FCoE then no action is taken in ixgbe
except to add it to the dcb_app_list.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: DCB, implement capabilities flags
John Fastabend [Tue, 1 Mar 2011 05:25:35 +0000 (05:25 +0000)]
ixgbe: DCB, implement capabilities flags

This implements dcbnl get and set capabilities ops. The
devices supported by ixgbe can be configured to run in
IEEE or CEE modes but not both.

With the DCBX set capabilities bit we add an explicit
signal that must be used to toggle between these modes.
This patch adds logic to fail the CEE command set_hw_all()
which programs the device with a CEE configuration if
the CEE caps bit is not set. Similarly, IEEE set
commands will fail if the IEEE caps bit is not set. We
allow most CEE config set commands to occur because they
do not touch the hardware until set_hw_all() is called.

The one exception to the above is the {set|get}app routines.
These must always be protected by caps bits to ensure
side effects do not corrupt the current configured mode.

By requiring the caps bit to be set correctly we can
maintain a consistent configuration in the hardware
for CEE or IEEE modes and prevent partial hardware
configurations that may occur if user space does
not send a complete IEEE or CEE configurations.

It is expected that user space will signal a DCBX mode
before programming device.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoigb: Bump version to 3.0.6
Carolyn Wyborny [Sat, 12 Mar 2011 04:58:19 +0000 (20:58 -0800)]
igb: Bump version to 3.0.6

This patch updates igb version to 3.0.6.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoigb: Add DMA Coalescing feature to driver
Carolyn Wyborny [Sat, 12 Mar 2011 04:43:54 +0000 (20:43 -0800)]
igb: Add DMA Coalescing feature to driver

This patch add DMA Coalescing which is a power-saving feature that
coalesces DMA writes in order to stay in a low-power state as much
as possible.  Feature is disabled by default.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoigb: Update NVM functions to work with i350 devices
Carolyn Wyborny [Sat, 12 Mar 2011 04:43:18 +0000 (20:43 -0800)]
igb: Update NVM functions to work with i350 devices

This patch adds functions and functions pointers to accommodate
differences between NVM interfaces and options for i350 devices,
82580 devices and the rest.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoigb: Add Energy Efficient Ethernet (EEE) for i350 devices.
Carolyn Wyborny [Sat, 12 Mar 2011 04:42:13 +0000 (20:42 -0800)]
igb: Add Energy Efficient Ethernet (EEE) for i350 devices.

This patch adds the EEE feature for i350 devices, enabled by default.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbevf: Fix Driver String
Greg Rose [Sat, 12 Mar 2011 02:01:29 +0000 (02:01 +0000)]
ixgbevf: Fix Driver String

Change the driver string to match the PF driver string format.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbevf: Fix Version String
Greg Rose [Wed, 2 Mar 2011 06:52:30 +0000 (06:52 +0000)]
ixgbevf: Fix Version String

The kernel version string is off by a major version number since
new silicon was just introduced and also uses the wrong format for
the version postfix.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wirel...
John W. Linville [Fri, 11 Mar 2011 19:11:11 +0000 (14:11 -0500)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless-next-2.6 into for-davem

13 years agoe1000e: bump version number
Bruce Allan [Sat, 26 Feb 2011 03:12:19 +0000 (03:12 +0000)]
e1000e: bump version number

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoe1000e: do not suggest the driver supports Wake-on-ARP
Bruce Allan [Fri, 4 Mar 2011 09:07:01 +0000 (09:07 +0000)]
e1000e: do not suggest the driver supports Wake-on-ARP

The driver doesn't support Wake-on-ARP, so don't advertise through ethtool
that it does.

Cleanup some coding style issues in the same functions.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoe1000e: disable jumbo frames on 82579 when MACsec enabled in EEPROM
Bruce Allan [Fri, 25 Feb 2011 07:44:51 +0000 (07:44 +0000)]
e1000e: disable jumbo frames on 82579 when MACsec enabled in EEPROM

If/when an OEM enables MACsec in the 82579 EEPROM, disable jumbo frames
support in the driver due to an interoperability issue in hardware that
prevents jumbo packets from being transmitted or received.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoe1000e: do not toggle LANPHYPC value bit when PHY reset is blocked
Bruce Allan [Fri, 25 Feb 2011 06:25:18 +0000 (06:25 +0000)]
e1000e: do not toggle LANPHYPC value bit when PHY reset is blocked

When PHY reset is intentionally blocked on 82577/8/9, do not toggle the
LANPHYPC value bit (essentially performing a hard power reset of the
device) otherwise the PHY can be put into an unknown state.

Cleanup whitespace in the same function.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoe1000e: extend EEE LPI timer to prevent dropped link
Bruce Allan [Fri, 25 Feb 2011 06:58:03 +0000 (06:58 +0000)]
e1000e: extend EEE LPI timer to prevent dropped link

The link can be unexpectedly dropped when the timer for entering EEE low-
power-idle quiet state expires too soon.  The timer needs to be extended
from 196usec to 200usec after every LCD (PHY) reset to prevent this from
happening.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoe1000e: extend timeout for ethtool link test diagnostic
Bruce Allan [Fri, 25 Feb 2011 06:36:25 +0000 (06:36 +0000)]
e1000e: extend timeout for ethtool link test diagnostic

With some PHYs supported by this driver, link establishment can take a
little longer when connected to certain switches.  Extend the timeout to
reduce the number of false diagnostic failures, and cleanup a code style
issue in the same function.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoe1000e: magic number cleanup - ETH_ALEN
Bruce Allan [Fri, 25 Feb 2011 07:09:37 +0000 (07:09 +0000)]
e1000e: magic number cleanup - ETH_ALEN

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoe1000e: use dev_kfree_skb_irq() instead of dev_kfree_skb()
Bruce Allan [Thu, 10 Feb 2011 08:17:21 +0000 (08:17 +0000)]
e1000e: use dev_kfree_skb_irq() instead of dev_kfree_skb()

Based on a report and patch originally submitted by Prasanna Panchamukhi.

Use dev_kfree_skb_irq() in e1000_clean_jumbo_rx_irq() since this latter
function is called only in interrupt context.  This avoids "Warning:
kfree_skb on hard IRQ" messages.

Cc: "Prasanna S. Panchamukhi" <prasanna.panchamukhi@riverbed.com>
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbevf: remove Tx hang detection
Lior Levy [Wed, 2 Mar 2011 06:42:37 +0000 (06:42 +0000)]
ixgbevf: remove Tx hang detection

Removed Tx hang detection mechanism from ixgbevf.
This mechanism has no affect and can cause false alarm messages in some
cases. Especially when VF Tx rate limit is turned on.

The same mechanism was removed recently from igbvf.

Signed-off-by: Lior Levy <lior.levy@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agonet: add proper documentation for previously added net_device_ops for FCoE
Yi Zou [Wed, 9 Mar 2011 08:48:03 +0000 (08:48 +0000)]
net: add proper documentation for previously added net_device_ops for FCoE

Add proper documentation for previously added net_device_ops ops for FCoE.

Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgb: convert to new VLAN model
Emil Tantilov [Thu, 27 Jan 2011 09:14:18 +0000 (09:14 +0000)]
ixgb: convert to new VLAN model

Based on a patch from Jesse Gross <jesse@nicira.com>

This switches the ixgb driver to use the new VLAN interfaces.
In doing this, it completes the work begun in
ae54496f9e8d40c89e5668205c181dccfa9ecda1 allowing the use of
hardware VLAN insertion without having a VLAN group configured.

CC: Jesse Gross <jesse@nicira.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoipv4: Kill flowi arg to fib_select_multipath()
David S. Miller [Fri, 11 Mar 2011 01:01:16 +0000 (17:01 -0800)]
ipv4: Kill flowi arg to fib_select_multipath()

Completely unused.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: Remove unnecessary test from ip_mkroute_input()
David S. Miller [Fri, 11 Mar 2011 00:23:24 +0000 (16:23 -0800)]
ipv4: Remove unnecessary test from ip_mkroute_input()

fl->oif will always be zero on the input path, so there is no reason
to test for that.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: Remove redundant RCU locking in ip_check_mc().
David S. Miller [Fri, 11 Mar 2011 00:34:38 +0000 (16:34 -0800)]
ipv4: Remove redundant RCU locking in ip_check_mc().

All callers are under rcu_read_lock() protection already.

Rename to ip_check_mc_rcu() to make it even more clear.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
David S. Miller [Thu, 10 Mar 2011 22:26:00 +0000 (14:26 -0800)]
Merge branch 'master' of /linux/kernel/git/davem/net-2.6

Conflicts:
drivers/net/bnx2x/bnx2x_cmn.c

13 years agoip6ip6: autoload ip6 tunnel
stephen hemminger [Thu, 10 Mar 2011 11:43:19 +0000 (11:43 +0000)]
ip6ip6: autoload ip6 tunnel

Add necessary alias to autoload ip6ip6 tunnel module.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge branch 'master' of /home/davem/src/GIT/linux-2.6/
David S. Miller [Thu, 10 Mar 2011 22:00:44 +0000 (14:00 -0800)]
Merge branch 'master' of /home/davem/src/GIT/linux-2.6/

13 years agonet: bridge builtin vs. ipv6 modular
Randy Dunlap [Thu, 10 Mar 2011 21:45:57 +0000 (13:45 -0800)]
net: bridge builtin vs. ipv6 modular

When configs BRIDGE=y and IPV6=m, this build error occurs:

br_multicast.c:(.text+0xa3341): undefined reference to `ipv6_dev_get_saddr'

BRIDGE_IGMP_SNOOPING is boolean; if it were tristate, then adding
depends on IPV6 || IPV6=n
to BRIDGE_IGMP_SNOOPING would be a good fix.  As it is currently,
making BRIDGE depend on the IPV6 config works.

Reported-by: Patrick Schaaf <netdev@bof.de>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge branch 'media_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab...
Linus Torvalds [Thu, 10 Mar 2011 21:22:10 +0000 (13:22 -0800)]
Merge branch 'media_fixes' of git://git./linux/kernel/git/mchehab/linux-2.6

* 'media_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6:
  [media] mantis_pci: remove asm/pgtable.h include
  [media] tda829x: fix regression in probe functions
  [media] mceusb: don't claim multifunction device non-IR parts
  [media] nuvoton-cir: fix wake from suspend
  [media] cx18: Add support for Hauppauge HVR-1600 models with s5h1411
  [media] ivtv: Fix corrective action taken upon DMA ERR interrupt to avoid hang
  [media] cx25840: fix probing of cx2583x chips
  [media] cx23885: Remove unused 'err:' labels to quiet compiler warning
  [media] cx23885: Revert "Check for slave nack on all transactions"
  [media] DiB7000M: add pid filtering
  [media] Fix sysfs rc protocol lookup for rc-5-sz
  [media] au0828: fix VBI handling when in V4L2 streaming mode
  [media] ir-raw: Properly initialize the IR event (BZ#27202)
  [media] s2255drv: firmware re-loading changes
  [media] Fix double free of video_device in mem2mem_testdev
  [media] DM04/QQBOX memcpy to const char fix

13 years agoipmi: Fix IPMI errors due to timing problems
Doe, YiCheng [Thu, 10 Mar 2011 20:00:21 +0000 (14:00 -0600)]
ipmi: Fix IPMI errors due to timing problems

This patch fixes an issue in OpenIPMI module where sometimes an ABORT command
is sent after sending an IPMI request to BMC causing the IPMI request to fail.

Signed-off-by: YiCheng Doe <yicheng.doe@hp.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Acked-by: Tom Mingarelli <thomas.mingarelli@hp.com>
Tested-by: Andy Cress <andy.cress@us.kontron.com>
Tested-by: Mika Lansirine <Mika.Lansirinne@stonesoft.com>
Tested-by: Brian De Wolf <bldewolf@csupomona.edu>
Cc: Jean Michel Audet <Jean-Michel.Audet@ca.Kontron.com>
Cc: Jozef Sudelsky <jozef.sudolsky@elbiahosting.sk>
Acked-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
Linus Torvalds [Thu, 10 Mar 2011 21:16:01 +0000 (13:16 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  fs/dcache: allow d_obtain_alias() to return unhashed dentries
  Check for immutable/append flag in fallocate path
  sysctl: the include of rcupdate.h is only needed in the kernel
  fat: fix d_revalidate oopsen on NFS exports
  jfs: fix d_revalidate oopsen on NFS exports
  ocfs2: fix d_revalidate oopsen on NFS exports
  gfs2: fix d_revalidate oopsen on NFS exports
  fuse: fix d_revalidate oopsen on NFS exports
  ceph: fix d_revalidate oopsen on NFS exports
  reiserfs xattr ->d_revalidate() shouldn't care about RCU
  /proc/self is never going to be invalidated...

13 years agoMerge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 10 Mar 2011 21:09:26 +0000 (13:09 -0800)]
Merge branch 'x86-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, UV: Initialize the broadcast assist unit base destination node id properly
  x86, numa: Fix numa_emulation code with memory-less node0
  x86, build: Make sure mkpiggy fails on read error

13 years agoMerge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 10 Mar 2011 21:08:59 +0000 (13:08 -0800)]
Merge branch 'sched-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: Fix sched rt group scheduling when hierachy is enabled

13 years agoMerge branch 'perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux...
Linus Torvalds [Thu, 10 Mar 2011 21:07:38 +0000 (13:07 -0800)]
Merge branch 'perf/urgent' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf symbols: Avoid resolving [kernel.kallsyms] to real path for buildid cache
  perf symbols: Fix vmlinux path when not using --symfs

13 years agodrm/i915: Revive combination mode for backlight control
Takashi Iwai [Thu, 10 Mar 2011 13:02:12 +0000 (14:02 +0100)]
drm/i915: Revive combination mode for backlight control

This reverts commit 951f3512dba5bd44cda3e5ee22b4b522e4bb09fb

    drm/i915: Do not handle backlight combination mode specially

since this commit introduced other regressions due to untouched LBPC
register, e.g. the backlight dimmed after resume.

In addition to the revert, this patch includes a fix for the original
issue (weird backlight levels) by removing the wrong bit shift for
computing the current backlight level.
Also, including typo fixes (lpbc -> lbpc).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34524
Acked-by: Indan Zupancic <indan@nul.nu>
Reviewed-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: <stable@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agofs/dcache: allow d_obtain_alias() to return unhashed dentries
J. Bruce Fields [Tue, 18 Jan 2011 20:45:09 +0000 (15:45 -0500)]
fs/dcache: allow d_obtain_alias() to return unhashed dentries

Without this patch, inodes are not promptly freed on last close of an
unlinked file by an nfs client:

client$ mount -tnfs4 server:/export/ /mnt/
client$ tail -f /mnt/FOO
...
server$ df -i /export
server$ rm /export/FOO
(^C the tail -f)
server$ df -i /export
server$ echo 2 >/proc/sys/vm/drop_caches
server$ df -i /export

the df's will show that the inode is not freed on the filesystem until
the last step, when it could have been freed after killing the client's
tail -f. On-disk data won't be deallocated either, leading to possible
spurious ENOSPC.

This occurs because when the client does the close, it arrives in a
compound with a putfh and a close, processed like:

- putfh: look up the filehandle.  The only alias found for the
  inode will be DCACHE_UNHASHED alias referenced by the filp
  this, so it creates a new DCACHE_DISCONECTED dentry and
  returns that instead.
- close: closes the existing filp, which is destroyed
  immediately by dput() since it's DCACHE_UNHASHED.
- end of the compound: release the reference
  to the current filehandle, and dput() the new
  DCACHE_DISCONECTED dentry, which gets put on the
  unused list instead of being destroyed immediately.

Nick Piggin suggested fixing this by allowing d_obtain_alias to return
the unhashed dentry that is referenced by the filp, instead of making it
create a new dentry.

Leave __d_find_alias() alone to avoid changing behavior of other
callers.

Also nfsd doesn't need all the checks of __d_find_alias(); any dentry,
hashed or unhashed, disconnected or not, should work.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agotg3: Remove 5750 PCI code
Matt Carlson [Wed, 9 Mar 2011 16:58:25 +0000 (16:58 +0000)]
tg3: Remove 5750 PCI code

The 5750 ASIC rev was never released as a PCI device.  It only exists as
a PCIe device.  This patch removes the code that supports the former
configuration.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agotg3: Move tg3_init_link_config to tg3_phy_probe
Matt Carlson [Wed, 9 Mar 2011 16:58:24 +0000 (16:58 +0000)]
tg3: Move tg3_init_link_config to tg3_phy_probe

This patch moves the function that initializes the link configuration
closer to the place where the rest of the phy code is initialized.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agotg3: Refine VAux decision process
Matt Carlson [Wed, 9 Mar 2011 16:58:23 +0000 (16:58 +0000)]
tg3: Refine VAux decision process

In the near future, the VAux switching decision process is going to get
more complicated.  This patch refines and consolidates the existing
algorithm in anticipation of the new scheme.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agotg3: cleanup pci device table vars
Matt Carlson [Wed, 9 Mar 2011 16:58:22 +0000 (16:58 +0000)]
tg3: cleanup pci device table vars

Commit 895950c2a6565d9eefda4a38b00fa28537e39fcb, entitled
"tg3: Use DEFINE_PCI_DEVICE_TABLE" moved two pci device tables into the
global address space, but didn't declare them static and didn't prefix
them with "tg3_".  This patch fixes those problems.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agotg3: Add code to verify RODATA checksum of VPD
Matt Carlson [Wed, 9 Mar 2011 16:58:21 +0000 (16:58 +0000)]
tg3: Add code to verify RODATA checksum of VPD

This patch adds code to verify the checksum stored in the "RV" info
keyword of the RODATA VPD section.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agotg3: Fix NVRAM selftest
Matt Carlson [Wed, 9 Mar 2011 16:58:20 +0000 (16:58 +0000)]
tg3: Fix NVRAM selftest

The tg3 NVRAM selftest actually fails when validating the checksum of
the legacy NVRAM format.  However, the test still reported success
because the last update of the return code was a success from the NVRAM
reads.  This patch fixes the code so that the error return code defaults
to a failure status.  Then the patch fixes the reason why the checsum
validation failed.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agotg3: Add missed 5719 workaround change
Matt Carlson [Wed, 9 Mar 2011 16:58:19 +0000 (16:58 +0000)]
tg3: Add missed 5719 workaround change

Commit 2866d956fe0ad8fc8d8a7c54104ccc879b49406d, entitled
"tg3: Expand 5719 workaround" extended a 5719 A0 workaround to all
revisions of the chip.  There was a change that should have been a
part of that patch that was missed.  This patch adds the missing
piece.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoCheck for immutable/append flag in fallocate path
Marco Stornelli [Sat, 5 Mar 2011 10:10:19 +0000 (11:10 +0100)]
Check for immutable/append flag in fallocate path

In the fallocate path the kernel doesn't check for the immutable/append
flag. It's possible to have a race condition in this scenario: an
application open a file in read/write and it does something, meanwhile
root set the immutable flag on the file, the application at that point
can call fallocate with success. In addition, we don't allow to do any
unreserve operation on an append only file but only the reserve one.

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agosysctl: the include of rcupdate.h is only needed in the kernel
Stephen Rothwell [Thu, 10 Mar 2011 00:25:43 +0000 (11:25 +1100)]
sysctl: the include of rcupdate.h is only needed in the kernel

Fixes this built error:

include/linux/sysctl.h:28: included file 'linux/rcupdate.h' is not exported

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agofat: fix d_revalidate oopsen on NFS exports
Al Viro [Thu, 10 Mar 2011 08:45:49 +0000 (03:45 -0500)]
fat: fix d_revalidate oopsen on NFS exports

can't blindly check nd->flags in ->d_revalidate()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agojfs: fix d_revalidate oopsen on NFS exports
Al Viro [Thu, 10 Mar 2011 08:45:28 +0000 (03:45 -0500)]
jfs: fix d_revalidate oopsen on NFS exports

can't blindly check nd->flags in ->d_revalidate()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoocfs2: fix d_revalidate oopsen on NFS exports
Al Viro [Thu, 10 Mar 2011 08:45:07 +0000 (03:45 -0500)]
ocfs2: fix d_revalidate oopsen on NFS exports

can't blindly check nd->flags in ->d_revalidate()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agogfs2: fix d_revalidate oopsen on NFS exports
Al Viro [Thu, 10 Mar 2011 08:44:48 +0000 (03:44 -0500)]
gfs2: fix d_revalidate oopsen on NFS exports

can't blindly check nd->flags in ->d_revalidate()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agofuse: fix d_revalidate oopsen on NFS exports
Al Viro [Thu, 10 Mar 2011 08:44:31 +0000 (03:44 -0500)]
fuse: fix d_revalidate oopsen on NFS exports

can't blindly check nd->flags in ->d_revalidate()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoceph: fix d_revalidate oopsen on NFS exports
Al Viro [Thu, 10 Mar 2011 08:44:05 +0000 (03:44 -0500)]
ceph: fix d_revalidate oopsen on NFS exports

can't blindly check nd->flags in ->d_revalidate()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>