kernel/kernel-generic.git
12 years agoipv6: Report TCP timetstamp info in cacheinfo just like ipv4 does.
David S. Miller [Thu, 29 Dec 2011 20:22:33 +0000 (15:22 -0500)]
ipv6: Report TCP timetstamp info in cacheinfo just like ipv4 does.

I missed this while adding ipv6 support to inet_peer.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agomlx4_core: limiting VF port options
Yevgeny Petrilin [Thu, 29 Dec 2011 07:42:39 +0000 (07:42 +0000)]
mlx4_core: limiting VF port options

At the moment VFs can only operate in Eth mode.
In addition we don't want the VF to attempt link sensing,
so we block this option as well.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agomlx4_core: using array index for sense_allowed
Yevgeny Petrilin [Thu, 29 Dec 2011 07:42:34 +0000 (07:42 +0000)]
mlx4_core: using array index for sense_allowed

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agosch_tbf: report backlog information
Eric Dumazet [Wed, 28 Dec 2011 23:27:44 +0000 (23:27 +0000)]
sch_tbf: report backlog information

Provide child qdisc backlog (byte count) information so that "tc -s
qdisc" can report it to user.

qdisc netem 30: root refcnt 18 limit 1000 delay 20.0ms  10.0ms
 Sent 948517 bytes 898 pkt (dropped 0, overlimits 0 requeues 1)
 rate 175056bit 16pps backlog 114b 1p requeues 1
qdisc tbf 40: parent 30: rate 256000bit burst 20Kb/8 mpu 0b lat 0us
 Sent 948517 bytes 898 pkt (dropped 15, overlimits 611 requeues 0)
 backlog 18168b 12p requeues 0

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonetfilter: Kconfig: fix unmet xt_nfacct dependencies
Pablo Neira Ayuso [Wed, 28 Dec 2011 15:03:30 +0000 (15:03 +0000)]
netfilter: Kconfig: fix unmet xt_nfacct dependencies

warning: (NETFILTER_XT_MATCH_NFACCT) selects NETFILTER_NETLINK_ACCT which has
unmet direct dependencies (NET && INET && NETFILTER && NETFILTER_ADVANCED)

and then

ERROR: "nfnetlink_subsys_unregister" [net/netfilter/nfnetlink_acct.ko] undefined!
ERROR: "nfnetlink_subsys_register" [net/netfilter/nfnetlink_acct.ko] undefined!

Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: Kill rt6i_dev and rt6i_expires defines.
David S. Miller [Thu, 29 Dec 2011 01:19:20 +0000 (20:19 -0500)]
ipv6: Kill rt6i_dev and rt6i_expires defines.

It just obscures that the netdevice pointer and the expires value are
implemented in the dst_entry sub-object of the ipv6 route.

And it makes grepping for dst_entry member uses much harder too.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: Create fast inline ipv6 neigh lookup just like ipv4.
David S. Miller [Wed, 28 Dec 2011 20:41:23 +0000 (15:41 -0500)]
ipv6: Create fast inline ipv6 neigh lookup just like ipv4.

Also, create and use an rt6_bind_neighbour() in net/ipv6/route.c to
consolidate some common logic.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: Use universal hash for NDISC.
David S. Miller [Wed, 28 Dec 2011 20:06:58 +0000 (15:06 -0500)]
ipv6: Use universal hash for NDISC.

In order to perform a proper universal hash on a vector of integers,
we have to use different universal hashes on each vector element.

Which means we need 4 different hash randoms for ipv6.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonetrom: avoid overflows in nr_setsockopt()
Xi Wang [Tue, 27 Dec 2011 09:44:53 +0000 (09:44 +0000)]
netrom: avoid overflows in nr_setsockopt()

Check setsockopt arguments to avoid overflows and return -EINVAL for
too large arguments.

Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoax25: avoid overflows in ax25_setsockopt()
Xi Wang [Tue, 27 Dec 2011 09:43:19 +0000 (09:43 +0000)]
ax25: avoid overflows in ax25_setsockopt()

Commit be639ac6 ("NET: AX.25: Check ioctl arguments to avoid overflows
further down the road") rejects very large arguments, but doesn't
completely fix overflows on 64-bit systems.  Consider the AX25_T2 case.

int opt;
...
if (opt < 1 || opt > ULONG_MAX / HZ) {
res = -EINVAL;
break;
}
ax25->t2 = opt * HZ;

The 32-bit multiplication opt * HZ would overflow before being assigned
to 64-bit ax25->t2.  This patch changes "opt" to unsigned long.

Signed-off-by: Xi Wang <xi.wang@gmail.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agogenetlink: add auto module loading
Stephen Hemminger [Wed, 28 Dec 2011 18:48:55 +0000 (13:48 -0500)]
genetlink: add auto module loading

When testing L2TP support, I discovered that the l2tp module is not autoloaded
as are other netlink interfaces. There is because of lack of hook in genetlink to call
request_module and load the module.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: Remove optimistic DAD flag test in ipv6_add_addr()
David Miller [Tue, 27 Dec 2011 09:53:05 +0000 (09:53 +0000)]
ipv6: Remove optimistic DAD flag test in ipv6_add_addr()

The route we have here is for the address being added to the interface,
ie. for input packet processing.

Therefore using that route to determine whether an output nexthop gateway
is known and resolved doesn't make any sense.

So, simply remove this test, it never triggered anyways.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-By: Neil Horman <nhorman@tuxdriver.com>
12 years agoMerge branch 'nf-next' of git://1984.lsi.us.es/net-next
David S. Miller [Wed, 28 Dec 2011 18:16:59 +0000 (13:16 -0500)]
Merge branch 'nf-next' of git://1984.lsi.us.es/net-next

12 years agonet: fec: Adjust ENET MDIO timeouts
Rogerio Pimentel [Tue, 27 Dec 2011 19:07:37 +0000 (14:07 -0500)]
net: fec: Adjust ENET MDIO timeouts

On extensive NFS boots on a mx6qsabrelite board it was noted that "FEC: MDIO read timeout" were occuring,
which caused failure on loading the FEC driver.

The original FEC_MII_TIMEOUT was set to 1 ms, which is too low when passed to the usecs_to_jiffies macro.

On ARM one jiffy is 10ms, so use a timeout of 30ms, which corresponds to 3 jiffies.

After running extensive NFS boots, the MDIO timeouts do not occur anymore with this change.

Signed-off-by: Rogerio Pimentel <rogerio.pimentel@freescale.com>
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonetfilter: provide config option to disable ancient procfs parts
Jan Engelhardt [Thu, 21 Apr 2011 07:32:45 +0000 (09:32 +0200)]
netfilter: provide config option to disable ancient procfs parts

Using /proc/net/nf_conntrack has been deprecated in favour of the
conntrack(8) tool.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agonetfilter: xtables: collapse conditions in xt_ecn
Jan Engelhardt [Thu, 9 Jun 2011 20:16:50 +0000 (22:16 +0200)]
netfilter: xtables: collapse conditions in xt_ecn

One simplification of an if clause.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agonetfilter: xtables: add an IPv6 capable version of the ECN match
Patrick McHardy [Thu, 9 Jun 2011 13:20:27 +0000 (15:20 +0200)]
netfilter: xtables: add an IPv6 capable version of the ECN match

References: http://www.spinics.net/lists/netfilter-devel/msg18875.html

Augment xt_ecn by facilities to match on IPv6 packets' DSCP/TOS field
similar to how it is already done for the IPv4 packet field.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agonetfilter: xtables: give xt_ecn its own name
Jan Engelhardt [Thu, 9 Jun 2011 19:15:37 +0000 (21:15 +0200)]
netfilter: xtables: give xt_ecn its own name

Use the new macro and struct names in xt_ecn.h, and put the old
definitions into a definition-forwarding ipt_ecn.h.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agonetfilter: xtables: move ipt_ecn to xt_ecn
Jan Engelhardt [Thu, 9 Jun 2011 19:03:07 +0000 (21:03 +0200)]
netfilter: xtables: move ipt_ecn to xt_ecn

Prepare the ECN match for augmentation by an IPv6 counterpart. Since
no symbol dependencies to ipv6.ko are added, having a single ecn match
module is the more so welcome.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agonet: meth: Add set_rx_mode hook to fix ICMPv6 neighbor discovery
Joshua Kinard [Mon, 26 Dec 2011 19:06:15 +0000 (19:06 +0000)]
net: meth: Add set_rx_mode hook to fix ICMPv6 neighbor discovery

SGI IP32 (O2)'s ethernet driver (meth) lacks a set_rx_mode function, which
prevents IPv6 from working completely because any ICMPv6 neighbor
solicitation requests aren't picked up by the driver.  So the machine can
ping out and connect to other systems, but other systems will have a very
hard time connecting to the O2.

Signed-off-by: Joshua Kinard <kumba@gentoo.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: calxeda xgmac ethernet driver add missing HAS_IOMEM dependency
Heiko Carstens [Tue, 27 Dec 2011 04:07:05 +0000 (04:07 +0000)]
net: calxeda xgmac ethernet driver add missing HAS_IOMEM dependency

Fix allyesconfig build on architectures without IOMEM:

drivers/net/ethernet/calxeda/xgmac.c:1800:2:
  error: implicit declaration of function 'iounmap' [-Werror=implicit-function-declaration]

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Rob Herring <rob.herring@calxeda.com>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotipc: Allow use of buf_seqno() helper routine by unicast links
Allan Stephens [Mon, 24 Oct 2011 20:03:12 +0000 (16:03 -0400)]
tipc: Allow use of buf_seqno() helper routine by unicast links

Migrates the buf_seqno() helper routine from broadcast link level to
unicast link level so that it can be used both types of TIPC links.
This is a cosmetic change only, and does not affect the operation of TIPC.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Ignore broadcast acknowledgements that are out-of-range
Allan Stephens [Mon, 24 Oct 2011 19:26:24 +0000 (15:26 -0400)]
tipc: Ignore broadcast acknowledgements that are out-of-range

Adds checks to TIPC's broadcast link so that it ignores any
acknowledgement message containing a sequence number that does not
correspond to an unacknowledged message currently in the broadcast
link's transmit queue.

This change prevents the broadcast link from becoming stalled if a
newly booted node receives stale broadcast link acknowledgement
information from another node that has not yet fully synchronized
its end of the broadcast link to reflect the current state of the
new node's end.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Flush unsent broadcast messages when contact with last node is lost
Allan Stephens [Mon, 24 Oct 2011 18:59:20 +0000 (14:59 -0400)]
tipc: Flush unsent broadcast messages when contact with last node is lost

Adds code to release any unsent broadcast messages in the broadcast link
transmit queue if TIPC loses contact with its only neighboring node.
Previously, a broadcast link that was in the congested state would hold
on to the unsent messages, even though the messages were now undeliverable.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Minor optimization of broadcast link transmit queue statistic
Allan Stephens [Mon, 24 Oct 2011 17:27:31 +0000 (13:27 -0400)]
tipc: Minor optimization of broadcast link transmit queue statistic

The two broadcast link statistics fields that are used to derive the
average length of that link's transmit queue are now updated only after
a successful attempt to send a broadcast message, since there is no need
to update these values when an unsuccessful send attempt leaves the
queue unchanged.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Handle broadcast attempt when no neighboring nodes exist
Allan Stephens [Mon, 24 Oct 2011 17:05:55 +0000 (13:05 -0400)]
tipc: Handle broadcast attempt when no neighboring nodes exist

Adds a check to detect when an attempt is made to send a message
via the broadcast link and no neighboring nodes are currently available
to receive it. Rather than wasting effort passing the message to the
broadcast link and broadcast bearer, who will only throw it away,
TIPC now frees the message immediately and reports success (i.e. the
message has been delivered to all available destinations).

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Ensure broadcast link spinlock is held when updating node map
Allan Stephens [Mon, 24 Oct 2011 15:18:12 +0000 (11:18 -0400)]
tipc: Ensure broadcast link spinlock is held when updating node map

Fixes oversight that allowed broadcast link node map to be updated without
first taking the broadcast link spinlock that protects the map. As part
of this fix the node map has been incorporated into the broadcast link
structure to make the need for such protection more evident.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Eliminate dynamic allocation of broadcast link data structures
Allan Stephens [Mon, 24 Oct 2011 14:29:26 +0000 (10:29 -0400)]
tipc: Eliminate dynamic allocation of broadcast link data structures

Creates global variables to hold the broadcast link's pseudo-bearer and
pseudo-link structures, rather than allocating them dynamically. There
is only a single instance of each structure, and changing over to static
allocation allows elimination of code to handle the cases where dynamic
allocation was unsuccessful.

The memset in the teardown code may look like they aren't used, but
the same teardown code is run when there is a non-fatal error at
init-time, so that stale data isn't present when the user fixes the
cause of the soft error.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Eliminate useless check when network address is assigned
Allan Stephens [Fri, 14 Oct 2011 18:42:25 +0000 (14:42 -0400)]
tipc: Eliminate useless check when network address is assigned

Gets rid of an unnecessary check in the routine that updates the port id
of a node's name publications when the node is assigned a network address,
since the routine is only invoked if the new address is different from
the existing one.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Minor correction to TIPC module unloading
Allan Stephens [Thu, 20 Oct 2011 13:48:05 +0000 (09:48 -0400)]
tipc: Minor correction to TIPC module unloading

Modifies TIPC's module unloading logic to switch itself into "single
node" mode before starting to terminate networking support. This helps
to ensure that no operations that require TIPC to be in "networking"
mode can initiate once unloading starts.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Eliminate useless memset operations in Ethernet media support
Allan Stephens [Wed, 19 Oct 2011 19:39:21 +0000 (15:39 -0400)]
tipc: Eliminate useless memset operations in Ethernet media support

Gets rid of two pointless operations that zero out the array used to
record information about TIPC's Ethernet bearers. There is no need to
initialize the array on start up since it is a global variable that is
already zero'd out, and there is no need to zero it out on exit because
the array is never referenced again.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Do timely cleanup of disabled Ethernet bearer resources
Allan Stephens [Wed, 19 Oct 2011 19:18:11 +0000 (15:18 -0400)]
tipc: Do timely cleanup of disabled Ethernet bearer resources

Modifies Ethernet bearer disable logic to break the association between
the bearer and its device driver at the time the bearer is disabled,
rather than when the TIPC module is unloaded. This allows the array
entry used by the disabled bearer to be re-used if the same bearer (or
a different one) is subsequently enabled.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Minor optimization to deactivation of Ethernet media suppot
Allan Stephens [Wed, 19 Oct 2011 18:58:29 +0000 (14:58 -0400)]
tipc: Minor optimization to deactivation of Ethernet media suppot

Change TIPC's shutdown code to deactivate generic networking support
before terminating Ethernet media support. The deactivation of generic
networking support causes all existing bearers to be destroyed, meaning
the Ethernet media termination routine no longer has to bother marking
them as unavailable.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Revise comment justifying release of configuration spinlock
Allan Stephens [Tue, 18 Oct 2011 18:47:02 +0000 (14:47 -0400)]
tipc: Revise comment justifying release of configuration spinlock

Comment-only change to better explain why TIPC's configuration lock is
temporarily released while activating support for network interfaces,
and why the existing activation code doesn't require rework.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Allow run-time alteration of default link settings
Allan Stephens [Tue, 18 Oct 2011 15:34:29 +0000 (11:34 -0400)]
tipc: Allow run-time alteration of default link settings

Permits run-time alteration of default link settings on a per-media
and per-bearer basis, in addition to the existing per-link basis.
The following syntax can now be used:

    tipc-config -lt=<link-name|bearer-name|media-name>/<tolerance>
    tipc-config -lp=<link-name|bearer-name|media-name>/<priority>
    tipc-config -lw=<link-name|bearer-name|media-name>/<window>

Note that changes to the default settings for a given media type has
no effect on the default settings used by existing bearers. Similarly,
changes to default bearer settings has no effect on existing link
endpoints that utilize that interface.

Thanks to Florian Westphal <fw@strlen.de> for his contributions to
the development of this enhancement.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Ignore neighbor discovery messages containing invalid address
Allan Stephens [Fri, 7 Oct 2011 19:48:41 +0000 (15:48 -0400)]
tipc: Ignore neighbor discovery messages containing invalid address

Adds a check to ensure that TIPC ignores an incoming neighbor discovery
message that specifies an invalid media address as its source. The check
ensures that the source address is a valid, non-broadcast address that
could legally be used by a neighboring link endpoint.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Hide media-specific addressing details from generic bearer code
Allan Stephens [Fri, 7 Oct 2011 19:19:11 +0000 (15:19 -0400)]
tipc: Hide media-specific addressing details from generic bearer code

Reworks TIPC's media address data structure and associated processing
routines to transfer all media-specific details of address conversion
to the associated TIPC media adaptation code. TIPC's generic bearer code
now only needs to know which media type an address is associated with
and whether or not it is a broadcast address, and totally ignores the
"value" field that contains the actual media-specific addressing info.

These changes eliminate the need for a number of endianness conversion
operations and will make it easier for TIPC to support new media types
in the future.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Add new address conversion routines for Ethernet media
Allan Stephens [Fri, 7 Oct 2011 17:37:34 +0000 (13:37 -0400)]
tipc: Add new address conversion routines for Ethernet media

Enhances TIPC's Ethernet media support to provide 3 new address conversion
routines, which allow TIPC to interpret an address that is in string form
and to convert an address to and from the 20 byte format used in TIPC's
neighbor discovery messages.

These routines are pre-requisites to a follow on commit that hides all
media-specific addressing details from TIPC's generic bearer code.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Improve handling of media address printing errors
Allan Stephens [Fri, 7 Oct 2011 15:31:49 +0000 (11:31 -0400)]
tipc: Improve handling of media address printing errors

Enhances conversion of a media address to printable form so that an
unconvertable address will be displayed as a string of hex digits,
rather than not being displayed at all. (Also removes a pointless check
for the existence of the media-specific address conversion routine,
since the routine is not optional.)

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Streamline media registration error checking
Allan Stephens [Fri, 7 Oct 2011 13:54:44 +0000 (09:54 -0400)]
tipc: Streamline media registration error checking

Simplifies error handling performed during media registration, since
TIPC no longer supports the dynamic addition of new media types that
are potentially error-prone. These simplifications include the following:

1) No longer check for premature registration of a new media type.
2) No longer check for negative link priority values (which was pointless
   since such values are unsigned, and could cause a compiler warning).
3) No longer generate a warning describing the exact cause of any
   registration failure (just warns that overall registration failed).

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Eliminate duplication of media structures
Allan Stephens [Fri, 7 Oct 2011 13:25:12 +0000 (09:25 -0400)]
tipc: Eliminate duplication of media structures

Changes TIPC's list of registered media types from an array of media
structures to an array of pointers to media structures. This eliminates
the need to copy of the contents of the structure passed in during media
registration.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Optimize detection of duplicate media registration
Allan Stephens [Thu, 6 Oct 2011 20:40:55 +0000 (16:40 -0400)]
tipc: Optimize detection of duplicate media registration

Streamlines the detection of an attempt to register a TIPC media structure
using an already registered name or type identifier. The revised logic now
reuses an existing routine to detect an existing name and no longer
unnecessarily manipulates the media type counter during an unsuccessful
registration attempt.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Register new media using pre-compiled structure
Allan Stephens [Thu, 6 Oct 2011 19:28:44 +0000 (15:28 -0400)]
tipc: Register new media using pre-compiled structure

Speeds up the registration of TIPC media types by passing in a structure
containing the required information, rather than by passing in the various
fields describing the media type individually.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Enable use by containers having their own network namespace
Allan Stephens [Thu, 6 Oct 2011 17:57:51 +0000 (13:57 -0400)]
tipc: Enable use by containers having their own network namespace

Permits a Linux container to use TIPC sockets even when it has its own
network namespace defined by removing the check that prohibits such use.
This makes it possible for users who wish to isolate their container
network traffic from normal network traffic to utilize TIPC.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agobonding: document undocumented active_slave sysfs entry.
Nicolas de PesloĂ¼an [Mon, 26 Dec 2011 13:35:24 +0000 (13:35 +0000)]
bonding: document undocumented active_slave sysfs entry.

v2, based on Jay's review.

I kept the 'link must be up' part, because this is enforced in the code.

Signed-off-by: Nicolas de PesloĂ¼an <nicolas.2p.debian@free.fr>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: Kill useless route tracing bits in net/ipv6/route.c
David S. Miller [Mon, 26 Dec 2011 20:24:36 +0000 (15:24 -0500)]
ipv6: Kill useless route tracing bits in net/ipv6/route.c

RDBG() wasn't even used, and the messages printed by RT6_DEBUG() were
far from useful.  Just get rid of all this stuff, we can replace it
with something more suitable if we want.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agomlx4: Add missing include of linux/slab.h
Axel Lin [Sun, 25 Dec 2011 23:35:34 +0000 (23:35 +0000)]
mlx4: Add missing include of linux/slab.h

Include linux/slab.h to fix below build error:

  CC      drivers/net/ethernet/mellanox/mlx4/resource_tracker.o
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'mlx4_init_resource_tracker':
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:233: error: implicit declaration of function 'kzalloc'
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:234: warning: assignment makes pointer from integer without a cast
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'mlx4_free_resource_tracker':
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:264: error: implicit declaration of function 'kfree'
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'alloc_qp_tr':
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:370: warning: assignment makes pointer from integer without a cast
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'alloc_mtt_tr':
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:386: warning: assignment makes pointer from integer without a cast
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'alloc_mpt_tr':
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:402: warning: assignment makes pointer from integer without a cast
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'alloc_eq_tr':
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:417: warning: assignment makes pointer from integer without a cast
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'alloc_cq_tr':
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:431: warning: assignment makes pointer from integer without a cast
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'alloc_srq_tr':
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:446: warning: assignment makes pointer from integer without a cast
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'alloc_counter_tr':
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:461: warning: assignment makes pointer from integer without a cast
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'add_res_range':
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:521: warning: assignment makes pointer from integer without a cast
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'mac_add_to_slave':
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:1193: warning: assignment makes pointer from integer without a cast
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'add_mcg_res':
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:2521: warning: assignment makes pointer from integer without a cast
make[5]: *** [drivers/net/ethernet/mellanox/mlx4/resource_tracker.o] Error 1
make[4]: *** [drivers/net/ethernet/mellanox/mlx4] Error 2
make[3]: *** [drivers/net/ethernet/mellanox] Error 2
make[2]: *** [drivers/net/ethernet] Error 2
make[1]: *** [drivers/net] Error 2
make: *** [drivers] Error 2

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agounix: If we happen to find peer NULL when diag dumping, write zero.
David S. Miller [Mon, 26 Dec 2011 19:41:55 +0000 (14:41 -0500)]
unix: If we happen to find peer NULL when diag dumping, write zero.

Otherwise we leave uninitialized kernel memory in there.

Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agounix_diag: Fix incoming connections nla length
Pavel Emelyanov [Mon, 26 Dec 2011 19:08:47 +0000 (14:08 -0500)]
unix_diag: Fix incoming connections nla length

The NLA_PUT macro should accept the actual attribute length, not
the amount of elements in array :(

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge branch 'nf-next' of git://1984.lsi.us.es/net-next
David S. Miller [Sun, 25 Dec 2011 07:21:45 +0000 (02:21 -0500)]
Merge branch 'nf-next' of git://1984.lsi.us.es/net-next

12 years agonetfilter: xtables: add nfacct match to support extended accounting
Pablo Neira Ayuso [Fri, 23 Dec 2011 13:28:59 +0000 (14:28 +0100)]
netfilter: xtables: add nfacct match to support extended accounting

This patch adds the match that allows to perform extended
accounting. It requires the new nfnetlink_acct infrastructure.

 # iptables -I INPUT -p tcp --sport 80 -m nfacct --nfacct-name http-traffic
 # iptables -I OUTPUT -p tcp --dport 80 -m nfacct --nfacct-name http-traffic

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agonetfilter: add extended accounting infrastructure over nfnetlink
Pablo Neira Ayuso [Fri, 23 Dec 2011 13:19:50 +0000 (14:19 +0100)]
netfilter: add extended accounting infrastructure over nfnetlink

We currently have two ways to account traffic in netfilter:

- iptables chain and rule counters:

 # iptables -L -n -v
Chain INPUT (policy DROP 3 packets, 867 bytes)
 pkts bytes target     prot opt in     out     source               destination
    8  1104 ACCEPT     all  --  lo     *       0.0.0.0/0            0.0.0.0/0

- use flow-based accounting provided by ctnetlink:

 # conntrack -L
tcp      6 431999 ESTABLISHED src=192.168.1.130 dst=212.106.219.168 sport=58152 dport=80 packets=47 bytes=7654 src=212.106.219.168 dst=192.168.1.130 sport=80 dport=58152 packets=49 bytes=66340 [ASSURED] mark=0 use=1

While trying to display real-time accounting statistics, we require
to pool the kernel periodically to obtain this information. This is
OK if the number of flows is relatively low. However, in case that
the number of flows is huge, we can spend a considerable amount of
cycles to iterate over the list of flows that have been obtained.

Moreover, if we want to obtain the sum of the flow accounting results
that match some criteria, we have to iterate over the whole list of
existing flows, look for matchings and update the counters.

This patch adds the extended accounting infrastructure for
nfnetlink which aims to allow displaying real-time traffic accounting
without the need of complicated and resource-consuming implementation
in user-space. Basically, this new infrastructure allows you to create
accounting objects. One accounting object is composed of packet and
byte counters.

In order to manipulate create accounting objects, you require the
new libnetfilter_acct library. It contains several examples of use:

libnetfilter_acct/examples# ./nfacct-add http-traffic
libnetfilter_acct/examples# ./nfacct-get
http-traffic = { pkts = 000000000000,   bytes = 000000000000 };

Then, you can use one of this accounting objects in several iptables
rules using the new nfacct match (which comes in a follow-up patch):

 # iptables -I INPUT -p tcp --sport 80 -m nfacct --nfacct-name http-traffic
 # iptables -I OUTPUT -p tcp --dport 80 -m nfacct --nfacct-name http-traffic

The idea is simple: if one packet matches the rule, the nfacct match
updates the counters.

Thanks to Patrick McHardy, Eric Dumazet, Changli Gao for reviewing and
providing feedback for this contribution.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agorfs: better sizing of dev_flow_table
Eric Dumazet [Sat, 24 Dec 2011 06:56:49 +0000 (06:56 +0000)]
rfs: better sizing of dev_flow_table

Aim of this patch is to provide full range of rps_flow_cnt on 64bit arches.

Theorical limit on number of flows is 2^32

Fix some buggy RPS/RFS macros as well.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Tom Herbert <therbert@google.com>
CC: Xi Wang <xi.wang@gmail.com>
CC: Laurent Chavey <chavey@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonetfilter: ctnetlink: get and zero operations must be atomic
Pablo Neira Ayuso [Sat, 24 Dec 2011 13:11:39 +0000 (14:11 +0100)]
netfilter: ctnetlink: get and zero operations must be atomic

The get and zero operations have to be done in an atomic context,
otherwise counters added between them will be lost.

This problem was spotted by Changli Gao while discussing the
nfacct infrastructure.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agonetlink: Undo const marker in netlink_is_kernel().
David S. Miller [Fri, 23 Dec 2011 22:33:03 +0000 (17:33 -0500)]
netlink: Undo const marker in netlink_is_kernel().

We can't do this without propagating the const to nlk_sk()
too, otherwise:

net/netlink/af_netlink.c: In function â€˜netlink_is_kernel’:
net/netlink/af_netlink.c:103:2: warning: passing argument 1 of â€˜nlk_sk’ discards â€˜const’ qualifier from pointer target type [enabled by default]
net/netlink/af_netlink.c:96:36: note: expected â€˜struct sock *’ but argument is of type â€˜const struct sock *’

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David S. Miller [Fri, 23 Dec 2011 22:13:56 +0000 (17:13 -0500)]
Merge git://git./linux/kernel/git/davem/net

Conflicts:
net/bluetooth/l2cap_core.c

Just two overlapping changes, one added an initialization of
a local variable, and another change added a new local variable.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonetem: loss model API sizes
stephen hemminger [Fri, 23 Dec 2011 09:16:30 +0000 (09:16 +0000)]
netem: loss model API sizes

The new netem loss model is configured with nested netlink messages.
This code is being overly strict about sizes, and is easily confused
by padding (or possible future expansion). Also message
for gemodel is incorrect.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agosch_hfsc: report backlog information
Eric Dumazet [Fri, 23 Dec 2011 05:19:20 +0000 (05:19 +0000)]
sch_hfsc: report backlog information

Add backlog (byte count) information in hfsc classes and qdisc, so that
"tc -s" can report it to user, instead of 0 values :

qdisc hfsc 1: root refcnt 6 default 20
 Sent 45141660 bytes 30545 pkt (dropped 0, overlimits 91751 requeues 0)
 rate 1492Kbit 126pps backlog 103226b 74p requeues 0
...
class hfsc 1:20 parent 1:1 leaf 1201: rt m1 0bit d 0us m2 400000bit ls m1 0bit d 0us m2 200000bit
 Sent 49534912 bytes 33519 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 81822b 56p requeues 0
 period 23 work 49451576 bytes rtwork 13277552 bytes level 0
...

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: John A. Sullivan III <jsullivan@opensourcedevel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agousb: pegasus: cleanup a couple conditions
Dan Carpenter [Fri, 23 Dec 2011 00:44:36 +0000 (00:44 +0000)]
usb: pegasus: cleanup a couple conditions

We recently made loopback a bool type instead of an int, so the bitwise
AND is redundent.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: only use a single page of slop in MAX_SKB_FRAGS
Ian Campbell [Thu, 22 Dec 2011 23:39:14 +0000 (23:39 +0000)]
net: only use a single page of slop in MAX_SKB_FRAGS

In order to accommodate a 64K buffer we need 64K/PAGE_SIZE plus one more page
in order to allow for a buffer which does not start on a page boundary.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agodrivers/net/usb/asix: fixed asix_get_wol reported wrong wol status issue
allan [Thu, 22 Dec 2011 20:38:51 +0000 (20:38 +0000)]
drivers/net/usb/asix: fixed asix_get_wol reported wrong wol status issue

Fixed the asix_get_wol() routine reported wrong wol status issue.

Signed-off-by: Allan Chou <allan@asix.com.tw>
Tested-by: Eugene <elubarsky@gmail.com>; Allan Chou <allan@asix.com.tw>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agopacket: fix typo in packet_mmap.txt
Wei Yongjun [Thu, 22 Dec 2011 17:47:54 +0000 (17:47 +0000)]
packet: fix typo in packet_mmap.txt

Just fixed typo of sample code in packet_mmap.txt

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobna: Add debugfs interface.
Krishna Gudipati [Thu, 22 Dec 2011 13:30:19 +0000 (13:30 +0000)]
bna: Add debugfs interface.

Change details:
- Add debugfs support to obtain firmware trace, saved firmware trace on
  an IOC crash, driver info and read/write to registers.

- debugfs hierarchy:
  bna/pci_dev:<pci_name>
  where the pci_name corresponds to the one under /sys/bus/pci/drivers/bna

- Following are the new debugfs entries added:
  fwtrc: collect current firmware trace.
  fwsave: collect last saved fw trace as a result of firmware crash.
  regwr: write one word to chip register
  regrd: read one or more words from chip register.
  drvinfo: collect the driver information.

Signed-off-by: Krishna Gudipati <kgudipat@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobna: Added flash sub-module and ethtool eeprom entry points.
Krishna Gudipati [Thu, 22 Dec 2011 13:29:45 +0000 (13:29 +0000)]
bna: Added flash sub-module and ethtool eeprom entry points.

Change details:
- The patch adds flash sub-module to the bna driver.
- Added ethtool set_eeprom() and get_eeprom() entry points to
  support flash partition read/write operations.

Signed-off-by: Krishna Gudipati <kgudipat@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge branch 'nf' of git://1984.lsi.us.es/net
David S. Miller [Fri, 23 Dec 2011 19:29:20 +0000 (14:29 -0500)]
Merge branch 'nf' of git://1984.lsi.us.es/net

12 years agostmmac: fix missing module license in the main.
Giuseppe Cavallaro [Fri, 23 Dec 2011 19:21:20 +0000 (14:21 -0500)]
stmmac: fix missing module license in the main.

This patch fixes the following warning raised
when compile:

WARNING: modpost: missing MODULE_LICENSE()
in drivers/net/ethernet/stmicro/stmmac/stmmac.o

Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonetfilter: xt_connbytes: handle negation correctly
Florian Westphal [Fri, 16 Dec 2011 17:35:15 +0000 (18:35 +0100)]
netfilter: xt_connbytes: handle negation correctly

"! --connbytes 23:42" should match if the packet/byte count is not in range.

As there is no explict "invert match" toggle in the match structure,
userspace swaps the from and to arguments
(i.e., as if "--connbytes 42:23" were given).

However, "what <= 23 && what >= 42" will always be false.

Change things so we use "||" in case "from" is larger than "to".

This change may look like it breaks backwards compatibility when "to" is 0.
However, older iptables binaries will refuse "connbytes 42:0",
and current releases treat it to mean "! --connbytes 0:42",
so we should be fine.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agonetfilter: ctnetlink: remove dead NAT code
Patrick McHardy [Fri, 23 Dec 2011 13:01:36 +0000 (14:01 +0100)]
netfilter: ctnetlink: remove dead NAT code

The NAT range to nlattr conversation callbacks and helpers are entirely
dead code and are also useless since there are no NAT ranges in conntrack
context, they are only used for initially selecting a tuple. The final NAT
information is contained in the selected tuples of the conntrack entry.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agonetfilter: nf_nat: remove obsolete check in nf_nat_mangle_udp_packet()
Patrick McHardy [Fri, 23 Dec 2011 13:01:26 +0000 (14:01 +0100)]
netfilter: nf_nat: remove obsolete check in nf_nat_mangle_udp_packet()

The packet size check originates from a time when UDP helpers could
accidentally mangle incorrect packets (NEWNAT) and is unnecessary
nowadays since the conntrack helpers invoke the NAT helpers for the
proper packet directly.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agonetfilter: nf_nat: remove obsolete code from nf_nat_icmp_reply_translation()
Patrick McHardy [Fri, 23 Dec 2011 13:01:03 +0000 (14:01 +0100)]
netfilter: nf_nat: remove obsolete code from nf_nat_icmp_reply_translation()

The inner tuple that is extracted from the packet is unused. The code also
doesn't have any useful side-effects like verifying the packet does contain
enough data to extract the inner tuple since conntrack already does the
same, so remove it.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agonetfilter: nat: remove module reference counting from NAT protocols
Patrick McHardy [Fri, 23 Dec 2011 13:00:49 +0000 (14:00 +0100)]
netfilter: nat: remove module reference counting from NAT protocols

The only remaining user of NAT protocol module reference counting is NAT
ctnetlink support. Since this is a fairly short sequence of code, convert
over to use RCU and remove module reference counting.

Module unregistration is already protected by RCU using synchronize_rcu(),
so no further changes are necessary.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agonetfilter: nf_nat: add missing nla_policy entry for CTA_NAT_PROTO attribute
Patrick McHardy [Fri, 23 Dec 2011 13:00:30 +0000 (14:00 +0100)]
netfilter: nf_nat: add missing nla_policy entry for CTA_NAT_PROTO attribute

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agonetfilter: nf_nat: use hash random for bysource hash
Patrick McHardy [Fri, 23 Dec 2011 13:00:13 +0000 (14:00 +0100)]
netfilter: nf_nat: use hash random for bysource hash

Use nf_conntrack_hash_rnd in NAT bysource hash to avoid hash chain attacks.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agonetfilter: nf_nat: export NAT definitions to userspace
Patrick McHardy [Fri, 23 Dec 2011 12:59:49 +0000 (13:59 +0100)]
netfilter: nf_nat: export NAT definitions to userspace

Export the NAT definitions to userspace. So far userspace (specifically,
iptables) has been copying the headers files from include/net. Also
rename some structures and definitions in preparation for IPv6 NAT.
Since these have never been officially exported, this doesn't affect
existing userspace code.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agonetfilter: rework user-space expectation helper support
Pablo Neira Ayuso [Sun, 18 Dec 2011 00:55:54 +0000 (01:55 +0100)]
netfilter: rework user-space expectation helper support

This partially reworks bc01befdcf3e40979eb518085a075cbf0aacede0
which added userspace expectation support.

This patch removes the nf_ct_userspace_expect_list since now we
force to use the new iptables CT target feature to add the helper
extension for conntracks that have attached expectations from
userspace.

A new version of the proof-of-concept code to implement userspace
helpers from userspace is available at:

http://people.netfilter.org/pablo/userspace-conntrack-helpers/nf-ftp-helper-POC.tar.bz2

This patch also modifies the CT target to allow to set the
conntrack's userspace helper status flags. This flag is used
to tell the conntrack system to explicitly allocate the helper
extension.

This helper extension is useful to link the userspace expectations
with the master conntrack that is being tracked from one userspace
helper.

This feature fixes a problem in the current approach of the
userspace helper support. Basically, if the master conntrack that
has got a userspace expectation vanishes, the expectations point to
one invalid memory address. Thus, triggering an oops in the
expectation deletion event path.

I decided not to add a new revision of the CT target because
I only needed to add a new flag for it. I'll document in this
issue in the iptables manpage. I have also changed the return
value from EINVAL to EOPNOTSUPP if one flag not supported is
specified. Thus, in the future adding new features that only
require a new flag can be added without a new revision.

There is no official code using this in userspace (apart from
the proof-of-concept) that uses this infrastructure but there
will be some by beginning 2012.

Reported-by: Sam Roberts <vieuxtech@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agonet: relax rcvbuf limits
Eric Dumazet [Wed, 21 Dec 2011 07:11:44 +0000 (07:11 +0000)]
net: relax rcvbuf limits

skb->truesize might be big even for a small packet.

Its even bigger after commit 87fb4b7b533 (net: more accurate skb
truesize) and big MTU.

We should allow queueing at least one packet per receiver, even with a
low RCVBUF setting.

Reported-by: Michal Simek <monstr@monstr.eu>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonetlink: wake up netlink listeners sooner (v2)
stephen hemminger [Thu, 22 Dec 2011 08:52:03 +0000 (08:52 +0000)]
netlink: wake up netlink listeners sooner (v2)

This patch changes it to yield sooner at halfway instead. Still not a cure-all
for listener overrun if listner is slow, but works much reliably.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonetlink: af_netlink cleanup (v2)
stephen hemminger [Thu, 22 Dec 2011 08:52:02 +0000 (08:52 +0000)]
netlink: af_netlink cleanup (v2)

Don't inline functions that cover several lines, and do inline
the trivial ones. Also make some arguments const.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoPartial revert "Basic kernel memory functionality for the Memory Controller"
Glauber Costa [Thu, 22 Dec 2011 01:02:27 +0000 (01:02 +0000)]
Partial revert "Basic kernel memory functionality for the Memory Controller"

This reverts commit e5671dfae59b165e2adfd4dfbdeab11ac8db5bda.

After a follow up discussion with Michal, it was agreed it would
be better to leave the kmem controller with just the tcp files,
deferring the behavior of the other general memory.kmem.* files
for a later time, when more caches are controlled. This is because
generic kmem files are not used by tcp accounting and it is
not clear how other slab caches would fit into the scheme.

We are reverting the original commit so we can track the reference.
Part of the patch is kept, because it was used by the later tcp
code. Conflicts are shown in the bottom. init/Kconfig is removed from
the revert entirely.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
CC: Kirill A. Shutemov <kirill@shutemov.name>
CC: Paul Menage <paul@paulmenage.org>
CC: Greg Thelen <gthelen@google.com>
CC: Johannes Weiner <jweiner@redhat.com>
CC: David S. Miller <davem@davemloft.net>
Conflicts:

Documentation/cgroups/memory.txt
mm/memcontrol.c
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agorps: fix insufficient bounds checking in store_rps_dev_flow_table_cnt()
Xi Wang [Thu, 22 Dec 2011 13:35:22 +0000 (13:35 +0000)]
rps: fix insufficient bounds checking in store_rps_dev_flow_table_cnt()

Setting a large rps_flow_cnt like (1 << 30) on 32-bit platform will
cause a kernel oops due to insufficient bounds checking.

if (count > 1<<30) {
/* Enforce a limit to prevent overflow */
return -EINVAL;
}
count = roundup_pow_of_two(count);
table = vmalloc(RPS_DEV_FLOW_TABLE_SIZE(count));

Note that the macro RPS_DEV_FLOW_TABLE_SIZE(count) is defined as:

... + (count * sizeof(struct rps_dev_flow))

where sizeof(struct rps_dev_flow) is 8.  (1 << 30) * 8 will overflow
32 bits.

This patch replaces the magic number (1 << 30) with a symbolic bound.

Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: introduce DST_NOPEER dst flag
Eric Dumazet [Thu, 22 Dec 2011 04:15:53 +0000 (04:15 +0000)]
net: introduce DST_NOPEER dst flag

Chris Boot reported crashes occurring in ipv6_select_ident().

[  461.457562] RIP: 0010:[<ffffffff812dde61>]  [<ffffffff812dde61>]
ipv6_select_ident+0x31/0xa7

[  461.578229] Call Trace:
[  461.580742] <IRQ>
[  461.582870]  [<ffffffff812efa7f>] ? udp6_ufo_fragment+0x124/0x1a2
[  461.589054]  [<ffffffff812dbfe0>] ? ipv6_gso_segment+0xc0/0x155
[  461.595140]  [<ffffffff812700c6>] ? skb_gso_segment+0x208/0x28b
[  461.601198]  [<ffffffffa03f236b>] ? ipv6_confirm+0x146/0x15e
[nf_conntrack_ipv6]
[  461.608786]  [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77
[  461.614227]  [<ffffffff81271d64>] ? dev_hard_start_xmit+0x357/0x543
[  461.620659]  [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111
[  461.626440]  [<ffffffffa0379745>] ? br_parse_ip_options+0x19a/0x19a
[bridge]
[  461.633581]  [<ffffffff812722ff>] ? dev_queue_xmit+0x3af/0x459
[  461.639577]  [<ffffffffa03747d2>] ? br_dev_queue_push_xmit+0x72/0x76
[bridge]
[  461.646887]  [<ffffffffa03791e3>] ? br_nf_post_routing+0x17d/0x18f
[bridge]
[  461.653997]  [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77
[  461.659473]  [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge]
[  461.665485]  [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111
[  461.671234]  [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge]
[  461.677299]  [<ffffffffa0379215>] ?
nf_bridge_update_protocol+0x20/0x20 [bridge]
[  461.684891]  [<ffffffffa03bb0e5>] ? nf_ct_zone+0xa/0x17 [nf_conntrack]
[  461.691520]  [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge]
[  461.697572]  [<ffffffffa0374812>] ? NF_HOOK.constprop.8+0x3c/0x56
[bridge]
[  461.704616]  [<ffffffffa0379031>] ?
nf_bridge_push_encap_header+0x1c/0x26 [bridge]
[  461.712329]  [<ffffffffa037929f>] ? br_nf_forward_finish+0x8a/0x95
[bridge]
[  461.719490]  [<ffffffffa037900a>] ?
nf_bridge_pull_encap_header+0x1c/0x27 [bridge]
[  461.727223]  [<ffffffffa0379974>] ? br_nf_forward_ip+0x1c0/0x1d4 [bridge]
[  461.734292]  [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77
[  461.739758]  [<ffffffffa03748cc>] ? __br_deliver+0xa0/0xa0 [bridge]
[  461.746203]  [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111
[  461.751950]  [<ffffffffa03748cc>] ? __br_deliver+0xa0/0xa0 [bridge]
[  461.758378]  [<ffffffffa037533a>] ? NF_HOOK.constprop.4+0x56/0x56
[bridge]

This is caused by bridge netfilter special dst_entry (fake_rtable), a
special shared entry, where attaching an inetpeer makes no sense.

Problem is present since commit 87c48fa3b46 (ipv6: make fragment
identifications less predictable)

Introduce DST_NOPEER dst flag and make sure ipv6_select_ident() and
__ip_select_ident() fallback to the 'no peer attached' handling.

Reported-by: Chris Boot <bootc@bootc.net>
Tested-by: Chris Boot <bootc@bootc.net>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agomqprio: Avoid panic if no options are provided
Thomas Graf [Thu, 22 Dec 2011 02:05:07 +0000 (02:05 +0000)]
mqprio: Avoid panic if no options are provided

Userspace may not provide TCA_OPTIONS, in fact tc currently does
so not do so if no arguments are specified on the command line.
Return EINVAL instead of panicing.

Signed-off-by: Thomas Graf <tgraf@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobridge: provide a mtu() method for fake_dst_ops
Eric Dumazet [Wed, 21 Dec 2011 20:00:32 +0000 (20:00 +0000)]
bridge: provide a mtu() method for fake_dst_ops

Commit 618f9bc74a039da76 (net: Move mtu handling down to the protocol
depended handlers) forgot the bridge netfilter case, adding a NULL
dereference in ip_fragment().

Reported-by: Chris Boot <bootc@bootc.net>
CC: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge branch 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Thu, 22 Dec 2011 20:59:47 +0000 (12:59 -0800)]
Merge branch 'usb-linus' of git://git./linux/kernel/git/gregkh/usb

* 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: Fix usb/isp1760 build on sparc
  usb: gadget: epautoconf: do not change number of streams
  usb: dwc3: core: fix cached revision on our structure
  usb: musb: fix reset issue with full speed device

12 years agoMerge branch 'upstream-linus' of git://github.com/jgarzik/libata-dev
Linus Torvalds [Thu, 22 Dec 2011 20:53:32 +0000 (12:53 -0800)]
Merge branch 'upstream-linus' of git://github.com/jgarzik/libata-dev

* 'upstream-linus' of git://github.com/jgarzik/libata-dev:
  pata_of_platform: Add missing CONFIG_OF_IRQ dependency.

12 years agopata_of_platform: Add missing CONFIG_OF_IRQ dependency.
David Miller [Wed, 21 Dec 2011 22:38:10 +0000 (17:38 -0500)]
pata_of_platform: Add missing CONFIG_OF_IRQ dependency.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
12 years agoipv4: using prefetch requires including prefetch.h
Stephen Rothwell [Thu, 22 Dec 2011 06:03:29 +0000 (17:03 +1100)]
ipv4: using prefetch requires including prefetch.h

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Thu, 22 Dec 2011 02:29:26 +0000 (18:29 -0800)]
Merge git://git./linux/kernel/git/davem/net

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  net: Add a flow_cache_flush_deferred function
  ipv4: reintroduce route cache garbage collector
  net: have ipconfig not wait if no dev is available
  sctp: Do not account for sizeof(struct sk_buff) in estimated rwnd
  asix: new device id
  davinci-cpdma: fix locking issue in cpdma_chan_stop
  sctp: fix incorrect overflow check on autoclose
  r8169: fix Config2 MSIEnable bit setting.
  llc: llc_cmsg_rcv was getting called after sk_eat_skb.
  net: bpf_jit: fix an off-one bug in x86_64 cond jump target
  iwlwifi: update SCD BC table for all SCD queues
  Revert "Bluetooth: Revert: Fix L2CAP connection establishment"
  Bluetooth: Clear RFCOMM session timer when disconnecting last channel
  Bluetooth: Prevent uninitialized data access in L2CAP configuration
  iwlwifi: allow to switch to HT40 if not associated
  iwlwifi: tx_sync only on PAN context
  mwifiex: avoid double list_del in command cancel path
  ath9k: fix max phy rate at rate control init
  nfc: signedness bug in __nci_request()
  iwlwifi: do not set the sequence control bit is not needed

12 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Linus Torvalds [Thu, 22 Dec 2011 02:29:05 +0000 (18:29 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/tiwai/sound

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: atmel/ac97c: using software reset instead hardware reset if not available

12 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6
Linus Torvalds [Thu, 22 Dec 2011 02:28:52 +0000 (18:28 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/sameo/mfd-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6:
  mfd: Include linux/io.h to jz4740-adc
  mfd: Use request_threaded_irq for twl4030-irq instead of irq_set_chained_handler
  mfd: Base interrupt for twl4030-irq must be one-shot
  mfd: Handle tps65910 clear-mask correctly
  mfd: add #ifdef CONFIG_DEBUG_FS guard for ab8500_debug_resources
  mfd: Fix twl-core oops while calling twl_i2c_* for unbound driver
  mfd: include linux/module.h for ab5500-debugfs
  mfd: Update wm8994 active device checks for WM1811
  mfd: Set tps6586x bits if new value is different from the old one
  mfd: Set da903x bits if new value is different from the old one
  mfd: Set adp5520 bits if new value is different from the old one
  mfd: Add missed free_irq in da903x_remove

12 years agovfs: __read_cache_page should use gfp argument rather than GFP_KERNEL
Dave Kleikamp [Wed, 21 Dec 2011 17:05:48 +0000 (11:05 -0600)]
vfs: __read_cache_page should use gfp argument rather than GFP_KERNEL

lockdep reports a deadlock in jfs because a special inode's rw semaphore
is taken recursively.  The mapping's gfp mask is GFP_NOFS, but is not
used when __read_cache_page() calls add_to_page_cache_lru().

Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Acked-by: Hugh Dickins <hughd@google.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoMerge branch 'for-greg' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb...
Greg Kroah-Hartman [Wed, 21 Dec 2011 22:42:17 +0000 (14:42 -0800)]
Merge branch 'for-greg' of git://git./linux/kernel/git/balbi/usb into usb-linus

* 'for-greg' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb:
  usb: gadget: epautoconf: do not change number of streams
  usb: dwc3: core: fix cached revision on our structure
  usb: musb: fix reset issue with full speed device

12 years agoUSB: Fix usb/isp1760 build on sparc
David Miller [Wed, 21 Dec 2011 22:31:54 +0000 (17:31 -0500)]
USB: Fix usb/isp1760 build on sparc

This commit:

commit 8f5d621543cb064d2989fc223d3c2bc61a43981e
Author: Joachim Foerster <joachim.foerster@missinglinkelectronics.com>
Date:   Mon Oct 10 18:06:54 2011 +0200

    usb/isp1760: Let OF bindings depend on general CONFIG_OF instead of PPC_OF .

    To be able to use the driver on other OF-aware architectures, too.
    And add necessary OF related #includes to fix compilation error.

Signed-off-by: Joachim Foerster <joachim.foerster@missinglinkelectronics.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
enabled the build on all CONFIG_OF architectures, but it cannot do
this.

This driver depends upon CONFIG_OF_IRQ but not all CONFIG_OF platforms
support that infrastructure, in particular Sparc does not so the
build fails.

Please push a patch like the following to Linus so that this code only
gets built where it actually should.

--------------------
usb/isp1760: Add missing CONFIG_OF_IRQ dependency on OF code.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
12 years agonet: Add a flow_cache_flush_deferred function
Steffen Klassert [Wed, 21 Dec 2011 21:48:08 +0000 (16:48 -0500)]
net: Add a flow_cache_flush_deferred function

flow_cach_flush() might sleep but can be called from
atomic context via the xfrm garbage collector. So add
a flow_cache_flush_deferred() function and use this if
the xfrm garbage colector is invoked from within the
packet path.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Timo Teräs <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: reintroduce route cache garbage collector
Eric Dumazet [Wed, 21 Dec 2011 20:47:16 +0000 (15:47 -0500)]
ipv4: reintroduce route cache garbage collector

Commit 2c8cec5c10b (ipv4: Cache learned PMTU information in inetpeer)
removed IP route cache garbage collector a bit too soon, as this gc was
responsible for expired routes cleanup, releasing their neighbour
reference.

As pointed out by Robert Gladewitz, recent kernels can fill and exhaust
their neighbour cache.

Reintroduce the garbage collection, since we'll have to wait our
neighbour lookups become refcount-less to not depend on this stuff.

Reported-by: Robert Gladewitz <gladewitz@gmx.de>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoirda: use msecs_to_jiffies() rather than manual calculation
Xi Wang [Wed, 21 Dec 2011 02:57:16 +0000 (02:57 +0000)]
irda: use msecs_to_jiffies() rather than manual calculation

Also use mod_timer() instead of direct assignment to "expires".

Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agostmmac: update the driver's documentation (Dec-2011)
Giuseppe CAVALLARO [Wed, 21 Dec 2011 03:58:20 +0000 (03:58 +0000)]
stmmac: update the driver's documentation (Dec-2011)

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agostmmac: add the experimental PCI support
Giuseppe CAVALLARO [Wed, 21 Dec 2011 03:58:19 +0000 (03:58 +0000)]
stmmac: add the experimental PCI support

This patch adds the PCI support (as EXPERIMENTAL)
this has been also tested on XLINX XC2V3000 FF1152AMT0221
D1215994A VIRTEX FPGA board.
To support the PCI bus the main part has been reworked
and both the platform and the PCI specific parts have
been moved into different files.

Signed-off-by: Rayagond Kokatanur <rayagond@vayavyalabs.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agosch_sfq: rehash queues in perturb timer
Eric Dumazet [Wed, 21 Dec 2011 03:30:11 +0000 (03:30 +0000)]
sch_sfq: rehash queues in perturb timer

A known Out Of Order (OOO) problem hurts SFQ when timer changes
perturbation value, since all new packets delivered to SFQ enqueue might
end on different slots than previous in-flight packets.

With round robin delivery, we can thus deliver packets in a different
order.

Since SFQ is limited to small amount of in-flight packets, we can rehash
packets so that this OOO problem is fixed.

This rehashing is performed only if internal flow classifier is in use.

We now store in skb->cb[] the "struct flow_keys" so that we dont call
skb_flow_dissect() again while rehashing.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: ethernet: xilinx: Don't use NO_IRQ in xilinx
Michal Simek [Wed, 21 Dec 2011 20:42:50 +0000 (15:42 -0500)]
net: ethernet: xilinx: Don't use NO_IRQ in xilinx

Fix ll_temac and emaclite drivers. Only Microblaze and Xilinx PPC
use then and both use NO_IRQ as 0. It will be removed in near future.

Signed-off-by: Michal Simek <monstr@monstr.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>