Roel Kluin [Fri, 1 Feb 2008 01:08:47 +0000 (17:08 -0800)]
[PKT_SCHED] sch_teql.c: Duplicate IFF_BROADCAST in FMASK, remove 2nd.
Signed-off-by: Roel Kluin <12o3l@tiscali.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Fri, 1 Feb 2008 01:07:21 +0000 (17:07 -0800)]
[BNX2]: Fix ASYM PAUSE advertisement for remote PHY.
We were checking for the ASYM_PAUSE bit for 1000Base-X twice instead
checking for both the 1000Base-X bit and the 10/100/1000Base-T bit.
The purpose of the logic is to tell the firmware that ASYM_PAUSE is
set on either the Serdes or Copper interface.
Problem was discovered by Roel Kluin <12o3l@tiscali.nl>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 1 Feb 2008 01:05:09 +0000 (17:05 -0800)]
[IPV4] route cache: Introduce rt_genid for smooth cache invalidation
Current ip route cache implementation is not suited to large caches.
We can consume a lot of CPU when cache must be invalidated, since we
currently need to evict all cache entries, and this eviction is
sometimes asynchronous. min_delay & max_delay can somewhat control this
asynchronism behavior, but whole thing is a kludge, regularly triggering
infamous soft lockup messages. When entries are still in use, this also
consumes a lot of ram, filling dst_garbage.list.
A better scheme is to use a generation identifier on each entry,
so that cache invalidation can be performed by changing the table
identifier, without having to scan all entries.
No more delayed flushing, no more stalling when secret_interval expires.
Invalidated entries will then be freed at GC time (controled by
ip_rt_gc_timeout or stress), or when an invalidated entry is found
in a chain when an insert is done.
Thus we keep a normal equilibrium.
This patch :
- renames rt_hash_rnd to rt_genid (and makes it an atomic_t)
- Adds a new rt_genid field to 'struct rtable' (filling a hole on 64bit)
- Checks entry->rt_genid at appropriate places :
Jesse Brandeburg [Fri, 1 Feb 2008 00:59:47 +0000 (16:59 -0800)]
[PKTGEN]: pktgen should not print info that it is spinning
when using pktgen to send delay packets the module prints repeatedly
to the kernel log:
sleeping for X
sleeping for X
...
This is probably just a debugging item left in and should not be
enabled for regular use of the module.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Fri, 1 Feb 2008 00:57:15 +0000 (16:57 -0800)]
[NET_SCHED]: sch_ingress: remove netfilter support
Since the old policer code is gone, TC actions are needed for policing.
The ingress qdisc can get packets directly from netif_receive_skb()
in case TC actions are enabled or through netfilter otherwise, but
since without TC actions there is no policer the only thing it actually
does is count packets.
Remove the netfilter support and always require TC actions.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rami Rosen [Fri, 1 Feb 2008 00:56:03 +0000 (16:56 -0800)]
[MACVLAN]: Setting macvlan_handle_frame_hook to NULL when rtnl_link_register() fails.
In drivers/net/macvlan.c, when rtnl_link_register() fails in
macvlan_init_module(), there is no point to set it (second time in
this method) to macvlan_handle_frame; macvlan_init_module() will
return a negative number, so instead this patch sets
macvlan_handle_frame_hook to NULL.
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Chris Leech [Fri, 1 Feb 2008 00:53:23 +0000 (16:53 -0800)]
[VLAN]: set_rx_mode support for unicast address list
Reuse the existing logic for multicast list synchronization for the
unicast address list. The core of dev_mc_sync/unsync are split out as
__dev_addr_sync/unsync and moved from dev_mcast.c to dev.c. These are
then used to implement dev_unicast_sync/unsync as well.
I'm working on cleaning up Intel's FCoE stack, which generates new MAC
addresses from the fibre channel device id assigned by the fabric as
per the current draft specification in T11. When using such a
protocol in a VLAN environment it would be nice to not always be
forced into promiscuous mode, assuming the underlying Ethernet driver
supports multiple unicast addresses as well.
Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Shan Wei [Fri, 1 Feb 2008 00:47:27 +0000 (16:47 -0800)]
[TCP]: Fix a bug in strategy_allowed_congestion_control
In strategy_allowed_congestion_control of the 2.6.24 kernel, when
sysctl_string return 1 on success,it should call
tcp_set_allowed_congestion_control to set the allowed congestion
control.But, it don't. the sysctl_string return 1 on success,
otherwise return negative, never return 0.The patch fix the problem.
Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger [Fri, 1 Feb 2008 00:45:47 +0000 (16:45 -0800)]
[IPV4] fib_trie: rescan if key is lost during dump
Normally during a dump the key of the last dumped entry is used for
continuation, but since lock is dropped it might be lost. In that case
fallback to the old counter based N^2 behaviour. This means the dump
will end up skipping some routes which matches what FIB_HASH does.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rami Rosen [Fri, 1 Feb 2008 00:42:23 +0000 (16:42 -0800)]
[PKTGEN]: Remove an unused definition in pktgen.c.
- Remove an unused definition (LAT_BUCKETS_MAX) in net/core/pktgen.c.
- Remove the corresponding comment.
- The LAT_BUCKETS_MAX seems to have to do with a patch from a long
time ago which was not applied (Ben Greear), which dealt with latency
counters.
See, for example : http://oss.sgi.com/archives/netdev/2002-09/msg00184.html
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jim Paris [Fri, 1 Feb 2008 00:36:25 +0000 (16:36 -0800)]
[IPV6]: Update MSS even if MTU is unchanged.
This is needed because in ndisc.c, we have:
static void ndisc_router_discovery(struct sk_buff *skb)
{
// ...
if (ndopts.nd_opts_mtu) {
// ...
if (rt)
rt->u.dst.metrics[RTAX_MTU-1] = mtu;
rt6_mtu_change(skb->dev, mtu);
// ...
}
Since the mtu is set directly here, rt6_mtu_change_route thinks that
it is unchanged, and so it fails to update the MSS accordingly. This
patch lets rt6_mtu_change_route still update MSS if old_mtu == new_mtu.
Signed-off-by: Jim Paris <jim@jtan.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Thu, 31 Jan 2008 13:07:57 +0000 (05:07 -0800)]
[NETNS]: Udp sockets per-net lookup.
Add the net parameter to udp_get_port family of calls and
udp_lookup one and use it to filter sockets.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Thu, 31 Jan 2008 13:07:21 +0000 (05:07 -0800)]
[NETNS]: Tcp-v6 sockets per-net lookup.
Add a net argument to inet6_lookup and propagate it further.
Actually, this is tcp-v6 implementation of what was done for
tcp-v4 sockets in a previous patch.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Thu, 31 Jan 2008 13:06:40 +0000 (05:06 -0800)]
[NETNS]: Tcp-v4 sockets per-net lookup.
Add a net argument to inet_lookup and propagate it further
into lookup calls. Plus tune the __inet_check_established.
The dccp and inet_diag, which use that lookup functions
pass the init_net into them.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Thu, 31 Jan 2008 13:05:50 +0000 (05:05 -0800)]
[NETNS]: Make bind buckets live in net namespaces.
This tags the inet_bind_bucket struct with net pointer,
initializes it during creation and makes a filtering
during lookup.
A better hashfn, that takes the net into account is to
be done in the future, but currently all bind buckets
with similar port will be in one hash chain.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Thu, 31 Jan 2008 13:04:45 +0000 (05:04 -0800)]
[INET]: Consolidate inet(6)_hash_connect.
These two functions are the same except for what they call
to "check_established" and "hash" for a socket.
This saves half-a-kilo for ipv4 and ipv6.
add/remove: 1/0 grow/shrink: 1/4 up/down: 582/-1128 (-546)
function old new delta
__inet_hash_connect - 577 +577
arp_ignore 108 113 +5
static.hint 8 4 -4
rt_worker_func 376 372 -4
inet6_hash_connect 584 25 -559
inet_hash_connect 586 25 -561
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Thu, 31 Jan 2008 13:03:27 +0000 (05:03 -0800)]
[IPV6]: Introduce the INET6_TW_MATCH macro.
We have INET_MATCH, INET_TW_MATCH and INET6_MATCH to test sockets and
twbuckets for matching, but ipv6 twbuckets are tested manually.
Here's the INET6_TW_MATCH to help with it.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 31 Jan 2008 12:55:29 +0000 (04:55 -0800)]
[NETFILTER]: xt_iprange: fix sparse warnings
CHECK net/netfilter/xt_iprange.c
net/netfilter/xt_iprange.c:104:19: warning: restricted degrades to integer
net/netfilter/xt_iprange.c:104:37: warning: restricted degrades to integer
net/netfilter/xt_iprange.c:104:19: warning: restricted degrades to integer
net/netfilter/xt_iprange.c:104:37: warning: restricted degrades to integer
net/netfilter/xt_iprange.c:104:19: warning: restricted degrades to integer
net/netfilter/xt_iprange.c:104:37: warning: restricted degrades to integer
net/netfilter/xt_iprange.c:104:19: warning: restricted degrades to integer
net/netfilter/xt_iprange.c:104:37: warning: restricted degrades to integer
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 31 Jan 2008 12:55:07 +0000 (04:55 -0800)]
[NETFILTER]: nf_nat: fix sparse warning
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 31 Jan 2008 12:54:45 +0000 (04:54 -0800)]
[NETFILTER]: nf_conntrack: fix sparse warning
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 31 Jan 2008 12:54:18 +0000 (04:54 -0800)]
[NETFILTER]: {ip,ip6}_queue: fix build error
Reported by Ingo Molnar:
net/built-in.o: In function `ip_queue_init':
ip_queue.c:(.init.text+0x322c): undefined reference to `net_ipv4_ctl_path'
Fix the build error and also handle CONFIG_PROC_FS=n properly.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Thu, 31 Jan 2008 12:53:24 +0000 (04:53 -0800)]
[NETFILTER]: nf_conntrack: annotate l3protos with const
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Thu, 31 Jan 2008 12:53:05 +0000 (04:53 -0800)]
[NETFILTER]: nf_{conntrack,nat}_icmp: constify and annotate
Constify a few data tables use const qualifiers on variables where
possible in the nf_conntrack_icmp* sources.
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Thu, 31 Jan 2008 12:52:46 +0000 (04:52 -0800)]
[NETFILTER]: nf_{conntrack,nat}_proto_gre: annotate with const
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Thu, 31 Jan 2008 12:52:29 +0000 (04:52 -0800)]
[NETFILTER]: nf_{conntrack,nat}_proto_udp{,lite}: annotate with const
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Thu, 31 Jan 2008 12:52:07 +0000 (04:52 -0800)]
[NETFILTER]: nf_{conntrack,nat}_proto_tcp: constify and annotate TCP modules
Constify a few data tables use const qualifiers on variables where
possible in the nf_*_proto_tcp sources.
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Thu, 31 Jan 2008 12:51:45 +0000 (04:51 -0800)]
[NETFILTER]: nf_conntrack_sane: annotate SANE helper with const
Annotate nf_conntrack_sane variables with const qualifier and remove
a few casts.
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Thu, 31 Jan 2008 12:51:23 +0000 (04:51 -0800)]
[NETFILTER]: nf_{conntrack,nat}_pptp: annotate PPtP helper with const
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Thu, 31 Jan 2008 12:50:51 +0000 (04:50 -0800)]
[NETFILTER]: nf_{conntrack,nat}_tftp: annotate TFTP helper with const
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Thu, 31 Jan 2008 12:50:25 +0000 (04:50 -0800)]
[NETFILTER]: nf_{conntrack,nat}_sip: annotate SIP helper with const
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Thu, 31 Jan 2008 12:50:05 +0000 (04:50 -0800)]
[NETFILTER]: nf_conntrack_h323: constify and annotate H.323 helper
Constify data tables (predominantly in nf_conntrack_h323_types.c, but
also a few in nf_conntrack_h323_asn1.c) and use const qualifiers on
variables where possible in the h323 sources.
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Dobriyan [Thu, 31 Jan 2008 12:49:35 +0000 (04:49 -0800)]
[NETFILTER]: x_tables: create per-netns /proc/net/*_tables_*
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Dobriyan [Thu, 31 Jan 2008 12:49:16 +0000 (04:49 -0800)]
[NETFILTER]: x_tables: netns propagation for /proc/net/*_tables_names
Propagate netns together with AF down to ->start/->next/->stop
iterators. Choose table based on netns and AF for showing.
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Dobriyan [Thu, 31 Jan 2008 12:48:54 +0000 (04:48 -0800)]
[NETFILTER]: x_tables: semi-rewrite of /proc/net/foo_tables_*
There are many small but still wrong things with /proc/net/*_tables_*
so I decided to do overhaul simultaneously making it more suitable for
per-netns /proc/net/*_tables_* implementation.
Fix
a) xt_get_idx() duplicating now standard seq_list_start/seq_list_next
iterators
b) tables/matches/targets list was chosen again and again on every ->next
c) multiple useless "af >= NPROTO" checks -- we simple don't supply invalid
AFs there and registration function should BUG_ON instead.
Regardless, the one in ->next() is the most useless -- ->next doesn't
run at all if ->start fails.
d) Don't use mutex_lock_interruptible() -- it can fail and ->stop is
executed even if ->start failed, so unlock without lock is possible.
As side effect, streamline code by splitting xt_tgt_ops into xt_target_ops,
xt_matches_ops, xt_tables_ops.
xt_tables_ops hooks will be changed by per-netns code. Code of
xt_matches_ops, xt_target_ops is identical except the list chosen for
iterating, but I think consolidating code for two files not worth it
given "<< 16" hacks needed for it.
[Patrick: removed unused enum in x_tables.c]
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Thu, 31 Jan 2008 12:48:13 +0000 (04:48 -0800)]
[NETFILTER]: xt_hashlimit match, revision 1
Introduces the xt_hashlimit match revision 1. It adds support for
kernel-level inversion and grouping source and/or destination IP
addresses, allowing to limit on a per-subnet basis. While this would
technically obsolete xt_limit, xt_hashlimit is a more expensive due
to the hashbucketing.
Kernel-level inversion: Previously you had to do user-level inversion:
iptables -N foo
iptables -A foo -m hashlimit --hashlimit(-upto) 5/s -j RETURN
iptables -A foo -j DROP
iptables -A INPUT -j foo
now it is simpler:
iptables -A INPUT -m hashlimit --hashlimit-over 5/s -j DROP
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ilpo Järvinen [Thu, 31 Jan 2008 12:47:35 +0000 (04:47 -0800)]
[NETFILTER]: nf_conntrack: kill unused static inline (do_iter)
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ilpo Järvinen [Thu, 31 Jan 2008 12:46:02 +0000 (04:46 -0800)]
[NETFILTER]: ipt_CLUSTERIP: kill clusterip_config_entry_get
It's unused static inline.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Leblond [Thu, 31 Jan 2008 12:44:27 +0000 (04:44 -0800)]
[NETFILTER]: nf_conntrack_netlink: transmit mark during all events
The following feature was submitted some months ago. It forces the dump
of mark during the connection destruction event. The induced load is
quiet small and the patch is usefull to provide an easy way to filter
event on user side without having to keep an hash in userspace.
Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Thu, 31 Jan 2008 12:43:53 +0000 (04:43 -0800)]
[NETFILTER]: nf_conntrack_h323: clean up code a bit
-total: 81 errors, 3 warnings, 876 lines checked
+total: 44 errors, 3 warnings, 876 lines checked
There is still work to be done, but that's for another patch.
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 31 Jan 2008 12:43:06 +0000 (04:43 -0800)]
[NETFILTER]: nf_nat: switch rwlock to spinlock
Since we're using RCU, all users of nf_nat_lock take a write_lock.
Switch it to a spinlock.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 31 Jan 2008 12:42:37 +0000 (04:42 -0800)]
[NETFILTER]: nf_nat: use RCU for bysource hash
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 31 Jan 2008 12:42:11 +0000 (04:42 -0800)]
[NETFILTER]: nf_conntrack: naming unification
Rename all "conntrack" variables to "ct" for more consistency and
avoiding some overly long lines.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 31 Jan 2008 12:41:44 +0000 (04:41 -0800)]
[NETFILTER]: nf_conntrack: don't inline early_drop()
early_drop() is only called *very* rarely, unfortunately gcc inlines it
into the hotpath because there is only a single caller. Explicitly mark
it noinline.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 31 Jan 2008 12:41:14 +0000 (04:41 -0800)]
[NETFILTER]: nf_conntrack: reorder struct nf_conntrack_l4proto
Reorder struct nf_conntrack_l4proto so all members used during packet
processing are in the same cacheline.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 31 Jan 2008 12:40:52 +0000 (04:40 -0800)]
[NETFILTER]: nf_conntrack: optimize hash_conntrack()
Avoid calling jhash three times and hash the entire tuple in one go.
__hash_conntrack | -485 # 760 -> 275, # inlines: 3 -> 1, size inlines: 717 -> 252
1 function changed, 485 bytes removed
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 31 Jan 2008 12:40:04 +0000 (04:40 -0800)]
[NETFILTER]: nf_conntrack: avoid duplicate protocol comparison in nf_ct_tuple_equal()
nf_ct_tuple_src_equal() and nf_ct_tuple_dst_equal() both compare the protocol
numbers. Unfortunately gcc doesn't optimize out the second comparison, so
remove it and prefix both functions with __ to indicate that they should not
be used directly.
Saves another 16 byte of text in __nf_conntrack_find() on x86_64:
nf_conntrack_tuple_taken | -20 # 320 -> 300, size inlines: 181 -> 161
__nf_conntrack_find | -16 # 267 -> 251, size inlines: 127 -> 115
__nf_conntrack_confirm | -40 # 875 -> 835, size inlines: 570 -> 537
3 functions changed, 76 bytes removed
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 31 Jan 2008 12:39:23 +0000 (04:39 -0800)]
[NETFILTER]: nf_conntrack: optimize __nf_conntrack_find()
Ignoring specific entries in __nf_conntrack_find() is only needed by NAT
for nf_conntrack_tuple_taken(). Remove it from __nf_conntrack_find()
and make nf_conntrack_tuple_taken() search the hash itself.
Saves 54 bytes of text in the hotpath on x86_64:
__nf_conntrack_find | -54 # 321 -> 267, # inlines: 3 -> 2, size inlines: 181 -> 127
nf_conntrack_tuple_taken | +305 # 15 -> 320, lexblocks: 0 -> 3, # inlines: 0 -> 3, size inlines: 0 -> 181
nf_conntrack_find_get | -2 # 90 -> 88
3 functions changed, 305 bytes added, 56 bytes removed, diff: +249
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 31 Jan 2008 12:38:58 +0000 (04:38 -0800)]
[NETFILTER]: nf_conntrack: switch rwlock to spinlock
With the RCU conversion only write_lock usages of nf_conntrack_lock are
left (except one read_lock that should actually use write_lock in the
H.323 helper). Switch to a spinlock.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 31 Jan 2008 12:38:38 +0000 (04:38 -0800)]
[NETFILTER]: nf_conntrack: use RCU for conntrack hash
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 31 Jan 2008 12:38:19 +0000 (04:38 -0800)]
[NETFILTER]: nf_conntrack_expect: use RCU for expectation hash
Use RCU for expectation hash. This doesn't buy much for conntrack
runtime performance, but allows to reduce the use of nf_conntrack_lock
for /proc and nf_netlink_conntrack.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 31 Jan 2008 12:37:36 +0000 (04:37 -0800)]
[NETFILTER]: nf_conntrack_core: avoid taking nf_conntrack_lock in nf_conntrack_alter_reply
The conntrack is unconfirmed, so we have an exclusive reference, which
means that the write_lock is definitely unneeded. A read_lock used to
be needed for the helper lookup, but since we're using RCU for helpers
now rcu_read_lock is enough.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 31 Jan 2008 12:36:54 +0000 (04:36 -0800)]
[NETFILTER]: nf_conntrack: use RCU for conntrack helpers
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 31 Jan 2008 12:36:31 +0000 (04:36 -0800)]
[NETFILTER]: nf_conntrack: fix accounting with fixed timeouts
Don't skip accounting for conntracks with the FIXED_TIMEOUT bit.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 31 Jan 2008 12:35:57 +0000 (04:35 -0800)]
[NETFILTER]: nf_conntrack_netlink: fix unbalanced locking
Properly drop nf_conntrack_lock on tuple parsing error.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 31 Jan 2008 12:10:40 +0000 (04:10 -0800)]
[NETFILTER]: nf_conntrack_ipv6: fix sparse warnings
CHECK net/ipv6/netfilter/nf_conntrack_reasm.c
net/ipv6/netfilter/nf_conntrack_reasm.c:77:18: warning: symbol 'nf_ct_ipv6_sysctl_table' was not declared. Should it be static?
net/ipv6/netfilter/nf_conntrack_reasm.c:586:16: warning: symbol 'nf_ct_frag6_gather' was not declared. Should it be static?
net/ipv6/netfilter/nf_conntrack_reasm.c:662:6: warning: symbol 'nf_ct_frag6_output' was not declared. Should it be static?
net/ipv6/netfilter/nf_conntrack_reasm.c:683:5: warning: symbol 'nf_ct_frag6_kfree_frags' was not declared. Should it be static?
net/ipv6/netfilter/nf_conntrack_reasm.c:698:5: warning: symbol 'nf_ct_frag6_init' was not declared. Should it be static?
net/ipv6/netfilter/nf_conntrack_reasm.c:717:6: warning: symbol 'nf_ct_frag6_cleanup' was not declared. Should it be static?
Based on patch by Stephen Hemminger with suggestions by Yasuyuki KOZAKAI.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 31 Jan 2008 12:10:18 +0000 (04:10 -0800)]
[NETFILTER]: {ip,arp,ip6}_tables: fix sparse warnings in compat code
CHECK net/ipv4/netfilter/ip_tables.c
net/ipv4/netfilter/ip_tables.c:1453:8: warning: incorrect type in argument 3 (different signedness)
net/ipv4/netfilter/ip_tables.c:1453:8: expected int *size
net/ipv4/netfilter/ip_tables.c:1453:8: got unsigned int [usertype] *size
net/ipv4/netfilter/ip_tables.c:1458:44: warning: incorrect type in argument 3 (different signedness)
net/ipv4/netfilter/ip_tables.c:1458:44: expected int *size
net/ipv4/netfilter/ip_tables.c:1458:44: got unsigned int [usertype] *size
net/ipv4/netfilter/ip_tables.c:1603:2: warning: incorrect type in argument 2 (different signedness)
net/ipv4/netfilter/ip_tables.c:1603:2: expected unsigned int *i
net/ipv4/netfilter/ip_tables.c:1603:2: got int *<noident>
net/ipv4/netfilter/ip_tables.c:1627:8: warning: incorrect type in argument 3 (different signedness)
net/ipv4/netfilter/ip_tables.c:1627:8: expected int *size
net/ipv4/netfilter/ip_tables.c:1627:8: got unsigned int *size
net/ipv4/netfilter/ip_tables.c:1634:40: warning: incorrect type in argument 3 (different signedness)
net/ipv4/netfilter/ip_tables.c:1634:40: expected int *size
net/ipv4/netfilter/ip_tables.c:1634:40: got unsigned int *size
net/ipv4/netfilter/ip_tables.c:1653:8: warning: incorrect type in argument 5 (different signedness)
net/ipv4/netfilter/ip_tables.c:1653:8: expected unsigned int *i
net/ipv4/netfilter/ip_tables.c:1653:8: got int *<noident>
net/ipv4/netfilter/ip_tables.c:1666:2: warning: incorrect type in argument 2 (different signedness)
net/ipv4/netfilter/ip_tables.c:1666:2: expected unsigned int *i
net/ipv4/netfilter/ip_tables.c:1666:2: got int *<noident>
CHECK net/ipv4/netfilter/arp_tables.c
net/ipv4/netfilter/arp_tables.c:1285:40: warning: incorrect type in argument 3 (different signedness)
net/ipv4/netfilter/arp_tables.c:1285:40: expected int *size
net/ipv4/netfilter/arp_tables.c:1285:40: got unsigned int *size
net/ipv4/netfilter/arp_tables.c:1543:44: warning: incorrect type in argument 3 (different signedness)
net/ipv4/netfilter/arp_tables.c:1543:44: expected int *size
net/ipv4/netfilter/arp_tables.c:1543:44: got unsigned int [usertype] *size
CHECK net/ipv6/netfilter/ip6_tables.c
net/ipv6/netfilter/ip6_tables.c:1481:8: warning: incorrect type in argument 3 (different signedness)
net/ipv6/netfilter/ip6_tables.c:1481:8: expected int *size
net/ipv6/netfilter/ip6_tables.c:1481:8: got unsigned int [usertype] *size
net/ipv6/netfilter/ip6_tables.c:1486:44: warning: incorrect type in argument 3 (different signedness)
net/ipv6/netfilter/ip6_tables.c:1486:44: expected int *size
net/ipv6/netfilter/ip6_tables.c:1486:44: got unsigned int [usertype] *size
net/ipv6/netfilter/ip6_tables.c:1631:2: warning: incorrect type in argument 2 (different signedness)
net/ipv6/netfilter/ip6_tables.c:1631:2: expected unsigned int *i
net/ipv6/netfilter/ip6_tables.c:1631:2: got int *<noident>
net/ipv6/netfilter/ip6_tables.c:1655:8: warning: incorrect type in argument 3 (different signedness)
net/ipv6/netfilter/ip6_tables.c:1655:8: expected int *size
net/ipv6/netfilter/ip6_tables.c:1655:8: got unsigned int *size
net/ipv6/netfilter/ip6_tables.c:1662:40: warning: incorrect type in argument 3 (different signedness)
net/ipv6/netfilter/ip6_tables.c:1662:40: expected int *size
net/ipv6/netfilter/ip6_tables.c:1662:40: got unsigned int *size
net/ipv6/netfilter/ip6_tables.c:1680:8: warning: incorrect type in argument 5 (different signedness)
net/ipv6/netfilter/ip6_tables.c:1680:8: expected unsigned int *i
net/ipv6/netfilter/ip6_tables.c:1680:8: got int *<noident>
net/ipv6/netfilter/ip6_tables.c:1693:2: warning: incorrect type in argument 2 (different signedness)
net/ipv6/netfilter/ip6_tables.c:1693:2: expected unsigned int *i
net/ipv6/netfilter/ip6_tables.c:1693:2: got int *<noident>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 31 Jan 2008 12:09:46 +0000 (04:09 -0800)]
[NETFILTER]: ipt_recent: fix sparse warnings
net/ipv4/netfilter/ipt_recent.c:215:17: warning: symbol 't' shadows an earlier one
net/ipv4/netfilter/ipt_recent.c:179:22: originally declared here
net/ipv4/netfilter/ipt_recent.c:322:13: warning: context imbalance in 'recent_seq_start' - wrong count at exit
net/ipv4/netfilter/ipt_recent.c:354:13: warning: context imbalance in 'recent_seq_stop' - unexpected unlock
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger [Thu, 31 Jan 2008 12:09:00 +0000 (04:09 -0800)]
[NETFILTER]: nf_conntrack_h3223: sparse fixes
Sparse complains when a function is not really static. Putting static
on the function prototype is not enough.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger [Thu, 31 Jan 2008 12:08:39 +0000 (04:08 -0800)]
[NETFILTER]: more sparse fixes
Some lock annotations, and make initializers static.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger [Thu, 31 Jan 2008 12:08:10 +0000 (04:08 -0800)]
[NETFILTER]: conntrack: get rid of sparse warnings
Teach sparse about locking here, and fix signed/unsigned warnings.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger [Thu, 31 Jan 2008 12:07:51 +0000 (04:07 -0800)]
[NETFILTER]: nfnetlink_log: sparse warning fixes
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger [Thu, 31 Jan 2008 12:07:29 +0000 (04:07 -0800)]
[NETFILTER]: nf_conntrack: sparse warnings
The hashtable size is really unsigned so sparse complains when you pass
a signed integer. Change all uses to make it consistent.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger [Thu, 31 Jan 2008 12:07:08 +0000 (04:07 -0800)]
[NETFILTER]: nf_nat_snmp: sparse warning
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Thu, 31 Jan 2008 12:06:38 +0000 (04:06 -0800)]
[NETFILTER]: xt_owner: allow matching UID/GID ranges
Add support for ranges to the new revision. This doesn't affect
compatibility since the new revision was not released yet.
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Thu, 31 Jan 2008 12:06:10 +0000 (04:06 -0800)]
[NETFILTER]: xt_TCPMSS: consider reverse route's MTU in clamp-to-pmtu
The TCPMSS target in Xtables should consider the MTU of the reverse
route on forwarded packets as part of the path MTU.
Point in case: IN=ppp0, OUT=eth0. MSS set to 1460 in spite of MTU of
ppp0 being 1392.
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Dobriyan [Thu, 31 Jan 2008 12:05:34 +0000 (04:05 -0800)]
[NETFILTER]: netns: put table module on netns stop
When number of entries exceeds number of initial entries, foo-tables code
will pin table module. But during table unregister on netns stop,
that additional pin was forgotten.
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Dobriyan [Thu, 31 Jan 2008 12:05:09 +0000 (04:05 -0800)]
[NETFILTER]: arp_tables: per-netns arp_tables FILTER
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Dobriyan [Thu, 31 Jan 2008 12:04:32 +0000 (04:04 -0800)]
[NETFILTER]: arp_tables: netns preparation
* Propagate netns from userspace.
* arpt_register_table() registers table in supplied netns.
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Dobriyan [Thu, 31 Jan 2008 12:04:13 +0000 (04:04 -0800)]
[NETFILTER]: ip6_tables: per-netns IPv6 FILTER, MANGLE, RAW
Now it's possible to list and manipulate per-netns ip6tables rules.
Filtering decisions are based on init_net's table so far.
P.S.: remove init_net check in inet6_create() to see the effect
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Dobriyan [Thu, 31 Jan 2008 12:03:45 +0000 (04:03 -0800)]
[NETFILTER]: ip6_tables: netns preparation
* Propagate netns from userspace down to xt_find_table_lock()
* Register ip6 tables in netns (modules still use init_net)
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Dobriyan [Thu, 31 Jan 2008 12:03:23 +0000 (04:03 -0800)]
[NETFILTER]: ip_tables: per-netns FILTER, MANGLE, RAW
Now, iptables show and configure different set of rules in different
netnss'. Filtering decisions are still made by consulting only
init_net's set.
Changes are identical except naming so no splitting.
P.S.: one need to remove init_net checks in nf_sockopt.c and inet_create()
to see the effect.
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Dobriyan [Thu, 31 Jan 2008 12:03:03 +0000 (04:03 -0800)]
[NETFILTER]: ip_tables: propagate netns from userspace
.. all the way down to table searching functions.
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Dobriyan [Thu, 31 Jan 2008 12:02:44 +0000 (04:02 -0800)]
[NETFILTER]: x_tables: return new table from {arp,ip,ip6}t_register_table()
Typical table module registers xt_table structure (i.e. packet_filter)
and link it to list during it. We can't use one template for it because
corresponding list_head will become corrupted. We also can't unregister
with template because it wasn't changed at all and thus doesn't know in
which list it is.
So, we duplicate template at the very first step of table registration.
Table modules will save it for use during unregistration time and actual
filtering.
Do it at once to not screw bisection.
P.S.: renaming i.e. packet_filter => __packet_filter is temporary until
full netnsization of table modules is done.
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Dobriyan [Thu, 31 Jan 2008 12:02:13 +0000 (04:02 -0800)]
[NETFILTER]: x_tables: per-netns xt_tables
In fact all we want is per-netns set of rules, however doing that will
unnecessary complicate routines such as ipt_hook()/ipt_do_table, so
make full xt_table array per-netns.
Every user stubbed with init_net for a while.
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Dobriyan [Thu, 31 Jan 2008 12:01:49 +0000 (04:01 -0800)]
[NETFILTER]: x_tables: change xt_table_register() return value convention
Switch from 0/-E to ptr/PTR_ERR convention.
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Thu, 31 Jan 2008 12:00:59 +0000 (04:00 -0800)]
[NETFILTER]: ebtables: mark matches, targets and watchers __read_mostly
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Thu, 31 Jan 2008 12:00:30 +0000 (04:00 -0800)]
[NETFILTER]: ebtables: Update modules' descriptions
Update the MODULES_DESCRIPTION() tags for all Ebtables modules.
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Thu, 31 Jan 2008 11:59:24 +0000 (03:59 -0800)]
[NETFILTER]: ebtables: remove casts, use consts
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Helge Deller [Thu, 31 Jan 2008 11:58:56 +0000 (03:58 -0800)]
[NETFILTER]: nf_log: add netfilter gcc printf format checking
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Thu, 31 Jan 2008 11:58:24 +0000 (03:58 -0800)]
[NETFILTER]: xt_conntrack: add port and direction matching
Extend the xt_conntrack match revision 1 by port matching (all four
{orig,repl}{src,dst}) and by packet direction matching.
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 31 Jan 2008 11:58:01 +0000 (03:58 -0800)]
[NETFILTER]: nfnetlink_log: fix typo
It should use htonl for the GID, not htons.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Thu, 31 Jan 2008 11:57:36 +0000 (03:57 -0800)]
linux/types.h: Use __u64 for aligned_u64
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 31 Jan 2008 11:56:35 +0000 (03:56 -0800)]
[NETFILTER]: bridge netfilter: remove nf_bridge_info read-only netoutdev member
Before the removal of the deferred output hooks, netoutdev was used in
case of VLANs on top of a bridge to store the VLAN device, so the
deferred hooks would see the correct output device. This isn't
necessary anymore since we're calling the output hooks for the correct
device directly in the IP stack.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 31 Jan 2008 11:56:04 +0000 (03:56 -0800)]
[NETFILTER]: nf_nat: remove double bysource hash initialization
The hash table is already initialized by nf_ct_alloc_hashtable().
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Thu, 31 Jan 2008 11:54:47 +0000 (03:54 -0800)]
[NETFILTER]: Use const in struct xt_match, xt_target, xt_table
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 31 Jan 2008 11:53:27 +0000 (03:53 -0800)]
[NETFILTER]: Supress some sparse warnings
CHECK net/netfilter/nf_conntrack_expect.c
net/netfilter/nf_conntrack_expect.c:429:13: warning: context imbalance in 'exp_seq_start' - wrong count at exit
net/netfilter/nf_conntrack_expect.c:441:13: warning: context imbalance in 'exp_seq_stop' - unexpected unlock
CHECK net/netfilter/nf_log.c
net/netfilter/nf_log.c:105:13: warning: context imbalance in 'seq_start' - wrong count at exit
net/netfilter/nf_log.c:125:13: warning: context imbalance in 'seq_stop' - unexpected unlock
CHECK net/netfilter/nfnetlink_queue.c
net/netfilter/nfnetlink_queue.c:363:7: warning: symbol 'size' shadows an earlier one
net/netfilter/nfnetlink_queue.c:217:9: originally declared here
net/netfilter/nfnetlink_queue.c:847:13: warning: context imbalance in 'seq_start' - wrong count at exit
net/netfilter/nfnetlink_queue.c:859:13: warning: context imbalance in 'seq_stop' - unexpected unlock
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Thu, 31 Jan 2008 11:48:55 +0000 (03:48 -0800)]
[RAW]: Wrong content of the /proc/net/raw6.
The address of IPv6 raw sockets was shown in the wrong format, from
IPv4 ones. The problem has been introduced by the commit
42a73808ed4f30b739eb52bcbb33a02fe62ceef5 ("[RAW]: Consolidate proc
interface.")
Thanks to Adrian Bunk who originally noticed the problem.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Thu, 31 Jan 2008 11:46:43 +0000 (03:46 -0800)]
[RAW]: Cleanup IPv4 raw_seq_show.
There is no need to use 128 bytes on the stack at all. Clean the code
in the IPv6 style.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Thu, 31 Jan 2008 11:46:12 +0000 (03:46 -0800)]
[RAW]: Family check in the /proc/net/raw[6] is extra.
Different hashtables are used for IPv6 and IPv4 raw sockets, so no
need to check the socket family in the iterator over hashtables. Clean
this out.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Thu, 31 Jan 2008 05:48:24 +0000 (21:48 -0800)]
[IPCOMP]: Fix reception of incompressible packets
I made a silly typo by entering IPPROTO_IP (== 0) instead of
IPPROTO_IPIP (== 4). This broke the reception of incompressible
packets.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 31 Jan 2008 04:07:45 +0000 (20:07 -0800)]
[NET]: should explicitely initialize atomic_t field in struct dst_ops
All but one struct dst_ops static initializations miss explicit
initialization of entries field.
As this field is atomic_t, we should use ATOMIC_INIT(0), and not
rely on atomic_t implementation.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ilpo Järvinen [Thu, 31 Jan 2008 04:06:02 +0000 (20:06 -0800)]
[TCP]: NewReno must count every skb while marking losses
NewReno should add cnt per skb (as with FACK) instead of depending on
SACKED_ACKED bits which won't be set with it at all. Effectively,
NewReno should always exists after the first iteration anyway (or
immediately if there's already head in lost_out.
This was fixed earlier in net-2.6.25 but got reverted among other
stuff and I didn't notice that this is still necessary (actually
wasn't even considering this case while trying to figure out the
reports because I lived with different kind of code than it in reality
was).
This should solve the WARN_ONs in TCP code that as a result of this
triggered multiple times in every place we check for this invariant.
Special thanks to Dave Young <hidave.darkstar@gmail.com> and Krishna
Kumar2 <krkumar2@in.ibm.com> for trying with my debug patches.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Tested-by: Dave Young <hidave.darkstar@gmail.com>
Tested-by: Krishna Kumar2 <krkumar2@in.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Thu, 31 Jan 2008 03:31:06 +0000 (19:31 -0800)]
[NETNS]: Fix race between put_net() and netlink_kernel_create().
The comment about "race free view of the set of network
namespaces" was a bit hasty. Look (there even can be only
one CPU, as discovered by Alexey Dobriyan and Denis Lunev):
put_net()
if (atomic_dec_and_test(&net->refcnt))
/* true */
__put_net(net);
queue_work(...);
/*
* note: the net now has refcnt 0, but still in
* the global list of net namespaces
*/
== re-schedule ==
register_pernet_subsys(&some_ops);
register_pernet_operations(&some_ops);
(*some_ops)->init(net);
/*
* we call netlink_kernel_create() here
* in some places
*/
netlink_kernel_create();
sk_alloc();
get_net(net); /* refcnt = 1 */
/*
* now we drop the net refcount not to
* block the net namespace exit in the
* future (or this can be done on the
* error path)
*/
put_net(sk->sk_net);
if (atomic_dec_and_test(&...))
/*
* true. BOOOM! The net is
* scheduled for release twice
*/
When thinking on this problem, I decided, that getting and
putting the net in init callback is wrong. If some init
callback needs to have a refcount-less reference on the struct
net, _it_ has to be careful himself, rather than relying on
the infrastructure to handle this correctly.
In case of netlink_kernel_create(), the problem is that the
sk_alloc() gets the given namespace, but passing the info
that we don't want to get it inside this call is too heavy.
Instead, I propose to crate the socket inside an init_net
namespace and then re-attach it to the desired one right
after the socket is created.
After doing this, we also have to be careful on error paths
not to drop the reference on the namespace, we didn't get
the one on.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Denis Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 31 Jan 2008 03:11:50 +0000 (19:11 -0800)]
[XFRM]: constify 'struct xfrm_type'
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Benjamin Thery [Thu, 31 Jan 2008 03:09:35 +0000 (19:09 -0800)]
[NETNS]: Add missing initialization of nl_info.nl_net in rtm_to_fib6_config()
Add missing initialization of the new nl_info.nl_net field in
rtm_to_fib6_config(). This will be needed the store network namespace
associated to the fib6_config struct.
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Laszlo Attila Toth [Thu, 31 Jan 2008 03:08:16 +0000 (19:08 -0800)]
[NET]: Introducing socket mark socket option.
A userspace program may wish to set the mark for each packets its send
without using the netfilter MARK target. Changing the mark can be used
for mark based routing without netfilter or for packet filtering.
It requires CAP_NET_ADMIN capability.
Signed-off-by: Laszlo Attila Toth <panther@balabit.hu>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Thu, 31 Jan 2008 02:55:45 +0000 (18:55 -0800)]
[AF_RXRPC]: constify function pointer tables
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Wed, 30 Jan 2008 05:38:52 +0000 (21:38 -0800)]
[BNX2]: Update version to 1.7.3.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Wed, 30 Jan 2008 05:38:06 +0000 (21:38 -0800)]
[BNX2]: Update firmware.
Update firmware to support programmable flow control.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Wed, 30 Jan 2008 05:37:17 +0000 (21:37 -0800)]
[BNX2]: Fine-tune flow control on 5709.
Make use of the programmable high/low water marks in 5709 for
802.3 flow control.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>