David S. Miller [Wed, 8 Oct 2008 21:26:36 +0000 (14:26 -0700)]
Merge branch 'lvs-next-2.6' of git://git./linux/kernel/git/horms/lvs-2.6
Conflicts:
net/netfilter/Kconfig
Vlad Yasevich [Wed, 8 Oct 2008 21:19:01 +0000 (14:19 -0700)]
sctp: shrink sctp_tsnmap some more by removing gabs array
The gabs array in the sctp_tsnmap structure is only used
in one place, sctp_make_sack(). As such, carrying the
array around in the sctp_tsnmap and thus directly in
the sctp_association is rather pointless since most
of the time it's just taking up space. Now, let
sctp_make_sack create and populate it and then throw
it away when it's done.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Wed, 8 Oct 2008 21:18:39 +0000 (14:18 -0700)]
sctp: Rework the tsn map to use generic bitmap.
The tsn map currently use is 4K large and is stuck inside
the sctp_association structure making memory references REALLY
expensive. What we really need is at most 4K worth of bits
so the biggest map we would have is 512 bytes. Also, the
map is only really usefull when we have gaps to store and
report. As such, starting with minimal map of say 32 TSNs (bits)
should be enough for normal low-loss operations. We can grow
the map by some multiple of 32 along with some extra room any
time we receive the TSN which would put us outside of the map
boundry. As we close gaps, we can shift the map to rebase
it on the latest TSN we've seen. This saves 4088 bytes per
association just in the map alone along savings from the now
unnecessary structure members.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 8 Oct 2008 21:18:04 +0000 (14:18 -0700)]
inet: cleanup of local_port_range
I noticed sysctl_local_port_range[] and its associated seqlock
sysctl_local_port_range_lock were on separate cache lines.
Moreover, sysctl_local_port_range[] was close to unrelated
variables, highly modified, leading to cache misses.
Moving these two variables in a structure can help data
locality and moving this structure to read_mostly section
helps sharing of this data among cpus.
Cleanup of extern declarations (moved in include file where
they belong), and use of inet_get_local_port_range()
accessor instead of direct access to ports values.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 8 Oct 2008 18:44:17 +0000 (11:44 -0700)]
udp: Improve port randomization
Current UDP port allocation is suboptimal.
We select the shortest chain to chose a port (out of 512)
that will hash in this shortest chain.
First, it can lead to give not so ramdom ports and ease
give attackers more opportunities to break the system.
Second, it can consume a lot of CPU to scan all table
in order to find the shortest chain.
Third, in some pathological cases we can fail to find
a free port even if they are plenty of them.
This patch zap the search for a short chain and only
use one random seed. Problem of getting long chains
should be addressed in another way, since we can
obtain long chains with non random ports.
Based on a report and patch from Vitaly Mayatskikh
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jarek Poplawski [Wed, 8 Oct 2008 18:36:22 +0000 (11:36 -0700)]
pkt_sched: Update qdisc requeue stats in dev_requeue_skb()
After the last change of requeuing there is no info about such
incidents in tc stats. This patch updates the counter, but we should
consider this should differ from previous stats because of additional
checks preventing to repeat this. On the other hand, previous stats
didn't include requeuing of gso_segmented skbs.
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ilpo Järvinen [Wed, 8 Oct 2008 18:34:06 +0000 (11:34 -0700)]
tcp: fix length used for checksum in a reset
While looking for some common code I came across difference
in checksum calculation between tcp_v6_send_(reset|ack) I
couldn't explain. I checked both v4 and v6 and found out that
both seem to have the same "feature". I couldn't find anything
in rfc nor anywhere else which would state that md5 option
should be ignored like it was in case of reset so I came to
a conclusion that this is probably a genuine bug. I suspect
that addition of md5 just was fooled by the excessive
copy-paste code in those functions and the reset part was
never tested well enough to find out the problem.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Wed, 8 Oct 2008 17:36:24 +0000 (10:36 -0700)]
ipv6: remove unused not init_ipv6_mibs/cleanup_ipv6_mibs
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Wed, 8 Oct 2008 17:36:03 +0000 (10:36 -0700)]
ipv6: making ip and icmp statistics per/namespace
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Wed, 8 Oct 2008 17:35:11 +0000 (10:35 -0700)]
ipv6: added net argument to _DEVINC/_DEVADD
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Wed, 8 Oct 2008 17:34:54 +0000 (10:34 -0700)]
ipv6: added net argument to ICMP6MSGIN_INC_STATS_BH
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Wed, 8 Oct 2008 17:34:35 +0000 (10:34 -0700)]
ipv6: ICMP6MSGIN_INC_STATS is not used
Removed.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Wed, 8 Oct 2008 17:34:14 +0000 (10:34 -0700)]
ipv6: added net argument to ICMP6MSGOUT_INC_STATS_BH
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Wed, 8 Oct 2008 17:33:50 +0000 (10:33 -0700)]
ipv6: added net argument to ICMP6MSGOUT_INC_STATS
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Wed, 8 Oct 2008 17:33:26 +0000 (10:33 -0700)]
ipv6: added net argument to ICMP6_INC_STATS_BH
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Wed, 8 Oct 2008 17:33:06 +0000 (10:33 -0700)]
ipv6: added net argument to ICMP6_INC_STATS
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Wed, 8 Oct 2008 17:32:43 +0000 (10:32 -0700)]
ipv6: added net argument to IP6_ADD_STATS_BH
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Wed, 8 Oct 2008 18:09:27 +0000 (11:09 -0700)]
ipv6: added net argument to IP6_INC_STATS_BH
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Wed, 8 Oct 2008 17:54:51 +0000 (10:54 -0700)]
netns: add net parameter to IP6_INC_STATS
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Wed, 8 Oct 2008 17:31:44 +0000 (10:31 -0700)]
ipv6: consolidate error paths in ipv6_frag_rcv
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Wed, 8 Oct 2008 17:31:18 +0000 (10:31 -0700)]
ipv6: local dev is actually unused in ip6_fragment
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 8 Oct 2008 16:50:38 +0000 (09:50 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/kaber/nf-next-2.6
Jan Engelhardt [Wed, 8 Oct 2008 09:35:20 +0000 (11:35 +0200)]
netfilter: xtables: remove bogus mangle table dependency of connmark
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:20 +0000 (11:35 +0200)]
netfilter: xtables: use NFPROTO_UNSPEC in more extensions
Lots of extensions are completely family-independent, so squash some code.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:20 +0000 (11:35 +0200)]
netfilter: xtables: cut down on static data for family-independent extensions
Using ->family in struct xt_*_param, multiple struct xt_{match,target}
can be squashed together.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:20 +0000 (11:35 +0200)]
netfilter: xtables: provide invoked family value to extensions
By passing in the family through which extensions were invoked, a bit
of data space can be reclaimed. The "family" member will be added to
the parameter structures and the check functions be adjusted.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:19 +0000 (11:35 +0200)]
netfilter: xtables: move extension arguments into compound structure (6/6)
This patch does this for target extensions' destroy functions.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:19 +0000 (11:35 +0200)]
netfilter: xtables: move extension arguments into compound structure (5/6)
This patch does this for target extensions' checkentry functions.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:19 +0000 (11:35 +0200)]
netfilter: xtables: move extension arguments into compound structure (4/6)
This patch does this for target extensions' target functions.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:19 +0000 (11:35 +0200)]
netfilter: xtables: move extension arguments into compound structure (3/6)
This patch does this for match extensions' destroy functions.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:18 +0000 (11:35 +0200)]
netfilter: xtables: move extension arguments into compound structure (2/6)
This patch does this for match extensions' checkentry functions.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:18 +0000 (11:35 +0200)]
netfilter: xtables: move extension arguments into compound structure (1/6)
The function signatures for Xtables extensions have grown over time.
It involves a lot of typing/replication, and also a bit of stack space
even if they are not used. Realize an NFWS2008 idea and pack them into
structs. The skb remains outside of the struct so gcc can continue to
apply its optimizations.
This patch does this for match extensions' match functions.
A few ambiguities have also been addressed. The "offset" parameter for
example has been renamed to "fragoff" (there are so many different
offsets already) and "protoff" to "thoff" (there is more than just one
protocol here, so clarify).
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:18 +0000 (11:35 +0200)]
netfilter: xtables: use "if" blocks in Kconfig
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:17 +0000 (11:35 +0200)]
netfilter: xtables: sort extensions alphabetically in Kconfig
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:17 +0000 (11:35 +0200)]
netfilter: ebtables: make BRIDGE_NF_EBTABLES a menuconfig option
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:17 +0000 (11:35 +0200)]
netfilter: ip6tables: fix Kconfig entry dependency for ip6t_LOG
ip6t_LOG does certainly not depend on the filter table.
(Also, move it so that menuconfig still displays it correctly.)
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:17 +0000 (11:35 +0200)]
netfilter: ip6tables: fix name of hopbyhop in Kconfig
The module is called hbh, not hopbyhop.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:17 +0000 (11:35 +0200)]
netfilter: xtables: do centralized checkentry call (1/2)
It used to be that {ip,ip6,etc}_tables called extension->checkentry
themselves, but this can be moved into the xtables core.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:16 +0000 (11:35 +0200)]
netfilter: ebtables: fix one wrong return value
Usually -EINVAL is used when checkentry fails (see *_tables).
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:16 +0000 (11:35 +0200)]
netfilter: remove redundant casts from Ebtables
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:16 +0000 (11:35 +0200)]
netfilter: remove unused Ebtables functions
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:16 +0000 (11:35 +0200)]
netfilter: implement hotdrop for Ebtables
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:15 +0000 (11:35 +0200)]
netfilter: ebtables: use generic table checking
Ebtables ORs (1 << NF_BR_NUMHOOKS) into the hook mask to indicate that
the extension was called from a base chain. So this also needs to be
present in the extensions' ->hooks.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:15 +0000 (11:35 +0200)]
netfilter: x_tables: output bad hook mask in hexadecimal
It is a mask, and masks are most useful in hex.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:15 +0000 (11:35 +0200)]
netfilter: move Ebtables to use Xtables
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:15 +0000 (11:35 +0200)]
netfilter: change Ebtables function signatures to match Xtables's
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:14 +0000 (11:35 +0200)]
netfilter: ebt_among: obtain match size through different means
The function signatures will be changed to match those of Xtables, and
the datalen argument will be gone. ebt_among unfortunately relies on
it, so we need to obtain it somehow.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:14 +0000 (11:35 +0200)]
netfilter: add dummy members to Ebtables code to ease transition to Xtables
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:13 +0000 (11:35 +0200)]
netfilter: Change return types of targets/watchers for Ebtables extensions
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:13 +0000 (11:35 +0200)]
netfilter: change return types of match functions for ebtables extensions
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:13 +0000 (11:35 +0200)]
netfilter: change return types of check functions for Ebtables extensions
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:13 +0000 (11:35 +0200)]
netfilter: ebtables: do centralized size checking
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
KOVACS Krisztian [Wed, 8 Oct 2008 09:35:12 +0000 (11:35 +0200)]
netfilter: Add documentation for tproxy
Add basic usage instructions to Documentation/networking.
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
KOVACS Krisztian [Wed, 8 Oct 2008 09:35:12 +0000 (11:35 +0200)]
netfilter: iptables TPROXY target
The TPROXY target implements redirection of non-local TCP/UDP traffic to local
sockets. Additionally, it's possible to manipulate the packet mark if and only
if a socket has been found. (We need this because we cannot use multiple
targets in the same iptables rule.)
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
KOVACS Krisztian [Wed, 8 Oct 2008 09:35:12 +0000 (11:35 +0200)]
netfilter: iptables socket match
Add iptables 'socket' match, which matches packets for which a TCP/UDP
socket lookup succeeds.
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
KOVACS Krisztian [Wed, 8 Oct 2008 09:35:12 +0000 (11:35 +0200)]
netfilter: iptables tproxy core
The iptables tproxy core is a module that contains the common routines used by
various tproxy related modules (TPROXY target and socket match)
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
KOVACS Krisztian [Wed, 8 Oct 2008 09:35:12 +0000 (11:35 +0200)]
netfilter: split netfilter IPv4 defragmentation into a separate module
Netfilter connection tracking requires all IPv4 packets to be defragmented.
Both the socket match and the TPROXY target depend on this functionality, so
this patch separates the Netfilter IPv4 defrag hooks into a separate module.
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:11 +0000 (11:35 +0200)]
netfilter: enable netfilter in netns
From kernel perspective, allow entrance in nf_hook_slow().
Stuff which uses nf_register_hook/nf_register_hooks, but otherwise not netns-ready:
DECnet netfilter
ipt_CLUSTERIP
nf_nat_standalone.c together with XFRM (?)
IPVS
several individual match modules (like hashlimit)
ctnetlink
NOTRACK
all sorts of queueing and reporting to userspace
L3 and L4 protocol sysctls, bridge sysctls
probably something else
Anyway critical mass has been achieved, there is no reason to hide netfilter any longer.
From userspace perspective, allow to manipulate all sorts of
iptables/ip6tables/arptables rules.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:11 +0000 (11:35 +0200)]
netfilter: netns nat: PPTP NAT in netns
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:11 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: fixup DNAT in netns
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:11 +0000 (11:35 +0200)]
netfilter: netns nat: per-netns bysource hash
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:10 +0000 (11:35 +0200)]
netfilter: netns nat: per-netns NAT table
Same story as with iptable_filter, iptables_raw tables.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:10 +0000 (11:35 +0200)]
netfilter: netns nat: fix ipt_MASQUERADE in netns
First, allow entry in notifier hook.
Second, start conntrack cleanup in netns to which netdevice belongs.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:10 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: PPTP conntracking in netns
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:10 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: GRE conntracking in netns
* make keymap list per-netns
* per-netns keymal lock (not strictly necessary)
* flush keymap at netns stop and module unload.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:09 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: H323 conntracking in netns
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:09 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: SIP conntracking in netns
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:09 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: final netns tweaks
Add init_net checks to not remove kmem_caches twice and so on.
Refactor functions to split code which should be executed only for
init_net into one place.
ip_ct_attach and ip_ct_destroy assignments remain separate, because
they're separate stages in setup and teardown.
NOTE: NOTRACK code is in for-every-net part. It will be made per-netns
after we decidce how to do it correctly.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:09 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: per-netns conntrack accounting
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:08 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: per-netns net.netfilter.nf_conntrack_log_invalid sysctl
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:08 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: per-netns net.netfilter.nf_conntrack_checksum sysctl
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:08 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: per-netns net.netfilter.nf_conntrack_count sysctl
Note, sysctl table is always duplicated, this is simpler and less
special-cased.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:08 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: per-netns /proc/net/stat/nf_conntrack, /proc/net/stat/ip_conntrack
Show correct conntrack count, while I'm at it.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:07 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: per-netns statistics
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:07 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: per-netns event cache
Heh, last minute proof-reading of this patch made me think,
that this is actually unneeded, simply because "ct" pointers will be
different for different conntracks in different netns, just like they
are different in one netns.
Not so sure anymore.
[Patrick: pointers will be different, flushing can only be done while
inactive though and thus it needs to be per netns]
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:07 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: pass conntrack to nf_conntrack_event_cache() not skb
This is cleaner, we already know conntrack to which event is relevant.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:07 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: cleanup after L3 and L4 proto unregister in every netns
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:06 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: unregister helper in every netns
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:06 +0000 (11:35 +0200)]
netns: export netns list
Conntrack code will use it for
a) removing expectations and helpers when corresponding module is removed, and
b) removing conntracks when L3 protocol conntrack module is removed.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:06 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: per-netns /proc/net/ip_conntrack, /proc/net/stat/ip_conntrack, /proc/net/ip_conntrack_expect
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:06 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: per-netns /proc/net/nf_conntrack_expect
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:05 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: per-netns /proc/net/nf_conntrack, /proc/net/stat/nf_conntrack
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:05 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: pass netns pointer to L4 protocol's ->error hook
Again, it's deducible from skb, but we're going to use it for
nf_conntrack_checksum and statistics, so just pass it from upper layer.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:04 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: pass netns pointer to nf_conntrack_in()
It's deducible from skb->dev or skb->dst->dev, but we know netns at
the moment of call, so pass it down and use for finding and creating
conntracks.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:04 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: per-netns unconfirmed list
What is confirmed connection in one netns can very well be unconfirmed
in another one.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:03 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: per-netns expectations
Make per-netns a) expectation hash and b) expectations count.
Expectations always belongs to netns to which it's master conntrack belong.
This is natural and doesn't bloat expectation.
Proc files and leaf users are stubbed to init_net, this is temporary.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:03 +0000 (11:35 +0200)]
netfilter: netns: fix {ip,6}_route_me_harder() in netns
Take netns from skb->dst->dev. It should be safe because, they are called
from LOCAL_OUT hook where dst is valid (though, I'm not exactly sure about
IPVS and queueing packets to userspace).
[Patrick: its safe everywhere since they already expect skb->dst to be set]
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:03 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: per-netns conntrack hash
* make per-netns conntrack hash
Other solution is to add ->ct_net pointer to tuplehashes and still has one
hash, I tried that it's ugly and requires more code deep down in protocol
modules et al.
* propagate netns pointer to where needed, e. g. to conntrack iterators.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:03 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: per-netns conntrack count
Sysctls and proc files are stubbed to init_net's one. This is temporary.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:02 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: add ->ct_net -- pointer from conntrack to netns
Conntrack (struct nf_conn) gets pointer to netns: ->ct_net -- netns in which
it was created. It comes from netdevice.
->ct_net is write-once field.
Every conntrack in system has ->ct_net initialized, no exceptions.
->ct_net doesn't pin netns: conntracks are recycled after timeouts and
pinning background traffic will prevent netns from even starting shutdown
sequence.
Right now every conntrack is created in init_net.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:02 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: add netns boilerplate
One comment: #ifdefs around #include is necessary to overcome amazing compile
breakages in NOTRACK-in-netns patch (see below).
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:02 +0000 (11:35 +0200)]
netfilter: netns: ip6t_REJECT in netns for real
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:02 +0000 (11:35 +0200)]
netfilter: netns: ip6table_mangle in netns for real
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:01 +0000 (11:35 +0200)]
netfilter: netns: ip6table_raw in netns for real
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:01 +0000 (11:35 +0200)]
netfilter: netns: remove nf_*_net() wrappers
Now that dev_net() exists, the usefullness of them is even less. Also they're
a big problem in resolving circular header dependencies necessary for
NOTRACK-in-netns patch. See below.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:01 +0000 (11:35 +0200)]
netfilter: implement NFPROTO_UNSPEC as a wildcard for extensions
When a match or target is looked up using xt_find_{match,target},
Xtables will also search the NFPROTO_UNSPEC module list. This allows
for protocol-independent extensions (like xt_time) to be reused from
other components (e.g. arptables, ebtables).
Extensions that take different codepaths depending on match->family
or target->family of course cannot use NFPROTO_UNSPEC within the
registration structure (e.g. xt_pkttype).
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:01 +0000 (11:35 +0200)]
netfilter: x_tables: use NFPROTO_* in extensions
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:00 +0000 (11:35 +0200)]
netfilter: Introduce NFPROTO_* constants
The netfilter subsystem only supports a handful of protocols (much
less than PF_*) and even non-PF protocols like ARP and
pseudo-protocols like PF_BRIDGE. By creating NFPROTO_*, we can earn a
few memory savings on arrays that previously were always PF_MAX-sized
and keep the pseudo-protocols to ourselves.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:00 +0000 (11:35 +0200)]
netfilter: xt_recent: IPv6 support
This updates xt_recent to support the IPv6 address family.
The new /proc/net/xt_recent directory must be used for this.
The old proc interface can also be configured out.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:00 +0000 (11:35 +0200)]
netfilter: rename ipt_recent to xt_recent
Like with other modules (such as ipt_state), ipt_recent.h is changed
to forward definitions to (IOW include) xt_recent.h, and xt_recent.c
is changed to use the new constant names.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>