platform/upstream/kernel-adaptation-pc.git
14 years agoipv4: Suppress lockdep-RCU false positive in FIB trie (3)
Jarek Poplawski [Tue, 7 Sep 2010 07:51:17 +0000 (07:51 +0000)]
ipv4: Suppress lockdep-RCU false positive in FIB trie (3)

Hi,
Here is one more of these warnings and a patch below:

Sep  5 23:52:33 del kernel: [46044.244833] ===================================================
Sep  5 23:52:33 del kernel: [46044.269681] [ INFO: suspicious rcu_dereference_check() usage. ]
Sep  5 23:52:33 del kernel: [46044.277000] ---------------------------------------------------
Sep  5 23:52:33 del kernel: [46044.285185] net/ipv4/fib_trie.c:1756 invoked rcu_dereference_check() without protection!
Sep  5 23:52:33 del kernel: [46044.293627]
Sep  5 23:52:33 del kernel: [46044.293632] other info that might help us debug this:
Sep  5 23:52:33 del kernel: [46044.293634]
Sep  5 23:52:33 del kernel: [46044.325333]
Sep  5 23:52:33 del kernel: [46044.325335] rcu_scheduler_active = 1, debug_locks = 0
Sep  5 23:52:33 del kernel: [46044.348013] 1 lock held by pppd/1717:
Sep  5 23:52:33 del kernel: [46044.357548]  #0:  (rtnl_mutex){+.+.+.}, at: [<c125dc1f>] rtnl_lock+0xf/0x20
Sep  5 23:52:33 del kernel: [46044.367647]
Sep  5 23:52:33 del kernel: [46044.367652] stack backtrace:
Sep  5 23:52:33 del kernel: [46044.387429] Pid: 1717, comm: pppd Not tainted 2.6.35.4.4a #3
Sep  5 23:52:33 del kernel: [46044.398764] Call Trace:
Sep  5 23:52:33 del kernel: [46044.409596]  [<c12f9aba>] ? printk+0x18/0x1e
Sep  5 23:52:33 del kernel: [46044.420761]  [<c1053969>] lockdep_rcu_dereference+0xa9/0xb0
Sep  5 23:52:33 del kernel: [46044.432229]  [<c12b7235>] trie_firstleaf+0x65/0x70
Sep  5 23:52:33 del kernel: [46044.443941]  [<c12b74d4>] fib_table_flush+0x14/0x170
Sep  5 23:52:33 del kernel: [46044.455823]  [<c1033e92>] ? local_bh_enable_ip+0x62/0xd0
Sep  5 23:52:33 del kernel: [46044.467995]  [<c12fc39f>] ? _raw_spin_unlock_bh+0x2f/0x40
Sep  5 23:52:33 del kernel: [46044.480404]  [<c12b24d0>] ? fib_sync_down_dev+0x120/0x180
Sep  5 23:52:33 del kernel: [46044.493025]  [<c12b069d>] fib_flush+0x2d/0x60
Sep  5 23:52:33 del kernel: [46044.505796]  [<c12b06f5>] fib_disable_ip+0x25/0x50
Sep  5 23:52:33 del kernel: [46044.518772]  [<c12b10d3>] fib_netdev_event+0x73/0xd0
Sep  5 23:52:33 del kernel: [46044.531918]  [<c1048dfd>] notifier_call_chain+0x2d/0x70
Sep  5 23:52:33 del kernel: [46044.545358]  [<c1048f0a>] raw_notifier_call_chain+0x1a/0x20
Sep  5 23:52:33 del kernel: [46044.559092]  [<c124f687>] call_netdevice_notifiers+0x27/0x60
Sep  5 23:52:33 del kernel: [46044.573037]  [<c124faec>] __dev_notify_flags+0x5c/0x80
Sep  5 23:52:33 del kernel: [46044.586489]  [<c124fb47>] dev_change_flags+0x37/0x60
Sep  5 23:52:33 del kernel: [46044.599394]  [<c12a8a8d>] devinet_ioctl+0x54d/0x630
Sep  5 23:52:33 del kernel: [46044.612277]  [<c12aabb7>] inet_ioctl+0x97/0xc0
Sep  5 23:52:34 del kernel: [46044.625208]  [<c123f6af>] sock_ioctl+0x6f/0x270
Sep  5 23:52:34 del kernel: [46044.638046]  [<c109d2b0>] ? handle_mm_fault+0x420/0x6c0
Sep  5 23:52:34 del kernel: [46044.650968]  [<c123f640>] ? sock_ioctl+0x0/0x270
Sep  5 23:52:34 del kernel: [46044.663865]  [<c10c3188>] vfs_ioctl+0x28/0xa0
Sep  5 23:52:34 del kernel: [46044.676556]  [<c10c38fa>] do_vfs_ioctl+0x6a/0x5c0
Sep  5 23:52:34 del kernel: [46044.688989]  [<c1048676>] ? up_read+0x16/0x30
Sep  5 23:52:34 del kernel: [46044.701411]  [<c1021376>] ? do_page_fault+0x1d6/0x3a0
Sep  5 23:52:34 del kernel: [46044.714223]  [<c10b6588>] ? fget_light+0xf8/0x2f0
Sep  5 23:52:34 del kernel: [46044.726601]  [<c1241f98>] ? sys_socketcall+0x208/0x2c0
Sep  5 23:52:34 del kernel: [46044.739140]  [<c10c3eb3>] sys_ioctl+0x63/0x70
Sep  5 23:52:34 del kernel: [46044.751967]  [<c12fca3d>] syscall_call+0x7/0xb
Sep  5 23:52:34 del kernel: [46044.764734]  [<c12f0000>] ? cookie_v6_check+0x3d0/0x630

-------------->

This patch fixes the warning:
 ===================================================
 [ INFO: suspicious rcu_dereference_check() usage. ]
 ---------------------------------------------------
 net/ipv4/fib_trie.c:1756 invoked rcu_dereference_check() without protection!

 other info that might help us debug this:

 rcu_scheduler_active = 1, debug_locks = 0
 1 lock held by pppd/1717:
  #0:  (rtnl_mutex){+.+.+.}, at: [<c125dc1f>] rtnl_lock+0xf/0x20

 stack backtrace:
 Pid: 1717, comm: pppd Not tainted 2.6.35.4a #3
 Call Trace:
  [<c12f9aba>] ? printk+0x18/0x1e
  [<c1053969>] lockdep_rcu_dereference+0xa9/0xb0
  [<c12b7235>] trie_firstleaf+0x65/0x70
  [<c12b74d4>] fib_table_flush+0x14/0x170
  ...

Allow trie_firstleaf() to be called either under rcu_read_lock()
protection or with RTNL held. The same annotation is added to
node_parent_rcu() to prevent a similar warning a bit later.

Followup of commits 634a4b20 and 4eaa0e3c.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoniu: Fix kernel buffer overflow for ETHTOOL_GRXCLSRLALL
Ben Hutchings [Tue, 7 Sep 2010 04:35:19 +0000 (04:35 +0000)]
niu: Fix kernel buffer overflow for ETHTOOL_GRXCLSRLALL

niu_get_ethtool_tcam_all() assumes that its output buffer is the right
size, and warns before returning if it is not.  However, the output
buffer size is under user control and ETHTOOL_GRXCLSRLALL is an
unprivileged ethtool command.  Therefore this is at least a local
denial-of-service vulnerability.

Change it to check before writing each entry and to return an error if
the buffer is already full.

Compile-tested only.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoipvs: fix active FTP
Julian Anastasov [Sun, 5 Sep 2010 18:02:29 +0000 (18:02 +0000)]
ipvs: fix active FTP

- Do not create expectation when forwarding the PORT
  command to avoid blocking the connection. The problem is that
  nf_conntrack_ftp.c:help() tries to create the same expectation later in
  POST_ROUTING and drops the packet with "dropping packet" message after
  failure in nf_ct_expect_related.

- Change ip_vs_update_conntrack to alter the conntrack
  for related connections from real server. If we do not alter the reply in
  this direction the next packet from client sent to vport 20 comes as NEW
  connection. We alter it but may be some collision happens for both
  conntracks and the second conntrack gets destroyed immediately. The
  connection stucks too.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agogro: Re-fix different skb headrooms
Jarek Poplawski [Sat, 4 Sep 2010 10:34:29 +0000 (10:34 +0000)]
gro: Re-fix different skb headrooms

The patch: "gro: fix different skb headrooms" in its part:
"2) allocate a minimal skb for head of frag_list" is buggy. The copied
skb has p->data set at the ip header at the moment, and skb_gro_offset
is the length of ip + tcp headers. So, after the change the length of
mac header is skipped. Later skb_set_mac_header() sets it into the
NET_SKB_PAD area (if it's long enough) and ip header is misaligned at
NET_SKB_PAD + NET_IP_ALIGN offset. There is no reason to assume the
original skb was wrongly allocated, so let's copy it as it was.

bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=16626
fixes commit: 3d3be4333fdf6faa080947b331a6a19bce1a4f57

Reported-by: Plamen Petrov <pvp-lsts@fs.uni-ruse.bg>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: Plamen Petrov <pvp-lsts@fs.uni-ruse.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agovia-velocity: Turn scatter-gather support back off.
David S. Miller [Tue, 7 Sep 2010 20:49:44 +0000 (13:49 -0700)]
via-velocity: Turn scatter-gather support back off.

It causes all kinds of DMA API debugging assertions and
all straight-forward attempts to fix it have failed.

So turn off SG, and we'll tackle making this work
properly in net-next-2.6

Reported-by: Dave Jones <davej@redhat.com>
Tested-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoipv4: Fix reverse path filtering with multipath routing.
David S. Miller [Tue, 7 Sep 2010 05:36:19 +0000 (22:36 -0700)]
ipv4: Fix reverse path filtering with multipath routing.

Actually iterate over the next-hops to make sure we have
a device match.  Otherwise RP filtering is always elided
when the route matched has multiple next-hops.

Reported-by: Igor M Podlesny <for.poige@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoUNIX: Do not loop forever at unix_autobind().
Tetsuo Handa [Sat, 4 Sep 2010 01:34:28 +0000 (01:34 +0000)]
UNIX: Do not loop forever at unix_autobind().

We assumed that unix_autobind() never fails if kzalloc() succeeded.
But unix_autobind() allows only 1048576 names. If /proc/sys/fs/file-max is
larger than 1048576 (e.g. systems with more than 10GB of RAM), a local user can
consume all names using fork()/socket()/bind().

If all names are in use, those who call bind() with addr_len == sizeof(short)
or connect()/sendmsg() with setsockopt(SO_PASSCRED) will continue

  while (1)
        yield();

loop at unix_autobind() till a name becomes available.
This patch adds a loop counter in order to give up after 1048576 attempts.

Calling yield() for once per 256 attempts may not be sufficient when many names
are already in use, for __unix_find_socket_byname() can take long time under
such circumstance. Therefore, this patch also adds cond_resched() call.

Note that currently a local user can consume 2GB of kernel memory if the user
is allowed to create and autobind 1048576 UNIX domain sockets. We should
consider adding some restriction for autobind operation.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoPATCH: b44 Handle RX FIFO overflow better (simplified)
Mark Lord [Sat, 4 Sep 2010 14:17:59 +0000 (14:17 +0000)]
PATCH: b44 Handle RX FIFO overflow better (simplified)

This patch is a simplified version of the original patch from James Courtier-Dutton.

>From: James Courtier-Dutton
>Subject: [PATCH] Fix b44 RX FIFO overflow recovery.
>Date: Wednesday, June 30, 2010 - 1:11 pm
>
>This patch improves the recovery after a RX FIFO overflow on the b44
>Ethernet NIC.
>Before it would do a complete chip reset, resulting is loss of link
>for a few seconds.
>This patch improves this to do recovery in about 20ms without loss of link.
>
>Signed off by: James@superbug.co.uk

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoirda: off by one
Dan Carpenter [Sat, 4 Sep 2010 03:14:35 +0000 (03:14 +0000)]
irda: off by one

This is an off by one.  We would go past the end when we NUL terminate
the "value" string at end of the function.  The "value" buffer is
allocated in irlan_client_parse_response() or
irlan_provider_parse_command().

CC: stable@kernel.org
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years ago3c59x: Fix deadlock in vortex_error()
Ben Hutchings [Tue, 7 Sep 2010 01:28:56 +0000 (18:28 -0700)]
3c59x: Fix deadlock in vortex_error()

This fixes a bug introduced in commit
de847272149365363a6043a963a6f42fb91566e2
"3c59x: Use fine-grained locks for MII and windowed register access".

vortex_interrupt() holds vp->window_lock over multiple register
accesses to reduce locking overhead.  However it also needs to call
vortex_error() sometimes, and that uses the regular functions for
access to windowed registers, which will try to acquire window_lock
again.

Therefore, drop window_lock around the call to vortex_error() and set
the window afterward reacquiring the lock.  Since vortex_error() may
call vortex_rx(), which *does* require its caller to hold window_lock,
lift that call up into vortex_interrupt().  This also removes the
potential for calling vortex_rx() on a later-generation NIC.

Reported-and-tested-by: Jens Schüßler <jgs@trash.net> [in Debian's 2.6.32]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonetfilter: discard overlapping IPv6 fragment
Nicolas Dichtel [Fri, 3 Sep 2010 05:13:07 +0000 (05:13 +0000)]
netfilter: discard overlapping IPv6 fragment

RFC5722 prohibits reassembling IPv6 fragments when some data overlaps.

Bug spotted by Zhang Zuotao <zuotao.zhang@6wind.com>.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoipv6: discard overlapping fragment
Nicolas Dichtel [Fri, 3 Sep 2010 05:13:05 +0000 (05:13 +0000)]
ipv6: discard overlapping fragment

RFC5722 prohibits reassembling fragments when some data overlaps.

Bug spotted by Zhang Zuotao <zuotao.zhang@6wind.com>.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: fix tx queue selection for bridged devices implementing select_queue
Helmut Schaa [Fri, 3 Sep 2010 02:39:56 +0000 (02:39 +0000)]
net: fix tx queue selection for bridged devices implementing select_queue

When a net device is implementing the select_queue callback and is part of
a bridge, frames coming from the bridge already have a tx queue associated
to the socket (introduced in commit a4ee3ce3293dc931fab19beb472a8bde1295aebe,
"net: Use sk_tx_queue_mapping for connected sockets"). The call to
sk_tx_queue_get will then return the tx queue used by the bridge instead
of calling the select_queue callback.

In case of mac80211 this broke QoS which is implemented by using the
select_queue callback. Furthermore it introduced problems with rt2x00
because frames with the same TID and RA sometimes appeared on different
tx queues which the hw cannot handle correctly.

Fix this by always calling select_queue first if it is available and only
afterwards use the socket tx queue mapping.

Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobonding: Fix jiffies overflow problems (again)
Jiri Bohac [Thu, 2 Sep 2010 05:45:54 +0000 (05:45 +0000)]
bonding: Fix jiffies overflow problems (again)

The time_before_eq()/time_after_eq() functions operate on unsigned
long and only work if the difference between the two compared values
is smaller than half the range of unsigned long (31 bits on i386).

Some of the variables (slave->jiffies, dev->trans_start, dev->last_rx)
used by bonding store a copy of jiffies and may not be updated for a
long time. With HZ=1000, time_before_eq()/time_after_eq() will start
giving bad results after ~25 days.

jiffies will never be before slave->jiffies, dev->trans_start,
dev->last_rx by more than possibly a couple ticks caused by preemption
of this code. This allows us to detect/prevent these overflows by
replacing time_before_eq()/time_after_eq() with time_in_range().

Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agostmmac: fix sleep inside atomic
Giuseppe Cavallaro [Mon, 6 Sep 2010 03:02:11 +0000 (05:02 +0200)]
stmmac: fix sleep inside atomic

We cannot use spinlock when kmalloc is invoked with
GFP_KERNEL flag because it can sleep.
So this patch reviews the usage of spinlock within the
stmmac_resume function avoing this bug.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Reported-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocls_cgroup: Fix rcu lockdep warning
Li Zefan [Thu, 2 Sep 2010 15:42:43 +0000 (15:42 +0000)]
cls_cgroup: Fix rcu lockdep warning

Dave reported an rcu lockdep warning on 2.6.35.4 kernel

task->cgroups and task->cgroups->subsys[i] are protected by RCU.
So we avoid accessing invalid pointers here. This might happen,
for example, when you are deref-ing those pointers while someone
move @task from one cgroup to another.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobe2net: remove a BUG_ON in be_cmds.c
Ajit Khaparde [Fri, 3 Sep 2010 06:24:13 +0000 (06:24 +0000)]
be2net: remove a BUG_ON in be_cmds.c

Async notifications other than link status are possible in certain
configurations. Remove the BUG_ON in the mcc completion processing path.

Signed-off-by: Ajit Khaparde <ajitk@serverengines.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobe2net: fix a bug in UE detection logic
Ajit Khaparde [Fri, 3 Sep 2010 06:23:30 +0000 (06:23 +0000)]
be2net: fix a bug in UE detection logic

The ONLINE registers can return 0xFFFFFFFF on more than one
occassion. On systems that care, reading these registers could
lead to problems.

So the new code decides that the ASIC has encountered and error
by reading the UE_STATUS_LOW/HIGH registers. AND them with
the mask values and a non-zero result indicates an error.

Signed-off-by: Ajit Khaparde <ajitk@serverengines.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobe2net: fix net-snmp error because of wrong packet stats
Ajit Khaparde [Fri, 3 Sep 2010 06:17:10 +0000 (06:17 +0000)]
be2net: fix net-snmp error because of wrong packet stats

Wrong packet statistics for multicast Rx was causing net-snmp error messages
every 15 seconds. Instead of picking the multicast stats from hardware,
now maintain it in the driver itself.

Signed-off-by: Ajit Khaparde <ajitk@serverengines.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agopkt_sched: Fix lockdep warning on est_tree_lock in gen_estimator
Jarek Poplawski [Thu, 2 Sep 2010 20:22:11 +0000 (13:22 -0700)]
pkt_sched: Fix lockdep warning on est_tree_lock in gen_estimator

This patch fixes a lockdep warning:

[  516.287584] =========================================================
[  516.288386] [ INFO: possible irq lock inversion dependency detected ]
[  516.288386] 2.6.35b #7
[  516.288386] ---------------------------------------------------------
[  516.288386] swapper/0 just changed the state of lock:
[  516.288386]  (&qdisc_tx_lock){+.-...}, at: [<c12eacda>] est_timer+0x62/0x1b4
[  516.288386] but this lock took another, SOFTIRQ-unsafe lock in the past:
[  516.288386]  (est_tree_lock){+.+...}
[  516.288386]
[  516.288386] and interrupts could create inverse lock ordering between them.
...

So, est_tree_lock needs BH protection because it's taken by
qdisc_tx_lock, which is used both in BH and process contexts.
(Full warning with this patch at netdev, 02 Sep 2010.)

Fixes commit: ae638c47dc040b8def16d05dc6acdd527628f231
("pkt_sched: gen_estimator: add a new lock")

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoipvs: avoid oops for passive FTP
Julian Anastasov [Wed, 1 Sep 2010 23:07:10 +0000 (23:07 +0000)]
ipvs: avoid oops for passive FTP

Fix Passive FTP problem in ip_vs_ftp:

- Do not oops in nf_nat_set_seq_adjust (adjust_tcp_sequence) when
  iptable_nat module is not loaded

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoRevert "sky2: don't do GRO on second port"
David S. Miller [Thu, 2 Sep 2010 16:39:09 +0000 (09:39 -0700)]
Revert "sky2: don't do GRO on second port"

This reverts commit de6be6c1f77798c4da38301693d33aff1cd76e84.

After some discussion with Jarek Poplawski and Eric Dumazet, we've
decided that this change is incorrect.

Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agogro: fix different skb headrooms
Eric Dumazet [Wed, 1 Sep 2010 00:50:51 +0000 (00:50 +0000)]
gro: fix different skb headrooms

Packets entering GRO might have different headrooms, even for a given
flow (because of implementation details in drivers, like copybreak).
We cant force drivers to deliver packets with a fixed headroom.

1) fix skb_segment()

skb_segment() makes the false assumption headrooms of fragments are same
than the head. When CHECKSUM_PARTIAL is used, this can give csum_start
errors, and crash later in skb_copy_and_csum_dev()

2) allocate a minimal skb for head of frag_list

skb_gro_receive() uses netdev_alloc_skb(headroom + skb_gro_offset(p)) to
allocate a fresh skb. This adds NET_SKB_PAD to a padding already
provided by netdevice, depending on various things, like copybreak.

Use alloc_skb() to allocate an exact padding, to reduce cache line
needs:
NET_SKB_PAD + NET_IP_ALIGN

bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=16626

Many thanks to Plamen Petrov, testing many debugging patches !
With help of Jarek Poplawski.

Reported-by: Plamen Petrov <pvp-lsts@fs.uni-ruse.bg>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobridge: Clear INET control block of SKBs passed into ip_fragment().
David S. Miller [Thu, 2 Sep 2010 01:06:39 +0000 (18:06 -0700)]
bridge: Clear INET control block of SKBs passed into ip_fragment().

In a similar vain to commit 17762060c25590bfddd68cc1131f28ec720f405f
("bridge: Clear IPCB before possible entry into IP stack")

Any time we call into the IP stack we have to make sure the state
there is as expected by the ipv4 code.

With help from Eric Dumazet and Herbert Xu.

Reported-by: Bandan Das <bandan.das@stratus.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years ago3c59x: Remove incorrect locking; correct documented lock hierarchy
Ben Hutchings [Mon, 30 Aug 2010 14:15:33 +0000 (14:15 +0000)]
3c59x: Remove incorrect locking; correct documented lock hierarchy

vortex_ioctl() was grabbing vortex_private::lock around its call to
generic_mii_ioctl().  This is no longer necessary since there are more
specific locks which the mdio_{read,write}() functions will obtain.
Worse, those functions do not save and restore IRQ flags when locking
the MII state, so interrupts will be enabled when generic_mii_ioctl()
returns.

Since there is currently no need for any function to call
mdio_{read,write}() while holding another spinlock, do not change them
to save and restore IRQ flags but remove the specification of ordering
between vortex_private::lock and vortex_private::mii_lock.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosky2: don't do GRO on second port
stephen hemminger [Mon, 30 Aug 2010 07:51:17 +0000 (07:51 +0000)]
sky2: don't do GRO on second port

There's something very important I forgot to tell you.
 What?

 Don't cross the GRO streams.
 Why?

 It would be bad.
 I'm fuzzy on the whole good/bad thing. What do you mean, "bad"?

 Try to imagine all the Internet as you know it stopping instantaneously
  and every bit in every packet swapping at the speed of light.
 Total packet reordering.
 Right. That's bad. Okay. All right. Important safety tip. Thanks, Hubert

The simplest way to stop this is just avoid doing GRO on the second port.
Very few Marvell boards support two ports per ring, and GRO is just
an optimization.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoipv4: minor fix about RPF in help of Kconfig
Nicolas Dichtel [Tue, 31 Aug 2010 05:50:43 +0000 (05:50 +0000)]
ipv4: minor fix about RPF in help of Kconfig

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoxfrm_user: avoid a warning with some compiler
Nicolas Dichtel [Tue, 31 Aug 2010 05:54:00 +0000 (05:54 +0000)]
xfrm_user: avoid a warning with some compiler

Attached is a small patch to remove a warning ("warning: ISO C90 forbids
mixed declarations and code" with gcc 4.3.2).

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet/sched/sch_hfsc.c: initialize parent's cl_cfmin properly in init_vf()
Michal Soltys [Mon, 30 Aug 2010 11:34:10 +0000 (11:34 +0000)]
net/sched/sch_hfsc.c: initialize parent's cl_cfmin properly in init_vf()

This patch fixes init_vf() function, so on each new backlog period parent's
cl_cfmin is properly updated (including further propgation towards the root),
even if the activated leaf has no upperlimit curve defined.

Signed-off-by: Michal Soltys <soltys@ziu.info>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agopxa168_eth: fix a mdiobus leak
Denis Kirjanov [Sun, 29 Aug 2010 21:21:38 +0000 (21:21 +0000)]
pxa168_eth: fix a mdiobus leak

mdiobus resources must be released on exit

Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org>
Acked-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet sched: fix kernel leak in act_police
Jeff Mahoney [Tue, 31 Aug 2010 13:21:42 +0000 (13:21 +0000)]
net sched: fix kernel leak in act_police

While reviewing commit 1c40be12f7d8ca1d387510d39787b12e512a7ce8, I
 audited other users of tc_action_ops->dump for information leaks.

 That commit covered almost all of them but act_police still had a leak.

 opt.limit and opt.capab aren't zeroed out before the structure is
 passed out.

 This patch uses the C99 initializers to zero everything unused out.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Acked-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agovhost: stop worker only if created
Eric Dumazet [Tue, 31 Aug 2010 02:05:57 +0000 (02:05 +0000)]
vhost: stop worker only if created

Its currently illegal to call kthread_stop(NULL)

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoMAINTAINERS: Add ehea driver as Supported
Breno Leitao [Wed, 1 Sep 2010 20:10:53 +0000 (13:10 -0700)]
MAINTAINERS: Add ehea driver as Supported

This change just add the IBM eHEA 10Gb network drivers as supported.

Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wirel...
David S. Miller [Wed, 1 Sep 2010 19:01:05 +0000 (12:01 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless-2.6

14 years agoath9k_hw: fix parsing of HT40 5 GHz CTLs
Luis R. Rodriguez [Mon, 30 Aug 2010 23:26:33 +0000 (19:26 -0400)]
ath9k_hw: fix parsing of HT40 5 GHz CTLs

The 5 GHz CTL indexes were not being read for all hardware
devices due to the masking out through the CTL_MODE_M mask
being one bit too short. Without this the calibrated regulatory
maximum values were not being picked up when devices operate
on 5 GHz in HT40 mode. The final output power used for Atheros
devices is the minimum between the calibrated CTL values and
what CRDA provides.

Cc: stable@kernel.org [2.6.27+]
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
14 years agoath9k_hw: Fix EEPROM uncompress block reading on AR9003
Luis R. Rodriguez [Mon, 30 Aug 2010 23:26:32 +0000 (19:26 -0400)]
ath9k_hw: Fix EEPROM uncompress block reading on AR9003

The EEPROM is compressed on AR9003, upon decompression
the wrong upper limit was being used for the block which
prevented the 5 GHz CTL indexes from being used, which are
stored towards the end of the EEPROM block. This fix allows
the actual intended regulatory limits to be used on AR9003
hardware.

Cc: stable@kernel.org [2.6.36+]
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
14 years agowireless: register wiphy rfkill w/o holding cfg80211_mutex
John W. Linville [Mon, 30 Aug 2010 21:36:40 +0000 (17:36 -0400)]
wireless: register wiphy rfkill w/o holding cfg80211_mutex

Otherwise lockdep complains...

https://bugzilla.kernel.org/show_bug.cgi?id=17311

[ INFO: possible circular locking dependency detected ]
2.6.36-rc2-git4 #12
-------------------------------------------------------
kworker/0:3/3630 is trying to acquire lock:
 (rtnl_mutex){+.+.+.}, at: [<ffffffff813396c7>] rtnl_lock+0x12/0x14

but task is already holding lock:
 (rfkill_global_mutex){+.+.+.}, at: [<ffffffffa014b129>]
rfkill_switch_all+0x24/0x49 [rfkill]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (rfkill_global_mutex){+.+.+.}:
       [<ffffffff81079ad7>] lock_acquire+0x120/0x15b
       [<ffffffff813ae869>] __mutex_lock_common+0x54/0x52e
       [<ffffffff813aede9>] mutex_lock_nested+0x34/0x39
       [<ffffffffa014b4ab>] rfkill_register+0x2b/0x29c [rfkill]
       [<ffffffffa0185ba0>] wiphy_register+0x1ae/0x270 [cfg80211]
       [<ffffffffa0206f01>] ieee80211_register_hw+0x1b4/0x3cf [mac80211]
       [<ffffffffa0292e98>] iwl_ucode_callback+0x9e9/0xae3 [iwlagn]
       [<ffffffff812d3e9d>] request_firmware_work_func+0x54/0x6f
       [<ffffffff81065d15>] kthread+0x8c/0x94
       [<ffffffff8100ac24>] kernel_thread_helper+0x4/0x10

-> #1 (cfg80211_mutex){+.+.+.}:
       [<ffffffff81079ad7>] lock_acquire+0x120/0x15b
       [<ffffffff813ae869>] __mutex_lock_common+0x54/0x52e
       [<ffffffff813aede9>] mutex_lock_nested+0x34/0x39
       [<ffffffffa018605e>] cfg80211_get_dev_from_ifindex+0x1b/0x7c [cfg80211]
       [<ffffffffa0189f36>] cfg80211_wext_giwscan+0x58/0x990 [cfg80211]
       [<ffffffff8139a3ce>] ioctl_standard_iw_point+0x1a8/0x272
       [<ffffffff8139a529>] ioctl_standard_call+0x91/0xa7
       [<ffffffff8139a687>] T.723+0xbd/0x12c
       [<ffffffff8139a727>] wext_handle_ioctl+0x31/0x6d
       [<ffffffff8133014e>] dev_ioctl+0x63d/0x67a
       [<ffffffff8131afd9>] sock_ioctl+0x48/0x21d
       [<ffffffff81102abd>] do_vfs_ioctl+0x4ba/0x509
       [<ffffffff81102b5d>] sys_ioctl+0x51/0x74
       [<ffffffff81009e02>] system_call_fastpath+0x16/0x1b

-> #0 (rtnl_mutex){+.+.+.}:
       [<ffffffff810796b0>] __lock_acquire+0xa93/0xd9a
       [<ffffffff81079ad7>] lock_acquire+0x120/0x15b
       [<ffffffff813ae869>] __mutex_lock_common+0x54/0x52e
       [<ffffffff813aede9>] mutex_lock_nested+0x34/0x39
       [<ffffffff813396c7>] rtnl_lock+0x12/0x14
       [<ffffffffa0185cb5>] cfg80211_rfkill_set_block+0x1a/0x7b [cfg80211]
       [<ffffffffa014aed0>] rfkill_set_block+0x80/0xd5 [rfkill]
       [<ffffffffa014b07e>] __rfkill_switch_all+0x3f/0x6f [rfkill]
       [<ffffffffa014b13d>] rfkill_switch_all+0x38/0x49 [rfkill]
       [<ffffffffa014b821>] rfkill_op_handler+0x105/0x136 [rfkill]
       [<ffffffff81060708>] process_one_work+0x248/0x403
       [<ffffffff81062620>] worker_thread+0x139/0x214
       [<ffffffff81065d15>] kthread+0x8c/0x94
       [<ffffffff8100ac24>] kernel_thread_helper+0x4/0x10

Signed-off-by: John W. Linville <linville@tuxdriver.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
14 years agonetlink: Make NETLINK_USERSOCK work again.
David S. Miller [Tue, 31 Aug 2010 02:08:01 +0000 (19:08 -0700)]
netlink: Make NETLINK_USERSOCK work again.

Once we started enforcing the a nl_table[] entry exist for
a protocol, NETLINK_USERSOCK stopped working.  Add a dummy
table entry so that it works again.

Reported-by: Thomas Voegtle <tv@lio96.de>
Tested-by: Thomas Voegtle <tv@lio96.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoirda: Correctly clean up self->ias_obj on irda_bind() failure.
David S. Miller [Tue, 31 Aug 2010 01:35:24 +0000 (18:35 -0700)]
irda: Correctly clean up self->ias_obj on irda_bind() failure.

If irda_open_tsap() fails, the irda_bind() code tries to destroy
the ->ias_obj object by hand, but does so wrongly.

In particular, it fails to a) release the hashbin attached to the
object and b) reset the self->ias_obj pointer to NULL.

Fix both problems by using irias_delete_object() and explicitly
setting self->ias_obj to NULL, just as irda_release() does.

Reported-by: Tavis Ormandy <taviso@cmpxchg8b.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agowireless extensions: fix kernel heap content leak
Johannes Berg [Mon, 30 Aug 2010 10:24:54 +0000 (12:24 +0200)]
wireless extensions: fix kernel heap content leak

Wireless extensions have an unfortunate, undocumented
requirement which requires drivers to always fill
iwp->length when returning a successful status. When
a driver doesn't do this, it leads to a kernel heap
content leak when userspace offers a larger buffer
than would have been necessary.

Arguably, this is a driver bug, as it should, if it
returns 0, fill iwp->length, even if it separately
indicated that the buffer contents was not valid.

However, we can also at least avoid the memory content
leak if the driver doesn't do this by setting the iwp
length to max_tokens, which then reflects how big the
buffer is that the driver may fill, regardless of how
big the userspace buffer is.

To illustrate the point, this patch also fixes a
corresponding cfg80211 bug (since this requirement
isn't documented nor was ever pointed out by anyone
during code review, I don't trust all drivers nor
all cfg80211 handlers to implement it correctly).

Cc: stable@kernel.org [all the way back]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
14 years agoMAINTAINERS: change broken url for prism54
John W. Linville [Fri, 27 Aug 2010 13:44:12 +0000 (09:44 -0400)]
MAINTAINERS: change broken url for prism54

Reported-by: Joe Perches <joe@perches.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
14 years agomac80211: delete work timer
Johannes Berg [Wed, 25 Aug 2010 12:47:38 +0000 (14:47 +0200)]
mac80211: delete work timer

The new workqueue changes helped me find this bug
that's been lingering since the changes to the work
processing in mac80211 -- the work timer is never
deleted properly. Do that to avoid having it fire
after all data structures have been freed. It can't
be re-armed because all it will do, if running, is
schedule the work, but that gets flushed later and
won't have anything to do since all work items are
gone by now (by way of interface removal).

Cc: stable@kernel.org [2.6.34+]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
14 years agop54: fix tx feedback status flag check
Christian Lamparter [Tue, 24 Aug 2010 20:54:05 +0000 (22:54 +0200)]
p54: fix tx feedback status flag check

Michael reported that p54* never really entered power
save mode, even tough it was enabled.

It turned out that upon a power save mode change the
firmware will set a special flag onto the last outgoing
frame tx status (which in this case is almost always the
designated PSM nullfunc frame). This flag confused the
driver; It erroneously reported transmission failures
to the stack, which then generated the next nullfunc.
and so on...

Cc: <stable@kernel.org>
Reported-by: Michael Buesch <mb@bu3sch.de>
Tested-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
14 years agoath5k: check return value of ieee80211_get_tx_rate
John W. Linville [Tue, 24 Aug 2010 19:27:34 +0000 (15:27 -0400)]
ath5k: check return value of ieee80211_get_tx_rate

This avoids a NULL pointer dereference as reported here:

https://bugzilla.redhat.com/show_bug.cgi?id=625889

When the WARN condition is hit in ieee80211_get_tx_rate, it will return
NULL.  So, we need to check the return value and avoid dereferencing it
in that case.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
Cc: stable@kernel.org
Acked-by: Bob Copeland <me@bobcopeland.com>
14 years agopcnet_cs: add new_id
Ken Kawasaki [Sat, 28 Aug 2010 12:45:01 +0000 (12:45 +0000)]
pcnet_cs: add new_id

pcnet_cs:
    add new_id: "KENTRONICS KEP-230" 10Base-T PCMCIA card.

Signed-off-by: Ken Kawasaki <ken_kawasaki@spring.nifty.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet/ipv4: Eliminate kstrdup memory leak
Julia Lawall [Sat, 28 Aug 2010 02:31:56 +0000 (19:31 -0700)]
net/ipv4: Eliminate kstrdup memory leak

The string clone is only used as a temporary copy of the argument val
within the while loop, and so it should be freed before leaving the
function.  The call to strsep, however, modifies clone, so a pointer to the
front of the string is kept in saved_clone, to make it possible to free it.

The sematic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@r exists@
local idexpression x;
expression E;
identifier l;
statement S;
@@

*x= \(kasprintf\|kstrdup\)(...);
...
if (x == NULL) S
... when != kfree(x)
    when != E = x
if (...) {
  <... when != kfree(x)
* goto l;
  ...>
* return ...;
}
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agolibertas: if_sdio: fix buffer alignment in struct if_sdio_card
Mike Rapoport [Sun, 22 Aug 2010 11:22:40 +0000 (14:22 +0300)]
libertas: if_sdio: fix buffer alignment in struct if_sdio_card

The commit 886275ce41a9751117367fb387ed171049eb6148 (param: lock
if_sdio's lbs_helper_name and lbs_fw_name against sysfs changes)
introduced new fields into the if_sdio_card structure. It caused
missalignment of the if_sdio_card.buffer field and failure at driver
load time:

  ~# modprobe libertas_sdio
  [   62.315124] libertas_sdio: Libertas SDIO driver
  [   62.319976] libertas_sdio: Copyright Pierre Ossman
  [   63.020629] DMA misaligned error with device 48
  [   63.025207] mmci-omap-hs mmci-omap-hs.1: unexpected dma status 800
  [   66.005035] libertas: command 0x0003 timed out
  [   66.009826] libertas: Timeout submitting command 0x0003
  [   66.016296] libertas: PREP_CMD: command 0x0003 failed: -110

Adding explicit alignment attribute for the if_sdio_card.buffer field
fixes this problem.

Signed-off-by: Mike Rapoport <mike@compulab.co.il>
Acked-by: Marek Vasut <marek.vasut@gmail.com>
Acked-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
14 years agonet/caif/cfrfml.c: use asm/unaligned.h
Jeff Mahoney [Thu, 26 Aug 2010 23:11:08 +0000 (16:11 -0700)]
net/caif/cfrfml.c: use asm/unaligned.h

caif does not build on ia64 starting with 2.6.32-rc1.  Using
asm/unaligned.h instead of linux/unaligned/le_byteshift.h fixes the issue.

include/linux/unaligned/le_byteshift.h:40:50: error: redefinition of 'get_unaligned_le16'
include/linux/unaligned/le_byteshift.h:45:50: error: redefinition of 'get_unaligned_le32'
include/linux/unaligned/le_byteshift.h:50:50: error: redefinition of 'get_unaligned_le64'
include/linux/unaligned/le_byteshift.h:55:51: error: redefinition of 'put_unaligned_le16'
include/linux/unaligned/le_byteshift.h:60:51: error: redefinition of 'put_unaligned_le32'
include/linux/unaligned/le_byteshift.h:65:51: error: redefinition of 'put_unaligned_le64'
include/linux/unaligned/le_struct.h:31:51: note: previous definition of 'put_unaligned_le64' was here

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoax25: missplaced sock_put(sk)
Bernard Pidoux F6BVP [Thu, 26 Aug 2010 11:40:00 +0000 (11:40 +0000)]
ax25: missplaced sock_put(sk)

This patch moves a missplaced sock_put(sk) after
bh_unlock_sock(sk)
like in other parts of AX25 driver.

Signed-off-by: Bernard Pidoux <f6bvp@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqlge: reset the chip before freeing the buffers
Breno Leitao [Thu, 26 Aug 2010 08:27:58 +0000 (08:27 +0000)]
qlge: reset the chip before freeing the buffers

Qlge is freeing the buffers before stopping the card DMA, and
this can cause some severe error, as a EEH event on PPC.

This patch just stop the card and then free the resources.

Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com>
Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agol2tp: test for ethernet header in l2tp_eth_dev_recv()
Eric Dumazet [Wed, 25 Aug 2010 23:44:35 +0000 (23:44 +0000)]
l2tp: test for ethernet header in l2tp_eth_dev_recv()

close https://bugzilla.kernel.org/show_bug.cgi?id=16529

Before calling dev_forward_skb(), we should make sure skb head contains
at least an ethernet header, even if length included in upper layer said
so. Use pskb_may_pull() to make sure this ethernet header is present in
skb head.

Reported-by: Thomas Heil <heil@terminal-consulting.de>
Reported-by: Ian Campbell <Ian.Campbell@eu.citrix.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agotcp: select(writefds) don't hang up when a peer close connection
KOSAKI Motohiro [Tue, 24 Aug 2010 16:05:48 +0000 (16:05 +0000)]
tcp: select(writefds) don't hang up when a peer close connection

This issue come from ruby language community. Below test program
hang up when only run on Linux.

% uname -mrsv
Linux 2.6.26-2-486 #1 Sat Dec 26 08:37:39 UTC 2009 i686
% ruby -rsocket -ve '
BasicSocket.do_not_reverse_lookup = true
serv = TCPServer.open("127.0.0.1", 0)
s1 = TCPSocket.open("127.0.0.1", serv.addr[1])
s2 = serv.accept
s2.close
s1.write("a") rescue p $!
s1.write("a") rescue p $!
Thread.new {
  s1.write("a")
}.join'
ruby 1.9.3dev (2010-07-06 trunk 28554) [i686-linux]
#<Errno::EPIPE: Broken pipe>
[Hang Here]

FreeBSD, Solaris, Mac doesn't. because Ruby's write() method call
select() internally. and tcp_poll has a bug.

SUS defined 'ready for writing' of select() as following.

|  A descriptor shall be considered ready for writing when a call to an output
|  function with O_NONBLOCK clear would not block, whether or not the function
|  would transfer data successfully.

That said, EPIPE situation is clearly one of 'ready for writing'.

We don't have read-side issue because tcp_poll() already has read side
shutdown care.

|        if (sk->sk_shutdown & RCV_SHUTDOWN)
|                mask |= POLLIN | POLLRDNORM | POLLRDHUP;

So, Let's insert same logic in write side.

- reference url
  http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/31065
  http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/31068

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agotcp: fix three tcp sysctls tuning
Eric Dumazet [Thu, 26 Aug 2010 06:02:17 +0000 (23:02 -0700)]
tcp: fix three tcp sysctls tuning

As discovered by Anton Blanchard, current code to autotune
tcp_death_row.sysctl_max_tw_buckets, sysctl_tcp_max_orphans and
sysctl_max_syn_backlog makes little sense.

The bigger a page is, the less tcp_max_orphans is : 4096 on a 512GB
machine in Anton's case.

(tcp_hashinfo.bhash_size * sizeof(struct inet_bind_hashbucket))
is much bigger if spinlock debugging is on. Its wrong to select bigger
limits in this case (where kernel structures are also bigger)

bhash_size max is 65536, and we get this value even for small machines.

A better ground is to use size of ehash table, this also makes code
shorter and more obvious.

Based on a patch from Anton, and another from David.

Reported-and-tested-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agotcp: Combat per-cpu skew in orphan tests.
David S. Miller [Wed, 25 Aug 2010 09:27:49 +0000 (02:27 -0700)]
tcp: Combat per-cpu skew in orphan tests.

As reported by Anton Blanchard when we use
percpu_counter_read_positive() to make our orphan socket limit checks,
the check can be off by up to num_cpus_online() * batch (which is 32
by default) which on a 128 cpu machine can be as large as the default
orphan limit itself.

Fix this by doing the full expensive sum check if the optimized check
triggers.

Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
14 years agopxa168_eth: silence gcc warnings
Dan Carpenter [Tue, 24 Aug 2010 06:55:05 +0000 (06:55 +0000)]
pxa168_eth: silence gcc warnings

Casting "pep->tx_desc_dma" to to a struct tx_desc pointer makes gcc
complain:

drivers/net/pxa168_eth.c:657: warning:
cast to pointer from integer of different size

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agopxa168_eth: update call to phy_mii_ioctl()
Dan Carpenter [Tue, 24 Aug 2010 06:54:20 +0000 (06:54 +0000)]
pxa168_eth: update call to phy_mii_ioctl()

The phy_mii_ioctl() function changed recently.  It now takes a struct
ifreq pointer directly.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agopxa168_eth: fix error handling in prope
Dan Carpenter [Tue, 24 Aug 2010 06:53:33 +0000 (06:53 +0000)]
pxa168_eth: fix error handling in prope

A couple issues here:
* Some resources weren't released.
* If alloc_etherdev() failed it would have caused a NULL dereference
  because "pep" would be null when we checked "if (pep->clk)".
* Also it's better to propagate the error codes from mdiobus_register()
  instead of just returning -ENOMEM.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agopxa168_eth: remove unneeded null check
Dan Carpenter [Tue, 24 Aug 2010 06:52:46 +0000 (06:52 +0000)]
pxa168_eth: remove unneeded null check

"pep->pd" isn't checked consistently in this function.  For example it's
dereferenced unconditionally on the next line after the end of the if
condition.  This function is only called from pxa168_eth_probe() and
pep->pd is always non-NULL so I removed the check.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agophylib: Fix race between returning phydev and calling adjust_link
Anton Vorontsov [Tue, 24 Aug 2010 21:46:12 +0000 (14:46 -0700)]
phylib: Fix race between returning phydev and calling adjust_link

It is possible that phylib will call adjust_link before returning
from {,of_}phy_connect(), which may cause the following [very rare,
though] oops upon reopening the device:

  Unable to handle kernel paging request for data at address 0x0000024c
  Oops: Kernel access of bad area, sig: 11 [#1]
  PREEMPT SMP NR_CPUS=2 LTT NESTING LEVEL : 0
  P1021 RDB
  Modules linked in:
  NIP: c0345dac LR: c0345dac CTR: c0345d84
  TASK = dffab6b0[30] 'events/0' THREAD: c0d24000 CPU: 0
  [...]
  NIP [c0345dac] adjust_link+0x28/0x19c
  LR [c0345dac] adjust_link+0x28/0x19c
  Call Trace:
  [c0d25f00] [000045e1] 0x45e1 (unreliable)
  [c0d25f30] [c036c158] phy_state_machine+0x3ac/0x554
  [...]

Here is why. Drivers store phydev in their private structures, e.g.
gianfar driver:

static int init_phy(struct net_device *dev)
{
...
priv->phydev = of_phy_connect(...);
...
}

So that adjust_link could retrieve it back:

static void adjust_link(struct net_device *dev)
{
...
struct phy_device *phydev = priv->phydev;
...
}

If the device has been opened before, then phydev->state is set to
PHY_HALTED (or undefined if the driver didn't call phy_stop()).

Now, phy_connect starts the PHY state machine before returning phydev to
the driver:

phy_start_machine(phydev, NULL);

if (phydev->irq > 0)
phy_start_interrupts(phydev);

return phydev;

The time between 'phy_start_machine()' and 'return phydev' is undefined.
The start machine routine delays execution for 1 second, which is enough
for most cases. But under heavy load, or if you're unlucky, it is quite
possible that PHY state machine will execute before phy_connect()
returns, and so adjust_link callback will try to dereference phydev,
which is not yet ready.

To fix the issue, simply initialize the PHY's state to PHY_READY during
phy_attach(). This will ensure that phylib won't call adjust_link before
phy_start().

Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocaif-driver: add HAS_DMA dependency
Heiko Carstens [Tue, 24 Aug 2010 19:21:13 +0000 (12:21 -0700)]
caif-driver: add HAS_DMA dependency

Fix this error on an s390 allyesconfig build:

linux-2.6/drivers/net/caif/caif_spi.c:98:
    undefined reference to `dma_free_coherent'

Cc: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years ago3c59x: Fix deadlock between boomerang_interrupt and boomerang_start_tx
Neil Horman [Wed, 11 Aug 2010 05:12:57 +0000 (05:12 +0000)]
3c59x: Fix deadlock between boomerang_interrupt and boomerang_start_tx

If netconsole is in use, there is a possibility for deadlock in 3c59x between
boomerang_interrupt and boomerang_start_xmit.  Both routines take the vp->lock,
and if netconsole is in use, a pr_* call from the boomerang_interrupt routine
will result in the netconsole code attempting to trnasmit an skb, which can try
to take the same spin lock, resulting in deadlock.

The fix is pretty straightforward.  This patch allocats a bit in the 3c59x
private structure to indicate that its handling an interrupt.  If we get into
the transmit routine and that bit is set, we can be sure that we have recursed
and will deadlock if we continue, so instead we just return NETDEV_TX_BUSY, so
the stack requeues the skb to try again later.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqlcnic: fix poll implementation
Yinglin Luan [Sun, 22 Aug 2010 21:57:56 +0000 (21:57 +0000)]
qlcnic: fix poll implementation

Function qlcnic_intr has pointer to qlcnic_host_sds_ring
as second parameter not pointer to qlcnic_adapter.

Signed-off-by: Yinglin Luan <synmyth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonetxen: fix poll implementation
Yinglin Luan [Sun, 22 Aug 2010 21:56:19 +0000 (21:56 +0000)]
netxen: fix poll implementation

Function netxen_intr has pointer to nx_host_sds_ring
as second parameter not pointer to netxen_adapter.

Signed-off-by: Yinglin Luan <synmyth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobridge: netfilter: fix a memory leak
Changli Gao [Sun, 22 Aug 2010 19:03:26 +0000 (19:03 +0000)]
bridge: netfilter: fix a memory leak

nf_bridge_alloc() always reset the skb->nf_bridge, so we should always
put the old one.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonetfilter: fix CONFIG_COMPAT support
Florian Westphal [Mon, 23 Aug 2010 21:41:22 +0000 (14:41 -0700)]
netfilter: fix CONFIG_COMPAT support

commit f3c5c1bfd430858d3a05436f82c51e53104feb6b
(netfilter: xtables: make ip_tables reentrant) forgot to
also compute the jumpstack size in the compat handlers.

Result is that "iptables -I INPUT -j userchain" turns into -j DROP.

Reported by Sebastian Roesner on #netfilter, closes
http://bugzilla.netfilter.org/show_bug.cgi?id=669.

Note: arptables change is compile-tested only.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoisdn/avm: fix build when PCMCIA is not enabled
Randy Dunlap [Thu, 19 Aug 2010 07:07:23 +0000 (07:07 +0000)]
isdn/avm: fix build when PCMCIA is not enabled

Why wouldn't kconfig symbol ISDN_DRV_AVMB1_B1PCMCIA also depend on
PCMCIA?

Fix build for PCMCIA not enabled:

ERROR: "b1_free_card" [drivers/isdn/hardware/avm/b1pcmcia.ko] undefined!
ERROR: "b1ctl_proc_fops" [drivers/isdn/hardware/avm/b1pcmcia.ko] undefined!
ERROR: "b1_reset_ctr" [drivers/isdn/hardware/avm/b1pcmcia.ko] undefined!
ERROR: "b1_load_firmware" [drivers/isdn/hardware/avm/b1pcmcia.ko] undefined!
ERROR: "b1_send_message" [drivers/isdn/hardware/avm/b1pcmcia.ko] undefined!
ERROR: "b1_release_appl" [drivers/isdn/hardware/avm/b1pcmcia.ko] undefined!
ERROR: "b1_register_appl" [drivers/isdn/hardware/avm/b1pcmcia.ko] undefined!
ERROR: "b1_getrevision" [drivers/isdn/hardware/avm/b1pcmcia.ko] undefined!
ERROR: "b1_detect" [drivers/isdn/hardware/avm/b1pcmcia.ko] undefined!
ERROR: "b1_interrupt" [drivers/isdn/hardware/avm/b1pcmcia.ko] undefined!
ERROR: "b1_alloc_card" [drivers/isdn/hardware/avm/b1pcmcia.ko] undefined!

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Carsten Paeth <calle@calle.de>
Cc: Karsten Keil <isdn@linux-pingi.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoheader: fix broken headers for user space
Changli Gao [Sun, 22 Aug 2010 17:25:05 +0000 (17:25 +0000)]
header: fix broken headers for user space

__packed is only defined in kernel space, so we should use
__attribute__((packed)) for the code shared between kernel and user space.

Two __attribute() annotations are replaced with __attribute__() too.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wirel...
David S. Miller [Thu, 19 Aug 2010 23:54:13 +0000 (16:54 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless-2.6

14 years agoe1000e: don't check for alternate MAC addr on parts that don't support it
Bruce Allan [Thu, 19 Aug 2010 22:48:52 +0000 (15:48 -0700)]
e1000e: don't check for alternate MAC addr on parts that don't support it

From: Bruce Allan <bruce.w.allan@intel.com>

The alternate MAC address feature is only supported by 80003ES2LAN and
82571 LOMs as well as a couple 82571 mezzanine cards.  Checking for an
alternate MAC address on other parts can fail leading to the driver not
able to load.  This patch limits the check for an alternate MAC address
to be done only for parts that support the feature.

This issue has been around since support for the feature was introduced
to the e1000e driver in 2.6.34.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Reported-by: Fabio Varesano <fax8@users.sourceforge.net>
Cc: stable@kernel.org
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoe1000e: disable ASPM L1 on 82573
Bruce Allan [Thu, 19 Aug 2010 22:48:30 +0000 (15:48 -0700)]
e1000e: disable ASPM L1 on 82573

On the e1000-devel mailing list, Nils Faerber reported latency issues with
the 82573 LOM on a ThinkPad X60.  It was found to be caused by ASPM L1;
disabling it resolves the latency.  The issue is present in kernels back
to 2.6.34 and possibly 2.6.33.

Reported-by: Nils Faerber <nils.faerber@kernelconcepts.de>
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Cc: stable@kernel.org
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoll_temac: Fix poll implementation
Michal Simek [Wed, 18 Aug 2010 00:26:34 +0000 (00:26 +0000)]
ll_temac: Fix poll implementation

Functions ll_temac_rx_irq and ll_temac_tx_irq
have pointer to net_device as second parameter not
pointer to temac_local.

Signed-off-by: Michal Simek <monstr@monstr.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonetxen: fix a race in netxen_nic_get_stats()
Eric Dumazet [Wed, 18 Aug 2010 02:29:30 +0000 (02:29 +0000)]
netxen: fix a race in netxen_nic_get_stats()

Dont clear netdev->stats, it might give transient wrong values to
concurrent stat readers.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqlnic: fix a race in qlcnic_get_stats()
Eric Dumazet [Wed, 18 Aug 2010 00:42:48 +0000 (00:42 +0000)]
qlnic: fix a race in qlcnic_get_stats()

Dont clear netdev->stats, it might give transient wrong values to
concurrent stat readers.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoirda: fix a race in irlan_eth_xmit()
Eric Dumazet [Wed, 18 Aug 2010 00:24:43 +0000 (00:24 +0000)]
irda: fix a race in irlan_eth_xmit()

After skb is queued, its illegal to dereference it.

Cache skb->len into a temporary variable.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: sh_eth: remove unused variable
Kuninori Morimoto [Thu, 19 Aug 2010 07:39:45 +0000 (00:39 -0700)]
net: sh_eth: remove unused variable

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonetxen: update version 4.0.74
Amit Kumar Salecha [Tue, 17 Aug 2010 20:51:52 +0000 (20:51 +0000)]
netxen: update version 4.0.74

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonetxen: fix inconsistent lock state
Amit Kumar Salecha [Tue, 17 Aug 2010 20:51:51 +0000 (20:51 +0000)]
netxen: fix inconsistent lock state

Spin lock rds_ring->lock is used in poll routine, so other users should
use spin_lock_bh(). While posting rx buffers from netxen_nic_attach,
rds_ring->lock is not required, so cleaning it instead of fixing it by
spin_lock_bh().

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agovlan: Match underlying dev carrier on vlan add
Phil Oester [Tue, 17 Aug 2010 18:45:08 +0000 (18:45 +0000)]
vlan: Match underlying dev carrier on vlan add

When adding a new vlan, if the underlying interface has no carrier,
then the newly added vlan interface should also have no carrier.
At present, this is not true - the newly added vlan is added with
carrier up.  Fix by checking state of real device.

Signed-off-by: Phil Oester <kernel@linuxace.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoibmveth: Fix opps during MTU change on an active device
Robert Jennings [Tue, 17 Aug 2010 09:15:45 +0000 (09:15 +0000)]
ibmveth: Fix opps during MTU change on an active device

This fixes the following opps which can occur when trying to deallocate
receive buffer pools when changing the MTU of an active ibmveth device.

Oops: Kernel access of bad area, sig: 11 [#1]
NIP: d000000004db00e8 LR: d000000004db00ac CTR: 0000000000591038
REGS: c00000007fff39d0 TRAP: 0300   Not tainted  (2.6.36-rc1)
MSR: 8000000000009032 <EE,ME,IR,DR>  CR: 22248244  XER: 00000002
DAR: 0000000000000488, DSISR: 0000000042000000
TASK = c00000007c463790[6531] 'netserver' THREAD: c00000007a154000 CPU: 0
GPR00: 0000000000000000 c00000007fff3c50 d000000004dbd360 0000000000000001
GPR04: 0000000000000001 1fffffffffffffff 000000000000043c c00000007a8e9f60
GPR08: c00000007a8e9e20 0000000000000245 0000000000000488 0000000000000000
GPR12: 00000000000000c0 c000000006d70000 c00000007bfec098 c00000007bfebc2c
GPR16: c00000007a157c78 0000000000000000 0000000000000001 0000000000000000
GPR20: 0000000000000001 0000000000000010 c000000000b51180 c00000007a8e9d90
GPR24: c00000007a8e9da0 c00000007a8e9580 00000000000005ea 00000000000002ff
GPR28: 0000000000000004 0000000000000080 c000000000a946f8 c00000007a8e9d80
NIP [d000000004db00e8] .ibmveth_remove_buffer_from_pool+0xe8/0x130 [ibmveth]
LR [d000000004db00ac] .ibmveth_remove_buffer_from_pool+0xac/0x130 [ibmveth]
Call Trace:
[c00000007fff3c50] [d000000004db00ac] .ibmveth_remove_buffer_from_pool+0xac/0x130 [ibmveth] (unreliable)
[c00000007fff3cf0] [d000000004db31dc] .ibmveth_poll+0x30c/0x460 [ibmveth]
[c00000007fff3dd0] [c00000000042c4b8] .net_rx_action+0x178/0x278
[c00000007fff3eb0] [c000000000093cf0] .__do_softirq+0x118/0x1f8
[c00000007fff3f90] [c00000000002ab3c] .call_do_softirq+0x14/0x24
[c00000007a157600] [c00000000000e3e4] .do_softirq+0xec/0x110
[c00000007a1576a0] [c000000000093394] .local_bh_enable_ip+0xb4/0xe0
[c00000007a157720] [c0000000004f0bac] ._raw_spin_unlock_bh+0x3c/0x50
[c00000007a157790] [c0000000004186e0] .release_sock+0x158/0x188
[c00000007a157840] [c000000000479660] .tcp_recvmsg+0x560/0x9b8
[c00000007a157970] [c0000000004a0d78] .inet_recvmsg+0x80/0xd8
[c00000007a157a00] [c000000000413e28] .sock_recvmsg+0x128/0x178
[c00000007a157bf0] [c0000000004164ac] .SyS_recvfrom+0xb4/0x148
[c00000007a157d70] [c000000000411f3c] .SyS_socketcall+0x274/0x360
[c00000007a157e30] [c0000000000085b4] syscall_exit+0x0/0x40

Reported-by: Rafael Camarda Silva Folco <rfolco@linux.vnet.ibm.com>
Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoehea: Fix synchronization between HW and SW send queue
Andre Detsch [Tue, 17 Aug 2010 05:49:12 +0000 (05:49 +0000)]
ehea: Fix synchronization between HW and SW send queue

ehea: Fix synchronization between HW and SW send queue

When memory is added to / removed from a partition via the Memory DLPAR
mechanism, the eHEA driver has to do a couple of things to reflect the
memory change in its own IO address translation tables. This involves
stopping and restarting the HW queues.
During this operation, it is possible that HW and SW pointer into these
queues get out of sync. This results in a situation where packets that
are attached to a send queue are not transmitted immediately, but
delayed until further X packets have been put on the queue.

This patch detects such loss of synchronization, and resets the ehea
port when needed.

Signed-off-by: Jan-Bernd Themann <themann@de.ibm.com>
Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobnx2x: Update bnx2x version to 1.52.53-4
Yaniv Rosner [Mon, 16 Aug 2010 06:34:07 +0000 (06:34 +0000)]
bnx2x: Update bnx2x version to 1.52.53-4

Update bnx2x version to 1.52.53-4

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobnx2x: Fix PHY locking problem
Yaniv Rosner [Mon, 16 Aug 2010 06:34:06 +0000 (06:34 +0000)]
bnx2x: Fix PHY locking problem

PHY locking is required between two ports for some external PHYs. Since
initialization was done in the common init function (called only on the
first port initialization) rather than in the port init function, there
was in fact no PHY locking between the ports.

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agords: fix a leak of kernel memory
Eric Dumazet [Mon, 16 Aug 2010 03:25:00 +0000 (03:25 +0000)]
rds: fix a leak of kernel memory

struct rds_rdma_notify contains a 32 bits hole on 64bit arches,
make sure it is zeroed before copying it to user.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonetlink: fix compat recvmsg
Johannes Berg [Sun, 15 Aug 2010 21:20:44 +0000 (21:20 +0000)]
netlink: fix compat recvmsg

Since
commit 1dacc76d0014a034b8aca14237c127d7c19d7726
Author: Johannes Berg <johannes@sipsolutions.net>
Date:   Wed Jul 1 11:26:02 2009 +0000

    net/compat/wext: send different messages to compat tasks

we had a race condition when setting and then
restoring frag_list. Eric attempted to fix it,
but the fix created even worse problems.

However, the original motivation I had when I
added the code that turned out to be racy is
no longer clear to me, since we only copy up
to skb->len to userspace, which doesn't include
the frag_list length. As a result, not doing
any frag_list clearing and restoring avoids
the race condition, while not introducing any
other problems.

Additionally, while preparing this patch I found
that since none of the remaining netlink code is
really aware of the frag_list, we need to use the
original skb's information for packet information
and credentials. This fixes, for example, the
group information received by compat tasks.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: stable@kernel.org [2.6.31+, for 2.6.35 revert 1235f504aa]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonetfilter: fix userspace header warning
Sam Ravnborg [Sun, 15 Aug 2010 10:03:57 +0000 (10:03 +0000)]
netfilter: fix userspace header warning

"make headers_check" issued the following warning:

  CHECK   include/linux/netfilter (64 files)
usr/include/linux/netfilter/xt_ipvs.h:19: found __[us]{8,16,32,64} type without #include <linux/types.h>

Fix this by as suggested including linux/types.h.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: add Fast Ethernet driver for PXA168.
Sachin Sanap [Fri, 13 Aug 2010 21:22:49 +0000 (21:22 +0000)]
net: add Fast Ethernet driver for PXA168.

Signed-off-by: Sachin Sanap <ssanap@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoiwlwifi: use long monitor timer for 5300 series
Wey-Yi Guy [Wed, 18 Aug 2010 19:53:28 +0000 (12:53 -0700)]
iwlwifi: use long monitor timer for 5300 series

For 5000 series of devices, use long monitor timer to check
stuck tx queues.

This modification apply to all the 5000 series including 5300 and others.

Cc: stable@kernel.org [2.6.35]
Reported-by: drago01 <drago01@gmail.com>
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
14 years agodrivers/net/wireless: Restore upper case words in wiphy_<level> messages
Joe Perches [Thu, 12 Aug 2010 02:11:19 +0000 (19:11 -0700)]
drivers/net/wireless: Restore upper case words in wiphy_<level> messages

Commit c96c31e499b70964cfc88744046c998bb710e4b8
"(drivers/net/wireless: Use wiphy_<level>)"
inadvertently changed some upper case words to
lower case.  Restore the original case.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
14 years agonet: Fix a memmove bug in dev_gro_receive()
Jarek Poplawski [Wed, 11 Aug 2010 02:02:10 +0000 (02:02 +0000)]
net: Fix a memmove bug in dev_gro_receive()

>Xin Xiaohui wrote:
> I looked into the code dev_gro_receive(), found the code here:
> if the frags[0] is pulled to 0, then the page will be released,
> and memmove() frags left.
> Is that right? I'm not sure if memmove do right or not, but
> frags[0].size is never set after memove at least. what I think
> a simple way is not to do anything if we found frags[0].size == 0.
> The patch is as followed.
...

This version of the patch fixes the bug directly in memmove.

Reported-by: "Xin, Xiaohui" <xiaohui.xin@intel.com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet sched: fix some kernel memory leaks
Eric Dumazet [Mon, 16 Aug 2010 20:04:22 +0000 (20:04 +0000)]
net sched: fix some kernel memory leaks

We leak at least 32bits of kernel memory to user land in tc dump,
because we dont init all fields (capab ?) of the dumped structure.

Use C99 initializers so that holes and non explicit fields are zeroed.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonetfilter: {ip,ip6,arp}_tables: avoid lockdep false positive
Eric Dumazet [Mon, 16 Aug 2010 10:22:10 +0000 (10:22 +0000)]
netfilter: {ip,ip6,arp}_tables: avoid lockdep false positive

After commit 24b36f019 (netfilter: {ip,ip6,arp}_tables: dont block
bottom half more than necessary), lockdep can raise a warning
because we attempt to lock a spinlock with BH enabled, while
the same lock is usually locked by another cpu in a softirq context.

Disable again BH to avoid these lockdep warnings.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Diagnosed-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoiwlwifi: fix 3945 filter flags
Johannes Berg [Tue, 17 Aug 2010 09:24:01 +0000 (11:24 +0200)]
iwlwifi: fix 3945 filter flags

Applying the filter flags directly as done since

commit 3474ad635db371b0d8d0ee40086f15d223d5b6a4
Author: Johannes Berg <johannes.berg@intel.com>
Date:   Thu Apr 29 04:43:05 2010 -0700

    iwlwifi: apply filter flags directly

broke 3945 under some unknown circumstances, as
reported by Alex.

Since I want to keep the direct application of
filter flags on iwlagn, duplicate the code into
both 3945 and agn and remove committing the
RXON that broke things from the 3945 version.

Cc: stable@kernel.org [2.6.35]
Reported-by: Alex Romosan <romosan@sycorax.lbl.gov>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
14 years agoipw2100: don't sync status queue entries
John W. Linville [Fri, 13 Aug 2010 22:47:33 +0000 (18:47 -0400)]
ipw2100: don't sync status queue entries

These are allocated with pci_alloc_consistent, so calling
pci_dma_sync_single_for_cpu is incorrect usage of the API.  Remove this
misuse and consequently avoid the following backtrace:

WARNING: at lib/dma-debug.c:902 check_sync+0xce/0x43a()
Hardware name: 2373HU6
ipw2100 0000:02:02.0: DMA-API: device driver tries to sync DMA memory it has not allocated [device address=0x0000000034e88008] [size=8 bytes]
Modules linked in: microcode ipw2100(+) snd_seq_device ppdev libipw nsc_ircc snd_pcm lib80211 video output irda parport_pc cfg80211 parport thinkpad_acpi e1000 iTCO_wdt crc_ccitt snd_timer iTCO_vendor_support snd i2c_i801 pcspkr rfkill soundcore joydev snd_page_alloc yenta_socket radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
Pid: 0, comm: swapper Tainted: G        W   2.6.35-wl+ #8
Call Trace:
 [<c043aa42>] warn_slowpath_common+0x6a/0x7f
 [<c05d252a>] ? check_sync+0xce/0x43a
 [<c043aaca>] warn_slowpath_fmt+0x2b/0x2f
 [<c05d252a>] check_sync+0xce/0x43a
 [<c046189a>] ? print_lock_contention_bug+0x11/0xb2
 [<c05d2b6f>] debug_dma_sync_single_for_cpu+0x47/0x49
 [<c06cbd3c>] ? ehci_irq+0x31/0x331
 [<f82a224a>] ? ipw2100_irq_tasklet+0x24/0x5e9 [ipw2100]
 [<f82a224a>] ? ipw2100_irq_tasklet+0x24/0x5e9 [ipw2100]
 [<f82a221d>] pci_dma_sync_single_for_cpu.clone.1+0x42/0x4b [ipw2100]
 [<f82a23a2>] ipw2100_irq_tasklet+0x17c/0x5e9 [ipw2100]
 [<c043fd87>] tasklet_action+0x78/0xcb
 [<c0440293>] __do_softirq+0xc4/0x183
 [<c044038d>] do_softirq+0x3b/0x5f
 [<c04404d0>] irq_exit+0x3a/0x6d
 [<c0404423>] do_IRQ+0x8b/0x9f
 [<c04038b5>] common_interrupt+0x35/0x3c
 [<c062ecfa>] ? acpi_idle_enter_simple+0xfe/0x13c
 [<c045007b>] ? exit_itimers+0x2d/0x73
 [<c062ecfc>] ? acpi_idle_enter_simple+0x100/0x13c
 [<c070bf10>] cpuidle_idle_call+0x78/0xdc
 [<c040251c>] cpu_idle+0x9b/0xb7
 [<c07b1dd2>] rest_init+0xa6/0xab
 [<c0a4b96d>] start_kernel+0x389/0x38e
 [<c0a4b0c9>] i386_start_kernel+0xc9/0xd0

Signed-off-by: John W. Linville <linville@tuxdriver.com>
14 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wirel...
David S. Miller [Mon, 16 Aug 2010 20:56:01 +0000 (13:56 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless-2.6

14 years agoRevert "netlink: netlink_recvmsg() fix"
David S. Miller [Mon, 16 Aug 2010 06:21:50 +0000 (23:21 -0700)]
Revert "netlink: netlink_recvmsg() fix"

This reverts commit 1235f504aaba2ebeabc863fdb3ceac764a317d47.

It causes regressions worse than the problem it was trying
to fix.  Eric will try to solve the problem another way.

Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoipv6: remove sysctl jiffies conversion on gc_elasticity and min_adv_mss
Min Zhang [Sun, 15 Aug 2010 05:42:51 +0000 (22:42 -0700)]
ipv6: remove sysctl jiffies conversion on gc_elasticity and min_adv_mss

sysctl output ipv6 gc_elasticity and min_adv_mss as values divided by
HZ. However, they are not in unit of jiffies, since ip6_rt_min_advmss
refers to packet size and ip6_rt_fc_elasticity is used as scaler as in
expire>>ip6_rt_gc_elasticity, so replace the jiffies conversion
handler will regular handler for them.

This has impact on scripts that are currently working assuming the
divide by HZ, will yield different results with this patch in place.

Signed-off-by: Min Zhang <mzhang@mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoxfrm: Use GFP_ATOMIC in xfrm_compile_policy
Herbert Xu [Sun, 15 Aug 2010 05:38:09 +0000 (22:38 -0700)]
xfrm: Use GFP_ATOMIC in xfrm_compile_policy

As xfrm_compile_policy runs within a read_lock, we cannot use
GFP_KERNEL for memory allocations.

Reported-by: Luca Tettamanti <kronos.it@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoiwlwifi: use long monitor timer to avoid un-necessary reload
Wey-Yi Guy [Fri, 23 Jul 2010 20:19:40 +0000 (13:19 -0700)]
iwlwifi: use long monitor timer to avoid un-necessary reload

For 5000 and 6000g2b series of devices, use long monitor timer to check
stuck tx queues.

.6000g2b series device, it is WiFi/BT combo device, there are some cases,
tx queues are not move for a period of time because the WiFi/BT coex.

.5000 series device, it is being reported firmware got reload more
often than necessary, so extend the timer to avoid un-necessary reload.

Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
14 years agoiwlwifi: long monitor timer
Wey-Yi Guy [Fri, 23 Jul 2010 20:19:39 +0000 (13:19 -0700)]
iwlwifi: long monitor timer

Change the name for monitor timer, also adding define for long monitor
timer; long monitor timer can be used for the type of devices require longer
time to determine the uCode is stuck on tx and needed reload.

Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
14 years agoath5k: disable ASPM L0s for all cards
Maxim Levitsky [Fri, 13 Aug 2010 15:27:28 +0000 (11:27 -0400)]
ath5k: disable ASPM L0s for all cards

Atheros PCIe wireless cards handled by ath5k do require L0s disabled.
For distributions shipping with CONFIG_PCIEASPM (this will be enabled
by default in the future in 2.6.36) this will also mean both L1 and L0s
will be disabled when a pre 1.1 PCIe device is detected. We do know L1
works correctly even for all ath5k pre 1.1 PCIe devices though but cannot
currently undue the effect of a blacklist, for details you can read
pcie_aspm_sanity_check() and see how it adjusts the device link
capability.

It may be possible in the future to implement some PCI API to allow
drivers to override blacklists for pre 1.1 PCIe but for now it is
best to accept that both L0s and L1 will be disabled completely for
distributions shipping with CONFIG_PCIEASPM rather than having this
issue present. Motivation for adding this new API will be to help
with power consumption for some of these devices.

Example of issues you'd see:

  - On the Acer Aspire One (AOA150, Atheros Communications Inc. AR5001
    Wireless Network Adapter [168c:001c] (rev 01)) doesn't work well
    with ASPM enabled, the card will eventually stall on heavy traffic
    with often 'unsupported jumbo' warnings appearing. Disabling
    ASPM L0s in ath5k fixes these problems.

  - On the same card you would see a storm of RXORN interrupts
    even though medium is idle.

Credit for root causing and fixing the bug goes to Jussi Kivilinna.

Cc: David Quan <David.Quan@atheros.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Tim Gardner <tim.gardner@canonical.com>
Cc: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Cc: stable@kernel.org
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>