kernel/kernel-generic.git
18 years ago[DECNET]: Only use local routers
Patrick Caulfield [Tue, 3 Jan 2006 22:24:02 +0000 (14:24 -0800)]
[DECNET]: Only use local routers

The attached patch makes DECnet routing only use routers from the same
area - rather than the highest rated router seen.

In theory there should not be an out-of-area router on a local network
but some networks are bridged rather than properly routed. VMS seems
to behave similarly: if I bring up a VMS node with no router then it
can't see anything else on the global network.

Signed-off-by: Patrick Caulfield <patrick@tykepenguin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPVS]: Cleanup IP_VS_DBG statements.
Roberto Nibali [Tue, 3 Jan 2006 22:22:59 +0000 (14:22 -0800)]
[IPVS]: Cleanup IP_VS_DBG statements.

From: Roberto Nibali <ratz@drugphish.ch>

The attached patch (against current -GIT) is a cleanup patch which does
following:

o lookup debug messages shifted back to 9
o added more informational value to flags and refcnt since those
entries can be in multiple referenced structures
o cleanup 80 char violation

It's the prepatch to the session pool implementation and helps very much
to debug and monitor important variables and structures regarding the
threshold limitation and persistency without the thousands of lookup
messages which noone is interested in.

Signed-off-by: Horms <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TG3]: fixup tot_len calculation
Alexey Dobriyan [Tue, 3 Jan 2006 22:19:25 +0000 (14:19 -0800)]
[TG3]: fixup tot_len calculation

Turning struct iphdr::tot_len into __be16 added sparse warning.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET]: Add a dev_ioctl() fallback to sock_ioctl()
Christoph Hellwig [Tue, 3 Jan 2006 22:18:33 +0000 (14:18 -0800)]
[NET]: Add a dev_ioctl() fallback to sock_ioctl()

Currently all network protocols need to call dev_ioctl as the default
fallback in their ioctl implementations.  This patch adds a fallback
to dev_ioctl to sock_ioctl if the protocol returned -ENOIOCTLCMD.
This way all the procotol ioctl handlers can be simplified and we don't
need to export dev_ioctl.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETROM]: Remove unessecary lock_sock calls in netrom_ioctl()
Christoph Hellwig [Tue, 3 Jan 2006 22:14:46 +0000 (14:14 -0800)]
[NETROM]: Remove unessecary lock_sock calls in netrom_ioctl()

lock_sock is needed only in very few cases, so do it there instead of
around the switch statement.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETLINK] genetlink: fix cmd type in genl_ops to be consistent to u8
Per Liden [Tue, 3 Jan 2006 22:13:29 +0000 (14:13 -0800)]
[NETLINK] genetlink: fix cmd type in genl_ops to be consistent to u8

Signed-off-by: Per Liden <per.liden@ericsson.com>
ACKed-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[AF_UNIX]: Convert to use a spinlock instead of rwlock
Benjamin LaHaise [Tue, 3 Jan 2006 22:10:46 +0000 (14:10 -0800)]
[AF_UNIX]: Convert to use a spinlock instead of rwlock

From: Benjamin LaHaise <bcrl@kvack.org>

In af_unix, a rwlock is used to protect internal state.  At least on my
P4 with HT it is faster to use a spinlock due to the simpler memory
barrier used to unlock.  This patch raises bw_unix to ~690K/s.

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET]: Speed up __alloc_skb()
Benjamin LaHaise [Tue, 3 Jan 2006 22:06:50 +0000 (14:06 -0800)]
[NET]: Speed up __alloc_skb()

From: Benjamin LaHaise <bcrl@kvack.org>

In __alloc_skb(), the use of skb_shinfo() which casts a u8 * to the
shared info structure results in gcc being forced to do a reload of the
pointer since it has no information on possible aliasing.  Fix this by
using a pointer to refer to skb_shared_info.

By initializing skb_shared_info sequentially, the write combining buffers
can reduce the number of memory transactions to a single write.  Reorder
the initialization in __alloc_skb() to match the structure definition.
There is also an alignment issue on 64 bit systems with skb_shared_info
by converting nr_frags to a short everything packs up nicely.

Also, pass the slab cache pointer according to the fclone flag instead
of using two almost identical function calls.

This raises bw_unix performance up to a peak of 707KB/s when combined
with the spinlock patch.  It should help other networking protocols, too.

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[PPPOX]: Fix assignment into const proto_ops.
David S. Miller [Wed, 28 Dec 2005 04:57:40 +0000 (20:57 -0800)]
[PPPOX]: Fix assignment into const proto_ops.

And actually, with this, the whole pppox layer can basically
be removed and subsumed into pppoe.c, no other pppox sub-protocol
implementation exists and we've had this thing for at least 4
years.

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TCP]: Don't use __constant_htonl for a non const arg
Arnaldo Carvalho de Melo [Tue, 27 Dec 2005 17:17:57 +0000 (15:17 -0200)]
[TCP]: Don't use __constant_htonl for a non const arg

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[INET_SOCK]: Move struct inet_sock & helper functions to net/inet_sock.h
Arnaldo Carvalho de Melo [Tue, 27 Dec 2005 04:43:12 +0000 (02:43 -0200)]
[INET_SOCK]: Move struct inet_sock & helper functions to net/inet_sock.h

To help in reducing the number of include dependencies, several files were
touched as they were getting needed headers indirectly for stuff they use.

Thanks also to Alan Menegotto for pointing out that net/dccp/proto.c had
linux/dccp.h include twice.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[SOCK]: Introduce sk_receive_skb
Arnaldo Carvalho de Melo [Tue, 27 Dec 2005 04:42:22 +0000 (02:42 -0200)]
[SOCK]: Introduce sk_receive_skb

Its common enough to to justify that, TCP still can't use it as it has the
prequeueing stuff, still to be made generic in the not so distant future :-)

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET]: restructure sock_aio_{read,write} / sock_{readv,writev}
Christoph Hellwig [Fri, 23 Dec 2005 05:08:46 +0000 (21:08 -0800)]
[NET]: restructure sock_aio_{read,write} / sock_{readv,writev}

Mid-term I plan to restructure the file_operations so that we don't need
to have all these duplicate aio and vectored versions.  This patch is
a small step in that direction but also a worthwile cleanup on it's own:

(1) introduce a alloc_sock_iocb helper that encapsulates allocating a
    proper sock_iocb
(2) add do_sock_read and do_sock_write helpers for common read/write
    code

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET]: Fix sock_init() return value.
David S. Miller [Thu, 22 Dec 2005 20:58:55 +0000 (12:58 -0800)]
[NET]: Fix sock_init() return value.

It needs to return zero now that it is an initcall.

Also, net/nonet.c no longer needs a dummy sock_init().

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[PKTGEN]: Deinitialise static variables.
Jaco Kroon [Thu, 22 Dec 2005 20:51:46 +0000 (12:51 -0800)]
[PKTGEN]: Deinitialise static variables.

static variables should not be explicitly initialised to 0.  This causes
them to be placed in .data instead of .bss.  This patch de-initialises 3
static variables in net/core/pktgen.c.

There are approximately 800 more such variables in the source tree
(2.6.15rc5).  If there is more interrest I'd be willing to track down the
rest of these as well and de-initialise them as well.

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET]: move struct proto_ops to const
Eric Dumazet [Thu, 22 Dec 2005 20:49:22 +0000 (12:49 -0800)]
[NET]: move struct proto_ops to const

I noticed that some of 'struct proto_ops' used in the kernel may share
a cache line used by locks or other heavily modified data. (default
linker alignement is 32 bytes, and L1_CACHE_LINE is 64 or 128 at
least)

This patch makes sure a 'struct proto_ops' can be declared as const,
so that all cpus can share all parts of it without false sharing.

This is not mandatory : a driver can still use a read/write structure
if it needs to (and eventually a __read_mostly)

I made a global stubstitute to change all existing occurences to make
them const.

This should reduce the possibility of false sharing on SMP, and
speedup some socket system calls.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET]: Small cleanup to socket initialization
Andi Kleen [Thu, 22 Dec 2005 20:43:42 +0000 (12:43 -0800)]
[NET]: Small cleanup to socket initialization

sock_init can be done as a core_initcall instead of calling
it directly in init/main.c

Also I removed an out of date #ifdef.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[SCTP]: Add support for SCTP_DELAYED_ACK_TIME socket option.
Frank Filz [Thu, 22 Dec 2005 19:37:30 +0000 (11:37 -0800)]
[SCTP]: Add support for SCTP_DELAYED_ACK_TIME socket option.

Signed-off-by: Frank Filz <ffilz@us.ibm.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[SCTP]: Update SCTP_PEER_ADDR_PARAMS socket option to the latest api draft.
Frank Filz [Thu, 22 Dec 2005 19:36:46 +0000 (11:36 -0800)]
[SCTP]: Update SCTP_PEER_ADDR_PARAMS socket option to the latest api draft.

This patch adds support to set/get heartbeat interval, maximum number of
retransmissions, pathmtu, sackdelay time for a particular transport/
association/socket as per the latest SCTP sockets api draft11.

Signed-off-by: Frank Filz <ffilz@us.ibm.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV4] fib_trie: Add credits.
Robert Olsson [Thu, 22 Dec 2005 19:25:10 +0000 (11:25 -0800)]
[IPV4] fib_trie: Add credits.

Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TCP] cubic: use Newton-Raphson
Stephen Hemminger [Thu, 22 Dec 2005 03:32:36 +0000 (19:32 -0800)]
[TCP] cubic: use Newton-Raphson

Replace cube root algorithim with a faster version using Newton-Raphson.
Surprisingly, doing the scaled div64_64 is faster than a true 64 bit
division on 64 bit CPU's.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TCP] cubic: precompute constants
Stephen Hemminger [Thu, 22 Dec 2005 03:32:08 +0000 (19:32 -0800)]
[TCP] cubic: precompute constants

Revised version of patch to pre-compute values for TCP cubic.
  * d32,d64 replaced with descriptive names
  * cube_factor replaces
 srtt[scaled by count] / HZ * ((1 << (10+2*BICTCP_HZ)) / bic_scale)
  * beta_scale replaces
8*(BICTCP_BETA_SCALE+beta)/3/(BICTCP_BETA_SCALE-beta);

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[FLS64]: x86_64 version
Stephen Hemminger [Thu, 22 Dec 2005 03:31:36 +0000 (19:31 -0800)]
[FLS64]: x86_64 version

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[FLS64]: generic version
Stephen Hemminger [Thu, 22 Dec 2005 03:30:53 +0000 (19:30 -0800)]
[FLS64]: generic version

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[PKT_SCHED] netem: packet corruption option
Stephen Hemminger [Thu, 22 Dec 2005 03:03:44 +0000 (19:03 -0800)]
[PKT_SCHED] netem: packet corruption option

Here is a new feature for netem in 2.6.16. It adds the ability to
randomly corrupt packets with netem. A version was done by
Hagen Paul Pfeifer, but I redid it to handle the cases of backwards
compatibility with netlink interface and presence of hardware checksum
offload. It is useful for testing hardware offload in devices.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[BRIDGE]: add version number
Stephen Hemminger [Thu, 22 Dec 2005 03:01:30 +0000 (19:01 -0800)]
[BRIDGE]: add version number

Add version info to bridge module.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[BRIDGE]: limited ethtool support
Stephen Hemminger [Thu, 22 Dec 2005 03:00:58 +0000 (19:00 -0800)]
[BRIDGE]: limited ethtool support

Add limited ethtool support to bridge to allow disabling
features.

Note: if underlying device does not support a feature (like checksum
offload), then the bridge device won't inherit it.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[BRIDGE]: filter packets in learning state
Stephen Hemminger [Thu, 22 Dec 2005 03:00:18 +0000 (19:00 -0800)]
[BRIDGE]: filter packets in learning state

While in the learning state, run filters but drop the result.
This prevents us from acquiring bad fdb entries in learning state.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[BRIDGE]: handle speed detection after carrier changes
Stephen Hemminger [Tue, 20 Dec 2005 23:19:51 +0000 (15:19 -0800)]
[BRIDGE]: handle speed detection after carrier changes

Speed of a interface may not be available until carrier
is detected in the case of autonegotiation. To get the correct value
we need to recheck speed after carrier event.  But the check needs to
be done in a context that is similar to normal ethtool interface (can sleep).

Also, delay check for 1ms to try avoid any carrier bounce transitions.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[BRIDGE]: allow setting hardware address of bridge pseudo-dev
Stephen Hemminger [Thu, 22 Dec 2005 02:51:49 +0000 (18:51 -0800)]
[BRIDGE]: allow setting hardware address of bridge pseudo-dev

Some people are using bridging to hide multiple machines from an ISP
that restricts by MAC address. So in that case allow the bridge mac
address to be set to any of the existing interfaces.  I don't want to
allow any arbitrary value and confuse STP.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[AF_UNIX]: Use spinlock for unix_table_lock
David S. Miller [Wed, 14 Dec 2005 07:26:29 +0000 (23:26 -0800)]
[AF_UNIX]: Use spinlock for unix_table_lock

This lock is actually taken mostly as a writer,
so using a rwlock actually just makes performance
worse especially on chips like the Intel P4.

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IP_SOCKGLUE]: Remove most of the tcp specific calls
Arnaldo Carvalho de Melo [Wed, 14 Dec 2005 07:26:10 +0000 (23:26 -0800)]
[IP_SOCKGLUE]: Remove most of the tcp specific calls

As DCCP needs to be called in the same spots.

Now we have a member in inet_sock (is_icsk), set at sock creation time from
struct inet_protosw->flags (if INET_PROTOSW_ICSK is set, like for TCP and
DCCP) to see if a struct sock instance is a inet_connection_sock for places
like the ones in ip_sockglue.c (v4 and v6) where we previously were looking if
sk_type was SOCK_STREAM, that is insufficient because we now use the same code
for DCCP, that has sk_type SOCK_DCCP.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TCP]: Move the TCPF_ enum to tcp_states.h
Arnaldo Carvalho de Melo [Wed, 14 Dec 2005 07:25:56 +0000 (23:25 -0800)]
[TCP]: Move the TCPF_ enum to tcp_states.h

Upcoming patches will make, for instance, ip_sockglue.c need just this enum
and not all of tcp.h.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[INET6]: Generalise tcp_v6_hash_connect
Arnaldo Carvalho de Melo [Wed, 14 Dec 2005 07:25:44 +0000 (23:25 -0800)]
[INET6]: Generalise tcp_v6_hash_connect

Renaming it to inet6_hash_connect, making it possible to ditch
dccp_v6_hash_connect and share the same code with TCP instead.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[INET]: Generalise tcp_v4_hash_connect
Arnaldo Carvalho de Melo [Wed, 14 Dec 2005 07:25:31 +0000 (23:25 -0800)]
[INET]: Generalise tcp_v4_hash_connect

Renaming it to inet_hash_connect, making it possible to ditch
dccp_v4_hash_connect and share the same code with TCP instead.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TWSK]: Introduce struct timewait_sock_ops
Arnaldo Carvalho de Melo [Wed, 14 Dec 2005 07:25:19 +0000 (23:25 -0800)]
[TWSK]: Introduce struct timewait_sock_ops

So that we can share several timewait sockets related functions and
make the timewait mini sockets infrastructure closer to the request
mini sockets one.

Next changesets will take advantage of this, moving more code out of
TCP and DCCP v4 and v6 to common infrastructure.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[DCCP]: Use reqsk_free in dccp_v4_conn_request
Arnaldo Carvalho de Melo [Wed, 14 Dec 2005 07:25:06 +0000 (23:25 -0800)]
[DCCP]: Use reqsk_free in dccp_v4_conn_request

Now we have the destructor (dccp_v4_reqsk_destructor) in our
request_sock_ops vtable.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[DCCP]: Introduce DCCPv6
Arnaldo Carvalho de Melo [Wed, 14 Dec 2005 07:24:53 +0000 (23:24 -0800)]
[DCCP]: Introduce DCCPv6

Still needs mucho polishing, specially in the checksum code, but works
just fine, inet_diag/iproute2 and all 8)

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV6]: Export ipv6_opt_accepted
Arnaldo Carvalho de Melo [Wed, 14 Dec 2005 07:24:28 +0000 (23:24 -0800)]
[IPV6]: Export ipv6_opt_accepted

It was already non-TCP specific, will be used by DCCPv6.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[DCCP]: Prepare the AF agnostic core for the introduction of DCCPv6
Arnaldo Carvalho de Melo [Wed, 14 Dec 2005 07:24:16 +0000 (23:24 -0800)]
[DCCP]: Prepare the AF agnostic core for the introduction of DCCPv6

Basically exports a similar set of functions as the one exported by
the non-AF specific TCP code.

In the process moved some non-AF specific code from dccp_v4_connect to
dccp_connect_init and moved the checksum verification from
dccp_invalid_packet to dccp_v4_rcv, so as to use it in dccp_v6_rcv
too.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[DCCP]: Just rename dccp_v4_prot to dccp_prot
Arnaldo Carvalho de Melo [Wed, 14 Dec 2005 07:23:32 +0000 (23:23 -0800)]
[DCCP]: Just rename dccp_v4_prot to dccp_prot

To match TCP equivalent.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV6]: Export some symbols for DCCPv6
Arnaldo Carvalho de Melo [Wed, 14 Dec 2005 07:23:20 +0000 (23:23 -0800)]
[IPV6]: Export some symbols for DCCPv6

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV6]: Introduce inet6_timewait_sock
Arnaldo Carvalho de Melo [Wed, 14 Dec 2005 07:23:09 +0000 (23:23 -0800)]
[IPV6]: Introduce inet6_timewait_sock

Out of tcp6_timewait_sock, that now is just an aggregation of
inet_timewait_sock and inet6_timewait_sock, using tw_ipv6_offset in struct
inet_timewait_sock, that is common to the IPv6 transport protocols that use
timewait sockets, like DCCP and TCP.

tw_ipv6_offset plays the struct inet_sock pinfo6 role, i.e. for the generic
code to find the IPv6 area in a timewait sock.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV6]: Generalise some functions
Arnaldo Carvalho de Melo [Wed, 14 Dec 2005 07:22:54 +0000 (23:22 -0800)]
[IPV6]: Generalise some functions

Using sk->sk_protocol instead of IPPROTO_TCP.

Will be used by DCCPv6 in the next changesets.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[AF_UNIX]: Remove superfluous reference counting in unix_stream_sendmsg
Benjamin LaHaise [Wed, 14 Dec 2005 07:22:32 +0000 (23:22 -0800)]
[AF_UNIX]: Remove superfluous reference counting in unix_stream_sendmsg

AF_UNIX stream socket performance on P4 CPUs tends to suffer due to a
lot of pipeline flushes from atomic operations.  The patch below
removes the sock_hold() and sock_put() in unix_stream_sendmsg().  This
should be safe as the socket still holds a reference to its peer which
is only released after the file descriptor's final user invokes
unix_release_sock().  The only consideration is that we must add a
memory barrier before setting the peer initially.

Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET]: Avoid atomic xchg() for non-error case
Benjamin LaHaise [Wed, 14 Dec 2005 07:22:19 +0000 (23:22 -0800)]
[NET]: Avoid atomic xchg() for non-error case

It also looks like there were 2 places where the test on sk_err was
missing from the event wait logic (in sk_stream_wait_connect and
sk_stream_wait_memory), while the rest of the sock_error() users look
to be doing the right thing.  This version of the patch fixes those,
and cleans up a few places that were testing ->sk_err directly.

Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPVS]: remove dead code
Roberto Nibali [Wed, 14 Dec 2005 07:17:20 +0000 (23:17 -0800)]
[IPVS]: remove dead code

This patch removes dead code. I don't see the reason to keep this cruft
around, besides cluttering the nice and functionally working code.

Signed-off-by: Roberto Nibali <ratz@drugphish.ch>
Signed-off-by: Horms <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[UDP]: udp_checksum_init return value
Stephen Hemminger [Wed, 14 Dec 2005 07:17:02 +0000 (23:17 -0800)]
[UDP]: udp_checksum_init return value

Since udp_checksum_init always returns 0 there is no point in
having it return a value.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IP]: Simplify and consolidate MSG_PEEK error handling
Herbert Xu [Wed, 14 Dec 2005 07:16:37 +0000 (23:16 -0800)]
[IP]: Simplify and consolidate MSG_PEEK error handling

When a packet is obtained from skb_recv_datagram with MSG_PEEK enabled
it is left on the socket receive queue.  This means that when we detect
a checksum error we have to be careful when trying to free the packet
as someone could have dequeued it in the time being.

Currently this delicate logic is duplicated three times between UDPv4,
UDPv6 and RAWv6.  This patch moves them into a one place and simplifies
the code somewhat.

This is based on a suggestion by Eric Dumazet.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[DCCP]: Introduce dccp_ipv4_af_ops
Arnaldo Carvalho de Melo [Wed, 14 Dec 2005 07:16:16 +0000 (23:16 -0800)]
[DCCP]: Introduce dccp_ipv4_af_ops

And make the core DCCP code AF agnostic, just like TCP, now its time
to work on net/dccp/ipv6.c, we are close to the end!

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[ICSK]: Move v4_addr2sockaddr from TCP to icsk
Arnaldo Carvalho de Melo [Wed, 14 Dec 2005 07:16:04 +0000 (23:16 -0800)]
[ICSK]: Move v4_addr2sockaddr from TCP to icsk

Renaming it to inet_csk_addr2sockaddr.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[ICSK]: Rename struct tcp_func to struct inet_connection_sock_af_ops
Arnaldo Carvalho de Melo [Wed, 14 Dec 2005 07:15:52 +0000 (23:15 -0800)]
[ICSK]: Rename struct tcp_func to struct inet_connection_sock_af_ops

And move it to struct inet_connection_sock. DCCP will use it in the
upcoming changesets.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV6]: Introduce inet6_rsk()
Arnaldo Carvalho de Melo [Wed, 14 Dec 2005 07:15:40 +0000 (23:15 -0800)]
[IPV6]: Introduce inet6_rsk()

And inet6_rsk_offset in inet_request_sock, for the same reasons as
inet_sock's pinfo6 member.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV6]: Generalise tcp_v6_search_req & tcp_v6_synq_add
Arnaldo Carvalho de Melo [Wed, 14 Dec 2005 07:15:24 +0000 (23:15 -0800)]
[IPV6]: Generalise tcp_v6_search_req & tcp_v6_synq_add

More work is needed tho to introduce inet6_request_sock from
tcp6_request_sock, in the same layout considerations as ipv6_pinfo in
inet_sock, next changeset will do that.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[ICSK]: make inet_csk_reqsk_queue_hash_add timeout arg unsigned long
Arnaldo Carvalho de Melo [Wed, 14 Dec 2005 07:15:12 +0000 (23:15 -0800)]
[ICSK]: make inet_csk_reqsk_queue_hash_add timeout arg unsigned long

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV6]: Generalise __tcp_v6_hash, renaming it to __inet6_hash
Arnaldo Carvalho de Melo [Wed, 14 Dec 2005 07:15:01 +0000 (23:15 -0800)]
[IPV6]: Generalise __tcp_v6_hash, renaming it to __inet6_hash

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV6]: Reuse inet_csk_get_port in tcp_v6_get_port
Arnaldo Carvalho de Melo [Wed, 14 Dec 2005 07:14:47 +0000 (23:14 -0800)]
[IPV6]: Reuse inet_csk_get_port in tcp_v6_get_port

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV4]: Safer reassembly
Herbert Xu [Wed, 14 Dec 2005 07:14:27 +0000 (23:14 -0800)]
[IPV4]: Safer reassembly

Another spin of Herbert Xu's "safer ip reassembly" patch
for 2.6.16.

(The original patch is here:
http://marc.theaimsgroup.com/?l=linux-netdev&m=112281936522415&w=2
and my only contribution is to have tested it.)

This patch (optionally) does additional checks before accepting IP
fragments, which can greatly reduce the possibility of reassembling
fragments which originated from different IP datagrams.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Arthur Kepner <akepner@sgi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER] ebtables: Support nf_log API from ebt_log and ebt_ulog
Bart De Schuymer [Wed, 14 Dec 2005 07:14:08 +0000 (23:14 -0800)]
[NETFILTER] ebtables: Support nf_log API from ebt_log and ebt_ulog

This makes ebt_log and ebt_ulog use the new nf_log api.  This enables
the bridging packet filter to log packets e.g. via nfnetlink_log.

Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER] ip_tables: NUMA-aware allocation
Eric Dumazet [Wed, 14 Dec 2005 07:13:48 +0000 (23:13 -0800)]
[NETFILTER] ip_tables: NUMA-aware allocation

Part of a performance problem with ip_tables is that memory allocation
is not NUMA aware, but 'only' SMP aware (ie each CPU normally touch
separate cache lines)

Even with small iptables rules, the cost of this misplacement can be
high on common workloads.  Instead of using one vmalloc() area
(located in the node of the iptables process), we now allocate an area
for each possible CPU, using vmalloc_node() so that memory should be
allocated in the CPU's node if possible.

Port to arp_tables and ip6_tables by Harald Welte.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TCP] BIC: CUBIC window growth (2.0)
Stephen Hemminger [Wed, 14 Dec 2005 07:13:28 +0000 (23:13 -0800)]
[TCP] BIC: CUBIC window growth (2.0)

Replace existing BIC version 1.1 with new version 2.0.
The main change is to replace the window growth function
with a cubic function as described in:
  http://www.csc.ncsu.edu/faculty/rhee/export/bitcp/cubic-paper.pdf

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TCP] BIC: spelling and whitespace
Stephen Hemminger [Wed, 14 Dec 2005 07:13:13 +0000 (23:13 -0800)]
[TCP] BIC: spelling and whitespace

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TCP] BIC: remove low utilization code.
Stephen Hemminger [Wed, 14 Dec 2005 07:13:00 +0000 (23:13 -0800)]
[TCP] BIC: remove low utilization code.

The latest BICTCP patch at:
http://www.csc.ncsu.edu:8080/faculty/rhee/export/bitcp/index_files/Page546.htm

disables the low_utilization feature of BICTCP because it doesn't work
in some cases. This patch removes it.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[LSM-IPSec]: Per-packet access control.
Trent Jaeger [Wed, 14 Dec 2005 07:12:40 +0000 (23:12 -0800)]
[LSM-IPSec]: Per-packet access control.

This patch series implements per packet access control via the
extension of the Linux Security Modules (LSM) interface by hooks in
the XFRM and pfkey subsystems that leverage IPSec security
associations to label packets.  Extensions to the SELinux LSM are
included that leverage the patch for this purpose.

This patch implements the changes necessary to the SELinux LSM to
create, deallocate, and use security contexts for policies
(xfrm_policy) and security associations (xfrm_state) that enable
control of a socket's ability to send and receive packets.

Patch purpose:

The patch is designed to enable the SELinux LSM to implement access
control on individual packets based on the strongly authenticated
IPSec security association.  Such access controls augment the existing
ones in SELinux based on network interface and IP address.  The former
are very coarse-grained, and the latter can be spoofed.  By using
IPSec, the SELinux can control access to remote hosts based on
cryptographic keys generated using the IPSec mechanism.  This enables
access control on a per-machine basis or per-application if the remote
machine is running the same mechanism and trusted to enforce the
access control policy.

Patch design approach:

The patch's main function is to authorize a socket's access to a IPSec
policy based on their security contexts.  Since the communication is
implemented by a security association, the patch ensures that the
security association's negotiated and used have the same security
context.  The patch enables allocation and deallocation of such
security contexts for policies and security associations.  It also
enables copying of the security context when policies are cloned.
Lastly, the patch ensures that packets that are sent without using a
IPSec security assocation with a security context are allowed to be
sent in that manner.

A presentation available at
www.selinux-symposium.org/2005/presentations/session2/2-3-jaeger.pdf
from the SELinux symposium describes the overall approach.

Patch implementation details:

The function which authorizes a socket to perform a requested
operation (send/receive) on a IPSec policy (xfrm_policy) is
selinux_xfrm_policy_lookup.  The Netfilter and rcv_skb hooks ensure
that if a IPSec SA with a securit y association has not been used,
then the socket is allowed to send or receive the packet,
respectively.

The patch implements SELinux function for allocating security contexts
when policies (xfrm_policy) are created via the pfkey or xfrm_user
interfaces via selinux_xfrm_policy_alloc.  When a security association
is built, SELinux allocates the security context designated by the
XFRM subsystem which is based on that of the authorized policy via
selinux_xfrm_state_alloc.

When a xfrm_policy is cloned, the security context of that policy, if
any, is copied to the clone via selinux_xfrm_policy_clone.

When a xfrm_policy or xfrm_state is freed, its security context, if
any is also freed at selinux_xfrm_policy_free or
selinux_xfrm_state_free.

Testing:

The SELinux authorization function is tested using ipsec-tools.  We
created policies and security associations with particular security
contexts and added SELinux access control policy entries to verify the
authorization decision.  We also made sure that packets for which no
security context was supplied (which either did or did not use
security associations) were authorized using an unlabelled context.

Signed-off-by: Trent Jaeger <tjaeger@cse.psu.edu>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[LSM-IPSec]: Security association restriction.
Trent Jaeger [Wed, 14 Dec 2005 07:12:27 +0000 (23:12 -0800)]
[LSM-IPSec]: Security association restriction.

This patch series implements per packet access control via the
extension of the Linux Security Modules (LSM) interface by hooks in
the XFRM and pfkey subsystems that leverage IPSec security
associations to label packets.  Extensions to the SELinux LSM are
included that leverage the patch for this purpose.

This patch implements the changes necessary to the XFRM subsystem,
pfkey interface, ipv4/ipv6, and xfrm_user interface to restrict a
socket to use only authorized security associations (or no security
association) to send/receive network packets.

Patch purpose:

The patch is designed to enable access control per packets based on
the strongly authenticated IPSec security association.  Such access
controls augment the existing ones based on network interface and IP
address.  The former are very coarse-grained, and the latter can be
spoofed.  By using IPSec, the system can control access to remote
hosts based on cryptographic keys generated using the IPSec mechanism.
This enables access control on a per-machine basis or per-application
if the remote machine is running the same mechanism and trusted to
enforce the access control policy.

Patch design approach:

The overall approach is that policy (xfrm_policy) entries set by
user-level programs (e.g., setkey for ipsec-tools) are extended with a
security context that is used at policy selection time in the XFRM
subsystem to restrict the sockets that can send/receive packets via
security associations (xfrm_states) that are built from those
policies.

A presentation available at
www.selinux-symposium.org/2005/presentations/session2/2-3-jaeger.pdf
from the SELinux symposium describes the overall approach.

Patch implementation details:

On output, the policy retrieved (via xfrm_policy_lookup or
xfrm_sk_policy_lookup) must be authorized for the security context of
the socket and the same security context is required for resultant
security association (retrieved or negotiated via racoon in
ipsec-tools).  This is enforced in xfrm_state_find.

On input, the policy retrieved must also be authorized for the socket
(at __xfrm_policy_check), and the security context of the policy must
also match the security association being used.

The patch has virtually no impact on packets that do not use IPSec.
The existing Netfilter (outgoing) and LSM rcv_skb hooks are used as
before.

Also, if IPSec is used without security contexts, the impact is
minimal.  The LSM must allow such policies to be selected for the
combination of socket and remote machine, but subsequent IPSec
processing proceeds as in the original case.

Testing:

The pfkey interface is tested using the ipsec-tools.  ipsec-tools have
been modified (a separate ipsec-tools patch is available for version
0.5) that supports assignment of xfrm_policy entries and security
associations with security contexts via setkey and the negotiation
using the security contexts via racoon.

The xfrm_user interface is tested via ad hoc programs that set
security contexts.  These programs are also available from me, and
contain programs for setting, getting, and deleting policy for testing
this interface.  Testing of sa functions was done by tracing kernel
behavior.

Signed-off-by: Trent Jaeger <tjaeger@cse.psu.edu>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years agoLinux v2.6.15
Linus Torvalds [Tue, 3 Jan 2006 03:21:10 +0000 (19:21 -0800)]
Linux v2.6.15

Hey, it's fifteen years today since I bought the machine that got Linux
started.  January 2nd is a good date.

18 years ago[PATCH] Make sure interleave masks have at least one node set
Andi Kleen [Mon, 2 Jan 2006 23:07:28 +0000 (00:07 +0100)]
[PATCH] Make sure interleave masks have at least one node set

Otherwise a bad mem policy system call can confuse the interleaving
code into referencing undefined nodes.

Originally reported by Doug Chapman

I was told it's CVE-2005-3358
(one has to love these security people - they make everything sound important)

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Avoid namespace pollution in <asm/param.h>
Dag-Erling Smørgrav [Mon, 2 Jan 2006 14:57:06 +0000 (15:57 +0100)]
[PATCH] Avoid namespace pollution in <asm/param.h>

In commit 3D59121003721a8fad11ee72e646fd9d3076b5679c, the x86 and x86-64
<asm/param.h> was changed to include <linux/config.h> for the
configurable timer frequency.

However, asm/param.h is sometimes used in userland (it is included
indirectly from <sys/param.h>), so your commit pollutes the userland
namespace with tons of CONFIG_FOO macros.  This greatly confuses
software packages (such as BusyBox) which use CONFIG_FOO macros
themselves to control the inclusion of optional features.

After a short exchange, Christoph approved this patch

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] powerpc: more g5 overtemp problem fix
Benjamin Herrenschmidt [Mon, 2 Jan 2006 02:04:44 +0000 (13:04 +1100)]
[PATCH] powerpc: more g5 overtemp problem fix

Some G5s still occasionally experience shutdowns due to overtemp
conditions despite the recent fix. After analyzing logs from such
machines, it appears that the overtemp code is a bit too quick at
shutting the machine down when reaching the critical temperature (tmax +
8) and doesn't leave the fan enough time to actually cool it down. This
happens if the temperature of a CPU suddenly rises too high in a very
short period of time, or occasionally on boot (that is the CPUs are
already overtemp by the time the driver loads).

This patches makes the code a bit more relaxed, leaving a few seconds to
the fans to do their job before kicking the machine shutown.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] x86: teach dump_task_regs() about the -8 offset.
Stas Sergeev [Sun, 1 Jan 2006 01:18:52 +0000 (04:18 +0300)]
[PATCH] x86: teach dump_task_regs() about the -8 offset.

This should fix multi-threaded core-files

Signed-off-by: stsp@aknet.ru
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years agosysctl: make sure to terminate strings with a NUL
Linus Torvalds [Sun, 1 Jan 2006 01:00:29 +0000 (17:00 -0800)]
sysctl: make sure to terminate strings with a NUL

This is a slightly more complete fix for the previous minimal sysctl
string fix.  It always terminates the returned string with a NUL, even
if the full result wouldn't fit in the user-supplied buffer.

The returned length is the full untruncated length, so that you can
tell when truncation has occurred.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years agoMerge master.kernel.org:/home/rmk/linux-2.6-serial
Linus Torvalds [Sat, 31 Dec 2005 21:49:26 +0000 (13:49 -0800)]
Merge master.kernel.org:/home/rmk/linux-2.6-serial

18 years ago[PATCH] Fix false old value return of sysctl
Yi Yang [Fri, 30 Dec 2005 08:37:10 +0000 (16:37 +0800)]
[PATCH] Fix false old value return of sysctl

For the sysctl syscall, if the user wants to get the old value of a
sysctl entry and set a new value for it in the same syscall, the old
value is always overwritten by the new value if the sysctl entry is of
string type and if the user sets its strategy to sysctl_string.  This
issue lies in the strategy being run twice if the strategy is set to
sysctl_string, the general strategy sysctl_string always returns 0 if
success.

Such strategy routines as sysctl_jiffies and sysctl_jiffies_ms return 1
because they do read and write for the sysctl entry.

The strategy routine sysctl_string return 0 although it actually read
and write the sysctl entry.

According to my analysis, if a strategy routine do read and write, it
should return 1, if it just does some necessary check but not read and
write, it should return 0, for example sysctl_intvec.

Signed-off-by: Yi Yang <yang.y.yi@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years agosysctl: don't overflow the user-supplied buffer with '\0'
Linus Torvalds [Sat, 31 Dec 2005 01:18:53 +0000 (17:18 -0800)]
sysctl: don't overflow the user-supplied buffer with '\0'

If the string was too long to fit in the user-supplied buffer,
the sysctl layer would zero-terminate it by writing past the
end of the buffer. Don't do that.

Noticed by Yi Yang <yang.y.yi@gmail.com>

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years agoInsanity avoidance in /proc
Linus Torvalds [Fri, 30 Dec 2005 16:39:10 +0000 (08:39 -0800)]
Insanity avoidance in /proc

The old /proc interfaces were never updated to use loff_t, and are just
generally broken.  Now, we should be using the seq_file interface for
all of the proc files, but converting the legacy functions is more work
than most people care for and has little upside..

But at least we can make the non-LFS rules explicit, rather than just
insanely wrapping the offset or something.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Input: wacom - fix X axis setup
Denny Priebe [Fri, 30 Dec 2005 03:19:09 +0000 (22:19 -0500)]
[PATCH] Input: wacom - fix X axis setup

This patch fixes a typo introduced by conversion to dynamic input_dev
allocation.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Input: warrior - fix HAT0Y axis setup
Dmitry Torokhov [Fri, 30 Dec 2005 03:19:08 +0000 (22:19 -0500)]
[PATCH] Input: warrior - fix HAT0Y axis setup

This patch fixes a typo introduced by conversion to dynamic input_dev
allocation.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Input: kbtab - fix Y axis setup
Dmitry Torokhov [Fri, 30 Dec 2005 03:19:07 +0000 (22:19 -0500)]
[PATCH] Input: kbtab - fix Y axis setup

This patch fixes a typo introduced by conversion to dynamic input_dev
allocation.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[ARM] 3216/1: indent and typo in drivers/serial/pxa.c
Erik Hovland [Fri, 30 Dec 2005 15:57:35 +0000 (15:57 +0000)]
[ARM] 3216/1: indent and typo in drivers/serial/pxa.c

Patch from Erik Hovland

This patch provides two changes. An indent is supplied for an if/else clause so that it is more readable. An acronym is incorrectly typed as UER when it should be IER.

Signed-off-by: Erik Hovland <erik@hovland.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
18 years ago[PATCH] Simplify the VIDEO_SAA7134_OSS Kconfig dependency line
Jean Delvare [Thu, 29 Dec 2005 21:07:30 +0000 (22:07 +0100)]
[PATCH] Simplify the VIDEO_SAA7134_OSS Kconfig dependency line

Thanks to Roman Zippel for the suggestion.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
[ Short explanation: Kconfig uses ternary math: n/m/y, and !m is m ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years agoRevert radeon AGP aperture offset changes
Linus Torvalds [Thu, 29 Dec 2005 21:01:54 +0000 (13:01 -0800)]
Revert radeon AGP aperture offset changes

This reverts the series of commits

67dbb4ea33731415fe09c62149a34f472719ac1d
281ab031a8c9e5b593142eb4ec59a87faae8676a
47807ce381acc34a7ffee2b42e35e96c0f322e52

that changed the GART VM start offset.  It fixed some machines, but
seems to continually interact badly with some X versions.

Quoth Ben Herrenschmidt:

  "So I think at this point, the best is that we keep the old bogus code
   that at least is consistent with the bug in the server. I'm working on a
   big patch to X that reworks the memory map stuff completely and fixes
   those issues on the server side, I'll do a DRM patch matching this X fix
   as well so that the memory map is only ever set in one place and with
   what I hope is a correct algorithm..."

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years agoMerge master.kernel.org:/home/rmk/linux-2.6-mmc
Linus Torvalds [Thu, 29 Dec 2005 18:27:28 +0000 (10:27 -0800)]
Merge master.kernel.org:/home/rmk/linux-2.6-mmc

18 years agoMerge master.kernel.org:/home/rmk/linux-2.6-serial
Linus Torvalds [Thu, 29 Dec 2005 18:27:07 +0000 (10:27 -0800)]
Merge master.kernel.org:/home/rmk/linux-2.6-serial

18 years ago[PATCH] Fix recursive config dependency for SAA7134
Jean Delvare [Wed, 28 Dec 2005 20:02:57 +0000 (21:02 +0100)]
[PATCH] Fix recursive config dependency for SAA7134

Fix the cyclic dependency issue between CONFIG_SAA7134_ALSA and
CONFIG_SAA7134_OSS (credits to Mauro Carvalho Chehab.)

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Mauro Carvalho Chehab <mchehab@brturbo.com.br>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] ppc64: htab_initialize_secondary cannot be marked __init
Anton Blanchard [Wed, 28 Dec 2005 23:46:29 +0000 (10:46 +1100)]
[PATCH] ppc64: htab_initialize_secondary cannot be marked __init

Sonny has noticed hotplug CPU on ppc64 is broken in 2.6.15-*. One of the
problems is that htab_initialize_secondary is called when a cpu is being
brought up, but it is marked __init.

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] x86_64: Fix incorrect node_present_pages on NUMA
Ravikiran G Thirumalai [Thu, 29 Dec 2005 12:06:11 +0000 (13:06 +0100)]
[PATCH] x86_64: Fix incorrect node_present_pages on NUMA

Currently, we do not pass the correct start_pfn to e820_hole_size, to
calculate holes.  Following patch fixes that.

The bug results in incorrect number of node_present_pages for each pgdat
and causes ugly output in /sys and probably VM inbalances.

Signed-off-by: Alok N Kataria <alokk@calsoftinc.com>
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Sighed-off-by: Shair Fultheim <shai@scalex86.org>
Sighed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Input: aiptek - fix Y axis setup
Riccardo Magliocchetti [Thu, 29 Dec 2005 01:44:48 +0000 (20:44 -0500)]
[PATCH] Input: aiptek - fix Y axis setup

This patch fixes a typo introduced by conversion to dynamic input_dev
allocation.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] fix ia64 compile failure with gcc4.1
Dave Jones [Thu, 29 Dec 2005 01:01:04 +0000 (20:01 -0500)]
[PATCH] fix ia64 compile failure with gcc4.1

__get_unaligned creates a typeof the var its passed, and writes to it,
which on gcc4.1, spits out the following error:

drivers/char/vc_screen.c: In function 'vcs_write':
drivers/char/vc_screen.c:422: error: assignment of read-only variable 'val'

Signed-off-by: Dave Jones <davej@redhat.com>
[ The "right" fix would be to try to fix <asm-generic/unaligned.h>
  but that's hard to do with the tools gcc gives us. So this
  simpler patch is preferable -- Linus ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] uml: fix compilation with CONFIG_MODE_TT disabled
Paolo 'Blaisorblade' Giarrusso [Thu, 29 Dec 2005 16:40:02 +0000 (17:40 +0100)]
[PATCH] uml: fix compilation with CONFIG_MODE_TT disabled

Fix UML compilation when SKAS mode is disabled. Indeed, we were compiling
SKAS-only object files, which failed due to some SKAS-only headers being
excluded from the search path.

Thanks to the bug report from Pekka J Enberg.

Acked-by: Pekka J Enberg <penberg (at) cs ! helsinki ! fi>
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Hostfs: update for new glibc - add missing symbol exports
Paolo 'Blaisorblade' Giarrusso [Thu, 29 Dec 2005 16:39:59 +0000 (17:39 +0100)]
[PATCH] Hostfs: update for new glibc - add missing symbol exports

Today, when compiling UML, I got warnings for two used unexported symbols:
readdir64 and truncate64. Indeed, my glibc headers are aliasing readdir to
readdir64 and truncate to truncate64 (and so on).

I'm then adding additional exports. Since I've no idea if the symbols where
always provided in the supported glibc's, I've added weak definitions too.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] uml: hostfs - fix possible PAGE_CACHE_SHIFT overflows
Paolo 'Blaisorblade' Giarrusso [Thu, 29 Dec 2005 16:39:57 +0000 (17:39 +0100)]
[PATCH] uml: hostfs - fix possible PAGE_CACHE_SHIFT overflows

Prevent page->index << PAGE_CACHE_SHIFT from overflowing.

There is a casting there, but was added without care, so it's at the wrong
place. Note the extra parens around the shift - "+" is higher precedence than
"<<", leading to a GCC warning which saved all us.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Hostfs: remove unused var
Paolo 'Blaisorblade' Giarrusso [Thu, 29 Dec 2005 16:39:54 +0000 (17:39 +0100)]
[PATCH] Hostfs: remove unused var

Trivial removal of unused variable from this file - doesn't even change the
generated assembly code, in fact (gcc should trigger a warning for unused value
here).

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] uml: fix random segfaults at bootup
Paolo 'Blaisorblade' Giarrusso [Thu, 29 Dec 2005 16:39:51 +0000 (17:39 +0100)]
[PATCH] uml: fix random segfaults at bootup

Don't use printk() where "current_thread_info()" is crap.

Until when we switch to running on init_stack, current_thread_info() evaluates
to crap. Printk uses "current" at times (in detail, &current is evaluated with
CONFIG_DEBUG_SPINLOCK to check the spinlock owner task).

And this leads to random segmentation faults.

Exactly, what happens is that &current = *(current_thread_info()), i.e. round
down $esp and dereference the value. I.e. access the stack below $esp, which
causes SIGSEGV on a VM_GROWSDOWN vma (see arch/i386/mm/fault.c).

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years agoMerge master.kernel.org:/pub/scm/linux/kernel/git/davem/tg3-2.6
Linus Torvalds [Wed, 28 Dec 2005 21:45:19 +0000 (13:45 -0800)]
Merge /pub/scm/linux/kernel/git/davem/tg3-2.6

18 years ago[SERMOUSE]: Sun mice speak 5-byte protocol too.
David S. Miller [Wed, 28 Dec 2005 21:27:04 +0000 (13:27 -0800)]
[SERMOUSE]: Sun mice speak 5-byte protocol too.

Noticed by Christophe Zimmerman, this explains the slow mouse movement
with 2.6.x kernels.

And checking the 2.4.x drivers/sbus/char/sunmouse.c driver shows we
always used a 5-byte protocol with Sun mice in the past.  I have no
idea how the 3-byte thing got into the 2.6.x driver, but it's surely
wrong.

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[SPARC]: Use STABS_DEBUG and DWARF_DEBUG macros in vmlinux.lds.S
David S. Miller [Wed, 28 Dec 2005 21:22:54 +0000 (13:22 -0800)]
[SPARC]: Use STABS_DEBUG and DWARF_DEBUG macros in vmlinux.lds.S

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TG3]: Update driver version and reldate.
David S. Miller [Wed, 28 Dec 2005 21:05:41 +0000 (13:05 -0800)]
[TG3]: Update driver version and reldate.

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TG3]: ethtool -d hangs PCIe systems
Chris Elmquist [Tue, 20 Dec 2005 21:25:19 +0000 (13:25 -0800)]
[TG3]: ethtool -d hangs PCIe systems

Resubmitting after recommendation to use GET_REG32_1() instead of
GET_REG32_LOOP(..., 1).  Retested.  Problem remains fixed.

Prevent tg3_get_regs() from reading reserved and undocumented registers
at RX_CPU_BASE and TX_CPU_BASE offsets which caused hostile behavior
on PCIe platforms.

Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[PATCH] Fix more radeon GART start calculation cases
Benjamin Herrenschmidt [Tue, 27 Dec 2005 01:49:33 +0000 (12:49 +1100)]
[PATCH] Fix more radeon GART start calculation cases

As reported by Jules Villard <jvillard@ens-lyon.fr> and some others, the
recent GART aperture start reconfiguration causes problems on some
setups.

What I _think_ might be happening is that the X server is also trying to
muck around with the card memory map and is forcing it back into a wrong
setting that also happens to no longer match what the DRM wants to do
and blows up.  There are bugs all over the place in that code (and still
some bugs in the DRM as well anyway).

This patch attempts to avoid that by using the largest of the 2 values,
which I think will cause it to behave as it used to for you and will
still fix the problem with machines that have an aperture size smaller
than the video memory.

Acked-by: Jules Villard <jvillard@ens-lyon.fr>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[IPV6] mcast: Fix multiple issues in MLDv2 reports.
David L Stevens [Tue, 27 Dec 2005 22:03:00 +0000 (14:03 -0800)]
[IPV6] mcast: Fix multiple issues in MLDv2 reports.

The below "jumbo" patch fixes the following problems in MLDv2.

1) Add necessary "ntohs" to recent "pskb_may_pull" check [breaks
        all nonzero source queries on little-endian (!)]

2) Add locking to source filter list [resend of prior patch]

3) fix "mld_marksources()" to
        a) send nothing when all queried sources are excluded
        b) send full exclude report when source queried sources are
                not excluded
        c) don't schedule a timer when there's nothing to report

NOTE: RFC 3810 specifies the source list should be saved and each
  source reported individually as an IS_IN. This is an obvious DOS
  path, requiring the host to store and then multicast as many sources
  as are queried (e.g., millions...). This alternative sends a full,
  relevant report that's limited to number of sources present on the
  machine.

4) fix "add_grec()" to send empty-source records when it should
        The original check doesn't account for a non-empty source
        list with all sources inactive; the new code keeps that
        short-circuit case, and also generates the group header
        with an empty list if needed.

5) fix mca_crcount decrement to be after add_grec(), which needs
        its original value

These issues (other than item #1 ;-) ) were all found by Yan Zheng,
much thanks!

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>