platform/kernel/linux-3.10.git
12 years agoaf_packet: packet_getsockopt() cleanup
Eric Dumazet [Thu, 19 Apr 2012 21:56:11 +0000 (21:56 +0000)]
af_packet: packet_getsockopt() cleanup

Factorize code, since most fetched values are int type.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: move duplicate code from tcp_v4_init_sock()/tcp_v6_init_sock()
Neal Cardwell [Thu, 19 Apr 2012 09:55:21 +0000 (09:55 +0000)]
tcp: move duplicate code from tcp_v4_init_sock()/tcp_v6_init_sock()

This commit moves the (substantial) common code shared between
tcp_v4_init_sock() and tcp_v6_init_sock() to a new address-family
independent function, tcp_init_sock().

Centralizing this functionality should help avoid drift issues,
e.g. where the IPv4 side is updated without a corresponding update to
IPv6. There was already some drift: IPv4 initialized snd_cwnd to
TCP_INIT_CWND, while the IPv6 side was still initializing snd_cwnd to
2 (in this case it should not matter, since snd_cwnd is also
initialized in tcp_init_metrics(), but the general risks and
maintenance overhead remain).

When diffing the old and new code, note that new tcp_init_sock()
function uses the order of steps from the tcp_v4_init_sock()
implementation (the order is slightly different in
tcp_v6_init_sock()).

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: allow better page reuse in splice(sock -> pipe)
Eric Dumazet [Thu, 19 Apr 2012 09:38:17 +0000 (09:38 +0000)]
net: allow better page reuse in splice(sock -> pipe)

splice() from socket to pipe needs linear_to_page() helper to transfert
skb header to part of page.

We can reset the offset in the current sk->sk_sndmsg_page if we are the
last user of the page.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoteam: add per-port option for enabling/disabling ports
Jiri Pirko [Fri, 20 Apr 2012 04:42:06 +0000 (04:42 +0000)]
team: add per-port option for enabling/disabling ports

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoteam: allow to enable/disable ports
Jiri Pirko [Fri, 20 Apr 2012 04:42:05 +0000 (04:42 +0000)]
team: allow to enable/disable ports

This patch changes content of hashlist (used to get port struct by
computed index (0...en_port_count-1)). Now the hash list contains only
enabled ports so userspace will be able to say what ports can be used
for tx/rx. This becomes handy when userspace will need to disable ports
which does not belong to active aggregator. By default, newly added port
is enabled.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoteam: lb: let userspace care about port macs
Jiri Pirko [Fri, 20 Apr 2012 04:42:04 +0000 (04:42 +0000)]
team: lb: let userspace care about port macs

Better to leave this for userspace

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: change big iov allocations
Eric Dumazet [Fri, 20 Apr 2012 18:04:01 +0000 (20:04 +0200)]
net: change big iov allocations

iov of more than 8 entries are allocated in sendmsg()/recvmsg() through
sock_kmalloc()

As these allocations are temporary only and small enough, it makes sense
to use plain kmalloc() and avoid sk_omem_alloc atomic overhead.

Slightly changed fast path to be even faster.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Mike Waychison <mikew@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: Repair connection-time negotiated parameters
Pavel Emelyanov [Thu, 19 Apr 2012 03:41:57 +0000 (03:41 +0000)]
tcp: Repair connection-time negotiated parameters

There are options, which are set up on a socket while performing
TCP handshake. Need to resurrect them on a socket while repairing.
A new sockoption accepts a buffer and parses it. The buffer should
be CODE:VALUE sequence of bytes, where CODE is standard option
code and VALUE is the respective value.

Only 4 options should be handled on repaired socket.

To read 3 out of 4 of these options the TCP_INFO sockoption can be
used. An ability to get the last one (the mss_clamp) was added by
the previous patch.

Now the restore. Three of these options -- timestamp_ok, mss_clamp
and snd_wscale -- are just restored on a coket.

The sack_ok flags has 2 issues. First, whether or not to do sacks
at all. This flag is just read and set back. No other sack  info is
saved or restored, since according to the standart and the code
dropping all sack-ed segments is OK, the sender will resubmit them
again, so after the repair we will probably experience a pause in
connection. Next, the fack bit. It's just set back on a socket if
the respective sysctl is set. No collected stats about packets flow
is preserved. As far as I see (plz, correct me if I'm wrong) the
fack-based congestion algorithm survives dropping all of the stats
and repairs itself eventually, probably losing the performance for
that period.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: Report mss_clamp with TCP_MAXSEG option in repair mode
Pavel Emelyanov [Thu, 19 Apr 2012 03:41:32 +0000 (03:41 +0000)]
tcp: Report mss_clamp with TCP_MAXSEG option in repair mode

The mss_clamp is the only connection-time negotiated option which
cannot be obtained from the user space. Make the TCP_MAXSEG sockopt
report one in the repair mode.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: Repair socket queues
Pavel Emelyanov [Thu, 19 Apr 2012 03:41:01 +0000 (03:41 +0000)]
tcp: Repair socket queues

Reading queues under repair mode is done with recvmsg call.
The queue-under-repair set by TCP_REPAIR_QUEUE option is used
to determine which queue should be read. Thus both send and
receive queue can be read with this.

Caller must pass the MSG_PEEK flag.

Writing to queues is done with sendmsg call and yet again --
the repair-queue option can be used to push data into the
receive queue.

When putting an skb into receive queue a zero tcp header is
appented to its head to address the tcp_hdr(skb)->syn and
the ->fin checks by the (after repair) tcp_recvmsg. These
flags flags are both set to zero and that's why.

The fin cannot be met in the queue while reading the source
socket, since the repair only works for closed/established
sockets and queueing fin packet always changes its state.

The syn in the queue denotes that the respective skb's seq
is "off-by-one" as compared to the actual payload lenght. Thus,
at the rcv queue refill we can just drop this flag and set the
skb's sequences to precice values.

When the repair mode is turned off, the write queue seqs are
updated so that the whole queue is considered to be 'already sent,
waiting for ACKs' (write_seq = snd_nxt <= snd_una). From the
protocol POV the send queue looks like it was sent, but the data
between the write_seq and snd_nxt is lost in the network.

This helps to avoid another sockoption for setting the snd_nxt
sequence. Leaving the whole queue in a 'not yet sent' state (as
it will be after sendmsg-s) will not allow to receive any acks
from the peer since the ack_seq will be after the snd_nxt. Thus
even the ack for the window probe will be dropped and the
connection will be 'locked' with the zero peer window.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: Initial repair mode
Pavel Emelyanov [Thu, 19 Apr 2012 03:40:39 +0000 (03:40 +0000)]
tcp: Initial repair mode

This includes (according the the previous description):

* TCP_REPAIR sockoption

This one just puts the socket in/out of the repair mode.
Allowed for CAP_NET_ADMIN and for closed/establised sockets only.
When repair mode is turned off and the socket happens to be in
the established state the window probe is sent to the peer to
'unlock' the connection.

* TCP_REPAIR_QUEUE sockoption

This one sets the queue which we're about to repair. The
'no-queue' is set by default.

* TCP_QUEUE_SEQ socoption

Sets the write_seq/rcv_nxt of a selected repaired queue.
Allowed for TCP_CLOSE-d sockets only. When the socket changes
its state the other seq-s are changed by the kernel according
to the protocol rules (most of the existing code is actually
reused).

* Ability to forcibly bind a socket to a port

The sk->sk_reuse is set to SK_FORCE_REUSE.

* Immediate connect modification

The connect syscall initializes the connection, then directly jumps
to the code which finalizes it.

* Silent close modification

The close just aborts the connection (similar to SO_LINGER with 0
time) but without sending any FIN/RST-s to peer.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: Move code around
Pavel Emelyanov [Thu, 19 Apr 2012 03:40:01 +0000 (03:40 +0000)]
tcp: Move code around

This is just the preparation patch, which makes the needed for
TCP repair code ready for use.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agosock: Introduce named constants for sk_reuse
Pavel Emelyanov [Thu, 19 Apr 2012 03:39:36 +0000 (03:39 +0000)]
sock: Introduce named constants for sk_reuse

Name them in a "backward compatible" manner, i.e. reuse or not
are still 1 and 0 respectively. The reuse value of 2 means that
the socket with it will forcibly reuse everyone else's port.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agodrivers/net: decouple ISA and ISA_DMA_API
Arnd Bergmann [Fri, 20 Apr 2012 10:56:16 +0000 (10:56 +0000)]
drivers/net: decouple ISA and ISA_DMA_API

The two options are separate, and some platforms (e.g. arm pxa)
have ISA slots but no ISA dma controller, so they cannot build
drivers using the DMA API functions.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agosungem: use mdelay instead of udelay where necessary
Arnd Bergmann [Fri, 20 Apr 2012 10:56:15 +0000 (10:56 +0000)]
sungem: use mdelay instead of udelay where necessary

Some architectures like ARM cannot handle large numbers as
arguments to udelay, so the drivers should use mdelay when
delaying for multiple miliseconds.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agodonauboe: replace excessive udelay with msleep
Arnd Bergmann [Fri, 20 Apr 2012 10:56:14 +0000 (10:56 +0000)]
donauboe: replace excessive udelay with msleep

No driver should spin the CPU for 10ms, so better use
an msleep, which is allowed in the ->suspend function.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years ago8390: select CRC32 support
Arnd Bergmann [Fri, 20 Apr 2012 10:56:13 +0000 (10:56 +0000)]
8390: select CRC32 support

The ax88796 driver uses the CRC32 functions, so make sure that
they are actually enabled.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agodrivers/net: iwmc3200 depends on EXPERIMENTAL
Arnd Bergmann [Fri, 20 Apr 2012 10:56:12 +0000 (10:56 +0000)]
drivers/net: iwmc3200 depends on EXPERIMENTAL

The iwmc3200 driver selects other code in Kconfig that depends on
EXPERIMENTAL. Kconfig warns about this when CONFIG_EXPERIMENTAL
is not already set, so logically, these options should also
be marked experimental or promoted to stable.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agocaif: include linux/io.h
Arnd Bergmann [Fri, 20 Apr 2012 10:56:11 +0000 (10:56 +0000)]
caif: include linux/io.h

The caif_shmcore requires io.h in order to use ioremap, so include that
explicitly to compile in all configurations.

Also add a note about the use of ioremap(), which is not a proper way
to map a DMA buffer into kernel space. It's not completely clear what
the intention is for using ioremap, but it is clear that the result
of ioremap must not simply be accessed using kernel pointers but
should use readl/writel or memcopy_{to,from}io. Assigning the result
of ioremap to a regular pointer that can also be set to something
else is not ok.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agodrivers/net: add missing __devexit_p() annotations
Arnd Bergmann [Fri, 20 Apr 2012 10:56:10 +0000 (10:56 +0000)]
drivers/net: add missing __devexit_p() annotations

Drivers that refer to a __devexit function in an operations
structure need to annotate that pointer with __devexit_p so
replace it with a NULL pointer when the section gets discarded.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agodavinci_cpdma: export symbols used by other drivers
Arnd Bergmann [Fri, 20 Apr 2012 10:56:09 +0000 (10:56 +0000)]
davinci_cpdma: export symbols used by other drivers

The davinci_emac driver can be a module, so the symbols
it needs from the cpdma driver must be exported.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agopch_gbe: remove suspicious comment
Richard Cochran [Fri, 20 Apr 2012 18:50:36 +0000 (18:50 +0000)]
pch_gbe: remove suspicious comment

The time stamping code in this driver appears to have been copied from
the ixp4xx_eth.c driver, including this timing comment. I had actually
measured the time stamp delay on an IXP425, but I really doubt that this
value also applies here.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agopch_gbe: run the ptp bpf just once per packet
Richard Cochran [Fri, 20 Apr 2012 18:50:35 +0000 (18:50 +0000)]
pch_gbe: run the ptp bpf just once per packet

This patch fixes code which needlessly ran the BPF twice per
packet. Instead, we just run the classifier once and test
whether the packet is any kind of PTP event message.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agopch_gbe: correct receive time stamp filtering
Takahiro Shimizu [Fri, 20 Apr 2012 18:50:34 +0000 (18:50 +0000)]
pch_gbe: correct receive time stamp filtering

This patch fixes the driver so that multicast PTP event messages can
be recognized by the hardware time stamping unit. The station address
register must be set according to the desired transport type.

[ RC - Rebased Takahiro's changes and wrote a commit message
  explaining the changes. ]

Signed-off-by: Takahiro Shimizu <tshimizu818@gmail.com>
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agopch_gbe: do not set the channel control register
Takahiro Shimizu [Fri, 20 Apr 2012 18:50:33 +0000 (18:50 +0000)]
pch_gbe: do not set the channel control register

We will let the pch_gbe code do that according to the receive time stamp
filter.

[ RC - Rebased Takahiro's changes and wrote a commit message
  explaining the changes. ]

Signed-off-by: Takahiro Shimizu <tshimizu818@gmail.com>
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agopch_gbe: improve coding style
Takahiro Shimizu [Fri, 20 Apr 2012 18:50:32 +0000 (18:50 +0000)]
pch_gbe: improve coding style

This patch clears up a few coding style issues:

- Makes two function definitions a bit nicer looking.
- Remove unneeded parentheses.
- Simplify macros for register bits.

[ RC - Rebased Takahiro's changes and wrote a commit message
  explaining the changes. ]

Signed-off-by: Takahiro Shimizu <tshimizu818@gmail.com>
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agopch_gbe: export a method to set the receive match address
Takahiro Shimizu [Fri, 20 Apr 2012 18:50:31 +0000 (18:50 +0000)]
pch_gbe: export a method to set the receive match address

The code in phc_gbe_main will need to call this method in order to set the
station address register according to the receive time stamping filter.

[ RC - Rebased Takahiro's changes and wrote a commit message
  explaining the changes. ]

Signed-off-by: Takahiro Shimizu <tshimizu818@gmail.com>
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agopch_gbe: reprogram multicast address register on reset
Takahiro Shimizu [Fri, 20 Apr 2012 18:50:30 +0000 (18:50 +0000)]
pch_gbe: reprogram multicast address register on reset

The reset logic after a Rx FIFO overrun will clear the programmed
multicast addresses. This patch fixes the issue by reprogramming the
registers after the reset.

[ RC - Rebased Takahiro's changes and wrote a commit message
  explaining the changes. ]

Signed-off-by: Takahiro Shimizu <tshimizu818@gmail.com>
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agopch_gbe: simplify transmit time stamping flag test
Takahiro Shimizu [Fri, 20 Apr 2012 18:50:29 +0000 (18:50 +0000)]
pch_gbe: simplify transmit time stamping flag test

This patch makes logic surrounding the test of the
transmit time stamping flag more readable.

[ RC - Rebased Takahiro's changes and wrote a commit message
  explaining the changes. ]

Signed-off-by: Takahiro Shimizu <tshimizu818@gmail.com>
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agopch_gbe: scale time stamps to nanoseconds
Takahiro Shimizu [Fri, 20 Apr 2012 18:50:28 +0000 (18:50 +0000)]
pch_gbe: scale time stamps to nanoseconds

This patch fixes the helper functions that give the transmit and
receive time stamps to return nanoseconds, instead of arbitrary clock
ticks.

[ RC - Rebased Takahiro's changes and wrote a commit message
  explaining the changes. ]

Signed-off-by: Takahiro Shimizu <tshimizu818@gmail.com>
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: Remove register_net_sysctl_table
Eric W. Biederman [Thu, 19 Apr 2012 13:46:06 +0000 (13:46 +0000)]
net: Remove register_net_sysctl_table

All of the users have been converted to use registera_net_sysctl so we
no longer need register_net_sysctl.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: Delete all remaining instances of ctl_path
Eric W. Biederman [Thu, 19 Apr 2012 13:45:29 +0000 (13:45 +0000)]
net: Delete all remaining instances of ctl_path

We don't use struct ctl_path anymore so delete the exported constants.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: Convert all sysctl registrations to register_net_sysctl
Eric W. Biederman [Thu, 19 Apr 2012 13:44:49 +0000 (13:44 +0000)]
net: Convert all sysctl registrations to register_net_sysctl

This results in code with less boiler plate that is a bit easier
to read.

Additionally stops us from using compatibility code in the sysctl
core, hastening the day when the compatibility code can be removed.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: Convert nf_conntrack_proto to use register_net_sysctl
Eric W. Biederman [Thu, 19 Apr 2012 13:43:55 +0000 (13:43 +0000)]
net: Convert nf_conntrack_proto to use register_net_sysctl

There isn't much advantage here except that strings paths are a bit
easier to read, and converting everything to them allows me to kill off
ctl_path.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet ipv4: Convert devinet to use register_net_sysctl
Eric W. Biederman [Thu, 19 Apr 2012 13:42:09 +0000 (13:42 +0000)]
net ipv4: Convert devinet to use register_net_sysctl

Using an ascii path to register_net_sysctl as opposed to the slightly
awkward ctl_path allows for much simpler code.

We no longer need to malloc dev_name to keep it alive the length of our
sysctl register instead we can use a small temporary buffer on the
stack.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet ipv6: Convert addrconf to use register_net_sysctl
Eric W. Biederman [Thu, 19 Apr 2012 13:41:24 +0000 (13:41 +0000)]
net ipv6: Convert addrconf to use register_net_sysctl

Using an ascii path to register_net_sysctl as opposed to the slightly
awkward ctl_path allows for much simpler code.

We no longer need to malloc dev_name to keep it alive the length of our
sysctl register instead we can use a small temporary buffer on the
stack.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet decnet: Convert to use register_net_sysctl
Eric W. Biederman [Thu, 19 Apr 2012 13:40:37 +0000 (13:40 +0000)]
net decnet: Convert to use register_net_sysctl

Using an ascii path to register_net_sysctl as opposed to the slightly
awkward ctl_path allows for much simpler code.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet neighbour: Convert to use register_net_sysctl
Eric W. Biederman [Thu, 19 Apr 2012 13:38:03 +0000 (13:38 +0000)]
net neighbour: Convert to use register_net_sysctl

Using an ascii path to register_net_sysctl as opposed to the slightly
awkward ctl_path allows for much simpler code.

We no longer need to malloc dev_name to keep it alive the length of our
sysctl register instead we can use a small temporary buffer on the
stack.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet ipv6: Don't use sysctl tables with .child entries.
Eric W. Biederman [Thu, 19 Apr 2012 13:37:09 +0000 (13:37 +0000)]
net ipv6: Don't use sysctl tables with .child entries.

The sysctl core no longer natively understands sysctl tables
with .child entries.

Split the ipv6_table to remove the .child entries.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet llc: Don't use sysctl tables with .child entries.
Eric W. Biederman [Thu, 19 Apr 2012 13:35:39 +0000 (13:35 +0000)]
net llc: Don't use sysctl tables with .child entries.

The sysctl core no longer natively understands sysctl tables with .child
entries.

Kill the intermediate tables and use register_net_sysctl directly to
remove the need for compatibility code.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet ax25: Simplify and cleanup the ax25 sysctl handling.
Eric W. Biederman [Thu, 19 Apr 2012 13:34:18 +0000 (13:34 +0000)]
net ax25: Simplify and cleanup the ax25 sysctl handling.

Don't register/unregister every ax25 table in a batch.  Instead register
and unregister per device ax25 sysctls as ax25 devices come and go.

This moves ax25 to be a completely modern sysctl user.  Registering the
sysctls in just the initial network namespace, removing the use of
.child entries that are no longer natively supported by the sysctl core
and taking advantage of the fact that there are no longer any ordering
constraints between registering and unregistering different sysctl
tables.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet ipv4: Remove the unneeded registration of an empty net/ipv4/neigh
Eric W. Biederman [Thu, 19 Apr 2012 13:32:39 +0000 (13:32 +0000)]
net ipv4: Remove the unneeded registration of an empty net/ipv4/neigh

sysctl no longer requires explicit creation of directories.  The neigh
directory is always populated with at least a default entry so this
won't cause any user visible changes.

Delete the ipv4_path and the ipv4_skeleton these are no longer needed.

Directly register the ipv4_route_table.

And since I am an idiot remove the header definitions that I should
have removed in the previous patch.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet ipv6: Remove unneded registration of an empty net/ipv6/neigh
Eric W. Biederman [Thu, 19 Apr 2012 13:26:19 +0000 (13:26 +0000)]
net ipv6: Remove unneded registration of an empty net/ipv6/neigh

sysctl no longer requires explicit creation of directories.  The neigh
directory is always populated with at least a default entry so this
should cause no user visible changes.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet core: Remove unneded creation of an empty net/core sysctl directory
Eric W. Biederman [Thu, 19 Apr 2012 13:25:13 +0000 (13:25 +0000)]
net core: Remove unneded creation of an empty net/core sysctl directory

On the next line we register the net_core_table in net/core which
creates the directory and ensures it exists.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: Move all of the network sysctls without a namespace into init_net.
Eric W. Biederman [Thu, 19 Apr 2012 13:24:33 +0000 (13:24 +0000)]
net: Move all of the network sysctls without a namespace into init_net.

This makes it clearer which sysctls are relative to your current network
namespace.

This makes it a little less error prone by not exposing sysctls for the
initial network namespace in other namespaces.

This is the same way we handle all of our other network interfaces to
userspace and I can't honestly remember why we didn't do this for
sysctls right from the start.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: Kill register_sysctl_rotable
Eric W. Biederman [Thu, 19 Apr 2012 13:22:55 +0000 (13:22 +0000)]
net: Kill register_sysctl_rotable

register_sysctl_rotable never caught on as an interesting way to
register sysctls.  My take on the situation is that what we want are
sysctls that we can only see in the initial network namespace.  What we
have implemented with register_sysctl_rotable are sysctls that we can
see in all of the network namespaces and can only change in the initial
network namespace.

That is a very silly way to go.  Just register the network sysctls
in the initial network namespace and we don't have any weird special
cases to deal with.

The sysctls affected are:
/proc/sys/net/ipv4/ipfrag_secret_interval
/proc/sys/net/ipv4/ipfrag_max_dist
/proc/sys/net/ipv6/ip6frag_secret_interval
/proc/sys/net/ipv6/mld_max_msf

I really don't expect anyone will miss them if they can't read them in a
child user namespace.

CC: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet sysctl: Initialize the network sysctls sooner to avoid problems.
Eric W. Biederman [Thu, 19 Apr 2012 13:20:32 +0000 (13:20 +0000)]
net sysctl: Initialize the network sysctls sooner to avoid problems.

If the netfilter code is modified to use register_net_sysctl_table the
kernel fails to boot because the per net sysctl infrasturce is not setup
soon enough.  So to avoid races call net_sysctl_init from sock_init().

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet sysctl: Register an empty /proc/sys/net
Eric W. Biederman [Thu, 19 Apr 2012 13:19:46 +0000 (13:19 +0000)]
net sysctl: Register an empty /proc/sys/net

Implementation limitations of the sysctl core won't let /proc/sys/net
reside in a network namespace.  /proc/sys/net at least must be registered
as a normal sysctl.  So register /proc/sys/net early as an empty directory
to guarantee we don't violate this constraint and hit bugs in the sysctl
implementation.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: Implement register_net_sysctl.
Eric W. Biederman [Thu, 19 Apr 2012 13:18:47 +0000 (13:18 +0000)]
net: Implement register_net_sysctl.

Right now all of the networking sysctl registrations are running in a
compatibiity mode.  The natvie sysctl registration api takes a cstring
for a path and a simple ctl_table.  Implement register_net_sysctl so
that we can register network sysctls without needing to use
compatiblity code in the sysctl core.

Switching from a ctl_path to a cstring results in less boiler plate
and denser code that is a little easier to read.

I would simply have changed the arguments to register_net_sysctl_table
instead of keeping two functions in parallel but gcc will allow a
ctl_path pointer to be passed to a char * pointer with only issuing a
warning resulting in completely incorrect code can be built.  Since I
have to change the function name I am taking advantage of the situation
to let both register_net_sysctl and register_net_sysctl_table live for a
short time in parallel which makes clean conversion patches a bit easier
to read and write.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge branch 'tipc_net-next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg...
David S. Miller [Sat, 21 Apr 2012 00:40:31 +0000 (20:40 -0400)]
Merge branch 'tipc_net-next' of git://git./linux/kernel/git/paulg/linux

12 years agoatl1c: remove MDIO_REG_ADDR_MASK in atl1c_mdio_read/write
Huang, Xiong [Wed, 18 Apr 2012 22:01:31 +0000 (22:01 +0000)]
atl1c: remove MDIO_REG_ADDR_MASK in atl1c_mdio_read/write

MDIO_REG_ADDR_MASK is already applied in function
atl1c_write_phy_reg and atl1c_read_phy_reg

Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoatl1c: fix WoL(magic) issue for l2cb 1.1
Huang, Xiong [Wed, 18 Apr 2012 22:01:30 +0000 (22:01 +0000)]
atl1c: fix WoL(magic) issue for l2cb 1.1

l2cb 1.1 hardware has a bug for magic wakeup,
the workaround is to add pattern enable.
WoL related registers are refined as well.

Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoatl1c: refine atl1c_pcie_patch
Huang, Xiong [Wed, 18 Apr 2012 22:01:29 +0000 (22:01 +0000)]
atl1c: refine atl1c_pcie_patch

bit PCIE_PHYMISC_FORCE_RCV_DET is only for l1c&l2c to fix WoL issue,
other chips set bit5 of REG_MASTER_CTRL --- this way could save more
power than the former, and the bit should be kept all time.
l2cb 1.x has special setting for L0S/L1
l2cb 1.x & l1d 1.x should clear Vendor Message on some platforms,
otherwise it will cause the root complex hang.

Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoatl1c: refine/update ASPM configuration
Huang, Xiong [Wed, 18 Apr 2012 22:01:28 +0000 (22:01 +0000)]
atl1c: refine/update ASPM configuration

some platforms(BIOS or OS) may change ASPM configuration in
PCI Express Link Control Register directly and dynamically
regardless the device driver installation.
Checking if ASPM support during the driver init phase by reading
PCI Express Link Contrl Register doesn't make sense.
This refine/update assume L0S/L1 is defalut enabled as hw->ctrl_flags
inited. atl1c_set_aspm will set real configuration based on chip
capability to hardware register.
atl1c_disable_l0s_l1 and register definition of REG_PM_CTRL are
refined as well.

Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoatl1c: clear bit MASTER_CTRL_CLK_SEL_DIS in atl1c_pcie_patch
Huang, Xiong [Wed, 18 Apr 2012 22:01:27 +0000 (22:01 +0000)]
atl1c: clear bit MASTER_CTRL_CLK_SEL_DIS in atl1c_pcie_patch

bit MASTER_CTRL_CLK_SEL_DIS could be set before enter suspend
clear it after resume to enable pclk(PCIE clock) switch to
low frequency(25M) in some circumstances to save power.

Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoatl1c: refine reg definition of REG_MASTER_CTRL
Huang, Xiong [Wed, 18 Apr 2012 22:01:26 +0000 (22:01 +0000)]
atl1c: refine reg definition of REG_MASTER_CTRL

refine/update register REG_MASTER_CTRL definition according with
hardware spec.

Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoatl1c: clear PCIE error status in atl1c_reset_pcie
Huang, Xiong [Wed, 18 Apr 2012 22:01:25 +0000 (22:01 +0000)]
atl1c: clear PCIE error status in atl1c_reset_pcie

clear PCIE error status (error log is write-1-clear).
REG_PCIE_UC_SEVERITY is removed as it's a standard pcie register,
and using kernle API to access it.

Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoatl1c: remove dmar_dly_cnt and dmaw_dly_cnt
Huang, Xiong [Wed, 18 Apr 2012 22:01:24 +0000 (22:01 +0000)]
atl1c: remove dmar_dly_cnt and dmaw_dly_cnt

dmar_dly_cnt and dmaw_dly_cnt aren't used by hardware/driver any more.

Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoatl1c: update right threshold for TSO
Huang, Xiong [Wed, 18 Apr 2012 22:01:23 +0000 (22:01 +0000)]
atl1c: update right threshold for TSO

atl1c_configure_tx used a wrong value of MAX_TX_OFFLOAD_THRESH(9KB)
for TSO threshold.
the right value should be 7KB
Fast Ethernet controller doesn't support Jumbo frame.

Signed-off-by: xiong <xiong@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoatl1c: add module parameter for l1c_wait_until_idle
Huang, Xiong [Wed, 18 Apr 2012 22:01:22 +0000 (22:01 +0000)]
atl1c: add module parameter for l1c_wait_until_idle

l1c_wait_until_idle is called for serval modules (TXQ/RXQ/TXMAC/RXMAC).
specific moudle have specific idle/busy status in reg REG_IDLE_STATUS.
the previous code return wrongly if all modules are in idle status,
regardless the 'stop' action is applied on individual module.
Refine the reg REG_IDLE_STATUS definition as well.

Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoatl1c: threshold for ASPM is changed based on chip capability
Huang, Xiong [Wed, 18 Apr 2012 22:01:21 +0000 (22:01 +0000)]
atl1c: threshold for ASPM is changed based on chip capability

threshold setting to control ASPM for diff chips are different.
currently, all gigabit-capability chips have limited-ASPM under
100M throughput.

Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agostmmac: do not fail when probe and there is no csr clk defined
Giuseppe CAVALLARO [Wed, 18 Apr 2012 19:48:22 +0000 (19:48 +0000)]
stmmac: do not fail when probe and there is no csr clk defined

On some platforms, for example where we are doing the bring-up,
the csr clock is not passed from the framework and the Ethernet
device driver is failing when it can work w/o any issues and
using the default values. So this patch just warnings the case
of the csr clock cannot be acquired but w/o failing the probe
step. I have just tested it on ST STiH415 SoC (ARM).

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agostmmac: verify the dma_cfg platform fields
Giuseppe CAVALLARO [Wed, 18 Apr 2012 19:48:21 +0000 (19:48 +0000)]
stmmac: verify the dma_cfg platform fields

Recently the dma parameters that can be passed from the platform
have been moved from the plat_stmmacenet_data to the stmmac_dma_cfg.

In case of this new structure is not well allocated the driver can
fails. This is an example how this field is managed in ST platforms

static struct stmmac_dma_cfg gmac_dma_setting = {
        .pbl = 32,
};

static struct plat_stmmacenet_data stih415_ethernet_platform_data[] = {
{
.dma_cfg = &gmac_dma_setting,
.has_gmac = 1,
[snip]

This patch so verifies that the dma_cfg passed from the platform.
In case of it is NULL there is no reason that the driver has to fail
and some default values can be passed. These are ok for all the
Synopsys chips and could impact on performances, only.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
cc: Viresh Kumar <viresh.kumar@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agostmmac: Move the mdio_register/_unregister in probe/remove
Francesco Virlinzi [Wed, 18 Apr 2012 19:48:20 +0000 (19:48 +0000)]
stmmac: Move the mdio_register/_unregister in probe/remove

This patch moves the mdio_register/_unregister in probe/remove
functions and this also is required when hibernation on disk
is done.

Signed-off-by: Francesco Virlinzi <francesco.virlinzi@st,com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st,com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agostmmac: use custom init/exit functions in pm ops
Francesco Virlinzi [Wed, 18 Apr 2012 19:48:19 +0000 (19:48 +0000)]
stmmac: use custom init/exit functions in pm ops

Freeze and restore can call the custom init/exit functions.
Also the patch adds a custom data field that can be used
for storing platform data useful on restore the embedded
setup (e.g. GPIO, SYSCFG).

Signed-off-by: Francesco Virlinzi <francesco.virlinzi@st.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoiwlwifi: Remove inconsistent and redundant declaration
David Spinadel [Thu, 19 Apr 2012 20:46:38 +0000 (13:46 -0700)]
iwlwifi: Remove inconsistent and redundant declaration

Remove declaration of iwl_alloc_traffic_mem from iwl-agn.h,
from methods that was exposed to support MVM.

MVM doesn't have to use this declaration.

CC: netdev@vger.kernel.org
Signed-off-by: David Spinadel <david.spinadel@intel.com>
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotipc: Ensure network address change doesn't impact configuration service
Allan Stephens [Wed, 18 Apr 2012 13:42:56 +0000 (09:42 -0400)]
tipc: Ensure network address change doesn't impact configuration service

Enhances command validation done by TIPC's configuration service so
that it works properly even if the node's network address is changed in
mid-operation. The default node address of <0.0.0> is now recognized as an
alias for "this node" even after a new network address has been assigned.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Ensure network address change doesn't impact rejected message
Allan Stephens [Wed, 18 Apr 2012 13:42:29 +0000 (09:42 -0400)]
tipc: Ensure network address change doesn't impact rejected message

Revises handling of a rejected message to ensure that a locally
originated message is returned properly even if the node's network
address is changed in mid-operation. The routine now treats the
default node address of <0.0.0> as an alias for "this node" when
determining where to send a returned message.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: handle <0.0.0> as an alias for this node on outgoing msgs
Allan Stephens [Wed, 18 Apr 2012 13:27:22 +0000 (09:27 -0400)]
tipc: handle <0.0.0> as an alias for this node on outgoing msgs

Revises handling of send routines for payload messages to ensure that
they are processed properly even if the node's network address is
changed in mid-operation. The routines now treat the default node
address of <0.0.0> as an alias for "this node" when determining where
to send an outgoing message.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: properly handle off-node send requests with invalid addr
Allan Stephens [Wed, 18 Apr 2012 13:22:56 +0000 (09:22 -0400)]
tipc: properly handle off-node send requests with invalid addr

There are two send routines that might conceivably be asked by an
application to send a message off-node when the node is still using
the default network address.  These now have an added check that
detects this and rejects the message gracefully.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: take lock while updating node network address
Allan Stephens [Wed, 18 Apr 2012 13:12:09 +0000 (09:12 -0400)]
tipc: take lock while updating node network address

The routine that changes the node's network address now takes TIPC's
network lock in write mode while the main address variable and associated
data structures are being changed; this is needed to ensure that the
link subsystem won't attempt to send a message off-node until the sending
port's message header template has been updated with the node's new
network address.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Ensure network address change doesn't impact local connections
Allan Stephens [Tue, 17 Apr 2012 22:42:28 +0000 (18:42 -0400)]
tipc: Ensure network address change doesn't impact local connections

Revises routines that deal with connections between two ports on
the same node to ensure the connection is not impacted if the node's
network address is changed in mid-operation. The routines now treat
the default node address of <0.0.0> as an alias for "this node" in
the following situations:

1) Incoming messages destined to a connected port now handle the alias
properly when validating that the message was sent by the expected
peer port, ensuring that the message will be accepted regardless of
whether it specifies the node's old network address or it's current one.

2) The code which completes connection establishment now handles the
alias properly when determining if the peer port is on the same node
as the connected port.

An added benefit of addressing issue 1) is that some peer port
validation code has been relocated to TIPC's socket subsystem, which
means that validation is no longer done twice when a message is
sent to a non-socket port (such as TIPC's configuration service or
network topology service).

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: delete duplicate peerport/peernode helper functions
Allan Stephens [Tue, 17 Apr 2012 22:36:42 +0000 (18:36 -0400)]
tipc: delete duplicate peerport/peernode helper functions

Prior to commit 23dd4cce387124ec3ea06ca30d17854ae4d9b772

    "tipc: Combine port structure with tipc_port structure"

there was a need for the two sets of helper functions.  But
now they are just duplicates.  Remove the globally visible
ones, and mark the remaining ones as inline.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Ensure network address change doesn't impact new port
Allan Stephens [Tue, 17 Apr 2012 22:22:49 +0000 (18:22 -0400)]
tipc: Ensure network address change doesn't impact new port

Re-orders port creation logic so that the initialization of a new
port's message header template occurs while the port list lock is
held. This ensures that a change to the node's network address that
occurs at the same time as the port is being created does not result
in the template identifying the sender using the former network
address. The new approach guarantees that the new port's template is
using the current network address or that it will be updated when
the address changes.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Optimize re-initialization of port message header templates
Allan Stephens [Tue, 17 Apr 2012 22:17:35 +0000 (18:17 -0400)]
tipc: Optimize re-initialization of port message header templates

Removes an unnecessary check in the logic that updates the message
header template for existing ports when a node's network address is
first assigned. There is no longer any need to check to see if the
node's network address has actually changed since the calling routine
has already verified that this is so.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Ensure network address change doesn't impact name table updates
Allan Stephens [Tue, 17 Apr 2012 22:16:34 +0000 (18:16 -0400)]
tipc: Ensure network address change doesn't impact name table updates

Revises routines that add and remove an entry from a node's name table
so that the publication scope lists are updated properly even if the
node's network address is changed in mid-operation. The routines now
recognize the default node address of <0.0.0> as an alias for "this node"
even after a new network address has been assigned.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Add routines for safe checking of node's network address
Allan Stephens [Tue, 17 Apr 2012 22:02:01 +0000 (18:02 -0400)]
tipc: Add routines for safe checking of node's network address

Introduces routines that test whether a given network address is
equal to a node's own network address or if it lies within the node's
own network cluster, and which work properly regardless of whether
the node is using the default network address <0.0.0> or a non-zero
network address that is assigned later on. In essence, these routines
ensure that address <0.0.0> is treated as an alias for "this node",
regardless of which network address the node is actually using.

Old users of the pre-existing more strict match in_own_cluster()
have been accordingly redirected to what is now called
in_own_cluster_exact() --- which does not extend matching to <0,0,0>.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Don't record failed publication attempt as a success
Allan Stephens [Wed, 9 Nov 2011 19:22:52 +0000 (14:22 -0500)]
tipc: Don't record failed publication attempt as a success

No longer increments counter of number of publications by a node
if an attempt to add a new publication fails. This prevents TIPC from
incorrectly blocking future publications because the configured maximum
number of publications has been reached.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Update node-scope publications when network address is assigned
Allan Stephens [Tue, 17 Apr 2012 21:57:52 +0000 (17:57 -0400)]
tipc: Update node-scope publications when network address is assigned

Ensures that node-scope name publications that exist prior to the
configuration of a node's network address are properly re-initialized
with that address when it is assigned. TIPC's node-scope publications
are now tracked using a publications list like the lists used for
cluster-scope and zone-scope publications so they can be easily updated
when required.

The inclusion of node scope name publications in a conventional publication
list means that they must now also be withdrawn, just like cluster and zone
scope publications are currently withdrawn.  So some conditional tests on
scope ==/!= TIPC_NODE_SCOPE are inserted/removed accordingly.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agotipc: Separate cluster-scope and zone-scope names into distinct lists
Allan Stephens [Tue, 17 Apr 2012 21:57:52 +0000 (17:57 -0400)]
tipc: Separate cluster-scope and zone-scope names into distinct lists

Utilizes distinct lists to track zone-scope and cluster-scope names
published by a node. For now, TIPC continues to process the entries
in both lists in the same way; however, an upcoming patch will utilize
the existence of the lists to prevent the sending of cluster-scope names
to nodes that are not part of the local cluster.

To achieve this, an array of publication lists is introduced, so
that they can be iterated over and accessed via publ->scope as
an index where convenient.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
12 years agobonding: start slaves with link down for ARP monitor
Michal Kubeček [Tue, 17 Apr 2012 02:02:06 +0000 (02:02 +0000)]
bonding: start slaves with link down for ARP monitor

Initialize slave device link state as down if ARP monitor is
active and net_carrier_ok() returns zero. Also shift initial
value of its last_arp_tx so that it doesn't immediately cause
fake detection of "up" state.

When ARP monitoring is used, initializing the slave device with
up link state can cause ARP monitor to detect link failure
before the device is really up (with igb driver, this can take
more than two seconds).

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonf_bridge: remove holes in struct nf_bridge_info
Eric Dumazet [Wed, 18 Apr 2012 23:19:25 +0000 (23:19 +0000)]
nf_bridge: remove holes in struct nf_bridge_info

Put use & mask on same location to avoid two holes on 64bit arches

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: dont drop packet in defrag but consume it
Eric Dumazet [Thu, 19 Apr 2012 06:10:26 +0000 (06:10 +0000)]
ipv4: dont drop packet in defrag but consume it

When defragmentation is finalized, we clone a packet and kfree_skb() it.

Call consume_skb() to not confuse dropwatch, since its not a drop.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: gro: GRO_MERGED_FREE consumes packets
Eric Dumazet [Thu, 19 Apr 2012 07:07:40 +0000 (07:07 +0000)]
net: gro: GRO_MERGED_FREE consumes packets

As part of GRO processing, merged skbs should be consumed, not freed, to
not confuse dropwatch/drop_monitor.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: dont drop packet but consume it
Eric Dumazet [Thu, 19 Apr 2012 02:24:53 +0000 (02:24 +0000)]
net: dont drop packet but consume it

When we need to clone skb, we dont drop a packet.
Call consume_skb() to not confuse dropwatch.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: dccp: dont drop packet but consume it
Eric Dumazet [Thu, 19 Apr 2012 02:24:48 +0000 (02:24 +0000)]
ipv6: dccp: dont drop packet but consume it

When we need to clone skb, we dont drop a packet.
Call consume_skb() to not confuse dropwatch.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agopacket: dont drop packet but consume it
Eric Dumazet [Thu, 19 Apr 2012 02:24:42 +0000 (02:24 +0000)]
packet: dont drop packet but consume it

When we need to clone skb, we dont drop a packet.
Call consume_skb() to not confuse dropwatch.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: tcp: dont drop packet but consume it
Eric Dumazet [Thu, 19 Apr 2012 02:24:36 +0000 (02:24 +0000)]
ipv6: tcp: dont drop packet but consume it

When we need to clone skb, we dont drop a packet.
Call consume_skb() to not confuse dropwatch.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonetlink: dont drop packet but consume it
Eric Dumazet [Thu, 19 Apr 2012 02:24:28 +0000 (02:24 +0000)]
netlink: dont drop packet but consume it

When we need to clone skb, we dont drop a packet.
Call consume_skb() to not confuse dropwatch.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoip6_tunnel: dont drop packet but consume it
Eric Dumazet [Thu, 19 Apr 2012 02:24:17 +0000 (02:24 +0000)]
ip6_tunnel: dont drop packet but consume it

When we need to reallocate skb, we dont drop a packet.
Call consume_skb() to not confuse dropwatch.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: fix compile error of leaking kmemleak.h header
Shan Wei [Wed, 18 Apr 2012 18:05:46 +0000 (18:05 +0000)]
net: fix compile error of leaking kmemleak.h header

net/core/sysctl_net_core.c: In function ‘sysctl_core_init’:
net/core/sysctl_net_core.c:259: error: implicit declaration of function ‘kmemleak_not_leak’

with same error in net/ipv4/route.c

Signed-off-by: Shan Wei <davidshan@tencent.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoatl1c: restore max-read-request-size in Device Conrol Register
Huang, Xiong [Tue, 17 Apr 2012 19:32:36 +0000 (19:32 +0000)]
atl1c: restore max-read-request-size in Device Conrol Register

in some platforms, we found the max-read-request-size in Device Control
Register is set to 0 by (BIOS?) during bootup, this will cause the
performance(throughput) very bad.
Restore it to a min-value.
register definition of REG_DEVICE_CTRL is removed, using kernel API to
access it as it's a standard pcie register.

Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoatl1c: using fixed TXQ configuration for l2cb and l1c
Huang, Xiong [Tue, 17 Apr 2012 19:32:35 +0000 (19:32 +0000)]
atl1c: using fixed TXQ configuration for l2cb and l1c

using fixed TXQ config for l2cb and l1c regardless dmar_block
to make tx-DMA more stable.
register REG_TXQ_CTRL is refined as well.

Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoatl1c: remove dmaw_block
Huang, Xiong [Tue, 17 Apr 2012 19:32:34 +0000 (19:32 +0000)]
atl1c: remove dmaw_block

dmaw_block is never used in the driver, remove it.

Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoatl1c: correct wrong definition of REG_DMA_CTRL
Huang, Xiong [Tue, 17 Apr 2012 19:32:33 +0000 (19:32 +0000)]
atl1c: correct wrong definition of REG_DMA_CTRL

some fields of REG_DMA_CTRL(15C0) are wrong, replace with the newest one.
haredware uses fixed dma-write-block size, remove dmaw_block related code
in function atl1c_configure_dma.

Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoatl1c: wrong register used to stop TXQ
Huang, Xiong [Tue, 17 Apr 2012 19:32:32 +0000 (19:32 +0000)]
atl1c: wrong register used to stop TXQ

function atl1c_stop_mac uses wrong register of REG_TWSI_CTRL
to stop mac, replace it with REG_TXQ_CTRL.

Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoatl1c: remove code related to rxq 1/2/3
Huang, Xiong [Tue, 17 Apr 2012 19:32:31 +0000 (19:32 +0000)]
atl1c: remove code related to rxq 1/2/3

remove code related to rxq 1/2/3 since multi-q not support.
refine REG_RXQ_CTRL definition as well.

Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoatl1c: split 2 32bit registers of TPD to 4 16bit registers
Huang, Xiong [Tue, 17 Apr 2012 19:32:30 +0000 (19:32 +0000)]
atl1c: split 2 32bit registers of TPD to 4 16bit registers

TPD producer/consumer index is 16bit wide.
16bit read/write reduce the dependency of the 2 tpd rings (hi and lo)
rename reg(157C/1580) to keep name coninsistency.

Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoatl1c: remove SMB/CMB DMA related code
Huang, Xiong [Tue, 17 Apr 2012 19:32:29 +0000 (19:32 +0000)]
atl1c: remove SMB/CMB DMA related code

l1c & later chips don't support DMA for SMB.
CMB is removed from hardware.
reg(15C8) is used to trig interrupt by tpd threshold.

Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoatl1c: remove VPD register
Huang, Xiong [Tue, 17 Apr 2012 19:32:28 +0000 (19:32 +0000)]
atl1c: remove VPD register

VPD register is only used for L1(devid=PCI_DEVICE_ID_ATTANSIC_L1) to
access external NV-memory.
l1c & later chip doesn't use it any more.
PHY 0/1 registers occupy the last 2 slots of the dump table.

Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>