review.tizen.org Git - platform/kernel/linux-rpi.git/log

projects / platform / kernel / linux-rpi.git / log

Hans Schillstrom [Mon, 30 Apr 2012 06:13:50 +0000 (08:13 +0200)]

net: export sysctl_[r|w]mem_max symbols needed by ip_vs_sync

To build ip_vs as a module sysctl_rmem_max and sysctl_wmem_max
needs to be exported.

The dependency was added by "ipvs: wakeup master thread" patch.

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

commit | commitdiff | tree

H Hartley Sweeten [Thu, 26 Apr 2012 18:26:15 +0000 (11:26 -0700)]

ipvs: ip_vs_proto: local functions should not be exposed globally

Functions not referenced outside of a source file should be marked
static to prevent it from being exposed globally.

This quiets the sparse warnings:

warning: symbol '__ipvs_proto_data_get' was not declared. Should it be static?

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Simon Horman <horms@verge.net.au>

commit | commitdiff | tree

H Hartley Sweeten [Thu, 26 Apr 2012 18:17:28 +0000 (11:17 -0700)]

ipvs: ip_vs_ftp: local functions should not be exposed globally

Functions not referenced outside of a source file should be marked
static to prevent it from being exposed globally.

This quiets the sparse warnings:

warning: symbol 'ip_vs_ftp_init' was not declared. Should it be static?

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Simon Horman <horms@verge.net.au>

commit | commitdiff | tree

Pablo Neira Ayuso [Tue, 8 May 2012 07:28:19 +0000 (09:28 +0200)]

ipvs: optimize the use of flags in ip_vs_bind_dest

cp->flags is marked volatile but ip_vs_bind_dest
can safely modify the flags, so save some CPU cycles by
using temp variable.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>

commit | commitdiff | tree

Pablo Neira Ayuso [Tue, 8 May 2012 17:40:30 +0000 (19:40 +0200)]

ipvs: add support for sync threads

Allow master and backup servers to use many threads
for sync traffic. Add sysctl var "sync_ports" to define the
number of threads. Every thread will use single UDP port,
thread 0 will use the default port 8848 while last thread
will use port 8848+sync_ports-1.

The sync traffic for connections is scheduled to many
master threads based on the cp address but one connection is
always assigned to same thread to avoid reordering of the
sync messages.

Remove ip_vs_sync_switch_mode because this check
for sync mode change is still risky. Instead, check for mode
change under sync_buff_lock.

Make sure the backup socks do not block on reading.

Special thanks to Aleksey Chudov for helping in all tests.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Tested-by: Aleksey Chudov <aleksey.chudov@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>

commit | commitdiff | tree

Julian Anastasov [Tue, 24 Apr 2012 20:46:40 +0000 (23:46 +0300)]

ipvs: reduce sync rate with time thresholds

Add two new sysctl vars to control the sync rate with the
main idea to reduce the rate for connection templates because
currently it depends on the packet rate for controlled connections.
This mechanism should be useful also for normal connections
with high traffic.

sync_refresh_period: in seconds, difference in reported connection
timer that triggers new sync message. It can be used to
avoid sync messages for the specified period (or half of
the connection timeout if it is lower) if connection state
is not changed from last sync.

sync_retries: integer, 0..3, defines sync retries with period of
sync_refresh_period/8. Useful to protect against loss of
sync messages.

Allow sysctl_sync_threshold to be used with
sysctl_sync_period=0, so that only single sync message is sent
if sync_refresh_period is also 0.

Add new field "sync_endtime" in connection structure to
hold the reported time when connection expires. The 2 lowest
bits will represent the retry count.

As the sysctl_sync_period now can be 0 use ACCESS_ONCE to
avoid division by zero.

Special thanks to Aleksey Chudov for being patient with me,
for his extensive reports and helping in all tests.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Tested-by: Aleksey Chudov <aleksey.chudov@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>

commit | commitdiff | tree

Pablo Neira Ayuso [Tue, 8 May 2012 17:39:49 +0000 (19:39 +0200)]

ipvs: wakeup master thread

High rate of sync messages in master can lead to
overflowing the socket buffer and dropping the messages.
Fixed sleep of 1 second without wakeup events is not suitable
for loaded masters,

Use delayed_work to schedule sending for queued messages
and limit the delay to IPVS_SYNC_SEND_DELAY (20ms). This will
reduce the rate of wakeups but to avoid sending long bursts we
wakeup the master thread after IPVS_SYNC_WAKEUP_RATE (8) messages.

Add hard limit for the queued messages before sending
by using "sync_qlen_max" sysctl var. It defaults to 1/32 of
the memory pages but actually represents number of messages.
It will protect us from allocating large parts of memory
when the sending rate is lower than the queuing rate.

As suggested by Pablo, add new sysctl var
"sync_sock_size" to configure the SNDBUF (master) or
RCVBUF (slave) socket limit. Default value is 0 (preserve
system defaults).

Change the master thread to detect and block on
SNDBUF overflow, so that we do not drop messages when
the socket limit is low but the sync_qlen_max limit is
not reached. On ENOBUFS or other errors just drop the
messages.

Change master thread to enter TASK_INTERRUPTIBLE
state early, so that we do not miss wakeups due to messages or
kthread_should_stop event.

Thanks to Pablo Neira Ayuso for his valuable feedback!

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>

commit | commitdiff | tree

Julian Anastasov [Tue, 24 Apr 2012 20:46:38 +0000 (23:46 +0300)]

ipvs: always update some of the flags bits in backup

As the goal is to mirror the inactconns/activeconns
counters in the backup server, make sure the cp->flags are
updated even if cp is still not bound to dest. If cp->flags
are not updated ip_vs_bind_dest will rely only on the initial
flags when updating the counters. To avoid mistakes and
complicated checks for protocol state rely only on the
IP_VS_CONN_F_INACTIVE bit when updating the counters.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Tested-by: Aleksey Chudov <aleksey.chudov@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>

commit | commitdiff | tree

Julian Anastasov [Tue, 24 Apr 2012 20:46:37 +0000 (23:46 +0300)]

ipvs: fix ip_vs_try_bind_dest to rebind app and transmitter

Initially, when the synced connection is created we
use the forwarding method provided by master but once we
bind to destination it can be changed. As result, we must
update the application and the transmitter.

As ip_vs_try_bind_dest is called always for connections
that require dest binding, there is no need to validate the
cp and dest pointers.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>

commit | commitdiff | tree

Julian Anastasov [Tue, 24 Apr 2012 20:46:36 +0000 (23:46 +0300)]

ipvs: remove check for IP_VS_CONN_F_SYNC from ip_vs_bind_dest

As the IP_VS_CONN_F_INACTIVE bit is properly set
in cp->flags for all kind of connections we do not need to
add special checks for synced connections when updating
the activeconns/inactconns counters for first time. Now
logic will look just like in ip_vs_unbind_dest.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>

commit | commitdiff | tree

Julian Anastasov [Tue, 24 Apr 2012 20:46:35 +0000 (23:46 +0300)]

ipvs: ignore IP_VS_CONN_F_NOOUTPUT in backup server

As IP_VS_CONN_F_NOOUTPUT is derived from the
forwarding method we should get it from conn_flags just
like we do it for IP_VS_CONN_F_FWD_MASK bits when binding
to real server.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>

commit | commitdiff | tree

Sasha Levin [Sat, 14 Apr 2012 16:37:47 +0000 (12:37 -0400)]

ipvs: use GFP_KERNEL allocation where possible

Use GFP_KERNEL instead of GFP_ATOMIC when registering an ipvs protocol.

This is safe since it will always run from a process context.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

commit | commitdiff | tree

Julian Anastasov [Fri, 13 Apr 2012 13:49:38 +0000 (16:49 +0300)]

ipvs: SH scheduler does not need GFP_ATOMIC allocation

Schedulers are initialized and bound to services only
on commands.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>

commit | commitdiff | tree

Julian Anastasov [Fri, 13 Apr 2012 13:49:41 +0000 (16:49 +0300)]

ipvs: LBLCR scheduler does not need GFP_ATOMIC allocation on init

Schedulers are initialized and bound to services only
on commands.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>

commit | commitdiff | tree

Julian Anastasov [Fri, 13 Apr 2012 13:49:42 +0000 (16:49 +0300)]

ipvs: WRR scheduler does not need GFP_ATOMIC allocation

Schedulers are initialized and bound to services only
on commands.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>

commit | commitdiff | tree

Julian Anastasov [Fri, 13 Apr 2012 13:49:39 +0000 (16:49 +0300)]

ipvs: DH scheduler does not need GFP_ATOMIC allocation

Schedulers are initialized and bound to services only
on commands.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>

commit | commitdiff | tree

Julian Anastasov [Fri, 13 Apr 2012 13:49:40 +0000 (16:49 +0300)]

ipvs: LBLC scheduler does not need GFP_ATOMIC allocation on init

Schedulers are initialized and bound to services only
on commands.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>

commit | commitdiff | tree

Julian Anastasov [Fri, 13 Apr 2012 13:49:37 +0000 (16:49 +0300)]

ipvs: timeout tables do not need GFP_ATOMIC allocation

They are called only on initialization.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>

commit | commitdiff | tree

Pablo Neira Ayuso [Tue, 8 May 2012 17:36:44 +0000 (19:36 +0200)]

netfilter: bridge: optionally set indev to vlan

if net.bridge.bridge-nf-filter-vlan-tagged sysctl is enabled, bridge
netfilter removes the vlan header temporarily and then feeds the packet
to ip(6)tables.

When the new "bridge-nf-pass-vlan-input-device" sysctl is on
(default off), then bridge netfilter will also set the
in-interface to the vlan interface; if such an interface exists.

This is needed to make iptables REDIRECT target work with
"vlan-on-top-of-bridge" setups and to allow use of "iptables -i" to
match the vlan device name.

Also update Documentation with current brnf default settings.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

commit | commitdiff | tree

Eric Dumazet [Wed, 18 Apr 2012 04:36:40 +0000 (06:36 +0200)]

netfilter: nf_conntrack: use this_cpu_inc()

this_cpu_inc() is IRQ safe and faster than
local_bh_disable()/__this_cpu_inc()/local_bh_enable(), at least on x86.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Christoph Lameter <cl@linux.com>
Cc: Tejun Heo <tj@kernel.org>
Reviewed-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

commit | commitdiff | tree

Eric Leblond [Wed, 18 Apr 2012 09:20:41 +0000 (11:20 +0200)]

netfilter: nf_ct_helper: allow to disable automatic helper assignment

This patch allows you to disable automatic conntrack helper
lookup based on TCP/UDP ports, eg.

echo 0 > /proc/sys/net/netfilter/nf_conntrack_helper

[ Note: flows that already got a helper will keep using it even
if automatic helper assignment has been disabled ]

Once this behaviour has been disabled, you have to explicitly
use the iptables CT target to attach helper to flows.

There are good reasons to stop supporting automatic helper
assignment, for further information, please read:

http://www.netfilter.org/news.html#2012-04-03

This patch also adds one message to inform that automatic helper
assignment is deprecated and it will be removed soon (this is
spotted only once, with the first flow that gets a helper attached
to make it as less annoying as possible).

Signed-off-by: Eric Leblond <eric@regit.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

commit | commitdiff | tree

Tony Zelenoff [Thu, 8 Mar 2012 23:35:39 +0000 (23:35 +0000)]

netfilter: nf_ct_ecache: refactor notifier registration

* ret variable initialization removed as useless
* similar code strings concatenated and functions code
flow became more plain

Signed-off-by: Tony Zelenoff <antonz@parallels.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

commit | commitdiff | tree

Steve Glendinning [Fri, 4 May 2012 00:57:13 +0000 (00:57 +0000)]

smsc75xx: let EEPROM determine GPIO/LED settings

This patch allows the GPIO/LED settings to be configured by the
EEPROM if present, and only sets the default values (LED outputs
for link/activity) when an EEPROM is not detected.

Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Steve Glendinning [Fri, 4 May 2012 00:57:12 +0000 (00:57 +0000)]

smsc75xx: eliminate unnecessary phy register read

Only a write is necessary to clear the interrupt status, and we
don't use the value from the preceding read operation. This
patch eliminates the unnecessary read.

Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Steve Glendinning [Fri, 4 May 2012 00:57:11 +0000 (00:57 +0000)]

smsc75xx: replace 0xffff with PHY_INT_SRC_CLEAR_ALL

This patch defines PHY_INT_SRC_CLEAR_ALL to replace the value 0xffff
in order to be more self-documenting.

This patch should make no functional change, it is purely cosmetic.

Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Tue, 8 May 2012 03:35:40 +0000 (23:35 -0400)]

Merge git://git./linux/kernel/git/davem/net

Conflicts:
drivers/net/ethernet/intel/e1000e/param.c
drivers/net/wireless/iwlwifi/iwl-agn-rx.c
drivers/net/wireless/iwlwifi/iwl-trans-pcie-rx.c
drivers/net/wireless/iwlwifi/iwl-trans.h

Resolved the iwlwifi conflict with mainline using 3-way diff posted
by John Linville and Stephen Rothwell. In 'net' we added a bug
fix to make iwlwifi report a more accurate skb->truesize but this
conflicted with RX path changes that happened meanwhile in net-next.

In e1000e a conflict arose in the validation code for settings of
adapter->itr. 'net-next' had more sophisticated logic so that
logic was used.

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Tue, 8 May 2012 03:05:13 +0000 (23:05 -0400)]

Merge branch 'vhost-net-next' of git://git./linux/kernel/git/mst/vhost

Michael S. Tsirkin says:

--------------------
There are mostly bugfixes here.
I hope to merge some more patches by 3.5, in particular
vlan support fixes are waiting for Eric's ack,
and a version of tracepoint patch might be
ready in time, but let's merge what's ready so it's testable.

This includes a ton of zerocopy fixes by Jason -
good stuff but too intrusive for 3.4 and zerocopy is experimental
anyway.

virtio supported delayed interrupt for a while now
so adding support to the virtio tool made sense
--------------------

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Jiri Pirko [Thu, 3 May 2012 22:37:45 +0000 (22:37 +0000)]

net: IP_MULTICAST_IF setsockopt now recognizes struct mreq

Until now, struct mreq has not been recognized and it was worked with
as with struct in_addr. That means imr_multiaddr was copied to
imr_address. So do recognize struct mreq here and copy that correctly.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David Daney [Wed, 2 May 2012 15:16:39 +0000 (15:16 +0000)]

netdev/of/phy: Add MDIO bus multiplexer driven by GPIO lines.

The GPIO pins select which sub bus is connected to the master.

Initially tested with an sn74cbtlv3253 switch device wired into the
MDIO bus.

Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David Daney [Wed, 2 May 2012 15:16:38 +0000 (15:16 +0000)]

netdev/of/phy: Add MDIO bus multiplexer support.

This patch adds a somewhat generic framework for MDIO bus
multiplexers.  It is modeled on the I2C multiplexer.

The multiplexer is needed if there are multiple PHYs with the same
address connected to the same MDIO bus adepter, or if there is
insufficient electrical drive capability for all the connected PHY
devices.

Conceptually it could look something like this:

                   ------------------
                   | Control Signal |
                   --------+---------
                           |
---------------   --------+------
| MDIO MASTER |---| Multiplexer |
---------------   --+-------+----
                     |       |
                     C       C
                     h       h
                     i       i
                     l       l
                     d       d
                     |       |
     ---------       A       B   ---------
     |       |       |       |   |       |
     | PHY@1 +-------+       +---+ PHY@1 |
     |       |       |       |   |       |
     ---------       |       |   ---------
     ---------       |       |   ---------
     |       |       |       |   |       |
     | PHY@2 +-------+       +---+ PHY@2 |
     |       |                   |       |
     ---------                   ---------

This framework configures the bus topology from device tree data.  The
mechanics of switching the multiplexer is left to device specific
drivers.

The follow-on patch contains a multiplexer driven by GPIO lines.

Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David Daney [Wed, 2 May 2012 15:16:37 +0000 (15:16 +0000)]

netdev/of/phy: New function: of_mdio_find_bus().

Add of_mdio_find_bus() which allows an mii_bus to be located given its
associated the device tree node.

This is needed by the follow-on patch to add a driver for MDIO bus
multiplexers.

The of_mdiobus_register() function is modified so that the device tree
node is recorded in the mii_bus. Then we can find it again by
iterating over all mdio_bus_class devices.

Because the OF device tree has now become an integral part of the
kernel, this can live in mdio_bus.c (which contains the needed
mdio_bus_class structure) instead of of_mdio.c.

Signed-off-by: David Daney <david.daney@cavium.com>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree