review.tizen.org Git - platform/kernel/linux-rpi.git/log

be2net: enable interrupts in EEH resume

On some BE3 FW versions, after a HW reset, interrupts will remain disabled
for each function. So, explicitly enable the interrupts in the eeh_resume
handler, else after an eeh recovery interrupts wouldn't work.

Signed-off-by: Kalesh AP <kalesh.purayil@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

jme: Fix unmap loop counting error:

In my recent fix (76a691d0a: fix dma unmap warning), Ben Hutchings noted that my
loop count was incorrect.  Where j started at startidx, it should have started
at zero, and gone on for count entries, not to endidx.  Additionally, a DMA
resource exhaustion should drop the frame and (for now), return
NETDEV_TX_OK, not NETEV_TX_BUSY.  This patch fixes both of those issues:

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Ben Hutchings <ben@decadent.org.uk>
CC: "David S. Miller" <davem@davemloft.net>
CC: Guo-Fu Tseng <cooldavid@cooldavid.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv6: fix calculation of option len in ip6_append_data

tot_len does specify the size of struct ipv6_txoptions. We need opt_flen +
opt_nflen to calculate the overall length of additional ipv6 extensions.

I found this while auditing the ipv6 output path for a memory corruption
reported by Alexey Preobrazhensky while he fuzzed an instrumented
AddressSanitizer kernel with trinity. This may or may not be the cause
of the original bug.

Fixes: 4df98e76cde7c6 ("ipv6: pmtudisc setting not respected with UFO/CORK")
Reported-by: Alexey Preobrazhensky <preobr@google.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: avoid dependency of net_get_random_once on nop patching

net_get_random_once depends on the static keys infrastructure to patch up
the branch to the slow path during boot. This was realized by abusing the
static keys api and defining a new initializer to not enable the call
site while still indicating that the branch point should get patched
up. This was needed to have the fast path considered likely by gcc.

The static key initialization during boot up normally walks through all
the registered keys and either patches in ideal nops or enables the jump
site but omitted that step on x86 if ideal nops where already placed at
static_key branch points. Thus net_get_random_once branches not always
became active.

This patch switches net_get_random_once to the ordinary static_key
api and thus places the kernel fast path in the - by gcc considered -
unlikely path. Microbenchmarks on Intel and AMD x86-64 showed that
the unlikely path actually beats the likely path in terms of cycle cost
and that different nop patterns did not make much difference, thus this
switch should not be noticeable.

Fixes: a48e42920ff38b ("net: introduce new macro net_get_random_once")
Reported-by: Tuomas Räsänen <tuomasjjrasanen@tjjr.fi>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: filter: x86: fix JIT address randomization

bpf_alloc_binary() adds 128 bytes of room to JITed program image
and rounds it up to the nearest page size. If image size is close
to page size (like 4000), it is rounded to two pages:
round_up(4000 + 4 + 128) == 8192
then 'hole' is computed as 8192 - (4000 + 4) = 4188
If prandom_u32() % hole selects a number >= PAGE_SIZE - sizeof(*header)
then kernel will crash during bpf_jit_free():

kernel BUG at arch/x86/mm/pageattr.c:887!
Call Trace:
[<ffffffff81037285>] change_page_attr_set_clr+0x135/0x460
[<ffffffff81694cc0>] ? _raw_spin_unlock_irq+0x30/0x50
[<ffffffff810378ff>] set_memory_rw+0x2f/0x40
[<ffffffffa01a0d8d>] bpf_jit_free_deferred+0x2d/0x60
[<ffffffff8106bf98>] process_one_work+0x1d8/0x6a0
[<ffffffff8106bf38>] ? process_one_work+0x178/0x6a0
[<ffffffff8106c90c>] worker_thread+0x11c/0x370

since bpf_jit_free() does:
  unsigned long addr = (unsigned long)fp->bpf_func & PAGE_MASK;
  struct bpf_binary_header *header = (void *)addr;
to compute start address of 'bpf_binary_header'
and header->pages will pass junk to:
  set_memory_rw(addr, header->pages);

Fix it by making sure that &header->image[prandom_u32() % hole] and &header
are in the same page

Fixes: 314beb9bcabfd ("x86: bpf_jit_comp: secure bpf jit against spraying attacks")
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge

Included changes:
- properly release neigh_ifinfo in batadv_iv_ogm_process_per_outif()
- properly release orig_ifinfo->router when freeing orig_ifinfo
- properly release neigh_node objects during periodic check
- properly release neigh_info objects when the related hard_iface
is free'd

These changes are all very important because they fix some
reference counting imbalances that lead to the
impossibility of releasing the netdev object used by
batman-adv on shutdown.
The consequence is that such object cannot be destroyed by
the networking stack (the refcounter does not reach zero)
thus bringing the system in hanging state during a normal
reboot operation or a network reconfiguration.

neigh: set nud_state to NUD_INCOMPLETE when probing router reachability

Since commit 7e98056964("ipv6: router reachability probing"), a router falls
into NUD_FAILED will be probed.

Now if function rt6_select() selects a router which neighbour state is NUD_FAILED,
and at the same time function rt6_probe() changes the neighbour state to NUD_PROBE,
then function dst_neigh_output() can directly send packets, but actually the
neighbour still is unreachable. If we set nud_state to NUD_INCOMPLETE instead
NUD_PROBE, packets will not be sent out until the neihbour is reachable.

In addition, because the route should be probes with a single NS, so we must
set neigh->probes to neigh_max_probes(), then the neigh timer timeout and function
neigh_timer_handler() will not send other NS Messages.

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ip6_tunnel: fix potential NULL pointer dereference

The function ip6_tnl_validate assumes that the rtnl
attribute IFLA_IPTUN_PROTO always be filled . If this
attribute is not filled by the userspace application
kernel get crashed with NULL pointer dereference. This
patch fixes the potential kernel crash when
IFLA_IPTUN_PROTO is missing .

Signed-off-by: Susant Sahani <susant@redhat.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

sfc: fix calling of free_irq with already free vector

If the sfc driver is in legacy interrupt mode (either explicitly by
using interrupt_mode module param or by falling back to it) it will
hit a warning at kernel/irq/manage.c because it will try to free an irq
which wasn't allocated by it in the first place because the MSI(X) irqs are
zero and it'll try to free them unconditionally. So fix it by checking if
we're in legacy mode and freeing the appropriate irqs.

CC: Zenghui Shi <zshi@redhat.com>
CC: Ben Hutchings <ben@decadent.org.uk>
CC: <linux-net-drivers@solarflare.com>
CC: Shradha Shah <sshah@solarflare.com>
CC: David S. Miller <davem@davemloft.net>
Fixes: 1899c111a535 ("sfc: Fix IRQ cleanup in case of a probe failure")
Reported-by: Zenghui Shi <zshi@redhat.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Acked-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

macvlan: Don't propagate IFF_ALLMULTI changes on down interfaces.

Clearing the IFF_ALLMULTI flag on a down interface could cause an allmulti
overflow on the underlying interface.

Attempting the set IFF_ALLMULTI on the underlying interface would cause an
error and the log message:

"allmulti touches root, set allmulti failed."

Signed-off-by: Peter Christensen <pch@ordbogen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ptp: fix kconfig dependency warnings

Fix kconfig warnings:

PTP_1588_CLOCK selects NET_PTP_CLASSIFY, which depends on NET,
so PTP_1588_CLOCK should also depend on NET.

PTP_1588_CLOCK_PCH selects PTP_1588_CLOCK so the former should
depend on NET.

warning: (IXP4XX_ETH && PTP_1588_CLOCK) selects NET_PTP_CLASSIFY which has unmet direct dependencies (NET)

warning: (SFC && TILE_NET && BFIN_MAC_USE_HWSTAMP && TIGON3 && FEC && E1000E && IGB && IXGBE && I40E && MLX4_EN && SXGBE_ETH && STMMAC_ETH && TI_CPTS && PTP_1588_CLOCK_GIANFAR && PTP_1588_CLOCK_IXP46X && DP83640_PHY && PTP_1588_CLOCK_PCH) selects PTP_1588_CLOCK which has unmet direct dependencies (NET)
[This warning is caused by the new 'depends on NET' in PTP_1588_CLOCK.]

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

batman-adv: fix removing neigh_ifinfo

When an interface is removed separately, all neighbors need to be
checked if they have a neigh_ifinfo structure for that particular
interface. If that is the case, remove that ifinfo so any references to
a hard interface can be freed.

This is a regression introduced by
89652331c00f43574515059ecbf262d26d885717
("batman-adv: split tq information in neigh_node struct")

Reported-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Simon Wunderlich <simon@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>

batman-adv: always run purge_orig_neighbors

The current code will not execute batadv_purge_orig_neighbors() when an
orig_ifinfo has already been purged. However we need to run it in any
case. Fix that.

This is a regression introduced by
7351a4822d42827ba0110677c0cbad88a3d52585
("batman-adv: split out router from orig_node")

Signed-off-by: Simon Wunderlich <simon@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>

batman-adv: fix neigh reference imbalance

When an interface is removed from batman-adv, the orig_ifinfo of a
orig_node may be removed without releasing the router first.
This will prevent the reference for the neighbor pointed at by the
orig_ifinfo->router to be released, and this leak may result in
reference leaks for the interface used by this neighbor. Fix that.

This is a regression introduced by
7351a4822d42827ba0110677c0cbad88a3d52585
("batman-adv: split out router from orig_node").

Reported-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Simon Wunderlich <simon@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>

batman-adv: fix neigh_ifinfo imbalance

The neigh_ifinfo object must be freed if it has been used in
batadv_iv_ogm_process_per_outif().

This is a regression introduced by
89652331c00f43574515059ecbf262d26d885717
("batman-adv: split tq information in neigh_node struct")

Reported-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Simon Wunderlich <simon@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>

Merge branch 'for-davem' of git://git./linux/kernel/git/linville/wireless

John W. Linville says:

====================
pull request: wireless 2014-05-08

This one is all from Johannes:

"Here are a few small fixes for the current cycle: radiotap TX flags were
wrong (fix by Bob), Chun-Yeow fixes an SMPS issue with mesh interfaces,
Eliad fixes a locking bug and a cfg80211 state problem and finally
Henning sent me a fix for IBSS rate information."

Please let me know if there are problems!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: cassini: use nested lock annotation

In the cas_lock_tx function we acquire multiple locks in a loop and
need to use nested lock annotation to prevent lockdep warnings.

Reported-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Emil Goode <emilgoode@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2x: Fix UNDI driver unload

Commit 91ebb928b "bnx2x: Add support for Multi-Function UNDI" contains a bug
which prevent the emptying of the device's Rx buffers before reset.
As a result, on new boards it is likely HW will reach some fatal assertion
once its interfaces load after UNDI was previously loaded.

Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'mdio_net'

Johan Hovold says:

====================
net: cpsw and mdio-gpio fixes for v3.15-final

These patches against v3.15-rc4 fix a few issues in the cpsw and
mdio-gpio drivers.

Resend with proper stable CC (git send-email still fails to parse the

Sorry about the noise.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: cpsw: add missing of_node_put

Add missing of_node_put to avoid kref leak.

Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: cpsw: fix null dereference at probe

Fix null-pointer dereference at probe when the mdio platform device is
missing (e.g. when it has been disabled in DT).

Cc: stable <stable@vger.kernel.org> # v3.8
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Revert "net: eth: cpsw: Correctly attach to GPIO bitbang MDIO driver"

This reverts commit f8d56d8f892be43a2404356073e16401eb5a42e6 ("net:
eth: cpsw: Correctly attach to GPIO bitbang MDIO driver").

Fix potential null-pointer dereference at probe if the mdio-gpio device
has not been successfully probed yet.

The offending commit is plain wrong for a number of reasons. First of
all it accesses internal driver data of an unrelated device. Neither
does it check that the data is non-null (which it is in case the device
has not been probed yet).

Furthermore, the decision on whether to treat any driver data according
to the mdio-gpio driver's internals is made based on the node name. But
the name is not compared against "mdio" which is the normal name for the
node, but rather against "gpio" which the node does not have to be named
(and shouldn't be according to the binding documentation). [ If this
hack is to be kept out-of-tree it should at least be matching against
the compatible property. ]

Cc: Stefan Roese <sr@denx.de>
Cc: stable <stable@vger.kernel.org> # v3.14
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: mdio-gpio: warn about missing bus alias id

Use a sane default bus id (rather than -ENODEV) and print a warning when
the bus alias id is missing.

Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: mdio-gpio: fix device-tree binding documentation

Fix aliases syntax in device-tree binding example to avoid
copy-paste errors (the alias would be dropped silently).

Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: cdc_mbim: handle unaccelerated VLAN tagged frames

This driver maps 802.1q VLANs to MBIM sessions. The mapping is based on
a bogus assumption that all tagged frames will use the acceleration API
because we enable NETIF_F_HW_VLAN_CTAG_TX. This fails for e.g. frames
tagged in userspace using packet sockets. Such frames will erroneously
be considered as untagged and silently dropped based on not being IP.

Fix by falling back to looking into the ethernet header for a tag if no
accelerated tag was found.

Fixes: a82c7ce5bc5b ("net: cdc_ncm: map MBIM IPS SessionID to VLAN ID")
Cc: Greg Suarez <gsuarez@smithmicro.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge git://git./pub/scm/linux/kernel/git/pablo/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following batch contains netfilter fixes for your net tree, they are:

1) Fix use after free in nfnetlink when sending a batch for some
   unsupported subsystem, from Denys Fedoryshchenko.

2) Skip autoload of the nat module if no binding is specified via
   ctnetlink, from Florian Westphal.

3) Set local_df after netfilter defragmentation to avoid a bogus ICMP
   fragmentation needed in the forwarding path, also from Florian.

4) Fix potential user after free in ip6_route_me_harder() when returning
   the error code to the upper layers, from Sergey Popovich.

5) Skip possible bogus ICMP time exceeded emitted from the router (not
   valid according to RFC) if conntrack zones are used, from Vasily Averin.

6) Fix fragment handling when nf_defrag_ipv4 is loaded but nf_conntrack
   is not present, also from Vasily.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

Driver for Beckhoff CX5020 EtherCAT master module.

This driver adds support for EtherCAT master module located on CCAT
FPGA found on Beckhoff CX series industrial PCs. The driver exposes
EtherCAT master as an ethernet interface.

EtherCAT is a fieldbus protocol defined on top of ethernet and Beckhoff
CX5020 PCs come with built-in EtherCAT master module located on a FPGA,
which in turn is connected to a PCI bus.

Signed-off-by: Dariusz Marcinkiewicz <reksio@newterm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>

ping: move ping_group_range out of CONFIG_SYSCTL

Similarly, when CONFIG_SYSCTL is not set, ping_group_range should still
work, just that no one can change it. Therefore we should move it out of
sysctl_net_ipv4.c. And, it should not share the same seqlock with
ip_local_port_range.

BTW, rename it to ->ping_group_range instead.

Cc: David S. Miller <davem@davemloft.net>
Cc: Francois Romieu <romieu@fr.zoreil.com>
Reported-by: Stefan de Konink <stefan@konink.de>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv4: move local_port_range out of CONFIG_SYSCTL

When CONFIG_SYSCTL is not set, ip_local_port_range should still work,
just that no one can change it. Therefore we should move it out of sysctl_inet.c.
Also, rename it to ->ip_local_ports instead.

Cc: David S. Miller <davem@davemloft.net>
Cc: Francois Romieu <romieu@fr.zoreil.com>
Reported-by: Stefan de Konink <stefan@konink.de>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netfilter: Fix potential use after free in ip6_route_me_harder()

Dst is released one line before we access it again with dst->error.

Fixes: 58e35d147128 netfilter: ipv6: propagate routing errors from
ip6_route_me_harder()

Signed-off-by: Sergey Popovich <popovich_sergei@mail.ru>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

Merge branch 'master' of git://git./linux/kernel/git/linville/wireless into for-davem

net: mdio: of_mdiobus_register(): fall back to mdiobus_register() for !CONFIG_OF

If CONFIG_OF is not set, make of_mdiobus_register() call
mdiobus_register() instead of returning -ENOSYS.

This way, we can just call of_mdiobus_register() from all DT-enabled
drivers to handle the compat cases.

Signed-off-by: Daniel Mack <zonque@gmail.com>
Suggested-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv4: fib_semantics: increment fib_info_cnt after fib_info allocation

Increment fib_info_cnt in fib_create_info() right after successfuly
alllocating fib_info structure, overwise fib_metrics allocation failure
leads to fib_info_cnt incorrectly decremented in free_fib_info(), called
on error path from fib_create_info().

Signed-off-by: Sergey Popovich <popovich_sergei@mail.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'qlcnic_net'

Rajesh Borundia says:

====================
qlcnic: Bug fixes.

This patch series contain following bug fixes.

* Fix panic where driver was accessing un-initialized crb_intr_mask
  in non Multi-Tx queue mode while dumping TX queue.
* Do not set netdev->real_num_tx_queues directly from driver instead
  use kernel defined netif_set_real_num_tx_queues() API. Also notify
  stack about change in number of Rx queues.

Please apply this series to net.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: Set real_num_{tx|rx}_queues properly

Do not set netdev->real_num_tx_queues directly,
let netif_set_real_num_tx_queues() take care of it.
Do not overwrite netdev->num_tx_queues everytime when driver
changes its Tx ring size through ethtool -L and also notify
stack to update number of Rx queues.

Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: Fix panic while dumping TX queues on TX timeout

o In case of non-multi TX queue mode driver does not initialize "crb_intr_mask" pointer
and driver was accessing that un-initialized pointer while dumping TX queue.
So dump "crb_intr_mask" only when it is initilaized.

Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

jme: Fix DMA unmap warning

The jme driver forgot to check the return status from pci_map_page in its tx
path, causing a dma api warning on unmap. Easy fix, just do the check and
augment the tx path to tell the stack that the driver is busy so we re-queue the
frame.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Guo-Fu Tseng <cooldavid@cooldavid.org>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'gso_forward'

Florian Westphal says:

====================
net: ip: push gso skb forwarding handling down the stack

Turns out doing the segmentation in forwarding was not a bright idea,
there are corner-cases where this has unintended side-effects.

This patch pushes the segmentation downwards.

After this, netif_skb_dev_features() function can be removed
again, it was only added to fetch the features of the output device,
we can just use skb->dev after the pushdown.

Tested with following setup:

host -> kvm_router -> kvm_host
mtu 1500 mtu1280

- 'host' has route to kvm_host with locked mtu of 1500
- gso/gro enabled on all interfaces

Did tests with all of following combinations:
- netfilter conntrack off and on on kvm_router
- virtio-net and e1000 driver on kvm_router
- tcp and udp bulk xmit from host to kvm_host

for tcp, I added TCPMSS mangling on kvm_host to make it lie about tcp mss.

Also added a dummy '-t mangle -A POSTROUTING -p udp -f'
rule to make sure no udp fragments are seen in the 'conntrack on'
and 'virtio-net' case.

Also checked (with ping -M do -s 1400)' that it still sends the wanted
icmp error message when size exceeds 1280.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

Revert "net: core: introduce netif_skb_dev_features"

This reverts commit d206940319c41df4299db75ed56142177bb2e5f6,
there are no more callers.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ip: push gso skb forwarding handling down the stack

Doing the segmentation in the forward path has one major drawback:

When using virtio, we may process gso udp packets coming
from host network stack. In that case, netfilter POSTROUTING
will see one packet with udp header followed by multiple ip
fragments.

Delay the segmentation and do it after POSTROUTING invocation
to avoid this.

Fixes: fe6cc55f3a9 ("net: ip, ipv6: handle gso skbs in forwarding path")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipv6: send pkttoobig immediately if orig frag size > mtu

If conntrack defragments incoming ipv6 frags it stores largest original
frag size in ip6cb and sets ->local_df.

We must thus first test the largest original frag size vs. mtu, and not
vice versa.

Without this patch PKTTOOBIG is still generated in ip6_fragment() later
in the stack, but

1) IPSTATS_MIB_INTOOBIGERRORS won't increment
2) packet did (needlessly) traverse netfilter postrouting hook.

Fixes: fe6cc55f3a9 ("net: ip, ipv6: handle gso skbs in forwarding path")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipv4: ip_forward: fix inverted local_df test

local_df means 'ignore DF bit if set', so if its set we're
allowed to perform ip fragmentation.

This wasn't noticed earlier because the output path also drops such skbs
(and emits needed icmp error) and because netfilter ip defrag did not
set local_df until couple of days ago.

Only difference is that DF-packets-larger-than MTU now discarded
earlier (f.e. we avoid pointless netfilter postrouting trip).

While at it, drop the repeated test ip_exceeds_mtu, checking it once
is enough...

Fixes: fe6cc55f3a9 ("net: ip, ipv6: handle gso skbs in forwarding path")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: cdc_mbim: __vlan_find_dev_deep need rcu_read_lock

Fixes this warning introduced by commit 5b8f15f78e6f
("net: cdc_mbim: handle IPv6 Neigbor Solicitations"):

===============================
[ INFO: suspicious RCU usage. ]
3.15.0-rc3 #213 Tainted: G W O
-------------------------------
net/8021q/vlan_core.c:69 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 1
no locks held by ksoftirqd/0/3.

stack backtrace:
CPU: 0 PID: 3 Comm: ksoftirqd/0 Tainted: G W O 3.15.0-rc3 #213
Hardware name: LENOVO 2776LEG/2776LEG, BIOS 6EET55WW (3.15 ) 12/19/2011
0000000000000001 ffff880232533bf0 ffffffff813a5ee6 0000000000000006
ffff880232530090 ffff880232533c20 ffffffff81076b94 0000000000000081
0000000000000000 ffff8802085ac000 ffff88007fc8ea00 ffff880232533c50
Call Trace:
[<ffffffff813a5ee6>] dump_stack+0x4e/0x68
[<ffffffff81076b94>] lockdep_rcu_suspicious+0xfa/0x103
[<ffffffff813978a6>] __vlan_find_dev_deep+0x54/0x94
[<ffffffffa04a1938>] cdc_mbim_rx_fixup+0x379/0x66a [cdc_mbim]
[<ffffffff813ab76f>] ? _raw_spin_unlock_irqrestore+0x3a/0x49
[<ffffffff81079671>] ? trace_hardirqs_on_caller+0x192/0x1a1
[<ffffffffa059bd10>] usbnet_bh+0x59/0x287 [usbnet]
[<ffffffff8104067d>] tasklet_action+0xbb/0xcd
[<ffffffff81040057>] __do_softirq+0x14c/0x30d
[<ffffffff81040237>] run_ksoftirqd+0x1f/0x50
[<ffffffff8105f13e>] smpboot_thread_fn+0x172/0x18e
[<ffffffff8105efcc>] ? SyS_setgroups+0xdf/0xdf
[<ffffffff810594b0>] kthread+0xb5/0xbd
[<ffffffff813a84b1>] ? __wait_for_common+0x13b/0x170
[<ffffffff810593fb>] ? __kthread_parkme+0x5c/0x5c
[<ffffffff813b147c>] ret_from_fork+0x7c/0xb0
[<ffffffff810593fb>] ? __kthread_parkme+0x5c/0x5c

Fixes: 5b8f15f78e6f ("net: cdc_mbim: handle IPv6 Neigbor Solicitations")
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge git://git./linux/kernel/git/jberg/mac80211

Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) e1000e computes header length incorrectly wrt vlans, fix from Vlad
    Yasevich.

2) ns_capable() check in sock_diag netlink code, from Andrew
    Lutomirski.

3) Fix invalid queue pairs handling in virtio_net, from Amos Kong.

4) Checksum offloading busted in sxgbe driver due to incorrect
    descriptor layout, fix from Byungho An.

5) Fix build failure with SMC_DEBUG set to 2 or larger, from Zi Shen
    Lim.

6) Fix uninitialized A and X registers in BPF interpreter, from Alexei
    Starovoitov.

7) Fix arch dependencies of candence driver.

8) Fix netlink capabilities checking tree-wide, from Eric W Biederman.

9) Don't dump IFLA_VF_PORTS if netlink request didn't ask for it in
    IFLA_EXT_MASK, from David Gibson.

10) IPV6 FIB dump restart doesn't handle table changes that happen
    meanwhile, causing the code to loop forever or emit dups, fix from
    Kumar Sandararajan.

11) Memory leak on VF removal in bnx2x, from Yuval Mintz.

12) Bug fixes for new Altera TSE driver from Vince Bridgers.

13) Fix route lookup key in SCTP, from Xugeng Zhang.

14) Use BH blocking spinlocks in SLIP, as per a similar fix to CAN/SLCAN
    driver.  From Oliver Hartkopp.

15) TCP doesn't bump retransmit counters in some code paths, fix from
    Eric Dumazet.

16) Clamp delayed_ack in tcp_cubic to prevent theoretical divides by
    zero.  Fix from Liu Yu.

17) Fix locking imbalance in error paths of HHF packet scheduler, from
    John Fastabend.

18) Properly reference the transport module when vsock_core_init() runs,
    from Andy King.

19) Fix buffer overflow in cdc_ncm driver, from Bjørn Mork.

20) IP_ECN_decapsulate() doesn't see a correct SKB network header in
    ip_tunnel_rcv(), fix from Ying Cai.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (132 commits)
  net: macb: Fix race between HW and driver
  net: macb: Remove 'unlikely' optimization
  net: macb: Re-enable RX interrupt only when RX is done
  net: macb: Clear interrupt flags
  net: macb: Pass same size to DMA_UNMAP as used for DMA_MAP
  ip_tunnel: Set network header properly for IP_ECN_decapsulate()
  e1000e: Restrict MDIO Slow Mode workaround to relevant parts
  e1000e: Fix issue with link flap on 82579
  e1000e: Expand workaround for 10Mb HD throughput bug
  e1000e: Workaround for dropped packets in Gig/100 speeds on 82579
  net/mlx4_core: Don't issue PCIe speed/width checks for VFs
  net/mlx4_core: Load the Eth driver first
  net/mlx4_core: Fix slave id computation for single port VF
  net/mlx4_core: Adjust port number in qp_attach wrapper when detaching
  net: cdc_ncm: fix buffer overflow
  Altera TSE: ALTERA_TSE should depend on HAS_DMA
  vsock: Make transport the proto owner
  net: sched: lock imbalance in hhf qdisc
  net: mvmdio: Check for a valid interrupt instead of an error
  net phy: Check for aneg completion before setting state to PHY_RUNNING
  ...

Merge tag 'usb-3.15-rc4' of git://git./linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
"Here are some small fixes and device ids for 3.15-rc4.

  All have been in linux-next just fine"

* tag 'usb-3.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: Nokia 5300 should be treated as unusual dev
  USB: Nokia 305 should be treated as unusual dev
  fsl-usb: do not test for PHY_CLK_VALID bit on controller version 1.6
  usb: storage: shuttle_usbat: fix discs being detected twice
  usb: qcserial: add a number of Dell devices
  USB: OHCI: fix problem with global suspend on ATI controllers
  usb: gadget: at91-udc: fix irq and iomem resource retrieval
  usb: phy: fsm: change "|" to "||" for condition OTG_STATE_A_WAIT_BCON at statemachine
  usb: phy: fsm: update OTG HNP state transition

Merge tag 'tty-3.15-rc4' of git://git./linux/kernel/git/gregkh/tty

Pull tty/serial fixes from Greg KH:
"Here are some tty and serial driver fixes for things reported
  recently"

* tag 'tty-3.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  tty: Fix lockless tty buffer race
  Revert "tty: Fix race condition between __tty_buffer_request_room and flush_to_ldisc"
  drivers/tty/hvc: don't free hvc_console_setup after init
  n_tty: Fix n_tty_write crash when echoing in raw mode
  tty: serial: 8250_core.c Bug fix for Exar chips.

Merge tag 'staging-3.15-rc4' of git://git./linux/kernel/git/gregkh/staging

Pull staging / iio fixes from Greg KH:
"Here are some small IIO driver fixes for 3.15-rc4 that resolve some
  reported issues"

* tag 'staging-3.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  iio: adc: Nothing in ADC should be a bool CONFIG
  iio: exynos_adc: use indio_dev->dev structure to handle child nodes
  iio:imu:mpu6050: Fixed segfault in Invensens MPU driver due to null dereference
  staging:iio:ad2s1200 fix missing parenthesis in a for statment.

Merge tag 'xtensa-next-20140503' of git://github.com/czankel/xtensa-linux

Pull Xtensa fixes from Chris Zankel:
- Fixes allmodconfig, allnoconfig builds
- Adds highmem support
- Enables build-time exception table sorting.

* tag 'xtensa-next-20140503' of git://github.com/czankel/xtensa-linux:
  xtensa: ISS: don't depend on CONFIG_TTY
  xtensa: xt2000: drop redundant sysmem initialization
  xtensa: add support for KC705
  xtensa: xtfpga: introduce SoC I/O bus
  xtensa: add HIGHMEM support
  xtensa: optimize local_flush_tlb_kernel_range
  xtensa: dump sysmem from the bootmem_init
  xtensa: handle memmap kernel option
  xtensa: keep sysmem banks ordered in mem_reserve
  xtensa: keep sysmem banks ordered in add_sysmem_bank
  xtensa: split bootparam and kernel meminfo
  xtensa: enable sorting extable at build time
  xtensa: export __{invalidate,flush}_dcache_range
  xtensa: Export __invalidate_icache_range

Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client

Pull Ceph fixes from Sage Weil:
"First, there is a critical fix for the new primary-affinity function
  that went into -rc1.

  The second batch of patches from Zheng fix a range of problems with
  directory fragmentation, readdir, and a few odds and ends for cephfs"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: reserve caps for file layout/lock MDS requests
  ceph: avoid releasing caps that are being used
  ceph: clear directory's completeness when creating file
  libceph: fix non-default values check in apply_primary_affinity()
  ceph: use fpos_cmp() to compare dentry positions
  ceph: check directory's completeness before emitting directory entry

net: macb: Fix race between HW and driver

Under "heavy" RX load, the driver cannot handle the descriptors fast
enough. In detail, when a descriptor is consumed, its used flag is
cleared and once the RX budget is consumed all descriptors with a
cleared used flag are prepared to receive more data. Under load though,
the HW may constantly receive more data and use those descriptors with a
cleared used flag before they are actually prepared for next usage.

The head and tail pointers into the RX-ring should always be valid and
we can omit clearing and checking of the used flag.

Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: macb: Remove 'unlikely' optimization

Coverage data suggests that the unlikely case of receiving data while
the receive handler is running may not be that unlikely.
Coverage data after running iperf for a while:
    91320:  891: work_done = bp->macbgem_ops.mog_rx(bp, budget);
    91320:  892: if (work_done < budget) {
     2362:  893: napi_complete(napi);
        -:  894:
        -:  895: /* Packets received while interrupts were disabled */
     4724:  896: status = macb_readl(bp, RSR);
     2362:  897: if (unlikely(status)) {
      762:  898: if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE)
      762:  899: macb_writel(bp, ISR, MACB_BIT(RCOMP));
        -:  900: napi_reschedule(napi);
        -:  901: } else {
     1600:  902: macb_writel(bp, IER, MACB_RX_INT_FLAGS);
        -:  903: }
        -:  904: }

Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: macb: Re-enable RX interrupt only when RX is done

When data is received during the driver processing received data the
NAPI is re-scheduled. In that case the RX interrupt should not be
re-enabled.

Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: macb: Clear interrupt flags

A few interrupt flags were not cleared in the ISR, resulting in a sytem
trapped in the ISR in cases one of those interrupts occurred. Clear all
flags to avoid such situations.

Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: macb: Pass same size to DMA_UNMAP as used for DMA_MAP

Just as commit "net: macb: DMA-unmap full rx-buffer"
(48330e08fa168395b9fd9f369f06cca1df204361), pass the size that
was used for mapping the memory also to the unmap routine to
avoid warnings from the DMA_API.

Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ip_tunnel: Set network header properly for IP_ECN_decapsulate()

In ip_tunnel_rcv(), set skb->network_header to inner IP header
before IP_ECN_decapsulate().

Without the fix, IP_ECN_decapsulate() takes outer IP header as
inner IP header, possibly causing error messages or packet drops.

Note that this skb_reset_network_header() call was in this spot when
the original feature for checking consistency of ECN bits through
tunnels was added in eccc1bb8d4b4 ("tunnel: drop packet if ECN present
with not-ECT"). It was only removed from this spot in 3d7b46cd20e3
("ip_tunnel: push generic protocol handling to ip_tunnel module.").

Fixes: 3d7b46cd20e3 ("ip_tunnel: push generic protocol handling to ip_tunnel module.")
Reported-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Ying Cai <ycai@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates

This series contains updates to e1000e only.

David provides four fixes for e1000e, first is a workaround for a hardware
erratum on 82579 devices which experienced packet loss in gigabit and 100
speeds when interconnect between the PHY and MAC is exiting K1 power saving
state.  Second expands the scope of a workaround to include i217 and i218
parts as well to address over aggressive transmit behavior when connecting
at 10Mbs half-duplex.  Next is to resolve a reported link flap issue on
82579 parts which was root caused as an interoperability problem between
82579 and at least some Broadcom PHYs in the Energy Efficient Ethernet wake
mechanism.  Lastly, restricts the workaround of putting the PHY into MDIO
slow mode to access the PHY id to relevant parts since this issue has been
fixed on the newer hardware.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

e1000e: Restrict MDIO Slow Mode workaround to relevant parts

It has been determined that the workaround of putting the PHY into MDIO
slow mode to access the PHY id is not necessary with Lynx Point and newer
parts. The issue that necessitated the workaround has been fixed on the
newer hardware.

We will maintains, as a last ditch attempt, the conversion to MDIO Slow
Mode in the failure branch when attempting to access the PHY id so as to
cover all contingencies.

Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: Fix issue with link flap on 82579

Several customers have reported a link flap issue on 82579. The symptoms
are random and intermittent link losses when 82579 is connected to specific
link partners. Issue has been root caused as interoperability problem
between 82579 and at least some Broadcom PHYs in the Energy Efficient
Ethernet wake mechanism.

To fix the issue, we are disabling the Phase Locked Loop shutdown in 100M
Low Power Idle. This solution will cause an increase of power in 100M EEE
link. It will cost additional 28mW in this specific mode.

Cc: Lukasz Adamczuk <lukasz.adamczuk@intel.com>
Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: Expand workaround for 10Mb HD throughput bug

In commit 772d05c51c4f4896c120ad418b1e91144a2ac813 "e1000e: slow performance
between two 82579 connected via 10Mbit hub", a workaround was put into place
to address the overaggressive transmit behavior of 82579 parts when connecting
at 10Mbs half-duplex.

This same behavior is seen on i217 and i218 parts as well. This patch expands
the original workaround to encompass these parts.

Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: Workaround for dropped packets in Gig/100 speeds on 82579

This is a workaround for a HW erratum on 82579 devices.
Erratum is #23 in Intel 6 Series Chipset and Intel C200 Series Chipset
specification Update June 2013.

Problem: 82579 parts experience packet loss in Gig and 100 speeds
when interconnect between PHY and MAC is exiting K1 power saving state.
This was previously believed to only affect 1Gig speed, but has been observed
at 100Mbs also.

Workaround: Disable K1 for 82579 devices at Gig and 100 speeds.

Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Merge branch 'mlx4'

Or Gerlitz says:

====================
This series contains fixes for 3.15-rc, mostly around SRIOV. The patches by Jack,
Matan and myself fix few issues related to mlx4 SRIOV support for RoCE and single
port VFs, and the patch from Eyal eliminates checking PCI caps for VFs which is misleading.

Patches done against the net tree, commit 014f1b2 "net: bonding: Fix format string
mismatch in bond_sysfs.c"

We'd be happy to get Eyal's patch queued in your -stable list for 3.14.y
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4_core: Don't issue PCIe speed/width checks for VFs

Carrying out PCI speed/width checks through pcie_get_minimum_link()
on VFs yield wrong results, so remove them.

Fixes: b912b2f ('net/mlx4_core: Warn if device doesn't have enough PCI bandwidth')
Signed-off-by: Eyal Perry <eyalpe@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4_core: Load the Eth driver first

When running in SRIOV mode, VM that is assigned with a non-provisioned
Ethernet VFs get themselves a random mac when the Eth driver starts. In
this case, if the IB driver startup code that deals with RoCE runs first,
it will use a zero mac as the source mac for the Para-Virtual CM MADs
which is buggy. To handle that, we change the order of loading.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4_core: Fix slave id computation for single port VF

The code that deals with computing the slave id based on a given GID
gave wrong results when the number of single port VFs wasn't the
same for port 1 vs. port 2 and the relevant VF is single ported on
port 2. As a result, incoming CM MADs were dispatched to the wrong VF.
Fixed that and added documentation to clarify the computation steps.

Fixes: 449fc48 ('net/mlx4: Adapt code for N-Port VF')
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4_core: Adjust port number in qp_attach wrapper when detaching

When using single ported VFs and the VF is using port 2, we need
to adjust the port accordingly (change it from 1 to 2).

Fixes: 449fc48 ('net/mlx4: Adapt code for N-Port VF')
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: cdc_ncm: fix buffer overflow

Commit 4d619f625a60 ("net: cdc_ncm: no point in filling up the NTBs
if we send ZLPs") changed the padding logic for devices with the ZLP
flag set. This meant that frames of any size will be sent without
additional padding, except for the single byte added if the size is
a multiple of the USB packet size. But if the unpadded size is
identical to the maximum frame size, and the maximum size is a
multiplum of the USB packet size, then this one-byte padding will
overflow the buffer.

Prevent padding if already at maximum frame size, letting usbnet
transmit a ZLP instead in this case.

Fixes: 4d619f625a60 ("net: cdc_ncm: no point in filling up the NTBs if we send ZLPs")
Reported by: Yu-an Shih <yshih@nvidia.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>

Altera TSE: ALTERA_TSE should depend on HAS_DMA

If NO_DMA=y:

drivers/built-in.o: In function `altera_tse_probe':
altera_tse_main.c:(.text+0x25ec2e): undefined reference to `dma_set_mask'
altera_tse_main.c:(.text+0x25ec78): undefined reference to `dma_supported'
altera_tse_main.c:(.text+0x25ecb6): undefined reference to `dma_supported'
drivers/built-in.o: In function `sgdma_async_read':
altera_sgdma.c:(.text+0x25f620): undefined reference to `dma_sync_single_for_cpu'
drivers/built-in.o: In function `sgdma_uninitialize':
(.text+0x25f678): undefined reference to `dma_unmap_single'
drivers/built-in.o: In function `sgdma_uninitialize':
(.text+0x25f696): undefined reference to `dma_unmap_single'
drivers/built-in.o: In function `sgdma_initialize':
(.text+0x25f6f0): undefined reference to `dma_map_single'
drivers/built-in.o: In function `sgdma_initialize':
(.text+0x25f702): undefined reference to `dma_mapping_error'
drivers/built-in.o: In function `sgdma_tx_buffer':
(.text+0x25f92a): undefined reference to `dma_sync_single_for_cpu'
drivers/built-in.o: In function `sgdma_rx_status':
(.text+0x25fa24): undefined reference to `dma_sync_single_for_cpu'
make[3]: *** [vmlinux] Error 1

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

vsock: Make transport the proto owner

Right now the core vsock module is the owner of the proto family. This
means there's nothing preventing the transport module from unloading if
there are open sockets, which results in a panic. Fix that by allowing
the transport to be the owner, which will refcount it properly.

Includes version bump to 1.0.1.0-k

Passes checkpatch this time, I swear...

Acked-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: Andy King <acking@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'for-davem' of git://git./linux/kernel/git/linville/wireless

John W. Linville says:

====================
pull request: wireless 2014-05-01

Please pull the following batch of fixes intended for the 3.15 stream!

For the Bluetooth bits, Gustavo says:

"Some fixes for 3.15. There is a revert for the intel driver, a new
device id, and two important SSP fixes from Johan."

On top of that...

Ben Hutchings gives us a fix for an unbalanced irq enable in an
rtl8192cu error path.

Colin Ian King provides an rtlwifi fix for an uninitialized variable.

Felix Fietkau brings a pair of ath9k fixes, one that corrects a
hardware initialization value and another that removes an (unnecessary)
flag that was being used in a way that led to a software tx queue
hang in ath9k.

Gertjan van Wingerde pushes a MAINTAINERS change to remove himself
from the rt2x00 maintainer team.

Hans de Goede fixes a brcmfmac firmware load hang.

Larry Finger changes rtlwifi to use the correct queue for V0 traffic
on rtl8192se.

Rajkumar Manoharan corrects a race in ath9k driver initialization.

Stanislaw Gruszka fixes an rt2x00 bug in which disabling beaconing
once on USB devices led to permanently disabling beaconing for those
devices.

Tim Harvey provides fixes for a pair of ath9k issues that can lead
to soft lockups in that driver.

Please let me know if there are problems!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

xtensa: ISS: don't depend on CONFIG_TTY

Build console support only when CONFIG_TTY is selected.
This restores ISS as the default platform for allnoconfig builds.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Chris Zankel <chris@zankel.net>

fix quoting of Ted's name in MAINTAINERS

Unpaired quotes really confuse mutt when copy & pasting it into the To:
form.

Signed-off-by: Christoph Hellwig <hch@lst.de>
[ I'm going to remove all silly quotes entirely one day, but that day is
not today. So I'll just apply this - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'upstream-3.15-rc5' of git://git.infradead.org/linux-ubifs

Pull ubifs fixes from Artem Bityutskiy:
"This includes the following fixes:

   - two real bug-fixes from Tanya for the still "experimental" UBI
     fastmap feature
   - a one-liner from Kees which hardens kernel security
   - a small error-path fix, where we forget to free various resources
     in case of failure - spotted by the 'smatch' tool"

* tag 'upstream-3.15-rc5' of git://git.infradead.org/linux-ubifs:
  UBI: avoid workqueue format string leak
  UBI: fix ubi free PEBs count calculation
  UBI: fix error path in __wl_get_peb
  UBIFS: fix remount error path

floppy: don't write kernel-only members to FDRAWCMD ioctl output

Do not leak kernel-only floppy_raw_cmd structure members to userspace.
This includes the linked-list pointer and the pointer to the allocated
DMA space.

Signed-off-by: Matthew Daley <mattd@bugfuzz.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

floppy: ignore kernel-only members in FDRAWCMD ioctl input

Always clear out these floppy_raw_cmd struct members after copying the
entire structure from userspace so that the in-kernel version is always
valid and never left in an interdeterminate state.

Signed-off-by: Matthew Daley <mattd@bugfuzz.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

bridge: superfluous skb->nfct check in br_nf_dev_queue_xmit

Currently bridge can silently drop ipv4 fragments.
If node have loaded nf_defrag_ipv4 module but have no nf_conntrack_ipv4,
br_nf_pre_routing defragments incoming ipv4 fragments
but nfct check in br_nf_dev_queue_xmit does not allow re-fragment combined
packet back, and therefore it is dropped in br_dev_queue_push_xmit without
incrementing of any failcounters

It seems the only way to hit the ip_fragment code in the bridge xmit
path is to have a fragment list whose reassembled fragments go over
the mtu. This only happens if nf_defrag is enabled. Thanks to
Florian Westphal for providing feedback to clarify this.

Defragmentation ipv4 is required not only in conntracks but at least in
TPROXY target and socket match, therefore #ifdef is changed from
NF_CONNTRACK_IPV4 to NF_DEFRAG_IPV4

Signed-off-by: Vasily Averin <vvs@openvz.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

ipv4: fix "conntrack zones" support for defrag user check in ip_expire

Defrag user check in ip_expire was not updated after adding support for
"conntrack zones".

This bug manifests as a RFC violation, since the router will send
the icmp time exceeeded message when using conntrack zones.

Signed-off-by: Vasily Averin <vvs@openvz.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

mac80211: fix nested rtnl locking on ieee80211_reconfig

ieee80211_reconfig already holds rtnl, so calling
cfg80211_sched_scan_stopped results in deadlock.

Use the rtnl-version of this function instead.

Fixes: d43c6b6 ("mac80211: reschedule sched scan after HW restart")
Cc: stable@vger.kernel.org (3.14+)
Signed-off-by: Eliad Peller <eliadx.peller@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

cfg80211: add cfg80211_sched_scan_stopped_rtnl

Add locked-version for cfg80211_sched_scan_stopped.
This is used for some users that might want to
call it when rtnl is already locked.

Fixes: d43c6b6 ("mac80211: reschedule sched scan after HW restart")
Cc: stable@vger.kernel.org (3.14+)
Signed-off-by: Eliad Peller <eliadx.peller@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

cfg80211: free sme on connection failures

cfg80211 is notified about connection failures by
__cfg80211_connect_result() call. However, this
function currently does not free cfg80211 sme.

This results in hanging connection attempts in some cases

e.g. when mac80211 authentication attempt is denied,
we have this function call:
ieee80211_rx_mgmt_auth() -> cfg80211_rx_mlme_mgmt() ->
cfg80211_process_auth() -> cfg80211_sme_rx_auth() ->
__cfg80211_connect_result()

but cfg80211_sme_free() is never get called.

Fixes: ceca7b712 ("cfg80211: separate internal SME implementation")
Cc: stable@vger.kernel.org (3.10+)
Signed-off-by: Eliad Peller <eliadx.peller@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

mac80211: Fix mac80211 station info rx bitrate for IBSS mode

Filter out incoming multicast packages before applying their bitrate
to the rx bitrate station info field to prevent them from setting the
rx bitrate to the basic multicast rate.

Signed-off-by: Henning Rogge <hrogge@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

UBI: avoid workqueue format string leak

When building the name for the workqueue thread, make sure a format
string cannot leak in from the disk name.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>

UBI: fix ubi free PEBs count calculation

The ubi->free_count should be updated with every insert/remove to/from
the ubi->free list.

Signed-off-by: Tanya Brokhman <tlinder@codeaurora.org>
Reviewed-by: Dolev Raviv <draviv@codeaurora.org>
Acked-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>

UBI: fix error path in __wl_get_peb

In case of an error (if there are not free PEB's for example),
__wl_get_peb will return a negative value. In order to prevent access
violation we need to test the returned value prior to using it later on.

Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>
Reviewed-by: Dolev Raviv <draviv@codeaurora.org>
Acked-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>

UBIFS: fix remount error path

Dan's "smatch" checker found out that there was a bug in the error path of the
'ubifs_remount_rw()' function. Instead of jumping to the "out" label which
cleans-things up, we just returned.

This patch fixes the problem.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>

Linux 3.15-rc4

net: sched: lock imbalance in hhf qdisc

hhf_change() takes the sch_tree_lock and releases it but misses the
error cases. Fix the missed case here.

To reproduce try a command like this,

# tc qdisc change dev p3p2 root hhf quantum 40960 non_hh_weight 300000

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'locks-v3.15-3' of git://git.samba.org/jlayton/linux

Pull file locking change from Jeff Layton:
"Only an email address change to the MAINTAINERS file"

* tag 'locks-v3.15-3' of git://git.samba.org/jlayton/linux:
MAINTAINERS: email address change for Jeff Layton

Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:
"These are mostly arm64 fixes with an additional arm(64) platform fix
  for the initialisation of vexpress clocks (the latter only affecting
  arm64; the arch/arm64 code is SoC agnostic and does not rely on early
  SoC-specific calls)

   - vexpress platform clocks initialisation moved earlier following the
     arm64 move of of_clk_init() call in a previous commit
   - Default DMA ops changed to non-coherent to preserve compatibility
     with 32-bit ARM DT files.  The "dma-coherent" property can be used
     to explicitly mark a device coherent.  The Applied Micro DT file
     has been updated to avoid DMA cache maintenance for the X-Gene SATA
     controller (the only arm64 related driver with such assumption in
     -rc mainline)
   - Fixmap correction for earlyprintk
   - kern_addr_valid() fix for huge pages"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  vexpress: Initialise the sysregs before setting up the clocks
  arm64: Mark the Applied Micro X-Gene SATA controller as DMA coherent
  arm64: Use bus notifiers to set per-device coherent DMA ops
  arm64: Make default dma_ops to be noncoherent
  arm64: fixmap: fix missing sub-page offset for earlyprintk
  arm64: Fix for the arm64 kern_addr_valid() function

Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
"This is two patches both fixing bugs in drivers (virtio-scsi and
  mpt2sas) causing an oops in certain circumstances"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  [SCSI] virtio-scsi: Skip setting affinity on uninitialized vq
  [SCSI] mpt2sas: Don't disable device twice at suspend.

netfilter: nfnetlink: Fix use after free when it fails to process batch

This bug manifests when calling the nft command line tool without
nf_tables kernel support.

kernel message:
[   44.071555] Netfilter messages via NETLINK v0.30.
[   44.072253] BUG: unable to handle kernel NULL pointer dereference at 0000000000000119
[   44.072264] IP: [<ffffffff8171db1f>] netlink_getsockbyportid+0xf/0x70
[   44.072272] PGD 7f2b74067 PUD 7f2b73067 PMD 0
[   44.072277] Oops: 0000 [#1] SMP
[...]
[   44.072369] Call Trace:
[   44.072373]  [<ffffffff8171fd81>] netlink_unicast+0x91/0x200
[   44.072377]  [<ffffffff817206c9>] netlink_ack+0x99/0x110
[   44.072381]  [<ffffffffa004b951>] nfnetlink_rcv+0x3c1/0x408 [nfnetlink]
[   44.072385]  [<ffffffff8171fde3>] netlink_unicast+0xf3/0x200
[   44.072389]  [<ffffffff817201ef>] netlink_sendmsg+0x2ff/0x740
[   44.072394]  [<ffffffff81044752>] ? __mmdrop+0x62/0x90
[   44.072398]  [<ffffffff816dafdb>] sock_sendmsg+0x8b/0xc0
[   44.072403]  [<ffffffff812f1af5>] ? copy_user_enhanced_fast_string+0x5/0x10
[   44.072406]  [<ffffffff816dbb6c>] ? move_addr_to_kernel+0x2c/0x50
[   44.072410]  [<ffffffff816db423>] ___sys_sendmsg+0x3c3/0x3d0
[   44.072415]  [<ffffffff811301ba>] ? handle_mm_fault+0xa9a/0xc60
[   44.072420]  [<ffffffff811362d6>] ? mmap_region+0x166/0x5a0
[   44.072424]  [<ffffffff817da84c>] ? __do_page_fault+0x1dc/0x510
[   44.072428]  [<ffffffff812b8b2c>] ? apparmor_capable+0x1c/0x60
[   44.072435]  [<ffffffff817d6e9a>] ? _raw_spin_unlock_bh+0x1a/0x20
[   44.072439]  [<ffffffff816dfc86>] ? release_sock+0x106/0x150
[   44.072443]  [<ffffffff816dc212>] __sys_sendmsg+0x42/0x80
[   44.072446]  [<ffffffff816dc262>] SyS_sendmsg+0x12/0x20
[   44.072450]  [<ffffffff817df616>] system_call_fastpath+0x1a/0x1f

Signed-off-by: Denys Fedoryshchenko <nuclearcat@nuclearcat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netfilter: ipv4: defrag: set local_df flag on defragmented skb

else we may fail to forward skb even if original fragments do fit
outgoing link mtu:

1. remote sends 2k packets in two 1000 byte frags, DF set
2. we want to forward but only see '2k > mtu and DF set'
3. we then send icmp error saying that outgoing link is 1500

But original sender never sent a packet that would not fit
the outgoing link.

Setting local_df makes outgoing path test size vs.
IPCB(skb)->frag_max_size, so we will still send the correct
error in case the largest original size did not fit
outgoing link mtu.

Reported-by: Maxime Bizon <mbizon@freebox.fr>
Suggested-by: Maxime Bizon <mbizon@freebox.fr>
Fixes: 5f2d04f1f9 (ipv4: fix path MTU discovery with connection tracking)
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

vexpress: Initialise the sysregs before setting up the clocks

Following arm64 commit bc3ee18a7a57 (arm64: init: Move of_clk_init to
time_init()), vexpress_osc_of_setup() is called via of_clk_init() long
before initcalls are issued. Initialising the vexpress oscillators
requires the vespress sysregs to be already initialised, so this patch
adds an explicit call to vexpress_sysreg_of_early_init() in vexpress
oscillator setup function.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Will Deacon <will.deacon@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Tested-by: Pawel Moll <pawel.moll@arm.com>
Acked-by: Pawel Moll <pawel.moll@arm.com>
Cc: Mike Turquette <mturquette@linaro.org>

USB: Nokia 5300 should be treated as unusual dev

Signed-off-by: Daniele Forsi <dforsi@gmail.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

USB: Nokia 305 should be treated as unusual dev

Signed-off-by: Victor A. Santos <victoraur.santos@gmail.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tty: Fix lockless tty buffer race

Commit 6a20dbd6caa2358716136144bf524331d70b1e03,
"tty: Fix race condition between __tty_buffer_request_room and flush_to_ldisc"
correctly identifies an unsafe race condition between
__tty_buffer_request_room() and flush_to_ldisc(), where the consumer
flush_to_ldisc() prematurely advances the head before consuming the
last of the data committed. For example:

           CPU 0                     |            CPU 1
__tty_buffer_request_room            | flush_to_ldisc
  ...                                |   ...
                                     |   count = head->commit - head->read
  n = tty_buffer_alloc()             |
  b->commit = b->used                |
  b->next = n                        |
                                     |   if (!count)                /* T */
                                     |     if (head->next == NULL)  /* F */
                                     |     buf->head = head->next

In this case, buf->head has been advanced but head->commit may have
been updated with a new value.

Instead of reintroducing an unnecessary lock, fix the race locklessly.
Read the commit-next pair in the reverse order of writing, which guarantees
the commit value read is the latest value written if the head is
advancing.

Reported-by: Manfred Schlaegl <manfred.schlaegl@gmx.at>
Cc: <stable@vger.kernel.org> # 3.12.x+
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Revert "tty: Fix race condition between __tty_buffer_request_room and flush_to_ldisc"

This reverts commit 6a20dbd6caa2358716136144bf524331d70b1e03.

Although the commit correctly identifies an unsafe race condition
between __tty_buffer_request_room() and flush_to_ldisc(), the commit
fixes the race with an unnecessary spinlock in a lockless algorithm.

The follow-on commit, "tty: Fix lockless tty buffer race" fixes
the race locklessly.

Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drivers/tty/hvc: don't free hvc_console_setup after init

When 'console=hvc0' is specified to the kernel parameter in x86 KVM guest,
hvc console is setup within a kthread. However, that will cause SEGV
and the boot will fail when the driver is builtin to the kernel,
because currently hvc_console_setup() is annotated with '__init'. This
patch removes '__init' to boot the guest successfully with 'console=hvc0'.

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

n_tty: Fix n_tty_write crash when echoing in raw mode

The tty atomic_write_lock does not provide an exclusion guarantee for
the tty driver if the termios settings are LECHO & !OPOST.  And since
it is unexpected and not allowed to call TTY buffer helpers like
tty_insert_flip_string concurrently, this may lead to crashes when
concurrect writers call pty_write. In that case the following two
writers:
* the ECHOing from a workqueue and
* pty_write from the process
race and can overflow the corresponding TTY buffer like follows.

If we look into tty_insert_flip_string_fixed_flag, there is:
  int space = __tty_buffer_request_room(port, goal, flags);
  struct tty_buffer *tb = port->buf.tail;
  ...
  memcpy(char_buf_ptr(tb, tb->used), chars, space);
  ...
  tb->used += space;

so the race of the two can result in something like this:
              A                                B
__tty_buffer_request_room
                                  __tty_buffer_request_room
memcpy(buf(tb->used), ...)
tb->used += space;
                                  memcpy(buf(tb->used), ...) ->BOOM

B's memcpy is past the tty_buffer due to the previous A's tb->used
increment.

Since the N_TTY line discipline input processing can output
concurrently with a tty write, obtain the N_TTY ldisc output_lock to
serialize echo output with normal tty writes.  This ensures the tty
buffer helper tty_insert_flip_string is not called concurrently and
everything is fine.

Note that this is nicely reproducible by an ordinary user using
forkpty and some setup around that (raw termios + ECHO). And it is
present in kernels at least after commit
d945cb9cce20ac7143c2de8d88b187f62db99bdc (pty: Rework the pty layer to
use the normal buffering logic) in 2.6.31-rc3.

js: add more info to the commit log
js: switch to bool
js: lock unconditionally
js: lock only the tty->ops->write call

References: CVE-2014-0196
Reported-and-tested-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tty: serial: 8250_core.c Bug fix for Exar chips.

The sleep function was updated to put the serial port to sleep only when necessary.
This appears to resolve the errant behavior of the driver as described in
Kernel Bug 61961 – "My Exar Corp. XR17C/D152 Dual PCI UART modem does not
work with 3.8.0".

Signed-off-by: Michael Welling <mwelling@ieee.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>