platform/kernel/linux-exynos.git
8 years agonet: thunderx: Don't set mac address for secondary Qset VFs
Sunil Goutham [Fri, 12 Aug 2016 11:21:40 +0000 (16:51 +0530)]
net: thunderx: Don't set mac address for secondary Qset VFs

Set MAC addresses only for primary VF's and don't for
secondary VFs.

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: thunderx: Reset RXQ HW stats when interface is brought down
Jerin Jacob [Fri, 12 Aug 2016 11:21:39 +0000 (16:51 +0530)]
net: thunderx: Reset RXQ HW stats when interface is brought down

When SQ/TXQ is reclaimed i.e reset it's stats also automatically reset
by HW. This is not the case with RQ. Also VF doesn't have write access
to statistics counter registers. Hence a new Mbox msg is introduced which
supports resetting RQ, SQ and full Qset stats. Currently only RQ stats
are being reset using this mbox message.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: thunderx: Improvement for MBX interface debug messages
Radoslaw Biernacki [Fri, 12 Aug 2016 11:21:38 +0000 (16:51 +0530)]
net: thunderx: Improvement for MBX interface debug messages

Adding debug messages in case of NACK for a mailbox message, also
did small cleanups.

Signed-off-by: Radoslaw Biernacki <rad@semihalf.com>
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: thunderx: Use skb_add_rx_frag() for split buffer Rx pkts
Sunil Goutham [Fri, 12 Aug 2016 11:21:37 +0000 (16:51 +0530)]
net: thunderx: Use skb_add_rx_frag() for split buffer Rx pkts

Instead of a round about way of converting buffers to SKBs and
combining them into a frag list, use standard skb_add_rx_frag()
API to merge page fragments. This code is useful when incoming
packets are of size more than RCV_FRAG_LEN which is currently
set to 2048bytes.

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: thunderx: Use netdev's name for naming VF's interrupts
Sunil Goutham [Fri, 12 Aug 2016 11:21:36 +0000 (16:51 +0530)]
net: thunderx: Use netdev's name for naming VF's interrupts

This patch changes the way VF's irqs are visible in /proc/interrupts.
Instead of VF id, logical interface's netdev name is used for IRQ
naming and also all secondary VF's interrupts in multiqset config
use primary VF's netdev name.

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: thunderx: Support for 83xx mixed QLM/DLM config
Sunil Goutham [Fri, 12 Aug 2016 11:21:35 +0000 (16:51 +0530)]
net: thunderx: Support for 83xx mixed QLM/DLM config

83xx has 4 BGX blocks and are enabled mixed QLM/DLM
configs. BGX0/BGX1 are from QLM2/QLM3, BGX3 is DLM4
and BGX2 is split across DLM5 & DLM6.

This patch adds support for BGX2's split config and also
enables all 4 BGXs to be used.

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: thunderx: Add support for 16 LMACs of 83xx
Sunil Goutham [Fri, 12 Aug 2016 11:21:34 +0000 (16:51 +0530)]
net: thunderx: Add support for 16 LMACs of 83xx

83xx will have 4 BGX blocks i.e 16 LMACs, to avoid changing
the same with every platform, nicpf struct elements which
track LMAC related info are now allocated runtime based
on platform's max possible BGX count.

Also fixed configuring min packet size for all LMAC's
supported on a platform.

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: thunderx: Add RGMII interface type support
Sunil Goutham [Fri, 12 Aug 2016 11:21:33 +0000 (16:51 +0530)]
net: thunderx: Add RGMII interface type support

This patch adds RGX/RGMII interface type support to BGX
driver. This type of interface is supported by 81xx SOC.

CN81XX VNIC has 8 VFs and max possible LMAC interfaces are 9,
hence RGMII interface will not work if all DLMs are in BGX mode
and all 8 LMACs are enabled

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: thunderx: Add QSGMII interface type support
Sunil Goutham [Fri, 12 Aug 2016 11:21:32 +0000 (16:51 +0530)]
net: thunderx: Add QSGMII interface type support

This patch adds support for QSGMII interface type to
the BGX driver. This type of interface is supported by
81xx SOC.

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: thunderx: Add 81xx support to BGX driver
Sunil Goutham [Fri, 12 Aug 2016 11:21:31 +0000 (16:51 +0530)]
net: thunderx: Add 81xx support to BGX driver

This patch adds support for BGX module on 81xx where a BGX
can be split and have different LMACs configured in
different modes.

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: thunderx: Support for different LMAC types within BGX
Sunil Goutham [Fri, 12 Aug 2016 11:21:30 +0000 (16:51 +0530)]
net: thunderx: Support for different LMAC types within BGX

On 88xx all LMACs in a BGX will be in same mode but on 81xx
BGX can be split as two and there can be LMACs configured in
different modes.

These changes move lmac_type, lane2serdes fields into per lmac
struct from BGX struct. Got rid of qlm_mode field which has become
redundant with these changes. And now no of valid LMACs is read
from CSRs configured by low level firmware and figuring out the
same based on QLM mode is discarded

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: thunderx: Enable mailbox interrupts on 81xx/83xx
Sunil Goutham [Fri, 12 Aug 2016 11:21:29 +0000 (16:51 +0530)]
net: thunderx: Enable mailbox interrupts on 81xx/83xx

88xx has 128 VFs, 81xx has 8 VFs and 83xx will have 32VFs.
Made changes to PF driver such that mailbox interrupt enable
registers are configuired based on number of VFs HW supports.
Also cleanedup mailbox irq handler registration code.

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: thunderx: Enable CQE_RX desc's extension fields
Sunil Goutham [Fri, 12 Aug 2016 11:21:28 +0000 (16:51 +0530)]
net: thunderx: Enable CQE_RX desc's extension fields

Unlike 88xx, CQE_RX descriptor's tunnelling extension i.e CQE_RX2_S
is always enabled on 81xx/83xx and HW does insert these fields into
CQE_RX. As a result receive buffer addresses will now be present at
7th word of CQE_RX instead of 6th.

Enable CQE_RX2_S on 88xx pass 2.x as well.

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: thunderx: Set queue count based on number of CPUs
Sunil Goutham [Fri, 12 Aug 2016 11:21:27 +0000 (16:51 +0530)]
net: thunderx: Set queue count based on number of CPUs

81xx has only 4 CPUs, so it doesn't make sense to initialize
entire Qset i.e 8 queues by default. Made changes to queue
initialization to init queues equal to number of CPUs or
8 queues whichever is lesser. Also this will be applicable to
VMs with VNIC VF attached and having less VCPUs

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: thunderx: Add support for 81xx and 83xx chips
Sunil Goutham [Fri, 12 Aug 2016 11:21:26 +0000 (16:51 +0530)]
net: thunderx: Add support for 81xx and 83xx chips

This patch adds info on HW maximums of 81xx/83xx and also
configures receive and transmit datapaths accordingly.

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: thunderx: Add VNIC's PCI devid on future chips
Sunil Goutham [Fri, 12 Aug 2016 11:21:25 +0000 (16:51 +0530)]
net: thunderx: Add VNIC's PCI devid on future chips

This patch adds PCI device IDs of VNIC on newer chips and also
registers VF driver with them. Device id remains same for all
versions of chips but subsystem device id changes.

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: thunderx: Moved HW capability info from macros to structure
Sunil Goutham [Fri, 12 Aug 2016 11:21:24 +0000 (16:51 +0530)]
net: thunderx: Moved HW capability info from macros to structure

Current driver has most of the HW maximums info like no of channels,
traffic limiters, RSS indices e.t.c in the form of macros. These have
been moved into a 'hw_info' structure so that support for VNIC on
newer chips with different set of HW maximums can be added.

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch 'xgene-next'
David S. Miller [Sat, 13 Aug 2016 18:48:54 +0000 (11:48 -0700)]
Merge branch 'xgene-next'

Iyappan Subramanian says:

====================
Fix warning and issues

This patch set fixes the following warning and issues,

  1. Fix compiler warnings
   - drivers: net: xgene: Fix compiler warnings
  2. unmap DMA memory on xgene_Enet_delete_bufpoool()
- drivers: net: xgene: fix: Add dma_unmap_single
  3. Delete descriptor rings and buffer pools on error
- drivers: net: xgene: fix: Delete descriptor rings and buffer pools
  4. Fix error desconstruction on probe()
  - drivers: net: xgene: Fix error deconstruction path
  5. Fix RSS indirection table fields
  - drivers: net: xgene: Fix RSS indirection table fields
  6. Change the port init sequence as per hardware specification
- drivers: net: xgene: Change port init sequence
  7. Fix link not recovered after link is down issue
- drivers: net: xgene: XFI PCS reset when link is down
  8. Fix link up is reported when no SFP+ module is plugged in issue
- drivers: net: xgene: Poll link status via GPIO
- dtb: xgene: Add rxlos-gpios property
- Documentation: dtb: xgene: Add rxlos GPIO mapping
  9. Fix backward compatibility when used with older driver
- drivers: net: xgene: Fix backward compatibility
- dtb: xgene: Fix backward compatibility

v2: Address review comments from v1
- Fixed compiler warnings
- Removed kbuild fix patch, since Arnd submitted the same
- Changed Kconfig to select GPIOLIB (to fix kbuild warning)
- Added rxlos-gpio documentation
- Fixed backward compatibility with older driver

v1:
- Initial version
====================

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Tested-by: Fushen Chen <fchen@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agodtb: xgene: Fix backward compatibility
Iyappan Subramanian [Sat, 13 Aug 2016 05:05:48 +0000 (22:05 -0700)]
dtb: xgene: Fix backward compatibility

This patch fixes the backward compatibility when used with older kernel.

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agodrivers: net: xgene: Fix backward compatibility
Iyappan Subramanian [Sat, 13 Aug 2016 05:05:47 +0000 (22:05 -0700)]
drivers: net: xgene: Fix backward compatibility

This patch fixes the backward compatibility on handling phy_connect(), by
iterating over the phy-handle, when new DT is used with older kernel.

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoDocumentation: dtb: xgene: Add rxlos GPIO mapping
Iyappan Subramanian [Sat, 13 Aug 2016 05:05:46 +0000 (22:05 -0700)]
Documentation: dtb: xgene: Add rxlos GPIO mapping

Signed-off-by: Quan Nguyen <qnguyen@apm.com>
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Tested-by: Fushen Chen <fchen@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agodtb: xgene: Add rxlos-gpios property
Iyappan Subramanian [Sat, 13 Aug 2016 05:05:45 +0000 (22:05 -0700)]
dtb: xgene: Add rxlos-gpios property

Added rxlos GPIO mapping by adding rxlos-gpios property.

Signed-off-by: Quan Nguyen <qnguyen@apm.com>
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Tested-by: Fushen Chen <fchen@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agodrivers: net: xgene: Poll link status via GPIO
Iyappan Subramanian [Sat, 13 Aug 2016 05:05:44 +0000 (22:05 -0700)]
drivers: net: xgene: Poll link status via GPIO

When 10GbE SFP+ module is not plugged in or cable is not connected,
the link status register does not report the proper state due
to floating signal. This patch checks the module present status via an
GPIO to determine whether to ignore the link status register and report
link down.

Signed-off-by: Quan Nguyen <qnguyen@apm.com>
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Tested-by: Fushen Chen <fchen@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agodrivers: net: xgene: XFI PCS reset when link is down
Iyappan Subramanian [Sat, 13 Aug 2016 05:05:43 +0000 (22:05 -0700)]
drivers: net: xgene: XFI PCS reset when link is down

This patch fixes the link recovery issue, by doing PCS reset
when the link is down.

Signed-off-by: Fushen Chen <fchen@apm.com>
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agodrivers: net: xgene: Change port init sequence
Iyappan Subramanian [Sat, 13 Aug 2016 05:05:42 +0000 (22:05 -0700)]
drivers: net: xgene: Change port init sequence

This patch rearranges the port initialization sequence as recommended by
hardware specification.  This patch also removes, mac_init() call from
xgene_enet_link_state(), as it was not required.

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Tested-by: Fushen Chen <fchen@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agodrivers: net: xgene: Fix RSS indirection table fields
Iyappan Subramanian [Sat, 13 Aug 2016 05:05:41 +0000 (22:05 -0700)]
drivers: net: xgene: Fix RSS indirection table fields

This patch fixes FPSel and NxtFPSel fields length to 5-bit value.

Signed-off-by: Quan Nguyen <qnguyen@apm.com>
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Tested-by: Fushen Chen <fchen@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agodrivers: net: xgene: Fix error deconstruction path
Iyappan Subramanian [Sat, 13 Aug 2016 05:05:40 +0000 (22:05 -0700)]
drivers: net: xgene: Fix error deconstruction path

Since register_netdev() call in xgene_enet_probe() was moved down to
the end, it doesn't properly handle errors that may occur, by
deconstructing everything that was setup before the error occurred.

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agodrivers: net: xgene: fix: Delete descriptor rings and buffer pools
Iyappan Subramanian [Sat, 13 Aug 2016 05:05:39 +0000 (22:05 -0700)]
drivers: net: xgene: fix: Delete descriptor rings and buffer pools

xgene_enet_init_hw() should delete any descriptor rings and
buffer pools setup should le_ops->cle_init() return an error.

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agodrivers: net: xgene: fix: Add dma_unmap_single
Iyappan Subramanian [Sat, 13 Aug 2016 05:05:38 +0000 (22:05 -0700)]
drivers: net: xgene: fix: Add dma_unmap_single

In addition to xgene_enet_delete_bufpool() freeing skbs, their associated
dma memory should also be unmapped.

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agodrivers: net: xgene: Fix compiler warnings
Iyappan Subramanian [Sat, 13 Aug 2016 05:05:37 +0000 (22:05 -0700)]
drivers: net: xgene: Fix compiler warnings

Fixed compiler warnings reported with -Wmaybe-uninitialized W=1,

      /drivers/net/ethernet/apm/xgene/xgene_enet_main.c: In function ‘xgene_enet_rx_frame’:
      ../drivers/net/ethernet/apm/xgene/xgene_enet_main.c:455:27: warning: variable ‘pdata’ set but not used [-Wunused-but-set-variable]
      struct xgene_enet_pdata *pdata;
      ^
      ../drivers/net/ethernet/apm/xgene/xgene_enet_main.c: In function ‘xgene_enet_remove’:
      ../drivers/net/ethernet/apm/xgene/xgene_enet_main.c:1691:30: warning: variable ‘mac_ops’ set but not used [-Wunused-but-set-variable]
      const struct xgene_mac_ops *mac_ops;
                                   ^

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch 'bpf-improvements'
David S. Miller [Sat, 13 Aug 2016 04:57:13 +0000 (21:57 -0700)]
Merge branch 'bpf-improvements'

Alexei Starovoitov says:

====================
bpf improvements

Two bpf improvements:
1. allow bpf helpers like bpf_map_lookup_elem() access packet data directly
  for XDP programs
2. enable bpf_get_prandom_u32() for tracing programs
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agobpf: allow bpf_get_prandom_u32() to be used in tracing
Alexei Starovoitov [Fri, 12 Aug 2016 01:17:18 +0000 (18:17 -0700)]
bpf: allow bpf_get_prandom_u32() to be used in tracing

bpf_get_prandom_u32() was initially introduced for socket filters
and later requested numberous times to be added to tracing bpf programs
for the same reason as in socket filters: to be able to randomly
select incoming events.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agosamples/bpf: add verifier tests for the helper access to the packet
Aaron Yue [Fri, 12 Aug 2016 01:17:17 +0000 (18:17 -0700)]
samples/bpf: add verifier tests for the helper access to the packet

test various corner cases of the helper function access to the packet
via crafted XDP programs.

Signed-off-by: Aaron Yue <haoxuany@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agobpf: allow helpers access the packet directly
Alexei Starovoitov [Fri, 12 Aug 2016 01:17:16 +0000 (18:17 -0700)]
bpf: allow helpers access the packet directly

The helper functions like bpf_map_lookup_elem(map, key) were only
allowing 'key' to point to the initialized stack area.
That is causing performance degradation when programs need to process
millions of packets per second and need to copy contents of the packet
into the stack just to pass the stack pointer into the lookup() function.
Allow such helpers read from the packet directly.
All helpers that expect ARG_PTR_TO_MAP_KEY, ARG_PTR_TO_MAP_VALUE,
ARG_PTR_TO_STACK assume byte aligned pointer, so no alignment concerns,
only need to check that helper will not be accessing beyond
the packet range verified by the prior 'if (ptr < data_end)' condition.
For now allow this feature for XDP programs only. Later it can be
relaxed for the clsact programs as well.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agosit: make function ipip6_valid_ip_proto() static
Wei Yongjun [Sat, 13 Aug 2016 01:54:15 +0000 (01:54 +0000)]
sit: make function ipip6_valid_ip_proto() static

Fixes the following sparse warning:

net/ipv6/sit.c:1129:6: warning:
 symbol 'ipip6_valid_ip_proto' was not declared. Should it be static?

Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch 'bpf-under-cgroup'
David S. Miller [Sat, 13 Aug 2016 04:49:42 +0000 (21:49 -0700)]
Merge branch 'bpf-under-cgroup'

Sargun Dhillon says:

====================
Add test_current_task_under_cgroup bpf helper and test

This patchset includes a helper and an example to determine whether the probe is
currently executing in the context of a specific cgroup based on a cgroup bpf
map / array. The helper checks the cgroupsv2 hierarchy based on the handle in
the map and if the current cgroup is equal to it, or a descendant of it. The
helper was tested with the example program, and it was verified that the correct
behaviour occurs in the interrupt context.

In an earlier version of this patchset I had added an "opensnoop"-like tool, and
I realized I was basically reimplementing a lot of the code that already exists
in the bcc repo. So, instead I decided to write a test that creates a new mount
namespace, mounts up the cgroupv2 hierarchy, and does some basic tests.  I used
the sync syscall as a canary for these tests because it's a simple, 0-arg
syscall. Once this patch is accepted, adding support to opensnoop will be easy.

I also added a task_under_cgroup_hierarchy function in cgroups.h, as this
pattern is used in a couple places. Converting those can be done in a later
patchset.

Thanks to Alexei, Tejun, and Daniel for providing review.

v1->v2: Clean up
v2->v3: Move around ifdefs out of *.c files, add an "integration" test
v3->v4: De-genercize arraymap fetching function;
rename helper from in_cgroup to under_cgroup (makes much more sense)
Split adding cgroups task_under_cgroup_hierarchy function
v4->v5: Fix formatting
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agosamples/bpf: Add test_current_task_under_cgroup test
Sargun Dhillon [Fri, 12 Aug 2016 15:57:04 +0000 (08:57 -0700)]
samples/bpf: Add test_current_task_under_cgroup test

This test has a BPF program which writes the last known pid to call the
sync syscall within a given cgroup to a map.

The user mode program creates its own mount namespace, and mounts the
cgroupsv2  hierarchy in there, as on all current test systems
(Ubuntu 16.04, Debian), the cgroupsv2 vfs is unmounted by default.
Once it does this, it proceeds to test.

The test checks for positive and negative condition. It ensures that
when it's part of a given cgroup, its pid is captured in the map,
and that when it leaves the cgroup, this doesn't happen.

It populate a cgroups arraymap prior to execution in userspace. This means
that the program must be run in the same cgroups namespace as the programs
that are being traced.

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agobpf: Add bpf_current_task_under_cgroup helper
Sargun Dhillon [Fri, 12 Aug 2016 15:56:52 +0000 (08:56 -0700)]
bpf: Add bpf_current_task_under_cgroup helper

This adds a bpf helper that's similar to the skb_in_cgroup helper to check
whether the probe is currently executing in the context of a specific
subset of the cgroupsv2 hierarchy. It does this based on membership test
for a cgroup arraymap. It is invalid to call this in an interrupt, and
it'll return an error. The helper is primarily to be used in debugging
activities for containers, where you may have multiple programs running in
a given top-level "container".

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agocgroup: Add task_under_cgroup_hierarchy cgroup inline function to headers
Sargun Dhillon [Fri, 12 Aug 2016 15:56:40 +0000 (08:56 -0700)]
cgroup: Add task_under_cgroup_hierarchy cgroup inline function to headers

This commit adds an inline function to cgroup.h to check whether a given
task is under a given cgroup hierarchy. This is to avoid having to put
ifdefs in .c files to gate access to cgroups. When cgroups are disabled
this always returns true.

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge tag 'batadv-next-for-davem-20160812' of git://git.open-mesh.org/linux-merge
David S. Miller [Sat, 13 Aug 2016 03:55:41 +0000 (20:55 -0700)]
Merge tag 'batadv-next-for-davem-20160812' of git://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
This feature patchset includes the following changes (mostly
chronological order):

 - bump version strings, by Simon Wunderlich

 - kerneldoc clean up, by Sven Eckelmann

 - enable RTNL automatic loading and according documentation
   changes, by Sven Eckelmann (2 patches)

 - fix/improve interface removal and associated locking, by
   Sven Eckelmann (3 patches)

 - clean up unused variables, by Linus Luessing

 - implement Gateway selection code for B.A.T.M.A.N. V by
   Antonio Quartulli (4 patches)

 - rewrite TQ comparison by Markus Pargmann

 - fix Cocinelle warnings on bool vs integers (by Fenguang Wu/Intels
   kbuild test robot) and bitwise arithmetic operations (by Linus
   Luessing)

 - rewrite packet creation for forwarding for readability and to avoid
   reference count mistakes, by Linus Luessing

 - use kmem_cache for translation table, which results in more efficient
   storing of translation table entries, by Sven Eckelmann

 - rewrite/clarify reference handling for send_skb_unicast, by Sven
   Eckelmann

 - fix debug messages when updating routes, by Sven Eckelmann
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch 'sfc-SFN8000-support-improvements'
David S. Miller [Sat, 13 Aug 2016 03:42:20 +0000 (20:42 -0700)]
Merge branch 'sfc-SFN8000-support-improvements'

Bert Kenward says:

====================
sfc: SFN8000 support improvements

This series improves support for the recently released SFN8000 series
of adapters. Specifically, it retrieves interrupt moderation timer
settings directly from the adapter and uses those settings. It also
uses a new event queue initialisation interface, allowing specification
of a performance objective rather than enabling individual flags.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agosfc: get timer configuration from adapter
Bert Kenward [Thu, 11 Aug 2016 12:02:36 +0000 (13:02 +0100)]
sfc: get timer configuration from adapter

On SFN8000 series adapters the MC provides a method to get the timer
quantum and the maximum timer setting. We revert to the old values if the
new call is unavailable.

Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agosfc: set interrupt moderation via MCDI
Bert Kenward [Thu, 11 Aug 2016 12:02:09 +0000 (13:02 +0100)]
sfc: set interrupt moderation via MCDI

SFN8000-series NICs require a new method of setting interrupt moderation,
via MCDI. This is indicated by a workaround flag. This new MCDI command
takes an explicit time value rather than a number of ticks. It therefore
makes sense to also store the moderation values in terms of time, since
that is what the ethtool interface is interested in.

Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agosfc: use new performance based event queue init
Bert Kenward [Thu, 11 Aug 2016 12:01:54 +0000 (13:01 +0100)]
sfc: use new performance based event queue init

Rather than explicitly specifying flags we can now specify a desired
performance target to the firmware, ie higher throughput or lower latency.
For now we use the default "auto" configuration.

Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agosfc: retrieve second word of datapath capabilities
Bert Kenward [Thu, 11 Aug 2016 12:01:35 +0000 (13:01 +0100)]
sfc: retrieve second word of datapath capabilities

Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agosfc: allow asynchronous MCDI without completion function
Bert Kenward [Thu, 11 Aug 2016 12:01:21 +0000 (13:01 +0100)]
sfc: allow asynchronous MCDI without completion function

Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agosfc: update MCDI protocol headers
Bert Kenward [Thu, 11 Aug 2016 12:01:01 +0000 (13:01 +0100)]
sfc: update MCDI protocol headers

Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: ethernet: mediatek: enhance the locking using the lightweight ones
Sean Wang [Thu, 11 Aug 2016 09:51:00 +0000 (17:51 +0800)]
net: ethernet: mediatek: enhance the locking using the lightweight ones

Since these critical sections protected by page_lock are all entered
from the user context or bottom half context, they can be replaced
with the spin_lock() or spin_lock_bh instead of spin_lock_irqsave().

Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Acked-by: John Crispin <john@phrozen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: ena: Add a driver for Amazon Elastic Network Adapters (ENA)
Netanel Belgazal [Wed, 10 Aug 2016 11:03:22 +0000 (14:03 +0300)]
net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)

This is a driver for the ENA family of networking devices.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch 'xilinx-gmiitorgmii-converter'
David S. Miller [Fri, 12 Aug 2016 23:57:20 +0000 (16:57 -0700)]
Merge branch 'xilinx-gmiitorgmii-converter'

Kedareswara rao Appana says:

====================
net: phy: Add xilinx gmiitorgmii converter support

The Gigabit Media Independent Interface (GMII) to Reduced Gigabit Media
Independent Interface (RGMII) core provides the RGMII between RGMII-compliant
Ethernet physical media devices (PHY) and the Gigabit Ethernet controller.
This core can be used in all three modes of operation(10/100/1000 Mb/s).
The Management Data Input/Output (MDIO) interface is used to configure the
Speed of operation. This core can switch dynamically between the three
Different speed modes by configuring the conveter register through mdio write.

The conveter sits b/w the MAC and external phy like below

MACB <==> GMII2RGMII <==> RGMII_PHY

        MDIO    <========> GMII2RGMII
MCAB <=======>
                <========> RGMII

Using MAC MDIO bus we can access both the converter and the external PHY.
We need to program the line speed of the converter during run time based
On the external phy negotiated speed.

This patch series does the below
---> Add mask for Control register 10Mbps speed.
---> Add support for xilinx gmiitorgmii converter.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: phy: Add gmiitorgmii converter support
Appana Durga Kedareswara Rao [Wed, 10 Aug 2016 05:50:08 +0000 (11:20 +0530)]
net: phy: Add gmiitorgmii converter support

This patch adds support for gmiitorgmii converter.

The GMII to RGMII IP core provides the Reduced Gigabit Media
Independent Interface (RGMII) between Ethernet physical media
Devices and the Gigabit Ethernet controller. This core can
Switch dynamically between the three different speed modes of
Operation by configuring the converter register through mdio write.

MDIO interface is used to set operating speed of Ethernet MAC.

This converter sits between the MAC and the external phy
MAC <==> GMII2RGMII <==> RGMII_PHY

Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoDocumentation: DT: net: Add Xilinx gmiitorgmii converter device tree binding document...
Appana Durga Kedareswara Rao [Wed, 10 Aug 2016 05:50:07 +0000 (11:20 +0530)]
Documentation: DT: net: Add Xilinx gmiitorgmii converter device tree binding documentation

Device-tree binding documentation for xilinx gmiitorgmii converter.

Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: Add mask for Control register 10Mbps speed
Appana Durga Kedareswara Rao [Wed, 10 Aug 2016 05:50:06 +0000 (11:20 +0530)]
net: Add mask for Control register 10Mbps speed

This patch adds mask for the Control register
10Mbps speed.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: ethernet: renesas: sh_eth: use new api ethtool_{get|set}_link_ksettings
Philippe Reynes [Tue, 9 Aug 2016 22:04:49 +0000 (00:04 +0200)]
net: ethernet: renesas: sh_eth: use new api ethtool_{get|set}_link_ksettings

The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Tested-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: ethernet: renesas: sh_eth: use phydev from struct net_device
Philippe Reynes [Tue, 9 Aug 2016 22:04:48 +0000 (00:04 +0200)]
net: ethernet: renesas: sh_eth: use phydev from struct net_device

The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy_dev in the private structure, and update the driver to use the
one contained in struct net_device.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Tested-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agosamples/bpf: fix bpf_perf_event_output prototype
Adam Barth [Wed, 10 Aug 2016 16:45:39 +0000 (09:45 -0700)]
samples/bpf: fix bpf_perf_event_output prototype

The commit 555c8a8623a3 ("bpf: avoid stack copy and use skb ctx for event output")
started using 20 of initially reserved upper 32-bits of 'flags' argument
in bpf_perf_event_output(). Adjust corresponding prototype in samples/bpf/bpf_helpers.h

Signed-off-by: Adam Barth <arb@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: macb: Add 64 bit addressing support for GEM
Harini Katakam [Tue, 9 Aug 2016 07:45:53 +0000 (13:15 +0530)]
net: macb: Add 64 bit addressing support for GEM

This patch adds support for 64 bit addressing and BDs.
-> Enable 64 bit addressing in DMACFG register.
-> Set DMA mask when design config register shows support for 64 bit addr.
-> Add new BD words for higher address when 64 bit DMA support is present.
-> Add and update TBQPH and RBQPH for MSB of BD pointers.
-> Change extraction and updation of buffer addresses to use
64 bit address.
-> In gem_rx extract address in one place insted of two and use a
separate flag for RXUSED.

Signed-off-by: Harini Katakam <harinik@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoqed*: Add support for ethtool link_ksettings callbacks.
Sudarsana Reddy Kalluru [Tue, 9 Aug 2016 07:51:23 +0000 (03:51 -0400)]
qed*: Add support for ethtool link_ksettings callbacks.

This patch adds the driver implementation for ethtool link_ksettings
callbacks. qed driver now defines/uses the qed specific masks for
representing link capability values. qede driver maps these values to
to new link modes defined by the kernel implementation of link_ksettings.

Please consider applying this to 'net-next' branch.

Signed-off-by: Sudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch 'cpsw-refactor'
David S. Miller [Thu, 11 Aug 2016 00:27:41 +0000 (17:27 -0700)]
Merge branch 'cpsw-refactor'

Ivan Khoronzhuk says:

====================
net: ethernet: ti: cpsw: split driver data and per ndev data

In dual_emac mode the driver can handle 2 network devices. Each of them can use
its own private data and common data/resources. This patchset splits common driver
data/resources and private per net device data.
It leads to:
- reduce memory usage
- increase code readability
- allows add a bunch of simplification
- create prerequisites to add multi-channel support,
  when channels are shared between net devices

Doesn't have bad impact on performance.
v2: https://lkml.org/lkml/2016/8/6/108

Since v2:
- removed patch:
  net: ethernet: ti: cpsw: fix int dbg message
- replaced patch:
  "net: ethernet: ti: cpsw: remove redundant check in napi poll"
  on "net: ethernet: ti: cpsw: remove intr dbg msg from poll handlers"
- removed macro "cpsw_get_slave_ndev"
- corrected some commits

Since v1:
- added several patch improvements
- avoided variable reordering in structures
- removed static variable for common function
- split big patch on several patches:
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: ethernet: ti: cpsw: move ale, cpts and drivers params under cpsw_common
Ivan Khoronzhuk [Tue, 9 Aug 2016 23:22:44 +0000 (02:22 +0300)]
net: ethernet: ti: cpsw: move ale, cpts and drivers params under cpsw_common

The ale, cpts, version, rx_packet_max, bus_freq, interrupt pacing
parameters are common per net device that uses the same h/w. So,
move them to common driver structure.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: ethernet: ti: cpsw: move napi struct to cpsw_common
Ivan Khoronzhuk [Tue, 9 Aug 2016 23:22:43 +0000 (02:22 +0300)]
net: ethernet: ti: cpsw: move napi struct to cpsw_common

The napi structs are common for both net devices in dual_emac
mode, In order to not hold duplicate links to them, move to
cpsw_common.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: ethernet: ti: cpsw: move platform data and slaves info to cpsw_common
Ivan Khoronzhuk [Tue, 9 Aug 2016 23:22:42 +0000 (02:22 +0300)]
net: ethernet: ti: cpsw: move platform data and slaves info to cpsw_common

These data are common for net devs in dual_emac mode. No need to hold
it for every priv instance, so move them under cpsw_common.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet; ethernet: ti: cpsw: move irq stuff under cpsw_common
Ivan Khoronzhuk [Tue, 9 Aug 2016 23:22:41 +0000 (02:22 +0300)]
net; ethernet: ti: cpsw: move irq stuff under cpsw_common

The irq data are common for net devs in dual_emac mode. So no need to
hold these data in every priv struct, move them under cpsw_common.
Also delete irq_num var, as after optimization it's not needed.
Correct number of irqs to 2, as anyway, driver is using only 2,
at least for now.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: ethernet: ti: cpsw: move cpdma resources to cpsw_common
Ivan Khoronzhuk [Tue, 9 Aug 2016 23:22:40 +0000 (02:22 +0300)]
net: ethernet: ti: cpsw: move cpdma resources to cpsw_common

Every net device private struct holds links to shared cpdma resources.
No need to save and every time synchronize these resources per net dev.
So, move it to common driver struct.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: ethernet: ti: cpsw: move links on h/w registers to cpsw_common
Ivan Khoronzhuk [Tue, 9 Aug 2016 23:22:39 +0000 (02:22 +0300)]
net: ethernet: ti: cpsw: move links on h/w registers to cpsw_common

The pointers on h/w registers are common for every cpsw_private
instance, so no need to hold them for every ndev.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: ethernet: ti: cpsw: replace pdev on dev
Ivan Khoronzhuk [Tue, 9 Aug 2016 23:22:38 +0000 (02:22 +0300)]
net: ethernet: ti: cpsw: replace pdev on dev

No need to hold pdev link when only dev is needed.
This allows to simplify a bunch of cpsw->pdev->dev now and farther.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: ethernet: ti: cpsw: create common struct to hold shared driver data
Ivan Khoronzhuk [Tue, 9 Aug 2016 23:22:37 +0000 (02:22 +0300)]
net: ethernet: ti: cpsw: create common struct to hold shared driver data

This patch simply create holder for common data and as a start moves
pdev var to it.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: ethernet: ti: cpsw: don't check slave num in runtime
Ivan Khoronzhuk [Tue, 9 Aug 2016 23:22:36 +0000 (02:22 +0300)]
net: ethernet: ti: cpsw: don't check slave num in runtime

No need to check const slave num in runtime for every packet,
and ndev for slaves w/o ndev is anyway NULL. So remove redundant
check and macro.

Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: ethernet: ti: cpsw: remove clk var from priv
Ivan Khoronzhuk [Tue, 9 Aug 2016 23:22:35 +0000 (02:22 +0300)]
net: ethernet: ti: cpsw: remove clk var from priv

There is no need to hold link to clk, it's used only once
while probe.

Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: ethernet: ti: cpsw: remove priv from cpsw_get_slave_port() parameters list
Ivan Khoronzhuk [Tue, 9 Aug 2016 23:22:34 +0000 (02:22 +0300)]
net: ethernet: ti: cpsw: remove priv from cpsw_get_slave_port() parameters list

There is no need in priv here.

Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: ethernet: ti: cpsw: remove intr dbg msg from poll handlers
Ivan Khoronzhuk [Tue, 9 Aug 2016 23:22:33 +0000 (02:22 +0300)]
net: ethernet: ti: cpsw: remove intr dbg msg from poll handlers

At poll handler no possibility to figure out which network device is
handling packets, as cpdma channels are common for both network
devices in dual_emac mode. Currently, the messages are printed only
for one device, in fact, there is two. This print msg is incorrect
and seems is not very useful, so drop it from poll handler.

Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: ethernet: ti: cpsw: simplify submit routine
Ivan Khoronzhuk [Tue, 9 Aug 2016 23:22:32 +0000 (02:22 +0300)]
net: ethernet: ti: cpsw: simplify submit routine

As second net dev is created only in case of dual_emac mode, port
number can be figured out in simpler way. Also no need to pass
redundant ndev struct.

Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agorps: Inspect PPTP encapsulated by GRE to get flow hash
Gao Feng [Tue, 9 Aug 2016 04:38:24 +0000 (12:38 +0800)]
rps: Inspect PPTP encapsulated by GRE to get flow hash

The PPTP is encapsulated by GRE header with that GRE_VERSION bits
must contain one. But current GRE RPS needs the GRE_VERSION must be
zero. So RPS does not work for PPTP traffic.

In my test environment, there are four MIPS cores, and all traffic
are passed through by PPTP. As a result, only one core is 100% busy
while other three cores are very idle. After this patch, the usage
of four cores are balanced well.

Signed-off-by: Gao Feng <fgao@ikuai8.com>
Reviewed-by: Philip Prindeville <philipp@redfish-solutions.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch 'qdisc-hashtable'
David S. Miller [Thu, 11 Aug 2016 00:19:07 +0000 (17:19 -0700)]
Merge branch 'qdisc-hashtable'

Jiri Kosina says:

====================
Convert qdisc linked list into a hashtable

This is a respin of the v6 of the original patch [1], split into two-patch
series as requested by davem; first patch fixes all symbol conflicts
that'd happen once netdevice.h starts to include hashtable.h, the second
one performs the actual switch to hashtable.

I've preserved Cong's Reviewed-by:, as code-wise this series is identical
to the original v6 of the patch.

[1] lkml.kernel.org/r/alpine.LNX.2.00.1608011220580.22028@cbobk.fhfr.pm
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: sched: convert qdisc linked list to hashtable
Jiri Kosina [Wed, 10 Aug 2016 09:05:15 +0000 (11:05 +0200)]
net: sched: convert qdisc linked list to hashtable

Convert the per-device linked list into a hashtable. The primary
motivation for this change is that currently, we're not tracking all the
qdiscs in hierarchy (e.g. excluding default qdiscs), as the lookup
performed over the linked list by qdisc_match_from_root() is rather
expensive.

The ultimate goal is to get rid of hidden qdiscs completely, which will
bring much more determinism in user experience.

Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: resolve symbol conflicts with generic hashtable.h
Jiri Kosina [Wed, 10 Aug 2016 09:03:35 +0000 (11:03 +0200)]
net: resolve symbol conflicts with generic hashtable.h

This is a preparatory patch for converting qdisc linked list into a
hashtable. As we'll need to include hashtable.h in netdevice.h, we first
have to make sure that this will not introduce symbol conflicts for any of
the netdevice.h users.

Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoravb: use proper names for suspend/resume functions
Niklas Söderlund [Wed, 10 Aug 2016 11:09:49 +0000 (13:09 +0200)]
ravb: use proper names for suspend/resume functions

The patch 'ravb: add sleep PM suspend/resume support' used incorrect
function names containing 'runtime' for the suspend and resume
functions.

Reported-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: ipconfig: fix use after free
Uwe Kleine-König [Wed, 10 Aug 2016 09:44:17 +0000 (11:44 +0200)]
net: ipconfig: fix use after free

ic_close_devs() calls kfree() for all devices's ic_device. Since commit
2647cffb2bc6 ("net: ipconfig: Support using "delayed" DHCP replies")
the active device's ic_device is still used however to print the
ipconfig summary which results in an oops if the memory is already
changed. So delay freeing until after the autoconfig results are
reported.

Fixes: 2647cffb2bc6 ("net: ipconfig: Support using "delayed" DHCP replies")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoravb: add sleep PM suspend/resume support
Niklas Söderlund [Wed, 3 Aug 2016 13:56:47 +0000 (15:56 +0200)]
ravb: add sleep PM suspend/resume support

The interface would not function after the system had been woken up
after have been suspended (echo mem > /sys/power/state) cycle. The
reason for this is that all device registers have been reset to its
default values. This patch adds sleep suspend and resume functions that
detached the interface at suspend and restore the registers and reattach
the interface at resume.

Only the registers that are only configured at probe time needs to be
explicitly restored by the resume handler. All other registers are
reconfigured by either reopening the device in the resume handler (if
the device was running when the system was suspended) or when the
interface is opened by a user at a later time.

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: dsa: b53: constify b53_io_ops structures
Julia Lawall [Tue, 9 Aug 2016 17:09:45 +0000 (19:09 +0200)]
net: dsa: b53: constify b53_io_ops structures

The b53_io_ops structures are never modified, so declare them as const.

Done with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: Remove fib_local variable
David Ahern [Tue, 9 Aug 2016 13:51:06 +0000 (06:51 -0700)]
net: Remove fib_local variable

After commit 0ddcf43d5d4a ("ipv4: FIB Local/MAIN table collapse")
fib_local is set but not used. Remove it.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoppp: build ifname using unit identifier for rtnl based devices
Guillaume Nault [Tue, 9 Aug 2016 13:12:26 +0000 (15:12 +0200)]
ppp: build ifname using unit identifier for rtnl based devices

Userspace programs generally need to know the name of the ppp devices
they create. Both ioctl and rtnl interfaces use the ppp<suffix> sheme
to name them. But although the suffix used by the ioctl interface can
be known by userspace (it's the PPP unit identifier returned by the
PPPIOCGUNIT ioctl), the one used by the rtnl is only known by the
kernel.

This patch brings more consistency between ioctl and rtnl based ppp
devices by generating device names using the PPP unit identifer as
suffix in both cases. This way, userspace can always infer the name of
the devices they create.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agobatman-adv: Fix consistency of update route messages
Sven Eckelmann [Wed, 29 Jun 2016 21:45:57 +0000 (23:45 +0200)]
batman-adv: Fix consistency of update route messages

The debug messages of _batadv_update_route were printed before the actual
route change is done. At this point it is not really known which
curr_router will be replaced. Thus the messages could print the wrong
operation.

Printing the debug messages after the operation was done avoids this
problem.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
8 years agobatman-adv: Use bitwise instead of arithmetic operator for flags
Linus Lüssing [Mon, 11 Jul 2016 09:16:36 +0000 (11:16 +0200)]
batman-adv: Use bitwise instead of arithmetic operator for flags

This silences the following coccinelle warning:

"WARNING: sum of probable bitmasks, consider |"

Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
8 years agobatman-adv: Remove orig_node reference handling from send_skb_unicast
Sven Eckelmann [Mon, 27 Jun 2016 06:15:42 +0000 (08:15 +0200)]
batman-adv: Remove orig_node reference handling from send_skb_unicast

The function batadv_send_skb_unicast is not acquiring a reference for an
orig_node nor removing it from any datastructure. It still reduces the
reference counter for an object which is still in the hands of the caller.

This is confusing and can lead in the future to problems in the reference
handling of the caller function.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Acked-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
8 years agobatman-adv: use kmem_cache for translation table
Sven Eckelmann [Sat, 25 Jun 2016 14:44:06 +0000 (16:44 +0200)]
batman-adv: use kmem_cache for translation table

The translation table (global, local) is usually the part of batman-adv
which has the most dynamical allocated objects. Most of them
(tt_local_entry, tt_global_entry, tt_orig_list_entry, tt_change_node,
tt_req_node, tt_roam_node) are equally sized. So it makes sense to have
them allocated from a kmem_cache for each type.

This approach allowed a small wireless router (TP-Link TL-841NDv8; SLUB
allocator) to store 34% more translation table entries compared to the
current implementation.

[1] https://open-mesh.org/projects/batman-adv/wiki/Kmalloc-kmem-cache-tests

Reported-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
8 years agobatman-adv: Introduce forward packet creation helper
Linus Lüssing [Mon, 20 Jun 2016 19:39:54 +0000 (21:39 +0200)]
batman-adv: Introduce forward packet creation helper

This patch abstracts the forward packet creation into the new function
batadv_forw_packet_alloc().

The queue counting and interface reference counters are now handled
internally within batadv_forw_packet_alloc() and its
batadv_forw_packet_free() counterpart. This should reduce the risk of
having reference/queue counting bugs again and should increase
code readibility.

Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
8 years agobatman-adv: fix boolreturn.cocci warnings
kbuild test robot [Wed, 6 Jul 2016 02:49:29 +0000 (10:49 +0800)]
batman-adv: fix boolreturn.cocci warnings

net/batman-adv/bridge_loop_avoidance.c:1105:9-10: WARNING: return of 0/1 in function 'batadv_bla_process_claim' with return type bool

 Return statements in functions returning bool should use
 true/false instead of 1/0.
Generated by: scripts/coccinelle/misc/boolreturn.cocci

Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
8 years agobatman-adv: iv_ogm, Reduce code duplication
Markus Pargmann [Sun, 3 Jul 2016 09:07:14 +0000 (11:07 +0200)]
batman-adv: iv_ogm, Reduce code duplication

The difference between tq1 and tq2 are calculated the same way in two
separate functions.

This patch moves the common code to a separate function
'batadv_iv_ogm_neigh_diff' which handles everything necessary. The other
two functions can then handle errors and use the difference directly.

Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
[sven@narfation.org: rebased on current version, initialize return variable
in batadv_iv_ogm_neigh_diff, add kerneldoc, convert to bool return type]
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
8 years agobatman-adv: disable sysfs knobs when GW-mode is not implemented
Antonio Quartulli [Sun, 3 Jul 2016 10:46:35 +0000 (12:46 +0200)]
batman-adv: disable sysfs knobs when GW-mode is not implemented

Now that the GW-mode code is algorithm specific, batman-adv expects the
routing algorithm to implement some APIs to make it work.

However, such APIs are not mandatory, therefore we might have algorithms
not providing them. In this case all the sysfs knobs related to GW-mode
should be deactivated to make sure that settings injected by the user
for this feature are rejected.

Signed-off-by: Antonio Quartulli <a@unstable.cc>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
8 years agobatman-adv: B.A.T.M.A.N. V - implement GW selection logic
Antonio Quartulli [Sun, 3 Jul 2016 10:46:34 +0000 (12:46 +0200)]
batman-adv: B.A.T.M.A.N. V - implement GW selection logic

Since the GW selection logic has been made routing protocol specific
it is now possible for B.A.T.M.A.N V to have its own mechanism by
providing the API implementation.

Implement the GW specific API in the B.A.T.M.A.N. V protocol in
order to provide a working GW selection mechanism.

Signed-off-by: Antonio Quartulli <a@unstable.cc>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
8 years agobatman-adv: make GW election code protocol specific
Antonio Quartulli [Sun, 3 Jul 2016 10:46:33 +0000 (12:46 +0200)]
batman-adv: make GW election code protocol specific

Each routing protocol may have its own specific logic about
gateway election which is potentially based on the metric being
used.

Create two GW specific API functions and move the current election
logic in the B.A.T.M.A.N. IV specific code.

Signed-off-by: Antonio Quartulli <a@unstable.cc>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
8 years agobatman-adv: make the GW selection class algorithm specific
Antonio Quartulli [Sun, 3 Jul 2016 10:46:32 +0000 (12:46 +0200)]
batman-adv: make the GW selection class algorithm specific

The B.A.T.M.A.N. V algorithm uses a different metric compared to its
predecessor and for this reason the logic used to compute the best
Gateway is also changed. This means that the GW selection class
fed to this logic has a semantics that depends on the algorithm being
used.

Make the parsing and printing routine of the GW selection class
routing algorithm specific. Each algorithm can now parse (and print)
this value independently.

If no API is provided by any algorithm, the default is to use the
current mechanism of considering such value like an integer between
1 and 255.

Signed-off-by: Antonio Quartulli <a@unstable.cc>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
8 years agobatman-adv: Remove unused primary_if and bat_priv variables
Linus Lüssing [Tue, 14 Jun 2016 20:56:50 +0000 (22:56 +0200)]
batman-adv: Remove unused primary_if and bat_priv variables

Fixes: ef0a937f7a14 ("batman-adv: consider outgoing interface in OGM sending")
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
8 years agobatman-adv: Avoid sysfs name collision for netns moves
Sven Eckelmann [Mon, 13 Jun 2016 05:41:32 +0000 (07:41 +0200)]
batman-adv: Avoid sysfs name collision for netns moves

The kobject_put is only removing the sysfs entry and corresponding entries
when its reference counter becomes zero. This tends to lead to collisions
when a device is moved between two different network namespaces because
some of the sysfs files have to be removed first and then added again to
the already moved sysfs entry.

    WARNING: CPU: 0 PID: 290 at lib/kobject.c:240 kobject_add_internal+0x5ec/0x8a0
    kobject_add_internal failed for batman_adv with -EEXIST, don't try to register things with the same name in the same directory.

But the caller of kobject_put can already remove the sysfs entry before it
does the kobject_put. This removal is done even when the reference counter
is not yet zero and thus avoids the problem.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
8 years agobatman-adv: Revert "postpone sysfs removal when unregistering"
Sven Eckelmann [Mon, 13 Jun 2016 05:41:31 +0000 (07:41 +0200)]
batman-adv: Revert "postpone sysfs removal when unregistering"

Postponing the removal of the interface breaks the expected behavior of
NETDEV_UNREGISTER and NETDEV_PRE_TYPE_CHANGE. This is especially
problematic when an interface is removed and added in quick succession.

This reverts commit 5bc44dc8458c ("batman-adv: postpone sysfs removal when
unregistering").

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
8 years agobatman-adv: Modify mesh_iface outside sysfs context
Sven Eckelmann [Mon, 13 Jun 2016 05:41:30 +0000 (07:41 +0200)]
batman-adv: Modify mesh_iface outside sysfs context

The legacy sysfs interface to modify interfaces belonging to batman-adv
is run inside a region holding s_lock. And to add a net_device, it has
to also get the rtnl_lock. This is exactly the other way around than in
other virtual net_devices and conflicts with netdevice notifier which
executes inside rtnl_lock.

The inverted lock situation is currently solved by executing the removal
of netdevices via workqueue. The workqueue isn't executed inside
rtnl_lock and thus can independently get the s_lock and the rtnl_lock.

But this workaround fails when the netdevice notifier creates events in
quick succession and the earlier triggered removal of a net_device isn't
processed in the workqueue before the adding of the new netdevice (with
same name) event is issued.

Instead the legacy sysfs interface store events have to be enqueued in
a workqueue to loose the s_lock. The worker is then free to get the
required locks and the deadlock is avoided.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
8 years agobatman-adv: Use rtnl link in device creation example
Sven Eckelmann [Fri, 10 Jun 2016 21:00:56 +0000 (23:00 +0200)]
batman-adv: Use rtnl link in device creation example

The standard kernel API to add new virtual interfaces and attach other
interfaces to it is rtnl-link. batman-adv supports it since v3.10. This
functionality should be used instead of the legacy batman-adv-only sysfs
interface.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
8 years agobatman-adv: Define module rtnl link name
Sven Eckelmann [Fri, 10 Jun 2016 21:00:55 +0000 (23:00 +0200)]
batman-adv: Define module rtnl link name

The batman-adv module can automatically be loaded when operations over the
rtnl link are triggered. This requires only the correct rtnl link name in
the module header.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
8 years agobatman-adv: Document optional batadv_algo_ops
Sven Eckelmann [Tue, 7 Jun 2016 20:44:53 +0000 (22:44 +0200)]
batman-adv: Document optional batadv_algo_ops

Some operations in batadv_algo_ops are optional and marked as such in the
kerneldoc. But some of them miss the "(optional)" in their kerneldoc. These
have to also be marked to give an implementor of an algorithm the correct
background information without looking in the code calling these function
pointers.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>