platform/kernel/linux-exynos.git
7 years agoIB/ipoib: Enable ioctl for to IPoIB rdma netdevs
Feras Daoud [Wed, 23 Aug 2017 05:37:21 +0000 (08:37 +0300)]
IB/ipoib: Enable ioctl for to IPoIB rdma netdevs

Adds support for ioctl callback in the RDMA netdevs to allow
supporting functions not handled by the generic interface code.

Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Eitan Rabin <rabin@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoMerge branch 'k.o/for-4.13-rc' into k.o/for-next
Doug Ledford [Thu, 24 Aug 2017 19:58:26 +0000 (15:58 -0400)]
Merge branch 'k.o/for-4.13-rc' into k.o/for-next

Pick up -rc fixes.

Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/mlx5: Always return success for RoCE modify port
Majd Dibbiny [Wed, 23 Aug 2017 05:35:42 +0000 (08:35 +0300)]
IB/mlx5: Always return success for RoCE modify port

CM layer calls ib_modify_port() regardless of the link layer.

For the Ethernet ports, qkey violation and Port capabilities
are meaningless. Therefore, always return success for ib_modify_port
calls on the Ethernet ports.

Cc: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/mlx5: Fix Raw Packet QP event handler assignment
Majd Dibbiny [Wed, 23 Aug 2017 05:35:41 +0000 (08:35 +0300)]
IB/mlx5: Fix Raw Packet QP event handler assignment

In case we have SQ and RQ for Raw Packet QP, the SQ's event handler
wasn't assigned.

Fixing this by assigning event handler for each WQ after creation.

[ 1877.145243] Call Trace:
[ 1877.148644] <IRQ>
[ 1877.150580] [<ffffffffa07987c5>] ? mlx5_rsc_event+0x105/0x210 [mlx5_core]
[ 1877.159581] [<ffffffffa0795bd7>] ? mlx5_cq_event+0x57/0xd0 [mlx5_core]
[ 1877.167137] [<ffffffffa079208e>] mlx5_eq_int+0x53e/0x6c0 [mlx5_core]
[ 1877.174526] [<ffffffff8101a679>] ? sched_clock+0x9/0x10
[ 1877.180753] [<ffffffff810f717e>] handle_irq_event_percpu+0x3e/0x1e0
[ 1877.188014] [<ffffffff810f735d>] handle_irq_event+0x3d/0x60
[ 1877.194567] [<ffffffff810f9fe7>] handle_edge_irq+0x77/0x130
[ 1877.201129] [<ffffffff81014c3f>] handle_irq+0xbf/0x150
[ 1877.207244] [<ffffffff815ed78a>] ? atomic_notifier_call_chain+0x1a/0x20
[ 1877.214829] [<ffffffff815f434f>] do_IRQ+0x4f/0xf0
[ 1877.220498] [<ffffffff815e94ad>] common_interrupt+0x6d/0x6d
[ 1877.227025] <EOI>
[ 1877.228967] [<ffffffff814834e2>] ? cpuidle_enter_state+0x52/0xc0
[ 1877.236990] [<ffffffff81483615>] cpuidle_idle_call+0xc5/0x200
[ 1877.243676] [<ffffffff8101bc7e>] arch_cpu_idle+0xe/0x30
[ 1877.249831] [<ffffffff810b4725>] cpu_startup_entry+0xf5/0x290
[ 1877.256513] [<ffffffff815cfee1>] start_secondary+0x265/0x27b
[ 1877.263111] Code: Bad RIP value.
[ 1877.267296] RIP [< (null)>] (null)
[ 1877.273264] RSP <ffff88046fd63df8>
[ 1877.277531] CR2: 0000000000000000

Fixes: 19098df2da78 ("IB/mlx5: Refactor mlx5_ib_qp to accommodate other QP types")
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/core: Avoid accessing non-allocated memory when inferring port type
Noa Osherovich [Wed, 23 Aug 2017 05:35:40 +0000 (08:35 +0300)]
IB/core: Avoid accessing non-allocated memory when inferring port type

Commit 44c58487d51a ("IB/core: Define 'ib' and 'roce' rdma_ah_attr types")
introduced the concept of type in ah_attr:
 * During ib_register_device, each port is checked for its type which
   is stored in ib_device's port_immutable array.
 * During uverbs' modify_qp, the type is inferred using the port number
   in ib_uverbs_qp_dest struct (address vector) by accessing the
   relevant port_immutable array and the type is passed on to
   providers.

IB spec (version 1.3) enforces a valid port value only in Reset to
Init. During Init to RTR, the address vector must be valid but port
number is not mentioned as a field in the address vector, so its
value is not validated, which leads to accesses to a non-allocated
memory when inferring the port type.

Save the real port number in ib_qp during modify to Init (when the
comp_mask indicates that the port number is valid) and use this value
to infer the port type.

Avoid copying the address vector fields if the matching bit is not set
in the attr_mask. Address vector can't be modified before the port, so
no valid flow is affected.

Fixes: 44c58487d51a ('IB/core: Define 'ib' and 'roce' rdma_ah_attr types')
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agordma: Autoload netlink client modules
Jason Gunthorpe [Mon, 14 Aug 2017 20:57:39 +0000 (14:57 -0600)]
rdma: Autoload netlink client modules

If a message comes in and we do not have the client in the table, then
try to load the module supplying that client using MODULE_ALIAS to find
it.

This duplicates the scheme seen in other netlink muxes (eg nfnetlink).

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agordma: Allow demand loading of NETLINK_RDMA
Jason Gunthorpe [Mon, 14 Aug 2017 20:57:38 +0000 (14:57 -0600)]
rdma: Allow demand loading of NETLINK_RDMA

Provide a module alias so that if userspace opens a netlink
socket for RDMA the kernel support is loaded automatically.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/mlx4: use kvmalloc_array to allocate wrid
Li Dongyang [Wed, 16 Aug 2017 13:31:23 +0000 (23:31 +1000)]
IB/mlx4: use kvmalloc_array to allocate wrid

We could use kvmalloc_array instead of the
kmalloc and __vmalloc combination.
After this we don't need to include linux/vmalloc.h

Signed-off-by: Li Dongyang <dongyang.li@anu.edu.au>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/mlx5: use kvmalloc_array for mlx5_ib_wq
Li Dongyang [Wed, 16 Aug 2017 13:31:22 +0000 (23:31 +1000)]
IB/mlx5: use kvmalloc_array for mlx5_ib_wq

We observed multiple times on our Lustre OSS servers that when
the system memory is fragmented, kmalloc() in create_kernel_qp()
could fail order 4/5 allocations while we still have many free pages.

Switch to kvmalloc_array() to allow the operation to contine.

Signed-off-by: Li Dongyang <dongyang.li@anu.edu.au>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoRDMA: Fix return value check for ib_get_eth_speed()
Selvin Xavier [Thu, 17 Aug 2017 14:58:07 +0000 (07:58 -0700)]
RDMA: Fix return value check for ib_get_eth_speed()

ib_get_eth_speed() return 0 on success. Fixing the condition checking
and prevent reporting failure for query_port verb.

Fixes: d41861942fc5 ("Add generic function to extract IB speed from netdev")
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/pvrdma: Remove unused function
Yuval Shaia [Thu, 10 Aug 2017 21:12:11 +0000 (00:12 +0300)]
IB/pvrdma: Remove unused function

The function pvrdma_idx_ring_is_valid_idx is not in used so let's remove
it.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Acked-by: Adit Ranadive <aditr@vmware.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoi40iw: Improve CQP timeout logic
Shiraz Saleem [Wed, 9 Aug 2017 01:38:45 +0000 (20:38 -0500)]
i40iw: Improve CQP timeout logic

The current timeout logic for Control Queue-Pair (CQP) OPs
does not take into account whether CQP makes progress but
rather blindly waits for a large timeout value, 100000 jiffies
for the completion event. Improve this by setting the timeout
based on whether the CQP is making progress or not. If the CQP
is hung, the timeout will happen sooner, in 5000 jiffies. Each
time the CQP progress is detetcted, the timeout extends by 5000
jiffies.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Christopher N Bednarz <christopher.n.bednarz@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/hfi1: Add kernel receive context info to debugfs
Kaike Wan [Sun, 13 Aug 2017 15:09:04 +0000 (08:09 -0700)]
IB/hfi1: Add kernel receive context info to debugfs

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/hfi1: Remove HFI1_VERBS_31BIT_PSN option
Grzegorz Morys [Sun, 13 Aug 2017 15:08:58 +0000 (08:08 -0700)]
IB/hfi1: Remove HFI1_VERBS_31BIT_PSN option

Remove HFI1_VERBS_31BIT_PSN Kconfig option leaving only 31-bit PSNs
available. The option was implemented in the early days of the driver
and is no longer needed.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Grzegorz Morys <grzegorz.morys@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/hfi1: Remove pstate from hfi1_pportdata
Jakub Byczkowski [Sun, 13 Aug 2017 15:08:52 +0000 (08:08 -0700)]
IB/hfi1: Remove pstate from hfi1_pportdata

Do not track physical state separately from host_link_state.
Deduce physical state from host_link_state when required.
Change cache_physical_state to log_physical_state to make
sure host_link_state reflects hardwares physical state properly.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Jakub Byczkowski <jakub.byczkowski@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/hfi1: Stricter bounds checking of MAD trap index
Kamenee Arumugame [Sun, 13 Aug 2017 15:08:46 +0000 (08:08 -0700)]
IB/hfi1: Stricter bounds checking of MAD trap index

The macro size is valid. This change makes it less ambiguous.
Bounds check trap type for better security.

Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Kamenee Arumugam <kamenee.arumugam@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/hfi1: Load fallback platform configuration per HFI device
Jakub Byczkowski [Sun, 13 Aug 2017 15:08:40 +0000 (08:08 -0700)]
IB/hfi1: Load fallback platform configuration per HFI device

Currently fallback configuration is loaded once per driver instance.
With multiple HFI devices in the same system the current code may not
load the platform config data for the device. Change fallback platform
config data loading to be per device.

Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Jakub Byczkowski <jakub.byczkowski@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/hfi1: Add flag for platform config scratch register read
Jakub Byczkowski [Sun, 13 Aug 2017 15:08:34 +0000 (08:08 -0700)]
IB/hfi1: Add flag for platform config scratch register read

Add flag in pport data structure to determine when platform config was
read from scratch registers. Change conditions in parse_platform_config
and get_platform_config_field to use the new flag.

Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Jakub Byczkowski <jakub.byczkowski@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/hfi1: Document phys port state bits not used in IB
Dennis Dalessandro [Sun, 13 Aug 2017 15:08:28 +0000 (08:08 -0700)]
IB/hfi1: Document phys port state bits not used in IB

A couple bits are used by OPA for link physical state that are not present
as part of InfiniBand. Add a short blurb what those states mean and removed
an unused state.

Cc: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Todd Rimmer <todd.rimmer@intel.com>
Reviewed-by: Brent Rothermel <brent.r.rothermel@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/hfi1: Check xchg returned value for queuing link down entry
Sebastian Sanchez [Sun, 13 Aug 2017 15:08:22 +0000 (08:08 -0700)]
IB/hfi1: Check xchg returned value for queuing link down entry

Check xchg returned value for queuing link down entry
to guarantee proper atomic value reads.

Fixes: 626c077c025f ("IB/hfi1: Prevent link down request double queuing")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/hfi1: fix spelling mistake: "Maximim" -> "Maximum"
Colin Ian King [Sat, 5 Aug 2017 13:11:50 +0000 (14:11 +0100)]
IB/hfi1: fix spelling mistake: "Maximim" -> "Maximum"

Trivial fix to spelling mistake in pr_warn_ratelimited warning message

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/hfi1: Enable RDMA_CAP_OPA_AH in hfi driver to support extended LIDs
Dasaratharaman Chandramouli [Fri, 4 Aug 2017 20:54:53 +0000 (13:54 -0700)]
IB/hfi1: Enable RDMA_CAP_OPA_AH in hfi driver to support extended LIDs

Enabling this bit helps core components query for extended address
support using the rdma_cap_opa_ah interface.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/hfi1: Enhance PIO/SDMA send for 16B
Don Hiatt [Fri, 4 Aug 2017 20:54:47 +0000 (13:54 -0700)]
IB/hfi1: Enhance PIO/SDMA send for 16B

PIO/SDMA send logic now uses the hdr_type field to determine
the type of packet that has been constructed. Based on the hdr_type,
certain things such as PBC flags, padding count and the LT extra
trailing bytes are determined.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/hfi1: Add 16B RC/UC support
Don Hiatt [Fri, 4 Aug 2017 20:54:41 +0000 (13:54 -0700)]
IB/hfi1: Add 16B RC/UC support

Add 16B bypass packet support for RC/UC traffic types.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/rdmavt, hfi1, qib: Enhance rdmavt and hfi1 to use 32 bit lids
Dasaratharaman Chandramouli [Fri, 4 Aug 2017 20:54:35 +0000 (13:54 -0700)]
IB/rdmavt, hfi1, qib: Enhance rdmavt and hfi1 to use 32 bit lids

Increase lid used in hfi1 driver to 32 bits. qib continues
to use 16 bit lids.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/hfi1: Add 16B trace support
Don Hiatt [Fri, 4 Aug 2017 20:54:29 +0000 (13:54 -0700)]
IB/hfi1: Add 16B trace support

Add trace support to 16B bypass packets during send and
receive.

Sample input header trace:
<idle>-0     [000] d.h. 271742.509477: input_ibhdr: [0000:05:00.0] (16B)
len:24 sc:0 dlid:0xf0000b slid:0x10002 age:0 becn:0 fecn:0 l4:10 rc:0
sc:0 pkey:0x8001 entropy:0x0000 op:0x65,UD_SEND_ONLY_WITH_IMMEDIATE se:0
m:1 pad:3 tver:0 qpn:0xffffff a:0 psn:0x00000001 hlen:248 deth qkey
0x01234567 sqpn 0x000004

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/hfi1: Add 16B UD support
Don Hiatt [Fri, 4 Aug 2017 20:54:23 +0000 (13:54 -0700)]
IB/hfi1: Add 16B UD support

Add 16B bypass packet support for UD traffic types.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/hfi1: Determine 9B/16B L2 header type based on Address handle
Don Hiatt [Fri, 4 Aug 2017 20:54:16 +0000 (13:54 -0700)]
IB/hfi1: Determine 9B/16B L2 header type based on Address handle

When address handle attributes are initialized, the LIDs are
transformed to be in the 32 bit LID space.
When constructing the header, hfi1 driver will look at the LID
to determine the packet header to be created.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/hfi1: Add support to process 16B header errors
Don Hiatt [Fri, 4 Aug 2017 20:54:10 +0000 (13:54 -0700)]
IB/hfi1: Add support to process 16B header errors

Enhance hdr_rcverr() to also handle errors during
16B bypass packet receive.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/hfi1: Add support to send 16B bypass packets
Don Hiatt [Fri, 4 Aug 2017 20:54:04 +0000 (13:54 -0700)]
IB/hfi1: Add support to send 16B bypass packets

We introduce struct hfi1_opa_header as a union
of ib (9B) and 16B headers.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/hfi1: Add support to receive 16B bypass packets
Don Hiatt [Fri, 4 Aug 2017 20:53:58 +0000 (13:53 -0700)]
IB/hfi1: Add support to receive 16B bypass packets

We introduce a struct hfi1_16b_header to support 16B headers.
16B bypass packets are received by the driver and processed
similar to 9B packets. Add basic support to handle 16B packets.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/rdmavt, hfi1, qib: Modify check_ah() to account for extended LIDs
Don Hiatt [Fri, 4 Aug 2017 20:53:51 +0000 (13:53 -0700)]
IB/rdmavt, hfi1, qib: Modify check_ah() to account for extended LIDs

rvt_check_ah() delegates lid verification to underlying
driver. Underlying driver uses different conditions to
check for dlid depending on whether the device supports
extended LIDs

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/hf1: User context locking is inconsistent
Michael J. Ruhl [Fri, 4 Aug 2017 20:52:44 +0000 (13:52 -0700)]
IB/hf1: User context locking is inconsistent

There is a mixture of mutex and spinlocks to protect receive context
(rcd/uctxt) information.  This is not used consistently.

Use the mutex to protect device receive context information only.
Use the spinlock to protect sub context information only.

Protect access to items in the rcd array with a spinlock and
reference count.

Remove spinlock around dd->rcd array cleanup.  Since interrupts are
disabled and cleaned up before this point, this lock is not useful.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/hfi1: Protect context array set/clear with spinlock
Michael J. Ruhl [Fri, 4 Aug 2017 20:52:38 +0000 (13:52 -0700)]
IB/hfi1: Protect context array set/clear with spinlock

The rcd array can be accessed from user context or during interrupts.
Protecting this with a mutex isn't a good idea because the mutex should
not be used from an IRQ.

Protect the allocation and freeing of rcd array elements with a
spinlock.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/hfi1: Use host_link_state to read state when DC is shut down
Bartlomiej Dudek [Fri, 4 Aug 2017 20:52:32 +0000 (13:52 -0700)]
IB/hfi1: Use host_link_state to read state when DC is shut down

When DC is shut down (by e.g.  disconnecting the cable), the
driver should use host_link_state to get port's current
physical state. This is due to the fact that physical state
is read from DC's CSRs and when DC is shut down and state is
changed, its registers are not impacted.

Reviewed-by: Jakub Byczkowski <jakub.byczkowski@intel.com>
Signed-off-by: Bartlomiej Dudek <bartlomiej.dudek@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/hfi1: Remove lstate from hfi1_pportdata
Byczkowski, Jakub [Fri, 4 Aug 2017 20:52:26 +0000 (13:52 -0700)]
IB/hfi1: Remove lstate from hfi1_pportdata

Do not track logical state separately from host_link_state. Deduce
logical state from host_link_state when required. Transitions in
set_link_state and goto_offline already make sure host_link_state
reflects hardware's logical state properly.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Jakub Byczkowski <jakub.byczkowski@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/hfi1: Remove pmtu from the QP structure
Sebastian Sanchez [Fri, 4 Aug 2017 20:52:20 +0000 (13:52 -0700)]
IB/hfi1: Remove pmtu from the QP structure

The pmtu field doens't have be stored in the QP structure
as it can easily be calculated when needed.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/hfi1: Revert egress pkey check enforcement
Alex Estrin [Fri, 4 Aug 2017 20:52:13 +0000 (13:52 -0700)]
IB/hfi1: Revert egress pkey check enforcement

Current code has some serious flaws. Disarm the flag
pending an appropriate patch.

Fixes: 53526500f301 ("IB/hfi1: Permanently enable P_Key checking in HFI")
Cc: stable@vger.kernel.org
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/core: Fix input len in multiple user verbs
Amrani, Ram [Tue, 27 Jun 2017 14:04:42 +0000 (17:04 +0300)]
IB/core: Fix input len in multiple user verbs

Most user verbs pass user data to the kernel with the inclusion of the
ib_uverbs_cmd_hdr structure. This is problematic because the vendor has
no ideas if the verb was called by a legacy verb or an extended verb.
Also, the incosistency between the verbs is confusing.

Fixes: 565197dd8fb1 ("IB/core: Extend ib_uverbs_create_cq")
Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agomlx5: Replace PCI pool old API
Romain Perier [Tue, 22 Aug 2017 11:46:59 +0000 (13:46 +0200)]
mlx5: Replace PCI pool old API

The PCI pool API is deprecated. This commit replaces the PCI pool old
API by the appropriate function with the DMA pool API.

Signed-off-by: Romain Perier <romain.perier@collabora.com>
Reviewed-by: Peter Senna Tschudin <peter.senna@collabora.com>
Acked-by: Doug Ledford <dledford@redhat.com>
Tested-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agomlx4: Replace PCI pool old API
Romain Perier [Tue, 22 Aug 2017 11:46:58 +0000 (13:46 +0200)]
mlx4: Replace PCI pool old API

The PCI pool API is deprecated. This commit replaces the PCI pool old
API by the appropriate function with the DMA pool API.

Signed-off-by: Romain Perier <romain.perier@collabora.com>
Acked-by: Peter Senna Tschudin <peter.senna@collabora.com>
Tested-by: Peter Senna Tschudin <peter.senna@collabora.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Doug Ledford <dledford@redhat.com>
Tested-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/mthca: Replace PCI pool old API
Romain Perier [Tue, 22 Aug 2017 11:46:56 +0000 (13:46 +0200)]
IB/mthca: Replace PCI pool old API

The PCI pool API is deprecated. This commit replaces the PCI pool old
API by the appropriate function with the DMA pool API.

Signed-off-by: Romain Perier <romain.perier@collabora.com>
Acked-by: Peter Senna Tschudin <peter.senna@collabora.com>
Tested-by: Peter Senna Tschudin <peter.senna@collabora.com>
Acked-by: Doug Ledford <dledford@redhat.com>
Tested-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoRDMA/uverbs: Initialize cq_context appropriately
Bharat Potnuri [Tue, 1 Aug 2017 05:28:35 +0000 (10:58 +0530)]
RDMA/uverbs: Initialize cq_context appropriately

Initializing cq_context with ev_queue in create_cq(), leads to NULL pointer
dereference in ib_uverbs_comp_handler(), if application doesnot use completion
channel. This patch fixes the cq_context initialization.

Fixes: 1e7710f3f65 ("IB/core: Change completion channel to use the reworked")
Cc: stable@vger.kernel.org # 4.12
Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
(cherry picked from commit 699a2d5b1b880b4e4e1c7d55fa25659322cf5b51)

7 years agoRDMA/bnxt_re: Implement the alloc/get_hw_stats callback
Somnath Kotur [Wed, 2 Aug 2017 08:46:19 +0000 (01:46 -0700)]
RDMA/bnxt_re: Implement the alloc/get_hw_stats callback

Expose HW counters using the get_hw_stats callback

Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoRDMA/bnxt_re: Allocate multiple notification queues
Selvin Xavier [Wed, 2 Aug 2017 08:46:18 +0000 (01:46 -0700)]
RDMA/bnxt_re: Allocate multiple notification queues

Enables multiple Interrupt vectors. Driver is requesting the max
MSIX vectors based on the number of online  cpus and creates upto
9 MSIx vectors (1 for control path and 8 for data path).
A tasklet is created for each of these vectors. NQs are assigned
to CQs in round robin fashion.
This patch also adds IRQ affinity hint for the MSIX vector of each NQ.

Signed-off-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoAdd OPA extended LID support
Hiatt, Don [Mon, 14 Aug 2017 18:17:43 +0000 (14:17 -0400)]
Add OPA extended LID support

This patch series primarily increases sizes of variables that hold
lid values from 16 to 32 bits. Additionally, it adds a check in
the IB mad stack to verify a properly formatted MAD when OPA
extended LIDs are used.

Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoMerge branch 'k.o/for-4.13-rc' into k.o/for-next
Doug Ledford [Fri, 18 Aug 2017 18:12:04 +0000 (14:12 -0400)]
Merge branch 'k.o/for-4.13-rc' into k.o/for-next

Merging our (hopefully) final -rc pull branch into our for-next branch
because some of our pending patches won't apply cleanly without having
the -rc patches in our tree.

Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoMerge branch 'misc' into k.o/for-next
Doug Ledford [Fri, 18 Aug 2017 18:10:23 +0000 (14:10 -0400)]
Merge branch 'misc' into k.o/for-next

Conflicts:
drivers/infiniband/core/iwcm.c - The rdma_netlink patches in
HEAD and the iwarp cm workqueue fix (don't use WQ_MEM_RECLAIM,
we aren't safe for that context) touched the same code.

Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/hfi1: add const to bin_attribute structures
Bhumika Goyal [Wed, 2 Aug 2017 10:01:30 +0000 (15:31 +0530)]
IB/hfi1: add const to bin_attribute structures

Add const to bin_attribute structures as they are only passed to the
functions sysfs_{remove/create}_bin_file. The arguments passed are of
type const, so declare the structures to be const.

Done using Coccinelle.

@m disable optional_qualifier@
identifier s;
position p;
@@
static struct bin_attribute s@p={...};

@okay1@
position p;
identifier m.s;
@@
(
sysfs_create_bin_file(...,&s@p,...)
|
sysfs_remove_bin_file(...,&s@p,...)
)

@bad@
position p!={m.p,okay1.p};
identifier m.s;
@@
s@p

@change depends on !bad disable optional_qualifier@
identifier m.s;
@@
static
+const
struct bin_attribute s={...};

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/qib: add const to bin_attribute structures
Bhumika Goyal [Wed, 2 Aug 2017 10:01:29 +0000 (15:31 +0530)]
IB/qib: add const to bin_attribute structures

Add const to bin_attribute structures as they are only passed to the
functions sysfs_{remove/create}_bin_file. The arguments passed are of
type const, so declare the structures to be const.

Done using Coccinelle.

@m disable optional_qualifier@
identifier s;
position p;
@@
static struct bin_attribute s@p={...};

@okay1@
position p;
identifier m.s;
@@
(
sysfs_create_bin_file(...,&s@p,...)
|
sysfs_remove_bin_file(...,&s@p,...)
)

@bad@
position p!={m.p,okay1.p};
identifier m.s;
@@
s@p

@change depends on !bad disable optional_qualifier@
identifier m.s;
@@
static
+const
struct bin_attribute s={...};

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoRDMA/uverbs: Initialize cq_context appropriately
Bharat Potnuri [Tue, 1 Aug 2017 05:28:35 +0000 (10:58 +0530)]
RDMA/uverbs: Initialize cq_context appropriately

Initializing cq_context with ev_queue in create_cq(), leads to NULL pointer
dereference in ib_uverbs_comp_handler(), if application doesnot use completion
channel. This patch fixes the cq_context initialization.

Fixes: 1e7710f3f65 ("IB/core: Change completion channel to use the reworked")
Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoinfiniband: avoid overflow warning
Arnd Bergmann [Mon, 31 Jul 2017 06:50:05 +0000 (08:50 +0200)]
infiniband: avoid overflow warning

A sockaddr_in structure on the stack getting passed into rdma_ip2gid
triggers this warning, since we memcpy into a larger sockaddr_in6
structure:

In function 'memcpy',
    inlined from 'rdma_ip2gid' at include/rdma/ib_addr.h:175:3,
    inlined from 'addr_event.isra.4.constprop' at drivers/infiniband/core/roce_gid_mgmt.c:693:2,
    inlined from 'inetaddr_event' at drivers/infiniband/core/roce_gid_mgmt.c:716:9:
include/linux/string.h:305:4: error: call to '__read_overflow2' declared with attribute error: detected read beyond size of object passed as 2nd parameter

The warning seems appropriate here, but the code is also clearly
correct, so we really just want to shut up this instance of the
output.

The best way I found so far is to avoid the memcpy() call and instead
replace it with a struct assignment.

Fixes: 6974f0c4555e ("include/linux/string.h: add the option of fortified string.h functions")
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoi40iw: fix spelling mistake: "allloc_buf" -> "alloc_buf"
Colin Ian King [Fri, 21 Jul 2017 22:19:33 +0000 (23:19 +0100)]
i40iw: fix spelling mistake: "allloc_buf" -> "alloc_buf"

Trivial fix to spelling mistake in i40iw_debug  message and
also split up a couple of lines that are too long and cause
checkpatch warnings

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/rxe: Remove unneeded check
Yuval Shaia [Fri, 21 Jul 2017 19:20:50 +0000 (22:20 +0300)]
IB/rxe: Remove unneeded check

Port validation is performed in ib_core, no need to duplicate it here.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/rxe: Convert pr_info to pr_warn
Yuval Shaia [Fri, 21 Jul 2017 19:14:09 +0000 (22:14 +0300)]
IB/rxe: Convert pr_info to pr_warn

This message is warning so let's print it accordingly.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoi40iw: Fixes for static checker warnings
Shiraz Saleem [Wed, 19 Jul 2017 18:55:26 +0000 (13:55 -0500)]
i40iw: Fixes for static checker warnings

Remove NULL check for cm_node->listener in i40iw_accept
as listener is always present at this point.

Remove the check for cm_node->accept_pend and related code
in i40iw_cm_event_connected as the cm_node in this context
is only pertinent to active node and cm_node->accept_pend
is always 0.

This fixes the following smatch warnings,

drivers/infiniband/hw/i40iw/i40iw_cm.c:3691 i40iw_accept()
error: we previously assumed 'cm_node->listener' could be null

drivers/infiniband/hw/i40iw/i40iw_cm.c:4061 i40iw_cm_event_connected()
error: we previously assumed 'cm_node->listener' could be null

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoi40iw: Simplify code
Christophe Jaillet [Sun, 16 Jul 2017 11:09:23 +0000 (13:09 +0200)]
i40iw: Simplify code

Axe a few lines of code and re-use existing error handling path to avoid
code duplication.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoinfiniband: pvrdma: constify pci_device_id.
Arvind Yadav [Sun, 16 Jul 2017 06:30:46 +0000 (12:00 +0530)]
infiniband: pvrdma: constify pci_device_id.

pci_device_id are not supposed to change at runtime. All functions
working with pci_device_id provided by <linux/pci.h> work with
const pci_device_id. So mark the non-const structs as const.

File size before:
   text    data     bss     dec     hex filename
  10774    1872       8   12654    316e infiniband/hw/vmw_pvrdma/pvrdma_main.o

File size After adding 'const':
   text    data     bss     dec     hex filename
  10838    1808       8   12654    316e infiniband/hw/vmw_pvrdma/pvrdma_main.o

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoinfiniband: nes: constify pci_device_id.
Arvind Yadav [Sun, 16 Jul 2017 06:30:45 +0000 (12:00 +0530)]
infiniband: nes: constify pci_device_id.

pci_device_id are not supposed to change at runtime. All functions
working with pci_device_id provided by <linux/pci.h> work with
const pci_device_id. So mark the non-const structs as const.

File size before:
   text    data     bss     dec     hex filename
  10429     780      33   11242    2bea drivers/infiniband/hw/nes/nes.o

File size After adding 'const':
   text    data     bss     dec     hex filename
  10541     668      33   11242    2bea drivers/infiniband/hw/nes/nes.o

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoinfiniband: mthca: constify pci_device_id.
Arvind Yadav [Sun, 16 Jul 2017 06:30:44 +0000 (12:00 +0530)]
infiniband: mthca: constify pci_device_id.

pci_device_id are not supposed to change at runtime. All functions
working with pci_device_id provided by <linux/pci.h> work with
const pci_device_id. So mark the non-const structs as const.

File size before:
   text    data     bss     dec     hex filename
  13067     805       4   13876    3634 infiniband/hw/mthca/mthca_main.o

File size After adding 'const':
   text    data     bss     dec     hex filename
  13419     453       4   13876    3634 infiniband/hw/mthca/mthca_main.o

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoPCI/IB: add support for pci driver attribute groups
Greg Kroah-Hartman [Wed, 19 Jul 2017 13:01:06 +0000 (15:01 +0200)]
PCI/IB: add support for pci driver attribute groups

Some drivers (specifically the nes IB driver), want to create a lot of
sysfs driver attributes.  Instead of open-coding the creation and
removal of these files (and getting it wrong btw), it's a better idea to
let the driver core handle all of this logic for us.

So add a new field to the pci driver structure, **groups, that allows
pci drivers to specify an attribute group list it wishes to have created
when it is registered with the driver core.

Big bonus is now the driver doesn't race with userspace when the sysfs
files are created vs. when the kobject is announced, so any script/tool
that actually wanted to use these files will not have to poll waiting
for them to show up.

Cc: Faisal Latif <faisal.latif@intel.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoRDMA/bnxt_re: fix spelling mistake: "Deallocte" -> "Deallocate"
Colin Ian King [Fri, 14 Jul 2017 07:30:10 +0000 (08:30 +0100)]
RDMA/bnxt_re: fix spelling mistake: "Deallocte" -> "Deallocate"

Trivial fix to spelling mistake in dev_err error message

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/hfi1: fix spelling mistake in variable name continious
Colin Ian King [Thu, 13 Jul 2017 22:13:38 +0000 (23:13 +0100)]
IB/hfi1: fix spelling mistake in variable name continious

Trivial fix to spelling mistake, rename variable 'continious'
to the correct spelling 'continuous'

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/qib: fix spelling mistake: "failng" -> "failing"
Colin Ian King [Mon, 3 Jul 2017 09:23:47 +0000 (10:23 +0100)]
IB/qib: fix spelling mistake: "failng" -> "failing"

Trivial fix to spelling mistake in qib_dev_err error message

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoiwcm: Don't allocate iwcm workqueue with WQ_MEM_RECLAIM
Sagi Grimberg [Tue, 15 Aug 2017 19:20:38 +0000 (22:20 +0300)]
iwcm: Don't allocate iwcm workqueue with WQ_MEM_RECLAIM

Its very likely that iwcm work execution will yield memory
allocations (for example cm connection request).

Reported-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agocm: Don't allocate ib_cm workqueue with WQ_MEM_RECLAIM
Sagi Grimberg [Tue, 15 Aug 2017 19:20:37 +0000 (22:20 +0300)]
cm: Don't allocate ib_cm workqueue with WQ_MEM_RECLAIM

create_workqueue always creates the workqueue with WQ_MEM_RECLAIM
and silences a flush dependency warn for WQ_LEGACY. Instead, we
want to keep the warn in case the allocator tries to flush the
cm workqueue because its very likely that cm work execution will
yield memory allocations (for example cm connection requests).

Reported-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agonvmet-rdma: remove redundant empty device add callout
Sagi Grimberg [Sun, 2 Jul 2017 08:20:52 +0000 (11:20 +0300)]
nvmet-rdma: remove redundant empty device add callout

Now that its not needed, we can simply not assign it.

Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agonvme-rdma: remove redundant empty device add callout
Sagi Grimberg [Sun, 2 Jul 2017 08:20:51 +0000 (11:20 +0300)]
nvme-rdma: remove redundant empty device add callout

Now that its not needed, we can simply not assign it.

Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoRDMA/core: make ib_device.add method optional
Sagi Grimberg [Sun, 2 Jul 2017 08:20:50 +0000 (11:20 +0300)]
RDMA/core: make ib_device.add method optional

ib_clients can indeed fill .add to NULL, but then they will not see
any device removal notifications. The reason is that that
ib_register_client and ib_register_device checked existence of .add
before adding the creating a corresponding client_data and adding
it to the list. Simple condition reverse fixes the issue.

Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agocxgb4: Remove some dead code
Christophe Jaillet [Sat, 10 Jun 2017 09:19:20 +0000 (11:19 +0200)]
cxgb4: Remove some dead code

This 'BUG_ON(!ep)' can never trigger because we have:
   if (!ep)
      return 0;
just a few lines above. So it can be removed safely.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/uverbs: Fix NULL pointer dereference during device removal
Maor Gottlieb [Wed, 16 Aug 2017 15:57:04 +0000 (18:57 +0300)]
IB/uverbs: Fix NULL pointer dereference during device removal

As part of ib_uverbs_remove_one which might be triggered upon
reset flow, we trigger IB_EVENT_DEVICE_FATAL event to userspace
application.
If device was removed after uverbs fd was opened but before
ib_uverbs_get_context was called, the event file will be accessed
before it was allocated, result in NULL pointer dereference:

[ 72.325873] BUG: unable to handle kernel NULL pointer dereference at (null)
...
[ 72.325984] IP: _raw_spin_lock_irqsave+0x22/0x40
[ 72.327123] Call Trace:
[ 72.327168] ib_uverbs_async_handler.isra.8+0x2e/0x160 [ib_uverbs]
[ 72.327216] ? synchronize_srcu_expedited+0x27/0x30
[ 72.327269] ib_uverbs_remove_one+0x120/0x2c0 [ib_uverbs]
[ 72.327330] ib_unregister_device+0xd0/0x180 [ib_core]
[ 72.327373] mlx5_ib_remove+0x74/0x140 [mlx5_ib]
[ 72.327422] mlx5_remove_device+0xfb/0x110 [mlx5_core]
[ 72.327466] mlx5_unregister_interface+0x3c/0xa0 [mlx5_core]
[ 72.327509] mlx5_ib_cleanup+0x10/0x962 [mlx5_ib]
[ 72.327546] SyS_delete_module+0x155/0x230
[ 72.328472] ? exit_to_usermode_loop+0x70/0xa6
[ 72.329370] do_syscall_64+0x54/0xc0
[ 72.330262] entry_SYSCALL64_slow_path+0x25/0x25

Fix it by checking that user context was allocated before
trigger the event.

Fixes: 036b10635739 ('IB/uverbs: Enable device removal when there are active user space applications')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/core: Protect sysfs entry on ib_unregister_device
Shiraz Saleem [Mon, 17 Jul 2017 19:03:50 +0000 (14:03 -0500)]
IB/core: Protect sysfs entry on ib_unregister_device

ib_unregister_device is not protecting removal of sysfs entries.
A call to ib_register_device in that window can result in
duplicate sysfs entry warning. Move mutex_unlock to after
ib_device_unregister_sysfs to protect against sysfs entry creation.

This issue is exposed during driver load/unload stress test.

WARNING: CPU: 5 PID: 4445 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x5f/0x70
sysfs: cannot create duplicate filename '/class/infiniband/i40iw0'
Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./Q87M-D2H
BIOS F7 01/17/2014
Workqueue: i40e i40e_service_task [i40e]
Call Trace:
dump_stack+0x67/0x98
__warn+0xcc/0xf0
warn_slowpath_fmt+0x4a/0x50
? kernfs_path_from_node+0x4b/0x60
sysfs_warn_dup+0x5f/0x70
sysfs_do_create_link_sd.isra.2+0xb7/0xc0
sysfs_create_link+0x20/0x40
device_add+0x28c/0x600
ib_device_register_sysfs+0x58/0x170 [ib_core]
ib_register_device+0x325/0x570 [ib_core]
? i40iw_register_rdma_device+0x1f4/0x400 [i40iw]
? kmem_cache_alloc_trace+0x143/0x330
? __raw_spin_lock_init+0x2d/0x50
i40iw_register_rdma_device+0x2dc/0x400 [i40iw]
i40iw_open+0x10a6/0x1950 [i40iw]
? i40iw_open+0xeab/0x1950 [i40iw]
? i40iw_make_cm_node+0x9c0/0x9c0 [i40iw]
i40e_client_subtask+0xa4/0x110 [i40e]
i40e_service_task+0xc2d/0x1320 [i40e]
process_one_work+0x203/0x710
? process_one_work+0x16f/0x710
worker_thread+0x126/0x4a0
? trace_hardirqs_on+0xd/0x10
kthread+0x112/0x150
? process_one_work+0x710/0x710
? kthread_create_on_node+0x40/0x40
ret_from_fork+0x2e/0x40
---[ end trace fd11b69e21ea7653 ]---
Couldn't register device i40iw0 with driver model

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoiw_cxgb4: fix misuse of integer variable
Steve Wise [Tue, 25 Jul 2017 13:51:15 +0000 (06:51 -0700)]
iw_cxgb4: fix misuse of integer variable

Fixes: ee30f7d507c0 ("iw_cxgb4: Max fastreg depth depends on DSGL support")
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/hns: fix memory leak on ah on error return path
Colin Ian King [Tue, 8 Aug 2017 17:41:02 +0000 (18:41 +0100)]
IB/hns: fix memory leak on ah on error return path

When dmac is NULL, ah is not being freed on the error return path. Fix
this by kfree'ing it.

Detected by CoverityScan, CID#1452636 ("Resource Leak")

Fixes: d8966fcd4c25 ("IB/core: Use rdma_ah_attr accessor functions")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoi40iw: Fix potential fcn_id_array out of bounds
Christopher N Bednarz [Wed, 9 Aug 2017 01:38:48 +0000 (20:38 -0500)]
i40iw: Fix potential fcn_id_array out of bounds

Avoid out of bounds error by utilizing I40IW_MAX_STATS_COUNT
instead of I40IW_INVALID_FCN_ID.

Signed-off-by: Christopher N Bednarz <christoper.n.bednarz@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoi40iw: Use correct alignment for CQ0 memory
Christopher N Bednarz [Wed, 9 Aug 2017 01:38:47 +0000 (20:38 -0500)]
i40iw: Use correct alignment for CQ0 memory

Utilize correct alignment variable when allocating
DMA memory for CQ0.

Signed-off-by: Christopher N Bednarz <christopher.n.bednarz@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoi40iw: Fix typecast of tcp_seq_num
Mustafa Ismail [Wed, 9 Aug 2017 01:38:46 +0000 (20:38 -0500)]
i40iw: Fix typecast of tcp_seq_num

The typecast of tcp_seq_num incorrectly uses u8. Fix by
casting to u32.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoi40iw: Correct variable names
Mustafa Ismail [Wed, 9 Aug 2017 01:38:44 +0000 (20:38 -0500)]
i40iw: Correct variable names

Fix incorrect naming of status code and struct. Use inline
instead of immediate.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoi40iw: Fix parsing of query/commit FPM buffers
Chien Tin Tung [Wed, 9 Aug 2017 01:38:43 +0000 (20:38 -0500)]
i40iw: Fix parsing of query/commit FPM buffers

Parsing of commit/query Host Memory Cache Function Private Memory
is not skipping over reserved fields and incorrectly assigning
those values into object's base/cnt/max_cnt fields. Skip over
reserved fields and set correct values. Also correct memory
alignment requirement for commit/query FPM buffers.

Signed-off-by: Chien Tin Tung <chien.tin.tung@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Christopher N Bednarz <christopher.n.bednarz@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoRDMA/vmw_pvrdma: Report CQ missed events
Bryan Tan [Thu, 10 Aug 2017 19:05:02 +0000 (12:05 -0700)]
RDMA/vmw_pvrdma: Report CQ missed events

There is a chance of a race between arming the CQ and receiving
completions. By reporting CQ missed events any ULPs should poll
again to get the completions.

Fixes: 29c8d9eba550 ("IB: Add vmw_pvrdma driver")
Acked-by: Aditya Sarwade <asarwade@vmware.com>
Signed-off-by: Bryan Tan <bryantan@vmware.com>
Signed-off-by: Adit Ranadive <aditr@vmware.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoIB/hns: Avoid compile test under non 64bit environments
Matan Barak [Tue, 25 Jul 2017 14:29:06 +0000 (17:29 +0300)]
IB/hns: Avoid compile test under non 64bit environments

The hns driver uses __raw_writeq which is only defined in 64BIT
environments. Trying to compile the driver in a 32BIT environment
results in errors. Only COMPILE_TEST when 64BIT is defined.

Fixes: 7d1b6a678e0b ("IB/hns: Support compile test for hns RoCE driver")
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoRevert "RDMA/hns: fix build regression"
Doug Ledford [Mon, 14 Aug 2017 15:15:26 +0000 (11:15 -0400)]
Revert "RDMA/hns: fix build regression"

This reverts commit ecd840ff9b793ac60e3e6658414525535349a17b.

7 years agoMerge branch 'rdma-netlink' into k.o/merge-test
Doug Ledford [Thu, 10 Aug 2017 18:34:18 +0000 (14:34 -0400)]
Merge branch 'rdma-netlink' into k.o/merge-test

Conflicts:
include/rdma/ib_verbs.h - Modified a function signature adjacent
to a newly added function signature from a previous merge

Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoMerge branches '32bit_lid' and 'irq_affinity' into k.o/merge-test
Doug Ledford [Thu, 10 Aug 2017 18:31:29 +0000 (14:31 -0400)]
Merge branches '32bit_lid' and 'irq_affinity' into k.o/merge-test

Conflicts:
drivers/infiniband/hw/mlx5/main.c - Both add new code
include/rdma/ib_verbs.h - Both add new code

Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoMerge tag 'rdma-next-2017-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git...
Doug Ledford [Thu, 10 Aug 2017 17:43:11 +0000 (13:43 -0400)]
Merge tag 'rdma-next-2017-08-10' of git://git./linux/kernel/git/leon/linux-rdma into rdma-netlink

RDMA netlink infrastructure v2

Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoRDMA/netlink: Export node_type
Leon Romanovsky [Thu, 29 Jun 2017 13:01:29 +0000 (16:01 +0300)]
RDMA/netlink: Export node_type

Add ability to get node_type for RDAM netlink users.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
7 years agoRDMA/netlink: Provide port state and physical link state
Leon Romanovsky [Thu, 29 Jun 2017 10:12:45 +0000 (13:12 +0300)]
RDMA/netlink: Provide port state and physical link state

Add port state and physical link state to the users of RDMA netlink.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
7 years agoRDMA/netlink: Export LID mask control (LMC)
Leon Romanovsky [Wed, 28 Jun 2017 12:49:30 +0000 (15:49 +0300)]
RDMA/netlink: Export LID mask control (LMC)

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
7 years agoRDMA/netink: Export lids and sm_lids
Leon Romanovsky [Wed, 28 Jun 2017 12:38:36 +0000 (15:38 +0300)]
RDMA/netink: Export lids and sm_lids

According to the IB specification, the LID and SM_LID
are 16-bit wide, but to support OmniPath users, export
it as 32-bit value from the beginning.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
7 years agoRDMA/netlink: Advertise IB subnet prefix
Leon Romanovsky [Wed, 28 Jun 2017 12:05:14 +0000 (15:05 +0300)]
RDMA/netlink: Advertise IB subnet prefix

Add IB subnet prefix to the port properties exported
by RDMA netlink.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
7 years agoRDMA/netlink: Export node_guid and sys_image_guid
Leon Romanovsky [Wed, 28 Jun 2017 11:01:37 +0000 (14:01 +0300)]
RDMA/netlink: Export node_guid and sys_image_guid

Add Node GUID and system image GUID to the device properties
exported by RDMA netlink, to be used by RDMAtool.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
7 years agoRDMA/netlink: Export FW version
Leon Romanovsky [Tue, 27 Jun 2017 13:58:59 +0000 (16:58 +0300)]
RDMA/netlink: Export FW version

Add FW version to the device properties exported
by RDMA netlink, to be used by RDMAtool.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
7 years agoRDMA: Simplify get firmware interface
Leon Romanovsky [Tue, 27 Jun 2017 13:49:53 +0000 (16:49 +0300)]
RDMA: Simplify get firmware interface

There is a need to forward FW version to user space
application through RDMA netlink. In order to make it safe, there
is need to declare nla_policy and limit the size of FW string.

The new define IB_FW_VERSION_NAME_MAX will limit the size of
FW version string. That define was chosen to be equal to
ETHTOOL_FWVERS_LEN, because many drivers anyway are limited
by that value indirectly.

The introduction of this define allows us to remove the string size
from get_fw_str function signature.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
7 years agoRDMA/netlink: Expose device and port capability masks
Leon Romanovsky [Tue, 20 Jun 2017 11:47:08 +0000 (14:47 +0300)]
RDMA/netlink: Expose device and port capability masks

The port capability mask is exposed to user space via sysfs interface,
while device capabilities are available for verbs only.

This patch provides those capabilities through netlink interface.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
7 years agoRDMA/netlink: Implement nldev port doit callback
Leon Romanovsky [Thu, 22 Jun 2017 13:10:38 +0000 (16:10 +0300)]
RDMA/netlink: Implement nldev port doit callback

Provide ability to get specific to device and port information.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
7 years agoRDMA/netlink: Add nldev port dumpit implementation
Leon Romanovsky [Tue, 20 Jun 2017 08:30:33 +0000 (11:30 +0300)]
RDMA/netlink: Add nldev port dumpit implementation

This patch implements the query interface to get all
ports data for the specific device.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
7 years agoRDMA/netlink: Add nldev device doit implementation
Leon Romanovsky [Thu, 15 Jun 2017 17:33:08 +0000 (20:33 +0300)]
RDMA/netlink: Add nldev device doit implementation

Provide ability to query specific device.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
7 years agoRDMA/netlink: Implement nldev device dumpit calback
Leon Romanovsky [Tue, 20 Jun 2017 06:59:14 +0000 (09:59 +0300)]
RDMA/netlink: Implement nldev device dumpit calback

This patch adds the ability to return all available devices
together with their properties.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
7 years agoRDMA/netlink: Add nldev initialization flows
Leon Romanovsky [Tue, 20 Jun 2017 06:14:15 +0000 (09:14 +0300)]
RDMA/netlink: Add nldev initialization flows

Add nldev init and exit flows to the RDMA/core.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
7 years agoRDMA/netlink: Add netlink device definitions to UAPI
Leon Romanovsky [Tue, 20 Jun 2017 04:55:53 +0000 (07:55 +0300)]
RDMA/netlink: Add netlink device definitions to UAPI

Introduce new defines to rdma_netlink.h, so the RDMA configuration tool
will be able to communicate with RDMA subsystem by using the shared defines.

The addition of new client (NLDEV) revealed the fact that we exposed by
mistake the RDMA_NL_I40IW define which is not backed by any RDMA netlink
by now and it won't be exposed in the future too. So this patch reuses
the value and deletes the old defines.

The NLDEV operates with objects. The struct ib_device has two straightforward
objects: device itself and ports of that device.

This brings us to propose the following commands to work on those objects:
 * RDMA_NLDEV_CMD_{GET,SET,NEW,DEL} - works on ib_device itself
 * RDMA_NLDEV_CMD_PORT_{GET,SET,NEW,DEL} - works on ports of specific ib_device

Those commands receive/return the device index (RDMA_NLDEV_ATTR_DEV_INDEX)
and port index (RDMA_NLDEV_ATTR_PORT_INDEX). For device object accesses,
the RDMA_NLDEV_ATTR_PORT_INDEX will return the maximum number of ports
for specific ib_device and for port access the actual port index.

The port index starts from 1 to follow RDMA/core internal semantics and
the sysfs exposed knobs.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>