platform/kernel/linux-starfive.git
4 years agoRDMA/hns: Remove redundant assignment of wc->smac when polling cq
Weihang Li [Fri, 20 Mar 2020 03:23:41 +0000 (11:23 +0800)]
RDMA/hns: Remove redundant assignment of wc->smac when polling cq

The field smac in ib_wc was used for create AH and then it will be treated
as destination mac address in UD sqwqe, but related code about filling smac
into AH has been removed in core. Actually, the dmac in UD sqwqe is parsed
from the dgid in grh which is passed in by ULP now, so this assignment
should be removed.

Link: https://lore.kernel.org/r/1584674622-52773-10-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Remove redundant qpc setup operations
Lang Cheng [Fri, 20 Mar 2020 03:23:40 +0000 (11:23 +0800)]
RDMA/hns: Remove redundant qpc setup operations

Before calling modify_qp_reset_to_init(), the entire qpc mask has been
cleared, so it is no longer necessary to clear the specific fields in the
mask.

Link: https://lore.kernel.org/r/1584674622-52773-9-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Remove meaningless prints
Wenpeng Liang [Fri, 20 Mar 2020 03:23:39 +0000 (11:23 +0800)]
RDMA/hns: Remove meaningless prints

ceq and aeq is a ring buffer, consumer index of them will be set to zero
after reaching the maximum value. The warning should be removed or it may
mislead the users.

Link: https://lore.kernel.org/r/1584674622-52773-8-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Remove definition of cq doorbell structure
Lang Cheng [Fri, 20 Mar 2020 03:23:38 +0000 (11:23 +0800)]
RDMA/hns: Remove definition of cq doorbell structure

The struct hns_roce_v2_cq_db is unused, it should be removed.

Link: https://lore.kernel.org/r/1584674622-52773-7-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Adjust the qp status value sequence of the hardware
Lang Cheng [Fri, 20 Mar 2020 03:23:37 +0000 (11:23 +0800)]
RDMA/hns: Adjust the qp status value sequence of the hardware

Interchange SQD and SQE to match the protocol.

Link: https://lore.kernel.org/r/1584674622-52773-6-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Optimize hns_roce_alloc_vf_resource()
Lijun Ou [Fri, 20 Mar 2020 03:23:36 +0000 (11:23 +0800)]
RDMA/hns: Optimize hns_roce_alloc_vf_resource()

The capbilities of hardware should be got at first and then used in
hns_roce_alloc_vf_resource(). Also removes an unnecessary if ... else
condition in it.

Link: https://lore.kernel.org/r/1584674622-52773-5-git-send-email-liweihang@huawei.com
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Simplify attribute judgment code
Lang Cheng [Fri, 20 Mar 2020 03:23:35 +0000 (11:23 +0800)]
RDMA/hns: Simplify attribute judgment code

Combine attribute flags before masking them.

Link: https://lore.kernel.org/r/1584674622-52773-4-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Fix a wrong judgment of return value
Weihang Li [Fri, 20 Mar 2020 03:23:34 +0000 (11:23 +0800)]
RDMA/hns: Fix a wrong judgment of return value

hns_roce_alloc_mtt_range() never return -1, ret should be checked
whether it is zero instead of -1.

Fixes: 1ceb0b11a8a2 ("RDMA/hns: Fix non-standard error codes")
Link: https://lore.kernel.org/r/1584674622-52773-3-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Unify format of prints
Lijun Ou [Fri, 20 Mar 2020 03:23:33 +0000 (11:23 +0800)]
RDMA/hns: Unify format of prints

Use ibdev_err/dbg/warn() instead of dev_err/dbg/warn(), and modify some
prints into format of "failed to do something, ret = n".

Link: https://lore.kernel.org/r/1584674622-52773-2-git-send-email-liweihang@huawei.com
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/iser: Always check sig MR before putting it to the free pool
Sergey Gorenko [Wed, 25 Mar 2020 15:12:10 +0000 (15:12 +0000)]
IB/iser: Always check sig MR before putting it to the free pool

libiscsi calls the check_protection transport handler only if SCSI-Respose
is received. So, the handler is never called if iSCSI task is completed
for some other reason like a timeout or error handling. And this behavior
looks correct. But the iSER does not handle this case properly because it
puts a non-checked signature MR to the free pool. Then the error occurs at
reusing the MR because it is not allowed to invalidate a signature MR
without checking.

This commit adds an extra check to iser_unreg_mem_fastreg(), which is a
part of the task cleanup flow. Now the signature MR is checked there if it
is needed.

Link: https://lore.kernel.org/r/20200325151210.1548-1-sergeygo@mellanox.com
Signed-off-by: Sergey Gorenko <sergeygo@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/rxe: Set sys_image_guid to be aligned with HW IB devices
Zhu Yanjun [Mon, 23 Mar 2020 11:28:00 +0000 (13:28 +0200)]
RDMA/rxe: Set sys_image_guid to be aligned with HW IB devices

The RXE driver doesn't set sys_image_guid and user space applications see
zeros. This causes to pyverbs tests to fail with the following traceback,
because the IBTA spec requires to have valid sys_image_guid.

 Traceback (most recent call last):
   File "./tests/test_device.py", line 51, in test_query_device
     self.verify_device_attr(attr)
   File "./tests/test_device.py", line 74, in verify_device_attr
     assert attr.sys_image_guid != 0

In order to fix it, set sys_image_guid to be equal to node_guid.

Before:
 5: rxe0: ... node_guid 5054:00ff:feaa:5363 sys_image_guid
 0000:0000:0000:0000

After:
 5: rxe0: ... node_guid 5054:00ff:feaa:5363 sys_image_guid
 5054:00ff:feaa:5363

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Link: https://lore.kernel.org/r/20200323112800.1444784-1-leon@kernel.org
Signed-off-by: Zhu Yanjun <yanjunz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/hfi1: Use scnprintf() for avoiding potential buffer overflow
Takashi Iwai [Thu, 19 Mar 2020 15:46:41 +0000 (16:46 +0100)]
IB/hfi1: Use scnprintf() for avoiding potential buffer overflow

Since snprintf() returns the would-be-output size instead of the actual
output size, the succeeding calls may go beyond the given buffer limit.
Fix it by replacing with scnprintf().

Link: https://lore.kernel.org/r/20200319154641.23711-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Update num_paths in cma_resolve_iboe_route error flow
Avihai Horon [Wed, 18 Mar 2020 10:17:41 +0000 (12:17 +0200)]
RDMA/cm: Update num_paths in cma_resolve_iboe_route error flow

After a successful allocation of path_rec, num_paths is set to 1, but any
error after such allocation will leave num_paths uncleared.

This causes to de-referencing a NULL pointer later on. Hence, num_paths
needs to be set back to 0 if such an error occurs.

The following crash from syzkaller revealed it.

  kasan: CONFIG_KASAN_INLINE enabled
  kasan: GPF could be caused by NULL-ptr deref or user memory access
  general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
  CPU: 0 PID: 357 Comm: syz-executor060 Not tainted 4.18.0+ #311
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
  rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
  RIP: 0010:ib_copy_path_rec_to_user+0x94/0x3e0
  Code: f1 f1 f1 f1 c7 40 0c 00 00 f4 f4 65 48 8b 04 25 28 00 00 00 48 89
  45 c8 31 c0 e8 d7 60 24 ff 48 8d 7b 4c 48 89 f8 48 c1 e8 03 <42> 0f b6
  14 30 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85
  RSP: 0018:ffff88006586f980 EFLAGS: 00010207
  RAX: 0000000000000009 RBX: 0000000000000000 RCX: 1ffff1000d5fe475
  RDX: ffff8800621e17c0 RSI: ffffffff820d45f9 RDI: 000000000000004c
  RBP: ffff88006586fa50 R08: ffffed000cb0df73 R09: ffffed000cb0df72
  R10: ffff88006586fa70 R11: ffffed000cb0df73 R12: 1ffff1000cb0df30
  R13: ffff88006586fae8 R14: dffffc0000000000 R15: ffff88006aff2200
  FS: 00000000016fc880(0000) GS:ffff88006d000000(0000)
  knlGS:0000000000000000
  CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000020000040 CR3: 0000000063fec000 CR4: 00000000000006b0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
  ? ib_copy_path_rec_from_user+0xcc0/0xcc0
  ? __mutex_unlock_slowpath+0xfc/0x670
  ? wait_for_completion+0x3b0/0x3b0
  ? ucma_query_route+0x818/0xc60
  ucma_query_route+0x818/0xc60
  ? ucma_listen+0x1b0/0x1b0
  ? sched_clock_cpu+0x18/0x1d0
  ? sched_clock_cpu+0x18/0x1d0
  ? ucma_listen+0x1b0/0x1b0
  ? ucma_write+0x292/0x460
  ucma_write+0x292/0x460
  ? ucma_close_id+0x60/0x60
  ? sched_clock_cpu+0x18/0x1d0
  ? sched_clock_cpu+0x18/0x1d0
  __vfs_write+0xf7/0x620
  ? ucma_close_id+0x60/0x60
  ? kernel_read+0x110/0x110
  ? time_hardirqs_on+0x19/0x580
  ? lock_acquire+0x18b/0x3a0
  ? finish_task_switch+0xf3/0x5d0
  ? _raw_spin_unlock_irq+0x29/0x40
  ? _raw_spin_unlock_irq+0x29/0x40
  ? finish_task_switch+0x1be/0x5d0
  ? __switch_to_asm+0x34/0x70
  ? __switch_to_asm+0x40/0x70
  ? security_file_permission+0x172/0x1e0
  vfs_write+0x192/0x460
  ksys_write+0xc6/0x1a0
  ? __ia32_sys_read+0xb0/0xb0
  ? entry_SYSCALL_64_after_hwframe+0x3e/0xbe
  ? do_syscall_64+0x1d/0x470
  do_syscall_64+0x9e/0x470
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: 3c86aa70bf67 ("RDMA/cm: Add RDMA CM support for IBoE devices")
Link: https://lore.kernel.org/r/20200318101741.47211-1-leon@kernel.org
Signed-off-by: Avihai Horon <avihaih@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/mlx5: Generally use the WC auto detection test result
Yishai Hadas [Wed, 18 Mar 2020 10:03:23 +0000 (12:03 +0200)]
IB/mlx5: Generally use the WC auto detection test result

Now that we have direct and reliable detection of WC support by the
system, use is broadly. The only case we have to worry about is when the
WC autodetector cannot run.

For this fringe case generally assume that that WC is available, except in
the well defined case of no PAT support on x86 which is tested by calling
arch_can_pci_mmap_wc().

If WC is wrongly assumed to be available then it causes a small
performance hit on paths in userspace that are tuned to the assumption
that WC is available. There is no functional loss.

It is very unlikely that any platforms exist that lack WC and also care
about the micro optimization of WC in the fringe case where autodetection
does not work.

By removing the fairly bogus CONFIG tests this makes WC work broadly on
all arches and all platforms.

Link: https://lore.kernel.org/r/20200318100323.46659-1-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Michael Guralnik <michaelgur@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Optimize mhop put flow for multi-hop addressing
Xi Wang [Tue, 17 Mar 2020 03:55:24 +0000 (11:55 +0800)]
RDMA/hns: Optimize mhop put flow for multi-hop addressing

Optimizes hns_roce_table_mhop_get() by encapsulating code about clearing
hem into clear_mhop_hem(), which will make the code flow clearer.

Link: https://lore.kernel.org/r/1584417324-2255-3-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Optimize mhop get flow for multi-hop addressing
Xi Wang [Tue, 17 Mar 2020 03:55:23 +0000 (11:55 +0800)]
RDMA/hns: Optimize mhop get flow for multi-hop addressing

Splits hns_roce_table_mhop_get() into 4 sub-functions to make the code flow
clearer.

Link: https://lore.kernel.org/r/1584417324-2255-2-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/bnxt_re: Wait for all the CQ events before freeing CQ data structures
Selvin Xavier [Fri, 13 Mar 2020 17:34:02 +0000 (10:34 -0700)]
RDMA/bnxt_re: Wait for all the CQ events before freeing CQ data structures

Destroy CQ command to firmware returns the num_cnq_events as a
response. This indicates the driver about the number of CQ events
generated for this CQ. Driver should wait for all these events before
freeing the CQ host structures.  Also, add routine to clean all the
pending notification for the CQs getting destroyed. This avoids the
possibility of accessing the CQ data structures after its freed.

Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Link: https://lore.kernel.org/r/1584120842-3200-1-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/mlx5: Fix a NULL vs IS_ERR() check
Dan Carpenter [Fri, 20 Mar 2020 13:26:41 +0000 (16:26 +0300)]
IB/mlx5: Fix a NULL vs IS_ERR() check

The kzalloc() function returns NULL, not error pointers.

Fixes: 30f2fe40c72b ("IB/mlx5: Introduce UAPIs to manage packet pacing")
Link: https://lore.kernel.org/r/20200320132641.GF95012@mwanda
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/siw: Suppress uninitialized var warning
Andrew Morton [Mon, 23 Mar 2020 18:46:27 +0000 (11:46 -0700)]
RDMA/siw: Suppress uninitialized var warning

drivers/infiniband/sw/siw/siw_qp_rx.c: In function siw_proc_send:
./include/linux/spinlock.h:288:3: warning: flags may be used uninitialized in this function [-Wmaybe-uninitialized]
   _raw_spin_unlock_irqrestore(lock, flags); \
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/infiniband/sw/siw/siw_qp_rx.c:335:16: note: flags was declared here
  unsigned long flags;

Link: https://lore.kernel.org/r/20200323184627.ZWPg91uin%akpm@linux-foundation.org
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/efa: Use in-kernel offsetofend() to check field availability
Leon Romanovsky [Tue, 10 Mar 2020 09:14:30 +0000 (11:14 +0200)]
RDMA/efa: Use in-kernel offsetofend() to check field availability

Remove custom and duplicated variant of offsetofend().

Link: https://lore.kernel.org/r/20200310091438.248429-4-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/hfi1: Remove kobj from hfi1_devdata
Kaike Wan [Mon, 16 Mar 2020 21:05:00 +0000 (17:05 -0400)]
IB/hfi1: Remove kobj from hfi1_devdata

The field kobj was added to hfi1_devdata structure to manage the life time
of the hfi1_devdata structure for PSM accesses:

commit e11ffbd57520 ("IB/hfi1: Do not free hfi1 cdev parent structure early")

Later another mechanism user_refcount/user_comp was introduced to provide
the same functionality:

commit acd7c8fe1493 ("IB/hfi1: Fix an Oops on pci device force remove")

This patch will remove this kobj field, as it is no longer needed.

Link: https://lore.kernel.org/r/20200316210500.7753.4145.stgit@awfm-01.aw.intel.com
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/rdmavt: Delete unused routine
Mike Marciniszyn [Mon, 16 Mar 2020 21:04:54 +0000 (17:04 -0400)]
IB/rdmavt: Delete unused routine

This routine was obsoleted by the patch below.

Delete it.

Fixes: a2a074ef396f ("RDMA: Handle ucontext allocations by IB/core")
Link: https://lore.kernel.org/r/20200316210454.7753.94689.stgit@awfm-01.aw.intel.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Check if depth of qp is 0 before configure
Lang Cheng [Thu, 12 Mar 2020 09:50:24 +0000 (17:50 +0800)]
RDMA/hns: Check if depth of qp is 0 before configure

Depth of qp shouldn't be allowed to be set to zero, after ensuring that,
subsequent process can be simplified. And when qp is changed from reset to
reset, the capability of minimum qp depth was used to identify hardware of
hip06, it should be changed into a more readable form.

Link: https://lore.kernel.org/r/1584006624-11846-1-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoi40iw: Report correct firmware version
Sindhu, Devale [Fri, 13 Mar 2020 21:44:06 +0000 (16:44 -0500)]
i40iw: Report correct firmware version

The driver uses a hard-coded value for FW version and reports an
inconsistent FW version between ibv_devinfo and
/sys/class/infiniband/i40iw/fw_ver.

Retrieve the FW version via a Control QP (CQP) operation and report it
consistently across sysfs and query device.

Fixes: d37498417947 ("i40iw: add files for iwarp interface")
Link: https://lore.kernel.org/r/20200313214406.2159-1-shiraz.saleem@intel.com
Reported-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Sindhu, Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Optimize wqe buffer set flow for post send
Xi Wang [Tue, 10 Mar 2020 11:18:04 +0000 (19:18 +0800)]
RDMA/hns: Optimize wqe buffer set flow for post send

Splits hns_roce_v2_post_send() into three sub-functions: set_rc_wqe(),
set_ud_wqe() and update_sq_db() to simplify the code.

Link: https://lore.kernel.org/r/1583839084-31579-6-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Optimize base address table config flow for qp buffer
Xi Wang [Tue, 10 Mar 2020 11:18:03 +0000 (19:18 +0800)]
RDMA/hns: Optimize base address table config flow for qp buffer

Currently, before the qp is created, a page size needs to be calculated
for the base address table to store all base addresses in the mtr. As a
result, the parameter configuration of the mtr is complex. So integrate
the process of calculating the base table page size into the hem related
interface to simplify the process of using mtr.

Link: https://lore.kernel.org/r/1583839084-31579-5-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Optimize the wr opcode conversion from ib to hns
Xi Wang [Tue, 10 Mar 2020 11:18:02 +0000 (19:18 +0800)]
RDMA/hns: Optimize the wr opcode conversion from ib to hns

Simplify the wr opcode conversion from ib to hns by using a map table
instead of the switch-case statement.

Link: https://lore.kernel.org/r/1583839084-31579-4-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Optimize wqe buffer filling process for post send
Xi Wang [Tue, 10 Mar 2020 11:18:01 +0000 (19:18 +0800)]
RDMA/hns: Optimize wqe buffer filling process for post send

Encapsulates the wqe buffer process details for datagram seg, fast mr seg
and atomic seg.

Link: https://lore.kernel.org/r/1583839084-31579-3-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Rename wqe buffer related functions
Xi Wang [Tue, 10 Mar 2020 11:18:00 +0000 (19:18 +0800)]
RDMA/hns: Rename wqe buffer related functions

There are serval global functions related to wqe buffer in the hns driver
and are called in different files. These symbols cannot directly represent
the namespace they belong to. So add prefix 'hns_roce_' to 3 wqe buffer
related global functions: get_recv_wqe(), get_send_wqe(), and
get_send_extend_sge().

Link: https://lore.kernel.org/r/1583839084-31579-2-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/bnxt_re: Remove unnecessary sched count
Selvin Xavier [Fri, 13 Mar 2020 16:33:27 +0000 (09:33 -0700)]
RDMA/bnxt_re: Remove unnecessary sched count

Since the lifetime of bnxt_re_task is controlled by the kref of device,
sched_count is no longer required.  Remove it.

Link: https://lore.kernel.org/r/1584117207-2664-4-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/bnxt_re: Fix lifetimes in bnxt_re_task
Jason Gunthorpe [Fri, 13 Mar 2020 16:33:26 +0000 (09:33 -0700)]
RDMA/bnxt_re: Fix lifetimes in bnxt_re_task

A work queue cannot just rely on the ib_device not being freed, it must
hold a kref on the memory so that the BNXT_RE_FLAG_IBDEV_REGISTERED check
works.

Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Link: https://lore.kernel.org/r/1584117207-2664-3-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/bnxt_re: Use ib_device_try_get()
Jason Gunthorpe [Fri, 13 Mar 2020 16:33:25 +0000 (09:33 -0700)]
RDMA/bnxt_re: Use ib_device_try_get()

There are a couple places in this driver running from a work queue that
need the ib_device to be registered. Instead of using a broken internal
bit rely on the new core code to guarantee device registration.

Link: https://lore.kernel.org/r/1584117207-2664-2-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Make sure the cm_id is in the IB_CM_IDLE state in destroy
Jason Gunthorpe [Tue, 10 Mar 2020 09:25:45 +0000 (11:25 +0200)]
RDMA/cm: Make sure the cm_id is in the IB_CM_IDLE state in destroy

The first switch statement in cm_destroy_id() tries to move the ID to
either IB_CM_IDLE or IB_CM_TIMEWAIT. Both states will block concurrent
MAD handlers from progressing.

Previous patches removed the unreliably lock/unlock sequences in this
flow, this patch removes the extra locking steps and adds the missing
parts to guarantee that destroy reaches IB_CM_IDLE. There is no point in
leaving the ID in the IB_CM_TIMEWAIT state the memory about to be kfreed.

Rework things to hold the lock across all the state transitions and
directly assert when done that it ended up in IB_CM_IDLE as expected.

This was accompanied by a careful audit of all the state transitions here,
which generally did end up in IDLE on their success and non-racy paths.

Link: https://lore.kernel.org/r/20200310092545.251365-16-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Allow ib_send_cm_sidr_rep() to be done under lock
Jason Gunthorpe [Tue, 10 Mar 2020 09:25:44 +0000 (11:25 +0200)]
RDMA/cm: Allow ib_send_cm_sidr_rep() to be done under lock

The first thing ib_send_cm_sidr_rep() does is obtain the lock, so use the
usual unlocked wrapper, locked actor pattern here.

Get rid of the cm_reject_sidr_req() wrapper so each call site can call the
locked or unlocked version as required.

This avoids a sketchy lock/unlock sequence (which could allow state to
change) during cm_destroy_id().

Link: https://lore.kernel.org/r/20200310092545.251365-15-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Allow ib_send_cm_rej() to be done under lock
Jason Gunthorpe [Tue, 10 Mar 2020 09:25:43 +0000 (11:25 +0200)]
RDMA/cm: Allow ib_send_cm_rej() to be done under lock

The first thing ib_send_cm_rej() does is obtain the lock, so use the usual
unlocked wrapper, locked actor pattern here.

This avoids a sketchy lock/unlock sequence (which could allow state to
change) during cm_destroy_id().

While here simplify some of the logic in the implementation.

Link: https://lore.kernel.org/r/20200310092545.251365-14-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Allow ib_send_cm_drep() to be done under lock
Jason Gunthorpe [Tue, 10 Mar 2020 09:25:42 +0000 (11:25 +0200)]
RDMA/cm: Allow ib_send_cm_drep() to be done under lock

The first thing ib_send_cm_drep() does is obtain the lock, so use the
usual unlocked wrapper, locked actor pattern here.

This avoids a sketchy lock/unlock sequence (which could allow state to
change) during cm_destroy_id().

Link: https://lore.kernel.org/r/20200310092545.251365-13-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Allow ib_send_cm_dreq() to be done under lock
Jason Gunthorpe [Tue, 10 Mar 2020 09:25:41 +0000 (11:25 +0200)]
RDMA/cm: Allow ib_send_cm_dreq() to be done under lock

The first thing ib_send_cm_dreq() does is obtain the lock, so use the
usual unlocked wrapper, locked actor pattern here.

This avoids a sketchy lock/unlock sequence (which could allow state to
change) during cm_destroy_id().

Link: https://lore.kernel.org/r/20200310092545.251365-12-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Add some lockdep assertions for cm_id_priv->lock
Jason Gunthorpe [Tue, 10 Mar 2020 09:25:40 +0000 (11:25 +0200)]
RDMA/cm: Add some lockdep assertions for cm_id_priv->lock

These functions all touch state, so must be called under the lock.
Inspection shows this is currently true.

Link: https://lore.kernel.org/r/20200310092545.251365-11-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Add missing locking around id.state in cm_dup_req_handler
Jason Gunthorpe [Tue, 10 Mar 2020 09:25:39 +0000 (11:25 +0200)]
RDMA/cm: Add missing locking around id.state in cm_dup_req_handler

All accesses to id.state must be done under the spinlock.

Fixes: a977049dacde ("[PATCH] IB: Add the kernel CM implementation")
Link: https://lore.kernel.org/r/20200310092545.251365-10-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Make it clearer how concurrency works in cm_req_handler()
Jason Gunthorpe [Tue, 10 Mar 2020 09:25:38 +0000 (11:25 +0200)]
RDMA/cm: Make it clearer how concurrency works in cm_req_handler()

ib_crate_cm_id() immediately places the id in the xarray, and publishes it
into the remote_id and remote_qpn rbtrees. This makes it visible to other
threads before it is fully set up.

It appears the thinking here was that the states IB_CM_IDLE and
IB_CM_REQ_RCVD do not allow any MAD handler or lookup in the remote_id and
remote_qpn rbtrees to advance.

However, cm_rej_handler() does take an action on IB_CM_REQ_RCVD, which is
not really expected by the design.

Make the whole thing clearer:
 - Keep the new cm_id out of the xarray until it is completely set up.
   This directly prevents MAD handlers and all rbtree lookups from seeing
   the pointer.
 - Move all the trivial setup right to the top so it is obviously done
   before any concurrency begins
 - Move the mutation of the cm_id_priv out of cm_match_id() and into the
   caller so the state transition is obvious
 - Place the manipulation of the work_list at the end, under lock, after
   the cm_id is placed in the xarray. The work_count cannot change on an
   ID outside the xarray.
 - Add some comments

Link: https://lore.kernel.org/r/20200310092545.251365-9-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Make it clear that there is no concurrency in cm_sidr_req_handler()
Jason Gunthorpe [Tue, 10 Mar 2020 09:25:37 +0000 (11:25 +0200)]
RDMA/cm: Make it clear that there is no concurrency in cm_sidr_req_handler()

ib_create_cm_id() immediately places the id in the xarray, so it is visible
to network traffic.

The state is initially set to IB_CM_IDLE and all the MAD handlers will
test this state under lock and refuse to advance from IDLE, so adding to
the xarray is harmless.

Further, the set to IB_CM_SIDR_REQ_RCVD also excludes all MAD handlers.

However, the local_id isn't even used for SIDR mode, and there will be no
input MADs related to the newly created ID.

So, make the whole flow simpler so it can be understood:
 - Do not put the SIDR cm_id in the xarray. This directly shows that there
   is no concurrency
 - Delete the confusing work_count and pending_list manipulations. This
   mechanism is only used by MAD handlers and timewait, neither of which
   apply to SIDR.
 - Add a few comments and rename 'cur_cm_id_priv' to 'listen_cm_id_priv'
 - Move other loose sets up to immediately after cm_id creation so that
   the cm_id is fully configured right away. This fixes an oversight where
   the service_id will not be returned back on a IB_SIDR_UNSUPPORTED
   reject.

Link: https://lore.kernel.org/r/20200310092545.251365-8-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Read id.state under lock when doing pr_debug()
Jason Gunthorpe [Tue, 10 Mar 2020 09:25:36 +0000 (11:25 +0200)]
RDMA/cm: Read id.state under lock when doing pr_debug()

The lock should not be dropped before doing the pr_debug() print as it is
accessing data protected by the lock, such as id.state.

Fixes: 119bf81793ea ("IB/cm: Add debug prints to ib_cm")
Link: https://lore.kernel.org/r/20200310092545.251365-7-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Simplify establishing a listen cm_id
Jason Gunthorpe [Tue, 10 Mar 2020 09:25:35 +0000 (11:25 +0200)]
RDMA/cm: Simplify establishing a listen cm_id

Any manipulation of cm_id->state must be done under the cm_id_priv->lock,
the two routines that added listens did not follow this rule, because they
never participate in any concurrent access around the state.

However, since this exception makes the code hard to understand, simplify
the flow so that it can be fully locked:
 - Move manipulation of listen_sharecount into cm_insert_listen() so it is
   trivially under the cm.lock without having to expose the cm.lock to the
   caller.
 - Push the cm.lock down into cm_insert_listen() and have the function
   increment the reference count before returning an existing pointer.
 - Split ib_cm_listen() into an cm_init_listen() and do not call
   ib_cm_listen() from ib_cm_insert_listen()
 - Make both ib_cm_listen() and ib_cm_insert_listen() directly call
   cm_insert_listen() under their cm_id_priv->lock which does both a
   collision detect and, if needed, the insert (atomically)
 - Enclose all state manipulation within the cm_id_priv->lock, notice this
   set can be done safely after cm_insert_listen() as no reader is allowed
   to read the state without holding the lock.
 - Do not set the listen cm_id in the xarray, as it is never correct to
   look it up. This makes the concurrency simpler to understand.

Many needless error unwinds are removed in the process.

Link: https://lore.kernel.org/r/20200310092545.251365-6-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Make the destroy_id flow more robust
Jason Gunthorpe [Tue, 10 Mar 2020 09:25:34 +0000 (11:25 +0200)]
RDMA/cm: Make the destroy_id flow more robust

Too much of the destruction is very carefully sensitive to the state
and various other things. Move more code to the unconditional path and
add several WARN_ONs to check consistency.

Link: https://lore.kernel.org/r/20200310092545.251365-5-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Remove a race freeing timewait_info
Jason Gunthorpe [Tue, 10 Mar 2020 09:25:33 +0000 (11:25 +0200)]
RDMA/cm: Remove a race freeing timewait_info

When creating a cm_id during REQ the id immediately becomes visible to the
other MAD handlers, and shortly after the state is moved to IB_CM_REQ_RCVD

This allows cm_rej_handler() to run concurrently and free the work:

        CPU 0                                CPU1
 cm_req_handler()
  ib_create_cm_id()
  cm_match_req()
    id_priv->state = IB_CM_REQ_RCVD
                                       cm_rej_handler()
                                         cm_acquire_id()
                                         spin_lock(&id_priv->lock)
                                         switch (id_priv->state)
      case IB_CM_REQ_RCVD:
                                            cm_reset_to_idle()
                                             kfree(id_priv->timewait_info);
   goto destroy
  destroy:
    kfree(id_priv->timewait_info);
                                             id_priv->timewait_info = NULL

Causing a double free or worse.

Do not free the timewait_info without also holding the
id_priv->lock. Simplify this entire flow by making the free unconditional
during cm_destroy_id() and removing the confusing special case error
unwind during creation of the timewait_info.

This also fixes a leak of the timewait if cm_destroy_id() is called in
IB_CM_ESTABLISHED with an XRC TGT QP. The state machine will be left in
ESTABLISHED while it needed to transition through IB_CM_TIMEWAIT to
release the timewait pointer.

Also fix a leak of the timewait_info if the caller mis-uses the API and
does ib_send_cm_reqs().

Fixes: a977049dacde ("[PATCH] IB: Add the kernel CM implementation")
Link: https://lore.kernel.org/r/20200310092545.251365-4-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Fix checking for allowed duplicate listens
Jason Gunthorpe [Tue, 10 Mar 2020 09:25:32 +0000 (11:25 +0200)]
RDMA/cm: Fix checking for allowed duplicate listens

The test here typod the cm_id_priv to use, it used the one that was
freshly allocated. By definition the allocated one has the matching
cm_handler and zero context, so the condition was always true.

Instead check that the existing listening ID is compatible with the
proposed handler so that it can be shared, as was originally intended.

Fixes: 067b171b8679 ("IB/cm: Share listening CM IDs")
Link: https://lore.kernel.org/r/20200310092545.251365-3-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Fix ordering of xa_alloc_cyclic() in ib_create_cm_id()
Jason Gunthorpe [Tue, 10 Mar 2020 09:25:31 +0000 (11:25 +0200)]
RDMA/cm: Fix ordering of xa_alloc_cyclic() in ib_create_cm_id()

xa_alloc_cyclic() is a SMP release to be paired with some later acquire
during xa_load() as part of cm_acquire_id().

As such, xa_alloc_cyclic() must be done after the cm_id is fully
initialized, in particular, it absolutely must be after the
refcount_set(), otherwise the refcount_inc() in cm_acquire_id() may not
see the set.

As there are several cases where a reader will be able to use the
id.local_id after cm_acquire_id in the IB_CM_IDLE state there needs to be
an unfortunate split into a NULL allocate and a finalizing xa_store.

Fixes: a977049dacde ("[PATCH] IB: Add the kernel CM implementation")
Link: https://lore.kernel.org/r/20200310092545.251365-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Fix wrong judgments of udata->outlen
Weihang Li [Tue, 10 Mar 2020 13:06:09 +0000 (21:06 +0800)]
RDMA/hns: Fix wrong judgments of udata->outlen

These judgments were used to keep the compatibility with older versions of
userspace that don't have the field named "cap_flags" in structure
hns_roce_ib_create_cq_resp. But it will be wrong to compare outlen with
the size of resp if another new field were added in resp. oulen should be
compared with the end offset of cap_flags in resp.

Fixes: 4f8f0d5e33dd ("RDMA/hns: Package the flow of creating cq")
Link: https://lore.kernel.org/r/1583845569-47257-1-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoMerge branch 'mlx5_mr_cache' into rdma.git for-next
Jason Gunthorpe [Fri, 13 Mar 2020 14:11:07 +0000 (11:11 -0300)]
Merge branch 'mlx5_mr_cache' into rdma.git for-next

Leon Romanovsky says:

====================
This series fixes various corner cases in the mlx5_ib MR cache
implementation, see specific commit messages for more information.
====================

Based on the mlx5-next branch at
 git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Due to dependencies

* branch 'mlx5_mr-cache':
  RDMA/mlx5: Allow MRs to be created in the cache synchronously
  RDMA/mlx5: Revise how the hysteresis scheme works for cache filling
  RDMA/mlx5: Fix locking in MR cache work queue
  RDMA/mlx5: Lock access to ent->available_mrs/limit when doing queue_work
  RDMA/mlx5: Fix MR cache size and limit debugfs
  RDMA/mlx5: Always remove MRs from the cache before destroying them
  RDMA/mlx5: Simplify how the MR cache bucket is located
  RDMA/mlx5: Rename the tracking variables for the MR cache
  RDMA/mlx5: Replace spinlock protected write with atomic var
  {IB,net}/mlx5: Move asynchronous mkey creation to mlx5_ib
  {IB,net}/mlx5: Assign mkey variant in mlx5_ib only
  {IB,net}/mlx5: Setup mkey variant before mr create command invocation

4 years agoRDMA/mlx5: Allow MRs to be created in the cache synchronously
Jason Gunthorpe [Tue, 10 Mar 2020 08:22:38 +0000 (10:22 +0200)]
RDMA/mlx5: Allow MRs to be created in the cache synchronously

If the cache is completely out of MRs, and we are running in cache mode,
then directly, and synchronously, create an MR that is compatible with the
cache bucket using a sleeping mailbox command. This ensures that the
thread that is waiting for the MR absolutely will get one.

When a MR allocated in this way becomes freed then it is compatible with
the cache bucket and will be recycled back into it.

Deletes the very buggy ent->compl scheme to create a synchronous MR
allocation.

Link: https://lore.kernel.org/r/20200310082238.239865-13-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/mlx5: Revise how the hysteresis scheme works for cache filling
Jason Gunthorpe [Tue, 10 Mar 2020 08:22:37 +0000 (10:22 +0200)]
RDMA/mlx5: Revise how the hysteresis scheme works for cache filling

Currently if the work queue is running then it is in 'hysteresis' mode and
will fill until the cache reaches the high water mark. This implicit state
is very tricky and doesn't interact with pending very well.

Instead of self re-scheduling the work queue after the add_keys() has
started to create the new MR, have the queue scheduled from
reg_mr_callback() only after the requested MR has been added.

This avoids the bad design of an in-rush of queue'd work doing back to
back add_keys() until EAGAIN then sleeping. The add_keys() will be paced
one at a time as they complete, slowly filling up the cache.

Also, fix pending to be only manipulated under lock.

Link: https://lore.kernel.org/r/20200310082238.239865-12-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/mlx5: Fix locking in MR cache work queue
Jason Gunthorpe [Tue, 10 Mar 2020 08:22:36 +0000 (10:22 +0200)]
RDMA/mlx5: Fix locking in MR cache work queue

All of the members of mlx5_cache_ent must be accessed while holding the
spinlock, add the missing spinlock in the __cache_work_func().

Using cache->stopped and flush_workqueue() is an inherently racy way to
shutdown self-scheduling work on a queue. Replace it with ent->disabled
under lock, and always check disabled before queuing any new work. Use
cancel_work_sync() to shutdown the queue.

Use READ_ONCE/WRITE_ONCE for dev->last_add to manage concurrency as
coherency is less important here.

Split fill_delay from the bitfield. C bitfield updates are not atomic and
this is just a mess. Use READ_ONCE/WRITE_ONCE, but this could also use
test_bit()/set_bit().

Link: https://lore.kernel.org/r/20200310082238.239865-11-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/mlx5: Lock access to ent->available_mrs/limit when doing queue_work
Jason Gunthorpe [Tue, 10 Mar 2020 08:22:35 +0000 (10:22 +0200)]
RDMA/mlx5: Lock access to ent->available_mrs/limit when doing queue_work

Accesses to these members needs to be locked. There is no reason not to
hold a spinlock while calling queue_work(), so move the tests into a
helper and always call it under lock.

The helper should be called when available_mrs is adjusted.

Link: https://lore.kernel.org/r/20200310082238.239865-10-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/mlx5: Fix MR cache size and limit debugfs
Jason Gunthorpe [Tue, 10 Mar 2020 08:22:34 +0000 (10:22 +0200)]
RDMA/mlx5: Fix MR cache size and limit debugfs

The size_write function is supposed to adjust the total_mr's to match the
user's request, but lacks locking and safety checking.

total_mrs can only be adjusted by at most available_mrs. mrs already
assigned to users cannot be revoked. Ensure that the user provides a
target value within the range of available_mrs and within the high/low
water mark.

limit_write has confusing and wrong sanity checking, and doesn't have the
ability to deallocate on limit reduction.

Since both functions use the same algorithm to adjust the available_mrs,
consolidate it into one function and write it correctly. Fix the locking
and by holding the spinlock for all accesses to ent->X.

Always fail if the user provides a malformed string.

Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Link: https://lore.kernel.org/r/20200310082238.239865-9-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/mlx5: Always remove MRs from the cache before destroying them
Jason Gunthorpe [Tue, 10 Mar 2020 08:22:33 +0000 (10:22 +0200)]
RDMA/mlx5: Always remove MRs from the cache before destroying them

The cache bucket tracks the total number of MRs that exists, both inside
and outside of the cache. Removing a MR from the cache (by setting
cache_ent to NULL) without updating total_mrs will cause the tracking to
leak and be inflated.

Further fix the rereg_mr path to always destroy the MR. reg_create will
always overwrite all the MR data in mlx5_ib_mr, so the MR must be
completely destroyed, in all cases, before this function can be
called. Detach the MR from the cache and unconditionally destroy it to
avoid leaking HW mkeys.

Fixes: afd1417404fb ("IB/mlx5: Use direct mkey destroy command upon UMR unreg failure")
Fixes: 56e11d628c5d ("IB/mlx5: Added support for re-registration of MRs")
Link: https://lore.kernel.org/r/20200310082238.239865-8-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/mlx5: Simplify how the MR cache bucket is located
Jason Gunthorpe [Tue, 10 Mar 2020 08:22:32 +0000 (10:22 +0200)]
RDMA/mlx5: Simplify how the MR cache bucket is located

There are many bad APIs here that are accepting a cache bucket index
instead of a bucket pointer. Many of the callers already have a bucket
pointer, so this results in a lot of confusing uses of order2idx().

Pass the struct mlx5_cache_ent into add_keys(), remove_keys(), and
alloc_cached_mr().

Once the MR is in the cache, store the cache bucket pointer directly in
the MR, replacing the 'bool allocated_from cache'.

In the end there is only one place that needs to form index from order,
alloc_mr_from_cache(). Increase the safety of this function by disallowing
it from accessing cache entries in the ODP special area.

Link: https://lore.kernel.org/r/20200310082238.239865-7-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/mlx5: Rename the tracking variables for the MR cache
Jason Gunthorpe [Tue, 10 Mar 2020 08:22:31 +0000 (10:22 +0200)]
RDMA/mlx5: Rename the tracking variables for the MR cache

The old names do not clearly indicate the intent.

Link: https://lore.kernel.org/r/20200310082238.239865-6-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/mlx5: Replace spinlock protected write with atomic var
Saeed Mahameed [Tue, 10 Mar 2020 08:22:29 +0000 (10:22 +0200)]
RDMA/mlx5: Replace spinlock protected write with atomic var

mkey variant calculation was spinlock protected to make it atomic, replace
that with one atomic variable.

Link: https://lore.kernel.org/r/20200310082238.239865-4-leon@kernel.org
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years ago{IB,net}/mlx5: Move asynchronous mkey creation to mlx5_ib
Michael Guralnik [Tue, 10 Mar 2020 08:22:30 +0000 (10:22 +0200)]
{IB,net}/mlx5: Move asynchronous mkey creation to mlx5_ib

As mlx5_ib is the only user of the mlx5_core_create_mkey_cb, move the
logic inside mlx5_ib and cleanup the code in mlx5_core.

Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
4 years ago{IB,net}/mlx5: Assign mkey variant in mlx5_ib only
Saeed Mahameed [Tue, 10 Mar 2020 08:22:28 +0000 (10:22 +0200)]
{IB,net}/mlx5: Assign mkey variant in mlx5_ib only

mkey variant is not required for mlx5_core use, move the mkey variant
counter to mlx5_ib.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
4 years ago{IB,net}/mlx5: Setup mkey variant before mr create command invocation
Saeed Mahameed [Tue, 10 Mar 2020 08:22:27 +0000 (10:22 +0200)]
{IB,net}/mlx5: Setup mkey variant before mr create command invocation

On reg_mr_callback() mlx5_ib is recalculating the mkey variant which is
wrong and will lead to using a different key variant than the one
submitted to firmware on create mkey command invocation.

To fix this, we store the mkey variant before invoking the firmware
command and use it later on completion (reg_mr_callback).

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
4 years agoRDMA/cm: Delete not implemented CM peer to peer communication
Leon Romanovsky [Tue, 10 Mar 2020 09:14:32 +0000 (11:14 +0200)]
RDMA/cm: Delete not implemented CM peer to peer communication

Peer to peer support was never implemented, so delete it to make code less
clutter.

Link: https://lore.kernel.org/r/20200310091438.248429-6-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/mlx5: Use offsetofend() instead of duplicated variant
Leon Romanovsky [Tue, 10 Mar 2020 09:14:31 +0000 (11:14 +0200)]
RDMA/mlx5: Use offsetofend() instead of duplicated variant

Convert mlx5 driver to use offsetofend() instead of its duplicated
variant.

Link: https://lore.kernel.org/r/20200310091438.248429-5-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/mlx4: Delete duplicated offsetofend implementation
Leon Romanovsky [Tue, 10 Mar 2020 09:14:29 +0000 (11:14 +0200)]
RDMA/mlx4: Delete duplicated offsetofend implementation

Convert mlx4 to use in-kernel offsetofend() instead
of its duplicated implementation.

Link: https://lore.kernel.org/r/20200310091438.248429-3-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/mlx5: Replace tunnel mpls capability bits for tunnel_offloads
Alex Vesker [Thu, 5 Mar 2020 12:38:41 +0000 (14:38 +0200)]
IB/mlx5: Replace tunnel mpls capability bits for tunnel_offloads

Until now the flex parser capability was used in ib_query_device() to
indicate tunnel_offloads_caps support for mpls_over_gre/mpls_over_udp.

Newer devices and firmware will have configurations with the flexparser
but without mpls support.

Testing for the flex parser capability was a mistake, the tunnel_stateless
capability was intended for detecting mpls and was introduced at the same
time as the flex parser capability.

Otherwise userspace will be incorrectly informed that a future device
supports MPLS when it does not.

Link: https://lore.kernel.org/r/20200305123841.196086-1-leon@kernel.org
Cc: <stable@vger.kernel.org> # 4.17
Fixes: e818e255a58d ("IB/mlx5: Expose MPLS related tunneling offloads")
Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Ariel Levkovich <lariel@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/mlx5: Remove duplicate definitions of SW_ICM macros
Erez Shitrit [Tue, 10 Mar 2020 07:57:06 +0000 (09:57 +0200)]
RDMA/mlx5: Remove duplicate definitions of SW_ICM macros

Those macros are already defined in include/linux/mlx5/driver.h, so delete
their duplicate variants.

Link: https://lore.kernel.org/r/20200310075706.238592-1-leon@kernel.org
Signed-off-by: Ariel Levkovich <lariel@mellanox.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@mellanox.com>
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/core: Remove the duplicate header file
Zhu Yanjun [Tue, 10 Mar 2020 09:16:56 +0000 (11:16 +0200)]
RDMA/core: Remove the duplicate header file

The header file rdma_core.h is duplicate, so let's remove it.

Fixes: 622db5b6439a ("RDMA/core: Add trace points to follow MR allocation")
Link: https://lore.kernel.org/r/20200310091656.249696-1-leon@kernel.org
Signed-off-by: Zhu Yanjun <yanjunz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/bnxt_re: Remove a redundant 'memset'
Christophe JAILLET [Sun, 8 Mar 2020 06:54:42 +0000 (07:54 +0100)]
RDMA/bnxt_re: Remove a redundant 'memset'

'wqe' is already zeroed at the top of the 'while' loop, just a few lines
below, and is not used outside of the loop.

So there is no need to zero it again, or for the variable to be declared
outside the loop.

Link: https://lore.kernel.org/r/20200308065442.5415-1-christophe.jaillet@wanadoo.fr
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cma: Teach lockdep about the order of rtnl and lock
Jason Gunthorpe [Thu, 27 Feb 2020 20:36:51 +0000 (16:36 -0400)]
RDMA/cma: Teach lockdep about the order of rtnl and lock

This lock ordering only happens when bonding is enabled and a certain
bonding related event fires. However, since it can happen this is a global
restriction on lock ordering.

Teach lockdep about the order directly and unconditionally so bugs here
are found quickly.

See https://syzkaller.appspot.com/bug?extid=55de90ab5f44172b0c90

Link: https://lore.kernel.org/r/20200227203651.GA27185@ziepe.ca
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/rw: map P2P memory correctly for signature operations
Max Gurtovoy [Thu, 20 Feb 2020 10:08:19 +0000 (12:08 +0200)]
RDMA/rw: map P2P memory correctly for signature operations

Since RDMA rw API support operations with P2P memory sg list, make sure to
map/unmap the scatter list for signature operation correctly.

Link: https://lore.kernel.org/r/20200220100819.41860-2-maxg@mellanox.com
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoMerge tag 'v5.6-rc5' into rdma.git for-next
Jason Gunthorpe [Tue, 10 Mar 2020 15:49:09 +0000 (12:49 -0300)]
Merge tag 'v5.6-rc5' into rdma.git for-next

Required due to dependencies in following patches.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoMerge branch 'mlx5_packet_pacing' into rdma.git for-next
Jason Gunthorpe [Tue, 10 Mar 2020 14:54:17 +0000 (11:54 -0300)]
Merge branch 'mlx5_packet_pacing' into rdma.git for-next

Yishai Hadas Says:

====================
Expose raw packet pacing APIs to be used by DEVX based applications.  The
existing code was refactored to have a single flow with the new raw APIs.
====================

Based on the mlx5-next branch at
 git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Due to dependencies

* branch 'mlx5_packet_pacing':
  IB/mlx5: Introduce UAPIs to manage packet pacing
  net/mlx5: Expose raw packet pacing APIs

4 years agoIB/mlx5: Introduce UAPIs to manage packet pacing
Yishai Hadas [Wed, 19 Feb 2020 19:05:18 +0000 (21:05 +0200)]
IB/mlx5: Introduce UAPIs to manage packet pacing

Introduce packet pacing uobject and its alloc and destroy
methods.

This uobject holds mlx5 packet pacing context according to the device
specification and enables managing packet pacing device entries that are
needed by DEVX applications.

Link: https://lore.kernel.org/r/20200219190518.200912-3-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoLinux 5.6-rc5
Linus Torvalds [Mon, 9 Mar 2020 00:44:44 +0000 (17:44 -0700)]
Linux 5.6-rc5

4 years agoMerge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Linus Torvalds [Mon, 9 Mar 2020 00:36:22 +0000 (17:36 -0700)]
Merge tag 'armsoc-fixes' of git://git./linux/kernel/git/soc/soc

Pull ARM SoC fixes from Olof Johansson:
 "We've been accruing these for a couple of weeks, so the batch is a bit
  bigger than usual.

  Largest delta is due to a led-bl driver that is added -- there was a
  miscommunication before the merge window and the driver didn't make it
  in. Due to this, the platforms needing it regressed. At this point, it
  seemed easier to add the new driver than unwind the changes.

  Besides that, there are a handful of various fixes:

   - AMD tee memory leak fix

   - A handful of fixlets for i.MX SCU communication

   - A few maintainers woke up and realized DEBUG_FS had been missing
     for a while, so a few updates of that.

  ... and the usual collection of smaller fixes to various platforms"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (37 commits)
  ARM: socfpga_defconfig: Add back DEBUG_FS
  arm64: dts: socfpga: agilex: Fix gmac compatible
  ARM: bcm2835_defconfig: Explicitly restore CONFIG_DEBUG_FS
  arm64: dts: meson: fix gxm-khadas-vim2 wifi
  arm64: dts: meson-sm1-sei610: add missing interrupt-names
  ARM: meson: Drop unneeded select of COMMON_CLK
  ARM: dts: bcm2711: Add pcie0 alias
  ARM: dts: bcm283x: Add missing properties to the PWR LED
  tee: amdtee: fix memory leak in amdtee_open_session()
  ARM: OMAP2+: Fix compile if CONFIG_HAVE_ARM_SMCCC is not set
  arm: dts: dra76x: Fix mmc3 max-frequency
  ARM: dts: dra7: Add "dma-ranges" property to PCIe RC DT nodes
  bus: ti-sysc: Fix 1-wire reset quirk
  ARM: dts: r8a7779: Remove deprecated "renesas, rcar-sata" compatible value
  soc: imx-scu: Align imx sc msg structs to 4
  firmware: imx: Align imx_sc_msg_req_cpu_start to 4
  firmware: imx: scu-pd: Align imx sc msg structs to 4
  firmware: imx: misc: Align imx sc msg structs to 4
  firmware: imx: scu: Ensure sequential TX
  ARM: dts: imx7-colibri: Fix frequency for sd/mmc
  ...

4 years agoMerge tag 'edac_urgent-2020-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Mon, 9 Mar 2020 00:33:52 +0000 (17:33 -0700)]
Merge tag 'edac_urgent-2020-03-08' of git://git./linux/kernel/git/ras/ras

Pull EDAC fix from Borislav Petkov:
 "Error reporting fix for synopsys_edac: do not overwrite partial
  decoded error message (Sherry Sun)"

* tag 'edac_urgent-2020-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/synopsys: Do not print an error with back-to-back snprintf() calls

4 years agoMerge tag 'char-misc-5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Sun, 8 Mar 2020 15:49:44 +0000 (10:49 -0500)]
Merge tag 'char-misc-5.6-rc5' of git://git./linux/kernel/git/gregkh/char-misc

Pull char/misc fixes from Greg KH:
 "Here are four small char/misc driver fixes for reported issues for
  5.6-rc5.

  These fixes are:

   - binder fix for a potential use-after-free problem found (took two
     tries to get it right)

   - interconnect core fix

   - altera-stapl driver fix

  All four of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  binder: prevent UAF for binderfs devices II
  interconnect: Handle memory allocation errors
  altera-stapl: altera_get_note: prevent write beyond end of 'key'
  binder: prevent UAF for binderfs devices

4 years agoMerge tag 'driver-core-5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 8 Mar 2020 15:39:40 +0000 (10:39 -0500)]
Merge tag 'driver-core-5.6-rc5' of git://git./linux/kernel/git/gregkh/driver-core

Pull driver core and debugfs fixes from Greg KH:
 "Here are four small driver core / debugfs patches for 5.6-rc3:

   - debugfs api cleanup now that all debugfs_create_regset32() callers
     have been fixed up. This was waiting until after the -rc1 merge as
     these fixes came in through different trees

   - driver core sync state fixes based on reports of minor issues found
     in the feature

  All of these have been in linux-next with no reported issues"

* tag 'driver-core-5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  driver core: Skip unnecessary work when device doesn't have sync_state()
  driver core: Add dev_has_sync_state()
  driver core: Call sync_state() even if supplier has no consumers
  debugfs: remove return value of debugfs_create_regset32()

4 years agoMerge tag 'tty-5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Linus Torvalds [Sun, 8 Mar 2020 15:35:04 +0000 (10:35 -0500)]
Merge tag 'tty-5.6-rc5' of git://git./linux/kernel/git/gregkh/tty

Pull tty/serial fixes from Greg KH:
 "Here are some small tty/serial fixes for 5.6-rc5

  Just some small serial driver fixes, and a vt core fixup, full details
  are:

   - vt fixes for issues found by syzbot

   - serdev fix for Apple boxes

   - fsl_lpuart serial driver fixes

   - MAINTAINER update for incorrect serial files

   - new device ids for 8250_exar driver

   - mvebu-uart fix

  All of these have been in linux-next with no reported issues"

* tag 'tty-5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  tty: serial: fsl_lpuart: free IDs allocated by IDA
  Revert "tty: serial: fsl_lpuart: drop EARLYCON_DECLARE"
  serdev: Fix detection of UART devices on Apple machines.
  MAINTAINERS: Add missed files related to Synopsys DesignWare UART
  serial: 8250_exar: add support for ACCES cards
  tty:serial:mvebu-uart:fix a wrong return
  vt: selection, push sel_lock up
  vt: selection, push console lock down

4 years agoMerge tag 'usb-5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Sun, 8 Mar 2020 15:32:23 +0000 (10:32 -0500)]
Merge tag 'usb-5.6-rc5' of git://git./linux/kernel/git/gregkh/usb

Pull USB/PHY fixes from Greg KH:
 "Here are some small USB and PHY driver fixes for reported issues for
  5.6-rc5.

  Included in here are:

   - phy driver fixes

   - new USB quirks

   - USB cdns3 gadget driver fixes

   - USB hub core fixes

  All of these have been in linux-next with no reported issues"

* tag 'usb-5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: dwc3: gadget: Update chain bit correctly when using sg list
  usb: core: port: do error out if usb_autopm_get_interface() fails
  usb: core: hub: do error out if usb_autopm_get_interface() fails
  usb: core: hub: fix unhandled return by employing a void function
  usb: storage: Add quirk for Samsung Fit flash
  usb: quirks: add NO_LPM quirk for Logitech Screen Share
  usb: usb251xb: fix regulator probe and error handling
  phy: allwinner: Fix GENMASK misuse
  usb: cdns3: gadget: toggle cycle bit before reset endpoint
  usb: cdns3: gadget: link trb should point to next request
  phy: mapphone-mdm6600: Fix timeouts by adding wake-up handling
  phy: brcm-sata: Correct MDIO operations for 40nm platforms
  phy: ti: gmii-sel: do not fail in case of gmii
  phy: ti: gmii-sel: fix set of copy-paste errors
  phy: core: Fix phy_get() to not return error on link creation failure
  phy: mapphone-mdm6600: Fix write timeouts with shorter GPIO toggle interval

4 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Linus Torvalds [Sun, 8 Mar 2020 01:52:55 +0000 (19:52 -0600)]
Merge tag 'for-linus' of git://git./linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
 "Nothing particularly exciting, some small ODP regressions from the mmu
  notifier rework, another bunch of syzkaller fixes, and a bug fix for a
  botched syzkaller fix in the first rc pull request.

   - Fix busted syzkaller fix in 'get_new_pps' - this turned out to
     crash on certain HW configurations

   - Bug fixes for various missed things in error unwinds

   - Add a missing rcu_read_lock annotation in hfi/qib

   - Fix two ODP related regressions from the recent mmu notifier
     changes

   - Several more syzkaller bugs in siw, RDMA netlink, verbs and iwcm

   - Revert an old patch in CMA as it is now shown to not be allocating
     port numbers properly"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/iwcm: Fix iwcm work deallocation
  RDMA/siw: Fix failure handling during device creation
  RDMA/nldev: Fix crash when set a QP to a new counter but QPN is missing
  RDMA/odp: Ensure the mm is still alive before creating an implicit child
  RDMA/core: Fix protection fault in ib_mr_pool_destroy
  IB/mlx5: Fix implicit ODP race
  IB/hfi1, qib: Ensure RCU is locked when accessing list
  RDMA/core: Fix pkey and port assignment in get_new_pps
  RMDA/cm: Fix missing ib_cm_destroy_id() in ib_cm_insert_listen()
  RDMA/rw: Fix error flow during RDMA context initialization
  RDMA/core: Fix use of logical OR in get_new_pps
  Revert "RDMA/cma: Simplify rdma_resolve_addr() error flow"

4 years agonet/mlx5: HW bit for goto chain offload support
Eli Cohen [Tue, 3 Mar 2020 00:15:22 +0000 (16:15 -0800)]
net/mlx5: HW bit for goto chain offload support

Add the HW bit definition indecating goto chain offload support.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: Expose link speed directly
Mark Bloch [Tue, 3 Mar 2020 00:15:21 +0000 (16:15 -0800)]
net/mlx5: Expose link speed directly

Expose port rate as part of the port speed register fields.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: Introduce TLS and IPSec objects enums
Saeed Mahameed [Tue, 3 Mar 2020 00:15:20 +0000 (16:15 -0800)]
net/mlx5: Introduce TLS and IPSec objects enums

Expose the TLS encryption key general object type enum correctly,
and add the IPSec encryption key general object type enum.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: Introduce egress acl forward-to-vport capability
Vu Pham [Tue, 3 Mar 2020 00:15:19 +0000 (16:15 -0800)]
net/mlx5: Introduce egress acl forward-to-vport capability

Add HCA_CAP.egress_acl_forward_to_vport field to check whether HW
supports e-switch vport's egress acl to forward packets to other
e-switch vport or not.

By default E-Switch egress ACL forwards eswitch vports egress packets
to their corresponding NIC/VF vports.

With this cap enabled, the driver is allowed to alter this behavior
and forward packets to arbitrary NIC/VF vports with the following
limitations:

   a. Multiple processing paths are supported if all of the following
      conditions are met:
      - HCA_CAP.egress_acl_forward_to_vport is set ==1.
      - A destination of type Flow Table only appears once, as the
        last destination in the list.
      - Vport destination is supported if
        HCA_CAP.egress_acl_forward_to_vport==1. Vport must not be
        the Uplink.
   b. Flow_tag not supported.
   c. This table is only applicable after an FDB table is created.
   d. Push VLAN action is not supported.
   e. Pop VLAN action cannot be added concurrently to this table and
      FDB table.

This feature will be used during port failover in bonding scenario
where two VFs representors are bonded to handle failover egress traffic
(VM's ingress/receive traffic).

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agoMerge tag 'io_uring-5.6-2020-03-07' of git://git.kernel.dk/linux-block
Linus Torvalds [Sat, 7 Mar 2020 20:20:29 +0000 (14:20 -0600)]
Merge tag 'io_uring-5.6-2020-03-07' of git://git.kernel.dk/linux-block

Pull io_uring fixes from Jens Axboe:
 "Here are a few io_uring fixes that should go into this release. This
  contains:

   - Removal of (now) unused io_wq_flush() and associated flag (Pavel)

   - Fix cancelation lockup with linked timeouts (Pavel)

   - Fix for potential use-after-free when freeing percpu ref for fixed
     file sets

   - io-wq cancelation fixups (Pavel)"

* tag 'io_uring-5.6-2020-03-07' of git://git.kernel.dk/linux-block:
  io_uring: fix lockup with timeouts
  io_uring: free fixed_file_data after RCU grace period
  io-wq: remove io_wq_flush and IO_WQ_WORK_INTERNAL
  io-wq: fix IO_WQ_WORK_NO_CANCEL cancellation

4 years agoMerge tag 'block-5.6-2020-03-07' of git://git.kernel.dk/linux-block
Linus Torvalds [Sat, 7 Mar 2020 20:14:38 +0000 (14:14 -0600)]
Merge tag 'block-5.6-2020-03-07' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "Here are a few fixes that should go into this release. This contains:

   - Revert of a bad bcache patch from this merge window

   - Removed unused function (Daniel)

   - Fixup for the blktrace fix from Jan from this release (Cengiz)

   - Fix of deeper level bfqq overwrite in BFQ (Carlo)"

* tag 'block-5.6-2020-03-07' of git://git.kernel.dk/linux-block:
  block, bfq: fix overwrite of bfq_group pointer in bfq_find_set_group()
  blktrace: fix dereference after null check
  Revert "bcache: ignore pending signals when creating gc and allocator thread"
  block: Remove used kblockd_schedule_work_on()

4 years agoMerge tag 'media/v5.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab...
Linus Torvalds [Sat, 7 Mar 2020 18:00:13 +0000 (12:00 -0600)]
Merge tag 'media/v5.6-2' of git://git./linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:

 - a fix for the media controller links in both hantro driver and in
   v4l2-mem2mem core

 - some fixes for the pulse8-cec driver

 - vicodec: handle alpha channel for RGB32 formats, as it may be used

 - mc-entity.c: fix handling of pad flags

* tag 'media/v5.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  media: hantro: Fix broken media controller links
  media: mc-entity.c: use & to check pad flags, not ==
  media: v4l2-mem2mem.c: fix broken links
  media: vicodec: process all 4 components for RGB32 formats
  media: pulse8-cec: close serio in disconnect, not adap_free
  media: pulse8-cec: INIT_DELAYED_WORK was called too late

4 years agoio_uring: fix lockup with timeouts
Pavel Begunkov [Fri, 6 Mar 2020 22:15:22 +0000 (01:15 +0300)]
io_uring: fix lockup with timeouts

There is a recipe to deadlock the kernel: submit a timeout sqe with a
linked_timeout (e.g.  test_single_link_timeout_ception() from liburing),
and SIGKILL the process.

Then, io_kill_timeouts() takes @ctx->completion_lock, but the timeout
isn't flagged with REQ_F_COMP_LOCKED, and will try to double grab it
during io_put_free() to cancel the linked timeout. Probably, the same
can happen with another io_kill_timeout() call site, that is
io_commit_cqring().

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agoMerge tag 's390-5.6-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Sat, 7 Mar 2020 14:12:47 +0000 (08:12 -0600)]
Merge tag 's390-5.6-5' of git://git./linux/kernel/git/s390/linux

Pull s390 fixes from Vasily Gorbik:

 - Fix panic in gup_fast on large pud by providing an implementation of
   pud_write. This has been overlooked during migration to common gup
   code.

 - Fix unexpected write combining on PCI stores.

* tag 's390-5.6-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/pci: Fix unexpected write combine on resource
  s390/mm: fix panic in gup_fast on large pud

4 years agoMerge tag 'powerpc-5.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Sat, 7 Mar 2020 14:10:34 +0000 (08:10 -0600)]
Merge tag 'powerpc-5.6-4' of git://git./linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "Some more powerpc fixes for 5.6:

   - One fix for a recent regression to our breakpoint/watchpoint code.

   - Another fix for our KUAP support, this time a missing annotation in
     a rarely used path in signal handling.

   - A fix for our handling of a CPU feature that effects the PMU, when
     booting guests in some configurations.

   - A minor fix to our linker script to explicitly include the .BTF
     section.

  Thanks to: Christophe Leroy, Desnes A. Nunes do Rosario, Leonardo
  Bras, Naveen N. Rao, Ravi Bangoria, Stefan Berger"

* tag 'powerpc-5.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/mm: Fix missing KUAP disable in flush_coherent_icache()
  powerpc: fix hardware PMU exception bug on PowerVM compatibility mode systems
  powerpc: Include .BTF section
  powerpc/watchpoint: Don't call dar_within_range() for Book3S

4 years agoMerge tag 'for-linus-5.6b-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 7 Mar 2020 14:04:54 +0000 (08:04 -0600)]
Merge tag 'for-linus-5.6b-rc5-tag' of git://git./linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:
 "Four fixes and a small cleanup patch:

   - two fixes by Dongli Zhang fixing races in the xenbus driver

   - two fixes by me fixing issues introduced in 5.6

   - a small cleanup by Gustavo Silva replacing a zero-length array with
     a flexible-array"

* tag 'for-linus-5.6b-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/blkfront: fix ring info addressing
  xen/xenbus: fix locking
  xenbus: req->err should be updated before req->state
  xenbus: req->body should be updated before req->state
  xen: Replace zero-length array with flexible-array member

4 years agoMerge tag 'for-linus-2020-03-07' of gitolite.kernel.org:pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 7 Mar 2020 14:01:43 +0000 (08:01 -0600)]
Merge tag 'for-linus-2020-03-07' of gitolite.pub/scm/linux/kernel/git/brauner/linux

Pull thread fixes from Christian Brauner:
 "Here are a few hopefully uncontroversial fixes:

   - Use RCU_INIT_POINTER() when initializing rcu protected members in
     task_struct to fix sparse warnings.

   - Add pidfd_fdinfo_test binary to .gitignore file"

* tag 'for-linus-2020-03-07' of gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux:
  selftests: pidfd: Add pidfd_fdinfo_test in .gitignore
  exit: Fix Sparse errors and warnings
  fork: Use RCU_INIT_POINTER() instead of rcu_access_pointer()

4 years agoMerge tag 'sound-5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Sat, 7 Mar 2020 13:59:30 +0000 (07:59 -0600)]
Merge tag 'sound-5.6-rc5' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "The regular "bump-in-the-middle" updates, containing mostly ASoC-
  related fixes at this time. All changes are reasonably small.

  A few entries are for ASoC and ALSA core parts (DAPM, PCM, topology)
  for followups of the recent changes and potential buffer overflow by
  snprintf(), while the rest are (both new and old) device-specific
  fixes for Intel, meson, tas2562, rt1015, as well as the usual HD-audio
  quirks"

* tag 'sound-5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (25 commits)
  ALSA: sgio2audio: Remove usage of dropped hw_params/hw_free functions
  ALSA: hda/realtek - Enable the headset of ASUS B9450FA with ALC294
  ALSA: hda/realtek - Fix silent output on Gigabyte X570 Aorus Master
  ALSA: hda/realtek - Add Headset Button supported for ThinkPad X1
  ALSA: hda/realtek - Add Headset Mic supported
  ASoC: wm8741: Fix typo in Kconfig prompt
  ASoC: stm32: sai: manage rebind issue
  ASoC: SOF: Fix snd_sof_ipc_stream_posn()
  ASoC: rt1015: modify pre-divider for sysclk
  ASoC: rt1015: add operation callback function for rt1015_dai[]
  ASoC: soc-component: tidyup snd_soc_pcm_component_sync_stop()
  ASoC: dapm: Correct DAPM handling of active widgets during shutdown
  ASoC: tas2562: Fix sample rate error message
  ASoC: Intel: Skylake: Fix available clock counter incrementation
  ASoC: soc-pcm/soc-compress: don't use snd_soc_dapm_stream_stop()
  ASoC: meson: g12a: add tohdmitx reset
  ASoC: pcm512x: Fix unbalanced regulator enable call in probe error path
  ASoC: soc-core: fix for_rtd_codec_dai_rollback() macro
  ASoC: topology: Fix memleak in soc_tplg_manifest_load()
  ASoC: topology: Fix memleak in soc_tplg_link_elems_load()
  ...

4 years agoMerge tag 'asoc-fix-v5.6-rc4' of https://git.kernel.org/pub/scm/linux/kernel/git...
Takashi Iwai [Sat, 7 Mar 2020 06:24:36 +0000 (07:24 +0100)]
Merge tag 'asoc-fix-v5.6-rc4' of https://git./linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v5.6

More fixes that have arrived since the merge window, spread out all
over.  There's a few things like the operation callback addition for
rt1015 and the meson reset addition which add small new bits of
functionality to fix non-working systems, they're all very small and for
parts of newly added functionality.

4 years agoMerge tag 'linux-kselftest-5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 6 Mar 2020 23:03:37 +0000 (17:03 -0600)]
Merge tag 'linux-kselftest-5.6-rc5' of git://git./linux/kernel/git/shuah/linux-kselftest

Pull kselftest update from Shuah Khan:
 "This consists of a cleanup patch to undo changes to global .gitignore
  that added selftests/lkdtm objects and add them to a local
  selftests/lkdtm/.gitignore.

  Summary of Linus's comments on local vs. global gitignore scope:

   - Keep local gitignore patterns in local files.

   - Put only global gitignore patterns in the top-level gitignore file.

  Local scope keeps things much better separated. It also incidentally
  means that if a directory gets renamed, the gitignore file continues
  to work unless in the case of renaming the actual files themselves
  that are named in the gitignore"

* tag 'linux-kselftest-5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftest/lkdtm: Use local .gitignore

4 years agoMerge tag 'riscv-for-linus-5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 6 Mar 2020 22:38:33 +0000 (16:38 -0600)]
Merge tag 'riscv-for-linus-5.6-rc5' of git://git./linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:
 "This contains a handful of fixes that I would like to target for 5.6:

   - A pair of fixes to module loading, which we hope solve the last of
     the issues with module text being loaded too sparsely for our call
     relocations.

   - A Kconfig fix that disallows selecting memory models not supported
     by NOMMU.

   - A series of Kconfig updates to ease selecting the drivers necessary
     to run on QEMU's virt platform.

   - DTS updates for SiFive's HiFive Unleashed.

   - A fix to our seccomp support that avoids mangling restartable
     syscalls"

* tag 'riscv-for-linus-5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: fix seccomp reject syscall code path
  riscv: dts: Add GPIO reboot method to HiFive Unleashed DTS file
  RISC-V: Select Goldfish RTC driver for QEMU virt machine
  RISC-V: Select SYSCON Reboot and Poweroff for QEMU virt machine
  RISC-V: Enable QEMU virt machine support in defconfigs
  RISC-V: Add kconfig option for QEMU virt machine
  riscv: Fix range looking for kernel image memblock
  riscv: Force flat memory model with no-mmu
  riscv: Change code model of module to medany to improve data accessing
  riscv: avoid the PIC offset of static percpu data in module beyond 2G limits

4 years agoparse-maintainers: Mark as executable
Jonathan Neuschäfer [Fri, 6 Mar 2020 22:13:11 +0000 (23:13 +0100)]
parse-maintainers: Mark as executable

This makes the script more convenient to run.

Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoMerge tag 'devicetree-fixes-for-5.6-3' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 6 Mar 2020 22:11:34 +0000 (16:11 -0600)]
Merge tag 'devicetree-fixes-for-5.6-3' of git://git./linux/kernel/git/robh/linux

Pull devicetree fixes from Rob Herring:
 "Another batch of DT fixes. I think this should be the last of it, but
  sending pull requests seems to cause people to send more fixes.

  Summary:

   - Fixes for warnings introduced by hierarchical PSCI binding changes

   - Fixes for broken doc references due to DT schema conversions

   - Several grammar and typo fixes

   - Fix a bunch of dtc warnings in examples"

* tag 'devicetree-fixes-for-5.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  dt-bindings: arm: Fixup the DT bindings for hierarchical PSCI states
  dt-bindings: power: Extend nodename pattern for power-domain providers
  MAINTAINERS: update ALLWINNER CPUFREQ DRIVER entry
  dt-bindings: bus: Drop empty compatible string in example
  dt-bindings: power: Convert domain-idle-states bindings to json-schema
  dt-bindings: arm: Fix cpu compatibles in the hierarchical example for PSCI
  dt-bindings: arm: Correct links to idle states definitions
  dt-bindings: mfd: Fix typo in file name of twl-familly.txt
  dt-bindings: mfd: tps65910: Improve grammar
  dt-bindings: mfd: zii,rave-sp: Fix a typo ("onborad")
  dt-bindings: arm: fsl: fix APF6Dev compatible
  dt-bindings: Fix dtc warnings in examples
  docs: dt: fix several broken doc references
  docs: dt: fix several broken references due to renames
  MAINTAINERS: clean up PCIE DRIVER FOR CAVIUM THUNDERX

4 years agoMerge tag 'drm-fixes-2020-03-06-1' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 6 Mar 2020 22:08:48 +0000 (16:08 -0600)]
Merge tag 'drm-fixes-2020-03-06-1' of git://anongit.freedesktop.org/drm/drm

Pull vgacon fix from Daniel Vetter:
 "One vgacon input check for stable"

* tag 'drm-fixes-2020-03-06-1' of git://anongit.freedesktop.org/drm/drm:
  vgacon: Fix a UAF in vgacon_invert_region