Xi Wang [Tue, 6 Apr 2021 13:25:14 +0000 (21:25 +0800)]
RDMA/hns: Remove duplicated hem page size config code
Remove duplicated code for setting hem page size in PF and VF.
Link: https://lore.kernel.org/r/1617715514-29039-7-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Wei Xu [Tue, 6 Apr 2021 13:25:13 +0000 (21:25 +0800)]
RDMA/hns: Enable RoCE on virtual functions
Introduce the VF support by adding code changes to allow VF PCI device
initialization, assgining the reserved resource of the PF to the active
VFs, setting the default abilities, applying the interruptions, resetting
and reducing the default QP/GID number to aovid exceeding the hardware
limitation.
Link: https://lore.kernel.org/r/1617715514-29039-6-git-send-email-liweihang@huawei.com
Signed-off-by: Wei Xu <xuwei5@hisilicon.com>
Signed-off-by: Shengming Shu <shushengming1@huawei.com>
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Wei Xu [Tue, 6 Apr 2021 13:25:12 +0000 (21:25 +0800)]
RDMA/hns: Set parameters of all the functions belong to a PF
Switch parameters of all functions belong to a PF should be set including
VFs.
Link: https://lore.kernel.org/r/1617715514-29039-5-git-send-email-liweihang@huawei.com
Signed-off-by: Wei Xu <xuwei5@hisilicon.com>
Signed-off-by: Shengming Shu <shushengming1@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Wei Xu [Tue, 6 Apr 2021 13:25:11 +0000 (21:25 +0800)]
RDMA/hns: Reserve the resource for the VFs
Query the resource including EQC/SMAC/SGID from the firmware in the PF and
distribute fairly among all the functions belong to the PF.
Link: https://lore.kernel.org/r/1617715514-29039-4-git-send-email-liweihang@huawei.com
Signed-off-by: Wei Xu <xuwei5@hisilicon.com>
Signed-off-by: Shengming Shu <shushengming1@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Wei Xu [Tue, 6 Apr 2021 13:25:10 +0000 (21:25 +0800)]
RDMA/hns: Query the number of functions supported by the PF
Query how many functions are supported by the PF from the FW and store it
in the hns_roce_dev structure which will be used to support the
configuration of virtual functions.
Link: https://lore.kernel.org/r/1617715514-29039-3-git-send-email-liweihang@huawei.com
Signed-off-by: Wei Xu <xuwei5@hisilicon.com>
Signed-off-by: Shengming Shu <shushengming1@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Xi Wang [Tue, 6 Apr 2021 13:25:09 +0000 (21:25 +0800)]
RDMA/hns: Simplify function's resource related command
Use hr_reg_write/read() to simplify codes about configuring function's
resource. And because the design of PF/VF fields is same, they can be
defined only once.
Link: https://lore.kernel.org/r/1617715514-29039-2-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Gioh Kim [Tue, 6 Apr 2021 12:36:39 +0000 (14:36 +0200)]
RDMA/rtrs-clt: Simplify error message
Two error messages are only different message but have common
code to generate the path string.
Link: https://lore.kernel.org/r/20210406123639.202899-4-gi-oh.kim@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Gioh Kim [Tue, 6 Apr 2021 12:36:38 +0000 (14:36 +0200)]
RDMA/rtrs-srv: More debugging info when fail to send reply
It does not help to debug if it only print error message
without any debugging information which session and connection
the error happened.
Link: https://lore.kernel.org/r/20210406123639.202899-3-gi-oh.kim@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Gioh Kim [Tue, 6 Apr 2021 12:36:37 +0000 (14:36 +0200)]
RDMA/rtrs-clt: Print more info when an error happens
Client prints only error value and it is not enough for debugging.
1. When client receives an error from server: the client does not only
print the error value but also more information of server connection.
2. When client failes to send IO: the client gets an error from RDMA
layer. It also print more information of server connection.
Link: https://lore.kernel.org/r/20210406123639.202899-2-gi-oh.kim@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Gioh Kim [Wed, 7 Apr 2021 11:34:44 +0000 (13:34 +0200)]
Documentation/ABI/rtrs-clt: Add descriptions for min-latency policy
Describe new multipath policy min-latency of the RTRS client.
Link: https://lore.kernel.org/r/20210407113444.150961-5-gi-oh.kim@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Gioh Kim [Wed, 7 Apr 2021 11:34:42 +0000 (13:34 +0200)]
RDMA/rtrs-clt: New sysfs attribute to print the latency of each path
It shows the latest latency that the client checked when sending the
heart-beat.
Link: https://lore.kernel.org/r/20210407113444.150961-3-gi-oh.kim@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Gioh Kim [Wed, 7 Apr 2021 11:34:41 +0000 (13:34 +0200)]
RDMA/rtrs-clt: Add a minimum latency multipath policy
This patch adds new multipath policy: min-latency. Client checks the
latency of each path when it sends the heart-beat. And it sends IO to the
path with the minimum latency.
Link: https://lore.kernel.org/r/20210407113444.150961-2-gi-oh.kim@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Jason Gunthorpe [Tue, 13 Apr 2021 22:37:17 +0000 (19:37 -0300)]
Merge branch 'mlx5_memic_ops' of git://git./linux/kernel/git/mellanox/linux
Maor Gottlieb says:
====================
This series from Maor extends MEMIC to support atomic operations from the
host in addition to already supported regular read/write.
====================
* 'memic_ops':
RDMA/mlx5: Expose UAPI to query DM
RDMA/mlx5: Add support in MEMIC operations
RDMA/mlx5: Add support to MODIFY_MEMIC command
RDMA/mlx5: Re-organize the DM code
RDMA/mlx5: Move all DM logic to separate file
RDMA/uverbs: Make UVERBS_OBJECT_METHODS to consider line number
net/mlx5: Add MEMIC operations related bits
Maor Gottlieb [Sun, 11 Apr 2021 12:29:24 +0000 (15:29 +0300)]
RDMA/mlx5: Expose UAPI to query DM
Expose UAPI to query MEMIC DM, this will let user space application
that didn't allocate the DM but has access to by owning the matching
command FD to retrieve its information.
Link: https://lore.kernel.org/r/20210411122924.60230-8-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Maor Gottlieb [Sun, 11 Apr 2021 12:29:23 +0000 (15:29 +0300)]
RDMA/mlx5: Add support in MEMIC operations
MEMIC buffer, in addition to regular read and write operations, can
support atomic operations from the host.
Introduce and implement new UAPI to allocate address space for MEMIC
operations such as atomic. This includes:
1. Expose new IOCTL for request mapping of MEMIC operation.
2. Hold the operations address in a list, so same operation to same DM
will be allocated only once.
3. Manage refcount on the mlx5_ib_dm object, so it would be keep valid
until all addresses were unmapped.
Link: https://lore.kernel.org/r/20210411122924.60230-7-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Maor Gottlieb [Sun, 11 Apr 2021 12:29:22 +0000 (15:29 +0300)]
RDMA/mlx5: Add support to MODIFY_MEMIC command
Add two functions to allocate and deallocate MEMIC operations
by using the MODIFY_MEMIC command.
Link: https://lore.kernel.org/r/20210411122924.60230-6-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Maor Gottlieb [Sun, 11 Apr 2021 12:29:21 +0000 (15:29 +0300)]
RDMA/mlx5: Re-organize the DM code
1. Inline the checks from check_dm_type_support() into their
respective allocation functions.
2. Fix use after free when driver fails to copy the MEMIC address to the
user by moving the allocation code into their respective functions,
hence we avoid the explicit call to free the DM in the error flow.
3. Split mlx5_ib_dm struct to memic and icm proper typesafety
throughout.
Fixes:
dc2316eba73f ("IB/mlx5: Fix device memory flows")
Link: https://lore.kernel.org/r/20210411122924.60230-5-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Maor Gottlieb [Sun, 11 Apr 2021 12:29:20 +0000 (15:29 +0300)]
RDMA/mlx5: Move all DM logic to separate file
Move all device memory related code to a separate file.
Link: https://lore.kernel.org/r/20210411122924.60230-4-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Maor Gottlieb [Sun, 11 Apr 2021 12:29:19 +0000 (15:29 +0300)]
RDMA/uverbs: Make UVERBS_OBJECT_METHODS to consider line number
In order to support multiple methods declaration in the same file we
should use the line number as part of the name.
Link: https://lore.kernel.org/r/20210411122924.60230-3-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Maor Gottlieb [Sun, 11 Apr 2021 12:29:18 +0000 (15:29 +0300)]
net/mlx5: Add MEMIC operations related bits
Add the MEMIC operations bits and structures to the mlx5_ifc file.
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Mike Marciniszyn [Mon, 29 Mar 2021 13:54:16 +0000 (09:54 -0400)]
IB/hfi1: Rework AIP and VNIC dummy netdev usage
All other users of the dummy netdevice embed the netdev in other
structures:
init_dummy_netdev(&mal->dummy_dev);
init_dummy_netdev(ð->dummy_dev);
init_dummy_netdev(&ar->napi_dev);
init_dummy_netdev(&irq_grp->napi_ndev);
init_dummy_netdev(&wil->napi_ndev);
init_dummy_netdev(&trans_pcie->napi_dev);
init_dummy_netdev(&dev->napi_dev);
init_dummy_netdev(&bus->mux_dev);
The AIP and VNIC implementation turns that model inside out and used a
kfree() to free what appears to be a netdev struct when in reality, it is
a struct that enbodies the rx state as well as the dummy netdev used to
support napi_poll across disparate receive contexts. The relationship is
infered by the odd allocation:
const int netdev_size = sizeof(*dd->dummy_netdev) +
sizeof(struct hfi1_netdev_priv);
<snip>
dd->dummy_netdev = kcalloc_node(1, netdev_size, GFP_KERNEL, dd->node);
Correct the issue by:
- Correctly naming the alloc and free functions
- Renaming hfi1_netdev_priv to hfi1_netdev_rx
- Replacing dd dummy_netdev with a netdev_rx pointer
- Embedding the net_device in hfi1_netdev_rx
- Moving the init_dummy_netdev to the alloc routine
- Adjusting wrappers to fit the new model
Fixes:
6991abcb993c ("IB/hfi1: Add functions to receive accelerated ipoib packets")
Link: https://lore.kernel.org/r/1617026056-50483-11-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Jiapeng Chong [Tue, 13 Apr 2021 09:11:03 +0000 (17:11 +0800)]
RDMA/qib: Remove useless qib_read_ureg() function
Fix the following clang warning:
drivers/infiniband/hw/qib/qib_iba7322.c:803:19: warning: unused function 'qib_read_ureg' [-Wunused-function].
Link: https://lore.kernel.org/r/1618305063-29007-1-git-send-email-jiapeng.chong@linux.alibaba.com
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Yixian Liu [Tue, 13 Apr 2021 09:11:27 +0000 (17:11 +0800)]
RDMA/hns: Remove unnecessary flush operation for workqueue
As a flush operation is implemented inside destroy_workqueue(), there is
no need to do flush operation before.
Fixes:
bfcc681bd09d ("IB/hns: Fix the bug when free mr")
Fixes:
0425e3e6e0c7 ("RDMA/hns: Support flush cqe for hip08 in kernel space")
Link: https://lore.kernel.org/r/1618305087-30799-1-git-send-email-liweihang@huawei.com
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Gioh Kim [Mon, 12 Apr 2021 08:40:02 +0000 (10:40 +0200)]
RDMA/rtrs-clt: destroy sysfs after removing session from active list
A session can be removed dynamically by sysfs interface "remove_path" that
eventually calls rtrs_clt_remove_path_from_sysfs function. The current
rtrs_clt_remove_path_from_sysfs first removes the sysfs interfaces and
frees sess->stats object. Second it removes the session from the active
list.
Therefore some functions could access non-connected session and access the
freed sess->stats object even-if they check the session status before
accessing the session.
For instance rtrs_clt_request and get_next_path_min_inflight check the
session status and try to send IO to the session. The session status
could be changed when they are trying to send IO but they could not catch
the change and update the statistics information in sess->stats object,
and generate use-after-free problem.
(see: "RDMA/rtrs-clt: Check state of the rtrs_clt_sess before reading its
stats")
This patch changes the rtrs_clt_remove_path_from_sysfs to remove the
session from the active session list and then destroy the sysfs
interfaces.
Each function still should check the session status because closing or
error recovery paths can change the status.
Fixes:
6a98d71daea1 ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20210412084002.33582-1-gi-oh.kim@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Wang Wensheng [Thu, 8 Apr 2021 11:31:32 +0000 (11:31 +0000)]
RDMA/srpt: Fix error return code in srpt_cm_req_recv()
Fix to return a negative error code from the error handling case instead
of 0, as done elsewhere in this function.
Fixes:
db7683d7deb2 ("IB/srpt: Fix login-related race conditions")
Link: https://lore.kernel.org/r/20210408113132.87250-1-wangwensheng4@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Wensheng <wangwensheng4@huawei.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Håkon Bugge [Wed, 31 Mar 2021 18:43:14 +0000 (20:43 +0200)]
rds: ib: Remove two ib_modify_qp() calls
For some HCAs, ib_modify_qp() is an expensive operation running
virtualized.
For both the active and passive side, the QP returned by the CM has the
state set to RTS, so no need for this excess RTS -> RTS transition. With
IB Core's ability to set the RNR Retry timer, we use this interface to
shave off another ib_modify_qp().
Fixes:
ec16227e1414 ("RDS/IB: Infiniband transport")
Link: https://lore.kernel.org/r/1617216194-12890-3-git-send-email-haakon.bugge@oracle.com
Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Håkon Bugge [Wed, 31 Mar 2021 18:43:13 +0000 (20:43 +0200)]
IB/cma: Introduce rdma_set_min_rnr_timer()
Introduce the ability for kernel ULPs to adjust the minimum RNR Retry
timer. The INIT -> RTR transition executed by RDMA CM will be used for
this adjustment. This avoids an additional ib_modify_qp() call.
rdma_set_min_rnr_timer() must be called before the call to rdma_connect()
on the active side and before the call to rdma_accept() on the passive
side.
The default value of RNR Retry timer is zero, which translates to 655
ms. When the receiver is not ready to accept a send messages, it encodes
the RNR Retry timer value in the NAK. The requestor will then wait at
least the specified time value before retrying the send.
The 5-bit value to be supplied to the rdma_set_min_rnr_timer() is
documented in IBTA Table 45: "Encoding for RNR NAK Timer Field".
Link: https://lore.kernel.org/r/1617216194-12890-2-git-send-email-haakon.bugge@oracle.com
Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Ye Bin [Fri, 9 Apr 2021 09:51:47 +0000 (17:51 +0800)]
RDMA/i40iw: Use DEFINE_SPINLOCK() for spinlock
spinlock can be initialized automatically with DEFINE_SPINLOCK() rather
than explicitly calling spin_lock_init().
Link: https://lore.kernel.org/r/20210409095147.2294269-1-yebin10@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Wang Wensheng [Thu, 8 Apr 2021 11:31:37 +0000 (11:31 +0000)]
RDMA/bnxt_re: Fix error return code in bnxt_qplib_cq_process_terminal()
Fix to return a negative error code from the error handling case instead
of 0, as done elsewhere in this function.
Fixes:
1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Link: https://lore.kernel.org/r/20210408113137.97202-1-wangwensheng4@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Wensheng <wangwensheng4@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Wang Wensheng [Thu, 8 Apr 2021 11:31:40 +0000 (11:31 +0000)]
IB/hfi1: Fix error return code in parse_platform_config()
Fix to return a negative error code from the error handling case instead
of 0, as done elsewhere in this function.
Fixes:
7724105686e7 ("IB/hfi1: add driver files")
Link: https://lore.kernel.org/r/20210408113140.103032-1-wangwensheng4@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Wensheng <wangwensheng4@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Wang Wensheng [Thu, 8 Apr 2021 11:31:35 +0000 (11:31 +0000)]
RDMA/qedr: Fix error return code in qedr_iw_connect()
Fix to return a negative error code from the error handling case instead
of 0, as done elsewhere in this function.
Fixes:
82af6d19d8d9 ("RDMA/qedr: Fix synchronization methods and memory leaks in qedr")
Link: https://lore.kernel.org/r/20210408113135.92165-1-wangwensheng4@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Wensheng <wangwensheng4@huawei.com>
Acked-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Wenpeng Liang [Wed, 7 Apr 2021 08:15:52 +0000 (16:15 +0800)]
RDMA/core: Correct format of block comments
Block comments should not use a trailing */ on a separate line and every
line of a block comment should start with an '*'.
Link: https://lore.kernel.org/r/1617783353-48249-7-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Wenpeng Liang [Wed, 7 Apr 2021 08:15:51 +0000 (16:15 +0800)]
RDMA/core: Correct format of braces
Do following cleanups about braces:
- Add the necessary braces to maintain context alignment.
- Fix the open '{' that is not on the same line as "switch".
- Remove braces that are not necessary for single statement blocks.
- Fix "else" that doesn't follow close brace '}'.
Link: https://lore.kernel.org/r/1617783353-48249-6-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Wenpeng Liang [Wed, 7 Apr 2021 08:15:50 +0000 (16:15 +0800)]
RDMA/core: Remove redundant spaces
Space is not required after '(', before ')', before ',' and between '*'
and symbol name of a definition.
Link: https://lore.kernel.org/r/1617783353-48249-5-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Wenpeng Liang [Wed, 7 Apr 2021 08:15:49 +0000 (16:15 +0800)]
RDMA/core: Add necessary spaces
Space is required before '(' of switch statements and around '='.
Link: https://lore.kernel.org/r/1617783353-48249-4-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Wenpeng Liang [Wed, 7 Apr 2021 08:15:48 +0000 (16:15 +0800)]
RDMA/core: Remove the redundant return statements
The return statements at the end of a void function is meaningless.
Link: https://lore.kernel.org/r/1617783353-48249-3-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Wenpeng Liang [Wed, 7 Apr 2021 08:15:47 +0000 (16:15 +0800)]
RDMA/core: Print the function name by __func__ instead of an fixed string
It's better to use __func__ than a fixed string to print a function's
name.
Link: https://lore.kernel.org/r/1617783353-48249-2-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Jason Gunthorpe [Mon, 12 Apr 2021 16:49:48 +0000 (13:49 -0300)]
Merge branch 'mlx5-next' of git://git./linux/kernel/git/mellanox/linux
Saeed Mahameed says:
====================
This pr contains changes from mlx5-next branch,
already reviewed on netdev and rdma mailing lists, links below.
1) From Leon, Dynamically assign MSI-X vectors count
Already Acked by Bjorn Helgaas.
https://patchwork.kernel.org/project/netdevbpf/cover/
20210314124256.70253-1-leon@kernel.org/
2) Cleanup series:
https://patchwork.kernel.org/project/netdevbpf/cover/
20210311070915.321814-1-saeed@kernel.org/
From Mark, E-Switch cleanups and refactoring, and the addition
of single FDB mode needed HW bits.
From Mikhael, Remove unused struct field
From Saeed, Cleanup W=1 prototype warning
From Zheng, Esw related cleanup
From Tariq, User order-0 page allocation for EQs
====================
* mlx5-next:
net/mlx5: Implement sriov_get_vf_total_msix/count() callbacks
net/mlx5: Dynamically assign MSI-X vectors count
net/mlx5: Add dynamic MSI-X capabilities bits
PCI/IOV: Add sysfs MSI-X vector assignment interface
net/mlx5: Use order-0 allocations for EQs
net/mlx5: Add IFC bits needed for single FDB mode
net/mlx5: E-Switch, Refactor send to vport to be more generic
RDMA/mlx5: Use representor E-Switch when getting netdev and metadata
net/mlx5: E-Switch, Add eswitch pointer to each representor
net/mlx5: E-Switch, Add match on vhca id to default send rules
net/mlx5: Remove unused mlx5_core_health member recover_work
net/mlx5: simplify the return expression of mlx5_esw_offloads_pair()
net/mlx5: Cleanup prototype warning
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Lang Cheng [Fri, 2 Apr 2021 09:07:34 +0000 (17:07 +0800)]
RDMA/hns: Prevent le32 from being implicitly converted to u32
Replace BUILD_BUG_ON_ZERO() with BUILD_BUG_ON() to avoid sparse
complaining "restricted __le32 degrades to integer".
Link: https://lore.kernel.org/r/1617354454-47840-10-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Yixing Liu [Fri, 2 Apr 2021 09:07:33 +0000 (17:07 +0800)]
RDMA/hns: Simplify the function config_eqc()
Use "hr_reg_write" replace "roce_set_filed".
Link: https://lore.kernel.org/r/1617354454-47840-9-git-send-email-liweihang@huawei.com
Signed-off-by: Yixing Liu <liuyixing1@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Wenpeng Liang [Fri, 2 Apr 2021 09:07:32 +0000 (17:07 +0800)]
RDMA/hns: Add XRC subtype in QPC and XRC type in SRQC
A field to distuiguish basic SRQ from XRC SRQ in SRQC and a field in QPC
to determine whether a QP is XRC TGT QP or XRC INI QP are missing.
Fixes:
32548870d438 ("RDMA/hns: Add support for XRC on HIP09")
Link: https://lore.kernel.org/r/1617354454-47840-8-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Wenpeng Liang [Fri, 2 Apr 2021 09:07:31 +0000 (17:07 +0800)]
RDMA/hns: Remove unsupported QP types
The hns ROCEE does not support UC QP currently.
Link: https://lore.kernel.org/r/1617354454-47840-7-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Yangyang Li [Fri, 2 Apr 2021 09:07:30 +0000 (17:07 +0800)]
RDMA/hns: Delete unused members in the structure hns_roce_hw
Some structure members in hns_roce_hw have never been used and need to be
deleted.
Fixes:
9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver")
Fixes:
b156269d88e4 ("RDMA/hns: Add modify CQ support for hip08")
Fixes:
c7bcb13442e1 ("RDMA/hns: Add SRQ support for hip08 kernel mode")
Link: https://lore.kernel.org/r/1617354454-47840-6-git-send-email-liweihang@huawei.com
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Wenpeng Liang [Fri, 2 Apr 2021 09:07:29 +0000 (17:07 +0800)]
RDMA/hns: Delete redundant abnormal interrupt status
The hardware supports only two types of abnormal interrupts.
Fixes:
a5073d6054f7 ("RDMA/hns: Add eq support of hip08")
Link: https://lore.kernel.org/r/1617354454-47840-5-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Yangyang Li [Fri, 2 Apr 2021 09:07:28 +0000 (17:07 +0800)]
RDMA/hns: Delete redundant condition judgment related to eq
The register value related to the eq interrupt depends only on
enable_flag, so the redundant condition judgment is deleted.
Fixes:
a5073d6054f7 ("RDMA/hns: Add eq support of hip08")
Link: https://lore.kernel.org/r/1617354454-47840-4-git-send-email-liweihang@huawei.com
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Weihang Li [Fri, 2 Apr 2021 09:07:27 +0000 (17:07 +0800)]
RDMA/hns: Fix missing assignment of max_inline_data
When querying QP, the ULPs should be informed of the max length of inline
data supported by the hardware.
Fixes:
30b707886aeb ("RDMA/hns: Support inline data in extented sge space for RC")
Link: https://lore.kernel.org/r/1617354454-47840-3-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Weihang Li [Fri, 2 Apr 2021 09:07:26 +0000 (17:07 +0800)]
RDMA/hns: Avoid enabling RQ inline on UD
RQ inline is not supported on UD/GSI QP, it should be disabled in QPC.
Link: https://lore.kernel.org/r/1617354454-47840-2-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Wenpeng Liang [Thu, 1 Apr 2021 07:32:21 +0000 (15:32 +0800)]
RDMA/hns: Modify prints for mailbox and command queue
Use ratelimited print in mbox and cmq. And print mailbox operation if
mailbox fails because it's useful information for the user.
Link: https://lore.kernel.org/r/1617262341-37571-4-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Lang Cheng [Thu, 1 Apr 2021 07:32:20 +0000 (15:32 +0800)]
RDMA/hns: Support more return types of command queue
Add error code definition according to the return code from firmware to
help find out more detailed reasons why a command fails to be sent.
Link: https://lore.kernel.org/r/1617262341-37571-3-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Lang Cheng [Thu, 1 Apr 2021 07:32:19 +0000 (15:32 +0800)]
RDMA/hns: Enable all CMDQ context
Fix error of cmd's context number calculation algorithm to enable all of
32 cmd entries and support 32 concurrent accesses.
Link: https://lore.kernel.org/r/1617262341-37571-2-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Bob Pearson [Fri, 2 Apr 2021 00:10:17 +0000 (19:10 -0500)]
RDMA/rxe: Fix missing acks from responder
All responder errors from request packets that do not consume a receive
WQE fail to generate acks for RC QPs. This patch corrects this behavior
by making the flow follow the same path as request packets that do consume
a WQE after the completion.
Link: https://lore.kernel.org/r/20210402001016.3210-1-rpearson@hpe.com
Link: https://lore.kernel.org/linux-rdma/1a7286ac-bcea-40fb-2267-480134dd301b@gmail.com/
Signed-off-by: Bob Pearson <rpearson@hpe.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Yixian Liu [Tue, 6 Apr 2021 08:46:12 +0000 (16:46 +0800)]
RDMA/core: Make the wc status prompt message clearer
Local invalidate is also a kind of memory management operation, not only
memory bind operation. Furthermore, as invalidate operations include local
and remote, add prefix to the prompt message to make it clearer.
Link: https://lore.kernel.org/r/1617698772-13871-1-git-send-email-liweihang@huawei.com
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Wei Yongjun [Wed, 7 Apr 2021 15:49:00 +0000 (15:49 +0000)]
RDMA/hns: Use GFP_ATOMIC under spin lock
A spin lock is taken here so we should use GFP_ATOMIC.
Fixes:
f91696f2f053 ("RDMA/hns: Support congestion control type selection according to the FW")
Link: https://lore.kernel.org/r/20210407154900.3486268-1-weiyongjun1@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Praveen Kumar Kannoju [Sat, 3 Apr 2021 04:53:55 +0000 (04:53 +0000)]
IB/mlx5: Reduce max order of memory allocated for xlt update
To update xlt (during mlx5_ib_reg_user_mr()), the driver can request up to
1 MB (order-8) memory, depending on the size of the MR. This costly
allocation can sometimes take very long to return (a few seconds). This
causes the calling application to hang for a long time, especially when
the system is fragmented. To avoid these long latency spikes, the calls
the higher order allocations need to fail faster in case they are not
available.
In order to acheive this we need __GFP_NORETRY flag in the gfp_mask before
during fetching the free pages. Allow the algorithm to automatically fall
back to smaller page sizes.
Link: https://lore.kernel.org/r/1617425635-35631-1-git-send-email-praveen.kannoju@oracle.com
Signed-off-by: Praveen Kumar Kannoju <praveen.kannoju@oracle.com>
Acked-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Kaike Wan [Mon, 29 Mar 2021 14:06:31 +0000 (10:06 -0400)]
IB/hfi1: Remove unused function
Remove the unused function sdma_iowait_schedule().
Fixes:
7724105686e7 ("IB/hfi1: add driver files")
Link: https://lore.kernel.org/r/1617026791-89997-1-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Mike Marciniszyn [Mon, 29 Mar 2021 13:54:14 +0000 (09:54 -0400)]
IB/hfi1: Use kzalloc() for mmu_rb_handler allocation
The code currently assumes that the mmu_notifier struct
embedded in mmu_rb_handler only contains two fields.
There are now extra fields:
struct mmu_notifier {
struct hlist_node hlist;
const struct mmu_notifier_ops *ops;
struct mm_struct *mm;
struct rcu_head rcu;
unsigned int users;
};
Given that there in no init for the mmu_notifier, a kzalloc() should
be used to insure that any newly added fields are given a predictable
initial value of zero.
Fixes:
06e0ffa69312 ("IB/hfi1: Re-factor MMU notification code")
Link: https://lore.kernel.org/r/1617026056-50483-9-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Adam Goldman <adam.goldman@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Mike Marciniszyn [Mon, 29 Mar 2021 13:54:13 +0000 (09:54 -0400)]
IB/hfi1: Add additional usdma traces
Add traces that were vital in isolating an issue with pq waitlist in
commit
fa8dac396863 ("IB/hfi1: Fix another case where pq is left on
waitlist")
Link: https://lore.kernel.org/r/1617026056-50483-8-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Mike Marciniszyn [Mon, 29 Mar 2021 13:54:11 +0000 (09:54 -0400)]
IB/hfi1: Remove indirect call to hfi1_ipoib_send_dma()
hfi1_ipoib_send() directly calls hfi1_ipoib_send_dma() with no value add.
Fix by renaming hfi1_ipoib_send_dma() to hfi1_ipoib_send().
Link: https://lore.kernel.org/r/1617026056-50483-6-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Mike Marciniszyn [Mon, 29 Mar 2021 13:54:10 +0000 (09:54 -0400)]
IB/hfi1: Use napi_schedule_irqoff() for tx napi
The call is from an ISR context and napi_schedule_irqoff() can be used.
Change the call to the more efficient type.
Link: https://lore.kernel.org/r/1617026056-50483-5-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Mike Marciniszyn [Mon, 29 Mar 2021 13:54:09 +0000 (09:54 -0400)]
IB/hfi1: Correct oversized ring allocation
The completion ring for tx is using the wrong size to size the ring,
oversizing the ring by two orders of magniture.
Correct the allocation size and use kcalloc_node() to allocate the ring.
Fix mistaken GFP defines in similar allocations.
Link: https://lore.kernel.org/r/1617026056-50483-4-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Mike Marciniszyn [Mon, 29 Mar 2021 13:54:08 +0000 (09:54 -0400)]
IB/{ipoib,hfi1}: Add a timeout handler for rdma_netdev
The current rdma_netdev handling in ipoib hooks the tx_timeout handler,
but prints out a totally useless message that prevents effective debugging
especially when multiple transmit queues are being used.
Add a tx_timeout rdma_netdev hook and implement the callback in the hfi1
to print additional information.
The existing non-helpful message is avoided when the driver has presented
a callback.
Link: https://lore.kernel.org/r/1617026056-50483-3-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Mike Marciniszyn [Mon, 29 Mar 2021 13:54:07 +0000 (09:54 -0400)]
IB/hfi1: Add AIP tx traces
Add traces to allow for debugging issues with AIP tx.
Link: https://lore.kernel.org/r/1617026056-50483-2-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Leon Romanovsky [Sun, 14 Mar 2021 12:42:56 +0000 (14:42 +0200)]
net/mlx5: Implement sriov_get_vf_total_msix/count() callbacks
The mlx5 implementation executes a firmware command on the PF to change
the configuration of the selected VF.
Link: https://lore.kernel.org/linux-pci/20210314124256.70253-5-leon@kernel.org
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Leon Romanovsky [Sun, 14 Mar 2021 12:42:55 +0000 (14:42 +0200)]
net/mlx5: Dynamically assign MSI-X vectors count
The number of MSI-X vectors is a PCI property visible through lspci. The
field is read-only and configured by the device. The mlx5 devices work in
a static or dynamic assignment mode.
Static assignment means that all newly created VFs have a preset number of
MSI-X vectors determined by device configuration parameters. This can
result in some VFs having too many or too few MSI-X vectors. Till now this
has been the only means of fine-tuning the MSI-X vector count and it was
acceptable for small numbers of VFs.
With dynamic assignment the inefficiency of having a fixed number of MSI-X
vectors can be avoided with each VF having exactly the required
vectors. Userspace will provide this information while provisioning the VF
for use, based on the intended use. For instance if being used with a VM,
the MSI-X vector count might be matched to the CPU count of the VM.
For compatibility mlx5 continues to start up with MSI-X vector assignment,
but the kernel can now access a larger dynamic vector pool and assign more
vectors to created VFs.
Link: https://lore.kernel.org/linux-pci/20210314124256.70253-4-leon@kernel.org
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Leon Romanovsky [Sun, 14 Mar 2021 12:42:54 +0000 (14:42 +0200)]
net/mlx5: Add dynamic MSI-X capabilities bits
These new fields declare the number of MSI-X vectors that is possible to
allocate on the VF through PF configuration.
Value must be in range defined by min_dynamic_vf_msix_table_size and
max_dynamic_vf_msix_table_size.
The driver should continue to query its MSI-X table through PCI
configuration header.
Link: https://lore.kernel.org/linux-pci/20210314124256.70253-3-leon@kernel.org
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Leon Romanovsky [Sun, 4 Apr 2021 07:22:18 +0000 (10:22 +0300)]
PCI/IOV: Add sysfs MSI-X vector assignment interface
A typical cloud provider SR-IOV use case is to create many VFs for use by
guest VMs. The VFs may not be assigned to a VM until a customer requests a
VM of a certain size, e.g., number of CPUs. A VF may need MSI-X vectors
proportional to the number of CPUs in the VM, but there is no standard way
to change the number of MSI-X vectors supported by a VF.
Some Mellanox ConnectX devices support dynamic assignment of MSI-X vectors
to SR-IOV VFs. This can be done by the PF driver after VFs are enabled,
and it can be done without affecting VFs that are already in use. The
hardware supports a limited pool of MSI-X vectors that can be assigned to
the PF or to individual VFs. This is device-specific behavior that
requires support in the PF driver.
Add a read-only "sriov_vf_total_msix" sysfs file for the PF and a writable
"sriov_vf_msix_count" file for each VF. Management software may use these
to learn how many MSI-X vectors are available and to dynamically assign
them to VFs before the VFs are passed through to a VM.
If the PF driver implements the ->sriov_get_vf_total_msix() callback,
"sriov_vf_total_msix" contains the total number of MSI-X vectors available
for distribution among VFs.
If no driver is bound to the VF, writing "N" to "sriov_vf_msix_count" uses
the PF driver ->sriov_set_msix_vec_count() callback to assign "N" MSI-X
vectors to the VF. When a VF driver subsequently reads the MSI-X Message
Control register, it will see the new Table Size "N".
Link: https://lore.kernel.org/linux-pci/20210314124256.70253-2-leon@kernel.org
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Yixian Liu [Sat, 27 Mar 2021 10:25:38 +0000 (18:25 +0800)]
RDMA/hns: Reorganize doorbell update interfaces for all queues
The doorbell update interfaces are very similar for different queues, such
as SQ, RQ, SRQ, CQ and EQ. So reorganize these code and also fix some
inappropriate naming.
Link: https://lore.kernel.org/r/1616840738-7866-3-git-send-email-liweihang@huawei.com
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Yixian Liu [Sat, 27 Mar 2021 10:25:37 +0000 (18:25 +0800)]
RDMA/hns: Support configuring doorbell mode of RQ and CQ
HIP08 supports both normal and record doorbell mode for RQ and CQ, SQ
record doorbell for userspace is also supported by the software for
flushing CQE process. As now the capability of HIP08 are exposed to the
user and are configurable, the support of normal doorbell should be added
back.
Note that, if switching to normal doorbell, the kernel will report "flush
cqe is unsupported" if modify qp to error status as the flush is based on
record doorbell.
Link: https://lore.kernel.org/r/1616840738-7866-2-git-send-email-liweihang@huawei.com
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Xi Wang [Sat, 27 Mar 2021 03:21:34 +0000 (11:21 +0800)]
RDMA/hns: Simplify command fields for HEM base address configuration
Use hr_reg_write() instead of roce_set_field() to simplify codes about
configuring HEM BA.
Link: https://lore.kernel.org/r/1616815294-13434-6-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Xi Wang [Sat, 27 Mar 2021 03:21:33 +0000 (11:21 +0800)]
RDMA/hns: Reorganize process of setting HEM
Encapsulate configuring GMV base address and other type of HEM table into
two separate functions to make process of setting HEM clearer.
Link: https://lore.kernel.org/r/1616815294-13434-5-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Xi Wang [Sat, 27 Mar 2021 03:21:32 +0000 (11:21 +0800)]
RDMA/hns: Refactor reset state checking flow
The 'HNS_ROCE_OPC_QUERY_MB_ST' command will response the mailbox complete
status and hardware busy flag, and the complete status is only valid when
the busy flag is 0, so it's better to query these two fields at a time
rather than separately.
Link: https://lore.kernel.org/r/1616815294-13434-4-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Yixing Liu [Sat, 27 Mar 2021 03:21:31 +0000 (11:21 +0800)]
RDMA/hns: Reorganize hns_roce_create_cq()
Encapsulate two subprocesses into functions.
Link: https://lore.kernel.org/r/1616815294-13434-3-git-send-email-liweihang@huawei.com
Signed-off-by: Yixing Liu <liuyixing1@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Weihang Li [Sat, 27 Mar 2021 03:21:30 +0000 (11:21 +0800)]
RDMA/hns: Refactor hns_roce_v2_poll_one()
Encapsulate the process of obtaining the current QP and filling WC as
functions, also merge some duplicate code.
Link: https://lore.kernel.org/r/1616815294-13434-2-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Weihang Li [Mon, 29 Mar 2021 08:46:24 +0000 (16:46 +0800)]
MAINTAINERS: remove Xavier as maintainer of HISILICON ROCE DRIVER
Wei Hu(Xavier) has left Hisilicon and his email address is invalid now.
I'd be glad to add him back with another address if he wants to continue
maintain this module.
Link: https://lore.kernel.org/r/1617007584-39842-1-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Jack Wang [Thu, 25 Mar 2021 15:33:05 +0000 (16:33 +0100)]
RDMA/rtrs-clt: Cap max_io_size
Max io size is limited by both remote buffer size and the max fr pages per
mr.
Link: https://lore.kernel.org/r/20210325153308.1214057-20-gi-oh.kim@ionos.com
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Jack Wang [Thu, 25 Mar 2021 15:33:03 +0000 (16:33 +0100)]
RDMA/rtrs: Cleanup unused 's' variable in __alloc_sess
Link: https://lore.kernel.org/r/20210325153308.1214057-18-gi-oh.kim@ionos.com
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Gioh Kim [Thu, 25 Mar 2021 15:33:02 +0000 (16:33 +0100)]
RDMA/rtrs-srv: Report temporary sessname for error message
Before receiving the session name, the error message cannot include any
information about which connection generates the error.
This patch stores the addresses of source and target in the sessname field
to show which generates the error. That field will be over-written
when receiving the session name from client.
Link: https://lore.kernel.org/r/20210325153308.1214057-17-gi-oh.kim@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Gioh Kim [Thu, 25 Mar 2021 15:32:59 +0000 (16:32 +0100)]
RDMA/rtrs: New function converting rtrs_addr to string
There is common code converting addresses of source machine and
destination machine to a string. We already have a struct rtrs_addr to
store two addresses. This patch introduces a new function that converts
two addresses into one string with struct rtrs_addr.
Link: https://lore.kernel.org/r/20210325153308.1214057-14-gi-oh.kim@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Guoqing Jiang [Thu, 25 Mar 2021 15:32:56 +0000 (16:32 +0100)]
RDMA/rtrs: Cleanup the code in rtrs_srv_rdma_cm_handler
Let these cases share the same path since all of them need to close
session.
Link: https://lore.kernel.org/r/20210325153308.1214057-11-gi-oh.kim@ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Reviewed-by: Danil Kipnis <danil.kipnis@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Guoqing Jiang [Thu, 25 Mar 2021 15:32:55 +0000 (16:32 +0100)]
RDMA/rtrs: Remove sessname and sess_kobj from rtrs_attrs
The two members are not used in the code, so remove them.
Link: https://lore.kernel.org/r/20210325153308.1214057-10-gi-oh.kim@ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Reviewed-by: Danil Kipnis <danil.kipnis@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Guoqing Jiang [Thu, 25 Mar 2021 15:32:54 +0000 (16:32 +0100)]
RDMA/rtrs: Kill the put label in rtrs_srv_create_once_sysfs_root_folders
We can remove the label after move put_device to the right place.
Link: https://lore.kernel.org/r/20210325153308.1214057-9-gi-oh.kim@ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Reviewed-by: Danil Kipnis <danil.kipnis@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Guoqing Jiang [Thu, 25 Mar 2021 15:32:53 +0000 (16:32 +0100)]
RDMA/rtrs-clt: Remove redundant code from rtrs_clt_read_req
There is no need to dereference 's' from 'sess', since we have "sess =
to_clt_sess(s)" before.
And we can deference 'dev' from 's' earlier.
Link: https://lore.kernel.org/r/20210325153308.1214057-8-gi-oh.kim@ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Reviewed-by: Danil Kipnis <danil.kipnis@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Danil Kipnis [Thu, 25 Mar 2021 15:32:47 +0000 (16:32 +0100)]
MAINTAINERS: Change maintainer for rtrs module
Danil will step down, Haris will take over. Also update to email address
to ionos.com, cloud.ionos.com will still work for sometime.
Link: https://lore.kernel.org/r/20210325153308.1214057-2-gi-oh.kim@ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Acked-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
YueHaibing [Thu, 1 Apr 2021 02:10:28 +0000 (10:10 +0800)]
RDMA/uverbs: Fix -Wunused-function warning
make W=1 warns this:
In file included from drivers/infiniband/sw/rdmavt/mmap.c:51:0:
./include/rdma/uverbs_ioctl.h:937:1:
warning: ‘_uverbs_get_const_unsigned’ defined but not used [-Wunused-function]
_uverbs_get_const_unsigned(u64 *to,
^~~~~~~~~~~~~~~~~~~~~~~~~~
./include/rdma/uverbs_ioctl.h:930:1:
warning: ‘_uverbs_get_const_signed’ defined but not used [-Wunused-function]
_uverbs_get_const_signed(s64 *to, const struct uverbs_attr_bundle *attrs_bundle,
^~~~~~~~~~~~~~~~~~~~~~~~
Make these functions inline to fix this warnings.
Fixes:
2904bb37b35d ("IB/core: Split uverbs_get_const/default to consider target type")
Link: https://lore.kernel.org/r/20210401021028.25720-1-yuehaibing@huawei.com
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Yangyang Li [Thu, 25 Mar 2021 13:33:56 +0000 (21:33 +0800)]
RDMA/hns: Support congestion control type selection according to the FW
The type of congestion control algorithm includes DCQCN, LDCP, HC3 and
DIP. The driver will select one of them according to the firmware when
querying PF capabilities, and then set the related configuration fields
into QPC.
Link: https://lore.kernel.org/r/1616679236-7795-3-git-send-email-liweihang@huawei.com
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Yixing Liu <liuyixing1@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Wei Xu [Thu, 25 Mar 2021 13:33:55 +0000 (21:33 +0800)]
RDMA/hns: Support query information of functions from FW
Add a new type of command to query mac id of functions from the firmware,
it is used to select the template of congestion algorithm. More info will
be supported in the future.
Link: https://lore.kernel.org/r/1616679236-7795-2-git-send-email-liweihang@huawei.com
Signed-off-by: Wei Xu <xuwei5@hisilicon.com>
Signed-off-by: Shengming Shu <shushengming1@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Håkon Bugge [Mon, 22 Mar 2021 13:35:32 +0000 (14:35 +0100)]
RDMA/core: Fix corrupted SL on passive side
On RoCE systems, a CM REQ contains a Primary Hop Limit > 1 and Primary
Subnet Local is zero.
In cm_req_handler(), the cm_process_routed_req() function is called. Since
the Primary Subnet Local value is zero in the request, and since this is
RoCE (Primary Local LID is permissive), the following statement will be
executed:
IBA_SET(CM_REQ_PRIMARY_SL, req_msg, wc->sl);
This corrupts SL in req_msg if it was different from zero. In other words,
a request to setup a connection using an SL != zero, will not be honored,
and a connection using SL zero will be created instead.
Fixed by not calling cm_process_routed_req() on RoCE systems, the
cm_process_route_req() is only for IB anyhow.
Fixes:
3971c9f6dbf2 ("IB/cm: Add interim support for routed paths")
Link: https://lore.kernel.org/r/1616420132-31005-1-git-send-email-haakon.bugge@oracle.com
Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Kamal Heib [Wed, 31 Mar 2021 10:20:43 +0000 (13:20 +0300)]
RDMA/rxe: Remove rxe_dma_device declaration
The function isn't implemented - delete the declaration.
Fixes:
a9d2e9ae953f ("RDMA/siw,rxe: Make emulated devices virtual in the device tree")
Link: https://lore.kernel.org/r/20210331102043.691950-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Tang Yizhou [Wed, 31 Mar 2021 02:01:05 +0000 (10:01 +0800)]
RDMA/iw_cxgb4: Use DEFINE_SPINLOCK() for spinlock
spinlock can be initialized automatically with DEFINE_SPINLOCK() rather
than explicitly calling spin_lock_init().
Link: https://lore.kernel.org/r/20210331020105.4858-1-tangyizhou@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Tang Yizhou <tangyizhou@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Wan Jiabing [Fri, 26 Mar 2021 11:33:46 +0000 (19:33 +0800)]
RDMA/iser: struct iscsi_iser_task is declared twice
struct iscsi_iser_task has been declared at 201st line. Remove the
duplicate.
Link: https://lore.kernel.org/r/20210326113347.903976-1-wanjiabing@vivo.com
Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Ruiqi Gong [Tue, 30 Mar 2021 12:29:12 +0000 (08:29 -0400)]
RDMA/hns: Fix a spelling mistake in hns_roce_hw_v1.c
s/caculating/calculating
Link: https://lore.kernel.org/r/20210330122912.19989-1-gongruiqi1@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Ruiqi Gong <gongruiqi1@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Bob Pearson [Thu, 25 Mar 2021 21:24:26 +0000 (16:24 -0500)]
RDMA/rxe: Split MEM into MR and MW
In the original rxe implementation it was intended to use a common object
to represent MRs and MWs but they are different enough to separate these
into two objects.
This allows replacing the mem name with mr for MRs which is more
consistent with the style for the other objects and less likely to be
confusing. This is a long patch that mostly changes mem to mr where it
makes sense and adds a new rxe_mw struct.
Link: https://lore.kernel.org/r/20210325212425.2792-1-rpearson@hpe.com
Signed-off-by: Bob Pearson <rpearson@hpe.com>
Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Gal Pressman [Mon, 29 Mar 2021 12:01:28 +0000 (15:01 +0300)]
RDMA/efa: Use strscpy instead of strlcpy
The strlcpy function doesn't limit the source length, use the preferred
strscpy function instead.
Link: https://lore.kernel.org/r/20210329120131.18793-1-galpress@amazon.com
Reviewed-by: Firas JahJah <firasj@amazon.com>
Reviewed-by: Yossi Leybovich <sleybo@amazon.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Lv Yunlong [Mon, 22 Mar 2021 16:13:25 +0000 (09:13 -0700)]
IB/isert: Fix a use after free in isert_connect_request
The device is got by isert_device_get() with refcount is 1, and is
assigned to isert_conn by
isert_conn->device = device.
When isert_create_qp() failed, device will be freed with
isert_device_put().
Later, the device is used in isert_free_login_buf(isert_conn) by the
isert_conn->device->ib_device statement.
Free the device in the correct order.
Fixes:
ae9ea9ed38c9 ("iser-target: Split some logic in isert_connect_request to routines")
Link: https://lore.kernel.org/r/20210322161325.7491-1-lyl2019@mail.ustc.edu.cn
Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn>
Acked-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Bhaskar Chowdhury [Mon, 22 Mar 2021 06:43:22 +0000 (12:13 +0530)]
RDMA: Fix a typo
s/struture/structure/
Link: https://lore.kernel.org/r/20210322064322.3933985-1-unixbhaskar@gmail.com
Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Bhaskar Chowdhury [Mon, 22 Mar 2021 06:29:23 +0000 (11:59 +0530)]
IB/hfi1: Fix a typo
s/struture/structure/
And add the missing colon for kdoc
Link: https://lore.kernel.org/r/20210322062923.3306167-1-unixbhaskar@gmail.com
Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Yangyang Li [Fri, 19 Mar 2021 09:55:49 +0000 (17:55 +0800)]
RDMA/core: Correct misspellings of two words in comments
Correct the following spelling errors:
1. shold -> should
2. uncontext -> ucontext
Link: https://lore.kernel.org/r/1616147749-49106-1-git-send-email-liweihang@huawei.com
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Shay Drory [Thu, 18 Mar 2021 13:52:59 +0000 (15:52 +0200)]
RDMA/mlx5: Set ODP caps only if device profile support ODP
Currently, ODP caps are set during the init stage of mlx5_ib_dev,
regardless of whether the device profile supports ODP or not. There is no
point in setting ODP caps if the device profile doesn't support
ODP. Hence, move setting the ODP caps to the odp_init stage.
Link: https://lore.kernel.org/r/20210318135259.681264-1-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Maor Gottlieb [Thu, 18 Mar 2021 13:51:23 +0000 (15:51 +0200)]
RDMA/mlx5: Fix drop packet rule in egress table
Initial drop action support missed that drop action can be added to egress
flow tables as well. Add the missing support.
This requires making sure that dest_type isn't set to PORT which in turn
exposes a possibility of passing dst while indicating number of dsts as
zero. Explicitly check for number of dsts and pass the appropriate
pointer.
Fixes:
f29de9eee782 ("RDMA/mlx5: Add support for drop action in DV steering")
Link: https://lore.kernel.org/r/20210318135123.680759-1-leon@kernel.org
Reviewed-by: Mark Bloch <markb@nvidia.com>
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Patrisious Haddad [Thu, 18 Mar 2021 11:05:02 +0000 (13:05 +0200)]
RDMA/uverbs: Refactor rdma_counter_set_auto_mode and __counter_set_mode
Success is returned in the following flows:
* New mode is the same as the current one.
* Switched to new mode and there are no bound counters yet.
Link: https://lore.kernel.org/r/20210318110502.673676-1-leon@kernel.org
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>