platform/kernel/linux-starfive.git
5 years agoIB/mlx5: Update the supported DEVX commands
Yishai Hadas [Mon, 26 Nov 2018 06:28:37 +0000 (08:28 +0200)]
IB/mlx5: Update the supported DEVX commands

Update the supported DEVX commands, it includes adding to the
query/modify command's list and to the encoding handling.

In addition, a valid range for general commands was added to be used for
future commands.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoIB/mlx5: Enforce DEVX privilege by firmware
Yishai Hadas [Mon, 26 Nov 2018 06:28:36 +0000 (08:28 +0200)]
IB/mlx5: Enforce DEVX privilege by firmware

Enforce DEVX privilege by firmware, this enables future device
functionality without the need to make driver changes unless a new
privilege type will be introduced.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoIB/mlx5: Enable modify and query verbs objects via DEVX
Yishai Hadas [Mon, 26 Nov 2018 06:28:35 +0000 (08:28 +0200)]
IB/mlx5: Enable modify and query verbs objects via DEVX

Enables modify and query verbs objects via the DEVX interface.
To support this the above DEVX handlers were changed to get any
object type via the UVERBS_IDR_ANY_OBJECT mechanism.

The type checking and handling is done per object as part of the
driver code.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoIB/core: Enable getting an object type from a given uobject
Yishai Hadas [Mon, 26 Nov 2018 06:28:34 +0000 (08:28 +0200)]
IB/core: Enable getting an object type from a given uobject

Enable getting an object type from a given uobject, the type is saved
upon tree merging and is returned as part of some helper function.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoIB/core: Introduce UVERBS_IDR_ANY_OBJECT
Yishai Hadas [Mon, 26 Nov 2018 06:28:33 +0000 (08:28 +0200)]
IB/core: Introduce UVERBS_IDR_ANY_OBJECT

Introduce the UVERBS_IDR_ANY_OBJECT type to match any IDR object.

Once used, the infrastructure skips checking for the IDR type, it
becomes the driver handler responsibility.

This enables drivers to get in a given method an object from various of
types.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoMerge 'mlx5-next' into mlx5-devx
Doug Ledford [Tue, 4 Dec 2018 18:36:57 +0000 (13:36 -0500)]
Merge 'mlx5-next' into mlx5-devx

The enhanced devx support series needs commit:
9d43faac02e3 ("net/mlx5: Update mlx5_ifc with DEVX UCTX capabilities bits")

Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agonet/mlx5: Update mlx5_ifc with DEVX UCTX capabilities bits
Yishai Hadas [Mon, 26 Nov 2018 06:28:32 +0000 (08:28 +0200)]
net/mlx5: Update mlx5_ifc with DEVX UCTX capabilities bits

Expose device capabilities for DEVX user context, it includes which caps
the device is supported and a matching bit to set as part of user
context creation.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agoRDMA/mlx5: Unfold modify RMP function
Leon Romanovsky [Wed, 28 Nov 2018 18:53:43 +0000 (20:53 +0200)]
RDMA/mlx5: Unfold modify RMP function

There is no need to perform modify_rmp in two separate function,
while one of them uses stack as a placeholder for data while other
allocates it dynamically. Combine those two functions to one call
instead of two.

Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agoRDMA/mlx5: Unfold create RMP function
Leon Romanovsky [Wed, 28 Nov 2018 18:53:42 +0000 (20:53 +0200)]
RDMA/mlx5: Unfold create RMP function

There is no need to perform create_rmp in two separate function, while
one of them uses stack as a placeholder for data while other allocates
it dynamically. Combine those two functions to one instead of two.

Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agoRDMA/mlx5: Initialize SRQ tables on mlx5_ib
Leon Romanovsky [Wed, 28 Nov 2018 18:53:41 +0000 (20:53 +0200)]
RDMA/mlx5: Initialize SRQ tables on mlx5_ib

Transfer initialization and cleanup from mlx5_priv struct of
mlx5_core_dev to be part of mlx5_ib_dev. This completes removal
of SRQ from mlx5_core.

Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agoRDMA/mlx5: Update SRQ functions signatures to mlx5_ib format
Leon Romanovsky [Wed, 28 Nov 2018 18:53:40 +0000 (20:53 +0200)]
RDMA/mlx5: Update SRQ functions signatures to mlx5_ib format

Reflect the change of moving SRQ code from mlx5_core to mlx5_ib by
updating function signatures do not require mlx5_core_dev as an input,
because all operations in mlx5_ib are supposed to use mlx5_ib_dev.

Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agoRDMA/mlx5: Use stages for callback to setup and release DEVX
Leon Romanovsky [Wed, 28 Nov 2018 18:53:39 +0000 (20:53 +0200)]
RDMA/mlx5: Use stages for callback to setup and release DEVX

Reuse existing infrastructure to initialize and release DEVX uid.
The DevX interface is intended for user space access, so it is supposed
to be initialized before ib_register_device(). Also it isn't supported
in switchdev mode and don't need to initialize it in that mode.

Fixes: 76dc5a8406bf ("IB/mlx5: Manage device uid for DEVX white list commands")
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agoRDMA/mlx5: Remove SRQ signature global flag
Leon Romanovsky [Wed, 28 Nov 2018 18:53:38 +0000 (20:53 +0200)]
RDMA/mlx5: Remove SRQ signature global flag

SRQ signature is not supported, hence no need for special static
global variable to announce it.

Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agonet/mlx5: Move SRQ functions to RDMA part
Leon Romanovsky [Wed, 28 Nov 2018 18:53:37 +0000 (20:53 +0200)]
net/mlx5: Move SRQ functions to RDMA part

There is no need to keep SRQ which is RDMA object in mlx5_core.
In this patch, we partially move the execution code, while next patches
will move table initialization/release logic too.

Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agonet/mlx5: Remove references to local mlx5_core functions
Leon Romanovsky [Wed, 28 Nov 2018 18:53:36 +0000 (20:53 +0200)]
net/mlx5: Remove references to local mlx5_core functions

As a preparation to move SRQ functionality to RDMA, drop all references
to mlx5_core logic and make SRQ be dependent on shared code only.

Most of the time, we are interested to know if events are working/not
working and it is possible with previous commit ("net/mlx5: Debug print
for forwarded async events").

Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agonet/mlx5: Remove not-used lib/eq.h header file
Leon Romanovsky [Wed, 28 Nov 2018 18:53:35 +0000 (20:53 +0200)]
net/mlx5: Remove not-used lib/eq.h header file

lib/eq.h is needed for EQ manipulation which are not performed in SRQ.

Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agonet/mlx5: Remove dead transobj code
Leon Romanovsky [Wed, 28 Nov 2018 18:53:34 +0000 (20:53 +0200)]
net/mlx5: Remove dead transobj code

Delete functions which are not called and not needed.

Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agonet/mlx5: Align SRQ licenses and copyright information
Leon Romanovsky [Wed, 28 Nov 2018 18:53:33 +0000 (20:53 +0200)]
net/mlx5: Align SRQ licenses and copyright information

Ensure that both RDMA and netdev parts of SRQ implementation
has same copyright and license information annotated by SPDX
tags.

Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agoRDMA/nldev: Export to user space number of contexts
Leon Romanovsky [Wed, 28 Nov 2018 11:16:45 +0000 (13:16 +0200)]
RDMA/nldev: Export to user space number of contexts

[leonro@server ~]$ rdma res show
1: mlx5_0: pd 3 cq 5 qp 4 cm_id 0 mr 0 ctx 0

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/uverbs: Annotate alloc/deallloc paths with context tracking
Leon Romanovsky [Wed, 28 Nov 2018 11:16:44 +0000 (13:16 +0200)]
RDMA/uverbs: Annotate alloc/deallloc paths with context tracking

Add restrack annotations to track allocations of ucontexts.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/restrack: Track ucontext
Leon Romanovsky [Wed, 28 Nov 2018 11:16:43 +0000 (13:16 +0200)]
RDMA/restrack: Track ucontext

Add ability to track allocated ib_ucontext, which are limited
resource and worth to be visible by users.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoMerge branch 'write-handler-consistent-flow' into for-next
Doug Ledford [Mon, 3 Dec 2018 17:20:53 +0000 (12:20 -0500)]
Merge branch 'write-handler-consistent-flow' into for-next

Make all of the write() handlers use a consistent flow

From Jason,

This series unifies all the write handlers to use a flow that is very
similar to the ioctl handler flow, including having the same basic
assumptions about extensible buffer handling and the same handler
function call signature.

Along the way this consolidates all the copy_to/from_user into a small
set of safe buffer accessor functions tailored to the usage here. These
accessors use the new dispatcher-controlled calling convention for ucore
data, and support a placement of the response that does not rely on the
cmd.response value.

Overall this brings in in strong bounds checking to all the write()
handlers and consistent enforcement of the zero-fill/zero-check
methodology for buffer extension.

The end result is a significant complexity reduction for all of the
handlers and creates a high degree of uniformity between the write,
write_ex, and ioctl handlers and dispatch flow.

Thanks

Jason Gunthorpe (12):
  RDMA/uverbs: Remove out_len checks that are now done by the core
  RDMA/uverbs: Use uverbs_attr_bundle to pass ucore for write/write_ex
  RDMA/uverbs: Get rid of the 'callback' scheme in the compat path
  RDMA/uverbs: Use uverbs_response() for remaining response copying
  RDMA/uverbs: Use uverbs_request() for request copying
  RDMA/uverbs: Use uverbs_request() and core for write_ex handlers
  RDMA/uverbs: Fill in the response for IB_USER_VERBS_EX_CMD_MODIFY_QP
  RDMA/uverbs: Simplify ib_uverbs_ex_query_device
  RDMA/uverbs: Add a simple iterator interface for reading the command
  RDMA/uverbs: Use the iterator for ib_uverbs_unmarshall_recv()
  RDMA/uverbs: Do not check the input length on create_cq/qp paths
  RDMA/uverbs: Use only attrs for the write() handler signature

 drivers/infiniband/core/rdma_core.h   |    5 +-
 drivers/infiniband/core/uverbs_cmd.c  | 1165 ++++++++++---------------
 drivers/infiniband/core/uverbs_main.c |   23 +-
 drivers/infiniband/core/uverbs_uapi.c |   23 +-
 include/rdma/uverbs_ioctl.h           |    9 +-
 5 files changed, 479 insertions(+), 746 deletions(-)

Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/uverbs: Use only attrs for the write() handler signature
Jason Gunthorpe [Sun, 25 Nov 2018 18:58:45 +0000 (20:58 +0200)]
RDMA/uverbs: Use only attrs for the write() handler signature

All of the old arguments can be derived from the uverbs_attr_bundle
structure, so get rid of the redundant arguments. Most of the prior work
has been removing users of the arguments to allow this to be a simple
patch.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/uverbs: Do not check the input length on create_cq/qp paths
Jason Gunthorpe [Sun, 25 Nov 2018 18:58:44 +0000 (20:58 +0200)]
RDMA/uverbs: Do not check the input length on create_cq/qp paths

If the user did not provide a long enough command buffer then the missing
bytes are forced to zero. There is no reason to check the length if a zero
value is OK.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/uverbs: Use the iterator for ib_uverbs_unmarshall_recv()
Jason Gunthorpe [Sun, 25 Nov 2018 18:58:43 +0000 (20:58 +0200)]
RDMA/uverbs: Use the iterator for ib_uverbs_unmarshall_recv()

This has a very complicated memory layout, with two flex arrays. Use
the iterator API to make reading it clearer.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/uverbs: Add a simple iterator interface for reading the command
Jason Gunthorpe [Sun, 25 Nov 2018 18:58:42 +0000 (20:58 +0200)]
RDMA/uverbs: Add a simple iterator interface for reading the command

Several methods have a command with a trailing flex array, and they
all open code some extraction scheme. Centralize this into a simple
iterator API.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/uverbs: Simplify ib_uverbs_ex_query_device
Jason Gunthorpe [Sun, 25 Nov 2018 18:58:41 +0000 (20:58 +0200)]
RDMA/uverbs: Simplify ib_uverbs_ex_query_device

We truncate the response structure if there is not enough room in the
user buffer so there is no reason to have all the mess with finely managing
response_length. Just fully fill the attrs and truncate on copy.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/uverbs: Fill in the response for IB_USER_VERBS_EX_CMD_MODIFY_QP
Jason Gunthorpe [Sun, 25 Nov 2018 18:58:40 +0000 (20:58 +0200)]
RDMA/uverbs: Fill in the response for IB_USER_VERBS_EX_CMD_MODIFY_QP

A response struct was defined, and userspace is providing it (but not
checking it). Fill it in and write it out.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/uverbs: Use uverbs_request() and core for write_ex handlers
Jason Gunthorpe [Sun, 25 Nov 2018 18:58:39 +0000 (20:58 +0200)]
RDMA/uverbs: Use uverbs_request() and core for write_ex handlers

The write_ex handlers have this horrible boilerplate in every function to
do the zero extend/zero check and min size checks. This is now handled in
the core code via the meta-data, and the zero checks are handled by
uverbs_request(). Replace all the occurrences.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/uverbs: Use uverbs_request() for request copying
Jason Gunthorpe [Sun, 25 Nov 2018 18:58:38 +0000 (20:58 +0200)]
RDMA/uverbs: Use uverbs_request() for request copying

This function properly zero-extends, and zero-checks if the user
buffer is not the same size as the kernel command struct.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/uverbs: Use uverbs_response() for remaining response copying
Jason Gunthorpe [Sun, 25 Nov 2018 18:58:37 +0000 (20:58 +0200)]
RDMA/uverbs: Use uverbs_response() for remaining response copying

This function properly truncates and zero-fills the response which is the
standard used by the ioctl uAPI when working with user data.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/uverbs: Get rid of the 'callback' scheme in the compat path
Jason Gunthorpe [Sun, 25 Nov 2018 18:58:36 +0000 (20:58 +0200)]
RDMA/uverbs: Get rid of the 'callback' scheme in the compat path

There is no reason for this. For response processing we simply need to
copy, truncate, and zero fill the response into whatever output buffer
was provided. Add a function uverbs_response() that does this
consistently.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/uverbs: Use uverbs_attr_bundle to pass ucore for write/write_ex
Jason Gunthorpe [Sun, 25 Nov 2018 18:58:35 +0000 (20:58 +0200)]
RDMA/uverbs: Use uverbs_attr_bundle to pass ucore for write/write_ex

This creates a consistent way to access the two core buffers across write
and write_ex handlers.

Remove the open coded ucore conversion in the write/ex compatibility
handlers.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/uverbs: Remove out_len checks that are now done by the core
Jason Gunthorpe [Sun, 25 Nov 2018 18:58:34 +0000 (20:58 +0200)]
RDMA/uverbs: Remove out_len checks that are now done by the core

write() methods must work with fixed sized structures as that is the only
way to know where the udata segment starts. The common udata code now
rejects any write() that has a response buffer shorter than the core's
response.

Thus all the checks of out_len for write methods are redundant and can be
removed.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agonet/mlx5: Debug print for forwarded async events
Saeed Mahameed [Mon, 26 Nov 2018 22:39:08 +0000 (14:39 -0800)]
net/mlx5: Debug print for forwarded async events

Print a debug message for every async FW event forwarded to mlx5
interfaces (mlx5e netdev and mlx5_ib rdma module).

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Forward SRQ resource events
Saeed Mahameed [Mon, 26 Nov 2018 22:39:07 +0000 (14:39 -0800)]
net/mlx5: Forward SRQ resource events

Allow forwarding of SRQ events to mlx5_core interfaces, e.g. mlx5_ib.
Use mlx5_notifier_register/unregister in srq.c in order to allow seamless
transition of srq.c to infiniband subsystem.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Forward QP/WorkQueues resource events
Saeed Mahameed [Mon, 26 Nov 2018 22:39:06 +0000 (14:39 -0800)]
net/mlx5: Forward QP/WorkQueues resource events

Allow forwarding QP and WQ events to mlx5_core interfaces, e.g. mlx5_ib

Use mlx5_notifier_register/unregister in qp.c in order to allow seamless
transition of qp.c to infiniband subsystem.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Remove all deprecated software versions of FW events
Saeed Mahameed [Mon, 26 Nov 2018 22:39:05 +0000 (14:39 -0800)]
net/mlx5: Remove all deprecated software versions of FW events

Before the new mlx5 event notification infrastructure and API,
mlx5_core used to process all events before forwarding them to mlx5
interfaces (mlx5e/mlx5_ib) and used to translate the event type enum
to a software defined enum, this is not needed anymore since it is ok
for mlx5e and mlx5_ib to receive FW events as is, at least the few ones
mlx5 core allows.

mlx5e and mlx5_ib already moved to use the new API and they only handle FW
events types, it is now safe to remove all equivalent software defined
events and the logic around them.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agoIB/mlx5: Handle raw delay drop general event
Saeed Mahameed [Mon, 26 Nov 2018 22:39:04 +0000 (14:39 -0800)]
IB/mlx5: Handle raw delay drop general event

Handle FW general event rq delay drop as it was received from FW via mlx5
notifiers API, instead of handling the processed software version of that
event. After this patch we can safely remove all software processed FW
events types and definitions.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Allow forwarding event type general event as is
Saeed Mahameed [Mon, 26 Nov 2018 22:39:03 +0000 (14:39 -0800)]
net/mlx5: Allow forwarding event type general event as is

FW general event is used by mlx5_ib for RQ delay drop timeout event
handling, in this patch we allow to forward FW general event type to mlx5
notifiers chain so mlx5_ib can handle it and to deprecate the software
version of it.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agoIB/mlx5: Handle raw port change event rather than the software version
Saeed Mahameed [Mon, 26 Nov 2018 22:39:02 +0000 (14:39 -0800)]
IB/mlx5: Handle raw port change event rather than the software version

Use the FW version of the port change event as forwarded via new mlx5
notifiers API.

After this patch, processed software version of the port change event
will become deprecated and will be totally removed in downstream
patches.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Remove unused events callback and logic
Saeed Mahameed [Mon, 26 Nov 2018 22:39:01 +0000 (14:39 -0800)]
net/mlx5: Remove unused events callback and logic

The mlx5_interface->event callback is not used by mlx5e/mlx5_ib anymore.

We totally remove the delayed events logic work around, since with
the dynamic notifier registration API it is not needed anymore, mlx5_ib
can register its notifier and start receiving events exactly at the moment
it is ready to handle them.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agoIB/mlx5: Use the new mlx5 core notifier API
Saeed Mahameed [Mon, 26 Nov 2018 22:39:00 +0000 (14:39 -0800)]
IB/mlx5: Use the new mlx5 core notifier API

Remove the deprecated mlx5_interface->event mlx5_ib callback and use new
mlx5 notifier API to subscribe for mlx5 events.

For native mlx5_ib devices profiles pf_profile/nic_rep_profile register
the notifier callback mlx5_ib_handle_event which treats the notifier
context as mlx5_ib_dev.

For vport repesentors, don't register any notifier, same as before, they
didn't receive any mlx5 events.

For slave port (mlx5_ib_multiport_info) register a different notifier
callback mlx5_ib_event_slave_port, which knows that the event is coming
for mlx5_ib_multiport_info and prepares the event job accordingly.
Before this on the event handler work we had to ask mlx5_core if this is
a slave port mlx5_core_is_mp_slave(work->dev), now it is not needed
anymore.
mlx5_ib_multiport_info notifier registration is done on
mlx5_ib_bind_slave_port and de-registration is done on
mlx5_ib_unbind_slave_port.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Forward all mlx5 events to mlx5 notifiers chain
Saeed Mahameed [Mon, 26 Nov 2018 22:38:59 +0000 (14:38 -0800)]
net/mlx5: Forward all mlx5 events to mlx5 notifiers chain

This to allow seamless migration to the new notifier chain API, and to
eventually deprecate interfaces dev->event callback.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5e: Use the new mlx5 core notifier API
Saeed Mahameed [Mon, 26 Nov 2018 22:38:58 +0000 (14:38 -0800)]
net/mlx5e: Use the new mlx5 core notifier API

Remove the deprecated mlx5_interface->event mlx5e callback and use new
mlx5 notifier API to subscribe for mlx5 events, handle port change event
as received from FW rather than handling the mlx5 core processed port
change software version event.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Allow port change event to be forwarded to driver notifiers chain
Saeed Mahameed [Mon, 26 Nov 2018 22:38:57 +0000 (14:38 -0800)]
net/mlx5: Allow port change event to be forwarded to driver notifiers chain

The idea is to allow mlx5 core interfaces (mlx5e/mlx5_ib) to be able to
receive some allowed FW events as is via the new notifier API.

In this patch we allow forwarding port change event to mlx5 core interfaces
(mlx5e/mlx5_ib) as it was received from FW.
Once mlx5e and mlx5_ib start using this event we can safely remove the
redundant software version of it and its translation logic.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Driver events notifier API
Saeed Mahameed [Mon, 26 Nov 2018 22:38:56 +0000 (14:38 -0800)]
net/mlx5: Driver events notifier API

Use atomic notifier chain to fire events to mlx5 core driver
consumers (mlx5e/mlx5_ib) and provide mlx5 register/unregister notifier
API.

This API will replace the current mlx5_interface->event callback and all
the logic around it, especially the delayed events logic introduced by
commit 97834eba7c19 ("net/mlx5: Delay events till ib registration ends")

Which is not needed anymore with this new API where the mlx5 interface
can dynamically register/unregister its notifier.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agoIB/mlx5: Use fragmented QP's buffer for in-kernel users
Guy Levi [Mon, 26 Nov 2018 06:15:50 +0000 (08:15 +0200)]
IB/mlx5: Use fragmented QP's buffer for in-kernel users

The current implementation of create QP requires contiguous memory, such a
requirement is problematic once the memory is fragmented or the system is
low in memory, it causes failures in dma_zalloc_coherent().

This patch takes advantage of the new mlx5_core API which allocates a
fragmented buffer. This makes the QP creation much more resilient to
memory fragmentation. Data-path code was adapted to the fact that WQEs can
cross buffers.

We also use the opportunity to fix some cosmetic legacy coding convention
errors which were in the feature scope.

Signed-off-by: Guy Levi <guyle@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoIB/mlx5: Use fragmented SRQ's buffer for in-kernel users
Guy Levi [Mon, 26 Nov 2018 06:15:39 +0000 (08:15 +0200)]
IB/mlx5: Use fragmented SRQ's buffer for in-kernel users

The current implementation of create SRQ requires contiguous memory, such
a requirement is problematic once the memory is fragmented or the system
is low in memory, it causes failures in dma_zalloc_coherent().

This patch takes the advantage of the new mlx5_core API which allocates a
fragmented buffer, and makes the SRQ creation much more resilient to
memory fragmentation. Data-path code was adapted to the fact that WQEs can
cross buffers.

Signed-off-by: Guy Levi <guyle@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agorxe: IB_WR_REG_MR does not capture MR's iova field
Chuck Lever [Sun, 25 Nov 2018 22:13:08 +0000 (17:13 -0500)]
rxe: IB_WR_REG_MR does not capture MR's iova field

FRWR memory registration is done with a series of calls and WRs.
1. ULP invokes ib_dma_map_sg()
2. ULP invokes ib_map_mr_sg()
3. ULP posts an IB_WR_REG_MR on the Send queue

Step 2 generates an iova. It is permissible for ULPs to change this
iova (with certain restrictions) between steps 2 and 3.

rxe_map_mr_sg captures the MR's iova but later when rxe processes the
REG_MR WR, it ignores the MR's iova field. If a ULP alters the MR's iova
after step 2 but before step 3, rxe never captures that change.

When the remote sends an RDMA Read targeting that MR, rxe looks up the
R_key, but the altered iova does not match the iova stored in the MR,
causing the RDMA Read request to fail.

Reported-by: Anna Schumaker <schumaker.anna@gmail.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/mlx5: Attach a DEVX counter via raw flow creation
Mark Bloch [Tue, 20 Nov 2018 18:31:08 +0000 (20:31 +0200)]
RDMA/mlx5: Attach a DEVX counter via raw flow creation

Allow a user to attach a DEVX counter via mlx5 raw flow creation. In order
to attach a counter we introduce a new attribute:

MLX5_IB_ATTR_CREATE_FLOW_ARR_COUNTERS_DEVX

A counter can be attached to multiple flow steering rules.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/qib: Remove all occurrences of BUG_ON()
Leon Romanovsky [Thu, 29 Nov 2018 12:15:28 +0000 (14:15 +0200)]
RDMA/qib: Remove all occurrences of BUG_ON()

QIB driver was added in 2010 with many BUG_ON(), most of them were cleaned
out after years of development and usages.

It looks like that it is safe now to remove rest of BUG_ONs.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoIB/usnic: fix spelling mistake "miniumum" -> "minimum"
Colin Ian King [Thu, 29 Nov 2018 10:42:13 +0000 (10:42 +0000)]
IB/usnic: fix spelling mistake "miniumum" -> "minimum"

There is a spelling mistake in a usnic_err error message, fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/uverbs: fix ptr_ret.cocci warnings
kbuild test robot [Tue, 27 Nov 2018 23:21:30 +0000 (07:21 +0800)]
RDMA/uverbs: fix ptr_ret.cocci warnings

drivers/infiniband/core/uverbs_cmd.c:1095:1-3: WARNING: PTR_ERR_OR_ZERO can be used

 Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR

Generated by: scripts/coccinelle/api/ptr_ret.cocci

Fixes: 7106a9769715 ("RDMA/uverbs: Make write() handlers return 0 on success")
Signed-off-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/drivers: Fix spelling mistake "initalize" -> "initialize"
Colin Ian King [Wed, 28 Nov 2018 15:11:16 +0000 (15:11 +0000)]
RDMA/drivers: Fix spelling mistake "initalize" -> "initialize"

Fix spelling mistake in usnic_err error message

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/uverbs: Use uverbs_attr_bundle to pass udata for ioctl()
Jason Gunthorpe [Sun, 25 Nov 2018 18:51:20 +0000 (20:51 +0200)]
RDMA/uverbs: Use uverbs_attr_bundle to pass udata for ioctl()

Have the core code initialize the driver_udata if the method has a udata
description. This is done using the same create_udata the handler was
supposed to call.

This makes ioctl consistent with the write and write_ex paths.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agoRDMA/uverbs: Use uverbs_attr_bundle to pass udata for write
Jason Gunthorpe [Sun, 25 Nov 2018 18:51:19 +0000 (20:51 +0200)]
RDMA/uverbs: Use uverbs_attr_bundle to pass udata for write

Now that we have metadata describing the command format the core code can
directly compute the udata pointers and all the really ugly
ib_uverbs_init_udata() calls can be removed from the handlers.

This means all the write() handlers are no longer sensitive to the layout
of the command buffer.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agoRDMA/uverbs: Use uverbs_attr_bundle to pass udata for write_ex
Jason Gunthorpe [Sun, 25 Nov 2018 18:51:18 +0000 (20:51 +0200)]
RDMA/uverbs: Use uverbs_attr_bundle to pass udata for write_ex

The core code needs to compute the udata so we may as well pass it in the
uverbs_attr_bundle instead of on the stack. This converts the simple case
of write_ex() which already has a core calculation.

Also change the write() path to use the attrs for ib_uverbs_init_udata()
instead of on the stack. This lets the write to write_ex compatibility
path continue to follow the lead of the _ex path.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agoRDMA/uverbs: Prohibit write() calls with too small buffers
Jason Gunthorpe [Sun, 25 Nov 2018 18:51:17 +0000 (20:51 +0200)]
RDMA/uverbs: Prohibit write() calls with too small buffers

The size meta-data in the prior patch describes the smallest acceptable
buffer for the write() interface. Globally check this in the core code.

This is necessary in the case of write() methods that have a driver udata
to prevent computing a negative udata buffer length.

The return code of -ENOSPC is chosen here as some of the handlers already
use this code, however many other handler use EINVAL.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agoRDMA/uverbs: Add structure size info to write commands
Jason Gunthorpe [Sun, 25 Nov 2018 18:51:16 +0000 (20:51 +0200)]
RDMA/uverbs: Add structure size info to write commands

We need the structure sizes to compute the location of the udata in the
core code. Annotate the sizes into the new macro language.

This is generated largely by script and checked by comparing against the
similar list in rdma-core.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agoRDMA/uverbs: Do not pass ib_uverbs_file to ioctl methods
Jason Gunthorpe [Sun, 25 Nov 2018 18:51:15 +0000 (20:51 +0200)]
RDMA/uverbs: Do not pass ib_uverbs_file to ioctl methods

The uverbs_attr_bundle already contains this pointer, and most methods
don't actually need it. Get rid of the redundant function argument.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agoRDMA/uverbs: Make write() handlers return 0 on success
Jason Gunthorpe [Sun, 25 Nov 2018 18:51:14 +0000 (20:51 +0200)]
RDMA/uverbs: Make write() handlers return 0 on success

Currently they return the command length, while all other handlers return
0. This makes the write path closer to the write_ex and ioctl path.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agoRDMA/uverbs: Replace ib_uverbs_file with uverbs_attr_bundle for write
Jason Gunthorpe [Sun, 25 Nov 2018 18:51:13 +0000 (20:51 +0200)]
RDMA/uverbs: Replace ib_uverbs_file with uverbs_attr_bundle for write

Now that we can add meta-data to the description of write() methods we
need to pass the uverbs_attr_bundle into all write based handlers so
future patches can use it as a container for any new data transferred out
of the core.

This is the first step to bringing the write() and ioctl() methods to a
common interface signature.

This is a simple search/replace, and we push the attr down into the uobj
and other APIs to keep changes minimal.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agoRDMA/uverbs: Add missing driver_data
Jason Gunthorpe [Sun, 25 Nov 2018 18:51:12 +0000 (20:51 +0200)]
RDMA/uverbs: Add missing driver_data

If the struct is used with a driver_udata it should have a trailing
driver_data flex array to mark it as having udata.

In most cases this forces the end of the struct to be aligned to u64 which
is needed to make the trailing driver_data naturally aligned.

Unfortunately We have a few cases where the base struct is not aligned to
8 bytes, these are marked with a u32 driver_data and userspace will check
for alignment issues when it compiles the driver.

Also remove the empty ib_uverbs_modify_qp_resp as nothing uses this.

pahole says there is no change to any struct sizes by this change.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agoIB/qib: fix spelling mistake "colescing" -> "coalescing"
Colin Ian King [Mon, 26 Nov 2018 16:23:20 +0000 (16:23 +0000)]
IB/qib: fix spelling mistake "colescing" -> "coalescing"

There is a spelling mistake in the module description text, fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agonet/mlx5: Improve core device events handling
Saeed Mahameed [Tue, 20 Nov 2018 22:12:28 +0000 (14:12 -0800)]
net/mlx5: Improve core device events handling

Register a separate handler per event type, rather than listening for all
events and looking for the events to handle in a switch case.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Device events, Use async events chain
Saeed Mahameed [Tue, 20 Nov 2018 22:12:27 +0000 (14:12 -0800)]
net/mlx5: Device events, Use async events chain

Move all the generic async events handling into new specific events
handling file events.c to keep eq.c file clean from concrete event logic
handling.

Use new API to register for NOTIFY_ANY to handle generic events and
dispatch allowed events to mlx5_core consumers (mlx5_ib and mlx5e)

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: CQ ERR, Use async events chain
Saeed Mahameed [Tue, 20 Nov 2018 22:12:26 +0000 (14:12 -0800)]
net/mlx5: CQ ERR, Use async events chain

Remove the explicit call to mlx5_eq_cq_event on MLX5_EVENT_TYPE_CQ_ERROR
and register a specific CQ ERROR handler via the new API.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Resource tables, Use async events chain
Saeed Mahameed [Tue, 20 Nov 2018 22:12:25 +0000 (14:12 -0800)]
net/mlx5: Resource tables, Use async events chain

Remove the explicit call to QP/SRQ resources events handlers on several FW
events and let resources logic register resources events notifiers via the
new API.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: CmdIF, Use async events chain
Saeed Mahameed [Tue, 20 Nov 2018 22:12:24 +0000 (14:12 -0800)]
net/mlx5: CmdIF, Use async events chain

Remove the explicit call to mlx5_cmd_comp_handler on MLX5_EVENT_TYPE_CMD
and let command interface to register its own handler when its ready.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: FWPage, Use async events chain
Saeed Mahameed [Tue, 20 Nov 2018 22:12:23 +0000 (14:12 -0800)]
net/mlx5: FWPage, Use async events chain

Remove the explicit call to mlx5_core_req_pages_handler on
MLX5_EVENT_TYPE_PAGE_REQUEST and let FW page logic  to register its own
handler when its ready.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: E-Switch, Use async events chain
Saeed Mahameed [Tue, 20 Nov 2018 22:12:22 +0000 (14:12 -0800)]
net/mlx5: E-Switch, Use async events chain

Remove the explicit call to mlx5_eswitch_vport_event on
MLX5_EVENT_TYPE_NIC_VPORT_CHANGE and let the eswitch register its own
handler when its ready.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Clock, Use async events chain
Saeed Mahameed [Tue, 20 Nov 2018 22:12:21 +0000 (14:12 -0800)]
net/mlx5: Clock, Use async events chain

Remove the explicit call to mlx5_pps_event on MLX5_EVENT_TYPE_PPS_EVENT
and let clock logic to register its own handler when its ready.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: FPGA, Use async events chain
Saeed Mahameed [Tue, 20 Nov 2018 22:12:20 +0000 (14:12 -0800)]
net/mlx5: FPGA, Use async events chain

Remove the explicit call to mlx5_fpga_event on
MLX5_EVENT_TYPE_FPGA_ERROR or MLX5_EVENT_TYPE_FPGA_QP_ERROR
let fpga core to register its own handler when its ready.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: FWTrace, Use async events chain
Saeed Mahameed [Tue, 20 Nov 2018 22:12:19 +0000 (14:12 -0800)]
net/mlx5: FWTrace, Use async events chain

Remove the explicit call to mlx5_fw_tracer_event on
MLX5_EVENT_TYPE_DEVICE_TRACER and let fw tracer to register
its own handler when its ready.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: EQ, Introduce atomic notifier chain subscription API
Saeed Mahameed [Tue, 20 Nov 2018 22:12:18 +0000 (14:12 -0800)]
net/mlx5: EQ, Introduce atomic notifier chain subscription API

Use atomic_notifier_chain to fire firmware events at internal mlx5 core
components such as eswitch/fpga/clock/FW tracer/etc.., this is to
avoid explicit calls from low level mlx5_core to upper components and to
simplify the mlx5_core API for future developments.

Simply provide register/unregister notifiers API and call the notifier
chain on firmware async events.

Example: to subscribe to a FW event:
struct mlx5_nb port_event;

MLX5_NB_INIT(&port_event, port_event_handler, PORT_CHANGE);
mlx5_eq_notifier_register(mdev, &port_event);

where:
 - port_event_handler is the notifier block callback.
 - PORT_EVENT is the suffix of MLX5_EVENT_TYPE_PORT_CHANGE.

The above will guarantee that port_event_handler will receive all FW
events of the type MLX5_EVENT_TYPE_PORT_CHANGE.

To receive all FW/HW events one can subscribe to
MLX5_EVENT_TYPE_NOTIFY_ANY.

The next few patches will start moving all mlx5 core components to use
this new API and cleanup mlx5_eq_async_int misx handler from component
explicit calls and specific logic.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agoRDMA/core: Sync unregistration with netlink commands
Parav Pandit [Fri, 16 Nov 2018 01:50:57 +0000 (03:50 +0200)]
RDMA/core: Sync unregistration with netlink commands

When the rdma device is getting removed, get resource info can race with
device removal, as below:

      CPU-0                                  CPU-1
    --------                               --------
    rdma_nl_rcv_msg()
       nldev_res_get_cq_dumpit()
          mutex_lock(device_lock);
          get device reference
          mutex_unlock(device_lock);        [..]
                                            ib_unregister_device()
                                            /* Valid reference to
                                             * device->dev exists.
                                             */
                                             ib_dealloc_device()

          [..]
          provider->fill_res_entry();

Even though device object is not freed, fill_res_entry() can get called on
device which doesn't have a driver anymore. Kernel core device reference
count is not sufficient, as this only keeps the structure valid, and
doesn't guarantee the driver is still loaded.

Similar race can occur with device renaming and device removal, where
device_rename() tries to rename a unregistered device. While this is fine
for devices of a class which are not net namespace aware, but it is
incorrect for net namespace aware class coming in subsequent series.  If a
class is net namespace aware, then the below [1] call trace is observed in
above situation.

Therefore, to avoid the race, keep a reference count and let device
unregistration wait until all netlink users drop the reference.

[1] Call trace:
kernfs: ns required in 'infiniband' for 'mlx5_0'
WARNING: CPU: 18 PID: 44270 at fs/kernfs/dir.c:842 kernfs_find_ns+0x104/0x120
libahci i2c_core mlxfw libata dca [last unloaded: devlink]
RIP: 0010:kernfs_find_ns+0x104/0x120
Call Trace:
kernfs_find_and_get_ns+0x2e/0x50
sysfs_rename_link_ns+0x40/0xb0
device_rename+0xb2/0xf0
ib_device_rename+0xb3/0x100 [ib_core]
nldev_set_doit+0x165/0x190 [ib_core]
rdma_nl_rcv_msg+0x249/0x250 [ib_core]
? netlink_deliver_tap+0x8f/0x3e0
rdma_nl_rcv+0xd6/0x120 [ib_core]
netlink_unicast+0x17c/0x230
netlink_sendmsg+0x2f0/0x3e0
sock_sendmsg+0x30/0x40
__sys_sendto+0xdc/0x160

Fixes: da5c85078215 ("RDMA/nldev: add driver-specific resource tracking")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/cma: Move cma module specific functions to cma_priv.h
Parav Pandit [Mon, 12 Nov 2018 22:45:24 +0000 (00:45 +0200)]
RDMA/cma: Move cma module specific functions to cma_priv.h

Currently several rdma_cm module specific functions are declared in
core_priv.h file. Now that we have cma_priv.h file specific to rdma_cm
kernel module, move them from core_priv.h to cma_priv.h

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/uverbs: Check for NULL driver methods for every write call
Jason Gunthorpe [Mon, 12 Nov 2018 20:59:58 +0000 (22:59 +0200)]
RDMA/uverbs: Check for NULL driver methods for every write call

Add annotations to the uverbs_api structure indicating which driver
methods are called by the implementation. If the required method
is NULL the write API will be not be callable.

This effectively duplicates the cmd_mask system, however it does it by
expressing invariants required by the core code, not by delegating
decision making to the driver. This is another step toward eliminating
cmd_mask.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agoRDMA/uverbs: Make all the method functions in uverbs_cmd static
Jason Gunthorpe [Mon, 12 Nov 2018 20:59:57 +0000 (22:59 +0200)]
RDMA/uverbs: Make all the method functions in uverbs_cmd static

Now that we use struct uverbs_uapi to link the method functions to the
dispatcher there is no reason to have them be extern symbols.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agoRDMA/uverbs: Convert the write interface to use uverbs_api
Jason Gunthorpe [Mon, 12 Nov 2018 20:59:56 +0000 (22:59 +0200)]
RDMA/uverbs: Convert the write interface to use uverbs_api

This organizes the write commands into objects and links them to the
uverbs_api data structure. The command path is reworked to use uapi
instead of its internal structures.

The command mask is moved from a runtime check to a registration time
check in the uapi.

Since the write interface does not have the object ID as part of the
command, the radix bins are converted into linear lists to support the
lookup.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agoRDMA/verbs: Store the write/write_ex uapi entry points in the uverbs_api
Jason Gunthorpe [Mon, 12 Nov 2018 20:59:55 +0000 (22:59 +0200)]
RDMA/verbs: Store the write/write_ex uapi entry points in the uverbs_api

Bringing all uapi entry points into one place lets us deal with them
consistently. For instance the write, write_ex and ioctl paths can be
disabled when an API is not supported by the driver.

This will replace the uverbs_cmd_table static arrays.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agoRDMA/uverbs: Require all objects to have a driver destroy function
Jason Gunthorpe [Mon, 12 Nov 2018 20:59:54 +0000 (22:59 +0200)]
RDMA/uverbs: Require all objects to have a driver destroy function

If we can't destroy the object then we certainly shouldn't allow it be
created or used. Remove it from the uverbs_uapi in this case.

This also disables methods of other objects that have mandatory object
handle inputs - ie REG_DM_MR is now automatically removed if DM objects
cannot be created.

Typically drivers not supporting an interface will mark all of the
supporting functions as NULL, including destroy.

This is intended to automatically eliminate entire corner cases in the API
that are difficult to test.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agoRDMA/mlx5: Use the uapi disablement APIs instead of code
Jason Gunthorpe [Mon, 12 Nov 2018 20:59:53 +0000 (22:59 +0200)]
RDMA/mlx5: Use the uapi disablement APIs instead of code

Rely on UAPI_DEF_IS_OBJ_SUPPORTED instead of manipulating the contents of
the driver's definition list.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agoRDMA/uverbs: Add helpers to mark uapi functions as unsupported
Jason Gunthorpe [Mon, 12 Nov 2018 20:59:52 +0000 (22:59 +0200)]
RDMA/uverbs: Add helpers to mark uapi functions as unsupported

We have many cases where parts of the uapi are not supported in a driver,
needs a certain protocol, or whatever. It is best to reflect this directly
into the struct uverbs_api when it is built so that everything is simply
blocked off, and future introspection can report a proper supported list.

This is done by adding some additional helpers to the definition list
language that disable objects based on a 'supported' call back, and a
helper that disables based on a NULL struct ib_device function pointer.

Disablement is global. For instance, if a driver disables an object then
everything connected to that object is removed, including core methods.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agoRDMA/uverbs: Factor out the add/get pattern into a helper
Jason Gunthorpe [Mon, 12 Nov 2018 20:59:51 +0000 (22:59 +0200)]
RDMA/uverbs: Factor out the add/get pattern into a helper

The next patch needs another copy of this, provide a simple helper to
reduce the coding. uapi_add_get_elm() returns an existing entry or adds a
new one.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agoRDMA/uverbs: Use a linear list to describe the compiled-in uapi
Jason Gunthorpe [Mon, 12 Nov 2018 20:59:50 +0000 (22:59 +0200)]
RDMA/uverbs: Use a linear list to describe the compiled-in uapi

The 'tree' data structure is very hard to build at compile time, and this
makes it very limited. The new radix tree based compiler can handle a more
complex input language that does not require the compiler to perfectly
group everything into a neat tree structure.

Instead use a simple list to describe to input, where the list elements
can be of various different 'opcodes' instructing the radix compiler what
to do. Start out with opcodes chaining to other definition lists and
chaining to the existing 'tree' definition.

Replace the very top level of the 'object tree' with this list type and
get rid of struct uverbs_object_tree_def and DECLARE_UVERBS_OBJECT_TREE.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agoRDMA/mlx5: Do not generate the uabi specs unconditionally
Jason Gunthorpe [Mon, 12 Nov 2018 20:59:49 +0000 (22:59 +0200)]
RDMA/mlx5: Do not generate the uabi specs unconditionally

For DM there is no reason not to add the spec for the START_OFFSET, if DM
is not supported then ib_dev.alloc_dm is already set to NULL which ensures
we do not call the method.

For IPSEC, the core code should be setting ib_dev.create_flow_action_esp
to NULL to disable it, not relying on wonky manipulation of the specs.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agomlx4: trigger IB events needed by SMC
Ursula Braun [Mon, 12 Nov 2018 11:41:55 +0000 (12:41 +0100)]
mlx4: trigger IB events needed by SMC

The mlx4 driver does not trigger an IB_EVENT_PORT_ACTIVE when the RoCE
network interface is activated. When SMC determines the RoCE device port
to be used, it checks the port states. This patch triggers IB events for
NETDEV_UP and NETDEV_DOWN.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoiw_cxgb4: only reconnect with MPAv1 if the peer aborts
Steve Wise [Sat, 10 Nov 2018 13:27:39 +0000 (05:27 -0800)]
iw_cxgb4: only reconnect with MPAv1 if the peer aborts

Only retry connection setup with MPAv1 if the peer actually aborted the
connection upon receiving the MPAv2 start message.  This avoids retrying
with MPAv1 in the case where the connection was aborted due to retransmit
timeouts.

Fixes: d2fe99e86bb2 ("RDMA/cxgb4: Add support for MPAv2 Enhanced RDMA Negotiation")
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoIB/core: Make function ib_fmr_pool_unmap return void
Yuval Shaia [Wed, 21 Nov 2018 11:47:02 +0000 (13:47 +0200)]
IB/core: Make function ib_fmr_pool_unmap return void

Since the function always returns 0 make it void.

Reported-by: HÃ¥kon Bugge <haakon.bugge@oracle.com>
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoIB/srpt: Drop pointless static qualifier in srpt_make_tpg()
Yue Haibing [Thu, 15 Nov 2018 10:55:00 +0000 (10:55 +0000)]
IB/srpt: Drop pointless static qualifier in srpt_make_tpg()

There is no need to have the 'struct se_portal_group *tpg' variable static
since new value always be assigned before use.

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/core: Remove unused header files mm.h, socket.h, scatterlist.h
Parav Pandit [Thu, 15 Nov 2018 02:03:35 +0000 (04:03 +0200)]
RDMA/core: Remove unused header files mm.h, socket.h, scatterlist.h

Structures of ib_verbs.h don't use fields/structures of mm.h, socket.h or
scatterlist.h.  So remove such header files inclusion.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoinfiniband/hw/cxgb4/qp.c: Use dma_zalloc_coherent
Sabyasachi Gupta [Mon, 12 Nov 2018 15:21:59 +0000 (20:51 +0530)]
infiniband/hw/cxgb4/qp.c: Use dma_zalloc_coherent

Replaced dma_alloc_coherent + memset with dma_zalloc_coherent

Signed-off-by: Sabyasachi Gupta <sabyasachi.linux@gmail.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoinfiniband/hw/cxgb3/cxio_hal.c: Use dma_zalloc_coherent
Sabyasachi Gupta [Fri, 9 Nov 2018 16:50:29 +0000 (22:20 +0530)]
infiniband/hw/cxgb3/cxio_hal.c: Use dma_zalloc_coherent

Replaced dma_alloc_coherent + memset with dma_zalloc_coherent

Signed-off-by: Sabyasachi Gupta <sabyasachi.linux@gmail.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoMerge branch 'mlx5-next' into rdma.git
Jason Gunthorpe [Wed, 21 Nov 2018 21:29:40 +0000 (14:29 -0700)]
Merge branch 'mlx5-next' into rdma.git

From git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux

mlx5 updates taken for dependencies on later ODP patches.

Conflict resolved by deleting mlx5_ib_get_vector_affinity()

* branch 'mlx5-next': (21 commits)
  net/mlx5: EQ, Make EQE access methods inline
  {net,IB}/mlx5: Move Page fault EQ and ODP logic to RDMA
  net/mlx5: EQ, Generic EQ
  net/mlx5: EQ, Different EQ types
  net/mlx5: EQ, Privatize eq_table and friends
  net/mlx5: EQ, irq_info and rmap belong to eq_table
  net/mlx5: EQ, Create all EQs in one place
  net/mlx5: EQ, Move all EQ logic to eq.c
  net/mlx5: EQ, Remove redundant completion EQ list lock
  net/mlx5: EQ, No need to store eq index as a field
  net/mlx5: EQ, Remove unused fields and structures
  net/mlx5: EQ, Use the right place to store/read IRQ affinity hint
  IB/mlx5: Improve ODP debugging messages
  net/mlx5: Use multi threaded workqueue for page fault handling
  net/mlx5: Return success for PAGE_FAULT_RESUME in internal error state
  IB/mlx5: Lock QP during page fault handling
  net/mlx5: Enumerate page fault types
  net/mlx5: Add interface to hold and release core resources
  net/mlx5: Release resource on error flow
  net/mlx5: Fix offsets of ifc reserved fields
  ...

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoIB/mlx5: Allow modify AV in DCI QP to RTR
Artemy Kovalyov [Mon, 5 Nov 2018 06:12:07 +0000 (08:12 +0200)]
IB/mlx5: Allow modify AV in DCI QP to RTR

This is required so the user can set the SL on the DC QP.

Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com>
Reviewed-by: Yossi Itigin <yosefe@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agonet/mlx5: EQ, Make EQE access methods inline
Saeed Mahameed [Mon, 19 Nov 2018 18:52:42 +0000 (10:52 -0800)]
net/mlx5: EQ, Make EQE access methods inline

These are one/two liner generic EQ access methods, better have them
declared static inline in eq.h.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years ago{net,IB}/mlx5: Move Page fault EQ and ODP logic to RDMA
Saeed Mahameed [Mon, 19 Nov 2018 18:52:41 +0000 (10:52 -0800)]
{net,IB}/mlx5: Move Page fault EQ and ODP logic to RDMA

Use the new generic EQ API to move all ODP RDMA data structures and logic
form mlx5 core driver into mlx5_ib driver.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agonet/mlx5: EQ, Generic EQ
Saeed Mahameed [Mon, 19 Nov 2018 18:52:40 +0000 (10:52 -0800)]
net/mlx5: EQ, Generic EQ

Add mlx5_eq_{create/destroy}_generic APIs and EQE access methods, for
mlx5 core consumers generic EQs.

This API will be used in downstream patch to move page fault (RDMA ODP)
EQ logic into mlx5_ib rdma driver, hence it will use a generic EQ.

Current mlx5 EQ allocation scheme:
On load mlx5 allocates 4 (for async) + #cores (for data completions)
MSIX vectors, mlx5 core will assign 3 MSIX vectors for internal async
EQs and will use all of the #cores MSIX vectors for completion EQs,
(One vector is going to be reserved for a generic EQ).

After this patch an external user (e.g mlx5_ib) of mlx5_core
can use this new API to create new generic EQs with the reserved msix
vector index for that eq.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>