IB/rdmavt: Handle dereg of inuse MRs properly
authorMike Marciniszyn <mike.marciniszyn@intel.com>
Mon, 28 Aug 2017 18:24:10 +0000 (11:24 -0700)
committerDoug Ledford <dledford@redhat.com>
Mon, 28 Aug 2017 23:12:31 +0000 (19:12 -0400)
commit0208da90def5776cef940f9de4ffe6ecef346207
tree05833a7dffa3de7e1eea51a004871ae9780e40d5
parent557fafe1bfecb50c3da0bc4948ebcbc4d19f1619
IB/rdmavt: Handle dereg of inuse MRs properly

A destroy of an MR prior to destroying the QP can cause the following
diagnostic if the QP is referencing the MR being de-registered:

hfi1 0000:05:00.0: hfi1_0: rvt_dereg_mr timeout mr ffff8808562108
              00 pd ffff880859b20b00

The solution is to when the a non-zero refcount is encountered when
the MR is destroyed the QPs needs to be iterated looking for QPs in
the same PD as the MR.  If rvt_qp_mr_clean() detects any such QP
references the rkey/lkey, the QP needs to be put into an error state
via a call to rvt_qp_error() which will trigger the clean up of any
stuck references.

This solution is as specified in IBTA 1.3 Volume 1 11.2.10.5.

[This is reproduced with the 0.4.9 version of qperf and the rc_bw test]

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
drivers/infiniband/sw/rdmavt/mr.c
drivers/infiniband/sw/rdmavt/qp.c
include/rdma/rdmavt_mr.h
include/rdma/rdmavt_qp.h