RDMA/srp: Fix support for unpopulated and unbalanced NUMA nodes
authorNicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com>
Fri, 5 Feb 2021 08:14:28 +0000 (09:14 +0100)
committerJason Gunthorpe <jgg@nvidia.com>
Wed, 17 Feb 2021 00:01:50 +0000 (20:01 -0400)
commit2b5715fc17386a6223490d5b8f08d031999b0c0b
tree532a56c17078e9d8eb4859e9871aed64eaad1868
parent68ad4d1cc679c1704faf9db6ddd0550702b5d093
RDMA/srp: Fix support for unpopulated and unbalanced NUMA nodes

The current code computes a number of channels per SRP target and spreads
them equally across all online NUMA nodes.  Each channel is then assigned
a CPU within this node.

In the case of unbalanced, or even unpopulated nodes, some channels do not
get a CPU associated and thus do not get connected.  This causes the SRP
connection to fail.

This patch solves the issue by rewriting channel computation and
allocation:

- Drop channel to node/CPU association as it had no real effect on
  locality but added unnecessary complexity.

- Tweak the number of channels allocated to reduce CPU contention when
  possible:
  - Up to one channel per CPU (instead of up to 4 by node)
  - At least 4 channels per node, unless ch_count module parameter is
    used.

Link: https://lore.kernel.org/r/9cb4d9d3-30ad-2276-7eff-e85f7ddfb411@suse.com
Signed-off-by: Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
drivers/infiniband/ulp/srp/ib_srp.c