net/smc: Avoid to access invalid RMBs' MRs in SMCRv1 ADD LINK CONT
authorWen Gu <guwen@linux.alibaba.com>
Thu, 1 Jun 2023 08:41:52 +0000 (16:41 +0800)
committerDavid S. Miller <davem@davemloft.net>
Sat, 3 Jun 2023 19:51:04 +0000 (20:51 +0100)
SMCRv1 has a similar issue to SMCRv2 (see link below) that may access
invalid MRs of RMBs when construct LLC ADD LINK CONT messages.

 BUG: kernel NULL pointer dereference, address: 0000000000000014
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: 0000 [#1] PREEMPT SMP PTI
 CPU: 5 PID: 48 Comm: kworker/5:0 Kdump: loaded Tainted: G W   E      6.4.0-rc3+ #49
 Workqueue: events smc_llc_add_link_work [smc]
 RIP: 0010:smc_llc_add_link_cont+0x160/0x270 [smc]
 RSP: 0018:ffffa737801d3d50 EFLAGS: 00010286
 RAX: ffff964f82144000 RBX: ffffa737801d3dd8 RCX: 0000000000000000
 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff964f81370c30
 RBP: ffffa737801d3dd4 R08: ffff964f81370000 R09: ffffa737801d3db0
 R10: 0000000000000001 R11: 0000000000000060 R12: ffff964f82e70000
 R13: ffff964f81370c38 R14: ffffa737801d3dd3 R15: 0000000000000001
 FS:  0000000000000000(0000) GS:ffff9652bfd40000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000014 CR3: 000000008fa20004 CR4: 00000000003706e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  <TASK>
  smc_llc_srv_rkey_exchange+0xa7/0x190 [smc]
  smc_llc_srv_add_link+0x3ae/0x5a0 [smc]
  smc_llc_add_link_work+0xb8/0x140 [smc]
  process_one_work+0x1e5/0x3f0
  worker_thread+0x4d/0x2f0
  ? __pfx_worker_thread+0x10/0x10
  kthread+0xe5/0x120
  ? __pfx_kthread+0x10/0x10
  ret_from_fork+0x2c/0x50
  </TASK>

When an alernate RNIC is available in system, SMC will try to add a new
link based on the RNIC for resilience. All the RMBs in use will be mapped
to the new link. Then the RMBs' MRs corresponding to the new link will
be filled into LLC messages. For SMCRv1, they are ADD LINK CONT messages.

However smc_llc_add_link_cont() may mistakenly access to unused RMBs which
haven't been mapped to the new link and have no valid MRs, thus causing a
crash. So this patch fixes it.

Fixes: 87f88cda2128 ("net/smc: rkey processing for a new link as SMC client")
Link: https://lore.kernel.org/r/1685101741-74826-3-git-send-email-guwen@linux.alibaba.com
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
net/smc/smc_llc.c

index 7a8d9163d186e176808818488ceae05c3cc81701..90f0b60b196ab4a75f0b6a856eb9cdce8a22ae26 100644 (file)
@@ -851,6 +851,8 @@ static int smc_llc_add_link_cont(struct smc_link *link,
        addc_llc->num_rkeys = *num_rkeys_todo;
        n = *num_rkeys_todo;
        for (i = 0; i < min_t(u8, n, SMC_LLC_RKEYS_PER_CONT_MSG); i++) {
+               while (*buf_pos && !(*buf_pos)->used)
+                       *buf_pos = smc_llc_get_next_rmb(lgr, buf_lst, *buf_pos);
                if (!*buf_pos) {
                        addc_llc->num_rkeys = addc_llc->num_rkeys -
                                              *num_rkeys_todo;
@@ -867,8 +869,6 @@ static int smc_llc_add_link_cont(struct smc_link *link,
 
                (*num_rkeys_todo)--;
                *buf_pos = smc_llc_get_next_rmb(lgr, buf_lst, *buf_pos);
-               while (*buf_pos && !(*buf_pos)->used)
-                       *buf_pos = smc_llc_get_next_rmb(lgr, buf_lst, *buf_pos);
        }
        addc_llc->hd.common.llc_type = SMC_LLC_ADD_LINK_CONT;
        addc_llc->hd.length = sizeof(struct smc_llc_msg_add_link_cont);