misc: bcm_vk: Fix potential deadlock on &vk->ctx_lock
authorChengfeng Ye <dg573847474@gmail.com>
Thu, 29 Jun 2023 18:29:41 +0000 (18:29 +0000)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Fri, 4 Aug 2023 13:45:19 +0000 (15:45 +0200)
commit1bae5c0e2c8d48c1c1306912dfae41d165508f58
treec2392e8da306274959729121bd9f5652243c59fe
parentacdbfa04816a3b9e990a9b07d5978f35d67f19c3
misc: bcm_vk: Fix potential deadlock on &vk->ctx_lock

As &vk->ctx_lock is acquired by timer bcm_vk_hb_poll() under softirq
context, other process context code should disable irq or bottom-half
before acquire the same lock, otherwise deadlock could happen if the
timer preempt the execution while the lock is held in process context
on the same CPU.

Possible deadlock scenario
bcm_vk_open()
    -> bcm_vk_get_ctx()
    -> spin_lock(&vk->ctx_lock)
<timer iterrupt>
-> bcm_vk_hb_poll()
-> bcm_vk_blk_drv_access()
-> spin_lock_irqsave(&vk->ctx_lock, flags) (deadlock here)

This flaw was found using an experimental static analysis tool we are
developing for irq-related deadlock, which reported the following
warning when analyzing the linux kernel 6.4-rc7 release.

[Deadlock]: &vk->ctx_lock
  [Interrupt]: bcm_vk_hb_poll
    -->/root/linux/drivers/misc/bcm-vk/bcm_vk_msg.c:176
    -->/root/linux/drivers/misc/bcm-vk/bcm_vk_dev.c:512
  [Locking Unit]: bcm_vk_ioctl
    -->/root/linux/drivers/misc/bcm-vk/bcm_vk_dev.c:1181
    -->/root/linux/drivers/misc/bcm-vk/bcm_vk_dev.c:512

[Deadlock]: &vk->ctx_lock
  [Interrupt]: bcm_vk_hb_poll
    -->/root/linux/drivers/misc/bcm-vk/bcm_vk_msg.c:176
    -->/root/linux/drivers/misc/bcm-vk/bcm_vk_dev.c:512
  [Locking Unit]: bcm_vk_ioctl
    -->/root/linux/drivers/misc/bcm-vk/bcm_vk_dev.c:1169

[Deadlock]: &vk->ctx_lock
  [Interrupt]: bcm_vk_hb_poll
    -->/root/linux/drivers/misc/bcm-vk/bcm_vk_msg.c:176
    -->/root/linux/drivers/misc/bcm-vk/bcm_vk_dev.c:512
  [Locking Unit]: bcm_vk_open
    -->/root/linux/drivers/misc/bcm-vk/bcm_vk_msg.c:216

[Deadlock]: &vk->ctx_lock
  [Interrupt]: bcm_vk_hb_poll
    -->/root/linux/drivers/misc/bcm-vk/bcm_vk_msg.c:176
    -->/root/linux/drivers/misc/bcm-vk/bcm_vk_dev.c:512
  [Locking Unit]: bcm_vk_release
    -->/root/linux/drivers/misc/bcm-vk/bcm_vk_msg.c:306

As suggested by Arnd, the tentative patch fix the potential deadlocks
by replacing the timer with delay workqueue. x86_64 allyesconfig using
GCC shows no new warning. Note that no runtime testing was performed
due to no device on hand.

Signed-off-by: Chengfeng Ye <dg573847474@gmail.com>
Acked-by: Scott Branden <scott.branden@broadcom.com>
Tested-by: Desmond Yan <desmond.branden@broadcom.com>
Tested-by: Desmond Yan <desmond.yan@broadcom.com>
Link: https://lore.kernel.org/r/20230629182941.13045-1-dg573847474@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
drivers/misc/bcm-vk/bcm_vk.h
drivers/misc/bcm-vk/bcm_vk_msg.c