RDMA/mlx5: Fix mkey cache WQ flush
authorMoshe Shemesh <moshe@nvidia.com>
Wed, 25 Oct 2023 17:49:59 +0000 (20:49 +0300)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Wed, 10 Jan 2024 16:16:55 +0000 (17:16 +0100)
[ Upstream commit a53e215f90079f617360439b1b6284820731e34c ]

The cited patch tries to ensure no pending works on the mkey cache
workqueue by disabling adding new works and call flush_workqueue().
But this workqueue also has delayed works which might still be pending
the delay time to be queued.

Add cancel_delayed_work() for the delayed works which waits to be queued
and then the flush_workqueue() will flush all works which are already
queued and running.

Fixes: 374012b00457 ("RDMA/mlx5: Fix mkey cache possible deadlock on cleanup")
Link: https://lore.kernel.org/r/b8722f14e7ed81452f791764a26d2ed4cfa11478.1698256179.git.leon@kernel.org
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
drivers/infiniband/hw/mlx5/mr.c

index 8a3762d..e062989 100644 (file)
@@ -1026,11 +1026,13 @@ void mlx5_mkey_cache_cleanup(struct mlx5_ib_dev *dev)
                return;
 
        mutex_lock(&dev->cache.rb_lock);
+       cancel_delayed_work(&dev->cache.remove_ent_dwork);
        for (node = rb_first(root); node; node = rb_next(node)) {
                ent = rb_entry(node, struct mlx5_cache_ent, node);
                xa_lock_irq(&ent->mkeys);
                ent->disabled = true;
                xa_unlock_irq(&ent->mkeys);
+               cancel_delayed_work(&ent->dwork);
        }
        mutex_unlock(&dev->cache.rb_lock);