NFSv4: Fix dropped lock for racing OPEN and delegation return
authorBenjamin Coddington <bcodding@redhat.com>
Fri, 30 Jun 2023 13:18:13 +0000 (09:18 -0400)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Wed, 30 Aug 2023 14:11:05 +0000 (16:11 +0200)
commit 1cbc11aaa01f80577b67ae02c73ee781112125fd upstream.

Commmit f5ea16137a3f ("NFSv4: Retry LOCK on OLD_STATEID during delegation
return") attempted to solve this problem by using nfs4's generic async error
handling, but introduced a regression where v4.0 lock recovery would hang.
The additional complexity introduced by overloading that error handling is
not necessary for this case.  This patch expects that commit to be
reverted.

The problem as originally explained in the above commit is:

    There's a small window where a LOCK sent during a delegation return can
    race with another OPEN on client, but the open stateid has not yet been
    updated.  In this case, the client doesn't handle the OLD_STATEID error
    from the server and will lose this lock, emitting:
    "NFS: nfs4_handle_delegation_recall_error: unhandled error -10024".

Fix this by using the old_stateid refresh helpers if the server replies
with OLD_STATEID.

Suggested-by: Trond Myklebust <trondmy@hammerspace.com>
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
fs/nfs/nfs4proc.c

index d673836..1044305 100644 (file)
@@ -7170,8 +7170,15 @@ static void nfs4_lock_done(struct rpc_task *task, void *calldata)
                } else if (!nfs4_update_lock_stateid(lsp, &data->res.stateid))
                        goto out_restart;
                break;
-       case -NFS4ERR_BAD_STATEID:
        case -NFS4ERR_OLD_STATEID:
+               if (data->arg.new_lock_owner != 0 &&
+                       nfs4_refresh_open_old_stateid(&data->arg.open_stateid,
+                                       lsp->ls_state))
+                       goto out_restart;
+               if (nfs4_refresh_lock_old_stateid(&data->arg.lock_stateid, lsp))
+                       goto out_restart;
+               fallthrough;
+       case -NFS4ERR_BAD_STATEID:
        case -NFS4ERR_STALE_STATEID:
        case -NFS4ERR_EXPIRED:
                if (data->arg.new_lock_owner != 0) {