We can race where we have added work to the work_list, but
vhost_task_fn has passed that check but not yet set us into
TASK_INTERRUPTIBLE. wake_up_process will see us in TASK_RUNNING and
just return.
This bug was intoduced in commit
f9010dbdce91 ("fork, vhost: Use
CLONE_THREAD to fix freezer/ps regression") when I moved the setting
of TASK_INTERRUPTIBLE to simplfy the code and avoid get_signal from
logging warnings about being in the wrong state. This moves the setting
of TASK_INTERRUPTIBLE back to before we test if we need to stop the
task to avoid a possible race there as well. We then have vhost_worker
set TASK_RUNNING if it finds work similar to before.
Fixes:
f9010dbdce91 ("fork, vhost: Use CLONE_THREAD to fix freezer/ps regression")
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <
20230607192338.6041-3-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
node = llist_del_all(&worker->work_list);
if (node) {
+ __set_current_state(TASK_RUNNING);
+
node = llist_reverse_order(node);
/* make sure flag is seen after deletion */
smp_wmb();
for (;;) {
bool did_work;
- /* mb paired w/ vhost_task_stop */
- if (test_bit(VHOST_TASK_FLAGS_STOP, &vtsk->flags))
- break;
-
if (!dead && signal_pending(current)) {
struct ksignal ksig;
/*
clear_thread_flag(TIF_SIGPENDING);
}
+ /* mb paired w/ vhost_task_stop */
+ set_current_state(TASK_INTERRUPTIBLE);
+
+ if (test_bit(VHOST_TASK_FLAGS_STOP, &vtsk->flags)) {
+ __set_current_state(TASK_RUNNING);
+ break;
+ }
+
did_work = vtsk->fn(vtsk->data);
- if (!did_work) {
- set_current_state(TASK_INTERRUPTIBLE);
+ if (!did_work)
schedule();
- }
}
complete(&vtsk->exited);