srcu: Prevent redundant __srcu_read_unlock() wakeup
authorPaul E. McKenney <paulmck@kernel.org>
Thu, 23 Sep 2021 17:07:14 +0000 (10:07 -0700)
committerPaul E. McKenney <paulmck@kernel.org>
Wed, 1 Dec 2021 01:28:16 +0000 (17:28 -0800)
Tiny SRCU readers can appear at task level, but also in interrupt and
softirq handlers.  Because Tiny SRCU is selected only in kernels built
with CONFIG_SMP=n and CONFIG_PREEMPTION=n, it is not possible for a grace
period to start while there is a non-task-level SRCU reader executing.
This means that it does not make sense for __srcu_read_unlock() to awaken
the Tiny SRCU grace period, because that can only happen when the grace
period is waiting for one value of ->srcu_idx and __srcu_read_unlock()
is ending the last reader for some other value of ->srcu_idx.  After all,
any such wakeup will be redundant.

Worse yet, in some cases, such wakeups generate lockdep splats:

======================================================
WARNING: possible circular locking dependency detected
5.15.0-rc1+ #3758 Not tainted
------------------------------------------------------
rcu_torture_rea/53 is trying to acquire lock:
ffffffff9514e6a8 (srcu_ctl.srcu_wq.lock){..-.}-{2:2}, at:
xa/0x30

but task is already holding lock:
ffff95c642479d80 (&p->pi_lock){-.-.}-{2:2}, at:
_extend+0x370/0x400

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&p->pi_lock){-.-.}-{2:2}:
       _raw_spin_lock_irqsave+0x2f/0x50
       try_to_wake_up+0x50/0x580
       swake_up_locked.part.7+0xe/0x30
       swake_up_one+0x22/0x30
       rcutorture_one_extend+0x1b6/0x400
       rcu_torture_one_read+0x290/0x5d0
       rcu_torture_timer+0x1a/0x70
       call_timer_fn+0xa6/0x230
       run_timer_softirq+0x493/0x4c0
       __do_softirq+0xc0/0x371
       irq_exit+0x73/0x90
       sysvec_apic_timer_interrupt+0x63/0x80
       asm_sysvec_apic_timer_interrupt+0x12/0x20
       default_idle+0xb/0x10
       default_idle_call+0x5e/0x170
       do_idle+0x18a/0x1f0
       cpu_startup_entry+0xa/0x10
       start_kernel+0x678/0x69f
       secondary_startup_64_no_verify+0xc2/0xcb

-> #0 (srcu_ctl.srcu_wq.lock){..-.}-{2:2}:
       __lock_acquire+0x130c/0x2440
       lock_acquire+0xc2/0x270
       _raw_spin_lock_irqsave+0x2f/0x50
       swake_up_one+0xa/0x30
       rcutorture_one_extend+0x387/0x400
       rcu_torture_one_read+0x290/0x5d0
       rcu_torture_reader+0xac/0x200
       kthread+0x12d/0x150
       ret_from_fork+0x22/0x30

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&p->pi_lock);
       lock(srcu_ctl.srcu_wq.lock);
       lock(&p->pi_lock);
  lock(srcu_ctl.srcu_wq.lock);

 *** DEADLOCK ***

1 lock held by rcu_torture_rea/53:
 #0: ffff95c642479d80 (&p->pi_lock){-.-.}-{2:2}, at:
_extend+0x370/0x400

stack backtrace:
CPU: 0 PID: 53 Comm: rcu_torture_rea Not tainted 5.15.0-rc1+

Hardware name: Red Hat KVM/RHEL-AV, BIOS
e_el8.5.0+746+bbd5d70c 04/01/2014
Call Trace:
 check_noncircular+0xfe/0x110
 ? find_held_lock+0x2d/0x90
 __lock_acquire+0x130c/0x2440
 lock_acquire+0xc2/0x270
 ? swake_up_one+0xa/0x30
 ? find_held_lock+0x72/0x90
 _raw_spin_lock_irqsave+0x2f/0x50
 ? swake_up_one+0xa/0x30
 swake_up_one+0xa/0x30
 rcutorture_one_extend+0x387/0x400
 rcu_torture_one_read+0x290/0x5d0
 rcu_torture_reader+0xac/0x200
 ? rcutorture_oom_notify+0xf0/0xf0
 ? __kthread_parkme+0x61/0x90
 ? rcu_torture_one_read+0x5d0/0x5d0
 kthread+0x12d/0x150
 ? set_kthread_struct+0x40/0x40
 ret_from_fork+0x22/0x30

This is a false positive because there is only one CPU, and both locks
are raw (non-preemptible) spinlocks.  However, it is worthwhile getting
rid of the redundant wakeup, which has the side effect of breaking
the theoretical deadlock cycle.  This commit therefore eliminates the
redundant wakeups.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
kernel/rcu/srcutiny.c

index a0ba2ed..92c002d 100644 (file)
@@ -99,7 +99,7 @@ void __srcu_read_unlock(struct srcu_struct *ssp, int idx)
        int newval = READ_ONCE(ssp->srcu_lock_nesting[idx]) - 1;
 
        WRITE_ONCE(ssp->srcu_lock_nesting[idx], newval);
-       if (!newval && READ_ONCE(ssp->srcu_gp_waiting))
+       if (!newval && READ_ONCE(ssp->srcu_gp_waiting) && in_task())
                swake_up_one(&ssp->srcu_wq);
 }
 EXPORT_SYMBOL_GPL(__srcu_read_unlock);