powerpc/64s/radix: combine final TLB flush and lazy tlb mm shootdown IPIs
authorNicholas Piggin <npiggin@gmail.com>
Wed, 24 May 2023 06:08:21 +0000 (16:08 +1000)
committerMichael Ellerman <mpe@ellerman.id.au>
Wed, 2 Aug 2023 12:22:19 +0000 (22:22 +1000)
This performs lazy tlb mm shootdown when doing the exit TLB flush when
all mm users go away and user mappings are removed, which avoids having
to do the lazy tlb mm shootdown IPIs on the final mmput when all kernel
references disappear.

powerpc/64s uses a broadcast TLBIE for the exit TLB flush if remote CPUs
need to be invalidated (unless TLBIE is disabled), so this doesn't
necessarily save IPIs but it does avoid a broadcast TLBIE which is quite
expensive.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Squash in preempt_disable/enable() fix from Nick]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230524060821.148015-5-npiggin@gmail.com
arch/powerpc/mm/book3s64/radix_tlb.c

index 4e72d80..dd12708 100644 (file)
@@ -1313,7 +1313,35 @@ void radix__tlb_flush(struct mmu_gather *tlb)
         * See the comment for radix in arch_exit_mmap().
         */
        if (tlb->fullmm) {
-               __flush_all_mm(mm, true);
+               if (IS_ENABLED(CONFIG_MMU_LAZY_TLB_SHOOTDOWN)) {
+                       /*
+                        * Shootdown based lazy tlb mm refcounting means we
+                        * have to IPI everyone in the mm_cpumask anyway soon
+                        * when the mm goes away, so might as well do it as
+                        * part of the final flush now.
+                        *
+                        * If lazy shootdown was improved to reduce IPIs (e.g.,
+                        * by batching), then it may end up being better to use
+                        * tlbies here instead.
+                        */
+                       preempt_disable();
+
+                       smp_mb(); /* see radix__flush_tlb_mm */
+                       exit_flush_lazy_tlbs(mm);
+                       _tlbiel_pid(mm->context.id, RIC_FLUSH_ALL);
+
+                       /*
+                        * It should not be possible to have coprocessors still
+                        * attached here.
+                        */
+                       if (WARN_ON_ONCE(atomic_read(&mm->context.copros) > 0))
+                               __flush_all_mm(mm, true);
+
+                       preempt_enable();
+               } else {
+                       __flush_all_mm(mm, true);
+               }
+
        } else if ( (psize = radix_get_mmu_psize(page_size)) == -1) {
                if (!tlb->freed_tables)
                        radix__flush_tlb_mm(mm);