[AMDGPU] Add s_nop WaitStates between neighboring mfma
authorAustin Kerbow <Austin.Kerbow@amd.com>
Fri, 4 Mar 2022 08:33:21 +0000 (00:33 -0800)
committerAustin Kerbow <Austin.Kerbow@amd.com>
Wed, 23 Mar 2022 20:56:09 +0000 (13:56 -0700)
commit1e15adba62a9fbc00a9999d75818ef8b1fbb8cd7
tree6ab03d7f8dfc9a65f2ff3b977901318bb6bf8add
parentee94a4a3d02f0cc7496eb91ea7d5c0819a6b32a0
[AMDGPU] Add s_nop WaitStates between neighboring mfma

In some cases padding bubbles between sequential MFMA instructions may
lead to increased inter-wave performance. Add option to request to pad
some portion of these stall cycles with s_nops.

Fixes: SWDEV-326925

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D121437
llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
llvm/lib/Target/AMDGPU/GCNHazardRecognizer.h
llvm/test/CodeGen/AMDGPU/neighboring-mfma-padding.mir [new file with mode: 0644]