[AMDGPU] Reintroduce CC exception for non-inlined functions in Promote Alloca limits
authorpvanhout <pierre.vanhoutryve@amd.com>
Mon, 15 May 2023 09:23:09 +0000 (11:23 +0200)
committerpvanhout <pierre.vanhoutryve@amd.com>
Tue, 23 May 2023 07:01:39 +0000 (09:01 +0200)
commitf104eb6e15503b770734e3a59937c9df865b2814
tree9b02490b33d11644c7b922337a8d7a7a7381e9d6
parentcd0e9383fc346a8a4ac2d30acf069f4f03fe83d9
[AMDGPU] Reintroduce CC exception for non-inlined functions in Promote Alloca limits

This is basically a partial revert of https://reviews.llvm.org/D145586 ( fd1d60873fdc )

D145586 was originally introduced to help with SWDEV-363662, and it did, but
it also caused a 25% drop in performance in
some MIOpen benchmarks where, it seems,
functions are inlined more conservatively.

This patch restores the pre-D145586 behavior
for PromoteAlloca: functions with a non-entry CC
have a 32 VGPRs threshold, but only if the function
is not marked with "alwaysinline".

A good number of AMDGPU code makes uses of
the AMDGPUAlwaysInline pass anyway, so in our
backend "alwaysinline" seems very common.

This change does not affect SWDEV-363662 (the motivating issue for introducing D145586).

Fixes SWDEV-399519

Reviewed By: rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D150551
llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
llvm/test/CodeGen/AMDGPU/vector-alloca-limits.ll