AMDGPU: Fix computation for getOccupancyWithLocalMemSize
authorMatt Arsenault <Matthew.Arsenault@amd.com>
Mon, 2 Mar 2020 14:43:06 +0000 (09:43 -0500)
committerMatt Arsenault <arsenm2@gmail.com>
Tue, 3 Mar 2020 22:15:57 +0000 (17:15 -0500)
commit88aced1e454195e038560abb3a0732d020aa4295
tree1fa025f64ffecbc2ab3a3dcb5348babb7a4a9c6a
parent27a3ecee45584f6e78b46741111ebbbe5554faad
AMDGPU: Fix computation for getOccupancyWithLocalMemSize

The computation here didn't really make sense to me, and reported
wildy different results depending on the flat work group size
attribute.

I think this should really report a range derived from the possible
work group size bounds, and only allow an occupancy that is a multiple
of the group size.
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
llvm/test/CodeGen/AMDGPU/occupancy-levels.ll