AMDGPU: Optimize outgoing workitem ID based on reqd_work_group_size
authorMatt Arsenault <Matthew.Arsenault@amd.com>
Sun, 9 Jan 2022 22:28:41 +0000 (17:28 -0500)
committerMatt Arsenault <Matthew.Arsenault@amd.com>
Thu, 13 Jan 2022 17:08:18 +0000 (12:08 -0500)
commita6f49423c1ecad4b414c204822d26d9025da2599
tree1790b412394d661acb04d7a59e2a5da45123094d
parentc719a8596d01cef9b54f0585bd2d68d657d8659a
AMDGPU: Optimize outgoing workitem ID based on reqd_work_group_size

If we know we we aren't using a component from the kernel, we can save
a few bit packing instructions.

We're still enabling the VGPR input to the kernel though.
llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-call-implicit-args.ll
llvm/test/CodeGen/AMDGPU/call-reqd-group-size.ll