AMDGPU: Use the implicit kernargs for code object version 5
authorChangpeng Fang <Changpeng.Fang@amd.com>
Thu, 17 Mar 2022 21:12:36 +0000 (14:12 -0700)
committerChangpeng Fang <Changpeng.Fang@amd.com>
Thu, 17 Mar 2022 21:12:36 +0000 (14:12 -0700)
commitdd5895cc39864393f8ca357bc4e23e8d7b5b9723
tree8e7f42ff744fb17083ce3507f54c691c2a7a52bd
parent2c9995c117f521a13f47fedf9c5648a82318d29d
AMDGPU: Use the implicit kernargs for code object version 5

Summary:
  Specifically, for trap handling, for targets that do not support getDoorbellID,
we load the queue_ptr from the implicit kernarg, and move queue_ptr to s[0:1].
To get aperture bases when targets do not have aperture registers, we load
private_base or shared_base directly from the implicit kernarg. In clang, we use
implicitarg_ptr + offsets to implement __builtin_amdgcn_workgroup_size_{xyz}.

Reviewers: arsenm, sameerds, yaxunl

Differential Revision: https://reviews.llvm.org/D120265
13 files changed:
clang/lib/CodeGen/CGBuiltin.cpp
clang/test/CodeGenCUDA/amdgpu-workgroup-size.cu
llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.h
llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
llvm/lib/Target/AMDGPU/SIDefines.h
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
llvm/lib/Target/AMDGPU/SIISelLowering.h
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h
llvm/test/CodeGen/AMDGPU/GlobalISel/implicit-kernarg-backend-usage-global-isel.ll [new file with mode: 0644]
llvm/test/CodeGen/AMDGPU/implicit-kernarg-backend-usage.ll [new file with mode: 0644]