AMDGPU/SI: Improve SILoadStoreOptimizer and run it before the scheduler
authorTom Stellard <thomas.stellard@amd.com>
Mon, 29 Aug 2016 19:15:22 +0000 (19:15 +0000)
committerTom Stellard <thomas.stellard@amd.com>
Mon, 29 Aug 2016 19:15:22 +0000 (19:15 +0000)
commitc2ff0eb69762f0c87545b74c89d99cfdbb0913e9
tree23670976c67603007a5f4b62773205d746b8b9dc
parentc10c33444e9c02124eb39d1521646ee1bc8a5525
AMDGPU/SI: Improve SILoadStoreOptimizer and run it before the scheduler

Summary:
The SILoadStoreOptimizer can now look ahead more then one instruction when
looking for instructions to merge, which greatly improves the number of
loads/stores that we are able to merge.

Moving the pass before scheduling avoids increasing register pressure after
the scheduler, so that the scheduler's register pressure estimates will be
more accurate.  It also gives more consistent results, since it is no longer
affected by minor scheduling changes.

Reviewers: arsenm

Subscribers: arsenm, kzhuravl, llvm-commits

Differential Revision: https://reviews.llvm.org/D23814

llvm-svn: 279991
12 files changed:
llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
llvm/test/CodeGen/AMDGPU/ds_read2_offset_order.ll
llvm/test/CodeGen/AMDGPU/ds_write2.ll
llvm/test/CodeGen/AMDGPU/fceil64.ll
llvm/test/CodeGen/AMDGPU/llvm.amdgcn.rsq.clamp.ll
llvm/test/CodeGen/AMDGPU/load-local-i16.ll
llvm/test/CodeGen/AMDGPU/load-local-i32.ll
llvm/test/CodeGen/AMDGPU/local-memory.amdgcn.ll
llvm/test/CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll
llvm/test/CodeGen/AMDGPU/store-v3i64.ll
llvm/test/CodeGen/AMDGPU/use-sgpr-multiple-times.ll