AMDGPU: Don't report 2-byte alignment as fast
authorMatt Arsenault <Matthew.Arsenault@amd.com>
Mon, 10 Feb 2020 15:30:34 +0000 (10:30 -0500)
committerMatt Arsenault <arsenm2@gmail.com>
Tue, 11 Feb 2020 23:35:00 +0000 (18:35 -0500)
commit86f9117d476bcef2f5e0eabae4781e99877ce7b5
tree0928e0493bbb31d255ed9bce634c8f15626f3aae
parentb2c44de956cca22efa374cfb587912b38c41ed67
AMDGPU: Don't report 2-byte alignment as fast

This is apparently worse than 1-byte alignment. This does not attempt
to decompose 2-byte aligned wide stores, but will stop trying to
produce them.

Also fix bug in LoadStoreVectorizer which was decreasing the alignment
and vectorizing stack accesses. It was assuming a stack object was an
alloca that could have its base alignment changed, which is not true
if the pointer is derived from a function argument.
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
llvm/test/CodeGen/AMDGPU/chain-hi-to-lo.ll
llvm/test/CodeGen/AMDGPU/fast-unaligned-load-store.global.ll [new file with mode: 0644]
llvm/test/CodeGen/AMDGPU/fast-unaligned-load-store.private.ll [new file with mode: 0644]
llvm/test/CodeGen/AMDGPU/unaligned-load-store.ll
llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/adjust-alloca-alignment.ll
llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/merge-stores-private.ll
llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/merge-stores.ll