[LoadStoreVectorizer] Change VectorSet to Vector to match head and tail positions...
authorAlina Sbirlea <asbirlea@google.com>
Tue, 30 Aug 2016 23:53:59 +0000 (23:53 +0000)
committerAlina Sbirlea <asbirlea@google.com>
Tue, 30 Aug 2016 23:53:59 +0000 (23:53 +0000)
commit3f8f7840bf12ffa4bfd558e5115acbd66b39280a
treea20c66b655f152f81902a1037d847e83c3a3402f
parentfdb32d566a5e81deb9a3e4c0714f74337edcfbb7
[LoadStoreVectorizer] Change VectorSet to Vector to match head and tail positions. Resolves PR29148.

Summary:
LSV was using two vector sets (heads and tails) to track pairs of adjiacent position to vectorize.
A recent optimization is trying to obtain the longest chain to vectorize and assumes the positions
in heads(H) and tails(T) match, which is not the case is there are multiple tails for the same head.

e.g.:
i1: store a[0]
i2: store a[1]
i3: store a[1]
Leads to:
H: i1
T: i2 i3
Instead of:
H: i1 i1
T: i2 i3
So the positions for instructions that follow i3 will have different indexes in H/T.
This patch resolves PR29148.

This issue also surfaced the fact that if the chain is too long, and TLI
returns a "not-fast" answer, the whole chain will be abandoned for
vectorization, even though a smaller one would be beneficial.
Added a testcase and FIXME for this.

Reviewers: tstellarAMD, arsenm, jlebar

Subscribers: mzolotukhin, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D24057

llvm-svn: 280179
llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/multiple_tails.ll [new file with mode: 0644]
llvm/test/Transforms/LoadStoreVectorizer/X86/subchain-interleaved.ll