[LV] Fold tail by masking to vectorize loops of arbitrary trip count under opt for...
authorAyal Zaks <ayal.zaks@intel.com>
Thu, 18 Oct 2018 15:03:15 +0000 (15:03 +0000)
committerAyal Zaks <ayal.zaks@intel.com>
Thu, 18 Oct 2018 15:03:15 +0000 (15:03 +0000)
commitb0b5312e677ccbe568ffe4ea8247c4384d30b000
tree8842218ea6576623d45b67fcf7125a660841b436
parenta1e6e65b9fd68490db04530a45c9333cf69b6213
[LV] Fold tail by masking to vectorize loops of arbitrary trip count under opt for size

When optimizing for size, a loop is vectorized only if the resulting vector loop
completely replaces the original scalar loop. This holds if no runtime guards
are needed, if the original trip-count TC does not overflow, and if TC is a
known constant that is a multiple of the VF. The last two TC-related conditions
can be overcome by
1. rounding the trip-count of the vector loop up from TC to a multiple of VF;
2. masking the vector body under a newly introduced "if (i <= TC-1)" condition.

The patch allows loops with arbitrary trip counts to be vectorized under -Os,
subject to the existing cost model considerations. It also applies to loops with
small trip counts (under -O2) which are currently handled as if under -Os.

The patch does not handle loops with reductions, live-outs, or w/o a primary
induction variable, and disallows interleave groups.

(Third, final and main part of -)
Differential Revision: https://reviews.llvm.org/D50480

llvm-svn: 344743
llvm/include/llvm/Analysis/VectorUtils.h
llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h
llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
llvm/lib/Transforms/Vectorize/VPlan.cpp
llvm/lib/Transforms/Vectorize/VPlan.h
llvm/test/Transforms/LoopVectorize/X86/optsize.ll
llvm/test/Transforms/LoopVectorize/X86/small-size.ll
llvm/test/Transforms/LoopVectorize/X86/vect.omp.force.small-tc.ll