[LoopVectorize] Add non-zero check for MaxPowerOf2RuntimeVF in computeMaxVF
authorDavid Sherwood <david.sherwood@arm.com>
Wed, 29 Mar 2023 08:54:56 +0000 (08:54 +0000)
committerDavid Sherwood <david.sherwood@arm.com>
Wed, 29 Mar 2023 10:08:32 +0000 (10:08 +0000)
This one-line patch just tightens up the code added in
1c4fedfa35aeb8b456e2d8f4f826c0e026b9d863
where we try to avoid tail-folding if we know the runtime
VF will always be a multiple of the trip count.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

index c391882..21858cc 100644 (file)
@@ -5170,7 +5170,7 @@ LoopVectorizationCostModel::computeMaxVF(ElementCount UserVF, unsigned UserIC) {
       MaxPowerOf2RuntimeVF = std::nullopt; // Stick with tail-folding for now.
   }
 
-  if (MaxPowerOf2RuntimeVF) {
+  if (MaxPowerOf2RuntimeVF && *MaxPowerOf2RuntimeVF > 0) {
     assert((UserVF.isNonZero() || isPowerOf2_32(*MaxPowerOf2RuntimeVF)) &&
            "MaxFixedVF must be a power of 2");
     unsigned MaxVFtimesIC =