From 6600e1759be1626965a26cf1da8d8f8fc73344ca Mon Sep 17 00:00:00 2001
From: Philip Reames <listmail@philipreames.com>
Date: Tue, 31 Aug 2021 08:43:29 -0700
Subject: [PATCH] [SCEV] If max BTC is zero, then so is the exact BTC [1 of N]

This patch is specifically the howManyLessThan case.  There will be a couple of followon patches for other codepaths.

The subtle bit is explaining why the two codepaths have a difference while both are correct. The test case with modifications is a good example, so let's discuss in terms of it.
* The previous exact bounds for this example of (-126 + (126 smax %n))<nsw> can evaluate to either 0 or 1. Both are "correct" results, but only one of them results in a well defined loop. If %n were 127 (the only possible value producing a trip count of 1), then the loop must execute undefined behavior. As a result, we can ignore the TC computed when %n is 127. All other values produce 0.
* The max taken count computation uses the limit (i.e. the maximum value END can be without resulting in UB) to restrict the bound computation. As a result, it returns 0 which is also correct.

WARNING: The logic above only holds for a single exit loop. The current logic for max trip count would be incorrect for multiple exit loops, except that we never call computeMaxBECountForLT except when we can prove either a) no overflow occurs in this IV before exit, or b) this is the sole exit.

An alternate approach here would be to add the limit logic to the symbolic path. I haven't played with this extensively, but I'm hesitant because a) the term is optional and b) I'm not sure it'll reliably simplify away. As such, the resulting code quality from expansion might actually get worse.

This was noticed while trying to figure out why D108848 wasn't NFC, but is otherwise standalone.

Differential Revision: https://reviews.llvm.org/D108921
---
 llvm/lib/Analysis/ScalarEvolution.cpp                | 4 ++++
 llvm/test/Analysis/ScalarEvolution/max-trip-count.ll | 2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index d77934c..0dfa2e2 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -11939,6 +11939,10 @@ ScalarEvolution::howManyLessThans(const SCEV *LHS, const SCEV *RHS,
   } else {
     MaxBECount = computeMaxBECountForLT(
         Start, Stride, RHS, getTypeSizeInBits(LHS->getType()), IsSigned);
+    // If we prove the max count is zero, so is the symbolic bound.  This can
+    // happen due to differences in how we reason about bounds impied by UB.
+    if (MaxBECount->isZero())
+      BECount = MaxBECount;
   }
 
   if (isa<SCEVCouldNotCompute>(MaxBECount) &&
diff --git a/llvm/test/Analysis/ScalarEvolution/max-trip-count.ll b/llvm/test/Analysis/ScalarEvolution/max-trip-count.ll
index 75fd30c..f50ff14 100644
--- a/llvm/test/Analysis/ScalarEvolution/max-trip-count.ll
+++ b/llvm/test/Analysis/ScalarEvolution/max-trip-count.ll
@@ -458,7 +458,7 @@ loop.exit:
 
 define void @max_overflow_se(i8 %n) mustprogress {
 ; CHECK-LABEL: Determining loop execution counts for: @max_overflow_se
-; CHECK: Loop %loop: backedge-taken count is (-126 + (126 smax %n))<nsw>
+; CHECK: Loop %loop: backedge-taken count is 0
 ; CHECK: Loop %loop: max backedge-taken count is 0
 entry:
   br label %loop
-- 
2.7.4