From 0c2d23933f06ed048191f84ecde889e9da93609c Mon Sep 17 00:00:00 2001 From: Jonas Paulsson Date: Thu, 10 Dec 2020 01:56:45 +0100 Subject: [PATCH] [SystemZTTIImpl] Allow some non-prefetched accesses in getMinPrefetchStride(). The performance improvement on LBM previously achieved with improved software prefetching (36d4421) have gone lost recently with e00f189. There now is one memory access in the loop that LoopDataPrefetch cannot handle (while before there was none) which the heuristic rejects. This patch adds a small margin by allowing 1 non-prefetched memory access for every 32 prefetched ones, so that the heuristic doesn't bail in this type of case. Review: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D92985 --- llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp b/llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp index 2c6659b..e7ac239 100644 --- a/llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp +++ b/llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp @@ -341,8 +341,8 @@ unsigned SystemZTTIImpl::getMinPrefetchStride(unsigned NumMemAccesses, // Emit prefetch instructions for smaller strides in cases where we think // the hardware prefetcher might not be able to keep up. - if (NumStridedMemAccesses > 32 && - NumStridedMemAccesses == NumMemAccesses && !HasCall) + if (NumStridedMemAccesses > 32 && !HasCall && + (NumMemAccesses - NumStridedMemAccesses) * 32 <= NumStridedMemAccesses) return 1; return ST->hasMiscellaneousExtensions3() ? 8192 : 2048; -- 2.7.4