review.tizen.org Git - platform/upstream/llvm.git/commit

author	David Sherwood <david.sherwood@arm.com>
	Wed, 18 Aug 2021 08:40:21 +0000 (09:40 +0100)
committer	David Sherwood <david.sherwood@arm.com>
	Wed, 18 Aug 2021 16:01:56 +0000 (17:01 +0100)
commit	219d4518fce9aafcb5eba9b92fb778837f0a4827
tree	9b5b13f75e9f160783327c45a351b2facfae459a	tree \| snapshot
parent	13d8f000d7271226e5dfc6c0dc25b91cf6233349	commit \| diff

[Analysis][AArch64] Make fixed-width ordered reductions slightly more expensive

For tight loops like this:

  float r = 0;
  for (int i = 0; i < n; i++) {
    r += a[i];
  }

it's better not to vectorise at -O3 using fixed-width ordered reductions
on AArch64 targets. Although the resulting number of instructions in the
generated code ends up being comparable to not vectorising at all, there
may be additional costs on some CPUs, for example perhaps the scheduling
is worse. It makes sense to deter vectorisation in tight loops.

Differential Revision: https://reviews.llvm.org/D108292

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp		diff \| blob \| history
llvm/test/Analysis/CostModel/AArch64/reduce-fadd.ll		diff \| blob \| history
llvm/test/Transforms/LoopVectorize/AArch64/strict-fadd-cost.ll		diff \| blob \| history