[TTI][AArch64][SLP] Sets the cost of an ADD reduction 2xi64 to 2.
authorVasileios Porpodas <vporpodas@google.com>
Fri, 29 Jul 2022 00:01:15 +0000 (17:01 -0700)
committerVasileios Porpodas <vporpodas@google.com>
Mon, 1 Aug 2022 20:03:14 +0000 (13:03 -0700)
commitf6690303732e04f43715799154f86999b49c8cff
tree5df210a3875e0270fdb807c3f4555ec35db91df8
parent5fd03b00ee029b4cc958ae8e6c970a6123bd12f6
[TTI][AArch64][SLP] Sets the cost of an ADD reduction 2xi64 to 2.

2xi64 is the legalized type for wide reductions (like 16xi64) and setting the
cost to 2 makes `load-reduce` and `load-zext-reduce` patterns profitable.

The few performance measurments that I did on an aarch64 machine confirm that
these patterns are actually faster when vectorized.

Differential Revision: https://reviews.llvm.org/D130740
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
llvm/test/Analysis/CostModel/AArch64/reduce-add.ll
llvm/test/Transforms/SLPVectorizer/AArch64/reduce-add-i64.ll