[ARM] Add larger than legal ICmp costs
authorDavid Green <david.green@arm.com>
Thu, 18 Feb 2021 11:42:17 +0000 (11:42 +0000)
committerDavid Green <david.green@arm.com>
Thu, 18 Feb 2021 11:42:17 +0000 (11:42 +0000)
commit1a6744e3dc671d92662c94a77bbae50a6a34d316
treef03166389b5c7c1143d25b2230ce860291e619bb
parent059cfe30939db19ed042c80c8cba349f8a4d3c7f
[ARM] Add larger than legal ICmp costs

A v8i32 compare will produce a v8i1 predicate, but during codegen the
v8i32 will be split into two v4i32, potentially requiring two v4i1
predicates to be merged into a single v8i1. Because this merging of two
v4i1's into a v8i1 is very expensive, we need to make the cost of the
compare equally high.

This patch adds the cost of that to ARMTTIImpl::getCmpSelInstrCost.
Because we don't know whether the user of the predicate can be split,
and the cost model is mostly pre-instruction, we may be pessimistic but
that should only be for larger and legal types. This also adds min/max
detection to the costmodel where it can be detected, to keep those in
line with the cost of simple min/max instructions. Otherwise for the
most part, costs that were already expensive have become more expensive.

Differential Revision: https://reviews.llvm.org/D96692
llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
llvm/test/Analysis/CostModel/ARM/arith-overflow.ll
llvm/test/Analysis/CostModel/ARM/intrinsic-cost-kinds.ll
llvm/test/Analysis/CostModel/ARM/mve-cmp.ll
llvm/test/Analysis/CostModel/ARM/reduce-smax.ll
llvm/test/Analysis/CostModel/ARM/reduce-smin.ll
llvm/test/Analysis/CostModel/ARM/reduce-umax.ll
llvm/test/Analysis/CostModel/ARM/reduce-umin.ll
llvm/test/CodeGen/ARM/vselect_imax.ll
llvm/test/Transforms/LoopVectorize/ARM/mve-icmpcost.ll