[SLP][NFC] Pre-commit test showing vectorization preventing FMA
When we generate a horizontal reduction of floating adds fed by a vectorized
tree rooted at floating multiplies, we should account for the cost of no
longer being able to generate scalar FMAs. Similarly, if we vectorize a
list of floating multiplies that each feeds a single floating add, we should
again account for this cost.
The first test was reduced from a case where the vectorizable tree looked
barely profitable (cost -1) with a horizontal reduction, but produced
substantially worse code than allowing the FMAs to be generated. The second
test was derived from the first: we again generate a horizontal reduction
here, but even if the horizontal reduction is forced to be unprofitable, we
try to vectorize the multiplies. I have follow-up patches to address these
issues.
Differential Revision: https://reviews.llvm.org/D124867