[LoopVectorize] Take vscale into account when deciding to create epilogues
In LoopVectorizationCostModel::isEpilogueVectorizationProfitable we
check to see if the chosen main vector loop VF >= 16. If so, we
decide to create a vector epilogue loop. However, this doesn't
take VScaleForTuning into account because we could be targeting a
CPU where vscale > 1, and hence the runtime VF would be a multiple
of the known minimum value.
This patch multiplies scalable VFs by VScaleForTuning and several
tests have been updated that now produce vector epilogues.
Differential Revision: https://reviews.llvm.org/D147522