[X86][Costmodel] `getReplicationShuffleCost()`: promote 8 bit-wide elements to 32...
authorRoman Lebedev <lebedev.ri@gmail.com>
Mon, 15 Nov 2021 15:55:45 +0000 (18:55 +0300)
committerRoman Lebedev <lebedev.ri@gmail.com>
Mon, 15 Nov 2021 16:04:02 +0000 (19:04 +0300)
commit5c7255fe3a8570a329d894c22421b54a5e5d5dc7
treedd1e4aa3be837edbfecbd49970efe7661821dcc2
parenta468c39c90192aeff9b5dde9eb16a383d29b808b
[X86][Costmodel] `getReplicationShuffleCost()`: promote 8 bit-wide elements to 32 bit when no AVX512VBMI

Currently `X86TTIImpl::getInterleavedMemoryOpCostAVX512()` asks about i8 elt type,
so this change does affect vectorization. In the end, it will ask about i1.

We should also try to promote to i16 if we have AVX512BW, i'll do that in a follow-up.
All costs here look good, i've added the missing truncation costs in preparatory patches.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D113853
llvm/lib/Target/X86/X86TargetTransformInfo.cpp
llvm/test/Analysis/CostModel/X86/interleaved-store-accesses-with-gaps.ll
llvm/test/Analysis/CostModel/X86/shuffle-replication-i8.ll