[X86][LV] X86 does *not* prefer vectorized addressing
authorRoman Lebedev <lebedev.ri@gmail.com>
Sat, 16 Oct 2021 09:25:08 +0000 (12:25 +0300)
committerRoman Lebedev <lebedev.ri@gmail.com>
Sat, 16 Oct 2021 09:32:18 +0000 (12:32 +0300)
commitd137f1288e2c2169b53a1baef0d5cd94a4bb3999
tree625dbf4540c39a85df2f83d827afbfaf9ed736fb
parent9bf6bef9951a1c230796ccad2c5c0195ce4c4dff
[X86][LV] X86 does *not* prefer vectorized addressing

And another attempt to start untangling this ball of threads around gather.
There's `TTI::prefersVectorizedAddressing()`hoop, which confusingly defaults to `true`,
which tells LV to try to vectorize the addresses that lead to loads,
but X86 generally can not deal with vectors of addresses,
the only instructions that support that are GATHER/SCATTER,
but even those aren't available until AVX2, and aren't really usable until AVX512.

This specializes the hook for X86, to return true only if we have AVX512 or AVX2 w/ fast gather.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D111546
35 files changed:
llvm/lib/Target/X86/X86TargetTransformInfo.cpp
llvm/lib/Target/X86/X86TargetTransformInfo.h
llvm/test/Analysis/CostModel/X86/gather-i16-with-i8-index.ll
llvm/test/Analysis/CostModel/X86/gather-i32-with-i8-index.ll
llvm/test/Analysis/CostModel/X86/gather-i64-with-i8-index.ll
llvm/test/Analysis/CostModel/X86/gather-i8-with-i8-index.ll
llvm/test/Analysis/CostModel/X86/interleaved-load-f32-stride-3.ll
llvm/test/Analysis/CostModel/X86/interleaved-load-f32-stride-4.ll
llvm/test/Analysis/CostModel/X86/interleaved-load-f64-stride-2.ll
llvm/test/Analysis/CostModel/X86/interleaved-load-f64-stride-4.ll
llvm/test/Analysis/CostModel/X86/interleaved-load-i16-stride-5.ll
llvm/test/Analysis/CostModel/X86/interleaved-load-i16-stride-6.ll
llvm/test/Analysis/CostModel/X86/interleaved-load-i32-stride-2-indices-0u.ll
llvm/test/Analysis/CostModel/X86/interleaved-load-i32-stride-3-indices-01u.ll
llvm/test/Analysis/CostModel/X86/interleaved-load-i32-stride-3-indices-0uu.ll
llvm/test/Analysis/CostModel/X86/interleaved-load-i32-stride-3.ll
llvm/test/Analysis/CostModel/X86/interleaved-load-i32-stride-4-indices-012u.ll
llvm/test/Analysis/CostModel/X86/interleaved-load-i32-stride-4-indices-01uu.ll
llvm/test/Analysis/CostModel/X86/interleaved-load-i32-stride-4-indices-0uuu.ll
llvm/test/Analysis/CostModel/X86/interleaved-load-i32-stride-4.ll
llvm/test/Analysis/CostModel/X86/interleaved-load-i64-stride-2.ll
llvm/test/Analysis/CostModel/X86/interleaved-load-i64-stride-4.ll
llvm/test/Analysis/CostModel/X86/interleaved-store-f32-stride-3.ll
llvm/test/Analysis/CostModel/X86/interleaved-store-f32-stride-4.ll
llvm/test/Analysis/CostModel/X86/interleaved-store-f64-stride-2.ll
llvm/test/Analysis/CostModel/X86/interleaved-store-f64-stride-4.ll
llvm/test/Analysis/CostModel/X86/interleaved-store-i16-stride-5.ll
llvm/test/Analysis/CostModel/X86/interleaved-store-i16-stride-6.ll
llvm/test/Analysis/CostModel/X86/interleaved-store-i32-stride-3.ll
llvm/test/Analysis/CostModel/X86/interleaved-store-i32-stride-4.ll
llvm/test/Analysis/CostModel/X86/interleaved-store-i64-stride-2.ll
llvm/test/Analysis/CostModel/X86/interleaved-store-i64-stride-4.ll
llvm/test/Transforms/LoopVectorize/X86/cost-model.ll
llvm/test/Transforms/LoopVectorize/X86/parallel-loops.ll
llvm/test/Transforms/LoopVectorize/X86/uniform_mem_op.ll