[X86] Prefer blendi over movss/sd when avx512 is enabled unless optimizing for size.
authorCraig Topper <craig.topper@intel.com>
Sat, 14 Jul 2018 02:05:08 +0000 (02:05 +0000)
committerCraig Topper <craig.topper@intel.com>
Sat, 14 Jul 2018 02:05:08 +0000 (02:05 +0000)
commitf0b164415c383be9d20031d52cf5db9ae71cded8
tree26beee3fa7ce875eab5f2aa1d6b23cab85871f70
parent70993d37e8dd2031b623071372d7aebefb18c74c
[X86] Prefer blendi over movss/sd when avx512 is enabled unless optimizing for size.

AVX512 doesn't have an immediate controlled blend instruction. But blend throughput is still better than movss/sd on SKX.

This commit changes AVX512 to use the AVX blend instructions instead of MOVSS/MOVSD. This constrains the register allocation since it won't be able to use XMM16-31, but hopefully the increased throughput and reduced port 5 pressure makes up for that.

llvm-svn: 337083
22 files changed:
llvm/lib/Target/X86/X86InstrAVX512.td
llvm/lib/Target/X86/X86InstrSSE.td
llvm/test/CodeGen/X86/avx-intrinsics-x86-upgrade.ll
llvm/test/CodeGen/X86/avx512-insert-extract.ll
llvm/test/CodeGen/X86/avx512-intrinsics-fast-isel.ll
llvm/test/CodeGen/X86/avx512-shuffles/partial_permute.ll
llvm/test/CodeGen/X86/coalesce_commute_movsd.ll
llvm/test/CodeGen/X86/fmsubadd-combine.ll
llvm/test/CodeGen/X86/sse-intrinsics-fast-isel.ll
llvm/test/CodeGen/X86/sse-scalar-fp-arith.ll
llvm/test/CodeGen/X86/sse2-intrinsics-fast-isel.ll
llvm/test/CodeGen/X86/sse2-intrinsics-x86-upgrade.ll
llvm/test/CodeGen/X86/sse2.ll
llvm/test/CodeGen/X86/sse41-intrinsics-fast-isel.ll
llvm/test/CodeGen/X86/sse41-intrinsics-x86-upgrade.ll
llvm/test/CodeGen/X86/sse41.ll
llvm/test/CodeGen/X86/vec_ss_load_fold.ll
llvm/test/CodeGen/X86/vector-shuffle-128-v2.ll
llvm/test/CodeGen/X86/vector-shuffle-128-v4.ll
llvm/test/CodeGen/X86/vector-shuffle-256-v4.ll
llvm/test/CodeGen/X86/vector-shuffle-combining-avx2.ll
llvm/test/CodeGen/X86/vector-shuffle-combining-ssse3.ll