[X86] Make v32i16/v64i8 legal types without avx512bw. Use custom splitting instead.
authorCraig Topper <craig.topper@intel.com>
Wed, 15 Apr 2020 19:03:39 +0000 (12:03 -0700)
committerCraig Topper <craig.topper@intel.com>
Wed, 15 Apr 2020 19:17:18 +0000 (12:17 -0700)
commit8dfb9627b7be27e7b37ab4200c60f65f5af95256
tree16bb0ea7158762a0cd0c229b0b8df4380cbaec83
parent3fbc9c7b51e427a549109f092d3a822b70e1e679
[X86] Make v32i16/v64i8 legal types without avx512bw. Use custom splitting instead.

This moves v32i16/v64i8 to a model consistent with how we
treat integer types with avx1.

This does change the ABI for types vXi16/vXi8 vectors larger than
512 bits to pass in multiple zmms instead of multiple ymms. We'd
already hacked some code to make v64i8/v32i16 pass in zmm.

Cost model is still a bit of a mess. In some place I tried to
match existing behavior. But really we need to account for
splitting and concating costs. Cost model for shuffles is
especially pessimistic.

Differential Revision: https://reviews.llvm.org/D76212
82 files changed:
llvm/docs/ReleaseNotes.rst
llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
llvm/lib/Target/X86/X86ISelLowering.cpp
llvm/lib/Target/X86/X86TargetTransformInfo.cpp
llvm/test/Analysis/CostModel/X86/arith-fix.ll
llvm/test/Analysis/CostModel/X86/arith-overflow.ll
llvm/test/Analysis/CostModel/X86/arith.ll
llvm/test/Analysis/CostModel/X86/fshl.ll
llvm/test/Analysis/CostModel/X86/fshr.ll
llvm/test/Analysis/CostModel/X86/icmp.ll
llvm/test/Analysis/CostModel/X86/masked-intrinsic-cost.ll
llvm/test/Analysis/CostModel/X86/reduce-add.ll
llvm/test/Analysis/CostModel/X86/reduce-and.ll
llvm/test/Analysis/CostModel/X86/reduce-mul.ll
llvm/test/Analysis/CostModel/X86/reduce-or.ll
llvm/test/Analysis/CostModel/X86/reduce-smax.ll
llvm/test/Analysis/CostModel/X86/reduce-smin.ll
llvm/test/Analysis/CostModel/X86/reduce-umax.ll
llvm/test/Analysis/CostModel/X86/reduce-umin.ll
llvm/test/Analysis/CostModel/X86/reduce-xor.ll
llvm/test/Analysis/CostModel/X86/rem.ll
llvm/test/Analysis/CostModel/X86/shuffle-extract_subvector.ll
llvm/test/Analysis/CostModel/X86/shuffle-reverse.ll
llvm/test/Analysis/CostModel/X86/shuffle-two-src.ll
llvm/test/Analysis/CostModel/X86/trunc.ll
llvm/test/Analysis/CostModel/X86/vector-extract.ll
llvm/test/Analysis/CostModel/X86/vector-insert.ll
llvm/test/CodeGen/X86/avg-mask.ll
llvm/test/CodeGen/X86/avg.ll
llvm/test/CodeGen/X86/avx512-calling-conv.ll
llvm/test/CodeGen/X86/avx512-ext.ll
llvm/test/CodeGen/X86/avx512-insert-extract.ll
llvm/test/CodeGen/X86/avx512-logic.ll
llvm/test/CodeGen/X86/avx512-mask-op.ll
llvm/test/CodeGen/X86/avx512-select.ll
llvm/test/CodeGen/X86/avx512-trunc.ll
llvm/test/CodeGen/X86/avx512-vbroadcasti128.ll
llvm/test/CodeGen/X86/avx512-vbroadcasti256.ll
llvm/test/CodeGen/X86/avx512-vec-cmp.ll
llvm/test/CodeGen/X86/avx512-vselect.ll
llvm/test/CodeGen/X86/avx512vl-vec-masked-cmp.ll
llvm/test/CodeGen/X86/bitcast-and-setcc-512.ll
llvm/test/CodeGen/X86/bitcast-int-to-vector-bool-zext.ll
llvm/test/CodeGen/X86/bitcast-setcc-512.ll
llvm/test/CodeGen/X86/fast-isel-nontemporal.ll
llvm/test/CodeGen/X86/kshift.ll
llvm/test/CodeGen/X86/madd.ll
llvm/test/CodeGen/X86/masked_store_trunc.ll
llvm/test/CodeGen/X86/masked_store_trunc_usat.ll
llvm/test/CodeGen/X86/merge-consecutive-loads-512.ll
llvm/test/CodeGen/X86/midpoint-int-vec-512.ll
llvm/test/CodeGen/X86/movmsk-cmp.ll
llvm/test/CodeGen/X86/nontemporal-loads-2.ll
llvm/test/CodeGen/X86/nontemporal-loads.ll
llvm/test/CodeGen/X86/pmaddubsw.ll
llvm/test/CodeGen/X86/pmul.ll
llvm/test/CodeGen/X86/pmulh.ll
llvm/test/CodeGen/X86/pr45443.ll
llvm/test/CodeGen/X86/var-permute-512.ll
llvm/test/CodeGen/X86/vector-compare-results.ll
llvm/test/CodeGen/X86/vector-fshl-512.ll
llvm/test/CodeGen/X86/vector-fshl-rot-512.ll
llvm/test/CodeGen/X86/vector-fshr-512.ll
llvm/test/CodeGen/X86/vector-fshr-rot-512.ll
llvm/test/CodeGen/X86/vector-idiv-sdiv-512.ll
llvm/test/CodeGen/X86/vector-idiv-udiv-512.ll
llvm/test/CodeGen/X86/vector-popcnt-512.ll
llvm/test/CodeGen/X86/vector-reduce-and-bool.ll
llvm/test/CodeGen/X86/vector-reduce-mul.ll
llvm/test/CodeGen/X86/vector-reduce-or-bool.ll
llvm/test/CodeGen/X86/vector-reduce-xor-bool.ll
llvm/test/CodeGen/X86/vector-rotate-512.ll
llvm/test/CodeGen/X86/vector-sext.ll
llvm/test/CodeGen/X86/vector-shift-ashr-512.ll
llvm/test/CodeGen/X86/vector-shift-lshr-512.ll
llvm/test/CodeGen/X86/vector-shift-shl-512.ll
llvm/test/CodeGen/X86/vector-shuffle-512-v32.ll
llvm/test/CodeGen/X86/vector-shuffle-512-v64.ll
llvm/test/CodeGen/X86/vector-shuffle-v1.ll
llvm/test/CodeGen/X86/vector-tzcnt-512.ll
llvm/test/CodeGen/X86/vector-zext.ll
llvm/test/CodeGen/X86/viabs.ll