[X86] Fix the cost model for v16i16->v16i32 zero_extend/sign_extend with AVX2
authorCraig Topper <craig.topper@intel.com>
Wed, 29 Jan 2020 19:40:52 +0000 (11:40 -0800)
committerCraig Topper <craig.topper@intel.com>
Wed, 29 Jan 2020 23:52:10 +0000 (15:52 -0800)
commit35625464c6ddef557c2369946681be5cfb42d5c1
treeca866d9fcc159e21930d953dae4007b9fe53647a
parent228ea1a46cc82aed60b1b3c8263bed60c4d48f05
[X86] Fix the cost model for v16i16->v16i32 zero_extend/sign_extend with AVX2

We seem to be inheriting the cost from sse4.1. But if we have 256-bit registers we should be able to do this with just one extract to split the 16i16 and two v8i16->v8i32 operations so our cost should be 3 not 4.

Differential Revision: https://reviews.llvm.org/D73646
llvm/lib/Target/X86/X86TargetTransformInfo.cpp
llvm/test/Analysis/CostModel/X86/arith-fix.ll
llvm/test/Analysis/CostModel/X86/arith-overflow.ll
llvm/test/Analysis/CostModel/X86/cast.ll
llvm/test/Analysis/CostModel/X86/extend.ll
llvm/test/Analysis/CostModel/X86/min-legal-vector-width.ll