[X86] Use packusdw+vpmovuswb to implement v16i32->V16i8 that clamps signed inputs...
authorCraig Topper <craig.topper@intel.com>
Thu, 10 Oct 2019 19:40:44 +0000 (19:40 +0000)
committerCraig Topper <craig.topper@intel.com>
Thu, 10 Oct 2019 19:40:44 +0000 (19:40 +0000)
commit0e561437c5873a0406fab6dd7e1ba8247847bb92
treef62dd063da09d3b7d17441f59cfd924c32ef18ac
parentff5640caea906c61f9ecc48e14b37eacdde3c521
[X86] Use packusdw+vpmovuswb to implement v16i32->V16i8 that clamps signed inputs to be between 0 and 255 when zmm registers are disabled on SKX.

If we've disable zmm registers, the v16i32 will need to be split. This split will propagate through min/max the truncate. This creates two sequences that need to be concatenated back to v16i8. We can instead use packusdw to do part of the clamping, truncating, and concatenating all at once. Then we can use a vpmovuswb to finish off the clamp.

Differential Revision: https://reviews.llvm.org/D68763

llvm-svn: 374431
llvm/lib/Target/X86/X86ISelLowering.cpp
llvm/test/CodeGen/X86/min-legal-vector-width.ll