[X86] Change the implementation of scalar masked load/store intrinsics to not use...
authorCraig Topper <craig.topper@intel.com>
Thu, 10 May 2018 05:43:43 +0000 (05:43 +0000)
committerCraig Topper <craig.topper@intel.com>
Thu, 10 May 2018 05:43:43 +0000 (05:43 +0000)
commit74ac0eda685e2a2e286b02cb679b68fd57c636b2
tree5efd2c185c0eca69871b8fc6bcab63a58495780a
parentae56a957afd5e4e51a7e9374ef91a8e143b6e6c0
[X86] Change the implementation of scalar masked load/store intrinsics to not use a 512-bit intermediate vector.

This is unnecessary for AVX512VL supporting CPUs like SKX. We can just emit a 128-bit masked load/store here no matter what. The backend will widen it to 512-bits on KNL CPUs.

Fixes the frontend portion of PR37386. Need to fix the backend to optimize the new sequences well.

llvm-svn: 331958
clang/include/clang/Basic/BuiltinsX86.def
clang/lib/CodeGen/CGBuiltin.cpp
clang/lib/Headers/avx512fintrin.h
clang/test/CodeGen/avx512f-builtins.c