[x86] improve codegen for non-splat bit-masked vector compare and select (PR46531)
vselect ((X & Pow2C) == 0), LHS, RHS --> vselect ((shl X, C') < 0), RHS, LHS
Follow-up to D83073 - the non-splat mask cases where we actually see an
improvement are quite limited from what I can tell. AVX1 needs multiply
and blend capabilities and AVX2 needs vector shift and blend capabilities.
The intersection of those 2 constraints is only vectors with 32-bit or
64-bit elements.
XOP is/was better.
Differential Revision: https://reviews.llvm.org/D83181