[llvm] Widen vector equality (OP_PCMPEQx) (mono/mono#14292)
* [llvm] Widen vector equality (OP_PCMPEQx)
Fixes https://github.com/mono/mono/issues/14261
Fixes https://github.com/mono/mono/issues/14143
This was causing an issue when trying to inline
System.Numerics.Vectors.Vector`1<byte>:Equals into System.Span.IndexOf
from System.Memory.dll as provided by nuget. I believe that it is
comparing the result of a SIMD operation on two SIMD registers there with a
SIMD register full of zero bytes.
When we produce the Equals call (no inline), we spill the SIMD <4 x i32>
vector to the stack in the caller, and we then later load it
with the widened type used here <16 x i8>.
When we inline, we end up making the args into XMOV opcodes,
and then doing the "pcmpeqb" mini opcode with the return values
from those operations. This becomes a register to register no-op
copy, and the original <4 x i32> gets compared against the <16 x i8>
type:
```
%116 = icmp eq <16 x i8> zeroinitializer, <4 x i32> %115
```
This does not go the way we want it to. It reliably leads to IndexOf
breaking and only finding the desired byte in the portion not processed
by the SIMD code (in the last few elements of the array).
This fix checks if the input dimensions differ, and does a conversion.
This produces the following IR:
```
%114 = bitcast <4 x i32> %99 to <16 x i8>
%115 = icmp eq <16 x i8> %114, %113
%116 = sext <16 x i1> %115 to <16 x i8>
%117 = icmp eq <16 x i8> zeroinitializer, %116
```
* [llvm] Widen vector greater-than
Here we do the same trick used in the above fix, but for the parallel greater-than operation.
Commit migrated from https://github.com/mono/mono/commit/
b15631d641e1c296b58ae744226c9076ca81b0ab