[AArch64][CostModel] Detects that {extract,insert}-element at lane 0 has the same cost as the other lane for vector instructions in the IR.
Currently, {extract,insert}-element has zero cost at lane 0 [1]. However, there is a cost (by fmov instruction [2], or ext/ins instruction) to move values from SIMD registers to GPR registers, when the element is used explicitly as integers.
See https://godbolt.org/z/faPE1nTn8, when fmov is generated for d* register -> x* register conversion.
Implementation-wise, add a private method `AArch64TTIImpl::getVectorInstrCostHelper` as a helper function. This way, instruction-based method could share the core logic (e.g.,
returning zero cost if type is legalized to scalar).
[1] https://github.com/llvm/llvm-project/blob/
2cf320d41ed708679e01eeeb93f58d6c5c88ba7a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp#L1853
[2] https://github.com/llvm/llvm-project/blob/
2cf320d41ed708679e01eeeb93f58d6c5c88ba7a/llvm/lib/Target/AArch64/AArch64InstrInfo.td#L8150-L8157
Differential Revision: https://reviews.llvm.org/D128302