[ARM] Fix Crashes in fp16/bf16 Inline Asm
We were still seeing occasional crashes with inline assembly blocks
using fp16/bf16 after my previous patches:
- https://reviews.llvm.org/rGff4027d152d0
- https://reviews.llvm.org/rG7d15212b8c0c
- https://reviews.llvm.org/rG20b2d11896d9
It turns out:
- The original two commits were wrong, and we should have always been
choosing the SPR register class, not the HPR register class, so that
LLVM's SelectionDAGBuilder correctly did the right splits/joins.
- The `splitValueIntoRegisterParts`/`joinRegisterPartsIntoValue` changes
from rG20b2d11896d9 are still correct, even though they sometimes
result in inefficient codegen of casts between fp16/bf16 and i32/f32
(which is visible in these tests).
This patch fixes crashes in `getCopyToParts` and when trying to select
`(bf16 (bitconvert (fp16 ...)))` dags when Neon is enabled.
This patch also adds support for passing fp16/bf16 values using the 'x'
constraint that is LLVM-specific. This should broadly match how we pass
with 't' and 'w', but with a different set of valid S registers.
Differential Revision: https://reviews.llvm.org/D147715