[RISCV] Fold low 12 bits into instruction during frame index elimination
Fold the low 12 bits of an immediate offset into the offset field of the using instruction. That using instruction will be a load, store, or addi which performs an add of a signed 12-bit immediate as part of it's operation. Splitting out the low bits allows the high bits to be generated via a single LUI instead of needing an LUI/ADDI pair.
The codegen effect of this is mostly converting cases where "split addi" kicks in to using LUI + a folded offset. There are a couple of straight dynamic instruction count wins, and using a canonical LUI is probably better than a chain of SP adds if the dynamic instruction count is equal.
Differential Revision: https://reviews.llvm.org/D139037