ARM64: Enable Long Address
Fixes https://github.com/dotnet/coreclr/issues/3668
Currently ARM64 codegen can have reference within +/-1 MB due to encoding
restriction in `b<cond>/adr/ldr` instructions. This is normally okay
assuming each function is reasonably small, but certainly not working for large method which also
can be formed with an aggressive inlining probably like crossgen/corert scenarios.
In addition, for hot/cold code separation long address is a prerequisite
since reference can be across different regions which are arbitrary.
In fact, we need additional relocations which are not in this change yet.
In details, this supports long address for conditional jump/address loading/constant
loading operations by default while they can be shortened later by
`emitJumpDistBind()` if they can fit into the smaller encoding. Logically
those operations now can reach within +/-4GB address range.
Note I haven't extended unconditional jump in this change for simplicity
so it can reach within +/-128MB same as before.
`emitOutputLJ` is extended to finally encode these operations.
There are 3 pseudo instructions introduced. These can be expanded either
short/long form.
1. Conditional jump. See `emitIns_J()`
a. Short form(`IF_BI_0B`): `b<cond> rel_addr`
b. Long form(`IF_LARGEJMP`):
```
b<rev cond> $LABEL
b rel_addr (unconditional jump)
$LABEL:
```
2. Load label(address computation). See `emitIns_R_L()`
a. Short form(`IF_DI_1E`): `adr x, [rel_addr]`
b. Long form(`IF_LARGEADR`):
```
adrp x, [rel_page_addr]
add x, x, page_offs
```
3. Load constant (from JIT data). See `emitIns_R_C()`
a. Short form(`IF_LS_1A`): `ldr x, [rel_addr]`
b. Long form(`IF_LARGLDC`):
```
adrp x, [rel_page_addr]
ldr x, [x, page_offs]
(fmov v, x in case loading vector constant)
```
In addition, JIT data is aligned on 8 byte to be accessible from large
load. Replaced JitLargeBranches by JitLongAddress to test stress on these
operations.
Commit migrated from https://github.com/dotnet/coreclr/commit/
61fe4641665e84089dcceeabbea3e5faa0f693ce