Documentation/design-docs/jump-stubs.md

   1 # Jump Stubs
   2
   3 ## Overview
   4
   5 On 64-bit platforms (AMD64 (x64) and ARM64), we have a 64-bit address
   6 space. When the CLR formulates code and data addresses, it generally
   7 uses short (<64 bit) relative addresses, and attempts to pack all code
   8 and data relatively close together at runtime, to reduce code size. For
   9 example, on x64, the JIT generates 32-bit relative call instruction
  10 sequences, which can refer to a target address +/- 2GB from the source
  11 address, and which are 5 bytes in size: 1 byte for opcode and 4 bytes
  12 for a 32-bit IP-relative offset (called a rel32 offset). A call sequence
  13 with a full 64-bit target address requires 12 bytes, and in addition
  14 requires a register. Jumps have the same characteristics as calls: there
  15 are rel32 jumps as well.
  16
  17 In case the short relative address is insufficient to address the target
  18 from the source address, we have two options: (1) for data, we must
  19 generate full 64-bit sized addresses, (2) for code, we insert a "jump
  20 stub", so the short relative call or jump targets a "jump stub" which
  21 then jumps directly to the target using a full 64-bit address (and
  22 trashes a register to load that address). Since calls are so common, and
  23 the need for full 64-bit call sequences so rare, using this design
  24 drastically improves code size. The need for jump stubs only arises when
  25 jumps of greater than 2GB range (on x64; 128MB on arm64) are required.
  26 This only happens when the amount of code in a process is very large,
  27 such that all the related code can't be packed tightly together, or the
  28 address space is otherwise tightly packed in the range where code is
  29 normally allocated, once again preventing from packing code together.
  30
  31 An important issue arises, though: these jump stubs themselves must be
  32 allocated within short relative range of the small call or jump
  33 instruction. If that doesn't occur, we encounter a fatal error
  34 condition, if we have no way for the already generated instruction to
  35 reach its intended target.
  36
  37 ARM64 has a similar issue: it has a 28-bit relative branch that is the
  38 preferred branch instruction. The JIT always generates this instruction,
  39 and requires the VM to generate jump stubs if required. However, the VM
  40 does not use this form in any of its stubs; it always uses large form
  41 branches. The remainder of this document will only describe the AMD64
  42 case.
  43
  44 This document will describe the design and implementation of jump stubs,
  45 their various users, the design of their allocation, and how we can
  46 address the problem of failure to allocate required jump stubs (which in
  47 this document I call "mitigation"), for each case.
  48
  49 ## Jump stub creation and management
  50
  51 A jump stub looks like this:
  52 ```
  53 mov rax, <8-byte address>
  54 jmp rax
  55 ```
  56
  57 It is 12 bytes in size. Note that it trashes the RAX register. Since it
  58 is normally used to interpose on a call instruction, and RAX is a
  59 callee-trashed (volatile) register for amd64 (for both Windows and Linux
  60 / System V ABI), this is not a problem. For calls with custom calling
  61 conventions, like profiler hooks, the VM is careful not to use jump
  62 stubs that might interfere with those conventions.
  63
  64 Jump stub creation goes through the function `rel32UsingJumpStub()`. It
  65 takes the rel32 data address, the target address, and computes the
  66 offset from the source to the target address, and returns this offset.
  67 Note that the source, or "base", address is the address of the rel32
  68 data plus 4 bytes, which it assumes due to the rules of the x86/x64
  69 instruction set which state that the "base" address for computing a
  70 branch offset is the instruction pointer value, or address, of the
  71 following instruction, which is the rel32 address plus 4.
  72
  73 If the offset doesn't fit, it computes the allowed address range (e.g.,
  74 [low ... high]) where a jump stub must be located to create a legal
  75 rel32 offset, and calls `ExecutionManager::jumpStub()` to create or find
  76 an appropriate jump stub.
  77
  78 Jump stubs are allocated in the loader heap associated with a particular
  79 use: either the `LoaderCodeHeap` for normal code, or the `HostCodeHeap`
  80 for DynamicMethod / LCG functions. Dynamic methods cannot share jump
  81 stubs, to support unloading individual methods and reclaiming their
  82 memory. For normal code, jump stubs are reused. In fact, we maintain a
  83 hash table mapping from jump stub target to the jump stub itself, and
  84 look up in this table to find a jump stub to reuse.
  85
  86 In case there is no space left for a jump stub in any existing code heap
  87 in the correct range, a new code heap is attempted to be created in the
  88 range required by the new jump stub, using the function
  89 `ClrVirtualAllocWithinRange()`. This function walks the acceptable address
  90 space range, using OS virtual memory query/allocation APIs, to find and
  91 allocate a new block of memory in the acceptable range. If this function
  92 can't find and allocate space in the required range, we have, on AMD64,
  93 one more fallback: if an emergency jump stub reserve was created using
  94 the `COMPlus_NGenReserveForjumpStubs` configuration (see below), we
  95 attempt to find an appropriate, in range, allocation from that emergency
  96 pool. If all attempts fail to create an allocation in the appropriate
  97 range, we encounter a fatal error (and tear down the process), with a
  98 distinguished "out of memory within range" message (using the
  99 `ThrowOutOfMemoryWithinRange()` function).
 100
 101 ## Jump stub allocation failure mitigation
 102
 103 Several strategies have already been created to attempt to lessen the
 104 occurrence of jump stub allocation failure. The following CLR
 105 configuration variables are relevant (these can be set in the registry
 106 as well as the environment, as usual):
 107
 108 * `COMPlus_CodeHeapReserveForJumpStubs`. This value specifies a percentage
 109 of every code heap to reserve for jump stubs. When a non-jump stub
 110 allocation in the code heap would eat into the reserved percentage, a
 111 new code heap is allocated instead, leaving some buffer in the existing
 112 code heap. The default value is 2.
 113 * `COMPlus_NGenReserveForjumpStubs`. This value, when non-zero, creates an
 114 "emergency jump stub reserve". For each NGEN image loaded, an emergency
 115 jump stub reserve space is calculated by multiplying this number, as a
 116 percentage, against the loaded native image size. This amount of space
 117 is allocated, within rel32 range of the NGEN image. An allocation
 118 granularity for these emergency code heaps exceeds the specific
 119 requirement, but multiple NGEN images can share the same jump stub
 120 emergency space heap if it is in range. If an emergency jump stub space
 121 can't be allocated, the failure is ignored (hopefully in this case any
 122 required jump stub will be able to be allocated somewhere else). When
 123 looking to allocate jump stubs, the normal mechanisms for finding jump
 124 stub space are followed, and only if they fail to find appropriate space
 125 are the emergency jump stub reserve heaps tried. The default value is
 126 zero.
 127 * `COMPlus_BreakOnOutOfMemoryWithinRange`. When set to 1, this breaks into
 128 the debugger when the specific jump stub allocation failure condition
 129 occurs.
 130
 131 The `COMPlus_NGenReserveForjumpStubs` mitigation is described publicly
 132 here:
 133 https://support.microsoft.com/en-us/help/3152158/out-of-memory-exception-in-a-managed-application-that-s-running-on-the-64-bit-.net-framework.
 134 (It also mentions, in passing, `COMPlus_CodeHeapReserveForJumpStubs`, but
 135 only to say not to use it.)
 136
 137 ## Jump stubs and the JIT
 138
 139 As the JIT generates code on AMD64, it starts by generating all data and
 140 code addresses as rel32 IP-relative offsets. At the end of code
 141 generation, the JIT determines how much code will be generated, and
 142 requests buffers from the VM to hold the generated artifacts: a buffer
 143 for the "hot" code, a buffer for the "cold" code (only used in the case
 144 of hot/cold splitting during NGEN), and a buffer for the read-only data
 145 (see `ICorJitInfo::allocMem()`). The VM finds allocation space in either
 146 existing code heaps, or in newly created code heaps, to satisfy this
 147 request. It is only at this point that the actual addresses where the
 148 generated code will live is known. Note that the JIT has finalized the
 149 exact generated code sequences in the function before calling
 150 `allocMem()`. Then, the JIT issues (or "emits") the generated instruction
 151 bytes into the provided buffers, as well as telling the VM about
 152 exception handling ranges, GC information, and debug information.
 153 When the JIT emits an instruction that includes a rel32 offset (as well
 154 as for other cases of global pointer references), it calls the VM
 155 function `ICorJitInfo::recordRelocation()` to tell the VM the address of
 156 the rel32 data and the intended target address of the rel32 offset. How
 157 this is handled in the VM depends on whether we are JIT-compiling, or
 158 compiling for NGEN.
 159
 160 For JIT compilation, the function `CEEJitInfo::recordRelocation()`
 161 determines the actual rel32 value to use, and fills in the rel32 data in
 162 the generated code buffer. However, what if the offset doesn't fit in a
 163 32-bit rel32 space?
 164
 165 Up to this point, the VM has allowed the JIT to always generate rel32
 166 addresses. It is allowed by the JIT calling
 167 `ICorJitInfo::getRelocTypeHint()`. If this function returns
 168 `IMAGE_REL_BASED_REL32`, then the JIT generates a rel32 address. The first
 169 time in the lifetime of the process when recordRelocation() fails to
 170 compute an offset that fits in a rel32 space, the VM aborts the
 171 compilation, and restarts it in a mode where
 172 `ICorJitInfo::getRelocTypeHint()` never returns `IMAGE_REL_BASED_REL32`.
 173 That is, the VM never allows the JIT to generate rel32 addresses. This
 174 is "rel32 overflow" mode. However, this restriction only applies to data
 175 addresses. The JIT will then load up full 64-bit data addresses in the
 176 code (which are also subject to relocation), and use those. These 64-bit
 177 data addresses are guaranteed to reach the entire address space.
 178
 179 The JIT continues to generate rel32 addresses for call instructions.
 180 After the process is in rel32 overflow mode, if the VM gets a
 181 `ICorJitInfo::recordRelocation()` that overflows rel32 space, it assumes
 182 the rel32 address is for a call instruction, and it attempts to build a
 183 jump stub, and patch the rel32 with the offset to the generated jump
 184 stub.
 185
 186 Note that in rel32 overflow mode, most call instructions are likely to
 187 still reach their intended target with a rel32 offset, so jump stubs are
 188 not expected to be required in most cases.
 189
 190 If this attempt to create a jump stub fails, then the generated code
 191 cannot be used, and the VM restarts the compilation with reserving
 192 extra space in the code heap for jump stubs. The reserved extra space
 193 ensures that the retry succeeds with high probability.
 194
 195 There are several problems with this system:
 196 1. Because the VM doesn't know whether a `IMAGE_REL_BASED_REL32`
 197 relocation is for data or for code, in the normal case (before "rel32
 198 overflow" mode), it assumes the worst, that it is for data. It's
 199 possible that if all rel32 data accesses fit, and only code offsets
 200 don't fit, and the VM could distinguish between code and data
 201 references, that we could generate jump stubs for the too-large code
 202 offsets, and never go into "rel32 overflow" mode that leads to
 203 generating 64-bit data addresses.
 204 2. We can't stress jump stub creation functionality for JIT-generated
 205 code because the JIT generates `IMAGE_REL_BASED_REL32` relocations for
 206 intra-function jumps and calls that it expects and, in fact, requires,
 207 not be replaced with jump stubs, because it doesn't expect the register
 208 used by jump stubs (RAX) to be trashed.
 209
 210 In the NGEN case, rel32 calls are guaranteed to always reach, as PE
 211 image files are limited to 2GB in size, meaning a rel32 offset is
 212 sufficient to reach from any location in the image to any other
 213 location. In addition, all control transfers to locations outside the
 214 image go through indirection stubs. These stubs themselves might require
 215 jump stubs, as described later.
 216
 217 ### Failure mitigation
 218
 219 There are several possible alternative mitigations for JIT failure to
 220 allocate jump stubs.
 221 1. When we get into "rel32 overflow" mode, the JIT could always generate
 222 large calls, and never generate rel32 offsets. This is obviously
 223 somewhat expensive, as every external call, such as every call to a JIT
 224 helper, would increase from 5 to 12 bytes. Since it would only occur
 225 once you are in "rel32 overflow" mode, you already know that the process
 226 is quite large, so this is perhaps justifiable, though also perhaps
 227 could be optimized somewhat. This is very simple to implement.
 228 2. Note that you get into "rel32 overflow" mode even for data addresses.
 229 It would be useful to verify that the need for large data addresses
 230 doesn't happen much more frequently than large code addresses.
 231 3. An alternative is to have two separate overflow modes: "data rel32
 232 overflow" and "code rel32 overflow", as follows:
 233    1. "data rel32 overflow" is entered by not being able to generate a
 234       rel32 offset for a data address. Restart the compile, and all subsequent
 235       data addresses will be large.
 236    2. "code rel32 overflow" is entered by not being able to generate a
 237       rel32 offset or jump stub for a code address. Restart the compile, and
 238       all subsequent external call/jump sequences will be large.
 239       These could be independent, which would require distinguishing code and
 240       data rel32 to the VM (which might be useful for other reasons, such as
 241       enabling better stress modes). Or, we could layer them: "data rel32
 242       overflow" would be the current "rel32 overflow" we have today, which we
 243       must enter before attempting to generate a jump stub. If a jump stub
 244       fails to be created, we fail and retry the compilation again, enter
 245       "code rel32 overflow" mode, and all subsequent code (and data) addresses
 246       would be large. We would need to add the ability to communicate this new
 247       mode from the VM to the JIT, implement large call/jump generation in the
 248       JIT, and implement another type of retry in the VM.
 249 4. Another alternative: The JIT could determine the total number of
 250 unique external call/jump targets from a function, and report that to
 251 the VM. Jump stub space for exactly this number would be allocated,
 252 perhaps along with the function itself (such as at the end), and only if
 253 we are in a "rel32 overflow" mode. Any jump stub required would come
 254 from this space (and identical targets would share the same jump stub;
 255 note that sharing is optional). Since jump stubs would not be shared
 256 between functions, this requires more space than the current jump stub
 257 system but would be guaranteed to work and would only kick in when we
 258 are already experiencing large system behavior.
 259
 260 ## Other jump stub creation paths
 261
 262 The VM has several other locations that dynamically generate code or
 263 patch previously generated code, not related to the JIT generating code.
 264 These also must use the jump stub mechanism to possibly create jump
 265 stubs for large distance jumps. The following sections describe these
 266 cases.
 267
 268 ## ReJIT
 269
 270 ReJIT is a CLR profiler feature, currently only implemented for x86 and
 271 amd64, that allows a profiler to request a function be re-compiled with
 272 different IL, given by the profiler, and have that newly compiled code
 273 be used instead of the originally compiled IL. This happens within a
 274 live process. A single function can be ReJIT compiled more than once,
 275 and in fact, any number of times. The VM currently implements the
 276 transfer of control to the ReJIT compiled function by replacing the
 277 first five bytes of the generated code of the original function with a
 278 "jmp rel32" to the newly generated code. Call this the "jump patch"
 279 space. One fundamental requirement for this to work is that every
 280 function (a) be at least 5 bytes long, and (b) the first 5 bytes of a
 281 function (except the first, which is the address of the function itself)
 282 can't be the target of any branch. (As an implementation detail, the JIT
 283 currently pads the function prolog out to 5 bytes with NOP instructions,
 284 if required, even if there is enough code following the prolog to
 285 satisfy the 5-byte requirement if those non-prolog bytes are also not
 286 branch targets.)
 287
 288 If the newly ReJIT generated code is at an address that doesn't fit in a
 289 rel32 in the "jmp rel32" patch, then a jump stub is created.
 290
 291 The JIT only creates the required jump patch space if the
 292 `CORJIT_FLG_PROF_REJIT_NOPS` flag is passed to the JIT. For dynamic
 293 compilation, this flag is only passed if a profiler is attached and has
 294 also requested ReJIT services. Note that currently, to enable ReJIT, the
 295 profiler must be present from process launch, and must opt-in to enable
 296 ReJIT at process launch, meaning that all JIT generated functions will
 297 have the jump patch space under these conditions. There will never be a
 298 mix of functions with and without jump patch space in the process if a
 299 profiler has enabled ReJIT. A desirable future state from the profiler
 300 perspective would be to support profiler attach-to-process and ReJIT
 301 (with function swapping) at any time thereafter. This goal may or may
 302 not be achieved via the jump stamp space design.
 303
 304 All NGEN and Ready2Run images are currently built with the
 305 `CORJIT_FLG_PROF_REJIT_NOPS` flag set, to always enable ReJIT using native
 306 images.
 307
 308 A single function can be ReJIT compiled many times. Only the last ReJIT
 309 generated function can be active; the previous compilations consume
 310 address space in the process, but are not collected until the AppDomain
 311 unloads. Each ReJIT event must update the "jmp rel32" patch to point to
 312 the new function, and thus each ReJIT event might require a new jump
 313 stub.
 314
 315 If a situation arises where a single function is ReJIT compiled many
 316 times, and each time requires a new jump stub, it's possible that all
 317 jump stub space near the original function can be consumed simply by the
 318 "leaked" jump stubs created by all the ReJIT compilations for a single
 319 function. The "leaked" ReJIT compiled functions (since they aren't
 320 collected until AppDomain unload) also make it more likely that "close"
 321 code heap address space gets filled up.
 322
 323 ### Failure mitigation
 324
 325 A simple mitigation would be to increase the size of the required
 326 function jump patch space from 5 to 12 bytes. This is a two line change
 327 in the `CodeGen::genPrologPadForReJit()` function in the JIT. However,
 328 this would increase the size of all NGEN and Ready2Run images. Note that
 329 many managed code functions are very small, with very small prologs, so
 330 this could significantly impact code size (the change could easily be
 331 measured). For JIT-generated code, where the additional size would only
 332 be added once a profiler has enabled ReJIT, it seems like the additional
 333 code size would be easily justified.
 334
 335 Note that a function has at most one active ReJIT companion function.
 336 When that ReJIT function is no longer used (and thus never again used),
 337 the associated jump stub is also "leaked", and never used again. We
 338 could reserve space for a single jump stub for each function, to be used
 339 by ReJIT, and then, if a jump stub is required for ReJIT, always use
 340 that space. The JIT could pad the function end by 12 bytes when the
 341 `CORJIT_FLG_PROF_REJIT_NOPS` flag is passed, and the ReJIT patching code
 342 could use this reserved space any time it required a jump stub. This
 343 would require 12 bytes extra bytes to be allocated for every function
 344 generated when the `CORJIT_FLG_PROF_REJIT_NOPS` flag is passed. These 12
 345 bytes could also be allocated at the end of the code heap, in the
 346 address space, but not in the normal working set.
 347
 348 For NGEN and Ready2Run, this would require 12 bytes for every function
 349 in the image. This is quite a bit more space than the suggested
 350 mitigation of increasing prolog padding to 12 bytes but only if
 351 necessary (meaning, only if they aren't already 12 bytes in size).
 352 Alternatively, NGEN could allocate this space itself in the native
 353 image, putting it in some distant jump stub data area or section that
 354 would be guaranteed to be within range (due to the 2GB PE file size
 355 limitation) but wouldn't consume physical memory unless needed. This
 356 option would require more complex logic to allocate and find the
 357 associated jump stub during ReJIT. This would be similar to the JIT
 358 case, above, of reserving the jump stub in a distant portion of the code
 359 heap.
 360
 361 ## NGEN
 362
 363 NGEN images are built with several tables of code addresses that must be
 364 patched when the NGEN image is loaded.
 365
 366 ### CLR Helpers
 367
 368 During NGEN, the JIT generates either direct or indirect calls to CLR
 369 helpers. Most are direct calls. When NGEN constructs the PE file, it
 370 causes these all to branch to (or through, in the case of indirect
 371 calls) the helper table. When a native image is loaded, it replaces the
 372 helper number in the table with a 5-byte "jmp rel32" sequence. If the
 373 rel32 doesn't fit, a jump stub is created. Note that each helper table
 374 entry is allocated with 8 bytes (only 5 are needed for "jmp rel32", but
 375 presumably 8 bytes are reserved to improve alignment.)
 376
 377 The code for filling out the helper table is `Module::LoadHelperTable()`.
 378
 379 #### Failure mitigation
 380
 381 A simple fix is to change NGEN to reserve 12 bytes for each direct call
 382 table entry, to accommodate the 12-byte jump stub sequence. A 5-byte
 383 "jmp rel32" sequence could still be used, if it fits, but the full 12
 384 bytes would be used if necessary.
 385
 386 There are fewer than 200 helpers, so a maximum additional overhead would
 387 be about `200 * (12 - 8) = 800` bytes. That is by far a worst-case
 388 scenario. Mscorlib.ni.dll itself has 72 entries in the helper table.
 389 System.XML.ni.dll has 51 entries, which would lead to 288 and 204 bytes
 390 of additional space, out of 34MB and 12MB total NI file size,
 391 respectively.
 392
 393 An alternative is to change all helper calls in NGEN to be indirect:
 394 ```
 395 call [rel32]
 396 ```
 397 where the [rel32] offset points to an 8-byte address stored in the
 398 helper table. This method is already used by exactly one helper on
 399 AMD64: `CORINFO_HELP_STOP_FOR_GC`, in particular because this helper
 400 doesn't allow us to trash RAX, as required by jump stubs.
 401 Similarly, Ready2Run images use:
 402 ```
 403 call [rel32]
 404 ```
 405 for "hot" helpers and:
 406 ```
 407 call [rel32]
 408 ```
 409 to a shared:
 410 ```
 411 jmp [rel32]
 412 ```
 413 for cold helpers. We could change NGEN to use the Ready2Run scheme.
 414
 415 Alternatively, we might handle all NGEN jump stub issues by reserving a
 416 section in the image for jump stubs that reserves virtual address space
 417 but does not increase the size of the image (in C++ this is the ".bss"
 418 section). The size of this section could be calculated precisely from
 419 all the required possible jump stub contributions to the image. Then,
 420 the jump stub code would allocate jump stubs from this space when
 421 required for a NGEN image.
 422
 423 ### Cross-module inherited methods
 424
 425 Per the comments on `VirtualMethodFixupWorker()`, in an NGEN image,
 426 virtual slots inherited from cross-module dependencies point to jump
 427 thunks. The jump thunk invokes code to ensure the method is loaded and
 428 has a stable entry point, at which point the jump thunk is replaced by a
 429 "jmp rel32" to that stable entrypoint. This is represented by
 430 `CORCOMPILE_VIRTUAL_IMPORT_THUNK`. This can require a jump stub.
 431
 432 Similarly, `CORCOMPILE_EXTERNAL_METHOD_THUNK` represents another kind of
 433 jump thunk in the NGEN image that also can require a jump stub.
 434
 435 #### Failure mitigation
 436
 437 Both external method thunks could be changed to reserve 12 bytes instead
 438 of just 5 for the jump thunk, to provide for space required for any
 439 potential jump stub.
 440
 441 ## Precode
 442
 443 Precodes are used as temporary entrypoints for functions that will be
 444 JIT compiled. They are also used for temporary entrypoints in NGEN
 445 images for methods that need to be restored (i.e., the method code has
 446 external references that need to be loaded before the code runs). There
 447 exists `StubPrecode`, `FixupPrecode`, `RemotingPrecode`, and
 448 `ThisPtrRetBufPrecode`. Each of these generates a rel32 jump and/or call
 449 that might require a jump stub.
 450
 451 StubPrecode is the "normal" general case. FixupPrecode is the most
 452 common, and has been heavily size optimized. Each FixupPrecode is 8
 453 bytes. Generated code calls the FixupPrecode address. Initially, the
 454 precode invokes code to generate or fix up the method being called, and
 455 then "fix up" the FixupPrecode itself to jump directly to the native
 456 code. This final code will be a "jmp rel32", possibly via a jump stub.
 457 DynamicMethod / LCG uses FixupPrecode. This code path has been found to
 458 fail in large customer installations.
 459
 460 ### Failure mitigation
 461
 462 An implementation has been made which changes the allocation of
 463 FixupPrecode to pre-allocate space for jump stubs, but only in the case
 464 of DynamicMethod. (See https://github.com/dotnet/coreclr/pull/9883).
 465 Currently, FixupPrecode are allocated in "chunks", that share a
 466 MethodDesc pointer. For LCG, each chunk will have an additional set of
 467 bytes allocated, to reserve space for one jump stub per FixupPrecode in
 468 the chunk. When the FixupPrecode is patched, for LCG methods it will use
 469 the pre-allocated space if a jump stub is required.
 470
 471 For non-LCG, we are reserving, but not allocating, a space at the end
 472 of the code heap. This is similar and in addition to the reservation done by
 473 COMPlus_CodeHeapReserveForJumpStubs. (See https://github.com/dotnet/coreclr/pull/15296).
 474
 475 ## Ready2Run
 476
 477 There are several DynamicHelpers class methods, used by Ready2Run, which
 478 may create jump stubs (not all do, but many do). The helpers are
 479 allocated dynamically when the helper in question is needed.
 480
 481 ### Failure mitigation
 482
 483 These helpers could easily be changed to allocate additional, reserved,
 484 unshared space for a potential jump stub, and that space could be used
 485 when creating the rel32 offset.
 486
 487 ## Compact entrypoints
 488
 489 The compact entrypoints implementation might create jump stubs. However,
 490 compact entrypoints are not enabled for AMD64 currently.
 491
 492 ## Stress modes
 493
 494 Setting `COMPlus_ForceRelocs=1` forces jump stubs to be created in all
 495 scenarios except for JIT generated code. As described previously, the
 496 VM doesn't know when the JIT is reporting a rel32 data address or code
 497 address, and in addition the JIT reports relocations for intra-function
 498 jumps and calls for which it doesn't expect the register used by the
 499 jump stub to be trashed, thus we don't force jump stubs to be created
 500 for all JIT-reported jumps or calls.
 501
 502 We should improve the communication between the JIT and VM such that we
 503 can reliably force jump stub creation for every rel32 call or jump. In
 504 addition, we should make sure to enable code to stress the creation of
 505 jump stubs for every mitigation that is implemented whether that be
 506 using the existing `COMPlus_ForceRelocs` configuration, or the creation of
 507 a new configuration option.