- [Introduction to the Common Language Runtime](intro-to-clr.md)
- [Garbage Collection Design](garbage-collection.md)
- [Threading](threading.md)
-- [RyuJIT Overview](ryujit-overview.md)
- - [Porting RyuJIT to other platforms](porting-ryujit.md)
+- [RyuJIT Overview](../jit/ryujit-overview.md)
+ - [Porting RyuJIT to other platforms](../jit/porting-ryujit.md)
- [Type System](type-system.md)
- [Type Loader](type-loader.md)
- [Method Descriptor](method-descriptor.md)
be able to devirtualize these calls), and plot the times as a function of `p`.
The result is something like the following:
-![two classes baseline perf](TwoClassesBaseline.JPG)
+![two classes baseline perf](images/TwoClassesBaseline.JPG)
Modern hardware includes an indirect branch target predictor and we can see it
in action here. When the array element type is predictable (`p` very close to
performance should perhaps slightly worse than the un-optimized case as there is
now extra code to run check before the call.
-![two classes devirt perf](TwoClassesDevirt.JPG)
+![two classes devirt perf](images/TwoClassesDevirt.JPG)
However as you can see the performance of devirtualized case (blue line) is as
good or better than the un-optimized case for all values of `p`. This is perhaps
average the results for the various values of `p1` and plot performance as a
function of `p`:
-![three classes devirt perf](ThreeClassesDevirt.JPG)
+![three classes devirt perf](images/ThreeClassesDevirt.JPG)
The right-hand side (`p` near 1.0) looks a lot like the previous chart. This is
not surprising as there are relatively few instances of that third class. But the
The following chart shows the min and max values as well as the average, and also
shows the two-class result (dashed lines).
-![three classes devirt perf ranges](ThreeClassesDevirtFull.JPG)
+![three classes devirt perf ranges](images/ThreeClassesDevirtFull.JPG)
You can see the minimum values are very similar to the two class case; these
are cases where the `p1` is close to 0 or close to 1. And that makes sense because
and `E`. Here's some detail (the x axis now shows `p1` and `p` as upper and
lower values respectively).
-![three classes devirt perf detail](ThreeClassesDevirtDetail.JPG)
+![three classes devirt perf detail](images/ThreeClassesDevirtDetail.JPG)
The worst case for perf for both is when the mixture of `D` and `E` is
unpredictably 50-50 and there are no `B`s. Once we mix in just 10% of `B` then
guesses the wrong class, but the break-even point (not shown) is at a relatively
small probability of a correct guess.
-![two classes interface devirt](TwoClassesInterface.JPG)
+![two classes interface devirt](images/TwoClassesInterface.JPG)
### Interface Calls: The Three Class Case
correct, guessing wins on average, and around 30% correct guessing is always a
perf win.
-![three classes interface devirt](ThreeClassesInterface.JPG)
+![three classes interface devirt](images/ThreeClassesInterface.JPG)
### Delegate Speculation
- should we test for multiple types? Once we've peeled off the "most likely" case
if the conditional probability of the next most likely case is high it is probably
worth testing for it too. I believe the C++ compiler will test up to 3 candidates
-this way... but that's a lot of code expansion.
\ No newline at end of file
+this way... but that's a lot of code expansion.
### Improved Struct Handling
-Most of the work required to improve struct handling in RyuJIT is captered in the [first-class structs](https://github.com/dotnet/runtime/blob/master/docs/design/features/first-class-structs.md)
+Most of the work required to improve struct handling in RyuJIT is captered in the [first-class structs](first-class-structs.md)
document, though that document needs to be updated to reflect the work that has already been done.
Recent improvements include the struct promotion improvements that went in for `Span<T>`.
This is increasingly important as C# language constructs like async/await and
certain `foreach` incantations are implemented with EH constructs, making them
difficult to avoid at source level. The recent work on finally cloning, empty
-finally removal, and empty try removal targeted this. [Writethrough](https://github.com/dotnet/runtime/blob/master/docs/design/features/eh-writethru.md)
+finally removal, and empty try removal targeted this. [Writethrough](eh-writethru.md)
is another key optimization enabler here, and we are actively pursuing it. Other
things we've discussed include inlining methods with EH and computing funclet
callee-save register usage independently of main function callee-save register
This document provides additional detail on the linear scan register
allocator (LSRA) in RyuJIT. It is expected that the reader has already
-read the [RyuJIT Overview document](https://github.com/dotnet/runtime/blob/master/docs/design/coreclr/botr/ryujit-overview.md).
+read the [RyuJIT Overview document](ryujit-overview.md).
Register allocation is performed using a linear scan register allocation
scheme, implemented by the `LinearScan` class.
Below is the portion of the call graph the escape analysis will have to consider when proving this allocation is not escaping.
Green arrows correspond to the call sites that are inlined and red arrows correspond to the call sites that are not inlined.
-![Call Graph](GreenNode_WriteTo_CallGraph.png)
+![Call Graph](images/GreenNode_WriteTo_CallGraph.png)
## Roadmap
set "COMPlus_JitDump=Main GetEnumerator"
```
-See [Setting configuration variables](../../../workflow/testing/coreclr/viewing-jit-dumps.md#setting-configuration-variables) for more
+See [Setting configuration variables](viewing-jit-dumps.md#setting-configuration-variables) for more
details on this.
Full instructions for dumping the compilation of some managed code can be found here:
-[viewing-jit-dumps.md](../../../workflow/testing/coreclr/viewing-jit-dumps.md)
+[viewing-jit-dumps.md](viewing-jit-dumps.md)
## Reading expression trees
- It then imports the IL for the candidate, producing IR
- This is inserted at the call site, if successful
- This phase has been undergoing significant refactoring and enhancement:
- - https://github.com/dotnet/runtime/blob/master/docs/design/features/inlining-plans.md
+ - See [inlining plans](inlining-plans.md)
#### Notes
The inliner re-invokes the importer for each method that is considered a suitable candidate. Along the way, it may determine that the method cannot, or should not, be inlined, at which case it abandons the constructed IR, and leaves the callsite as-is. Otherwise, it inserts the newly created IR at the callsite, adds the local variables of the called method to the callee, and fixes up the arguments and returns.
- Run & capture jitdump3.out, search for optCloneLoops
### Reference
-- The RyuJIT Overview document is available here:
- https://github.com/dotnet/runtime/blob/master/docs/design/coreclr/botr/ryujit-overview.md
+- The RyuJIT overview document is available [here](ryujit-overview.md)
## Backup
This document is intended for people interested in seeing the disassembly, GC info, or other details the JIT generates for a managed program.
-To make sense of the results, it is recommended you also read the [Reading a JitDump](/docs/design/coreclr/botr/ryujit-overview.md#reading-a-jitdump) section of the RyuJIT Overview.
+To make sense of the results, it is recommended you also read the [Reading a JitDump](ryujit-overview.md#reading-a-jitdump) section of the RyuJIT Overview.
## Setting up our environment
Below are some of the most useful `COMPlus` variables. Where {method-list} is specified in the list below, you can supply a space-separated list of either fully-qualified or simple method names (the former is useful when running something that has many methods of the same name), or you can specify `*` to mean all methods.
-* `COMPlus_JitDump`={method-list} – dump lots of useful information about what the JIT is doing. See [Reading a JitDump](/docs/design/coreclr/botr/ryujit-overview.md#reading-a-jitdump) for more on how to analyze this data.
+* `COMPlus_JitDump`={method-list} – dump lots of useful information about what the JIT is doing. See [Reading a JitDump](ryujit-overview.md#reading-a-jitdump) for more on how to analyze this data.
* `COMPlus_JitDisasm`={method-list} – dump a disassembly listing of each method.
* `COMPlus_JitDiffableDasm` – set to 1 to tell the JIT to avoid printing things like pointer values that can change from one invocation to the next, so that the disassembly can be more easily compared.
* `COMPlus_JitGCDump`={method-list} – dump the GC information.
### Lowering
-As described here: https://github.com/dotnet/runtime/blob/master/docs/design/coreclr/botr/ryujit-overview.md#lowering, Lowering is responsible for transforming the IR in such a way that the control flow, and any register requirements, are fully exposed. This includes determining what instructions can be "contained" in another, such as immediates or addressing modes. For the hardware intrinsics, these are done in the target-specific methods `Lowering::LowerHWIntrinsic()` and `Lowering::ContainCheckHWIntrinsic()`.
+As described [here](../coreclr/jit/ryujit-overview.md#lowering), Lowering is responsible for transforming the IR in such a way that the control flow, and any register requirements, are fully exposed. This includes determining what instructions can be "contained" in another, such as immediates or addressing modes. For the hardware intrinsics, these are done in the target-specific methods `Lowering::LowerHWIntrinsic()` and `Lowering::ContainCheckHWIntrinsic()`.
The main consideration here is whether there are child nodes that are folded into the generated instruction. These may be:
* An immediate operand
| EE | Execution Engine. |
| GC | [Garbage Collector](https://github.com/dotnet/runtime/blob/master/docs/design/coreclr/botr/garbage-collection.md). |
| IPC | Inter-Process Communication. |
-| JIT | [Just-in-Time](https://github.com/dotnet/runtime/blob/master/docs/design/coreclr/botr/ryujit-overview.md) compiler. RyuJIT is the code name for the next generation Just-in-Time(aka "JIT") for the .NET runtime. |
+| JIT | [Just-in-Time](https://github.com/dotnet/runtime/blob/master/docs/design/coreclr/jit/ryujit-overview.md) compiler. RyuJIT is the code name for the next generation Just-in-Time(aka "JIT") for the .NET runtime. |
| LCG | Lightweight Code Generation. An early name for [dynamic methods](https://github.com/dotnet/runtime/blob/master/src/coreclr/src/System.Private.CoreLib/src/System/Reflection/Emit/DynamicMethod.cs). |
| MD | MetaData. |
| MDA | Managed Debugging Assistant - see [details](https://docs.microsoft.com/en-us/dotnet/framework/debug-trace-profile/diagnosing-errors-with-managed-debugging-assistants) (Note: Not in .NET Core, equivalent diagnostic functionality is made available on a case-by-case basis, e.g. [#15465](https://github.com/dotnet/coreclr/issues/15465)) |