--- /dev/null
+How to Debug Crossgen2
+=================
+
+Crossgen2 brings with it a number of new challenges for debugging the compilation process. Fortunately, in addition to challenges, crossgen2 is designed to enhance various parts of the debugging experience.
+
+Important concerns to be aware of when debugging Crossgen2
+---------------------------------
+
+- Other than the JIT, Crossgen2 is a managed application
+- By default Crossgen2 uses a multi-core compilation strategy
+- A Crossgen2 process will have 2 copies of the JIT in the process at the same time, the one used to compile the target, and the one used to compile Crossgen2 itself.
+- Crossgen2 does not parse environment variables for controlling the JIT (or any other behavior), all behavior is controlled via the command line
+- The Crossgen2 command line as generated by the project system is quite complex
+
+
+Built in debugging aids in Crossgen2
+---------------------------------
+
+- When debugging a multi-threaded component of Crossgen2 and not investigating a multi-threading issue itself, it is generally advisable to disable the use of multiple threads.
+To do this use the `--parallelism 1` switch to specify that the maximum parallelism of the process shall be 1.
+
+- When debugging the behavior of compiling a single method, the compiler may be instructed to only compile a single method. This is done via the various --singlemethod options
+
+ - These options work by specifying a specific method by type, method name, generic method arguments, and if those are insufficiently descriptive to uniquely identify the method, by index. Types are described using the same format that the managed Type.GetType(string) function uses, which is documented in the normal .NET documentation. As this format can be quite verbose, the compiler provides a `--print-repro-instructions` switch which will print the arguments necessary to compile a function to the console.
+ - `--singlemethodindex` is used in cases where the method signature is the only distinguishing factor about the method. An index is used instead of a series of descriptive arguments, as specifying a signature exactly is extraordinarily complicated.
+ - Repro args will look like the following ``--singlemethodtypename "Internal.Runtime.CompilerServices.Unsafe" --singlemethodname As --singlemethodindex 2 --singlemethodgenericarg "System.Runtime.Intrinsics.Vector256`1[[System.SByte]]" --singlemethodgenericarg "System.Runtime.Intrinsics.Vector256`1[[System.Double]]"``
+
+- Since Crossgen2 is by default multi-threaded, it produces results fairly quickly even when compiling using a Debug variant of the JIT. In general, when debugging JIT issues we recommend using the debug JIT regardless of which environment caused a problem.
+
+- Crossgen2 supports nearly arbitrary cross-targetting, including OS and architecture cross targeting. The only restriction is that 32bit architecture cannot compile targetting a 64bit architecture. This allows the use of the debugging environment most convenient to the developer. In particular, if there is an issue which crosses the managed/native boundary, it is often convenient to debug using the mixed mode debugger on Windows X64.
+ - If the correct set of assemblies/command line arguments are passed to the compiler Crossgen2 should produce binary identical output on all platforms.
+ - The compiler does not check the OS/Architecture specified for input assemblies, which allows compiling using a non-architecture/OS matched version of the framework to target an arbitrary target. While this isn't useful for producing the diagnosing all issues, it can be cheaply used to identify the general behavior of a change on the full swath of supported architectures.
+
+Control compilation behavior by using the `--targetos` and `--targetarch` switches. The default behavior is to target the crossgen2's own OS/Arch pair, but all 64bit versions of crossgen2 are capable of targetting arbitrary OS/Arch combinations.
+At the time of writing the current supported sets of valid arguments are:
+| Command line arguments
+| --- |
+| `--targetos windows --targetarch x86` |
+| `--targetos windows --targetarch x64` |
+| `--targetos windows --targetarch arm` |
+|`--targetos windows --targetarch arm64` |
+|`--targetos linux --targetarch x64` |
+|`--targetos linux --targetarch arm` |
+|`--targetos linux --targetarch arm64` |
+|`--targetos osx --targetarch x64` |
+|`--targetos osx --targetarch arm64` |
+
+- Passing special jit behavior flags to the compiler is done via the `--codegenopt` switch. As an example to turn on tailcall loop optimizations and dump all code compiled use a pair of them like `--codegenopt NgenDump=* --codegenopt TailCallLoopOpt=1`.
+
+- When using the NgenDump feature of the JIT, disable parallelism as described above or specify a single method to be compiled. Otherwise, output from multiple functions will be interleaved and inscrutable.
+
+- Since there are 2 jits in the process, when debugging in the JIT, if the source files match up, there is a decent chance that a native debugger will stop at unfortunate and unexpected locations. This is extremely annoying, and to combat this, we generally recommend making a point of using a runtime which doesn't exactly match that of the crossgen2 in use. However, if that isn't feasible, it is also possible to disable symbol loading in most native debuggers. For instance, in Visual Studio, one would use the "Specify excluded modules" feature.
+
+- Crossgen2 identifies the JIT to use by the means of a naming convention. By default it will use a JIT located in the same directory as the crossgen2.dll file. In addition there is support for a `--jitpath` switch to use a specific JIT. This option is intended to support A/B testing by the JIT team. The `--jitpath` option should only be used if the jit interface has not been changed. The JIT specified by the `--jitpath` switch must be compatible with the current settings of the `--targetos` and `--targetarch` switches.
+
+- In parallel to the crossgen2 project, there is a tool known as r2rdump. This tool can be used to dump the contents of a produced image to examine what was actually produced in the final binary. It has a large multitude of options to control exactly what is dumped, but in general it is able to dump any image produced by crossgen2, and display its contents in a human readable fashion. Specify `--disasm` to display disassembly.
+
+- If there is a need to debug the dependency graph of crossgen2 (which is a very rare need at this time), there is a visualization tool located in the corert repo. https://github.com/dotnet/corert/tree/master/src/ILCompiler.DependencyAnalysisFramework/ILCompiler-DependencyGraph-Viewer To use that tool, get the sources from the CoreRT repo, compile it, and run it on Windows before the crossgen2 compilation begins. It will present a live view of the graph as it is generated and allow for exploration to determine why some node is in the graph. Every node in the graph has a unique id that is visible to this tool, and it can be used in parallel with a debugger to understand what is happening in the crossgen2 process. Changes to move this tool to a more commonly built location and improve the fairly horrible UI are encouraged.
+
+- When used in the official build system, the set of arguments passed to crossgen2 is extremely complex, especially with regards to the set of reference paths (each assembly is specified individually). To make it easier to use crossgen2 from the command line manually the tool will accept wildcards in its parsing of references. Please note that on Unix that the shell will happily expand these arguments by itself, which will not work correctly. In those situations enclose the argument in quotes to prevent the shell expansion.
+
+- Crossgen2 supports a `--map` and `--mapcsv` arguments to produce map files of the produced output. These are primarily used for diagnosing size issues, as they describe the generated file in fairly high detail, as well as providing a number of interesting statistics about the produced output.
+
+- Diagnosing why a specific method failed to compile in crossgen2 can be done by passing the `--verbose` switch to crossgen2. This will print many things, but in particular it will print the reason why a compilation was aborted due to an R2R format limitation.
+
+- Crossgen2 can use either the version of dotnet that is used to build the product (as found by the dotnet.cmd or dotnet.sh script found in the root of the runtime repo) or it can use a sufficiently recent corerun.exe produced by constructing a test build. It is strongly recommended if using corerun.exe to use a release build of corerun for this purpose, as crossgen2 runs a very large amount of managed code. The version of corerun used does not need to come from the same build as the crossgen2.dll that is being debugging. In fact, I would recommnend using a different enlistment to build that corerun to avoid any confusion.
+
+- In the runtime testbed, each test can be commanded to compile with crossgen2 by using environment variables. Just set the `RunCrossgen2` variable to 1, and optionally set the `CompositeBuildMode` variable to 1 if you wish to see the R2R behavior with composite image creation. By default this will simply use `dotnet` to run crossgen2. If you run the test batch script from the root of the enlistment on Windows this will just work; otherwise, you must set the `__TestDotNetCmd` environment variable to point at copy of `dotnet` or `corerun` that can run crossgen2. This is often the easiest way to run a simple test with crossgen2 for developers practiced in the CoreCLR testbed. See the example below of various techniques to use when diagnosing issues under crossgen2.
+
+- When attempting to build crossgen2, you must build the clr.tools subset. If rebuilding a component of the JIT and wanting to use that in your inner loop, you must build as well with either the clr.jit or clr.alljits subsets. If the jit interface is changed, the clr.runtime subset must also be rebuilt.
+
+- After completion of a product build, a functional copy of crossgen2.dll will be located in a bin directory in a path like `bin\coreclr\windows.x64.Debug\crossgen2`. After creating a test native layout via a command such as `src\tests\build generatelayoutonly` then there will be a copy of crossgen2 located in the `%CORE_ROOT%\crossgen2` directory. The version of crossgen2 in the test core_root directory will have the appropriate files for running under either an x64 dotnet.exe or under the target architecture. This was done to make it somewhat easier to do cross platform development, and assumes the primary development machine is x64,
+
+Example of debugging a test application in Crossgen2
+================================================
+This example is to demonstrate debugging of a simple test in the CoreCLR testbed.
+
+The example assumes that `CORE_ROOT` is set appropriately, and that `__TestDotNetCmd` is set appropriately. See comments above for details on `__TestDotNetCmd`
+
+The test begins by setting `RunCrossgen2=1` This will instruct the test batch script to run crossgen2 on input binaries. It will also create a copy of the input binaries which needs to be deleted if you modify the test. To do so, delete the directory where the test binaries are, and rebuild the test.
+
+```
+C:\git2\runtime>set RunCrossgen2=1
+
+C:\git2\runtime>c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\Complex1.cmd
+BEGIN EXECUTION
+Complex1.dll
+ 1 file(s) copied.
+Could Not Find c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\Complex1.dll.rsp
+Response file: c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\\Complex1.dll.rsp
+c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\IL-CG2\Complex1.dll
+-o:c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\\Complex1.dll
+--targetarch:x64
+--verify-type-and-field-layout
+-O
+-r:c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\Tests\Core_Root\System.*.dll
+-r:c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\Tests\Core_Root\Microsoft.*.dll
+-r:c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\Tests\Core_Root\mscorlib.dll
+-r:c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\Tests\Core_Root\netstandard.dll
+" "dotnet" "c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\Tests\Core_Root\crossgen2\crossgen2.dll" @"c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\\Complex1.dll.rsp" -r:c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\IL-CG2\*.dll"
+Emitting R2R PE file: c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\\Complex1.dll
+ "c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\Tests\Core_Root\corerun.exe" Complex1.dll
+Starting...
+Everything Worked!
+Expected: 100
+Actual: 100
+END EXECUTION - PASSED
+PASSED
+```
+From that invocation you can see that crossgen2 was launched with a response file containing a list of arguments including all of the details for references. Then you can manually run the actual crossgen2 command, which is prefixed with the value of the `__TestDotNetCmd` environment variable. For instance, once I saw the above output, I copied and pasted the last command, and ran it.
+```
+C:\git2\runtime>"dotnet" "c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\Tests\Core_Root\crossgen2\crossgen2.dll" @"c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\\Complex1.dll.rsp" -r:c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\IL-CG2\*.dll
+C:\git2\runtime\.dotnet
+Emitting R2R PE file: c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\\Complex1.dll
+```
+
+And then wanted to debug the individual method compilation, and ran it with the `--print-repro-instructions` switch
+
+```
+C:\git2\runtime>"dotnet" "c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\Tests\Core_Root\crossgen2\crossgen2.dll" @"c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\\Complex1.dll.rsp" -r:c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\IL-CG2\*.dll --print-repro-instructions
+C:\git2\runtime\.dotnet
+Single method repro args:--singlemethodtypename "Complex,Complex1" --singlemethodname mul_em --singlemethodindex 1
+Single method repro args:--singlemethodtypename "Complex_Array_Test,Complex1" --singlemethodname .ctor --singlemethodindex 1
+Single method repro args:--singlemethodtypename "Complex_Array_Test,Complex1" --singlemethodname Main --singlemethodindex 1
+Emitting R2R PE file: c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\\Complex1.dll
+```
+
+I then wanted to see some more detail from the jit. To keep the size of this example small, I'm just using the `NgenOrder=1`switch, but jit developers would more likely use `NgenDump=*` switch.
+```
+C:\git2\runtime>"dotnet" "c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\Tests\Core_Root\crossgen2\crossgen2.dll" @"c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\\Complex1.dll.rsp" -r:c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\IL-CG2\*.dll --print-repro-instructions --singlemethodtypename "Complex_Array_Test,Complex1" --singlemethodname Main --singlemethodindex 1 --codegenopt NgenOrder=1
+C:\git2\runtime\.dotnet
+Single method repro args:--singlemethodtypename "Complex_Array_Test,Complex1" --singlemethodname Main --singlemethodindex 1
+ | Profiled | Method | Method has | calls | Num |LclV |AProp| CSE | Perf |bytes | x64 codesize|
+ mdToken | CNT | RGN | Hash | EH | FRM | LOOP | NRM | IND | BBs | Cnt | Cnt | Cnt | Score | IL | HOT | CLD | method name
+---------+------+------+----------+----+-----+------+-----+-----+-----+-----+-----+-----+---------+------+-------+-----+
+06000002 | | | f656934b | | rsp | LOOP | 3 | 0 | 35 | 56 | 48 | 7 | 43056 | 490 | 761 | 0 | Complex_Array_Test:Main(System.String[]):int
+Emitting R2R PE file: c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\\Complex1.dll
+```
+
+And finally, as the last `--targetarch` and `--targetos` switch is the meaningful one, it is simple to target a different architecture for ad hoc exploration...
+
+```
+C:\git2\runtime>"dotnet" "c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\Tests\Core_Root\crossgen2\crossgen2.dll" @"c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\\Complex1.dll.rsp" -r:c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\IL-CG2\*.dll --print-repro-instructions --singlemethodtypename "Complex_Array_Test,Complex1" --singlemethodname Main --singlemethodindex 1 --codegenopt NgenOrder=1 --targetarch arm64
+C:\git2\runtime\.dotnet
+Single method repro args:--singlemethodtypename "Complex_Array_Test,Complex1" --singlemethodname Main --singlemethodindex 1
+ | Profiled | Method | Method has | calls | Num |LclV |AProp| CSE | Perf |bytes | arm64 codesize|
+ mdToken | CNT | RGN | Hash | EH | FRM | LOOP | NRM | IND | BBs | Cnt | Cnt | Cnt | Score | IL | HOT | CLD | method name
+---------+------+------+----------+----+-----+------+-----+-----+-----+-----+-----+-----+---------+------+-------+-----+
+06000002 | | | f656934b | | fp | LOOP | 3 | 0 | 35 | 59 | 48 | 10 | 63828 | 490 | 1048 | 0 | Complex_Array_Test:Main(System.String[]):int
+Emitting R2R PE file: c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\\Complex1.dll
+```
+Note that the only difference in the command line was to pass the `--targetarch arm64` switch, and the JIT now compiles the method as arm64.
+
+Finally, attaching a debugger to crossgen2.
+
+Since this example uses `dotnet` as the `__TestDotNetCmd` you will need to debug the `c:\git2\runtime\.dotnet\dotnet.exe` process.
+
+```
+devenv /debugexe C:\git2\runtime\.dotnet\dotnet.exe "c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\Tests\Core_Root\crossgen2\crossgen2.dll" @"c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\\Complex1.dll.rsp" -r:c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\IL-CG2\*.dll --print-repro-instructions --singlemethodtypename "Complex_Array_Test,Complex1" --singlemethodname Main --singlemethodindex 1 --codegenopt NgenOrder=1 --targetarch arm64
+```
+
+This will launch the Visual Studio debugger, with a solution setup for debugging the dotnet.exe process. By default this solution will debug the native code of the process only. To debug the managed components, edit the properties on the solution and set the `Debugger Type` to `Managed (.NET Core, .NET 5+)` or `Mixed (.NET Core, .NET 5+)`.
using System.Reflection.Metadata;
using System.Reflection.PortableExecutable;
using System.Runtime.InteropServices;
+using System.Text;
using Internal.CommandLine;
using Internal.IL;
.UseCompilationRoots(compilationRoots)
.UseOptimizationMode(_optimizationMode);
+ if (_commandLineOptions.PrintReproInstructions)
+ builder.UsePrintReproInstructions(CreateReproArgumentString);
+
compilation = builder.ToCompilation();
}
return method;
}
- private static bool DumpReproArguments(CodeGenerationFailedException ex)
+ private static string CreateReproArgumentString(MethodDesc method)
{
- Console.WriteLine(SR.DumpReproInstructions);
+ StringBuilder sb = new StringBuilder();
- MethodDesc failingMethod = ex.Method;
-
- var formatter = new CustomAttributeTypeNameFormatter((IAssemblyDesc)failingMethod.Context.SystemModule);
+ var formatter = new CustomAttributeTypeNameFormatter((IAssemblyDesc)method.Context.SystemModule);
- Console.Write($"--singlemethodtypename \"{formatter.FormatName(failingMethod.OwningType, true)}\"");
- Console.Write($" --singlemethodname {failingMethod.Name}");
+ sb.Append($"--singlemethodtypename \"{formatter.FormatName(method.OwningType, true)}\"");
+ sb.Append($" --singlemethodname {method.Name}");
{
int curIndex = 0;
- foreach (var searchMethod in failingMethod.OwningType.GetMethods())
+ foreach (var searchMethod in method.OwningType.GetMethods())
{
- if (searchMethod.Name != failingMethod.Name)
+ if (searchMethod.Name != method.Name)
continue;
curIndex++;
- if (searchMethod == failingMethod.GetMethodDefinition())
+ if (searchMethod == method.GetMethodDefinition())
{
- Console.Write($" --singlemethodindex {curIndex}");
+ sb.Append($" --singlemethodindex {curIndex}");
}
}
}
- for (int i = 0; i < failingMethod.Instantiation.Length; i++)
- Console.Write($" --singlemethodgenericarg \"{formatter.FormatName(failingMethod.Instantiation[i], true)}\"");
+ for (int i = 0; i < method.Instantiation.Length; i++)
+ sb.Append($" --singlemethodgenericarg \"{formatter.FormatName(method.Instantiation[i], true)}\"");
+
+ return sb.ToString();
+ }
+
+ private static bool DumpReproArguments(CodeGenerationFailedException ex)
+ {
+ Console.WriteLine(SR.DumpReproInstructions);
- Console.WriteLine();
+ MethodDesc failingMethod = ex.Method;
+ Console.WriteLine(CreateReproArgumentString(failingMethod));
return false;
}