Add description of TraceLog text format
authorAleksei Vereshchagin <avereschagin@dev.rtsoft.ru>
Tue, 10 Apr 2018 21:25:53 +0000 (00:25 +0300)
committerAleksei Vereshchagin <avereschagin@dev.rtsoft.ru>
Fri, 13 Apr 2018 14:59:28 +0000 (17:59 +0300)
doc/tracelog-format.md [new file with mode: 0644]

diff --git a/doc/tracelog-format.md b/doc/tracelog-format.md
new file mode 100644 (file)
index 0000000..e20b558
--- /dev/null
@@ -0,0 +1,466 @@
+# Format description of trace data file
+
+This document describes format of the performance data dumped by CoreProfiler.
+
+## Table of Contents
+
+<!-- TOC START min:2 max:3 link:true update:true -->
+- [Table of Contents](#table-of-contents)
+- [Format rules](#format-rules)
+- [Profiler records](#profiler-records)
+  - [Profiler Start Time: `prf stm`](#profiler-start-time-prf-stm)
+  - [Profiler Configuration: `prf cfg`](#profiler-configuration-prf-cfg)
+  - [Profiler Pause/Resume: `prf tps`/`prf tps`](#profiler-pauseresume-prf-tpsprf-tps)
+- [Process records](#process-records)
+  - [Process CPU Usage: `prc cpu`](#process-cpu-usage-prc-cpu)
+- [Thread records](#thread-records)
+  - [Thread Created: `thr crt`](#thread-created-thr-crt)
+  - [Thread Assigned To OS Thread: `thr aos`](#thread-assigned-to-os-thread-thr-aos)
+  - [Thread Destroyed: `thr crt`](#thread-destroyed-thr-crt)
+  - [Thread CPU Usage: `thr cpu`](#thread-cpu-usage-thr-cpu)
+- [Execution Engine records](#execution-engine-records)
+  - [Module Load Finished: `mod ldf`](#module-load-finished-mod-ldf)
+  - [Module Attached To Assembly: `mod ata`](#module-attached-to-assembly-mod-ata)
+  - [Assembly Load Finished: `asm ldf`](#assembly-load-finished-asm-ldf)
+  - [Application Domain Creation Finished: `apd crf`](#application-domain-creation-finished-apd-crf)
+- [Class records](#class-records)
+  - [Class Load Finished: `cls ldf`](#class-load-finished-cls-ldf)
+  - [Class Name: `cls nam`](#class-name-cls-nam)
+- [Function records](#function-records)
+  - [Function info: `fun inf`](#function-info-fun-inf)
+  - [Function name: `fun nam`](#function-name-fun-nam)
+- [JIT records](#jit-records)
+  - [JIT Compilation Started: `jit cms`](#jit-compilation-started-jit-cms)
+  - [JIT Compilation Finished: `jit cmf`](#jit-compilation-finished-jit-cmf)
+  - [JIT Cached Function Search Started: `jit css`](#jit-cached-function-search-started-jit-css)
+  - [JIT Cached Function Search Finished: `jit csf`](#jit-cached-function-search-finished-jit-csf)
+- [GC records](#gc-records)
+  - [Garbage Collection Started: `gch gcs`](#garbage-collection-started-gch-gcs)
+  - [Garbage Collection Finished: `gch gcf`](#garbage-collection-finished-gch-gcf)
+  - [GC Heap Allocation Table: `gch alt`](#gc-heap-allocation-table-gch-alt)
+- [Sample records](#sample-records)
+  - [Stack Trace Sample: `sam str`](#stack-trace-sample-sam-str)
+  - [Memory Sample: `sam mem`](#memory-sample-sam-mem)
+
+<!-- TOC END -->
+
+## Format rules
+
+Each dump record is one text line prefixed with `<aaa> <bbb>`,
+where `<aaa>` is the record type and `<bbb>` is the record sub-type.
+All other data on the line depends on the pair of type and sub-type.
+
+Different item types can be described by the dump format. This table enumerates
+this types, its format and description.
+
+<!-- '∣' (U+2223, Symbol divides) instead of `|` (U+01C0, Latin letter dental
+click) used in tables to avoid parser problems. -->
+
+| Item           | Format                                                 | Description                                                                       |
+| -------------- | ------------------------------------------------------ | --------------------------------------------------------------------------------- |
+| `<num>`        | `%d`                                                   | Decimal number                                                                    |
+| `<hex>`        | `0x%x`                                                 | Hexadecimal number                                                                |
+| `<hex32>`      | `0x%08X`                                               | 32-bit hexadecimal number                                                         |
+| `<hex64>`      | `0x%016X`                                              | 64-bit hexadecimal number                                                         |
+| `<str>`        | `%s`                                                   | Text string                                                                       |
+| `<qtn>`        | `"%s"`                                                 | Quoted text string                                                                |
+| `<systime>`    | `%Y-%m-%d %H:%M:%S.%03f`                               | System time in format of year, month, day, hour, minute, seconds and milliseconds |
+| `<ms>`         | `<num>`                                                | Time (in milliseconds)                                                            |
+| `<us>`         | `<num>`                                                | Time (in microseconds)                                                            |
+| `<ptr>`        | `<hex64>`                                              | Memory pointer                                                                    |
+| `<pid>`        | `<num>`                                                | System process identifier                                                         |
+| `<ip>`         | `<ptr>∣?`                                              | Instruction pointer, `?` if undefined                                             |
+| `<hr>`         | `<hex32>`                                              | HRESULT                                                                           |
+| `<id>`         | `<hex64>`                                              | Profiling API ID                                                                  |
+| `<iid>`        | `<hex32>∣?`                                            | Internal CoreProfiler ID, `?` if undefined                                        |
+| `<token>`      | `<hex32>`                                              | .NET Metadata Token                                                               |
+| `<code_info>`  | `<start_addr:hex64>:<size:hex>`                        | Item of Code Info table                                                           |
+| `<il_map>`     | `<ilOffset:hex>:<nativeStart:hex>:<nativeEnd:hex>`     | Item of IL mapping table                                                          |
+| `<alt_item>`   | `<classIid:iid>:<allocCount:num>:<memSize:num>`        | Item of GC allocation table                                                       |
+| `<frame>`      | `<functionIid:iid>[:<ip>]`                             | Call stack frame                                                                  |
+| `<alloc_item>` | `<classIid:iid>:<allocCount:num>:<memSize:num>[:<ip>]` | Item of memory sample                                                             |
+
+Many records have special attribute named 'Internal ID'. The internal ID
+can be used as link to such objects as threads, classes, functions, etc.
+All objects of one type have unique Internal ID within data log. Current
+implementation uses incremental counters for Internal ID started from zero.
+This distinguishes Internal ID from CoreCLR ID, whose values can be reused
+after end of object lifetime, and, in fact, represented by memory pointers.
+
+## Profiler records
+
+### Profiler Start Time: `prf stm`
+
+```
+prf stm <systime>
+```
+
+Local time of the profiling session start-up.
+
+### Profiler Configuration: `prf cfg`
+
+```
+prf cfg <name:str> <value:str>
+```
+
+Used to dump profiler configuration. First string is the configuration parameter
+name. Second string is the configuration parameter value. `<num>` format is used
+for numbers. `T|F` is used for Boolean values.
+
+### Profiler Pause/Resume: `prf tps`/`prf tps`
+
+```
+prf tps <timestamp:ms>
+prf trs <timestamp:ms>
+```
+
+Dumped to trace log when profiling is paused (`tps`) or resumed (`trs`).
+Followed by the event timestamp in milliseconds from the start of profiling.
+
+## Process records
+
+### Process CPU Usage: `prc cpu`
+
+```
+prc cpu <timestamp:ms> <usage:us>
+```
+
+Used to dump process CPU usage.
+
+* `timestamp` - the event timestamp in milliseconds from the start of profiling.
+* `usage` - the CPU usage time in microseconds from last timestamp.
+
+For the first record `CpuTraceTimeoutMs` value from the configuration should
+be used as measurement interval duration. In the pause mode CPU usage is still
+counted and will be dumped after resume event relatively to the last timestamp
+before pause.
+
+Process CPU usage in general is total CPU usage of process threads
+(so for parallel case it can be greater than measurement interval duration).
+
+## Thread records
+
+### Thread Created: `thr crt`
+
+```
+thr crt <threadId:id> <threadIid:iid>
+```
+
+Notifies that a thread has been created.
+
+* `threadId` - the ID of the thread that has been created.
+* `threadIid` - the Internal ID of the thread that has been created.
+
+### Thread Assigned To OS Thread: `thr aos`
+
+```
+thr aos <threadIid:iid> <osThreadId:pid>
+```
+
+Notifies that a managed thread is being implemented using a particular operating
+system thread.
+
+* `threadIid` - the Internal ID of the managed thread.
+* `osThreadId` - the PID of the operating system thread.
+
+### Thread Destroyed: `thr crt`
+
+```
+thr crt <threadIid:iid>
+```
+
+Notifies that a thread has been destroyed.
+
+* `threadIid` - the Internal ID of the thread that has been destroyed.
+
+### Thread CPU Usage: `thr cpu`
+
+```
+thr cpu <threadIid:iid> <timestamp:ms> <usage:us>
+```
+
+Used to dump CPU usage of managed thread.
+
+* `threadIid` - the thread Internal ID.
+* `timestamp` - the event timestamp in milliseconds from the start of profiling.
+* `usage` - the CPU usage time in microseconds from last timestamp.
+
+For the first record `CpuTraceTimeoutMs` value from the configuration should
+be used as measurement interval duration. In the pause mode CPU usage is still
+counted and will be dumped after resume event relatively to the last timestamp
+before pause.
+
+## Execution Engine records
+
+### Module Load Finished: `mod ldf`
+
+```
+mod ldf <moduleId:id> <baseLoadAddress:ptr> <assemblyId:id> <status:hr> <moduleName:qtn>
+```
+
+Notifies that a module has finished loading.
+
+* `moduleId` - the ID of the module that has finished loading.
+* `baseLoadAddress` - the base address at which the module is loaded.
+* `assemblyId` - the ID of the module's parent assembly.
+* `status` - an HRESULT that indicates whether the module was loaded
+  successfully.
+* `moduleName` - the file name of the module.
+
+### Module Attached To Assembly: `mod ata`
+
+```
+mod ata <moduleId:id> <assemblyId:id>
+```
+
+Notifies that a module is being attached to its parent assembly.
+
+* `moduleId` - the ID of the module that is being attached.
+* `assemblyId` - the ID of the parent assembly to which the module is attached.
+
+### Assembly Load Finished: `asm ldf`
+
+```
+asm ldf <assemblyId:id> <appDomainId:id> <moduleId:id> <status:hr> <assemblyName:qtn>
+```
+
+Notifies that an assembly has finished loading.
+
+* `assemblyId` - the ID of the assembly that was loaded.
+* `appDomainId` - the ID of the application domain that contains the assembly.
+* `moduleId` - the ID of the assembly's manifest module.
+* `status` - an HRESULT that indicates whether the assembly finished loading
+  successfully.
+* `assemblyName` - the assembly's name.
+
+### Application Domain Creation Finished: `apd crf`
+
+```
+apd crf <appDomainId:id> <processId:id> <status:hr> <appDomainName:qtn>
+```
+
+Notifies that an application domain has been created.
+
+* `appDomainId` - the ID of the application domain which has been created.
+* `processId` - the ID of the process that contains the application domain.
+* `status` - an HRESULT that indicates whether creation of the application
+  domain completed successfully.
+* `appDomainName` - the full or partial application domain name.
+
+## Class records
+
+### Class Load Finished: `cls ldf`
+
+```
+cls ldf <classId:id> <classIid:iid> <moduleId:id> <classToken:token> <status:hr>
+```
+
+Notifies that a class has finished loading.
+
+* `classId` - the ID of the class that was loaded.
+* `classIid` - the Internal ID of the class that was loaded.
+* `moduleId` - the ID of the parent module of the class.
+* `classToken` - the metadata token for the class.
+* `status` - an HRESULT that indicates whether the class loaded successfully.
+
+### Class Name: `cls nam`
+
+```
+cls nam <internalId:iid> <fullName:qtn>
+```
+
+Dump class name.
+
+## Function records
+
+### Function info: `fun inf`
+
+```
+fun inf <internalId:iid> <functionId:id> <classId:id> <moduleId:id> <funcToken:token> <code_info>... <il_map>...
+```
+
+Describes function information.
+
+* `internalId` - the Internal ID of the function.
+* `functionId` - the ID of the function.
+* `classId` - the ID of the parent class of the function.
+* `moduleId` - the ID of the module in which the function's parent class
+  is defined.
+* `funcToken` - the metadata token for the function.
+* `code_info` - zero or more contiguous block of native code stored in memory.
+  * `start_addr` - the starting address of the contiguous block of code.
+  * `size` - the size of the block.
+* `il_map` - zero or more MSIL to native code map items.
+  * `ilOffset` - the offset of the MSIL code.
+  * `nativeStart` - the offset of the start of the native code.
+  * `nativeEnd` - the offset of the end of the native code.
+
+### Function name: `fun nam`
+
+```
+fun nam <internalId:iid> <fullName:qtn> <returnType:qtn> <signature:qtn>
+```
+
+Dump function name, return type and signature.
+
+## JIT records
+
+### JIT Compilation Started: `jit cms`
+
+```
+jit cms <threadIid:iid> <timestamp:ms> <functionId:id>
+```
+
+Notifies that the JIT compiler has started to compile a function.
+
+* `threadIid` - the Internal ID of the thread running JIT.
+* `timestamp` - the event timestamp in milliseconds from the start of profiling.
+* `functionId` - the ID of the function for which the compilation is starting.
+
+### JIT Compilation Finished: `jit cmf`
+
+```
+jit cmf <threadIid:iid> <timestamp:ms> <functionId:id> <status:hr>
+```
+
+Notifies that the JIT compiler has finished compiling a function.
+
+* `threadIid` - the Internal ID of the thread running JIT.
+* `timestamp` - the event timestamp in milliseconds from the start of profiling.
+* `functionId` - the ID of the function that was compiled.
+* `status` - an HRESULT that indicates whether compilation was successful.
+
+### JIT Cached Function Search Started: `jit css`
+
+```
+jit css <threadIid:iid> <timestamp:ms> <functionId:id>
+```
+
+Notifies that a search has started for a function that was compiled previously
+using the Native Image Generator (NGen).
+
+* `threadIid` - the Internal ID of the thread running JIT.
+* `timestamp` - the event timestamp in milliseconds from the start of profiling.
+* `functionId` - the ID of the function for which the search is being performed.
+
+### JIT Cached Function Search Finished: `jit csf`
+
+```
+jit csf <threadIid:iid> <timestamp:ms> <functionId:id>
+```
+
+Notifies that a search has finished for a function that was compiled previously
+using the Native Image Generator (NGen).
+
+* `threadIid` - the Internal ID of the thread running JIT.
+* `timestamp` - the event timestamp in milliseconds from the start of profiling.
+* `functionId` - the ID of the function for which the search was performed.
+
+<!-- TODO: add 'result' field to the record. -->
+
+## GC records
+
+### Garbage Collection Started: `gch gcs`
+
+```
+gch gcs <threadIid:iid> <timestamp:ms> <reason:str> <gen_collect:t|f>...
+```
+
+Notifies that garbage collection has started.
+
+* `threadIid` - the Internal ID of the thread running JIT.
+* `timestamp` - the event timestamp in milliseconds from the start of profiling.
+* `reason` - the reason the garbage collection was induced.
+  * `induced` - the garbage collection was induced by a Collect method.
+  * `?` - the reason is unspecified.
+* `gen_collect` - an array of Boolean values, which are true (`t`)
+  if the generation that corresponds to the array index is being collected
+  by this garbage collection; otherwise, false (`f`).
+  * `[0]` - generation 0;
+  * `[1]` - generation 1;
+  * `[2]` - generation 2;
+  * `[3]` - large-object heap.
+
+### Garbage Collection Finished: `gch gcf`
+
+```
+gch gcf <threadIid:iid> <timestamp:ms>
+```
+
+Notifies that garbage collection has completed.
+
+* `threadIid` - the Internal ID of the thread running JIT.
+* `timestamp` - the event timestamp in milliseconds from the start of profiling.
+
+### GC Heap Allocation Table: `gch alt`
+
+```
+gch alt <timestamp:ms> <alt_item>...
+```
+
+Used to dump the GC allocation table, that describes how many objects of
+different classes remaining in the heap after a garbage collection has completed
+and how many memory is used by them.
+
+* `timestamp` - the event timestamp in milliseconds from the start of profiling.
+* `alt_item` - zero of more allocation table items.
+  * `classIid` - the Internal ID of the some class, which instances remaining
+    in the heap.
+  * `allocCount` - the number of instances of the class remaining in the heap.
+  * `memSize` - memory volume (in bytes) used by instances of the class
+    remaining in the heap.
+
+<!-- TODO: remove timestamp - it is dublicated by `gch gcf` -->
+
+## Sample records
+
+### Stack Trace Sample: `sam str`
+
+```
+sam str <threadIid:iid> <timestamp:ms> <count:num> <matchPrefixSize:num>:<stackSize:num>[:<ip>] <frame>...
+```
+
+This record describes stack changes occured from the last known stack state
+(it is considered, that empty stack precedes the first stack trace sample
+record).
+
+* `threadIid` - the Internal ID of the thread that owns stack.
+* `timestamp` - the event timestamp in milliseconds from the start of profiling.
+* `count` - the number of sampling ticks (with rate, corresponding to the
+  `PROF_SAMPLING_TIMEOUT`) that has been generated before sample was dumped.
+  For the `PROF_HIGH_GRAN=1` mode this is generally `1`.
+  For the `Instrumentaion` mode this is `0`.
+* `matchPrefixSize` - the number of stack frames (from the bottom of the stack)
+  that was not changed.
+* `stackSize` - the number of stack frames in the previous stack state.
+* `ip` - if 'call point' of last matched frame was changed, new instruction
+  pointer will be dumped (`?` for undefined). 'Call point' is an instruction
+  pointer which from the next stack frame was called. For the top frame this is
+  current instruction pointer.
+* `frame` - zero of more new frames added to stack.
+  * `functionIid:iid` - the Internal ID of the called function.
+  * `ip` - 'call point' of the frame. Undefined values is not printed.
+
+`ip` is printed only for `PROF_LINE_TRACE=1` mode.
+
+### Memory Sample: `sam mem`
+
+```
+sam mem <threadIid:iid> <timestamp:ms> <alloc_item>...
+```
+
+This record describes memory allocations occured from the previos memory sample
+(or from start of programm for the first sample record).
+
+* `threadIid` - the Internal ID of the thread that owns stack.
+* `timestamp` - the event timestamp in milliseconds from the start of profiling.
+* `alloc_item`- zero or more allocation items.
+  * `classIid` - the Internal ID of the some class, which instances has been
+    allocated.
+  * `allocCount` - the number of instances of the class,
+    that has been allocated.
+  * `memSize` - memory volume (in bytes) used by instances of the class, that
+    has been allocated.
+  * `ip` - the instruction pointer of corresponding allocation operations.
+
+`ip` is printed only for `PROF_LINE_TRACE=1` mode.