- tools/[inject](tools/inject.py): Targeted error injection with call chain and predicates [Examples](tools/inject_example.txt).
- tools/[killsnoop](tools/killsnoop.py): Trace signals issued by the kill() syscall. [Examples](tools/killsnoop_example.txt).
- tools/[klockstat](tools/klockstat.py): Traces kernel mutex lock events and display locks statistics. [Examples](tools/klockstat_example.txt).
+- tools/[kvmexit](tools/kvmexit.py): Display the exit_reason and its statistics of each vm exit. [Examples](tools/kvmexit_example.txt).
- tools/[llcstat](tools/llcstat.py): Summarize CPU cache references and misses by process. [Examples](tools/llcstat_example.txt).
- tools/[mdflush](tools/mdflush.py): Trace md flush events. [Examples](tools/mdflush_example.txt).
- tools/[memleak](tools/memleak.py): Display outstanding memory allocations to find memory leaks. [Examples](tools/memleak_example.txt).
--- /dev/null
+.TH kvmexit 8 "2021-07-08" "USER COMMANDS"
+.SH NAME
+kvmexit \- Display the exit_reason and its statistics of each vm exit.
+.SH SYNOPSIS
+.B kvmexit [\-h] [\-p PID [\-v VCPU | \-a] ] [\-t TID | \-T 'TID1,TID2'] [duration]
+.SH DESCRIPTION
+Considering virtual machines' frequent exits can cause performance problems,
+this tool aims to locate the frequent exited reasons and then find solutions
+to reduce or even avoid the exit, by displaying the detail exit reasons and
+the counts of each vm exit for all vms running on one physical machine.
+
+This tool uses a PERCPU_ARRAY: pcpuArrayA and a percpu_hash: hashA to
+collaboratively store each kvm exit reason and its count. The reason is there
+exists a rule when one vcpu exits and re-enters, it tends to continue to run on
+the same physical cpu as the last cycle, which is also called 'cache hit'. Thus
+we turn to use a PERCPU_ARRAY to record the 'cache hit' situation to speed
+things up; and for other cases, then use a percpu_hash.
+
+As RAW_TRACEPOINT_PROBE(kvm_exit) consumes less cpu cycles, when this tool is
+used, it firstly tries to employ raw tracepoints in modules, and if failes,
+then fall back to regular tracepoint.
+
+Limitation: In view of the hardware-assisted virtualization technology of
+different architectures, currently we only adapt on vmx in intel.
+
+Since this uses BPF, only the root user can use this tool.
+.SH REQUIREMENTS
+CONFIG_BPF and bcc.
+
+This also requires Linux 4.7+ (BPF_PROG_TYPE_TRACEPOINT support).
+.SH OPTIONS
+.TP
+\-h
+Print usage message.
+.TP
+\-p PID
+Display process with this PID only, collpase all tids with exit reasons sorted
+in descending order.
+.TP
+\-v VCPU
+Display this VCPU only for this PID.
+.TP
+\-a ALLTIDS
+Display all TIDS for this PID.
+.TP
+\-t TID
+Display thread with this TID only with exit reasons sorted in descending order.
+.TP
+\-T 'TID1,TID2'
+Display threads for a union like {395490, 395491}.
+.TP
+duration
+Duration of display, after sleeping several seconds.
+.SH EXAMPLES
+.TP
+Display kvm exit reasons and statistics for all threads... Hit Ctrl-C to end:
+#
+.B kvmexit
+.TP
+Display kvm exit reasons and statistics for all threads after sleeping 6 secs:
+#
+.B kvmexit 6
+.TP
+Display kvm exit reasons and statistics for PID 1273795 after sleeping 5 secs:
+#
+.B kvmexit -p 1273795 5
+.TP
+Display kvm exit reasons and statistics for PID 1273795 and its all threads after sleeping 5 secs:
+#
+.B kvmexit -p 1273795 5 -a
+.TP
+Display kvm exit reasons and statistics for PID 1273795 VCPU 0... Hit Ctrl-C to end:
+#
+.B kvmexit -p 1273795 -v 0
+.TP
+Display kvm exit reasons and statistics for PID 1273795 VCPU 0 after sleeping 4 secs:
+#
+.B kvmexit -p 1273795 -v 0 4
+.TP
+Display kvm exit reasons and statistics for TID 1273819 after sleeping 10 secs:
+#
+.B kvmexit -t 1273819 10
+.TP
+Display kvm exit reasons and statistics for TIDS ['1273820', '1273819']... Hit Ctrl-C to end:
+#
+.B kvmexit -T '1273820,1273819'
+.SH OVERHEAD
+This traces the "kvm_exit" kernel function, records the exit reason and
+calculates its counts. Contrast with filling more vm-exit reason debug entries,
+this tool is more easily and flexibly: the bcc python logic could provide nice
+kernel aggregation and custom output, the bpf in-kernel percpu_array and
+percpu_cache further improves performance.
+
+The impact of using this tool on the host should be negligible. While this
+tool is very efficient, it does affect the guest virtual machine itself, the
+average test results on guest vm are as follows:
+ | cpu cycles
+ no TP | 1127
+ regular TP | 1277 (13% downgrade)
+ RAW TP | 1187 (5% downgrade)
+
+Host: echo 1 > /proc/sys/net/core/bpf_jit_enable
+.SH SOURCE
+This is from bcc.
+.IP
+https://github.com/iovisor/bcc
+.PP
+Also look in the bcc distribution for a companion _examples.txt file containing
+example usage, output, and commentary for this tool.
+.SH OS
+Linux
+.SH STABILITY
+Unstable - in development.
+.SH AUTHOR
+Fei Li <lifei.shirley@bytedance.com>
--- /dev/null
+#!/usr/bin/env python
+#
+# kvmexit.py
+#
+# Display the exit_reason and its statistics of each vm exit
+# for all vcpus of all virtual machines. For example:
+# $./kvmexit.py
+# PID TID KVM_EXIT_REASON COUNT
+# 1273551 1273568 EXIT_REASON_MSR_WRITE 6
+# 1274253 1274261 EXIT_REASON_EXTERNAL_INTERRUPT 1
+# 1274253 1274261 EXIT_REASON_HLT 12
+# ...
+#
+# Besides, we also allow users to specify one pid, tid(s), or one
+# pid and its vcpu. See kvmexit_example.txt for more examples.
+#
+# @PID: each vitual machine's pid in the user space.
+# @TID: the user space's thread of each vcpu of that virtual machine.
+# @KVM_EXIT_REASON: the reason why the vm exits.
+# @COUNT: the counts of the @KVM_EXIT_REASONS.
+#
+# REQUIRES: Linux 4.7+ (BPF_PROG_TYPE_TRACEPOINT support)
+#
+# Copyright (c) 2021 ByteDance Inc. All rights reserved.
+#
+# Author(s):
+# Fei Li <lifei.shirley@bytedance.com>
+
+
+from __future__ import print_function
+from time import sleep, strftime
+from bcc import BPF
+import argparse
+import multiprocessing
+import os
+import signal
+import subprocess
+
+#
+# Process Arguments
+#
+def valid_args_list(args):
+ args_list = args.split(",")
+ for arg in args_list:
+ try:
+ int(arg)
+ except:
+ raise argparse.ArgumentTypeError("must be valid integer")
+ return args_list
+
+# arguments
+examples = """examples:
+ ./kvmexit # Display kvm_exit_reason and its statistics in real-time until Ctrl-C
+ ./kvmexit 5 # Display in real-time after sleeping 5s
+ ./kvmexit -p 3195281 # Collpase all tids for pid 3195281 with exit reasons sorted in descending order
+ ./kvmexit -p 3195281 20 # Collpase all tids for pid 3195281 with exit reasons sorted in descending order, and display after sleeping 20s
+ ./kvmexit -p 3195281 -v 0 # Display only vcpu0 for pid 3195281, descending sort by default
+ ./kvmexit -p 3195281 -a # Display all tids for pid 3195281
+ ./kvmexit -t 395490 # Display only for tid 395490 with exit reasons sorted in descending order
+ ./kvmexit -t 395490 20 # Display only for tid 395490 with exit reasons sorted in descending order after sleeping 20s
+ ./kvmexit -T '395490,395491' # Display for a union like {395490, 395491}
+"""
+parser = argparse.ArgumentParser(
+ description="Display kvm_exit_reason and its statistics at a timed interval",
+ formatter_class=argparse.RawDescriptionHelpFormatter,
+ epilog=examples)
+parser.add_argument("duration", nargs="?", default=99999999, type=int, help="show delta for next several seconds")
+parser.add_argument("-p", "--pid", type=int, help="trace this PID only")
+exgroup = parser.add_mutually_exclusive_group()
+exgroup.add_argument("-t", "--tid", type=int, help="trace this TID only")
+exgroup.add_argument("-T", "--tids", type=valid_args_list, help="trace a comma separated series of tids with no space in between")
+exgroup.add_argument("-v", "--vcpu", type=int, help="trace this vcpu only")
+exgroup.add_argument("-a", "--alltids", action="store_true", help="trace all tids for this pid")
+args = parser.parse_args()
+duration = int(args.duration)
+
+#
+# Setup BPF
+#
+
+# load BPF program
+bpf_text = """
+#include <linux/delay.h>
+
+#define REASON_NUM 69
+#define TGID_NUM 1024
+
+struct exit_count {
+ u64 exit_ct[REASON_NUM];
+};
+BPF_PERCPU_ARRAY(init_value, struct exit_count, 1);
+BPF_TABLE("percpu_hash", u64, struct exit_count, pcpu_kvm_stat, TGID_NUM);
+
+struct cache_info {
+ u64 cache_pid_tgid;
+ struct exit_count cache_exit_ct;
+};
+BPF_PERCPU_ARRAY(pcpu_cache, struct cache_info, 1);
+
+FUNC_ENTRY {
+ int cache_miss = 0;
+ int zero = 0;
+ u32 er = GET_ER;
+ if (er >= REASON_NUM) {
+ return 0;
+ }
+
+ u64 cur_pid_tgid = bpf_get_current_pid_tgid();
+ u32 tgid = cur_pid_tgid >> 32;
+ u32 pid = cur_pid_tgid;
+
+ if (THREAD_FILTER)
+ return 0;
+
+ struct exit_count *tmp_info = NULL, *initial = NULL;
+ struct cache_info *cache_p;
+ cache_p = pcpu_cache.lookup(&zero);
+ if (cache_p == NULL) {
+ return 0;
+ }
+
+ if (cache_p->cache_pid_tgid == cur_pid_tgid) {
+ //a. If the cur_pid_tgid hit this physical cpu consecutively, save it to pcpu_cache
+ tmp_info = &cache_p->cache_exit_ct;
+ } else {
+ //b. If another pid_tgid matches this pcpu for the last hit, OR it is the first time to hit this physical cpu.
+ cache_miss = 1;
+
+ // b.a Try to load the last cache struct if exists.
+ tmp_info = pcpu_kvm_stat.lookup(&cur_pid_tgid);
+
+ // b.b If it is the first time for the cur_pid_tgid to hit this pcpu, employ a
+ // per_cpu array to initialize pcpu_kvm_stat's exit_count with each exit reason's count is zero
+ if (tmp_info == NULL) {
+ initial = init_value.lookup(&zero);
+ if (initial == NULL) {
+ return 0;
+ }
+
+ pcpu_kvm_stat.update(&cur_pid_tgid, initial);
+ tmp_info = pcpu_kvm_stat.lookup(&cur_pid_tgid);
+ // To pass the verifier
+ if (tmp_info == NULL) {
+ return 0;
+ }
+ }
+ }
+
+ if (er < REASON_NUM) {
+ tmp_info->exit_ct[er]++;
+ if (cache_miss == 1) {
+ if (cache_p->cache_pid_tgid != 0) {
+ // b.*.a Let's save the last hit cache_info into kvm_stat.
+ pcpu_kvm_stat.update(&cache_p->cache_pid_tgid, &cache_p->cache_exit_ct);
+ }
+ // b.* As the cur_pid_tgid meets current pcpu_cache_array for the first time, save it.
+ cache_p->cache_pid_tgid = cur_pid_tgid;
+ bpf_probe_read(&cache_p->cache_exit_ct, sizeof(*tmp_info), tmp_info);
+ }
+ return 0;
+ }
+
+ return 0;
+}
+"""
+
+# format output
+exit_reasons = (
+ "EXCEPTION_NMI",
+ "EXTERNAL_INTERRUPT",
+ "TRIPLE_FAULT",
+ "INIT_SIGNAL",
+ "N/A",
+ "N/A",
+ "N/A",
+ "INTERRUPT_WINDOW",
+ "NMI_WINDOW",
+ "TASK_SWITCH",
+ "CPUID",
+ "N/A",
+ "HLT",
+ "INVD",
+ "INVLPG",
+ "RDPMC",
+ "RDTSC",
+ "N/A",
+ "VMCALL",
+ "VMCLEAR",
+ "VMLAUNCH",
+ "VMPTRLD",
+ "VMPTRST",
+ "VMREAD",
+ "VMRESUME",
+ "VMWRITE",
+ "VMOFF",
+ "VMON",
+ "CR_ACCESS",
+ "DR_ACCESS",
+ "IO_INSTRUCTION",
+ "MSR_READ",
+ "MSR_WRITE",
+ "INVALID_STATE",
+ "MSR_LOAD_FAIL",
+ "N/A",
+ "MWAIT_INSTRUCTION",
+ "MONITOR_TRAP_FLAG",
+ "N/A",
+ "MONITOR_INSTRUCTION",
+ "PAUSE_INSTRUCTION",
+ "MCE_DURING_VMENTRY",
+ "N/A",
+ "TPR_BELOW_THRESHOLD",
+ "APIC_ACCESS",
+ "EOI_INDUCED",
+ "GDTR_IDTR",
+ "LDTR_TR",
+ "EPT_VIOLATION",
+ "EPT_MISCONFIG",
+ "INVEPT",
+ "RDTSCP",
+ "PREEMPTION_TIMER",
+ "INVVPID",
+ "WBINVD",
+ "XSETBV",
+ "APIC_WRITE",
+ "RDRAND",
+ "INVPCID",
+ "VMFUNC",
+ "ENCLS",
+ "RDSEED",
+ "PML_FULL",
+ "XSAVES",
+ "XRSTORS",
+ "N/A",
+ "N/A",
+ "UMWAIT",
+ "TPAUSE"
+)
+
+#
+# Do some checks
+#
+try:
+ # Currently, only adapte on intel architecture
+ cmd = "cat /proc/cpuinfo | grep vendor_id | head -n 1"
+ arch_info = subprocess.check_output(cmd, shell=True).strip()
+ if b"Intel" in arch_info:
+ pass
+ else:
+ raise Exception("Currently we only support Intel architecture, please do expansion if needs more.")
+
+ # Check if kvm module is loaded
+ if os.access("/dev/kvm", os.R_OK | os.W_OK):
+ pass
+ else:
+ raise Exception("Please insmod kvm module to use kvmexit tool.")
+except Exception as e:
+ raise Exception("Failed to do precondition check, due to: %s." % e)
+
+try:
+ if BPF.support_raw_tracepoint_in_module():
+ # Let's firstly try raw_tracepoint_in_module
+ func_entry = "RAW_TRACEPOINT_PROBE(kvm_exit)"
+ get_er = "ctx->args[0]"
+ else:
+ # If raw_tp_in_module is not supported, fall back to regular tp
+ func_entry = "TRACEPOINT_PROBE(kvm, kvm_exit)"
+ get_er = "args->exit_reason"
+except Exception as e:
+ raise Exception("Failed to catch kvm exit reasons due to: %s" % e)
+
+
+def find_tid(tgt_dir, tgt_vcpu):
+ for tid in os.listdir(tgt_dir):
+ path = tgt_dir + "/" + tid + "/comm"
+ fp = open(path, "r")
+ comm = fp.read()
+ if (comm.find(tgt_vcpu) != -1):
+ return tid
+ return -1
+
+# set process/thread filter
+thread_context = ""
+header_format = ""
+need_collapse = not args.alltids
+if args.tid is not None:
+ thread_context = "TID %s" % args.tid
+ thread_filter = 'pid != %s' % args.tid
+elif args.tids is not None:
+ thread_context = "TIDS %s" % args.tids
+ thread_filter = "pid != " + " && pid != ".join(args.tids)
+ header_format = "TIDS "
+elif args.pid is not None:
+ thread_context = "PID %s" % args.pid
+ thread_filter = 'tgid != %s' % args.pid
+ if args.vcpu is not None:
+ thread_context = "PID %s VCPU %s" % (args.pid, args.vcpu)
+ # transfer vcpu to tid
+ tgt_dir = '/proc/' + str(args.pid) + '/task'
+ tgt_vcpu = "CPU " + str(args.vcpu)
+ args.tid = find_tid(tgt_dir, tgt_vcpu)
+ if args.tid == -1:
+ raise Exception("There's no v%s for PID %d." % (tgt_vcpu, args.pid))
+ thread_filter = 'pid != %s' % args.tid
+ elif args.alltids:
+ thread_context = "PID %s and its all threads" % args.pid
+ header_format = "TID "
+else:
+ thread_context = "all threads"
+ thread_filter = '0'
+ header_format = "PID TID "
+bpf_text = bpf_text.replace('THREAD_FILTER', thread_filter)
+
+# For kernel >= 5.0, use RAW_TRACEPOINT_MODULE for performance consideration
+bpf_text = bpf_text.replace('FUNC_ENTRY', func_entry)
+bpf_text = bpf_text.replace('GET_ER', get_er)
+b = BPF(text=bpf_text)
+
+
+# header
+print("Display kvm exit reasons and statistics for %s" % thread_context, end="")
+if duration < 99999999:
+ print(" after sleeping %d secs." % duration)
+else:
+ print("... Hit Ctrl-C to end.")
+print("%s%-35s %s" % (header_format, "KVM_EXIT_REASON", "COUNT"))
+
+# signal handler
+def signal_ignore(signal, frame):
+ print()
+try:
+ sleep(duration)
+except KeyboardInterrupt:
+ signal.signal(signal.SIGINT, signal_ignore)
+
+
+# Currently, sort multiple tids in descending order is not supported.
+if (args.pid or args.tid):
+ ct_reason = []
+ if args.pid:
+ tgid_exit = [0 for i in range(len(exit_reasons))]
+
+# output
+pcpu_kvm_stat = b["pcpu_kvm_stat"]
+pcpu_cache = b["pcpu_cache"]
+for k, v in pcpu_kvm_stat.items():
+ tgid = k.value >> 32
+ pid = k.value & 0xffffffff
+ for i in range(0, len(exit_reasons)):
+ sum1 = 0
+ for inner_cpu in range(0, multiprocessing.cpu_count()):
+ cachePIDTGID = pcpu_cache[0][inner_cpu].cache_pid_tgid
+ # Take priority to check if it is in cache
+ if cachePIDTGID == k.value:
+ sum1 += pcpu_cache[0][inner_cpu].cache_exit_ct.exit_ct[i]
+ # If not in cache, find from kvm_stat
+ else:
+ sum1 += v[inner_cpu].exit_ct[i]
+ if sum1 == 0:
+ continue
+
+ if (args.pid and args.pid == tgid and need_collapse):
+ tgid_exit[i] += sum1
+ elif (args.tid and args.tid == pid):
+ ct_reason.append((sum1, i))
+ elif not need_collapse or args.tids:
+ print("%-8u %-35s %-8u" % (pid, exit_reasons[i], sum1))
+ else:
+ print("%-8u %-8u %-35s %-8u" % (tgid, pid, exit_reasons[i], sum1))
+
+ # Display only for the target tid in descending sort
+ if (args.tid and args.tid == pid):
+ ct_reason.sort(reverse=True)
+ for i in range(0, len(ct_reason)):
+ if ct_reason[i][0] == 0:
+ continue
+ print("%-35s %-8u" % (exit_reasons[ct_reason[i][1]], ct_reason[i][0]))
+ break
+
+
+# Aggregate all tids' counts for this args.pid in descending sort
+if args.pid and need_collapse:
+ for i in range(0, len(exit_reasons)):
+ ct_reason.append((tgid_exit[i], i))
+ ct_reason.sort(reverse=True)
+ for i in range(0, len(ct_reason)):
+ if ct_reason[i][0] == 0:
+ continue
+ print("%-35s %-8u" % (exit_reasons[ct_reason[i][1]], ct_reason[i][0]))
--- /dev/null
+Demonstrations of kvm exit reasons, the Linux eBPF/bcc version.
+
+
+Considering virtual machines' frequent exits can cause performance problems,
+this tool aims to locate the frequent exited reasons and then find solutions
+to reduce or even avoid the exit, by displaying the detail exit reasons and
+the counts of each vm exit for all vms running on one physical machine.
+
+
+Features of this tool
+=====================
+
+- Although there is a patch: [KVM: x86: add full vm-exit reason debug entries]
+ (https://patchwork.kernel.org/project/kvm/patch/1555939499-30854-1-git-send-email-pizhenwei@bytedance.com/)
+ trying to fill more vm-exit reason debug entries, just as the comments said,
+ the code allocates lots of memory that may never be consumed, misses some
+ arch-specific kvm causes, and can not do kernel aggregation. Instead bcc, as
+ a user space tool, can implement all these functions more easily and flexibly.
+- The bcc python logic could provide nice kernel aggregation and custom output,
+ like collpasing all tids for one pid (e.i. one vm's qemu process id) with exit
+ reasons sorted in descending order. For more information, see the following
+ #USAGE message.
+- The bpf in-kernel percpu_array and percpu_cache further improves performance.
+ For more information, see the following #Help to understand.
+
+
+Limited
+=======
+
+In view of the hardware-assisted virtualization technology of
+different architectures, currently we only adapt on vmx in intel.
+And the amd feature is on the road..
+
+
+Example output:
+===============
+
+# ./kvmexit.py
+Display kvm exit reasons and statistics for all threads... Hit Ctrl-C to end.
+PID TID KVM_EXIT_REASON COUNT
+^C1273551 1273568 EXIT_REASON_HLT 12
+1273551 1273568 EXIT_REASON_MSR_WRITE 6
+1274253 1274261 EXIT_REASON_EXTERNAL_INTERRUPT 1
+1274253 1274261 EXIT_REASON_HLT 12
+1274253 1274261 EXIT_REASON_MSR_WRITE 4
+
+# ./kvmexit.py 6
+Display kvm exit reasons and statistics for all threads after sleeping 6 secs.
+PID TID KVM_EXIT_REASON COUNT
+1273903 1273922 EXIT_REASON_EXTERNAL_INTERRUPT 175
+1273903 1273922 EXIT_REASON_CPUID 10
+1273903 1273922 EXIT_REASON_HLT 6043
+1273903 1273922 EXIT_REASON_IO_INSTRUCTION 24
+1273903 1273922 EXIT_REASON_MSR_WRITE 15025
+1273903 1273922 EXIT_REASON_PAUSE_INSTRUCTION 11
+1273903 1273922 EXIT_REASON_EOI_INDUCED 12
+1273903 1273922 EXIT_REASON_EPT_VIOLATION 6
+1273903 1273922 EXIT_REASON_EPT_MISCONFIG 380
+1273903 1273922 EXIT_REASON_PREEMPTION_TIMER 194
+1273551 1273568 EXIT_REASON_EXTERNAL_INTERRUPT 18
+1273551 1273568 EXIT_REASON_HLT 989
+1273551 1273568 EXIT_REASON_IO_INSTRUCTION 10
+1273551 1273568 EXIT_REASON_MSR_WRITE 2205
+1273551 1273568 EXIT_REASON_PAUSE_INSTRUCTION 1
+1273551 1273568 EXIT_REASON_EOI_INDUCED 5
+1273551 1273568 EXIT_REASON_EPT_MISCONFIG 61
+1273551 1273568 EXIT_REASON_PREEMPTION_TIMER 14
+
+# ./kvmexit.py -p 1273795 5
+Display kvm exit reasons and statistics for PID 1273795 after sleeping 5 secs.
+KVM_EXIT_REASON COUNT
+MSR_WRITE 13467
+HLT 5060
+PREEMPTION_TIMER 345
+EPT_MISCONFIG 264
+EXTERNAL_INTERRUPT 169
+EPT_VIOLATION 18
+PAUSE_INSTRUCTION 6
+IO_INSTRUCTION 4
+EOI_INDUCED 2
+
+# ./kvmexit.py -p 1273795 5 -a
+Display kvm exit reasons and statistics for PID 1273795 and its all threads after sleeping 5 secs.
+TID KVM_EXIT_REASON COUNT
+1273819 EXTERNAL_INTERRUPT 64
+1273819 HLT 2802
+1273819 IO_INSTRUCTION 4
+1273819 MSR_WRITE 7196
+1273819 PAUSE_INSTRUCTION 2
+1273819 EOI_INDUCED 2
+1273819 EPT_VIOLATION 6
+1273819 EPT_MISCONFIG 162
+1273819 PREEMPTION_TIMER 194
+1273820 EXTERNAL_INTERRUPT 78
+1273820 HLT 2054
+1273820 MSR_WRITE 5199
+1273820 EPT_VIOLATION 2
+1273820 EPT_MISCONFIG 77
+1273820 PREEMPTION_TIMER 102
+
+# ./kvmexit.py -p 1273795 -v 0
+Display kvm exit reasons and statistics for PID 1273795 VCPU 0... Hit Ctrl-C to end.
+KVM_EXIT_REASON COUNT
+^CMSR_WRITE 2076
+HLT 795
+PREEMPTION_TIMER 86
+EXTERNAL_INTERRUPT 20
+EPT_MISCONFIG 10
+PAUSE_INSTRUCTION 2
+IO_INSTRUCTION 2
+EPT_VIOLATION 1
+EOI_INDUCED 1
+
+# ./kvmexit.py -p 1273795 -v 0 4
+Display kvm exit reasons and statistics for PID 1273795 VCPU 0 after sleeping 4 secs.
+KVM_EXIT_REASON COUNT
+MSR_WRITE 4726
+HLT 1827
+PREEMPTION_TIMER 78
+EPT_MISCONFIG 67
+EXTERNAL_INTERRUPT 28
+IO_INSTRUCTION 4
+EOI_INDUCED 2
+PAUSE_INSTRUCTION 2
+
+# ./kvmexit.py -p 1273795 -v 4 4
+Traceback (most recent call last):
+ File "tools/kvmexit.py", line 306, in <module>
+ raise Exception("There's no v%s for PID %d." % (tgt_vcpu, args.pid))
+ Exception: There's no vCPU 4 for PID 1273795.
+
+# ./kvmexit.py -t 1273819 10
+Display kvm exit reasons and statistics for TID 1273819 after sleeping 10 secs.
+KVM_EXIT_REASON COUNT
+MSR_WRITE 13318
+HLT 5274
+EPT_MISCONFIG 263
+PREEMPTION_TIMER 171
+EXTERNAL_INTERRUPT 109
+IO_INSTRUCTION 8
+PAUSE_INSTRUCTION 5
+EOI_INDUCED 4
+EPT_VIOLATION 2
+
+# ./kvmexit.py -T '1273820,1273819'
+Display kvm exit reasons and statistics for TIDS ['1273820', '1273819']... Hit Ctrl-C to end.
+TIDS KVM_EXIT_REASON COUNT
+^C1273819 EXTERNAL_INTERRUPT 300
+1273819 HLT 13718
+1273819 IO_INSTRUCTION 26
+1273819 MSR_WRITE 37457
+1273819 PAUSE_INSTRUCTION 13
+1273819 EOI_INDUCED 13
+1273819 EPT_VIOLATION 53
+1273819 EPT_MISCONFIG 654
+1273819 PREEMPTION_TIMER 958
+1273820 EXTERNAL_INTERRUPT 212
+1273820 HLT 9002
+1273820 MSR_WRITE 25495
+1273820 PAUSE_INSTRUCTION 2
+1273820 EPT_VIOLATION 64
+1273820 EPT_MISCONFIG 396
+1273820 PREEMPTION_TIMER 268
+
+
+Help to understand
+==================
+
+We use a PERCPU_ARRAY: pcpuArrayA and a percpu_hash: hashA to collaboratively
+store each kvm exit reason and its count. The reason is there exists a rule when
+one vcpu exits and re-enters, it tends to continue to run on the same physical
+cpu (pcpu as follows) as the last cycle, which is also called 'cache hit'. Thus
+we turn to use a PERCPU_ARRAY to record the 'cache hit' situation to speed
+things up; and for other cases, then use a percpu_hash.
+
+BTW, we originally use a common hash to do this, with a u64(exit_reason)
+key and a struct exit_info {tgid_pid, exit_reason} value. But due to
+the big lock in bpf_hash, each updating is quite performance consuming.
+
+Now imagine here is a pid_tgidA (vcpu A) exits and is going to run on
+pcpuArrayA, the BPF code flow is as follows:
+
+ pid_tgidA keeps running on the same pcpu
+ // \\
+ // \\
+ // Y N \\
+ // \\
+ a. cache_hit b. cache_miss
+(cacheA's pid_tgid matches pid_tgidA) ||
+ | ||
+ | ||
+ "increase percpu exit_ct and return" ||
+ [*Note*] ||
+ pid_tgidA ever been exited on pcpuArrayA?
+ // \\
+ // \\
+ // \\
+ // Y N \\
+ // \\
+ b.a load_last_hashA b.b initialize_hashA_with_zero
+ \ /
+ \ /
+ \ /
+ "increase percpu exit_ct"
+ ||
+ ||
+ is another pid_tgid been running on pcpuArrayA?
+ // \\
+ // Y N \\
+ // \\
+ b.*.a save_theLastHit_hashB do_nothing
+ \\ //
+ \\ //
+ \\ //
+ b.* save_to_pcpuArrayA
+
+
+[*Note*] we do not update the table in above "a.", in case the vcpu hit the same
+pcpu again when exits next time, instead we only update until this pcpu is not
+hitted by the same tgidpid(vcpu) again, which is in "b.*.a" and "b.*".
+
+
+USAGE message:
+==============
+
+# ./kvmexit.py -h
+usage: kvmexit.py [-h] [-p PID [-v VCPU | -a] ] [-t TID | -T 'TID1,TID2'] [duration]
+
+Display kvm_exit_reason and its statistics at a timed interval
+
+optional arguments:
+ -h, --help show this help message and exit
+ -p PID, --pid PID display process with this PID only, collpase all tids with exit reasons sorted in descending order
+ -v VCPU, --v VCPU display this VCPU only for this PID
+ -a, --alltids display all TIDS for this PID
+ -t TID, --tid TID display thread with this TID only with exit reasons sorted in descending order
+ -T 'TID1,TID2', --tids 'TID1,TID2'
+ display threads for a union like {395490, 395491}
+ duration duration of display, after sleeping several seconds
+
+examples:
+ ./kvmexit # Display kvm_exit_reason and its statistics in real-time until Ctrl-C
+ ./kvmexit 5 # Display in real-time after sleeping 5s
+ ./kvmexit -p 3195281 # Collpase all tids for pid 3195281 with exit reasons sorted in descending order
+ ./kvmexit -p 3195281 20 # Collpase all tids for pid 3195281 with exit reasons sorted in descending order, and display after sleeping 20s
+ ./kvmexit -p 3195281 -v 0 # Display only vcpu0 for pid 3195281, descending sort by default
+ ./kvmexit -p 3195281 -a # Display all tids for pid 3195281
+ ./kvmexit -t 395490 # Display only for tid 395490 with exit reasons sorted in descending order
+ ./kvmexit -t 395490 20 # Display only for tid 395490 with exit reasons sorted in descending order after sleeping 20s
+ ./kvmexit -T '395490,395491' # Display for a union like {395490, 395491}