- tools/[biosnoop](tools/biosnoop): Trace block device I/O with PID and latency. [Examples](tools/biosnoop_example.txt).
- tools/[funccount](tools/funccount): Count kernel function calls. [Examples](tools/funccount_example.txt).
- tools/[funclatency](tools/funclatency): Time kernel functions and show their latency distribution. [Examples](tools/funclatency_example.txt).
+- tools/[hardirqs](tools/hardirqs): Measure hard IRQ (hard interrupt) event time. [Examples](tools/hardirqs_example.txt).
- tools/[killsnoop](tools/killsnoop): Trace signals issued by the kill() syscall. [Examples](tools/killsnoop_example.txt).
- tools/[opensnoop](tools/opensnoop): Trace open() syscalls. [Examples](tools/opensnoop_example.txt).
- tools/[pidpersec](tools/pidpersec): Count new processes (via fork). [Examples](tools/pidpersec_example.txt).
+- tools/[softirqs](tools/softirqs): Measure soft IRQ (soft interrupt) event time. [Examples](tools/softirqs_example.txt).
- tools/[syncsnoop](tools/syncsnoop): Trace sync() syscall. [Examples](tools/syncsnoop_example.txt).
- tools/[tcpaccept](tools/tcpaccept): Trace TCP passive connections (accept()). [Examples](tools/tcpaccept_example.txt).
- tools/[tcpconnect](tools/tcpconnect): Trace TCP active connections (connect()). [Examples](tools/tcpconnect_example.txt).
--- /dev/null
+.TH hardirqs 8 "2015-10-20" "USER COMMANDS"
+.SH NAME
+hardirqs \- Measure hard IRQ (hard interrupt) event time. Uses Linux eBPF/bcc.
+.SH SYNOPSIS
+.B hardirqs [\-h] [\-T] [\-N] [\-d] [interval] [count]
+.SH DESCRIPTION
+This summarizes the time spent servicing hard IRQs (hard interrupts), and can
+show this time as either totals or histogram distributions. A system-wide
+summary of this time is shown by the %irq column of mpstat(1), and event
+counts (but not times) are shown by /proc/interrupts.
+
+WARNING: This currently uses dynamic tracing of hard interrupts. You should
+understand what this means before use. Try in a test environment. Future
+versions should switch to tracepoints.
+
+Since this uses BPF, only the root user can use this tool.
+.SH REQUIREMENTS
+CONFIG_BPF and bcc.
+.SH OPTIONS
+.TP
+\-h
+Print usage message.
+.TP
+\-T
+Include timestamps on output.
+.TP
+\-N
+Output in nanoseconds
+.TP
+\-d
+Show IRQ time distribution as histograms
+.SH EXAMPLES
+.TP
+Sum hard IRQ event time until Ctrl-C:
+#
+.B hardirqs
+.TP
+Show hard IRQ event time as histograms:
+#
+.B hardirqs \-d
+.TP
+Print 1 second summaries, 10 times:
+#
+.B hardirqs 1 10
+.TP
+1 second summaries, printed in nanoseconds, with timestamps:
+#
+.B hardirqs \-NT 1
+.SH FIELDS
+.TP
+HARDIRQ
+The irq action name for this hard IRQ.
+.TP
+TOTAL_usecs
+Total time spent in this hard IRQ in microseconds.
+.TP
+TOTAL_nsecs
+Total time spent in this hard IRQ in nanoseconds.
+.TP
+usecs
+Range of microseconds for this bucket.
+.TP
+nsecs
+Range of nanoseconds for this bucket.
+.TP
+count
+Number of hard IRQs in this time range.
+.TP
+distribution
+ASCII representation of the distribution (the count column).
+.SH OVERHEAD
+This traces kernel functions and maintains in-kernel counts, which
+are asynchronously copied to user-space. While the rate of interrupts
+be very high (>1M/sec), this is a relatively efficient way to trace these
+events, and so the overhead is expected to be small for normal workloads, but
+could become noticable for heavy workloads. Measure in a test environment
+before use.
+.SH SOURCE
+This is from bcc.
+.IP
+https://github.com/iovisor/bcc
+.PP
+Also look in the bcc distribution for a companion _examples.txt file containing
+example usage, output, and commentary for this tool.
+.SH OS
+Linux
+.SH STABILITY
+Unstable - in development.
+.SH AUTHOR
+Brendan Gregg
+.SH SEE ALSO
+softirqs(8)
--- /dev/null
+.TH softirqs 8 "2015-10-20" "USER COMMANDS"
+.SH NAME
+softirqs \- Measure soft IRQ (soft interrupt) event time. Uses Linux eBPF/bcc.
+.SH SYNOPSIS
+.B softirqs [\-h] [\-T] [\-N] [\-d] [interval] [count]
+.SH DESCRIPTION
+This summarizes the time spent servicing soft IRQs (soft interrupts), and can
+show this time as either totals or histogram distributions. A system-wide
+summary of this time is shown by the %soft column of mpstat(1), and soft IRQ
+event counts (but not times) are available in /proc/softirqs.
+
+WARNING: This currently uses dynamic tracing of various soft interrupt
+functions, and can easily not work with different kernel versions. Check and
+adjust the code as necessary. Also try in a test environment and ensure this
+tool is safe before use. Future versions should switch to tracepoints.
+
+Since this uses BPF, only the root user can use this tool.
+.SH REQUIREMENTS
+CONFIG_BPF and bcc.
+.SH OPTIONS
+.TP
+\-h
+Print usage message.
+.TP
+\-T
+Include timestamps on output.
+.TP
+\-N
+Output in nanoseconds
+.TP
+\-d
+Show IRQ time distribution as histograms
+.SH EXAMPLES
+.TP
+Sum soft IRQ event time until Ctrl-C:
+#
+.B softirqs
+.TP
+Show soft IRQ event time as histograms:
+#
+.B softirqs \-d
+.TP
+Print 1 second summaries, 10 times:
+#
+.B softirqs 1 10
+.TP
+1 second summaries, printed in nanoseconds, with timestamps:
+#
+.B softirqs \-NT 1
+.SH FIELDS
+.TP
+SOFTIRQ
+The kernel function name that performs the soft IRQ action.
+.TP
+TOTAL_usecs
+Total time spent in this soft IRQ function in microseconds.
+.TP
+TOTAL_nsecs
+Total time spent in this soft IRQ function in nanoseconds.
+.TP
+usecs
+Range of microseconds for this bucket.
+.TP
+nsecs
+Range of nanoseconds for this bucket.
+.TP
+count
+Number of soft IRQs in this time range.
+.TP
+distribution
+ASCII representation of the distribution (the count column).
+.SH OVERHEAD
+This traces kernel functions and maintains in-kernel counts, which
+are asynchronously copied to user-space. While the rate of interrupts
+be very high (>1M/sec), this is a relatively efficient way to trace these
+events, and so the overhead is expected to be small for normal workloads, but
+could become noticable for heavy workloads. Measure in a test environment
+before use.
+.SH SOURCE
+This is from bcc.
+.IP
+https://github.com/iovisor/bcc
+.PP
+Also look in the bcc distribution for a companion _examples.txt file containing
+example usage, output, and commentary for this tool.
+.SH OS
+Linux
+.SH STABILITY
+Unstable - in development.
+.SH AUTHOR
+Brendan Gregg
+.SH SEE ALSO
+hardirqs(8)
--- /dev/null
+#!/usr/bin/python
+#
+# hardirqs Summarize hard IRQ (interrupt) event time.
+# For Linux, uses BCC, eBPF.
+#
+# USAGE: hardirqs [-h] [-T] [-Q] [-m] [-D] [interval] [count]
+#
+# Thanks Amer Ather for help understanding irq behavior.
+#
+# Copyright (c) 2015 Brendan Gregg.
+# Licensed under the Apache License, Version 2.0 (the "License")
+#
+# 19-Oct-2015 Brendan Gregg Created this.
+
+from __future__ import print_function
+from bcc import BPF
+from time import sleep, strftime
+import argparse
+
+### arguments
+examples = """examples:
+ ./hardirqs # sum hard irq event time
+ ./hardirqs -d # show hard irq event time as histograms
+ ./hardirqs 1 10 # print 1 second summaries, 10 times
+ ./hardirqs -NT 1 # 1s summaries, nanoseconds, and timestamps
+"""
+parser = argparse.ArgumentParser(
+ description="Summarize hard irq event time as histograms",
+ formatter_class=argparse.RawDescriptionHelpFormatter,
+ epilog=examples)
+parser.add_argument("-T", "--timestamp", action="store_true",
+ help="include timestamp on output")
+parser.add_argument("-N", "--nanoseconds", action="store_true",
+ help="output in nanoseconds")
+parser.add_argument("-d", "--dist", action="store_true",
+ help="show distributions as histograms")
+parser.add_argument("interval", nargs="?", default=99999999,
+ help="output interval, in seconds")
+parser.add_argument("count", nargs="?", default=99999999,
+ help="number of outputs")
+args = parser.parse_args()
+countdown = int(args.count)
+if args.nanoseconds:
+ factor = 1
+ label = "nsecs"
+else:
+ factor = 1000
+ label = "usecs"
+debug = 0
+
+### define BPF program
+bpf_text = """
+#include <uapi/linux/ptrace.h>
+#include <linux/irq.h>
+#include <linux/irqdesc.h>
+#include <linux/interrupt.h>
+
+typedef struct irq_key {
+ char name[32];
+ u64 slot;
+} irq_key_t;
+BPF_HASH(start, u32);
+BPF_HASH(irqdesc, u32, struct irq_desc *);
+BPF_HISTOGRAM(dist, irq_key_t);
+
+// time IRQ
+int trace_start(struct pt_regs *ctx, struct irq_desc *desc)
+{
+ u32 pid = bpf_get_current_pid_tgid();
+ u64 ts = bpf_ktime_get_ns();
+ start.update(&pid, &ts);
+ irqdesc.update(&pid, &desc);
+ return 0;
+}
+
+int trace_completion(struct pt_regs *ctx)
+{
+ u64 *tsp, delta;
+ struct irq_desc **descp;
+ u32 pid = bpf_get_current_pid_tgid();
+
+ // fetch timestamp and calculate delta
+ tsp = start.lookup(&pid);
+ descp = irqdesc.lookup(&pid);
+ if (tsp == 0 || descp == 0) {
+ return 0; // missed start
+ }
+ // Note: descp is a value from map, so '&' can be done without
+ // probe_read, but the next level irqaction * needs a probe read.
+ // Do these steps first after reading the map, otherwise some of these
+ // pointers may get pushed onto the stack and verifier will fail.
+ struct irqaction *action = 0;
+ bpf_probe_read(&action, sizeof(action), &(*descp)->action);
+ const char **namep = &action->name;
+ char *name = 0;
+ bpf_probe_read(&name, sizeof(name), namep);
+ delta = bpf_ktime_get_ns() - *tsp;
+
+ // store as sum or histogram
+ STORE
+
+ start.delete(&pid);
+ irqdesc.delete(&pid);
+ return 0;
+}
+"""
+
+### code substitutions
+if args.dist:
+ bpf_text = bpf_text.replace('STORE',
+ 'irq_key_t key = {.slot = bpf_log2l(delta)};' +
+ 'bpf_probe_read(&key.name, sizeof(key.name), name);' +
+ 'dist.increment(key);')
+else:
+ bpf_text = bpf_text.replace('STORE',
+ 'irq_key_t key = {.slot = 0 /* ignore */};' +
+ 'bpf_probe_read(&key.name, sizeof(key.name), name);' +
+ 'u64 zero = 0, *vp = dist.lookup_or_init(&key, &zero);' +
+ '(*vp) += delta;')
+if debug:
+ print(bpf_text)
+
+### load BPF program
+b = BPF(text=bpf_text)
+
+# these should really use irq:irq_handler_entry/exit tracepoints:
+b.attach_kprobe(event="handle_irq_event_percpu", fn_name="trace_start")
+b.attach_kretprobe(event="handle_irq_event_percpu", fn_name="trace_completion")
+
+print("Tracing hard irq event time... Hit Ctrl-C to end.")
+
+### output
+exiting = 0 if args.interval else 1
+dist = b.get_table("dist")
+while (1):
+ try:
+ sleep(int(args.interval))
+ except KeyboardInterrupt:
+ exiting=1
+
+ print()
+ if args.timestamp:
+ print("%-8s\n" % strftime("%H:%M:%S"), end="")
+
+ if args.dist:
+ dist.print_log2_hist(label, "hardirq")
+ else:
+ print("%-26s %11s" % ("HARDIRQ", "TOTAL_" + label))
+ for k, v in sorted(dist.items(), key=lambda dist: dist[1].value):
+ print("%-26s %11d" % (k.name, v.value / factor))
+ dist.clear()
+
+ countdown -= 1
+ if exiting or countdown == 0:
+ exit()
--- /dev/null
+Demonstrations of hardirqs, the Linux eBPF/bcc version.
+
+
+This program traces hard interrupts (irqs), and stores timing statistics
+in-kernel for efficiency. For example:
+
+# ./hardirqs
+Tracing hard irq event time... Hit Ctrl-C to end.
+^C
+HARDIRQ TOTAL_usecs
+callfuncsingle0 2
+callfuncsingle5 5
+callfuncsingle6 5
+callfuncsingle7 21
+blkif 66
+timer7 84
+resched5 94
+resched0 97
+resched3 102
+resched7 111
+resched6 255
+timer3 362
+resched4 367
+timer5 474
+timer1 529
+timer6 679
+timer2 746
+timer4 943
+resched1 1048
+timer0 1558
+resched2 1750
+eth0 11441
+
+The HARDIRQ column prints the interrupt action name. While tracing, the eth0
+hard irq action ran for 11441 microseconds (11 milliseconds) in total.
+
+Many other interrupts are visible in the output: this is an 8 CPU system, and
+some of these interrupts have a separate action per-CPU (eg, "timer",
+"resched").
+
+
+An interval can be provided, and also optionally a count. Eg, printing output
+every 1 second, and including timestamps (-T):
+
+# ./hardirqs -T 1 3
+Tracing hard irq event time... Hit Ctrl-C to end.
+
+22:16:14
+HARDIRQ TOTAL_usecs
+callfuncsingle0 2
+callfuncsingle7 5
+callfuncsingle3 5
+callfuncsingle2 5
+callfuncsingle6 6
+callfuncsingle1 11
+resched0 32
+blkif 51
+resched5 71
+resched7 71
+resched4 72
+resched6 82
+timer7 172
+resched1 187
+resched2 236
+timer3 252
+resched3 282
+timer1 320
+timer2 374
+timer6 396
+timer5 427
+timer4 470
+timer0 1430
+eth0 7498
+
+22:16:15
+HARDIRQ TOTAL_usecs
+callfuncsingle7 6
+callfuncsingle5 11
+callfuncsingle4 13
+timer2 17
+callfuncsingle6 18
+resched0 21
+blkif 33
+resched3 40
+resched5 60
+resched4 69
+resched6 70
+resched7 74
+timer7 86
+resched2 91
+timer3 134
+resched1 293
+timer5 354
+timer1 433
+timer6 497
+timer4 1112
+timer0 1768
+eth0 6972
+
+22:16:16
+HARDIRQ TOTAL_usecs
+callfuncsingle7 5
+callfuncsingle3 5
+callfuncsingle2 6
+timer3 10
+resched0 18
+callfuncsingle4 22
+resched5 27
+resched6 44
+blkif 45
+resched7 65
+resched4 69
+timer4 77
+resched2 97
+timer7 98
+resched3 103
+timer2 169
+resched1 226
+timer5 525
+timer1 691
+timer6 697
+timer0 1415
+eth0 7152
+
+This can be useful for quantifying where CPU cycles are spent among the hard
+interrupts (summarized as the %irq column from mpstat(1)). The output above
+shows that most time was spent processing for eth0 (network interface), which
+was around 7 milliseconds per second (total across all CPUs).
+
+Note that the time spent among the "timer" interrupts was low, and usually less
+than one microsecond per second. Here's the hardirq per-second output when the
+perf tool is performing a 999 Hertz CPU profile ("perf record -F 999 -a ..."):
+
+22:13:59
+HARDIRQ TOTAL_usecs
+callfuncsingle7 5
+callfuncsingle5 5
+callfuncsingle3 6
+callfuncsingle4 7
+callfuncsingle6 19
+blkif 66
+resched0 66
+resched2 82
+resched7 87
+resched3 96
+resched4 118
+resched5 120
+resched6 130
+resched1 230
+timer3 946
+timer1 1981
+timer7 2618
+timer5 3063
+timer6 3141
+timer4 3511
+timer2 3554
+timer0 5044
+eth0 16015
+
+This sheds some light into the CPU overhead of the perf profiler, which cost
+around 3 milliseconds per second. Note that I'm usually profiling at a much
+lower rate, 99 Hertz, which looks like this:
+
+22:22:12
+HARDIRQ TOTAL_usecs
+callfuncsingle3 5
+callfuncsingle6 5
+callfuncsingle5 22
+blkif 46
+resched6 47
+resched5 57
+resched4 66
+resched7 78
+resched2 97
+resched0 214
+timer2 326
+timer0 498
+timer5 536
+timer6 576
+timer1 600
+timer4 982
+resched1 1315
+timer7 1364
+timer3 1825
+resched3 5708
+eth0 9743
+
+Much lower (and remember to compare this to the baseline). Note that perf has
+other overheads (non-irq CPU cycles, file system storage).
+
+
+The distribution of interrupt run time can be printed as a histogram with the -d
+option. Eg:
+
+# ./hardirqs -d
+Tracing hard irq event time... Hit Ctrl-C to end.
+^C
+
+hardirq = 'callfuncsingle1'
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 0 | |
+ 512 -> 1023 : 0 | |
+ 1024 -> 2047 : 0 | |
+ 2048 -> 4095 : 0 | |
+ 4096 -> 8191 : 0 | |
+ 8192 -> 16383 : 1 |****************************************|
+
+hardirq = 'callfuncsingle0'
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 0 | |
+ 512 -> 1023 : 0 | |
+ 1024 -> 2047 : 0 | |
+ 2048 -> 4095 : 1 |****************************************|
+
+hardirq = 'callfuncsingle3'
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 0 | |
+ 512 -> 1023 : 0 | |
+ 1024 -> 2047 : 0 | |
+ 2048 -> 4095 : 0 | |
+ 4096 -> 8191 : 3 |****************************************|
+
+hardirq = 'callfuncsingle2'
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 0 | |
+ 512 -> 1023 : 0 | |
+ 1024 -> 2047 : 0 | |
+ 2048 -> 4095 : 0 | |
+ 4096 -> 8191 : 2 |****************************************|
+
+hardirq = 'callfuncsingle5'
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 0 | |
+ 512 -> 1023 : 0 | |
+ 1024 -> 2047 : 0 | |
+ 2048 -> 4095 : 0 | |
+ 4096 -> 8191 : 5 |****************************************|
+
+hardirq = 'callfuncsingle4'
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 0 | |
+ 512 -> 1023 : 0 | |
+ 1024 -> 2047 : 0 | |
+ 2048 -> 4095 : 0 | |
+ 4096 -> 8191 : 6 |****************************************|
+
+hardirq = 'callfuncsingle7'
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 0 | |
+ 512 -> 1023 : 0 | |
+ 1024 -> 2047 : 0 | |
+ 2048 -> 4095 : 0 | |
+ 4096 -> 8191 : 4 |****************************************|
+
+hardirq = 'callfuncsingle6'
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 0 | |
+ 512 -> 1023 : 0 | |
+ 1024 -> 2047 : 0 | |
+ 2048 -> 4095 : 0 | |
+ 4096 -> 8191 : 4 |****************************************|
+
+hardirq = 'eth0'
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 0 | |
+ 512 -> 1023 : 5102 |********* |
+ 1024 -> 2047 : 20617 |****************************************|
+ 2048 -> 4095 : 4832 |********* |
+ 4096 -> 8191 : 12 | |
+
+hardirq = 'timer7'
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 0 | |
+ 512 -> 1023 : 0 | |
+ 1024 -> 2047 : 9 |*** |
+ 2048 -> 4095 : 70 |***************************** |
+ 4096 -> 8191 : 94 |****************************************|
+
+hardirq = 'timer6'
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 0 | |
+ 512 -> 1023 : 0 | |
+ 1024 -> 2047 : 1 | |
+ 2048 -> 4095 : 86 |*********** |
+ 4096 -> 8191 : 295 |****************************************|
+ 8192 -> 16383 : 28 |*** |
+
+hardirq = 'timer5'
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 0 | |
+ 512 -> 1023 : 1 | |
+ 1024 -> 2047 : 0 | |
+ 2048 -> 4095 : 137 |****************************************|
+ 4096 -> 8191 : 123 |*********************************** |
+ 8192 -> 16383 : 8 |** |
+
+hardirq = 'timer4'
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 0 | |
+ 512 -> 1023 : 0 | |
+ 1024 -> 2047 : 0 | |
+ 2048 -> 4095 : 46 |********* |
+ 4096 -> 8191 : 198 |****************************************|
+ 8192 -> 16383 : 49 |********* |
+
+hardirq = 'timer3'
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 0 | |
+ 512 -> 1023 : 0 | |
+ 1024 -> 2047 : 4 | |
+ 2048 -> 4095 : 210 |****************************************|
+ 4096 -> 8191 : 186 |*********************************** |
+
+hardirq = 'timer2'
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 0 | |
+ 512 -> 1023 : 0 | |
+ 1024 -> 2047 : 0 | |
+ 2048 -> 4095 : 245 |****************************************|
+ 4096 -> 8191 : 227 |************************************* |
+ 8192 -> 16383 : 6 | |
+
+hardirq = 'timer1'
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 0 | |
+ 512 -> 1023 : 0 | |
+ 1024 -> 2047 : 6 |* |
+ 2048 -> 4095 : 112 |************************ |
+ 4096 -> 8191 : 181 |****************************************|
+ 8192 -> 16383 : 7 |* |
+
+hardirq = 'timer0'
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 0 | |
+ 512 -> 1023 : 0 | |
+ 1024 -> 2047 : 0 | |
+ 2048 -> 4095 : 0 | |
+ 4096 -> 8191 : 887 |****************************************|
+ 8192 -> 16383 : 92 |**** |
+
+hardirq = 'blkif'
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 0 | |
+ 512 -> 1023 : 0 | |
+ 1024 -> 2047 : 0 | |
+ 2048 -> 4095 : 0 | |
+ 4096 -> 8191 : 9 |****************************************|
+ 8192 -> 16383 : 7 |******************************* |
+ 16384 -> 32767 : 2 |******** |
+
+hardirq = 'resched4'
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 0 | |
+ 512 -> 1023 : 0 | |
+ 1024 -> 2047 : 104 |****************************************|
+ 2048 -> 4095 : 80 |****************************** |
+
+hardirq = 'resched5'
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 0 | |
+ 512 -> 1023 : 27 |***** |
+ 1024 -> 2047 : 216 |****************************************|
+ 2048 -> 4095 : 27 |***** |
+ 4096 -> 8191 : 1 | |
+
+hardirq = 'resched6'
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 0 | |
+ 512 -> 1023 : 480 |******************* |
+ 1024 -> 2047 : 1003 |****************************************|
+ 2048 -> 4095 : 64 |** |
+
+hardirq = 'resched7'
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 0 | |
+ 512 -> 1023 : 46 |********* |
+ 1024 -> 2047 : 190 |****************************************|
+ 2048 -> 4095 : 42 |******** |
+
+hardirq = 'resched0'
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 0 | |
+ 512 -> 1023 : 11 |**** |
+ 1024 -> 2047 : 100 |****************************************|
+ 2048 -> 4095 : 23 |********* |
+
+hardirq = 'resched1'
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 0 | |
+ 512 -> 1023 : 96 |******** |
+ 1024 -> 2047 : 462 |****************************************|
+ 2048 -> 4095 : 36 |*** |
+
+hardirq = 'resched2'
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 0 | |
+ 512 -> 1023 : 120 |************************** |
+ 1024 -> 2047 : 183 |****************************************|
+ 2048 -> 4095 : 41 |******** |
+
+hardirq = 'resched3'
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 0 | |
+ 512 -> 1023 : 0 | |
+ 1024 -> 2047 : 789 |****************************************|
+ 2048 -> 4095 : 39 |* |
+
+
+USAGE message:
+
+# ./hardirqs -h
+usage: hardirqs [-h] [-T] [-N] [-d] [interval] [count]
+
+Summarize hard irq event time as histograms
+
+positional arguments:
+ interval output interval, in seconds
+ count number of outputs
+
+optional arguments:
+ -h, --help show this help message and exit
+ -T, --timestamp include timestamp on output
+ -N, --nanoseconds output in nanoseconds
+ -d, --dist show distributions as histograms
+
+examples:
+ ./hardirqs # sum hard irq event time
+ ./hardirqs -d # show hard irq event time as histograms
+ ./hardirqs 1 10 # print 1 second summaries, 10 times
+ ./hardirqs -NT 1 # 1s summaries, nanoseconds, and timestamps
--- /dev/null
+#!/usr/bin/python
+#
+# softirqs Summarize soft IRQ (interrupt) event time.
+# For Linux, uses BCC, eBPF.
+#
+# USAGE: softirqs [-h] [-T] [-N] [-d] [interval] [count]
+#
+# Copyright (c) 2015 Brendan Gregg.
+# Licensed under the Apache License, Version 2.0 (the "License")
+#
+# 20-Oct-2015 Brendan Gregg Created this.
+
+from __future__ import print_function
+from bcc import BPF
+from time import sleep, strftime
+import argparse
+
+### arguments
+examples = """examples:
+ ./softirqs # sum soft irq event time
+ ./softirqs -d # show soft irq event time as histograms
+ ./softirqs 1 10 # print 1 second summaries, 10 times
+ ./softirqs -NT 1 # 1s summaries, nanoseconds, and timestamps
+"""
+parser = argparse.ArgumentParser(
+ description="Summarize soft irq event time as histograms",
+ formatter_class=argparse.RawDescriptionHelpFormatter,
+ epilog=examples)
+parser.add_argument("-T", "--timestamp", action="store_true",
+ help="include timestamp on output")
+parser.add_argument("-N", "--nanoseconds", action="store_true",
+ help="output in nanoseconds")
+parser.add_argument("-d", "--dist", action="store_true",
+ help="show distributions as histograms")
+parser.add_argument("interval", nargs="?", default=99999999,
+ help="output interval, in seconds")
+parser.add_argument("count", nargs="?", default=99999999,
+ help="number of outputs")
+args = parser.parse_args()
+countdown = int(args.count)
+if args.nanoseconds:
+ factor = 1
+ label = "nsecs"
+else:
+ factor = 1000
+ label = "usecs"
+debug = 0
+
+### define BPF program
+bpf_text = """
+#include <uapi/linux/ptrace.h>
+
+typedef struct irq_key {
+ u64 ip;
+ u64 slot;
+} irq_key_t;
+BPF_HASH(start, u32);
+BPF_HASH(iptr, u32);
+BPF_HISTOGRAM(dist, irq_key_t);
+
+// time IRQ
+int trace_start(struct pt_regs *ctx)
+{
+ u32 pid = bpf_get_current_pid_tgid();
+ u64 ip = ctx->ip, ts = bpf_ktime_get_ns();
+ start.update(&pid, &ts);
+ iptr.update(&pid, &ip);
+ return 0;
+}
+
+int trace_completion(struct pt_regs *ctx)
+{
+ u64 *tsp, delta, ip, *ipp;
+ u32 pid = bpf_get_current_pid_tgid();
+
+ // fetch timestamp and calculate delta
+ tsp = start.lookup(&pid);
+ ipp = iptr.lookup(&pid);
+ if (tsp == 0 || ipp == 0) {
+ return 0; // missed start
+ }
+ delta = bpf_ktime_get_ns() - *tsp;
+ ip = *ipp;
+
+ // store as sum or histogram
+ STORE
+
+ start.delete(&pid);
+ iptr.delete(&pid);
+ return 0;
+}
+"""
+
+### code substitutions
+if args.dist:
+ bpf_text = bpf_text.replace('STORE',
+ 'irq_key_t key = {.ip = ip, .slot = bpf_log2l(delta)};' +
+ 'dist.increment(key);')
+else:
+ bpf_text = bpf_text.replace('STORE',
+ 'irq_key_t key = {.ip = ip, .slot = 0 /* ignore */};' +
+ 'u64 zero = 0, *vp = dist.lookup_or_init(&key, &zero);' +
+ '(*vp) += delta;')
+if debug:
+ print(bpf_text)
+
+### load BPF program
+b = BPF(text=bpf_text)
+
+# this should really use irq:softirq_entry/exit tracepoints; for now the
+# soft irq functions are individually traced (search your kernel for
+# open_softirq() calls, and adjust the following list as needed).
+for softirqfunc in ("blk_iopoll_softirq", "blk_done_softirq",
+ "rcu_process_callbacks", "run_rebalance_domains", "tasklet_action",
+ "tasklet_hi_action", "run_timer_softirq", "net_tx_action",
+ "net_rx_action"):
+ b.attach_kprobe(event=softirqfunc, fn_name="trace_start")
+ b.attach_kretprobe(event=softirqfunc, fn_name="trace_completion")
+
+print("Tracing soft irq event time... Hit Ctrl-C to end.")
+
+### output
+exiting = 0 if args.interval else 1
+dist = b.get_table("dist")
+while (1):
+ try:
+ sleep(int(args.interval))
+ except KeyboardInterrupt:
+ exiting=1
+
+ print()
+ if args.timestamp:
+ print("%-8s\n" % strftime("%H:%M:%S"), end="")
+
+ if args.dist:
+ dist.print_log2_hist(label, "softirq", section_print_fn=b.ksym)
+ else:
+ print("%-26s %11s" % ("SOFTIRQ", "TOTAL_" + label))
+ for k, v in sorted(dist.items(), key=lambda dist: dist[1].value):
+ print("%-26s %11d" % (b.ksym(k.ip), v.value / factor))
+ dist.clear()
+
+ countdown -= 1
+ if exiting or countdown == 0:
+ exit()
--- /dev/null
+Demonstrations of softirqs, the Linux eBPF/bcc version.
+
+
+This program traces soft interrupts (irqs), and stores timing statistics
+in-kernel for efficiency. For example:
+
+# ./softirqs
+Tracing soft irq event time... Hit Ctrl-C to end.
+^C
+SOFTIRQ TOTAL_usecs
+rcu_process_callbacks 974
+run_rebalance_domains 1809
+run_timer_softirq 2615
+net_tx_action 14605
+tasklet_action 38692
+net_rx_action 88188
+
+The SOFTIRQ column prints the interrupt action function name. While tracing,
+the net_rx_action() soft interrupt ran for 20199 microseconds (20 milliseconds)
+in total.
+
+This tool works by dynamic tracing the individual softirq functions, and will
+need to be adjusted to match kernel/module changes. Future versions should
+use the softirq tracepoints instead.
+
+
+An interval can be provided, and also optionally a count. Eg, printing output
+every 1 second, and including timestamps (-T):
+
+# ./softirqs -T 1 3
+Tracing soft irq event time... Hit Ctrl-C to end.
+
+22:29:16
+SOFTIRQ TOTAL_usecs
+rcu_process_callbacks 456
+run_rebalance_domains 1005
+run_timer_softirq 1196
+net_tx_action 2796
+tasklet_action 5534
+net_rx_action 15075
+
+22:29:17
+SOFTIRQ TOTAL_usecs
+rcu_process_callbacks 456
+run_rebalance_domains 839
+run_timer_softirq 1142
+net_tx_action 1912
+tasklet_action 4428
+net_rx_action 14652
+
+22:29:18
+SOFTIRQ TOTAL_usecs
+rcu_process_callbacks 502
+run_rebalance_domains 840
+run_timer_softirq 1192
+net_tx_action 2341
+tasklet_action 5496
+net_rx_action 15656
+
+This can be useful for quantifying where CPU cycles are spent among the soft
+interrupts (summarized as the %softirq column from mpstat(1), and shown as
+event counts in /proc/softirqs). The output above shows that most time was spent
+processing net_rx_action(), which was around 15 milleconds per second (total
+time across all CPUs).
+
+
+The distribution of interrupt run time can be printed as a histogram with the -d
+option. Eg:
+
+# ./softirqs -d
+Tracing soft irq event time... Hit Ctrl-C to end.
+^C
+
+softirq = net_tx_action
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 440 | |
+ 512 -> 1023 : 27613 |****************************************|
+ 1024 -> 2047 : 5728 |******** |
+ 2048 -> 4095 : 439 | |
+ 4096 -> 8191 : 53 | |
+ 8192 -> 16383 : 2 | |
+
+softirq = net_rx_action
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 0 | |
+ 512 -> 1023 : 6 | |
+ 1024 -> 2047 : 35 | |
+ 2048 -> 4095 : 3562 |**************** |
+ 4096 -> 8191 : 7023 |******************************** |
+ 8192 -> 16383 : 8770 |****************************************|
+ 16384 -> 32767 : 1780 |******** |
+ 32768 -> 65535 : 216 | |
+ 65536 -> 131071 : 4 | |
+
+softirq = tasklet_action
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 95 | |
+ 512 -> 1023 : 12521 |****************************************|
+ 1024 -> 2047 : 1068 |*** |
+ 2048 -> 4095 : 1077 |*** |
+ 4096 -> 8191 : 12349 |*************************************** |
+ 8192 -> 16383 : 464 |* |
+ 16384 -> 32767 : 1 | |
+
+softirq = rcu_process_callbacks
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 0 | |
+ 512 -> 1023 : 708 |****************************************|
+ 1024 -> 2047 : 495 |*************************** |
+ 2048 -> 4095 : 98 |***** |
+ 4096 -> 8191 : 62 |*** |
+ 8192 -> 16383 : 4 | |
+
+softirq = run_timer_softirq
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 2 | |
+ 512 -> 1023 : 366 |********* |
+ 1024 -> 2047 : 1525 |****************************************|
+ 2048 -> 4095 : 629 |**************** |
+ 4096 -> 8191 : 87 |** |
+ 8192 -> 16383 : 1 | |
+
+softirq = run_rebalance_domains
+ usecs : count distribution
+ 0 -> 1 : 0 | |
+ 2 -> 3 : 0 | |
+ 4 -> 7 : 0 | |
+ 8 -> 15 : 0 | |
+ 16 -> 31 : 0 | |
+ 32 -> 63 : 0 | |
+ 64 -> 127 : 0 | |
+ 128 -> 255 : 0 | |
+ 256 -> 511 : 3 | |
+ 512 -> 1023 : 18 |* |
+ 1024 -> 2047 : 80 |******** |
+ 2048 -> 4095 : 374 |****************************************|
+ 4096 -> 8191 : 257 |*************************** |
+ 8192 -> 16383 : 50 |***** |
+ 16384 -> 32767 : 24 |** |
+
+
+USAGE message:
+
+# ./softirqs -h
+usage: softirqs [-h] [-T] [-N] [-d] [interval] [count]
+
+Summarize soft irq event time as histograms
+
+positional arguments:
+ interval output interval, in seconds
+ count number of outputs
+
+optional arguments:
+ -h, --help show this help message and exit
+ -T, --timestamp include timestamp on output
+ -N, --nanoseconds output in nanoseconds
+ -d, --dist show distributions as histograms
+
+examples:
+ ./softirqs # sum soft irq event time
+ ./softirqs -d # show soft irq event time as histograms
+ ./softirqs 1 10 # print 1 second summaries, 10 times
+ ./softirqs -NT 1 # 1s summaries, nanoseconds, and timestamps